This project will focus on two related challenges in non-linear regression:
- Automated Model Selection for Structured Data – Selection of non-linear model forms for regression is a largely manual process informed by exploratory data analysis, and an iterative improvement approach. The number of different ways that variables can interact in non-linear relationships increases exponentially with the number of variables. Particularly for complex observed systems, where there is limited intuition on the relationships between variables, a manual exploration of the model space is limited and can lead to unsuccessful or sub-optimal model selection. Mathematical expressions can be represented by tree-like data structures which is convenient for perturbation operators like those used in meta-heuristics (approximate optimisation algorithms). This project proposes to develop techniques for automating and optimising the selection of non-linear regression models for complex data.
- Application and selection of meta-models for stochastic optimisation – There are many examples of problems where discrete-event simulation and optimisation can be used cooperatively to determine optimal strategic or operational plans for systems with high complexity and uncertainty. For example, determining the optimal allocation of infrastructure and resources in a large hospital to optimise patient flow and minimise waiting lists. Combining simulation and optimisation is computationally challenging, and so the use of meta-models to approximate the simulation model has proven to be an effective technique to reduce the computational burden. This project aims to apply automated model selection techniques to determine good-fitting mathematical forms for meta models, where previous research has always used apriori defined forms. The project will also aim to improve on existing techniques for incorporating meta-models into optimisation frameworks.
Paul Corry and other Program Leaders / Participants
Involvement of Centre researchers and cross-disciplinary nature
There is obvious potential cross-disciplinary applications. For this project it will depend on which types of datasets are used for testing purposes, and who the relevant stakeholders are in relation to the datasets. This is yet to be determined.
Alignment to Centre Vision and Core Research Themes
This project aligns most closely with the themes, Frontier Methods in Data-Focused Modelling, Advances in Computational Algorithms, and Innovation in Data Acquisition and Design.
Novelty of research
To our knowledge, the proposed automated model selection challenge approach is yet to be explored, and stochastic optimisation still presents many challenges to both researchers and practitioners.
Potential for impact
Automating the non-linear model selection process would be a useful contribution to data-science researchers and practitioners, resulting in significant efficiencies. It will possibly also make the use of regression more favourable compared to machine learning approaches, benefitting end-users with the additional insights that come with a regression approach.
The motivation for dealing with stochastic optimisation comes from a recent Food Agility projects involving both the beef and vegetable production industries, along with an ARC Linkage project involving Queensland Health. Stakeholders in these projects have a common the desire to optimise allocation of resources across their value chains, under complex and uncertain operating environments.
Potential for external and ongoing funding
ARC. Food Agility, Queensland Health, Teys and Mulgowie Farms are possible funding sources for application of this work.
Academic journal papers, R libraries.