Approximate Bayesian computation with surrogate posteriors
A key ingredient in approximate Bayesian computation (ABC) procedures is the choice of a discrepancy that describes how different the simulated and observed data are, often based on a set of summary statistics when the data cannot be compared directly. Unless discrepancies and summaries are available from experts or prior knowledge, which seldom occurs, they have to be chosen and this can affect the quality of approximations. The choice between discrepancies is an active research topic, which has mainly considered data discrepancies requiring samples of observations or distances between summary statistics. In this work, we introduce a preliminary learning step in which surrogate posteriors are built from finite Gaussian mixtures, using an inverse regression approach. These surrogate posteriors are then used in place of summary statistics and compared using metrics between distributions in place of data discrepancies. Two such metrics are investigated, a standard L2 distance and an optimal transport-based distance. The whole procedure can be seen as an extension of the semi-automatic ABC framework to functional summary statistics setting and can also be used as an alternative to sample-based approaches. The resulting ABC quasi-posterior distribution is shown to converge to the true one, under standard conditions. Performance is illustrated on both synthetic and real data sets, where it is shown that our approach is particularly useful, when the posterior is multimodal.
Florence Forbes is director of Research at Inria in Grenoble France, and head of the Statify group. She has been working on graphical Markov models, classification methods for spatially localized data and statistical image analysis for more than 20 years. Her publications range from 4 main domains showing a balance at the interface of Statistics and Probability, Machine Learning and Pattern recognition, Signal and Image processing and Biology and medicine. Her current interest consists mainly of model-based clustering methods, supervised (learning) or unsupervised (parameter estimation), statistical model selection and Bayesian techniques to integrate various sources of information and a priori. She had experience with different types of data from domains as diverse as genetics and genomics, computer vision, and planetary science. She is also the cofounder and scientific advisor of Pixyl Automatic Neuroimaging Solutions.
Details:
Location: | GP-419; or https://qut.zoom.us/j/84037026355?pwd=QzlrcTBscmZLOEEzUzhaaEZQd05IUT09 |
Start Date: | 08/04/2022 [add to calendar] |
Start Time: | 12pm |
End Date: | 08/04/2022 |
End Time: | 1pm (AEST) |
Enquiries: | datascience@qut.edu.au |