SMC Down Under: AMSI-AustMS Workshop on Sequential Monte Carlo

The SMC Down Under workshop will bring together the SMC community to discuss the theory and practice of sequential Monte Carlo. The workshop will consist of contributed talks, posters, and collaborative sessions to discuss current trends in SMC and its future directions.

Location and Dates
Queensland University of Technology, Brisbane
10-13  July 2023

  • All sessions will be held at QUT Gardens Point, P-block.
  • Sessions will be in room P512 unless otherwise stated in the schedule below.

Keynote Speakers

 

 

 

 

 

 

 

 

 

Schedule (see below for details about individual talks)

Monday, July 10 (pre-workshop)

Tutorial 1: Matt Sutton An introduction to SMC samplers

Tutorial 2: Imke Botha An introduction to Particle Filters

Tutorial 3: Joshua Bon An introduction to general SMC

HelloSMC.jl: GitHub page for tutorials

Networking event: With AMSI Winter School – light food provided

Tuesday, July 11

Keynote 1: Dr Francesca Crucinio, ENSAE Paris - Divide-and-Conquer sequential Monte Carlo with applications to high dimensional filtering.

In this talk I will describe the Divide-and-Conquer sequential Monte Carlo (DaC-SMC) algorithm introduced in Lindsten et al. (2017), an extension of standard SMC which exploits auxiliary tree-structured decompositions of the target distribution to turn the overall inferential task into a collection of smaller sub-problems. The tree structure embedded in DaC-SMC makes it easier to parallelise and distribute than standard SMC. I will present some recent results on the convergence properties of these algorithms and describe an application of DaC-SMC to high dimensional filtering for spatial state space models. This talk is based on joint work with Adam M. Johansen and Juan Kuntz.

About:

Dr Francesca Crucinio is a PostDoctoral Fellow at ENSAE in Paris and a member of the Center for Research in Economics and Statistics (CREST) working with Nicolas Chopin. She is interested in Monte Carlo and interacting particle methods (e.g. importance sampling, sequential Monte Carlo, Markov chain Monte Carlo, McKean-Vlasov stochastic differential equations) both from a methodological and a theoretical point of view.

Before joining ENSAE, Francesca was a Research Fellow in the Department of Statistics of the University of Warwick working with Gareth Roberts and Adam Johansen.

Personal website: https://francescacrucinio.github.io/

David Gunawan, The University of Wollongong: The Block-Correlated Pseudo Marginal Sampler for State Space Models

Particle Marginal Metropolis-Hastings (PMMH) is a general approach to Bayesian inference when the likelihood is intractable, but can be estimated unbiasedly. Our article develops an efficient PMMH method, which we call the multiple PMMH (MPM) algorithm, for estimating the parameters of complex state space models. Several important innovations are proposed.  First, the multiple particle filters are run in parallel and the trimmed means of their unbiased likelihood estimates are used. Second, a novel block version of PMMH that works with multiple particle filters is proposed. Third, the article develops an efficient auxiliary disturbance particle filter, which is necessary when the bootstrap filter is inefficient, but the state transition density cannot be expressed in closed form. Fourth, a novel fast sorting algorithm is developed to preserve the correlation between the logs of the likelihood estimates at the current and proposed parameter values. These features enable our sampler to scale up better to higher dimensional state vectors than previous approaches. The performance of the sampler is investigated empirically by applying it to non-linear Dynamic Stochastic General Equilibrium models with relatively high state dimensions and with intractable state transition densities and to multivariate stochastic volatility in the mean models. Although our focus is on applying the method to state space models, the approach will be useful in a wide range of applications such as large panel data models and stochastic differential equation models with mixed effects.

Imke Botha, QUT Centre for Data Science: Automatically adapting the mutation kernel in SMC2

Sequential Monte Carlo squared (SMC2) methods give exact parameter inference of state-space models where the likelihood of the model parameters is unknown in closed form. SMC methods propagate a set of particles through a sequence of distributions using a combination of reweighting, resampling and mutation steps. In the Bayesian setting, this sequence often starts at the prior and ends at the posterior. SMC2 is similar to particle Markov chain Monte Carlo (MCMC) methods in the sense that it replaces the intractable likelihood in the sequence of distributions being traversed with a particle filter estimator. As a result, particle MCMC methods are a natural choice to mutate the particles within SMC2. We introduce a method that adaptively chooses between two particle MCMC algorithms for the mutation step: particle marginal Metropolis-Hastings and particle Gibbs. The most efficient mutation kernel greatly depends on both the model and the target distribution, i.e. the current distribution in the sequence, so adaptively switching between mutation kernels can greatly improve the performance of SMC2.

Joshua Bon, QUT Centre for Data Science: Monte Carlo twisting for SMC

We consider the problem of designing efficient particle filters for twisted Feynman–Kac models. Particle filters using twisted models can deliver low error approximations of statistical quantities and such twisting functions can be learnt iteratively. Practical implementations of these algorithms are complicated by the need to (i) sample from the twisted transition dynamics, and (ii) calculate the twisted potential functions. We expand the class of applicable models using rejection sampling for (i) and unbiased approximations for (ii) using a random weight particle filter. We characterise the average acceptance rates within the particle filter in order to control the computational cost, and analyse the asymptotic variance. Empirical results show the mean squared error of the normalising constant estimate in our method is smaller than a memory-equivalent particle filter but not a computation-equivalent filter. Both comparisons are improved when more efficient sampling is possible which we demonstrate on a stochastic volatility model.

Andrew Hoegh, Montana State University: Particle filtering methods for animal movement modeling

There is a natural harmony between particle methods and telemetric animal location data. Visual displays of particle methods can be directly overlaid with the telemetry data to estimate movement from recorded point locations. However, there are two common challenges for applying particle methods to movement data. Modern data collection devices are capable of recording and storing location data with higher temporal resolution which necessitates models, and computational tools, equipped to analyze higher resolution data. These data streams can be present difficulties for particle methods. A second challenge results from collective movement models, where the behavior of one animal depends on the location of other animals. From a computational perspective, this requires that all animals are analyzed jointly - as opposed to permitting parallel computation. This poster will explore some of the limits of particle methods for animal movement models.

Adam Bretherton, QUT Centre for Data Science: Transfer Sequential Monte Carlo: A Framework for Bayesian Model Transfer

Model transfer attempts to incorporate information from related source domains to improve the inference on the target domain. Unfortunately, it is not clear when to transfer information, which information to transfer, and how to transfer this information. Current statistical model transfer methods are limited to conjugate distributions and suffer from some theoretical issues. We develop a new framework for Bayesian model transfer, transfer sequential Monte Carlo, that takes a principled statistical approach to the transfer problem. The new framework permits a convenient comparison of four different approaches for selecting the amount of tempering of source information, and does not require that the Bayesian model under consideration be conjugate. Two of these methods take a Bayesian model selection approach to the model transfer problem, by treating each level of transfer as a potential model to be selected. We explore two selection criteria, the model evidence and widely applicable information criterion. The second two methods consider the transfer factor to be a random variable and utilise the joint power prior and normalised power prior respectively. We use our new TSMC framework to evaluate the efficacy of Bayesian model transfer, with the empirical results expanding on previously explored theoretical justifications and limitations.

AMSI Winter School Finalist Presentations

Wednesday, July 12

Keynote 2: Prof Sumeetpal Singh, University of Wollongong - On resampling schemes for particle filters with weakly informative observations

We consider particle filters with weakly informative observations (or `potentials') relative to the latent state dynamics. The particular focus of this work is on particle filters to approximate time-discretisations of continuous-time Feynman-Kac path integral models --- a scenario that naturally arises when addressing filtering and smoothing problems in continuous time --- but our findings are indicative about weakly informative settings beyond this context too. We study the performance of different resampling schemes, such as systematic resampling, SSP (Srinivasan sampling process) and stratified resampling, as the time-discretisation becomes finer and also identify their continuous-time limit, which is expressed as a suitably defined `infinitesimal generator.' By contrasting these generators, we find that (certain modifications of) systematic and SSP resampling `dominate' stratified and independent `killing' resampling in terms of their limiting overall resampling rate. The reduced intensity of resampling manifests itself in lower variance in our numerical experiment. This efficiency result, through an ordering of the resampling rate, is new to the literature. The second major contribution of this work concerns the analysis of the limiting behaviour of the entire population of particles of the particle filter as the time discretisation becomes finer. We provide the first proof, under general conditions, that the particle approximation of the discretised continuous-time Feynman-Kac path integral models converges to a (uniformly weighted) continuous-time particle system.  Joint work with N. Chopin, T. Soto and M. Vihola. DOI: 10.1214/22-AOS2222

About our speaker:

Professor Sumeetpal Singh recently started as the Tibra Foundation Chair in Mathematical Sciences at the University of Wollongong. He moved to UOW from the University of Cambridge. His area of expertise is statistical signal processing and computational statistics. He is well known for his contributions to some of the most effective computational techniques for contemporary data science. At UOW, Professor Singh will continue to develop an internationally renowned research program in Computational Methodology for Statistics and Machine Learning, hoping to collaborate with other applied sciences within the University, such as life sciences and engineering

Jacopo Iollo, INRIA - Combining stochastic approximation (SA) and sequential Monte Carlo (SMC) for sequential optimal design

We propose a new algorithm for Bayesian design that performs sequential design optimization while providing an estimate of the posterior distribution for parameter inference. The sequential design optmization is carried out through a stochastic approximation (SA) procedure and uses a tempering principle to handle the fact that maximizing information gain jeopardizes standard SMC samplers accuracy. The approach behaves well on a preliminary 2D source localization example, exploring efficiently target distributions. It illustrates that with off line computer simulations, the number of performed field measurements can be significantly reduced, with lower costs for practitioners, while preserving the estimation quality.

Laurence Davies, QUT Centre for Data Science - Weight-stabilised transdimensional sequential Monte Carlo

Geometric annealing is a well-known approach to enhance sampling algorithms such as parallel tempering, simulated tempering, and static sequential Monte Carlo. These approaches improve exploration of Monte Carlo methods by introducing a sequence of auxiliary distributions which are more amenable to sampling. A benefit of geometric annealing is its ease of implementation regardless of the geometry of the target distribution, which has led to a natural uptake of the approach in transdimensional problems. However, for certain problems the sequence of distributions can exhibit a "dropout" of density mass in regions of the state space, or sometimes in entire models, only to see a return of this mass in the final distribution. In this work, we examine the latter case in a transdimensional static sequential Monte Carlo sampler and propose a robust "weight-stabilised" approach which counteracts the dropout effect and drastically improves the efficiency of the sampler. Two flavours of weight-stabilisation are considered, the first using a Gaussian-based correction, and the second using an evidence-approximation correction. The robustness of the second approach is demonstrated through two distinct and challenging numerical examples.

Brodie Lawson, QUT Centre for Data Science - Sequential Monte Carlo for Population Calibration

Deterministic, black-box models of complex systems have seen increasing interface with Bayesian statistics, as a means of selecting values for their parameters that comport with reality, and quantifying our uncertainties in these parameter estimates. However, this approach relies upon a view that there is some true values for these parameters we seek to discover (or at least, operates by evaluating how likely observing the data would be for each set of parameter values). This breaks down when we are dealing with data known to come from a variable population, where each data point presumably corresponds to a separate set of values of the parameters. When these parameters describe meaningful qualities/quantities, or affect subsequent predictions in non-trivial ways, it is insufficient to simply push this variability off to be described by random effects. Instead, we come to considering the pushforward and pullback of distributions through our black box. In this work, such "population calibration" problems are introduced and discussed in the broader statistical context, before we explore how sequential Monte Carlo may be used to assist in solving population calibration problems.

Matt Moores, University of Wollongong - Fitting a Doubly-Stochastic Point Process to Peaks in Spectroscopy using SMC

Raman spectroscopy is a measurement technique that can be used to quantify the chemical composition of a sample. For example, the amount of alcohol in a glass of beer, or the presence of inflammatory biomarkers in blood. The “Perseverance” rover is currently using Raman spectroscopy to search for traces of ancient life on Mars. The main difficulties in analysing this data are due to multiple, overlapping peaks and a curved baseline that is not directly observed.

We introduce a sequential Monte Carlo algorithm for model-based analysis of Raman spectroscopy. The locations of the peaks depend on the structure of the molecule, so it is often possible to obtain informative, Bayesian priors using computational chemistry. In the absence of such prior information, we can model the peaks as a convolution of Dirac delta functions with a line-broadening function, such as Lorentzian or Voigt. We deconvolve the signal by transforming to the Fourier domain, then smooth the resulting line-narrowed spectrum by fitting an inhomogeneous point process. By sampling from this process, we are able to obtain posterior distributions for the peak locations. Since the model is conditionally linear, the baseline function can be integrated out by Rao-Blackwellized particle filtering, thereby reducing the dimension of the parameter space. We use RcppEigen and OpenMP to implement our SMC algorithm in the R package ‘serrsBayes.’

This is joint work with Lassi Roininen, Teemu Härkönen, Emma Hannula, & Erik Vartiainen (LUT University, Finland)

Gloria Monsalve Bravo, The University of Queensland - Gas sorption isotherms in glassy polymers: are mixture predictions sensitive to parameter uncertainty?

Investigation of mixed-gas sorption is necessary to robustly design and optimize membrane-based processes. While sorption models for glassy polymers are well-established, they often deviate from observed mixture data. In doing so, however, these models’ adjustable parameters are often estimated via traditional least-squares optimization methods, and so parametric uncertainty is often ignored in mixture sorption predictions. As an alternative, we use Bayesian Inference to explore the parameter space, estimate probability distributions for sorption models' parameters, and thus provide statistical defensive mixture sorption predictions that reflect parameter uncertainty rigorously. As such, we exploit molecular sorption simulations to focus on two different glassy polymer systems (i.e., single- and mixed-gas sorption of in a fluorinated polyimide and in a polymer of intrinsic microporosity), and three popular sorption models (i.e., the Dual-Mode Sorption model, Non-Equilibrium Thermodynamics for Glassy Polymers model, and the Ideal Adsorbed Solution Theory) and thus showcase the benefits of this technique for uncertainty quantification and propagation in sorption applications. We show that observed sorption data at typical working pressures (e.g., 0-25 atm) are often insufficient to accurately estimate model parameters, and consequently, sorption models often fail to represent observed mixture data. Furthermore, we show that sorption data, able to capture intrinsic isotherm nonlinearly of sorption models (e.g., 0-100atm), are critical to considerably improve parameter inference from collective model-data fits and thus accurately predict mixture sorption.

Sarah Vollert, QUT Centre for Data Science - Dataless ecosystem modelling via approximate Bayesian sequential Monte Carlo

In conservation planning, decision-makers often analyse the potential risks for each species in an ecosystem via quantitative ecosystem models. Systems of differential equations are combined with information about species interactions via food webs to model species abundances. However, parameterising these models is challenging – particularly for ecosystems without monitoring. Models parameterised with expected behaviours of the system – stable coexistence of species – help us understand ecosystems without data. However, existing methods are computationally inefficient, preventing larger networks from being studied. We adapt an approximate Bayesian sequential Monte Carlo sampling approach that yields equivalent parameter inferences and model predictions but is orders of magnitude faster than existing methods. In one case study, we demonstrate how our new method speeds up the ensemble-generating process from 46 days to 41 minutes. Now, for the first time, larger and more realistic networks can be practically simulated.

Ming received his B. Comm (Hons) at the University of Melbourne and worked as an actuarial analyst for 3 years before returning to academia. He completed his M. Sc. in Mathematics at the University of New South Wales, performing research in Bayesian computation and completed his PhD at QUT under the supervision of Prof. Michael Milford, Dr Niko Sünderhauf and Dr Tobias Fischer. He is now a Research Fellow at the Australian National University under Prof. Stephen Gould.

Ming is currently working on a variety of robot learning and computer vision problems, more recently focusing on combining traditional non-linear optimization techniques and deep networks for state estimation and control.

Thursday, July 13

Keynote 3: Dr Saifuddin Syed, University of Oxford - Annealed sequential Monte Carlo samplers.

Sequential Monte Carlo Samplers (SMCS) constitute a widely used class of SMC algorithms that calculate normalizing constants and simulate complex multi-modal target distributions. Typically, SMCS utilizes a process known as annealing, which propagates solutions from a tractable reference distribution to the intractable target through a continuous path of increasingly complex distributions. SMCS delivers state-of-the-art performance when adequately tuned, although this can pose a challenge for current tuning methods, yielding a random run-time and compromising the normalizing constant's unbiasedness.

In this talk, we intend to describe all the components of an SMCS algorithm and their influence on the variance of the normalizing constant. Specifically, we will demonstrate that SMCS exhibits fundamentally different behaviour in large particle and dense schedule limits. The dense schedule limit reveals the natural geometry induced by annealing, which can pinpoint optimal performance and tune the number of particles, number of annealing distributions, annealing schedules, the resampling schedule, and the path. Lastly, we propose an efficient, black-box algorithm for tuning SMCS that delivers optimal performance within a fixed, user-specified computation budget, all while preserving the unbiasedness of the normalizing constant.

About:

Saifuddin Syed is a Florence Nightingale Bicentennial Fellow in computational statistics and machine learning at the University of Oxford’s Department of Statistics, where he is also a member of the Algorithms and Inference Working Group for the Generation Event Horizon Telescope (ngEHT). His research focus is on developing mathematically rigorous and scalable algorithms for Bayesian inference in scientific applications. Saifudden's approach is centered around the use of “annealing” techniques, which involve progressively transforming solutions from a more manageable reference problem to address intractable problems. He draws inspiration from a range of fields, including stochastic analysis, statistical physics, and differential geometry.”

Personal website https://www.saifsyed.com/

Mitchell O'Sullivan, QUT Centre for Data Science - Adaptive Summary Statistic Weighting for SMC-ABC using Sparse Optimisation

Approximate Bayesian Computation (ABC) is a simulation-based inference method that enables the analysis of implicit models with intractable likelihoods. Using summary statistics for ABC improves the rate of convergence by reducing the dimensionality of the comparison between the simulated and observed datasets. Although the choice of summary statistics greatly affects the performance of ABC, it is immensely challenging to determine informative summary statistics due to the intractable nature of the model. Previous methods have assigned learned weights to each statistic, which may enable the efficient use of a large initial set of candidate statistics. However, this can be computationally burdensome and must generally be performed before a complete analysis. Sequential Monte Carlo targeting the ABC posterior (SMC-ABC) is a popular method that allows the efficient generation of samples from the approximate posterior. Here, we present an SMC-ABC algorithm that automatically selects the singly most informative summary statistic at each iteration to maximise the information of the next target distribution. Using this sparse optimisation approach allows the algorithm to be fast and robust, without compromising consistency and accuracy. Selecting summary statistics automatically and adaptively removes the trade-off between increasing the information content and limiting the dimensionality of the problem.

Leah South, QUT - Unbiased and Consistent Nested Sampling via Sequential Monte Carlo

In this talk I will introduce a new class of Nested Sampling methods that we refer to as Nested Sampling via Sequential Monte Carlo (NS-SMC). NS-SMC reframes the Nested Sampling method of [1] in terms of sequential Monte Carlo techniques. This new framework allows one to obtain provably consistent estimates of the marginal likelihood and posterior expectations. An additional benefit is that marginal likelihood estimates are unbiased. To help practitioners use the method, I will give practical advice on how to tune NS-SMC. I will also provide empirical comparisons between NS-SMC and temperature-annealed SMC for several Bayesian inference examples.

[1] Skilling, J. (2006). Nested sampling for general Bayesian computation.
Bayesian Analysis, 1(4), 833-859.

Hien Nguyen, The University of Queensland - Convergence of Bayesian posterior statistics and related quantities

Many statistical quantities of interest can be expressed as an integral of a data dependent function, on some parameter space, with respect to a data dependent measure. Such objects appear frequently in Bayesian statistics, in the forms of posterior risk statistics, such as the usual Bayes' risks, and posterior utility statistics, such as the Bayesian predictive information criterion and the deviance information criterion. In this work, we investigate some sufficient conditions for guaranteeing the convergence of such data dependent integrals, almost surely, and in L1. We provide examples of how our results can be applied in practice.

Clara Grazian, The University of Sydney - Stochastic Variational Inference for GARCH Models

Stochastic variational inference algorithms are derived for fitting various heteroskedastic time series models using Gaussian approximating densities. Gaussian, t and skew-t response GARCH models are examined. We implement an efficient stochastic gradient ascent approach based upon the use of control variates or the reparameterization trick and show that the proposed approach offers a fast and accurate alternative to Markov chain Monte Carlo sampling. We also present a sequential updating implementation of our variational algorithms, which is suitable for the construction of an efficient portfolio optimization strategy.

Sarah is a PhD student at the Queensland University of Technology working on applied mathematics and statistical methods, with a focus on ecological applications. Sarah is interested in expressing and understanding uncertainty in environmental models, so that they can be more informative in a decision-making context.

Committee Members

  • Joshua Bon, QUT
  • Adam Bretherton, QUT
  • KD Dang, University of Melbourne
  • Mohammad Davoudabadi, University of Sydney
  • Christopher Drovandi, QUT
  • Kerrie Mengersen, QUT
  • Minh-Ngoc Tran, University of Sydney
  • Sarah Vollert, QUT

Contact us

For any enquiries, please email us at smcdownunder@gmail.com.

Sponsors

 

 

 

 

 

 

Details:

Location: QUT Gardens Point
Start Date: 13/07/2023 [add to calendar]
End Date: 13/07/2023
Register: