This document provides an introduction to Approximate Bayesian Computation (ABC), a likelihood-free method for approximating posterior distributions when the likelihood function is unavailable or computationally intractable. It describes the ABC rejection sampling algorithm and key concepts like tolerance levels, distance functions, summary statistics, and improvements like ABC-MCMC and ABC-SMC. ABC is presented as an alternative to traditional Bayesian inference methods for models where direct likelihood evaluation is impossible or too expensive.
Approximate Bayesian computation for the Ising/Potts modelMatt Moores
Bayes’ formula involves the likelihood function, p(y|theta), which is a problem when the likelihood is unavailable in closed form. ABC is a method for approximating the posterior p(theta|y) without evaluating the likelihood. Instead, pseudo-data is simulated from a generative model and compared with the observations. This talk will give an introduction to ABC algorithms: rejection sampling, ABC-MCMC and ABC-SMC. Application of these algorithms to image analysis will be presented as an illustrative example. These methods have been implemented in the R package bayesImageS.
This is joint work with Christian Robert (Warwick/Dauphine), Kerrie Mengersen and Christopher Drovandi (QUT).
Approximate Bayesian computation for the Ising/Potts modelMatt Moores
Bayes’ formula involves the likelihood function, p(y|theta), which is a problem when the likelihood is unavailable in closed form. ABC is a method for approximating the posterior p(theta|y) without evaluating the likelihood. Instead, pseudo-data is simulated from a generative model and compared with the observations. This talk will give an introduction to ABC algorithms: rejection sampling, ABC-MCMC and ABC-SMC. Application of these algorithms to image analysis will be presented as an illustrative example. These methods have been implemented in the R package bayesImageS.
This is joint work with Christian Robert (Warwick/Dauphine), Kerrie Mengersen and Christopher Drovandi (QUT).
Bayesian modelling and computation for Raman spectroscopyMatt Moores
Raman spectroscopy can be used to identify molecules by the characteristic scattering of light from a laser. Each Raman-active dye label has a unique spectral signature, comprised by the locations and amplitudes of the peaks. The Raman spectrum is discretised into a multivariate observation that is highly collinear, hence it lends itself to a reduced-rank representation. We introduce a sequential Monte Carlo (SMC) algorithm to separate this signal into a series of peaks plus a smoothly-varying baseline, corrupted by additive white noise. By incorporating this representation into a Bayesian functional regression, we can quantify the relationship between dye concentration and peak intensity. We also estimate the model evidence using SMC to investigate long-range dependence between peaks. These methods have been implemented as an R package, using RcppEigen and OpenMP.
A 3hrs intro lecture to Approximate Bayesian Computation (ABC), given as part of a PhD course at Lund University, February 2016. For sample codes see http://www.maths.lu.se/kurshemsida/phd-course-fms020f-nams002-statistical-inference-for-partially-observed-stochastic-processes/
Pre-computation for ABC in image analysisMatt Moores
MCMSki IV (the 5th IMS-ISBA joint meeting)
January 2014
Chamonix Mont-Blanc, France
The associated journal article has now been uploaded to arXiv: http://arxiv.org/abs/1403.4359
ABC with data cloning for MLE in state space modelsUmberto Picchini
An application of the "data cloning" method for parameter estimation via MLE aided by Approximate Bayesian Computation. The relevant paper is http://arxiv.org/abs/1505.06318
In this article we consider macrocanonical models for texture synthesis. In these models samples are generated given an input texture image and a set of features which should be matched in expectation. It is known that if the images are quantized, macrocanonical models are given by Gibbs measures, using the maximum entropy principle. We study conditions under which this result extends to real-valued images. If these conditions hold, finding a macrocanonical model amounts to minimizing a convex function and sampling from an associated Gibbs measure. We analyze an algorithm which alternates between sampling and minimizing. We present experiments with neural network features and study the drawbacks and advantages of using this sampling scheme.
Bayesian modelling and computation for Raman spectroscopyMatt Moores
Raman spectroscopy can be used to identify molecules by the characteristic scattering of light from a laser. Each Raman-active dye label has a unique spectral signature, comprised by the locations and amplitudes of the peaks. The Raman spectrum is discretised into a multivariate observation that is highly collinear, hence it lends itself to a reduced-rank representation. We introduce a sequential Monte Carlo (SMC) algorithm to separate this signal into a series of peaks plus a smoothly-varying baseline, corrupted by additive white noise. By incorporating this representation into a Bayesian functional regression, we can quantify the relationship between dye concentration and peak intensity. We also estimate the model evidence using SMC to investigate long-range dependence between peaks. These methods have been implemented as an R package, using RcppEigen and OpenMP.
A 3hrs intro lecture to Approximate Bayesian Computation (ABC), given as part of a PhD course at Lund University, February 2016. For sample codes see http://www.maths.lu.se/kurshemsida/phd-course-fms020f-nams002-statistical-inference-for-partially-observed-stochastic-processes/
Pre-computation for ABC in image analysisMatt Moores
MCMSki IV (the 5th IMS-ISBA joint meeting)
January 2014
Chamonix Mont-Blanc, France
The associated journal article has now been uploaded to arXiv: http://arxiv.org/abs/1403.4359
ABC with data cloning for MLE in state space modelsUmberto Picchini
An application of the "data cloning" method for parameter estimation via MLE aided by Approximate Bayesian Computation. The relevant paper is http://arxiv.org/abs/1505.06318
In this article we consider macrocanonical models for texture synthesis. In these models samples are generated given an input texture image and a set of features which should be matched in expectation. It is known that if the images are quantized, macrocanonical models are given by Gibbs measures, using the maximum entropy principle. We study conditions under which this result extends to real-valued images. If these conditions hold, finding a macrocanonical model amounts to minimizing a convex function and sampling from an associated Gibbs measure. We analyze an algorithm which alternates between sampling and minimizing. We present experiments with neural network features and study the drawbacks and advantages of using this sampling scheme.
Stratified Monte Carlo and bootstrapping for approximate Bayesian computationUmberto Picchini
Presented on 7 May 2020 at "One World Approximate Bayesian Computation (ABC) Seminar". A video is available at https://youtu.be/IOPnRfAJ_W8
Approximate Bayesian computation (ABC) is computationally intensive for complex model simulators. To exploit expensive simulations, data-resampling via bootstrapping was used with success in [1] to obtain many artificial datasets at little cost and construct a synthetic likelihood. When using the same approach within ABC to produce a pseudo-marginal ABC-MCMC algorithm, the posterior variance is inflated, thus producing biased posterior inference. Here we use stratified Monte Carlo to considerably reduce the bias induced by data resampling. We also show that it is possible to obtain reliable inference using a larger than usual ABC threshold, by employing stratified Monte Carlo. Finally, we show that with stratified sampling we obtain a less variable ABC likelihood. In our paper [2] we consider simulation studies for static (Gaussian, g-and-k distribution, Ising model) and dynamic models (Lotka-Volterra). For the Lotka-Volterra case study, we compare our results against a standard pseudo-Marginal ABC and find that our approach is four times more efficient and, given limited computational budget, it explores the posterior surface more thoroughly. A comparison against state-of-art sequential Monte Carlo ABC is also reported.
References
[1] R. G. Everitt (2017). Bootstrapped synthetic likelihood. arXiv:1711.05825.
[2] U. Picchini, R.G. Everitt (2019). Stratified sampling and resampling for approximateBayesian computation. arXiv:1905.07976
Bayesian Inference and Uncertainty Quantification for Inverse ProblemsMatt Moores
So-called “inverse” problems arise when the parameters of a physical system cannot be directly observed. The mapping between these latent parameters and the space of noisy observations is represented as a mathematical model, often involving a system of differential equations. We seek to infer the parameter values that best fit our observed data. However, it is also vital to obtain accurate quantification of the uncertainty involved with these parameters, particularly when the output of the model will be used for forecasting. Bayesian inference provides well-calibrated uncertainty estimates, represented by the posterior distribution over the parameters. In this talk, I will give a brief introduction to Markov chain Monte Carlo (MCMC) algorithms for sampling from the posterior distribution and describe how they can be combined with numerical solvers for the forward model. We apply these methods to two examples of ODE models: growth curves in ecology, and thermogravimetric analysis (TGA) in chemistry. This is joint work with Matthew Berry, Mark Nelson, Brian Monaghan and Raymond Longbottom.
bayesImageS: an R package for Bayesian image analysisMatt Moores
There are many approaches to Bayesian computation with intractable likelihoods, including the exchange algorithm, approximate Bayesian computation (ABC), thermodynamic integration, and composite likelihood. These approaches vary in accuracy as well as scalability for datasets of significant size. The Potts model is an example where such methods are required, due to its intractable normalising constant. This model is a type of Markov random field, which is commonly used for image segmentation. The dimension of its parameter space increases linearly with the number of pixels in the image, making this a challenging application for scalable Bayesian computation. My talk will introduce various algorithms in the context of the Potts model and describe their implementation in C++, using OpenMP for parallelism. I will also discuss the process of releasing this software as an open source R package on the CRAN repository.
R package bayesImageS: Scalable Inference for Intractable LikelihoodsMatt Moores
There are many approaches to Bayesian computation with intractable likelihoods, including the exchange algorithm and approximate Bayesian computation (ABC). A serious drawback of these algorithms is that they do not scale well for models with a large state space. Markov random fields, such as the Ising/Potts model and exponential random graph model (ERGM), are particularly challenging because the number of discrete variables increases linearly with the size of the image or graph. The likelihood of these models cannot be computed directly, due to the presence of an intractable normalising constant. In this context, it is necessary to employ algorithms that provide a suitable compromise between accuracy and computational cost.
Bayesian indirect likelihood (BIL) is a class of methods that approximate the likelihood function using a surrogate model. This model can be trained using a pre-computation step, utilising massively parallel hardware to simulate auxiliary variables. We review various types of surrogate model that can be used in BIL. In the case of the Potts model, we introduce a parametric approximation to the score function that incorporates its known properties, such as heteroskedasticity and critical temperature. We demonstrate this method on 2D satellite remote sensing and 3D computed tomography (CT) images. We achieve a hundredfold improvement in the elapsed runtime, compared to the exchange algorithm or ABC. Our algorithm has been implemented in the R package “bayesImageS,” which is available from CRAN.
bayesImageS: Bayesian computation for medical Image Segmentation using a hidd...Matt Moores
There are many approaches to Bayesian computation with intractable likelihoods, including the exchange algorithm, approximate Bayesian computation (ABC), thermodynamic integration, and composite likelihood. These approaches vary in accuracy as well as scalability for datasets of significant size. The Potts model is an example where such methods are required, due to its intractable normalising constant. This model is a type of Markov random field, which is commonly used for image segmentation. The dimension of its parameter space increases linearly with the number of pixels in the image, making this a challenging application for scalable Bayesian computation. My talk will introduce various algorithms in the context of the Potts model and describe their implementation in C++, using OpenMP for parallelism.
Accelerating Pseudo-Marginal MCMC using Gaussian ProcessesMatt Moores
The grouped independence Metropolis-Hastings (GIMH) and Markov chain within Metropolis (MCWM) algorithms are pseudo-marginal methods used to perform Bayesian inference in latent variable models. These methods replace intractable likelihood calculations with unbiased estimates within Markov chain Monte Carlo algorithms. The GIMH method has the posterior of interest as its limiting distribution, but suffers from poor mixing if it is too computationally intensive to obtain high-precision likelihood estimates. The MCWM algorithm has better mixing properties, but less theoretical support. In this paper we accelerate the GIMH method by using a Gaussian process (GP) approximation to the log-likelihood and train this GP using a short pilot run of the MCWM algorithm. Our new method, GP-GIMH, is illustrated on simulated data from a stochastic volatility and a gene network model. Our approach produces reasonable estimates of the univariate and bivariate posterior distributions, and the posterior correlation matrix in these examples with at least an order of magnitude improvement in computing time.
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...Matt Moores
There are many approaches to Bayesian computation with intractable likelihoods, including the exchange algorithm, approximate Bayesian computation (ABC), thermodynamic integration, and composite likelihood. These approaches vary in accuracy as well as scalability for datasets of significant size. The Potts model is an example where such methods are required, due to its intractable normalising constant. This model is a type of Markov random field, which is commonly used for image segmentation. The dimension of its parameter space increases linearly with the number of pixels in the image, making this a challenging application for scalable Bayesian computation. My talk will introduce various algorithms in the context of the Potts model and describe their implementation in C++, using OpenMP for parallelism. I will also discuss the process of releasing this software as an open source R package on the CRAN repository.
Informative Priors for Segmentation of Medical ImagesMatt Moores
There is an abundance of prior information available for image-guided radiotherapy, making it ideally suited for Bayesian techniques. I will demonstrate some results from applying the method of Teo, Sapiro & Wandell (1997) to cone-beam computed tomography (CT). A previous CT scan of the same object forms the prior expectation. The posterior probabilities of class membership are smoothed by diffusion, before labeling each pixel according to the maximum a posteriori (MAP) estimate. The effect of the prior and of the smoothing is discussed and some potential extensions to this method are proposed.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
StarCompliance is a leading firm specializing in the recovery of stolen cryptocurrency. Our comprehensive services are designed to assist individuals and organizations in navigating the complex process of fraud reporting, investigation, and fund recovery. We combine cutting-edge technology with expert legal support to provide a robust solution for victims of crypto theft.
Our Services Include:
Reporting to Tracking Authorities:
We immediately notify all relevant centralized exchanges (CEX), decentralized exchanges (DEX), and wallet providers about the stolen cryptocurrency. This ensures that the stolen assets are flagged as scam transactions, making it impossible for the thief to use them.
Assistance with Filing Police Reports:
We guide you through the process of filing a valid police report. Our support team provides detailed instructions on which police department to contact and helps you complete the necessary paperwork within the critical 72-hour window.
Launching the Refund Process:
Our team of experienced lawyers can initiate lawsuits on your behalf and represent you in various jurisdictions around the world. They work diligently to recover your stolen funds and ensure that justice is served.
At StarCompliance, we understand the urgency and stress involved in dealing with cryptocurrency theft. Our dedicated team works quickly and efficiently to provide you with the support and expertise needed to recover your assets. Trust us to be your partner in navigating the complexities of the crypto world and safeguarding your investments.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
1. Intro to ABC Example Conclusion
an introduction to
Approximate Bayesian Computation
Matt Moores
Mathematical Sciences School
Queensland University of Technology
Brisbane, Australia
ABC in Sydney
July 3, 2014
2. Intro to ABC Example Conclusion
Motivation
Inference for a parameter θ when it is:
impossible
or very expensive
to evaluate the likelihood p(y|θ)
ABC is a likelihood-free method for approximating
the posterior distribution
π(θ|y)
by generating pseudo-data from the model:
w ∼ f(·|θ)
3. Intro to ABC Example Conclusion
Likelihood-free rejection sampler
Algorithm 1 Likelihood-free rejection sampler
1: Draw parameter value θ ∼ π(θ)
2: Generate w ∼ f(·|θ )
3: if w = y (the observed data) then
4: accept θ
5: end if
But if the observations y are continuous
(or the space y ∈ Y is enormous)
then P(w = y) ≈ 0
Tavar´e, Balding, Griffith & Donnelly (1997) Genetics 145(2)
4. Intro to ABC Example Conclusion
ABC tolerance
accept θ if δ(w, y) <
where
> 0 is the tolerance level
δ(·, ·) is a distance function
(for an appropriate choice of norm)
Inference is more exact when is close to zero. but
more proposed θ are rejected
(tradeoff between accuracy & computational cost)
Pritchard, Seielstad, Perez-Lezaun & Feldman (1999) Mol. Biol. Evol. 16(12)
5. Intro to ABC Example Conclusion
Summary statistics
Computing δ(w, y) for w1, . . . , wn and y1, . . . , yn
can be very expensive for large n
Instead, compute summary statistics s(y)
e.g. sufficient statistics
(only available for exponential family)
6. Intro to ABC Example Conclusion
Sufficient statistics
Fisher-Neyman factorisation theorem:
if s(y) is sufficient for θ
then p(y|θ) = f(y) g (s(y)|θ)
only applies to Potts, Ising, exponential random
graph models (ERGM)
otherwise, selection of suitable summary
statistics can be a very difficult problem
7. Intro to ABC Example Conclusion
ABC rejection sampler
Algorithm 2 ABC rejection sampler
1: for all iterations t ∈ 1 . . . T do
2: Draw independent proposal θ ∼ π(θ)
3: Generate w ∼ f(·|θ )
4: if s(w) − s(y) < then
5: set θt ← θ
6: else
7: set θt ← θt−1
8: end if
9: end for
Approximates π(θ|y) by π (θ | s(w) − s(y) < )
Marin, Pudlo, Robert & Ryder (2012) Stat. Comput. 22(6)
Marin & Robert (2014) Bayesian Essentials with R §8.3
8. Intro to ABC Example Conclusion
A trivial (counter) example
Gaussian with unknown mean:
y ∼ N(µ, 1)
natural conjugate prior:
π(µ) ∼ N(0, 106
)
sufficient statistic:
¯y = 1
n
n
i=1 yi
posterior is analytically tractable:
π(µ|y) ∼ N (m , s2
)
where
1
s2 = n
1
+ 1
106
m = s2 n¯y
1
+ 0 = n¯y
n+10−6
∴ no need for ABC (nor MCMC) in practice
9. Intro to ABC Example Conclusion
R code
π(µ|y)
1.5 2.0 2.5 3.0 3.5 4.0 4.5
0.00.20.40.60.8
§
y ← rnorm (n=5, mean=3, sd=1)
n ← length ( y )
ybar ← sum( y )/n
post s ← 1/(n + 1e−6)
post m ← post s ∗ n∗ ybar
post sim ← rnorm (10000 , post m, sd=sqrt ( post s ))
10. Intro to ABC Example Conclusion
now with ABC
π(µ)
−4000 −2000 0 2000 4000
0e+002e−044e−04
πε(µ | δ(s(w), s(y)) < ε)
0 2 4 6
0.00.20.40.60.8
§
prop mu ← rnorm (10000 , 0 , sqrt (1 e6 ))
pseudo ← rnorm (n∗ 10000 , prop mu, 1)
pseudoMx ← matrix ( pseudo , nrow=10000, ncol=n)
ps ybar ← rowMeans ( pseudoMx )
ps norm ← abs ( ps ybar − ybar )
e p s i l o n ← sort ( ps norm ) [ 2 0 ]
prop keep ← prop mu[ ps norm <= e p s i l o n ]
12. Intro to ABC Example Conclusion
Improvements to ABC
Alternatives to i.i.d. proposals:
ABC-MCMC
ABC-SMC
Regression adjustment
compensates for larger
Validation of ABC approximation
ABC for model choice
13. Intro to ABC Example Conclusion
Summary
ABC is a method for likelihood-free inference
It enables inference for models that are
otherwise computationally intractable
Main components of ABC:
π(θ) proposal density for θ
f(·|θ) generative model for w
tolerance level
δ(·, ·) distance function
s(y) summary statistics
14. Intro to ABC Example Conclusion
References
Jean-Michel Marin & Christian Robert
Bayesian Essentials with R
Springer-Verlag, 2014.
Jean-Michel Marin, Pierre Pudlo, Christian Robert & Robin Ryder
Approximate Bayesian computational methods.
Statistics & Computing, 22(6): 1167–80, 2012.
Simon Tavar´e, David Balding, Robert Griffiths & Peter Donnelly
Inferring coalescence times from DNA sequence data.
Genetics, 145(2): 505–18, 1997.
Jonathan Pritchard, Mark Seielstad, Anna Perez-Lezaun & Marcus
Feldman
Population Growth of Human Y Chromosomes: A Study of Y
Chromosome Microsatellites.
Mol. Biol. Evol. 16(12): 1791–98, 1999.