The document discusses methods for efficiently and accurately estimating integrals, including Monte Carlo simulation, low-discrepancy sampling, and Bayesian cubature. It notes that product rules for estimating high-dimensional integrals become prohibitively expensive as dimension increases. Adaptive low-discrepancy sampling is proposed as a method that uses Sobol' or lattice points and normally doubles the number of points until a tolerance is reached.
The document describes various adaptive methods for numerical integration or cubature of functions, including Monte Carlo methods, low-discrepancy sampling, and Bayesian cubature. It discusses approaches to choose sample sizes and weights to guarantee the integral estimate is within a given tolerance of the true integral with high probability. Specific examples discussed include multidimensional Gaussian integrals and estimating Sobol' sensitivity indices.
The document discusses different perspectives on simulating the mean of a function, including deterministic, randomized, and Bayesian approaches. It summarizes Monte Carlo methods using the central limit theorem and Berry-Esseen inequality to estimate error bounds. Low-discrepancy sampling and cubature methods are described which use Fourier coefficients to bound integration errors. Bayesian cubature is outlined, which assumes the function is drawn from a Gaussian process prior to perform optimal quadrature. Maximum likelihood is used to estimate the kernel hyperparameters.
This document discusses error analysis for quasi-Monte Carlo methods. It introduces the trio error identity that decomposes the error into three terms: the variation of the integrand, the discrepancy of the sampling measure from the probability measure, and the alignment between the integrand and the difference between the measures. Several examples are provided to illustrate the identity, including integration over a reproducing kernel Hilbert space. The discrepancy term can be evaluated in O(n^2) operations and converges at different rates depending on the sampling method and properties of the integrand.
We will describe and analyze accurate and efficient numerical algorithms to interpolate and approximate the integral of multivariate functions. The algorithms can be applied when we are given the function values at an arbitrary positioned, and usually small, existing sparse set of function values (samples), and additional samples are impossible, or difficult (e.g. expensive) to obtain. The methods are based on local, and global, tensor-product sparse quasi-interpolation methods that are exact for a class of sparse multivariate orthogonal polynomials.
One of the central tasks in computational mathematics and statistics is to accurately approximate unknown target functions. This is typically done with the help of data — samples of the unknown functions. The emergence of Big Data presents both opportunities and challenges. On one hand, big data introduces more information about the unknowns and, in principle, allows us to create more accurate models. On the other hand, data storage and processing become highly challenging. In this talk, we present a set of sequential algorithms for function approximation in high dimensions with large data sets. The algorithms are of iterative nature and involve only vector operations. They use one data sample at each step and can handle dynamic/stream data. We present both the numerical algorithms, which are easy to implement, as well as rigorous analysis for their theoretical foundation.
This document describes an automatic Bayesian method for numerical integration. It begins by introducing the problem of multivariate integration and current approaches like Monte Carlo integration that have limitations. It then presents the Bayesian cubature algorithm which chooses sample points and weights to minimize the error in approximating an integral. This is done by modeling the integrand as a Gaussian process, deriving identities relating the error to properties of the covariance kernel, and estimating its hyperparameters. The kernel used is shift-invariant, allowing fast matrix computations. Simulation results show Bayesian cubature achieves high accuracy with fewer samples compared to other methods.
Multidimensional integrals may be approximated by weighted averages of integrand values. Quasi-Monte Carlo (QMC) methods are more accurate than simple Monte Carlo methods because they carefully choose where to evaluate the integrand. This tutorial focuses on how quickly QMC methods converge to the correct answer as the number of integrand values increases. The answer may depend on the smoothness of the integrand and the sophistication of the QMC method. QMC error analysis may assumes the integrand belongs to a reproducing kernel Hilbert space or may assume that the integrand is an instance of a stochastic process with known covariance structure. These two approaches have interesting parallels. This tutorial also explores how the computational cost of achieving a good approximation to the integral depends on the dimension of the domain of the integrand. Finally, this tutorial explores methods for determining how many integrand values are needed to satisfy the error tolerance. Relevant software is described.
The document describes various adaptive methods for numerical integration or cubature of functions, including Monte Carlo methods, low-discrepancy sampling, and Bayesian cubature. It discusses approaches to choose sample sizes and weights to guarantee the integral estimate is within a given tolerance of the true integral with high probability. Specific examples discussed include multidimensional Gaussian integrals and estimating Sobol' sensitivity indices.
The document discusses different perspectives on simulating the mean of a function, including deterministic, randomized, and Bayesian approaches. It summarizes Monte Carlo methods using the central limit theorem and Berry-Esseen inequality to estimate error bounds. Low-discrepancy sampling and cubature methods are described which use Fourier coefficients to bound integration errors. Bayesian cubature is outlined, which assumes the function is drawn from a Gaussian process prior to perform optimal quadrature. Maximum likelihood is used to estimate the kernel hyperparameters.
This document discusses error analysis for quasi-Monte Carlo methods. It introduces the trio error identity that decomposes the error into three terms: the variation of the integrand, the discrepancy of the sampling measure from the probability measure, and the alignment between the integrand and the difference between the measures. Several examples are provided to illustrate the identity, including integration over a reproducing kernel Hilbert space. The discrepancy term can be evaluated in O(n^2) operations and converges at different rates depending on the sampling method and properties of the integrand.
We will describe and analyze accurate and efficient numerical algorithms to interpolate and approximate the integral of multivariate functions. The algorithms can be applied when we are given the function values at an arbitrary positioned, and usually small, existing sparse set of function values (samples), and additional samples are impossible, or difficult (e.g. expensive) to obtain. The methods are based on local, and global, tensor-product sparse quasi-interpolation methods that are exact for a class of sparse multivariate orthogonal polynomials.
One of the central tasks in computational mathematics and statistics is to accurately approximate unknown target functions. This is typically done with the help of data — samples of the unknown functions. The emergence of Big Data presents both opportunities and challenges. On one hand, big data introduces more information about the unknowns and, in principle, allows us to create more accurate models. On the other hand, data storage and processing become highly challenging. In this talk, we present a set of sequential algorithms for function approximation in high dimensions with large data sets. The algorithms are of iterative nature and involve only vector operations. They use one data sample at each step and can handle dynamic/stream data. We present both the numerical algorithms, which are easy to implement, as well as rigorous analysis for their theoretical foundation.
This document describes an automatic Bayesian method for numerical integration. It begins by introducing the problem of multivariate integration and current approaches like Monte Carlo integration that have limitations. It then presents the Bayesian cubature algorithm which chooses sample points and weights to minimize the error in approximating an integral. This is done by modeling the integrand as a Gaussian process, deriving identities relating the error to properties of the covariance kernel, and estimating its hyperparameters. The kernel used is shift-invariant, allowing fast matrix computations. Simulation results show Bayesian cubature achieves high accuracy with fewer samples compared to other methods.
Multidimensional integrals may be approximated by weighted averages of integrand values. Quasi-Monte Carlo (QMC) methods are more accurate than simple Monte Carlo methods because they carefully choose where to evaluate the integrand. This tutorial focuses on how quickly QMC methods converge to the correct answer as the number of integrand values increases. The answer may depend on the smoothness of the integrand and the sophistication of the QMC method. QMC error analysis may assumes the integrand belongs to a reproducing kernel Hilbert space or may assume that the integrand is an instance of a stochastic process with known covariance structure. These two approaches have interesting parallels. This tutorial also explores how the computational cost of achieving a good approximation to the integral depends on the dimension of the domain of the integrand. Finally, this tutorial explores methods for determining how many integrand values are needed to satisfy the error tolerance. Relevant software is described.
Markov chain Monte Carlo (MCMC) methods are popularly used in Bayesian computation. However, they need large number of samples for convergence which can become costly when the posterior distribution is expensive to evaluate. Deterministic sampling techniques such as Quasi-Monte Carlo (QMC) can be a useful alternative to MCMC, but the existing QMC methods are mainly developed only for sampling from unit hypercubes. Unfortunately, the posterior distributions can be highly correlated and nonlinear making them occupy very little space inside a hypercube. Thus, most of the samples from QMC can get wasted. The QMC samples can be saved if they can be pulled towards the high probability regions of the posterior distribution using inverse probability transforms. But this can be done only when the distribution function is known, which is rarely the case in Bayesian problems. In this talk, I will discuss a deterministic sampling technique, known as minimum energy designs, which can directly sample from the posterior distributions.
1. The document discusses the author's research in three areas: graph-based clustering methods, approximate Bayesian computation (ABC), and Bayesian computation using empirical likelihood.
2. For graph-based clustering, the author presents asymptotic results for spectral clustering as the number of data points and bandwidth approach infinity.
3. For ABC, the author discusses sequential ABC algorithms and challenges of model choice and high-dimensional summary statistics. Machine learning methods are proposed to analyze simulated ABC data.
4. For empirical likelihood, the author proposes using it for Bayesian computation when the likelihood is intractable and simulation is infeasible, as it provides correct confidence intervals unlike composite likelihoods.
1. The document presents Plug-and-Play priors for Bayesian imaging using Langevin-based sampling methods.
2. It introduces the Bayesian framework for image restoration and discusses challenges in modeling the prior.
3. A Plug-and-Play approach is proposed that uses an implicit prior defined by a denoising network in conjunction with Langevin sampling, termed PnP-ULA. Experiments demonstrate its effectiveness on image deblurring and inpainting tasks.
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...Frank Nielsen
Slides for the paper:
On the Chi Square and Higher-Order Chi Distances for Approximating f-Divergences
published in IEEE SPL:
http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6654274
This document discusses macrocanonical models for texture synthesis. It begins by introducing the goal of texture synthesis and providing a brief history. It then describes the parametric question of combining randomness and structure in images. Specifically, it discusses maximizing entropy under geometric constraints. The document goes on to discuss links to statistical physics, defining microcanonical and macrocanonical models. It focuses on studying the macrocanonical model, describing how to find optimal parameters through gradient descent and how to sample from the model using Langevin dynamics. The document provides examples of texture synthesis and compares results to other methods.
Maximum likelihood estimation of regularisation parameters in inverse problem...Valentin De Bortoli
This document discusses an empirical Bayesian approach for estimating regularization parameters in inverse problems using maximum likelihood estimation. It proposes the Stochastic Optimization with Unadjusted Langevin (SOUL) algorithm, which uses Markov chain sampling to approximate gradients in a stochastic projected gradient descent scheme for optimizing the regularization parameter. The algorithm is shown to converge to the maximum likelihood estimate under certain conditions on the log-likelihood and prior distributions.
The document discusses achieving higher-order convergence for integration on RN using quasi-Monte Carlo (QMC) rules. It describes the problem that when using tensor product QMC rules on truncated domains, the convergence rate scales with the dimension s as (α log N)sN-α. The goal is to obtain a convergence rate independent of the dimension s. The document proposes using a multivariate decomposition method (MDM) to decompose an infinite-dimensional integral into a sum of finite-dimensional integrals, then applying QMC rules to each integral to achieve the desired higher-order convergence rate.
1) The document discusses proximal algorithms for solving inverse problems in probability spaces, where the goal is to estimate an unknown variable x given noisy measurements y.
2) It describes using Bayesian methods like maximum a posteriori (MAP) estimation and Markov chain Monte Carlo (MCMC) to account for uncertainty, where the posterior distribution p(x|y) is assumed to be log-concave.
3) Proximal algorithms like the unadjusted Langevin algorithm (ULA) and proximal ULA (MYULA) are proposed for sampling from the posterior in high dimensions when p(x|y) is not differentiable.
This document provides an introduction to Approximate Bayesian Computation (ABC), a likelihood-free method for approximating posterior distributions when the likelihood function is unavailable or computationally intractable. It describes the ABC rejection sampling algorithm and key concepts like tolerance levels, distance functions, summary statistics, and improvements like ABC-MCMC and ABC-SMC. ABC is presented as an alternative to traditional Bayesian inference methods for models where direct likelihood evaluation is impossible or too expensive.
This document summarizes results on analyzing stochastic gradient descent (SGD) algorithms for minimizing convex functions. It shows that a continuous-time version of SGD (SGD-c) can strongly approximate the discrete-time version (SGD-d) under certain conditions. It also establishes that SGD achieves the minimax optimal convergence rate of O(t^-1/2) for α=1/2 by using an "averaging from the past" procedure, closing the gap between previous lower and upper bound results.
Lattice rules are one of the two main classes of methods for quasi-Monte Carlo (QMC) and randomized quasi-Monte Carlo (RQMC) integration. In this tutorial, we recall the definition and summarize the key properties of lattice rules. We discuss what classes of functions these rules are good to integrate, and how their parameters can be chosen in terms of variance bounds for these classes of functions. We consider integration lattices in the real space as well as in a polynomial space over the finite field F2. We provide various numerical examples of how these rules perform compared with standard Monte Carlo. Some examples involve high-dimensional integrals, others involve Markov chains. We also discuss software design for RQMC and what software is available.
Quantitative Propagation of Chaos for SGD in Wide Neural NetworksValentin De Bortoli
The document discusses quantitative analysis of stochastic gradient descent (SGD) for training wide neural networks. It presents two different regimes - a deterministic regime where the limiting dynamics is described by an ordinary differential equation, and a stochastic regime where the limiting dynamics is a stochastic differential equation. Experiments on MNIST classification show that the stochastic regime with larger step sizes exhibits better regularization properties. The analysis provides insights into the behavior of neural network training as the number of neurons becomes large.
This document describes fuzzy clustering and fuzzy c-means clustering. It begins by introducing fuzzy clustering and discussing how the cost function for k-means clustering can be modified to allow fuzzy membership. Specifically, it proposes using fuzzy membership values between 0 and 1 instead of the hard 0 or 1 membership of k-means. This modifies the cost function to include fuzzy membership values raised to a power m. Lagrange multipliers are then used to derive update equations for the fuzzy memberships and cluster centroids. The final equations assign membership based on the distance of a point to cluster centroids, and update centroids as the weighted mean of points based on their fuzzy memberships.
I am Anthony F. I am a Math Exam Helper at liveexamhelper.com. I hold a Masters' Degree in Maths, University of Cambridge, UK. I have been helping students with their exams for the past 9 years. You can hire me to take your exam in Math.
Visit liveexamhelper.com or email info@liveexamhelper.com.
You can also call on +1 678 648 4277 for any assistance with Math Exams.
Recently, there has been a surge in activity at the interface of optimal transport and statistics (with special emphasis on machine learning applications). The talk will summarize new results and challenges in this active area. For example, we will show how many of the most popular estimators in machine learning (such as Lasso and svm's) can be interpreted as games. This interpretation opens the door for new and potentially better estimators and algorithms, as well as questions about the underlying complexity of these new class of estimators.
(This talk is based on joint work with F. He, Y. Kang, K. Murthy, and F. Zhang)
I am Stacy W. I am a Probability Assignment Expert at statisticsassignmenthelp.com. I hold a Masters in Statistics from, University of McGill, Canada
I have been helping students with their homework for the past 8 years. I solve assignments related to Probability.
Visit statisticsassignmenthelp.com or email info@statisticsassignmenthelp.com.
You can also call on +1 678 648 4277 for any assistance with Probability Assignments.
Computational Information Geometry: A quick review (ICMS)Frank Nielsen
From the workshop
Computational information geometry for image and signal processing
Sep 21, 2015 - Sep 25, 2015
ICMS, 15 South College Street, Edinburgh
http://www.icms.org.uk/workshop.php?id=343
The document discusses computing averages and provides examples of calculating average speed and estimating population proportions. It explains that averages can be used to estimate values for large populations by taking samples. Care must be taken with sampling to ensure respondents are chosen randomly and independently to minimize errors. Averages also come up in assessing financial risk by considering expectations as averages over infinite scenarios.
Markov chain Monte Carlo (MCMC) methods are popularly used in Bayesian computation. However, they need large number of samples for convergence which can become costly when the posterior distribution is expensive to evaluate. Deterministic sampling techniques such as Quasi-Monte Carlo (QMC) can be a useful alternative to MCMC, but the existing QMC methods are mainly developed only for sampling from unit hypercubes. Unfortunately, the posterior distributions can be highly correlated and nonlinear making them occupy very little space inside a hypercube. Thus, most of the samples from QMC can get wasted. The QMC samples can be saved if they can be pulled towards the high probability regions of the posterior distribution using inverse probability transforms. But this can be done only when the distribution function is known, which is rarely the case in Bayesian problems. In this talk, I will discuss a deterministic sampling technique, known as minimum energy designs, which can directly sample from the posterior distributions.
1. The document discusses the author's research in three areas: graph-based clustering methods, approximate Bayesian computation (ABC), and Bayesian computation using empirical likelihood.
2. For graph-based clustering, the author presents asymptotic results for spectral clustering as the number of data points and bandwidth approach infinity.
3. For ABC, the author discusses sequential ABC algorithms and challenges of model choice and high-dimensional summary statistics. Machine learning methods are proposed to analyze simulated ABC data.
4. For empirical likelihood, the author proposes using it for Bayesian computation when the likelihood is intractable and simulation is infeasible, as it provides correct confidence intervals unlike composite likelihoods.
1. The document presents Plug-and-Play priors for Bayesian imaging using Langevin-based sampling methods.
2. It introduces the Bayesian framework for image restoration and discusses challenges in modeling the prior.
3. A Plug-and-Play approach is proposed that uses an implicit prior defined by a denoising network in conjunction with Langevin sampling, termed PnP-ULA. Experiments demonstrate its effectiveness on image deblurring and inpainting tasks.
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...Frank Nielsen
Slides for the paper:
On the Chi Square and Higher-Order Chi Distances for Approximating f-Divergences
published in IEEE SPL:
http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6654274
This document discusses macrocanonical models for texture synthesis. It begins by introducing the goal of texture synthesis and providing a brief history. It then describes the parametric question of combining randomness and structure in images. Specifically, it discusses maximizing entropy under geometric constraints. The document goes on to discuss links to statistical physics, defining microcanonical and macrocanonical models. It focuses on studying the macrocanonical model, describing how to find optimal parameters through gradient descent and how to sample from the model using Langevin dynamics. The document provides examples of texture synthesis and compares results to other methods.
Maximum likelihood estimation of regularisation parameters in inverse problem...Valentin De Bortoli
This document discusses an empirical Bayesian approach for estimating regularization parameters in inverse problems using maximum likelihood estimation. It proposes the Stochastic Optimization with Unadjusted Langevin (SOUL) algorithm, which uses Markov chain sampling to approximate gradients in a stochastic projected gradient descent scheme for optimizing the regularization parameter. The algorithm is shown to converge to the maximum likelihood estimate under certain conditions on the log-likelihood and prior distributions.
The document discusses achieving higher-order convergence for integration on RN using quasi-Monte Carlo (QMC) rules. It describes the problem that when using tensor product QMC rules on truncated domains, the convergence rate scales with the dimension s as (α log N)sN-α. The goal is to obtain a convergence rate independent of the dimension s. The document proposes using a multivariate decomposition method (MDM) to decompose an infinite-dimensional integral into a sum of finite-dimensional integrals, then applying QMC rules to each integral to achieve the desired higher-order convergence rate.
1) The document discusses proximal algorithms for solving inverse problems in probability spaces, where the goal is to estimate an unknown variable x given noisy measurements y.
2) It describes using Bayesian methods like maximum a posteriori (MAP) estimation and Markov chain Monte Carlo (MCMC) to account for uncertainty, where the posterior distribution p(x|y) is assumed to be log-concave.
3) Proximal algorithms like the unadjusted Langevin algorithm (ULA) and proximal ULA (MYULA) are proposed for sampling from the posterior in high dimensions when p(x|y) is not differentiable.
This document provides an introduction to Approximate Bayesian Computation (ABC), a likelihood-free method for approximating posterior distributions when the likelihood function is unavailable or computationally intractable. It describes the ABC rejection sampling algorithm and key concepts like tolerance levels, distance functions, summary statistics, and improvements like ABC-MCMC and ABC-SMC. ABC is presented as an alternative to traditional Bayesian inference methods for models where direct likelihood evaluation is impossible or too expensive.
This document summarizes results on analyzing stochastic gradient descent (SGD) algorithms for minimizing convex functions. It shows that a continuous-time version of SGD (SGD-c) can strongly approximate the discrete-time version (SGD-d) under certain conditions. It also establishes that SGD achieves the minimax optimal convergence rate of O(t^-1/2) for α=1/2 by using an "averaging from the past" procedure, closing the gap between previous lower and upper bound results.
Lattice rules are one of the two main classes of methods for quasi-Monte Carlo (QMC) and randomized quasi-Monte Carlo (RQMC) integration. In this tutorial, we recall the definition and summarize the key properties of lattice rules. We discuss what classes of functions these rules are good to integrate, and how their parameters can be chosen in terms of variance bounds for these classes of functions. We consider integration lattices in the real space as well as in a polynomial space over the finite field F2. We provide various numerical examples of how these rules perform compared with standard Monte Carlo. Some examples involve high-dimensional integrals, others involve Markov chains. We also discuss software design for RQMC and what software is available.
Quantitative Propagation of Chaos for SGD in Wide Neural NetworksValentin De Bortoli
The document discusses quantitative analysis of stochastic gradient descent (SGD) for training wide neural networks. It presents two different regimes - a deterministic regime where the limiting dynamics is described by an ordinary differential equation, and a stochastic regime where the limiting dynamics is a stochastic differential equation. Experiments on MNIST classification show that the stochastic regime with larger step sizes exhibits better regularization properties. The analysis provides insights into the behavior of neural network training as the number of neurons becomes large.
This document describes fuzzy clustering and fuzzy c-means clustering. It begins by introducing fuzzy clustering and discussing how the cost function for k-means clustering can be modified to allow fuzzy membership. Specifically, it proposes using fuzzy membership values between 0 and 1 instead of the hard 0 or 1 membership of k-means. This modifies the cost function to include fuzzy membership values raised to a power m. Lagrange multipliers are then used to derive update equations for the fuzzy memberships and cluster centroids. The final equations assign membership based on the distance of a point to cluster centroids, and update centroids as the weighted mean of points based on their fuzzy memberships.
I am Anthony F. I am a Math Exam Helper at liveexamhelper.com. I hold a Masters' Degree in Maths, University of Cambridge, UK. I have been helping students with their exams for the past 9 years. You can hire me to take your exam in Math.
Visit liveexamhelper.com or email info@liveexamhelper.com.
You can also call on +1 678 648 4277 for any assistance with Math Exams.
Recently, there has been a surge in activity at the interface of optimal transport and statistics (with special emphasis on machine learning applications). The talk will summarize new results and challenges in this active area. For example, we will show how many of the most popular estimators in machine learning (such as Lasso and svm's) can be interpreted as games. This interpretation opens the door for new and potentially better estimators and algorithms, as well as questions about the underlying complexity of these new class of estimators.
(This talk is based on joint work with F. He, Y. Kang, K. Murthy, and F. Zhang)
I am Stacy W. I am a Probability Assignment Expert at statisticsassignmenthelp.com. I hold a Masters in Statistics from, University of McGill, Canada
I have been helping students with their homework for the past 8 years. I solve assignments related to Probability.
Visit statisticsassignmenthelp.com or email info@statisticsassignmenthelp.com.
You can also call on +1 678 648 4277 for any assistance with Probability Assignments.
Computational Information Geometry: A quick review (ICMS)Frank Nielsen
From the workshop
Computational information geometry for image and signal processing
Sep 21, 2015 - Sep 25, 2015
ICMS, 15 South College Street, Edinburgh
http://www.icms.org.uk/workshop.php?id=343
The document discusses computing averages and provides examples of calculating average speed and estimating population proportions. It explains that averages can be used to estimate values for large populations by taking samples. Care must be taken with sampling to ensure respondents are chosen randomly and independently to minimize errors. Averages also come up in assessing financial risk by considering expectations as averages over infinite scenarios.
Natural Language Processing in R (rNLP)fridolin.wild
The introductory slides of a workshop given to the doctoral school at the Institute of Business Informatics of the Goethe University Frankfurt. The tutorials are available on http://crunch.kmi.open.ac.uk/w/index.php/Tutorials
This document provides an overview of Markov chain Monte Carlo (MCMC) methods. It begins with motivations for using MCMC, such as computational difficulties that arise in models with latent variables like mixture models. It then discusses likelihood-based and Bayesian approaches, noting limitations of maximum likelihood methods. Conjugate priors are described that allow tractable Bayesian inference for some simple models. However, conjugate priors are not available for more complex models, motivating the use of MCMC methods which can approximate integrals and distributions of interest for more complex models.
Monte Carlo methods use random sampling to solve problems numerically. They work by setting up probabilistic models and running simulations using random numbers. This allows approximating solutions to problems in physics, finance, optimization, and other fields. Examples include estimating pi by simulating dart throws, and using a "drunken wino" random walk simulation to approximate the solution to a partial differential equation on a grid. The accuracy of Monte Carlo methods increases with more simulation iterations, requiring truly random numbers for best results.
Applying Monte Carlo Simulation to Microsoft Project Schedulesjimparkpmp
Jim Park presented on applying Monte Carlo simulation (MCS) to Microsoft Project schedules. MCS can model schedule uncertainty by using multi-point estimates rather than single point estimates. It runs simulations with randomly selected values from the estimates' distributions to calculate schedule outcomes. This helps determine higher probability finish dates compared to traditional PERT analysis. Garbage in leads to garbage out, so quality estimates tied to quantifiable risks are important. The presentation demonstrated MCS tools and their use to improve schedule confidence levels.
Monte Carlo simulation is a statistical technique that uses random numbers and probability to simulate real-world processes. It was developed in the 1940s by scientists working on nuclear weapons research. Monte Carlo simulation provides approximate solutions to problems by running simulations many times. It allows for sensitivity analysis and scenario analysis. Some examples include estimating pi by randomly generating points within a circle, and approximating integrals by treating the area under a curve as a target for random darts. The technique provides probabilistic results and allows modeling of correlated inputs.
High Dimensional Quasi Monte Carlo Method in FinanceMarco Bianchetti
Monte Carlo simulation in finance has been traditionally focused on pricing derivatives. Actually nowadays market and counterparty risk measures, based on multi-dimensional multi-step Monte Carlo simulation, are very important tools for managing risk, both on the front office side (sensitivities, CVA) and on the risk management side (estimating risk and capital allocation). Furthermore, they are typically required for internal models and validated by regulators.
The daily production of prices and risk measures for large portfolios with multiple counterparties is a computationally intensive task, which requires a complex framework and an industrial approach. It is a typical high budget, high effort project in banks.
In this presentation we focus on the Monte Carlo simulation, showing that, despite some common wisdom, Quasi Monte Carlo techniques can be applied, under appropriate conditions, to successfully improve price and risk figures and to reduce the computational effort.
This work includes and extends our paper M. Bianchetti, S. Kucherenko and S. Scoleri, “Pricing and Risk Management with High-Dimensional Quasi Monte Carlo and Global Sensitivity Analysis”, Wilmott Journal, July 2015 (also available at http://ssrn.com/abstract=2592753).
- The document discusses methods for determining when to stop sampling in Monte Carlo integration to achieve a desired error tolerance.
- For independent and identically distributed (IID) sampling, the central limit theorem can be used to determine the necessary sample size based on the variance of the integrand.
- Quasi-Monte Carlo sampling can achieve faster convergence rates by using low-discrepancy point sets that more uniformly sample the domain. The error can be analyzed in the frequency domain based on the decay of the true Fourier coefficients.
- Bayesian cubature methods model the integrand as a Gaussian process, allowing inference of hyperparameters from sample points to improve integration accuracy.
The document discusses error analysis for quasi-Monte Carlo methods used for numerical integration. It introduces the concepts of reproducing kernel Hilbert spaces and mean square discrepancy to analyze integration error. Specifically, it shows that the mean square discrepancy of randomized low-discrepancy point sets can be computed in O(n) operations, whereas the standard discrepancy requires O(n^2) operations, making randomized quasi-Monte Carlo methods more efficient for high-dimensional integration problems.
This document discusses automatic Bayesian cubature for numerical integration. It begins with an introduction to multivariate integration and the challenges it poses. It then describes an automatic cubature algorithm that generates sample points and computes error bounds iteratively until a tolerance threshold is met. Next, it covers Bayesian cubature, which treats integrands as random functions to obtain probabilistic error bounds. It defines a Bayesian trio identity relating the integration error to discrepancies, variations, and alignments. The document concludes with discussions of future work.
최근 이수가 되고 있는 Bayesian Deep Learning 관련 이론과 최근 어플리케이션들을 소개합니다. Bayesian Inference 의 이론에 관해서 간단히 설명하고 Yarin Gal 의 Monte Carlo Dropout 의 이론과 어플리케이션들을 소개합니다.
This document discusses Bayesian inference on mixtures models. It covers several key topics:
1. Density approximation and consistency results for mixtures as a way to approximate unknown distributions.
2. The "scarcity phenomenon" where the posterior probabilities of most component allocations in mixture models are zero, concentrating on just a few high probability allocations.
3. Challenges with Bayesian inference for mixtures, including identifiability issues, label switching, and complex combinatorial calculations required to integrate over all possible component allocations.
H2O World - Consensus Optimization and Machine Learning - Stephen BoydSri Ambati
This document discusses consensus optimization and its applications to machine learning model fitting. Convex optimization problems can be solved effectively using interior point methods or customized algorithms. Model fitting is commonly formulated as regularized loss minimization, which is convex for many useful cases like linear regression. Consensus optimization allows distributed model fitting by splitting the data across nodes and coordinating local model parameters with consensus constraints. The alternating direction method of multipliers (ADMM) solves the consensus problem iteratively. Applications demonstrate distributed training of support vector machines and logistic regression models using ADMM consensus optimization.
The document discusses various methods for modeling input distributions in simulation models, including trace-driven simulation, empirical distributions, and fitting theoretical distributions to real data. It provides examples of several continuous and discrete probability distributions commonly used in simulation, including the exponential, normal, gamma, Weibull, binomial, and Poisson distributions. Key parameters and properties of each distribution are defined. Methods for selecting an appropriate input distribution based on summary statistics of real data are also presented.
In this work we discuss how to compute KLE with complexity O(k n log n), how to approximate large covariance matrices (in H-matrix format), how to use the Lanczos method.
We solve elliptic PDE with uncertain coefficients. We apply Karhunen-Loeve expansion to separate stochastic part from spatial part. The corresponding eigenvalue problem with covariance function is solved via the Hierarchical Matrix technique. We also demonstrate how low-rank tensor method can be applied for high-dimensional problems (e.g., to compute higher order statistical moments) . We provide explicit formulas to compute statistical moments of order k with linear complexity.
This document discusses using the sequence of iterates generated by inertial methods to minimize convex functions. It introduces inertial methods and how they can be used to generate sequences that converge to the minimum. While the last iterate is often used, sometimes averaging over iterates or using extrapolations like Aitken acceleration can provide better estimates of the minimum. Inertial methods allow for more exploration of the function space than gradient descent alone. The geometry of the function may provide opportunities to analyze the iterate sequence and obtain improved convergence estimates.
The document summarizes a presentation on minimizing tensor estimation error using alternating minimization. It begins with an introduction to tensor decompositions including CP, Tucker, and tensor train decompositions. It then discusses nonparametric tensor estimation using an alternating minimization method. The method iteratively updates components while holding other components fixed, achieving efficient computation. The analysis shows that after t iterations, the estimation error is bounded by the sum of a statistical error term and an optimization error term decaying exponentially in t. Real data analysis uses the method for multitask learning.
This document discusses various methods for approximating marginal likelihoods and Bayes factors, including:
1. Geyer's 1994 logistic regression approach for approximating marginal likelihoods using importance sampling.
2. Bridge sampling and its connection to Geyer's approach. Optimal bridge sampling requires knowledge of unknown normalizing constants.
3. Using mixtures of importance distributions and the target distribution as proposals to estimate marginal likelihoods through Rao-Blackwellization. This connects to bridge sampling estimates.
4. The document discusses various methods for approximating marginal likelihoods and comparing hypotheses using Bayes factors. It outlines the historical development and connections between different approximation techniques.
A new implementation of k-MLE for mixture modelling of Wishart distributionsFrank Nielsen
This document discusses a new implementation of k-MLE for mixture modelling of Wishart distributions. It begins with an overview of the Wishart distribution and its properties as an exponential family. It then describes the original k-MLE algorithm and how it can be adapted for Wishart distributions by using Hartigan and Wang's strategy instead of Lloyd's strategy to avoid empty clusters. The document also discusses approaches for initializing the clusters, such as k-means++, and proposes a heuristic to determine the number of clusters on-the-fly rather than fixing k.
The document discusses inertial algorithms for minimizing convex functions. It begins by introducing the gradient method and accelerated/inertial gradient method. It then reviews several classic approaches for analyzing the convergence of inertial algorithms, such as algebraic proofs, estimate sequences, and viewing the algorithm as a discretization of an ordinary differential equation (ODE). More recent approaches discussed include analyzing inertial algorithms as a combination of primal and mirror descent steps or using Bregman estimate sequences. The document raises questions about interpreting the difference between inertial algorithms and the heavy ball method from an ODE perspective. It also discusses a new direction of analyzing inertial algorithms by viewing them as numerical integration schemes approximating the solution to an ODE.
This document summarizes techniques for approximating marginal likelihoods and Bayes factors, which are important quantities in Bayesian inference. It discusses Geyer's 1994 logistic regression approach, links to bridge sampling, and how mixtures can be used as importance sampling proposals. Specifically, it shows how optimizing the logistic pseudo-likelihood relates to the bridge sampling optimal estimator. It also discusses non-parametric maximum likelihood estimation based on simulations.
Maximizing Submodular Function over the Integer LatticeTasuku Soma
The document describes generalizations of submodular function maximization and submodular cover problems from sets to integer lattices. It presents polynomial-time approximation algorithms for maximizing monotone diminishing return (DR) submodular functions subject to constraints like cardinality, polymatroid and knapsack on the integer lattice. It also presents an algorithm for the DR-submodular cover problem of minimizing cost subject to achieving a quality threshold. The results provide useful extensions of submodular optimization to settings that cannot be modeled as set functions.
This document discusses various methods for estimating normalizing constants that arise when evaluating integrals numerically. It begins by noting there are many computational methods for approximating normalizing constants across different communities. It then lists the topics that will be covered in the upcoming workshop, including discussions on estimating constants using Monte Carlo methods and Bayesian versus frequentist approaches. The document provides examples of estimating normalizing constants using Monte Carlo integration, reverse logistic regression, and Xiao-Li Meng's maximum likelihood estimation approach. It concludes by discussing some of the challenges in bringing a statistical framework to constant estimation problems.
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
This presentation explores a brief idea about the structural and functional attributes of nucleotides, the structure and function of genetic materials along with the impact of UV rays and pH upon them.
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxRASHMI M G
Abnormal or anomalous secondary growth in plants. It defines secondary growth as an increase in plant girth due to vascular cambium or cork cambium. Anomalous secondary growth does not follow the normal pattern of a single vascular cambium producing xylem internally and phloem externally.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxMAGOTI ERNEST
Although Artemia has been known to man for centuries, its use as a food for the culture of larval organisms apparently began only in the 1930s, when several investigators found that it made an excellent food for newly hatched fish larvae (Litvinenko et al., 2023). As aquaculture developed in the 1960s and ‘70s, the use of Artemia also became more widespread, due both to its convenience and to its nutritional value for larval organisms (Arenas-Pardo et al., 2024). The fact that Artemia dormant cysts can be stored for long periods in cans, and then used as an off-the-shelf food requiring only 24 h of incubation makes them the most convenient, least labor-intensive, live food available for aquaculture (Sorgeloos & Roubach, 2021). The nutritional value of Artemia, especially for marine organisms, is not constant, but varies both geographically and temporally. During the last decade, however, both the causes of Artemia nutritional variability and methods to improve poorquality Artemia have been identified (Loufi et al., 2024).
Brine shrimp (Artemia spp.) are used in marine aquaculture worldwide. Annually, more than 2,000 metric tons of dry cysts are used for cultivation of fish, crustacean, and shellfish larva. Brine shrimp are important to aquaculture because newly hatched brine shrimp nauplii (larvae) provide a food source for many fish fry (Mozanzadeh et al., 2021). Culture and harvesting of brine shrimp eggs represents another aspect of the aquaculture industry. Nauplii and metanauplii of Artemia, commonly known as brine shrimp, play a crucial role in aquaculture due to their nutritional value and suitability as live feed for many aquatic species, particularly in larval stages (Sorgeloos & Roubach, 2021).
The debris of the ‘last major merger’ is dynamically youngSérgio Sacani
The Milky Way’s (MW) inner stellar halo contains an [Fe/H]-rich component with highly eccentric orbits, often referred to as the
‘last major merger.’ Hypotheses for the origin of this component include Gaia-Sausage/Enceladus (GSE), where the progenitor
collided with the MW proto-disc 8–11 Gyr ago, and the Virgo Radial Merger (VRM), where the progenitor collided with the
MW disc within the last 3 Gyr. These two scenarios make different predictions about observable structure in local phase space,
because the morphology of debris depends on how long it has had to phase mix. The recently identified phase-space folds in Gaia
DR3 have positive caustic velocities, making them fundamentally different than the phase-mixed chevrons found in simulations
at late times. Roughly 20 per cent of the stars in the prograde local stellar halo are associated with the observed caustics. Based
on a simple phase-mixing model, the observed number of caustics are consistent with a merger that occurred 1–2 Gyr ago.
We also compare the observed phase-space distribution to FIRE-2 Latte simulations of GSE-like mergers, using a quantitative
measurement of phase mixing (2D causticality). The observed local phase-space distribution best matches the simulated data
1–2 Gyr after collision, and certainly not later than 3 Gyr. This is further evidence that the progenitor of the ‘last major merger’
did not collide with the MW proto-disc at early times, as is thought for the GSE, but instead collided with the MW disc within
the last few Gyr, consistent with the body of work surrounding the VRM.
1. Simulating the Mean Efficiently
and to a Given Tolerance
Fred J. Hickernell
Department of Applied Mathematics, Illinois Institute of Technology
hickernell@iit.edu mypages.iit.edu/~hickernell
Thanks to Lan Jiang, Tony Jiménez Rugama, Jagadees Rathinavel,
and the rest of the the Guaranteed Automatic Integration Library (GAIL) team
Supported by NSF-DMS-1522687
Thanks for your kind invitation
2. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Estimating/Simulating/Computing an Integral
Gaussian probability =
ż
[a,b]
e´xT
Σ´1
x/2
(2π)d/2 |Σ|1/2
dx
2/16
3. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Estimating/Simulating/Computing an Integral
Gaussian probability =
ż
[a,b]
e´xT
Σ´1
x/2
(2π)d/2 |Σ|1/2
dx
option price =
ż
Rd
payoff(x)
e´xT
Σ´1
x/2
(2π)d/2 |Σ|1/2
looooooomooooooon
PDF of Brownian motion at d times
dx
2/16
4. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Estimating/Simulating/Computing an Integral
Gaussian probability =
ż
[a,b]
e´xT
Σ´1
x/2
(2π)d/2 |Σ|1/2
dx
option price =
ż
Rd
payoff(x)
e´xT
Σ´1
x/2
(2π)d/2 |Σ|1/2
looooooomooooooon
PDF of Brownian motion at d times
dx
Bayesian ^βj =
ż
Rd
βj prob(β|data) dβ =
ş
Rd βj prob(data|β) probprior(β) dβ
ş
Rd prob(data|β) probprior(β) dβ
2/16
5. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Estimating/Simulating/Computing an Integral
Gaussian probability =
ż
[a,b]
e´xT
Σ´1
x/2
(2π)d/2 |Σ|1/2
dx
option price =
ż
Rd
payoff(x)
e´xT
Σ´1
x/2
(2π)d/2 |Σ|1/2
looooooomooooooon
PDF of Brownian motion at d times
dx
Bayesian ^βj =
ż
Rd
βj prob(β|data) dβ =
ş
Rd βj prob(data|β) probprior(β) dβ
ş
Rd prob(data|β) probprior(β) dβ
Sobol’ indexj =
ş
[0,1]2d output(x) ´ output(xj, x1
´j) output(x1
) dx dx1
ş
[0,1]d output(x)2 dx ´
ş
[0,1]d output(x) dx
2
2/16
6. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Estimating/Simulating/Computing the Mean
Gaussian probability =
ż
[a,b]
e´xT
Σ´1
x/2
(2π)d/2 |Σ|1/2
dx
option price =
ż
Rd
payoff(x)
e´xT
Σ´1
x/2
(2π)d/2 |Σ|1/2
looooooomooooooon
PDF of Brownian motion at d times
dx
Bayesian ^βj =
ż
Rd
βj prob(β|data) dβ =
ş
Rd βj prob(data|β) probprior(β) dβ
ş
Rd prob(data|β) probprior(β) dβ
Sobol’ indexj =
ş
[0,1]2d output(x) ´ output(xj, x1
´j) output(x1
) dx dx1
ş
[0,1]d output(x)2 dx ´
ş
[0,1]d output(x) dx
2
µ =
ż
Rd
g(x) dx = E[f(X)] =
ż
Rd
f(x) ν(dx) =?, ^µn =
nÿ
i=1
wif(xi)
How to choose ν, txiun
i=1, and twiun
i=1 to make |µ ´ ^µn| small? (trio identity)
Given εa, how big must n be to guarantee |µ ´ ^µn| ď εa? (adaptive cubature) 2/16
7. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Product Rules Using Rectangular Grids
µ =
ż
Rd
f(x) ν(dx) « ^µn =
nÿ
i=1
wif(xi)
If
ż 1
0
f(x) dx ´
mÿ
i=1
wif(ti) = O(m´r
), then
ż
[0,1]d
f(x) dx
´
mÿ
i1=1
¨ ¨ ¨
mÿ
id=1
wi1
¨ ¨ ¨ wid
f(ti1
, . . . , tid
)
= O(m´r
) = O(n´r/d
)
assuming rth
derivatives in each direction exist.
3/16
8. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Product Rules Using Rectangular Grids
µ =
ż
Rd
f(x) ν(dx) « ^µn =
nÿ
i=1
wif(xi)
If
ż 1
0
f(x) dx ´
mÿ
i=1
wif(ti) = O(m´r
), then
ż
[0,1]d
f(x) dx
´
mÿ
i1=1
¨ ¨ ¨
mÿ
id=1
wi1
¨ ¨ ¨ wid
f(ti1
, . . . , tid
)
= O(m´r
) = O(n´r/d
)
assuming rth
derivatives in each direction exist.
3/16
9. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Product Rules Using Rectangular Grids
µ =
ż
Rd
f(x) ν(dx) « ^µn =
nÿ
i=1
wif(xi)
If
ż 1
0
f(x) dx ´
mÿ
i=1
wif(ti) = O(m´r
), then
ż
[0,1]d
f(x) dx
´
mÿ
i1=1
¨ ¨ ¨
mÿ
id=1
wi1
¨ ¨ ¨ wid
f(ti1
, . . . , tid
)
= O(m´r
) = O(n´r/d
)
assuming rth
derivatives in each direction exist. But the computational cost
becomes prohibitive for large dimensions, d:
d 1 2 5 10 100
n = 8d
8 64 3.3E4 1.0E9 2.0E90
Product rules are typically a bad idea unless d is small. 3/16
10. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Monte Carlo Simulation in the News
Sampling with a computer can be fast
How big is our error?
4/16
12. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Central Limit Theorem Stopping Rule for IID Monte Carlo
µ =
ż
Rd
f(x) ν(dx) « ^µn =
1
n
nÿ
i=1
f(xi), xi
IID
„ ν
µ ´ ^µn =
µ ´ ^µn
std(f(X))/
?
nlooooooomooooooon
CNF„(0,1)
ˆ
1
?
nloomoon
DSC(txiu)
ˆ std(f(X))loooomoooon
VAR(f)
P[|µ ´ ^µn| ď errn] « 99%
for errn =
2.58 ˆ 1.2^σ
?
n
by the Central Limit Theorem (CLT),
where ^σ2
is the sample variance.
5/16
13. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Central Limit Theorem Stopping Rule for IID Monte Carlo
µ =
ż
Rd
f(x) ν(dx) « ^µn =
1
n
nÿ
i=1
f(xi), xi
IID
„ ν
µ ´ ^µn =
µ ´ ^µn
std(f(X))/
?
nlooooooomooooooon
CNF„(0,1)
ˆ
1
?
nloomoon
DSC(txiu)
ˆ std(f(X))loooomoooon
VAR(f)
P[|µ ´ ^µn| ď errn] « 99%
for errn =
2.58 ˆ 1.2^σ
?
n
by the Central Limit Theorem (CLT),
where ^σ2
is the sample variance. But the CLT is only an asymptotic result, and
1.2^σ may be an overly optimistic upper bound on σ.
5/16
14. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Berry-Esseen Stopping Rule for IID Monte Carlo
µ =
ż
Rd
f(x) ν(dx) « ^µn =
1
n
nÿ
i=1
f(xi), xi
IID
„ ν
µ ´ ^µn =
µ ´ ^µn
std(f(X))/
?
nlooooooomooooooon
CNF„(0,1)
ˆ
1
?
nloomoon
DSC(txiu)
ˆ std(f(X))loooomoooon
VAR(f)
P[|µ ´ ^µn| ď errn] ě 99%
for Φ ´
?
n errn /(1.2^σnσ
)
+ ∆n(´
?
n errn /(1.2^σnσ
), κmax) = 0.0025
by the Berry-Esseen Inequality,
where ^σ2
nσ
is the sample variance using an independent sample from that used to
simulate the mean, and provided that kurt(f(X)) ď κmax(nσ) (H. et al., 2013;
Jiang, 2016)
5/16
15. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Adaptive Low Discrepancy Sampling Cubature
µ =
ż
[0,1]d
f(x) dx
^µn =
1
n
nÿ
i=1
f(xi), xi Sobol’ or lattice
Normally n should be a power of 2
6/16
16. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Adaptive Low Discrepancy Sampling Cubature
µ =
ż
[0,1]d
f(x) dx
^µn =
1
n
nÿ
i=1
f(xi), xi Sobol’ or lattice
Normally n should be a power of 2
6/16
17. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Adaptive Low Discrepancy Sampling Cubature
µ =
ż
[0,1]d
f(x) dx
^µn =
1
n
nÿ
i=1
f(xi), xi Sobol’ or lattice
Normally n should be a power of 2
6/16
18. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Adaptive Low Discrepancy Sampling Cubature
µ =
ż
[0,1]d
f(x) dx
^µn =
1
n
nÿ
i=1
f(xi), xi Sobol’ or lattice
Normally n should be a power of 2
6/16
19. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Adaptive Low Discrepancy Sampling Cubature
µ =
ż
[0,1]d
f(x) dx
^µn =
1
n
nÿ
i=1
f(xi), xi Sobol’ or lattice
Normally n should be a power of 2
6/16
20. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Adaptive Low Discrepancy Sampling Cubature
µ =
ż
[0,1]d
f(x) dx
^µn =
1
n
nÿ
i=1
f(xi), xi Sobol’ or lattice
Normally n should be a power of 2
6/16
21. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Adaptive Low Discrepancy Sampling Cubature
µ =
ż
[0,1]d
f(x) dx
^µn =
1
n
nÿ
i=1
f(xi), xi Sobol’ or lattice
Normally n should be a power of 2
6/16
22. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Adaptive Low Discrepancy Sampling Cubature
µ =
ż
[0,1]d
f(x) dx
^µn =
1
n
nÿ
i=1
f(xi), xi Sobol’ or lattice
Normally n should be a power of 2
6/16
23. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Adaptive Low Discrepancy Sampling Cubature
µ =
ż
[0,1]d
f(x) dx
^µn =
1
n
nÿ
i=1
f(xi), xi Sobol’ or lattice
Normally n should be a power of 2
6/16
24. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Adaptive Low Discrepancy Sampling Cubature
µ =
ż
[0,1]d
f(x) dx
^µn =
1
n
nÿ
i=1
f(xi), xi Sobol’ or lattice
Normally n should be a power of 2
6/16
25. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Adaptive Low Discrepancy Sampling Cubature
µ =
ż
[0,1]d
f(x) dx
^µn =
1
n
nÿ
i=1
f(xi), xi Sobol’ or lattice
Normally n should be a power of 2
6/16
26. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Adaptive Low Discrepancy Sampling Cubature
µ =
ż
[0,1]d
f(x) dx
^µn =
1
n
nÿ
i=1
f(xi), xi Sobol’ or lattice
Normally n should be a power of 2
6/16
27. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Adaptive Low Discrepancy Sampling Cubature
µ =
ż
[0,1]d
f(x) dx
^µn =
1
n
nÿ
i=1
f(xi), xi Sobol’ or lattice
Normally n should be a power of 2
6/16
28. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Adaptive Low Discrepancy Sampling Cubature
µ =
ż
[0,1]d
f(x) dx
^µn =
1
n
nÿ
i=1
f(xi), xi Sobol’ or lattice
Normally n should be a power of 2
6/16
29. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Adaptive Low Discrepancy Sampling Cubature
µ =
ż
[0,1]d
f(x) dx
^µn =
1
n
nÿ
i=1
f(xi), xi Sobol’ or lattice
Normally n should be a power of 2
6/16
30. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Adaptive Low Discrepancy Sampling Cubature
µ =
ż
[0,1]d
f(x) dx
^µn =
1
n
nÿ
i=1
f(xi), xi Sobol’ or lattice
Normally n should be a power of 2
6/16
31. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Adaptive Low Discrepancy Sampling Cubature
µ =
ż
[0,1]d
f(x) dx
^µn =
1
n
nÿ
i=1
f(xi), xi Sobol’ or lattice
Normally n should be a power of 2
6/16
32. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Adaptive Low Discrepancy Sampling Cubature
µ =
ż
[0,1]d
f(x) dx
^µn =
1
n
nÿ
i=1
f(xi), xi Sobol’ or lattice
Let tpf(k)uk denote the coefficients of the
Fourier Walsh or complex exponential
expansion of f. Let tω(k)uk be some
weights. Then
µ ´ ^µn =
´
ÿ
0‰kPdual
pf(k)
! pf(k)
ω(k)
)
k 2
tω(k)u0‰kPdual 2
loooooooooooooooooooomoooooooooooooooooooon
CNFP[´1,1]
ˆ tω(k)u0‰kPdual 2loooooooooomoooooooooon
DSC(txiun
i=1)=O(n´1+ )
ˆ
#
pf(k)
ω(k)
+
k 2looooooomooooooon
VAR(f)
6/16
33. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Adaptive Low Discrepancy Sampling Cubature
µ =
ż
[0,1]d
f(x) dx
^µn =
1
n
nÿ
i=1
f(xi), xi Sobol’ or lattice
Let tpf(k)uk denote the coefficients of the
Fourier Walsh or complex exponential
expansion of f.
Assuming that the pf(k) do not decay erratically as k Ñ ∞, the discrete
transform, rfn(k)
(
k
, may be used to bound the error reliably (H. and Jiménez
Rugama, 2016; Jiménez Rugama and H., 2016; H. et al., 2017+):
|µ ´ ^µn| ď errn := C(n)
ÿ
certaink
rfn(k)
6/16
34. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Bayesian Cubature—f Is Random
µ =
ż
Rd
f(x) ν(dx) « ^µn =
nÿ
i=1
wi f(xi)
Assume f „ GP(0, s2
Cθ) (Diaconis, 1988;
O’Hagan, 1991; Ritter, 2000; Rasmussen and
Ghahramani, 2003)
c0 =
ż
RdˆRd
Cθ(x, t) ν(dx)ν(dt)
c =
ż
Rd
Cθ(xi, t) ν(dt)
n
i=1
, C = Cθ(xi, xj)
n
i,j=1
Choosing w = wi
n
i=1
= C´1
c is optimal
µ ´ ^µn =
µ ´ ^µn
b
c0 ´ cTC´1c yTC´1y
nlooooooooooooooomooooooooooooooon
CNF„N(0,1)
ˆ
a
c0 ´ cTC´1cloooooooomoooooooon
DSC
ˆ
c
yTC´1y
nlooooomooooon
VAR(f)
where y = f(xi)
n
i=1
.
7/16
35. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Bayesian Cubature—f Is Random
µ =
ż
Rd
f(x) ν(dx) « ^µn =
nÿ
i=1
wi f(xi)
Assume f „ GP(0, s2
Cθ) (Diaconis, 1988;
O’Hagan, 1991; Ritter, 2000; Rasmussen and
Ghahramani, 2003)
c0 =
ż
RdˆRd
Cθ(x, t) ν(dx)ν(dt)
c =
ż
Rd
Cθ(xi, t) ν(dt)
n
i=1
, C = Cθ(xi, xj)
n
i,j=1
Choosing w = wi
n
i=1
= C´1
c is optimal
P[|µ ´ ^µn| ď errn] = 99% for errn = 2.58
c
c0 ´ cTC´1c
yTC´1y
n
where y = f(xi)
n
i=1
. But, θ needs to be inferred (by MLE), and C´1
typically
requires O(n3
) operations
7/16
36. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Gaussian Probability
µ =
ż
[a,b]
exp ´1
2 tT
Σ´1
t
a
(2π)d det(Σ)
dt
Genz (1993)
=
ż
[0,1]d´1
f(x) dx
For some typical choice of a, b, Σ, d = 3, εa = 0; µ « 0.6763
Worst 10% Worst 10%
εr Method % Accuracy n Time (s)
IID Monte Carlo 100% 8.1E4 1.8E´2
1E´2 Sobol’ Sampling 100% 1.0E3 5.1E´3
Bayesian Lattice 100% 1.0E3 2.8E´3
IID Monte Carlo 100% 2.0E6 3.8E´1
1E´3 Sobol’ Sampling 100% 2.0E3 7.7E´3
Bayesian Lattice 100% 1.0E3 2.8E´3
1E´4 Sobol’ Sampling 100% 1.6E4 1.8E´2
Bayesian Lattice 100% 8.2E3 1.4E´2
Bayesian lattice cubature uses covariance kernel C for which C is circulant,
and operations on C require only O(n log(n)) operations 8/16
37. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Asian Option Pricing
fair price =
ż
Rd
e´rT
max
1
d
dÿ
j=1
Sj ´ K, 0
e´xT
Σ´1
x/2
(2π)d/2 |Σ|1/2
dx « $13.12
Sj = S0e(r´σ2
/2)jT/d+σxj
= stock price at time jT/d,
Σ = min(i, j)T/d
d
i,j=1
Worst 10% Worst 10%
εa = 1E´4 Method % Accuracy n Time (s)
Sobol’ Sampling 100% 2.1E6 4.3
Sobol’ Sampling w/ control variates 97% 1.0E6 2.1
The coefficient of the control variate for low discrepancy sampling is different than
for IID Monte Carlo (H. et al., 2005; H. et al., 2017+)
9/16
39. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Summary
The error in simulating the mean can be decomposed as a trio identity
(Meng, 2017+; H., 2017+)
Knowing when to stop a simulation of the mean is not trivial (H. et al., 2017+)
The Berry-Esseen inequality can tell us when to stop an IID simulation
Fourier analysis can tell us when to stop a low discrepancy simulation
Bayesian cubature can tell us when to stop a simulation if you can afford the
computational cost
All methods can be fooled by nasty functions, f
Relative error tolerances and problems involving functions of integrals can
be handled (H. et al., 2017+)
Our algorithms are implemented in the Guaranteed Automatic Integration
Library (GAIL) (Choi et al., 2013–2015), which is under continuous
development
11/16
40. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Upcoming SAMSI Quasi-Monte Carlo Program
12/16
42. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
References I
Bratley, P., B. L. Fox, and H. Niederreiter. 1992. Implementation and tests of low-discrepancy
sequences, ACM Trans. Model. Comput. Simul. 2, 195–213.
Choi, S.-C. T., Y. Ding, F. J. H., L. Jiang, Ll. A. Jiménez Rugama, X. Tong, Y. Zhang, and X. Zhou.
2013–2015. GAIL: Guaranteed Automatic Integration Library (versions 1.0–2.1).
Cools, R. and D. Nuyens (eds.) 2016. Monte Carlo and quasi-Monte Carlo methods: MCQMC,
Leuven, Belgium, April 2014, Springer Proceedings in Mathematics and Statistics, vol. 163,
Springer-Verlag, Berlin.
Diaconis, P. 1988. Bayesian numerical analysis, Statistical decision theory and related topics IV,
Papers from the 4th Purdue symp., West Lafayette, Indiana 1986, pp. 163–175.
Genz, A. 1993. Comparison of methods for the computation of multivariate normal probabilities,
Computing Science and Statistics 25, 400–405.
H., F. J. 2017+. Error analysis of quasi-Monte Carlo methods. submitted for publication,
arXiv:1702.01487.
H., F. J., L. Jiang, Y. Liu, and A. B. Owen. 2013. Guaranteed conservative fixed width confidence
intervals via Monte Carlo sampling, Monte Carlo and quasi-Monte Carlo methods 2012, pp. 105–128.
H., F. J. and Ll. A. Jiménez Rugama. 2016. Reliable adaptive cubature using digital sequences,
Monte Carlo and quasi-Monte Carlo methods: MCQMC, Leuven, Belgium, April 2014, pp. 367–383.
arXiv:1410.8615 [math.NA].
14/16
43. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
References II
H., F. J., Ll. A. Jiménez Rugama, and D. Li. 2017+. Adaptive quasi-Monte Carlo methods. submitted
for publication, arXiv:1702.01491 [math.NA].
H., F. J., C. Lemieux, and A. B. Owen. 2005. Control variates for quasi-Monte Carlo, Statist. Sci. 20,
1–31.
Jiang, L. 2016. Guaranteed adaptive Monte Carlo methods for estimating means of random
variables, Ph.D. Thesis.
Jiménez Rugama, Ll. A. and F. J. H. 2016. Adaptive multidimensional integration based on rank-1
lattices, Monte Carlo and quasi-Monte Carlo methods: MCQMC, Leuven, Belgium, April 2014,
pp. 407–422. arXiv:1411.1966.
Meng, X. 2017+. Statistical paradises and paradoxes in big data. in preparation.
O’Hagan, A. 1991. Bayes-Hermite quadrature, J. Statist. Plann. Inference 29, 245–260.
Rasmussen, C. E. and Z. Ghahramani. 2003. Bayesian Monte Carlo, Advances in Neural Information
Processing Systems, pp. 489–496.
Ritter, K. 2000. Average-case analysis of numerical problems, Lecture Notes in Mathematics,
vol. 1733, Springer-Verlag, Berlin.
Sobol’, I. M. 1990. On sensitivity estimation for nonlinear mathematical models, Matem. Mod. 2,
no. 1, 112–118.
. 2001. Global sensitivity indices for nonlinear mathematical models and their monte carlo
estimates, Math. Comput. Simul. 55, no. 1-3, 271–280.
15/16
44. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Maximum Likelihood Estimation of the Covariance Kernel
f „ GP(0, s2
Cθ), Cθ = Cθ(xi, xj)
n
i,j=1
y = f(xi)
n
i=1
, ^µn = cT
^θ
C´1
^θ
y
^θ = argmin
θ
yT
C´1
θ y
[det(C´1
θ )]1/n
P[|µ ´ ^µn| ď errn] = 99% for errn =
2.58
?
n
b
c0,^θ ´ cT
^θ
C´1
^θ
c^θ yTC´1
^θ
y
16/16
45. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Maximum Likelihood Estimation of the Covariance Kernel
f „ GP(0, s2
Cθ), Cθ = Cθ(xi, xj)
n
i,j=1
y = f(xi)
n
i=1
, ^µn = cT
^θ
C´1
^θ
y
^θ = argmin
θ
yT
C´1
θ y
[det(C´1
θ )]1/n
P[|µ ´ ^µn| ď errn] = 99% for errn =
2.58
?
n
b
c0,^θ ´ cT
^θ
C´1
^θ
c^θ yTC´1
^θ
y
There is a de-randomized interpretation of Bayesian cubature (H., 2017+)
f P Hilbert space w/ reproducing kernel Cθ and with best interpolant rfy
16/16
46. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Maximum Likelihood Estimation of the Covariance Kernel
f „ GP(0, s2
Cθ), Cθ = Cθ(xi, xj)
n
i,j=1
y = f(xi)
n
i=1
, ^µn = cT
^θ
C´1
^θ
y
^θ = argmin
θ
yT
C´1
θ y
[det(C´1
θ )]1/n
P[|µ ´ ^µn| ď errn] = 99% for errn =
2.58
?
n
b
c0,^θ ´ cT
^θ
C´1
^θ
c^θ yTC´1
^θ
y
There is a de-randomized interpretation of Bayesian cubature (H., 2017+)
f P Hilbert space w/ reproducing kernel Cθ and with best interpolant rfy
|µ ´ ^µn| ď
2.58
?
n
b
c0,^θ ´ cT
^θ
C´1
^θ
c^θ
loooooooooomoooooooooon
error representer ^θ
b
yTC´1
^θ
y
looooomooooon
rfy ^θ
if f ´ rfy ^θ
ď
2.58 rf ^θ?
n
16/16
47. Introduction IID Monte Carlo Low Discrepancy Sampling Bayesian Cubature Examples References
Maximum Likelihood Estimation of the Covariance Kernel
f „ GP(0, s2
Cθ), Cθ = Cθ(xi, xj)
n
i,j=1
y = f(xi)
n
i=1
, ^µn = cT
^θ
C´1
^θ
y
^θ = argmin
θ
yT
C´1
θ y
[det(C´1
θ )]1/n
P[|µ ´ ^µn| ď errn] = 99% for errn =
2.58
?
n
b
c0,^θ ´ cT
^θ
C´1
^θ
c^θ yTC´1
^θ
y
There is a de-randomized interpretation of Bayesian cubature (H., 2017+)
f P Hilbert space w/ reproducing kernel Cθ and with best interpolant rfy
^θ = argmin
θ
yT
C´1
θ y
[det(C´1
θ )]1/n
= argmin
θ
vol z P Rn
: rfz θ ď rfy θ
(
|µ ´ ^µn| ď
2.58
?
n
b
c0,^θ ´ cT
^θ
C´1
^θ
c^θ
loooooooooomoooooooooon
error representer ^θ
b
yTC´1
^θ
y
looooomooooon
rfy ^θ
if f ´ rfy ^θ
ď
2.58 rf ^θ?
n
16/16