SlideShare a Scribd company logo
Explaining “Explaining Variational 
Approximation” 
Based on Paper 
“Explaining Variational Approximation” 
JT Ormerod, MP Wand (2010) 
Presentation by Wayne Tai Lee
My Goal 
● Convert the paper into a short presentation 
● Not covering the examples (really helpful!) 
● Intuition and motivation only
Why do we want to use 
variational approximations? 
● In Statistics, Bayesian solutions always 
involve the posterior: 
p(Θ|data) = p(data | Θ) p(Θ) / p(data)
Why do we want to use 
variational approximations? 
● p(Θ|data) = p(data | Θ) p(Θ) / p(data) 
p(Θ|data) : posterior, belief after updating with data
Why do we want to use 
variational approximations? 
● p(Θ|data) = p(data | Θ) p(Θ) / p(data) 
p(Θ|data) : posterior, belief after updating with data 
p(data | Θ): likelihood, data generation
Why do we want to use 
variational approximations? 
● p(Θ|data) = p(data | Θ) p(Θ) / p(data) 
p(Θ|data) : posterior, belief after updating with data 
p(data | Θ): likelihood, data generation 
p(Θ): prior, belief before updating with data
Why do we want to use 
variational approximations? 
● p(Θ|data) = p(data | Θ) p(Θ) / p(data) 
p(Θ|data) : posterior, belief after updating with data 
p(data | Θ): likelihood, data generation 
p(Θ): prior, belief before updating with data 
p(data): “normalizing constant” to ensure posterior is 
a density function
Why do we want to use 
variational approximations? 
● p(Θ|data) = p(data | Θ) p(Θ) / p(data) 
p(data | Θ): likelihood, specified by you 
p(Θ): prior, specified by you 
p(data): a nasty integral that cannot be calculated 
explicitly in general
Why do we want to use 
variational approximations? 
● p(Θ|data) = p(data | Θ) p(Θ) / p(data) 
p(data | Θ): likelihood, specified by you 
p(Θ): prior, specified by you 
p(data): a nasty integral that cannot be calculated 
explicitly in general 
● Consequence: 
– Posterior often has no analytical expression
Most popular alternative 
● To obtain the posterior or any related statistic 
– Sample the posterior via MCMC methods
Most popular alternative 
● To obtain the posterior or any related statistic 
– Sample the posterior via MCMC methods 
● Pros 
– Can get arbitrarily close to the posterior with enough 
samples (resource/time intensive) 
● Con: 
– Lots of tuning necessary 
– Time consuming to run
Variational Approximation 
● Intuition: 
– Approximate the posterior with a class of 
functions that are easier to deal with 
mathematically 
– Find the function that minimizes the KL 
divergence between the posterior in this 
class
Variational Approximation 
● Intuition: 
– Approximate the posterior with a class of 
functions that are easier to deal with 
mathematically 
– Find the function that minimizes the KL 
divergence between the posterior in this class 
● Pros: 
– Suuuuper fast 
● Cons: 
– No guarantees on closeness
Big Picture 
Method to get 
Posterior 
MCMC Variational Method 
Strategy Sampling Optimization 
Solution Asymptotically Exact Approximation with no bounds 
Speed Often slow Fast 
The “catch” Tuning and convergence 
assessment require experience 
Need tractable mathematical 
setup
Explaining Variational 
Approximation 
● Change notation: p(y) = p(data) 
● Use q(Θ) to approximate p(Θ|y) 
● Will assume family of functions for q(Θ) 
– q(Θ) = q1(Θ1)q2(Θ2)...qp(Θp) 
– Each qi(Θi) is a density
Max Lower Bound = 
Min KL-Divergence
Sanity Check: Optimal Solution is 
THE solution 
● Optimal q(Θ) is p(Θ|y) for general form: 
● Important: this is a very general solution for 
arbitrary dependence/distribution of Θ and y 
● Product form of q(Θ) allows us to divide and 
conquer!
Focus on each Θ separately
Focus on each Θ separately 
+...
Focus on each Θ separately 
+...
Focus on each Θ separately
Our assumptions so far 
● Product form of q(Θ) allowed us to optimize 
each term separately 
● qi(Θi) being densities allow 
to integrate out nicely
How to convert into an optimization 
problem that we can solve?
We've only learned one trick...
We've only learned one trick... 
● Optimal q1(Θ1) is then
To get a densities, just normalize
Unfold our definitions
Focus on Θ1
Repeat for Θi 
● General Mean Field Variational Approximation 
Solution: 
The density that is proportional to
Similarity to Full Conditional in 
Gibb Sampling 
● Optimal qi(Θi) is proportional to 
● Need to do algebra until this is “tractable” 
– i.e. something we recognize as a standard 
distribution that is easily normalized 
– This is where the “setup” comes in important
For example 
● If 
resembles exp(Θi^2 *c) then we know this 
must be the Gaussian density!
Final Solution 
● Product of all qi(θi) is then approximated to 
p(θ|y) 
● Naturally doesn't do well when there's strong 
dependence between the θi 
● You Should try the examples in the paper!
First Example 
● Data generated as 
– Y | μ, σ^2 ~ N(μ,σ^2) 
● Priors 
– μ ~ N(m,s^2) 
– σ^2 ~InvGamma(a, b)
Gibbs Sampling vs Variational 
Samples 
● N=100 
● N=20
Discussion 
● Hard to know when the approximation is poor 
relative to the true posterior...

More Related Content

What's hot

Algorithm chapter 10
Algorithm chapter 10Algorithm chapter 10
Algorithm chapter 10
chidabdu
 
A comprehensive view on P vs NP
A comprehensive view on P vs NPA comprehensive view on P vs NP
A comprehensive view on P vs NP
Abhay Pai
 
Np complete
Np completeNp complete
Introduction to Approximation Algorithms
Introduction to Approximation AlgorithmsIntroduction to Approximation Algorithms
Introduction to Approximation Algorithms
Jhoirene Clemente
 
Introduction to NP Completeness
Introduction to NP CompletenessIntroduction to NP Completeness
Introduction to NP Completeness
Gene Moo Lee
 
Mathematical Background for Artificial Intelligence
Mathematical Background for Artificial IntelligenceMathematical Background for Artificial Intelligence
Mathematical Background for Artificial Intelligence
ananth
 
Planning Algorithms
Planning AlgorithmsPlanning Algorithms
Planning Algorithms
ahmad bassiouny
 
P vs NP
P vs NP P vs NP
P vs NP
Mikel Qafa
 
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
Ajay Kumar
 
lecture 27
lecture 27lecture 27
lecture 27
sajinsc
 
Planing presentation
Planing presentationPlaning presentation
Planing presentation
Prabhath Suminda
 
NP Complete Problems in Graph Theory
NP Complete Problems in Graph TheoryNP Complete Problems in Graph Theory
NP Complete Problems in Graph Theory
Seshagiri Rao Kornepati
 
lecture 29
lecture 29lecture 29
lecture 29
sajinsc
 
Issues in Decision Tree by Ravindra Singh Kushwaha B.Tech(IT) 2017-21 Chaudha...
Issues in Decision Tree by Ravindra Singh Kushwaha B.Tech(IT) 2017-21 Chaudha...Issues in Decision Tree by Ravindra Singh Kushwaha B.Tech(IT) 2017-21 Chaudha...
Issues in Decision Tree by Ravindra Singh Kushwaha B.Tech(IT) 2017-21 Chaudha...
RavindraSinghKushwah1
 
Np completeness h4
Np completeness  h4Np completeness  h4
Np completeness h4
Rajendran
 
Continuous control
Continuous controlContinuous control
Continuous control
Reiji Hatsugai
 
Np completeness
Np completenessNp completeness
Np completeness
Muhammad Saim
 
Artificial Intelligence Searching Techniques
Artificial Intelligence Searching TechniquesArtificial Intelligence Searching Techniques
Artificial Intelligence Searching Techniques
Dr. C.V. Suresh Babu
 
NIPS KANSAI Reading Group #5: State Aware Imitation Learning
NIPS KANSAI Reading Group #5: State Aware Imitation LearningNIPS KANSAI Reading Group #5: State Aware Imitation Learning
NIPS KANSAI Reading Group #5: State Aware Imitation Learning
Eiji Uchibe
 
Efficient Solving Techniques for Answer Set Programming
Efficient Solving Techniques for Answer Set ProgrammingEfficient Solving Techniques for Answer Set Programming
Efficient Solving Techniques for Answer Set Programming
Förderverein Technische Fakultät
 

What's hot (20)

Algorithm chapter 10
Algorithm chapter 10Algorithm chapter 10
Algorithm chapter 10
 
A comprehensive view on P vs NP
A comprehensive view on P vs NPA comprehensive view on P vs NP
A comprehensive view on P vs NP
 
Np complete
Np completeNp complete
Np complete
 
Introduction to Approximation Algorithms
Introduction to Approximation AlgorithmsIntroduction to Approximation Algorithms
Introduction to Approximation Algorithms
 
Introduction to NP Completeness
Introduction to NP CompletenessIntroduction to NP Completeness
Introduction to NP Completeness
 
Mathematical Background for Artificial Intelligence
Mathematical Background for Artificial IntelligenceMathematical Background for Artificial Intelligence
Mathematical Background for Artificial Intelligence
 
Planning Algorithms
Planning AlgorithmsPlanning Algorithms
Planning Algorithms
 
P vs NP
P vs NP P vs NP
P vs NP
 
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
 
lecture 27
lecture 27lecture 27
lecture 27
 
Planing presentation
Planing presentationPlaning presentation
Planing presentation
 
NP Complete Problems in Graph Theory
NP Complete Problems in Graph TheoryNP Complete Problems in Graph Theory
NP Complete Problems in Graph Theory
 
lecture 29
lecture 29lecture 29
lecture 29
 
Issues in Decision Tree by Ravindra Singh Kushwaha B.Tech(IT) 2017-21 Chaudha...
Issues in Decision Tree by Ravindra Singh Kushwaha B.Tech(IT) 2017-21 Chaudha...Issues in Decision Tree by Ravindra Singh Kushwaha B.Tech(IT) 2017-21 Chaudha...
Issues in Decision Tree by Ravindra Singh Kushwaha B.Tech(IT) 2017-21 Chaudha...
 
Np completeness h4
Np completeness  h4Np completeness  h4
Np completeness h4
 
Continuous control
Continuous controlContinuous control
Continuous control
 
Np completeness
Np completenessNp completeness
Np completeness
 
Artificial Intelligence Searching Techniques
Artificial Intelligence Searching TechniquesArtificial Intelligence Searching Techniques
Artificial Intelligence Searching Techniques
 
NIPS KANSAI Reading Group #5: State Aware Imitation Learning
NIPS KANSAI Reading Group #5: State Aware Imitation LearningNIPS KANSAI Reading Group #5: State Aware Imitation Learning
NIPS KANSAI Reading Group #5: State Aware Imitation Learning
 
Efficient Solving Techniques for Answer Set Programming
Efficient Solving Techniques for Answer Set ProgrammingEfficient Solving Techniques for Answer Set Programming
Efficient Solving Techniques for Answer Set Programming
 

Viewers also liked

Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
Wayne Lee
 
論文紹介 Adaptive metropolis algorithm using variational bayesian
論文紹介 Adaptive metropolis algorithm using variational bayesian論文紹介 Adaptive metropolis algorithm using variational bayesian
論文紹介 Adaptive metropolis algorithm using variational bayesian
Shuuji Mihara
 
The Key to Blind Dates - Data Snooping
The Key to Blind Dates - Data SnoopingThe Key to Blind Dates - Data Snooping
The Key to Blind Dates - Data Snooping
Wayne Lee
 
Crash Course in A/B testing
Crash Course in A/B testingCrash Course in A/B testing
Crash Course in A/B testing
Wayne Lee
 
CSC446: Pattern Recognition (LN6)
CSC446: Pattern Recognition (LN6)CSC446: Pattern Recognition (LN6)
CSC446: Pattern Recognition (LN6)
Mostafa G. M. Mostafa
 
Feature selection can hurt model inference
Feature selection can hurt model inferenceFeature selection can hurt model inference
Feature selection can hurt model inference
Wayne Lee
 
Reading Efron's 1979 paper on bootstrap
Reading Efron's 1979 paper on bootstrapReading Efron's 1979 paper on bootstrap
Reading Efron's 1979 paper on bootstrap
Christian Robert
 
Phylogeny
PhylogenyPhylogeny
Phylogeny
Roderic Page
 
K-means, EM and Mixture models
K-means, EM and Mixture modelsK-means, EM and Mixture models
K-means, EM and Mixture models
Vu Pham
 
Genscape Photos
Genscape Photos Genscape Photos
Genscape Photos
mhislop
 
Pattern recognition and Machine Learning.
Pattern recognition and Machine Learning.Pattern recognition and Machine Learning.
Pattern recognition and Machine Learning.
Rohit Kumar
 
LDA Beginner's Tutorial
LDA Beginner's TutorialLDA Beginner's Tutorial
LDA Beginner's Tutorial
Wayne Lee
 
How to Make Awesome SlideShares: Tips & Tricks
How to Make Awesome SlideShares: Tips & TricksHow to Make Awesome SlideShares: Tips & Tricks
How to Make Awesome SlideShares: Tips & Tricks
SlideShare
 
Getting Started With SlideShare
Getting Started With SlideShareGetting Started With SlideShare
Getting Started With SlideShare
SlideShare
 

Viewers also liked (14)

Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap Introduction to Bag of Little Bootstrap
Introduction to Bag of Little Bootstrap
 
論文紹介 Adaptive metropolis algorithm using variational bayesian
論文紹介 Adaptive metropolis algorithm using variational bayesian論文紹介 Adaptive metropolis algorithm using variational bayesian
論文紹介 Adaptive metropolis algorithm using variational bayesian
 
The Key to Blind Dates - Data Snooping
The Key to Blind Dates - Data SnoopingThe Key to Blind Dates - Data Snooping
The Key to Blind Dates - Data Snooping
 
Crash Course in A/B testing
Crash Course in A/B testingCrash Course in A/B testing
Crash Course in A/B testing
 
CSC446: Pattern Recognition (LN6)
CSC446: Pattern Recognition (LN6)CSC446: Pattern Recognition (LN6)
CSC446: Pattern Recognition (LN6)
 
Feature selection can hurt model inference
Feature selection can hurt model inferenceFeature selection can hurt model inference
Feature selection can hurt model inference
 
Reading Efron's 1979 paper on bootstrap
Reading Efron's 1979 paper on bootstrapReading Efron's 1979 paper on bootstrap
Reading Efron's 1979 paper on bootstrap
 
Phylogeny
PhylogenyPhylogeny
Phylogeny
 
K-means, EM and Mixture models
K-means, EM and Mixture modelsK-means, EM and Mixture models
K-means, EM and Mixture models
 
Genscape Photos
Genscape Photos Genscape Photos
Genscape Photos
 
Pattern recognition and Machine Learning.
Pattern recognition and Machine Learning.Pattern recognition and Machine Learning.
Pattern recognition and Machine Learning.
 
LDA Beginner's Tutorial
LDA Beginner's TutorialLDA Beginner's Tutorial
LDA Beginner's Tutorial
 
How to Make Awesome SlideShares: Tips & Tricks
How to Make Awesome SlideShares: Tips & TricksHow to Make Awesome SlideShares: Tips & Tricks
How to Make Awesome SlideShares: Tips & Tricks
 
Getting Started With SlideShare
Getting Started With SlideShareGetting Started With SlideShare
Getting Started With SlideShare
 

Similar to Explaining the Basics of Mean Field Variational Approximation for Statisticians

P, NP and NP-Complete, Theory of NP-Completeness V2
P, NP and NP-Complete, Theory of NP-Completeness V2P, NP and NP-Complete, Theory of NP-Completeness V2
P, NP and NP-Complete, Theory of NP-Completeness V2
S.Shayan Daneshvar
 
Machine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
Machine learning by Dr. Vivek Vijay and Dr. Sandeep YadavMachine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
Machine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
Agile Testing Alliance
 
Lec7 deeprlbootcamp-svg+scg
Lec7 deeprlbootcamp-svg+scgLec7 deeprlbootcamp-svg+scg
Lec7 deeprlbootcamp-svg+scg
Ronald Teo
 
Unbiased Bayes for Big Data
Unbiased Bayes for Big DataUnbiased Bayes for Big Data
Unbiased Bayes for Big Data
Christian Robert
 
L1 intro2 supervised_learning
L1 intro2 supervised_learningL1 intro2 supervised_learning
L1 intro2 supervised_learning
Yogendra Singh
 
Lec5 advanced-policy-gradient-methods
Lec5 advanced-policy-gradient-methodsLec5 advanced-policy-gradient-methods
Lec5 advanced-policy-gradient-methods
Ronald Teo
 
Variational Bayes: A Gentle Introduction
Variational Bayes: A Gentle IntroductionVariational Bayes: A Gentle Introduction
Variational Bayes: A Gentle Introduction
Flavio Morelli
 
ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptx
MohamedAliHabib3
 
Into to prob_prog_hari
Into to prob_prog_hariInto to prob_prog_hari
Into to prob_prog_hari
Hariharan Chandrasekaran
 
Particle filter
Particle filterParticle filter
Particle filter
Mohammad Reza Jabbari
 
Machine Learning, Financial Engineering and Quantitative Investing
Machine Learning, Financial Engineering and Quantitative InvestingMachine Learning, Financial Engineering and Quantitative Investing
Machine Learning, Financial Engineering and Quantitative Investing
Shengyuan Wang Steven
 
Machine learning mathematicals.pdf
Machine learning mathematicals.pdfMachine learning mathematicals.pdf
Machine learning mathematicals.pdf
King Khalid University
 
Bayesian Neural Networks
Bayesian Neural NetworksBayesian Neural Networks
Bayesian Neural Networks
Natan Katz
 
Firefly exact MCMC for Big Data
Firefly exact MCMC for Big DataFirefly exact MCMC for Big Data
Firefly exact MCMC for Big Data
Gianvito Siciliano
 
Always Valid Inference (Ramesh Johari, Stanford)
Always Valid Inference (Ramesh Johari, Stanford)Always Valid Inference (Ramesh Johari, Stanford)
Always Valid Inference (Ramesh Johari, Stanford)
Hakka Labs
 
Principle of Maximum Entropy
Principle of Maximum EntropyPrinciple of Maximum Entropy
Principle of Maximum Entropy
Jiawang Liu
 
lect5-1.ppt
lect5-1.pptlect5-1.ppt
class23.ppt
class23.pptclass23.ppt
class23.ppt
AjayPratap828815
 
Statistics 101
Statistics 101Statistics 101
Statistics 101
Olivier Teytaud
 
Overview on Optimization algorithms in Deep Learning
Overview on Optimization algorithms in Deep LearningOverview on Optimization algorithms in Deep Learning
Overview on Optimization algorithms in Deep Learning
Khang Pham
 

Similar to Explaining the Basics of Mean Field Variational Approximation for Statisticians (20)

P, NP and NP-Complete, Theory of NP-Completeness V2
P, NP and NP-Complete, Theory of NP-Completeness V2P, NP and NP-Complete, Theory of NP-Completeness V2
P, NP and NP-Complete, Theory of NP-Completeness V2
 
Machine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
Machine learning by Dr. Vivek Vijay and Dr. Sandeep YadavMachine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
Machine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
 
Lec7 deeprlbootcamp-svg+scg
Lec7 deeprlbootcamp-svg+scgLec7 deeprlbootcamp-svg+scg
Lec7 deeprlbootcamp-svg+scg
 
Unbiased Bayes for Big Data
Unbiased Bayes for Big DataUnbiased Bayes for Big Data
Unbiased Bayes for Big Data
 
L1 intro2 supervised_learning
L1 intro2 supervised_learningL1 intro2 supervised_learning
L1 intro2 supervised_learning
 
Lec5 advanced-policy-gradient-methods
Lec5 advanced-policy-gradient-methodsLec5 advanced-policy-gradient-methods
Lec5 advanced-policy-gradient-methods
 
Variational Bayes: A Gentle Introduction
Variational Bayes: A Gentle IntroductionVariational Bayes: A Gentle Introduction
Variational Bayes: A Gentle Introduction
 
ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptx
 
Into to prob_prog_hari
Into to prob_prog_hariInto to prob_prog_hari
Into to prob_prog_hari
 
Particle filter
Particle filterParticle filter
Particle filter
 
Machine Learning, Financial Engineering and Quantitative Investing
Machine Learning, Financial Engineering and Quantitative InvestingMachine Learning, Financial Engineering and Quantitative Investing
Machine Learning, Financial Engineering and Quantitative Investing
 
Machine learning mathematicals.pdf
Machine learning mathematicals.pdfMachine learning mathematicals.pdf
Machine learning mathematicals.pdf
 
Bayesian Neural Networks
Bayesian Neural NetworksBayesian Neural Networks
Bayesian Neural Networks
 
Firefly exact MCMC for Big Data
Firefly exact MCMC for Big DataFirefly exact MCMC for Big Data
Firefly exact MCMC for Big Data
 
Always Valid Inference (Ramesh Johari, Stanford)
Always Valid Inference (Ramesh Johari, Stanford)Always Valid Inference (Ramesh Johari, Stanford)
Always Valid Inference (Ramesh Johari, Stanford)
 
Principle of Maximum Entropy
Principle of Maximum EntropyPrinciple of Maximum Entropy
Principle of Maximum Entropy
 
lect5-1.ppt
lect5-1.pptlect5-1.ppt
lect5-1.ppt
 
class23.ppt
class23.pptclass23.ppt
class23.ppt
 
Statistics 101
Statistics 101Statistics 101
Statistics 101
 
Overview on Optimization algorithms in Deep Learning
Overview on Optimization algorithms in Deep LearningOverview on Optimization algorithms in Deep Learning
Overview on Optimization algorithms in Deep Learning
 

Recently uploaded

clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
Priyankaranawat4
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
TechSoup
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
Jean Carlos Nunes Paixão
 
Top five deadliest dog breeds in America
Top five deadliest dog breeds in AmericaTop five deadliest dog breeds in America
Top five deadliest dog breeds in America
Bisnar Chase Personal Injury Attorneys
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
IreneSebastianRueco1
 
Advanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docxAdvanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docx
adhitya5119
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
Celine George
 
Life upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for studentLife upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for student
NgcHiNguyn25
 
Assessment and Planning in Educational technology.pptx
Assessment and Planning in Educational technology.pptxAssessment and Planning in Educational technology.pptx
Assessment and Planning in Educational technology.pptx
Kavitha Krishnan
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
chanes7
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
David Douglas School District
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
taiba qazi
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
RitikBhardwaj56
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
tarandeep35
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
Smart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICTSmart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICT
simonomuemu
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
eBook.com.bd (প্রয়োজনীয় বাংলা বই)
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
AyyanKhan40
 

Recently uploaded (20)

clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
 
Top five deadliest dog breeds in America
Top five deadliest dog breeds in AmericaTop five deadliest dog breeds in America
Top five deadliest dog breeds in America
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
 
Advanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docxAdvanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docx
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
 
Life upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for studentLife upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for student
 
Assessment and Planning in Educational technology.pptx
Assessment and Planning in Educational technology.pptxAssessment and Planning in Educational technology.pptx
Assessment and Planning in Educational technology.pptx
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
Smart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICTSmart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICT
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
 

Explaining the Basics of Mean Field Variational Approximation for Statisticians

  • 1. Explaining “Explaining Variational Approximation” Based on Paper “Explaining Variational Approximation” JT Ormerod, MP Wand (2010) Presentation by Wayne Tai Lee
  • 2. My Goal ● Convert the paper into a short presentation ● Not covering the examples (really helpful!) ● Intuition and motivation only
  • 3. Why do we want to use variational approximations? ● In Statistics, Bayesian solutions always involve the posterior: p(Θ|data) = p(data | Θ) p(Θ) / p(data)
  • 4. Why do we want to use variational approximations? ● p(Θ|data) = p(data | Θ) p(Θ) / p(data) p(Θ|data) : posterior, belief after updating with data
  • 5. Why do we want to use variational approximations? ● p(Θ|data) = p(data | Θ) p(Θ) / p(data) p(Θ|data) : posterior, belief after updating with data p(data | Θ): likelihood, data generation
  • 6. Why do we want to use variational approximations? ● p(Θ|data) = p(data | Θ) p(Θ) / p(data) p(Θ|data) : posterior, belief after updating with data p(data | Θ): likelihood, data generation p(Θ): prior, belief before updating with data
  • 7. Why do we want to use variational approximations? ● p(Θ|data) = p(data | Θ) p(Θ) / p(data) p(Θ|data) : posterior, belief after updating with data p(data | Θ): likelihood, data generation p(Θ): prior, belief before updating with data p(data): “normalizing constant” to ensure posterior is a density function
  • 8. Why do we want to use variational approximations? ● p(Θ|data) = p(data | Θ) p(Θ) / p(data) p(data | Θ): likelihood, specified by you p(Θ): prior, specified by you p(data): a nasty integral that cannot be calculated explicitly in general
  • 9. Why do we want to use variational approximations? ● p(Θ|data) = p(data | Θ) p(Θ) / p(data) p(data | Θ): likelihood, specified by you p(Θ): prior, specified by you p(data): a nasty integral that cannot be calculated explicitly in general ● Consequence: – Posterior often has no analytical expression
  • 10. Most popular alternative ● To obtain the posterior or any related statistic – Sample the posterior via MCMC methods
  • 11. Most popular alternative ● To obtain the posterior or any related statistic – Sample the posterior via MCMC methods ● Pros – Can get arbitrarily close to the posterior with enough samples (resource/time intensive) ● Con: – Lots of tuning necessary – Time consuming to run
  • 12. Variational Approximation ● Intuition: – Approximate the posterior with a class of functions that are easier to deal with mathematically – Find the function that minimizes the KL divergence between the posterior in this class
  • 13. Variational Approximation ● Intuition: – Approximate the posterior with a class of functions that are easier to deal with mathematically – Find the function that minimizes the KL divergence between the posterior in this class ● Pros: – Suuuuper fast ● Cons: – No guarantees on closeness
  • 14. Big Picture Method to get Posterior MCMC Variational Method Strategy Sampling Optimization Solution Asymptotically Exact Approximation with no bounds Speed Often slow Fast The “catch” Tuning and convergence assessment require experience Need tractable mathematical setup
  • 15. Explaining Variational Approximation ● Change notation: p(y) = p(data) ● Use q(Θ) to approximate p(Θ|y) ● Will assume family of functions for q(Θ) – q(Θ) = q1(Θ1)q2(Θ2)...qp(Θp) – Each qi(Θi) is a density
  • 16. Max Lower Bound = Min KL-Divergence
  • 17. Sanity Check: Optimal Solution is THE solution ● Optimal q(Θ) is p(Θ|y) for general form: ● Important: this is a very general solution for arbitrary dependence/distribution of Θ and y ● Product form of q(Θ) allows us to divide and conquer!
  • 18. Focus on each Θ separately
  • 19. Focus on each Θ separately +...
  • 20. Focus on each Θ separately +...
  • 21. Focus on each Θ separately
  • 22. Our assumptions so far ● Product form of q(Θ) allowed us to optimize each term separately ● qi(Θi) being densities allow to integrate out nicely
  • 23. How to convert into an optimization problem that we can solve?
  • 24. We've only learned one trick...
  • 25. We've only learned one trick... ● Optimal q1(Θ1) is then
  • 26. To get a densities, just normalize
  • 29. Repeat for Θi ● General Mean Field Variational Approximation Solution: The density that is proportional to
  • 30. Similarity to Full Conditional in Gibb Sampling ● Optimal qi(Θi) is proportional to ● Need to do algebra until this is “tractable” – i.e. something we recognize as a standard distribution that is easily normalized – This is where the “setup” comes in important
  • 31. For example ● If resembles exp(Θi^2 *c) then we know this must be the Gaussian density!
  • 32. Final Solution ● Product of all qi(θi) is then approximated to p(θ|y) ● Naturally doesn't do well when there's strong dependence between the θi ● You Should try the examples in the paper!
  • 33. First Example ● Data generated as – Y | μ, σ^2 ~ N(μ,σ^2) ● Priors – μ ~ N(m,s^2) – σ^2 ~InvGamma(a, b)
  • 34. Gibbs Sampling vs Variational Samples ● N=100 ● N=20
  • 35. Discussion ● Hard to know when the approximation is poor relative to the true posterior...