SlideShare a Scribd company logo
1 of 19
The Gaussian Distribution
CSC 411/2515
Adopted from PRML 2.3
The Gaussian Distribution
 For a 𝐷-dimensional vector 𝑥, the multivariate Gaussian
distribution takes the form:
𝒩 𝑥|𝜇, Σ =
1
2𝜋
𝐷
2 Σ
1
2
exp −
1
2
𝑥 − 𝜇 ⊤
Σ−1
𝑥 − 𝜇
 Motivations:
 Maximum of the entropy
 Central limit theorem
The Gaussian Distribution: Properties
 The law is a function of the Mahalanobis distance from 𝑥 to 𝜇:
Δ2 = 𝑥 − 𝜇 ⊤Σ−1 𝑥 − 𝜇
 The expectation of 𝑥 under the Gaussian distribution is:
𝔼 𝑥 = 𝜇
 The covariance matrix of 𝑥 is:
cov 𝑥 = Σ
The Gaussian Distribution: Properties
 The law (quadratic function) is constant on elliptical surfaces:
 𝜆𝑖 are the eigenvalues of Σ
 𝑢𝑖 are the associated eigenvectors
The Gaussian Distribution: more examples
 Contours of constant probability density:
in which the covariance matrix is:
a) of general form
b) diagonal
c) proportional to the identity matrix
Conditional Law
 Given a Gaussian distribution 𝒩 𝑥|𝜇, Σ with
𝑥 = 𝑥𝑎, 𝑥𝑏
⊤, 𝜇 = 𝜇𝑎, 𝜇𝑏
⊤
Σ =
Σ𝑎𝑎 Σ𝑎𝑏
Σ𝑏𝑎 Σ𝑏𝑏
=
Λ𝑎𝑎 Λ𝑎𝑏
Λ𝑏𝑎 Λ𝑏𝑏
−1
 What’s the conditional distribution 𝑝(𝑥𝑎|𝑥𝑏)?
Conditional Law
 Given a Gaussian distribution 𝒩 𝑥|𝜇, Σ with
𝑥 = 𝑥𝑎, 𝑥𝑏
⊤, 𝜇 = 𝜇𝑎, 𝜇𝑏
⊤
Σ =
Σ𝑎𝑎 Σ𝑎𝑏
Σ𝑏𝑎 Σ𝑏𝑏
=
Λ𝑎𝑎 Λ𝑎𝑏
Λ𝑏𝑎 Λ𝑏𝑏
−1
 What’s the conditional distribution 𝑝(𝑥𝑎|𝑥𝑏)?
−
1
2
𝑥 − 𝜇 ⊤Σ−1 𝑥 − 𝜇 =
−
1
2
𝑥𝑎 − 𝜇𝑎
⊤Λ𝑎𝑎 𝑥𝑎 − 𝜇𝑎 −
1
2
𝑥𝑎 − 𝜇𝑎
⊤Λ𝑎𝑏 𝑥𝑏 − 𝜇𝑏
−
1
2
𝑥𝑏 − 𝜇𝑏
⊤
Λ𝑏𝑎 𝑥𝑎 − 𝜇𝑎 −
1
2
𝑥𝑏 − 𝜇𝑏
⊤
Λ𝑏𝑏 𝑥𝑏 − 𝜇𝑏
Conditional Law
 Using the definition:
−
1
2
𝑥 − 𝜇 ⊤
Σ−1
𝑥 − 𝜇 = −
1
2
𝑥⊤
Σ−1
𝑥 + 𝑥⊤
Σ−1
𝜇 + const
 What’s the conditional distribution 𝑝(𝑥𝑎|𝑥𝑏)?
 Σ𝑎|𝑏
−1
= Λ𝑎𝑎
 Σ𝑎|𝑏
−1
𝜇𝑎|𝑏 = Λ𝑎𝑎𝜇𝑎 − Λ𝑎𝑏 𝑥𝑏 − 𝜇𝑏
Σ =
Σ𝑎𝑎 Σ𝑎𝑏
Σ𝑏𝑎 Σ𝑏𝑏
=
Λ𝑎𝑎 Λ𝑎𝑏
Λ𝑏𝑎 Λ𝑏𝑏
−1
 Λ𝑎𝑎 = Σ𝑎𝑎 − Σ𝑎𝑏Σ𝑏𝑏
−1
Σ𝑏𝑎
−1
 Λ𝑎𝑏 = − Σ𝑎𝑎 − Σ𝑎𝑏Σ𝑏𝑏
−1
Σ𝑏𝑎
−1
Σ𝑎𝑏Σ𝑏𝑏
−1
Inverse partition identity:
𝐴 𝐵
𝐶 𝐷
−1
= 𝑀 −𝑀𝐵𝐷−1
−𝐷−1
𝐶𝑀 𝐷−1
+ 𝐷−1
𝐶𝑀𝐵𝐷−1
𝑀 = 𝐴 − 𝐵𝐷−1
𝐶 −1
Conditional Law
𝜇𝑎|𝑏 = 𝜇𝑎 + Λ𝑎𝑎Λ𝑎𝑏
−1
𝑥𝑏 − 𝜇𝑏
 The form using precision matrix:
Λ𝑎|𝑏 = Λ𝑎𝑎
𝜇𝑎|𝑏 = 𝜇𝑎 + Σ𝑎𝑏Σ𝑏𝑏
−1
𝑥𝑏 − 𝜇𝑏
 The conditional distribution 𝑝 𝑥𝑎 𝑥𝑏 is a Gaussian with:
Σ𝑎|𝑏 = Σ𝑎𝑎 − Σ𝑎𝑏Σ𝑏𝑏
−1
Σ𝑏𝑎
Marginal Law
 The marginal distribution is given by:
𝑝 𝑥𝑎 = 𝑝 𝑥𝑎, 𝑥𝑏 𝑑𝑥𝑏
 Picking out those terms that involve 𝑥𝑏, we have
−
1
2
𝑥𝑏
⊤
Λ𝑏𝑏𝑥𝑏 + 𝑥𝑏
⊤
𝑚 = −
1
2
𝑥𝑏 − Λ𝑏𝑏
−1
𝑚
⊤
Λ𝑏𝑏 𝑥𝑏 − Λ𝑏𝑏
−1
𝑚 +
1
2
𝑚⊤Λ𝑏𝑏
−1
𝑚
𝑚 = Λ𝑏𝑏𝜇𝑏 − Λ𝑏𝑎 𝑥𝑎 − 𝜇𝑎
 Integrate over 𝑥𝑏 (unnormalized Gaussian)
exp −
1
2
𝑥𝑏 − Λ𝑏𝑏
−1
𝑚
⊤
Λ𝑏𝑏 𝑥𝑏 − Λ𝑏𝑏
−1
𝑚 𝑑𝑥𝑏
 The integral is equal to the normalization term
Marginal Law
 After integrating over 𝑥𝑏, we pick out the remaining terms:
 The marginal distribution is a Gaussian with
𝔼 𝑥𝑎 = 𝜇𝑎 cov 𝑥𝑎 = Σ𝑎𝑎
−
1
2
𝑥𝑎
⊤Λ𝑎𝑎𝑥𝑎 + 𝑥𝑎
⊤ Λ𝑎𝑎𝜇𝑎 + Λ𝑎𝑏𝜇𝑏 +
1
2
𝑚⊤Λ𝑏𝑏
−1
𝑚 + const
𝑚 = Λ𝑏𝑏𝜇𝑏 − Λ𝑏𝑎 𝑥𝑎 − 𝜇𝑎
Short Summary
 Conditional distribution:
Σ =
Σ𝑎𝑎 Σ𝑎𝑏
Σ𝑏𝑎 Σ𝑏𝑏
=
Λ𝑎𝑎 Λ𝑎𝑏
Λ𝑏𝑎 Λ𝑏𝑏
−1
𝑝 𝑥𝑎 𝑥𝑏 = 𝒩 𝑥𝑎|𝜇𝑎|𝑏, Λ𝑎𝑎
−1
𝜇𝑎|𝑏 = 𝜇𝑎 − Λ𝑎𝑎
−1
Λ𝑎𝑏(𝑥𝑏 − 𝜇𝑏)
 Marginal distribution:
𝑝 𝑥𝑎 = 𝒩 𝑥𝑎|𝜇𝑎, Σ𝑎𝑎
Bayes’ theorem for Gaussian variables
 Setup:
 What’s the marginal distribution 𝑝 𝑦 and conditional
distribution 𝑝 𝑥|𝑦 ?
𝑝 𝑥 = 𝒩 𝑥|𝜇, Λ−1
𝑝 𝑦|𝑥 = 𝒩 𝑦|A𝑥 + 𝑏, L−1
 How about first compute 𝑝 𝑧 , where 𝑧 = 𝑥, 𝑦 ⊤
 𝑝 𝑧 is a Gaussian distribution, consider the log of the
joint distribution
ln 𝑝 𝑧 = ln 𝑝 𝑥 + ln 𝑝 𝑦 𝑥
= −
1
2
𝑥 − 𝜇 ⊤
Λ 𝑥 − 𝜇
−
1
2
𝑦 − A𝑥 + 𝑏 ⊤L 𝑦 − A𝑥 + 𝑏 + const
Bayes’ theorem for Gaussian variables
 The same trick (consider the second order terms), we get
𝔼 𝑧 =
𝜇
A𝜇 + 𝑏
cov 𝑧 = Λ−1
Λ−1
A
AΛ−1 𝐿−1 + AΛ−1A
 We can then get 𝑝 𝑦 and 𝑝 𝑥|𝑦 by marginal and conditional
laws!
Maximum likelihood for the Gaussian
 Assume we have X = 𝑥1, … , 𝑥𝑁
⊤
in which the observation
𝑥𝑛 are assumed to be drawn independently from a
multivariate Gaussian, the log likelihood function is given by
ln 𝑝 X 𝜇, Σ = −
𝑁𝐷
2
ln 2𝜋 −
𝑁
2
ln Σ −
1
2
𝑛=1
𝑁
𝑥𝑛 − 𝜇 ⊤
Σ−1
𝑥𝑛 − 𝜇
 Setting the derivative to zero, we obtain the solution for the
maximum likelihood estimator:
𝜕
𝜕𝜇
ln 𝑝 X 𝜇, Σ =
𝑛=1
𝑁
Σ−1 𝑥𝑛 − 𝜇 = 0
ΣML =
1
𝑁
𝑛=1
𝑁
𝑥𝑛 − 𝜇ML 𝑥𝑛 − 𝜇ML
⊤
𝜇ML =
1
𝑁
𝑛=1
𝑁
𝑥𝑛
Maximum likelihood for the Gaussian
 The empirical mean is unbiased in the sense
𝔼 𝜇ML = 𝜇
 However, the maximum likelihood estimate for the covariance
has an expectation that is less that the true value:
𝔼 ΣML =
𝑁 − 1
𝑁
Σ
 We can correct it by multiplying ΣML by the factor
𝑁
𝑁−1
Conjugate prior for the Gaussian
 Introducing prior distributions over the parameters of the
Gaussian
 The maximum likelihood framework only gives point estimates
for the parameters, we would like to have uncertainty
estimation (confidence interval) for the estimation
 The conjugate prior for 𝜇 is a Gaussian
 The conjugate prior for precision Λ is a Gamma distribution
 We would like the posterior 𝑝 𝜃 𝐷 ∝ 𝑝 𝜃 𝑝(𝐷|𝜃) has the
same form as the prior (Conjugate prior!)
The Gaussian Distribution: limitations
 A lot of parameters to estimate 𝐷 +
𝐷2+𝐷
2
: structured
approximation (e.g., diagonal variance matrix)
 Maximum likelihood estimators are not robust to outliers:
Student’s t-distribution (bottom left)
 Not able to describe periodic data: von Mises distribution
 Unimodel distribution: Mixture of Gaussian (bottom right)
The Gaussian Distribution: frontiers
 Gaussian Process
 Bayesian Neural Networks
 Generative modeling (Variational Autoencoder)
 ……

More Related Content

Similar to tut07.pptx

Similar to tut07.pptx (20)

E04 06 3943
E04 06 3943E04 06 3943
E04 06 3943
 
Probability Proficiency.pptx
Probability Proficiency.pptxProbability Proficiency.pptx
Probability Proficiency.pptx
 
Estimation Theory Class (Summary and Revision)
Estimation Theory Class (Summary and Revision)Estimation Theory Class (Summary and Revision)
Estimation Theory Class (Summary and Revision)
 
Helmholtz equation (Motivations and Solutions)
Helmholtz equation (Motivations and Solutions)Helmholtz equation (Motivations and Solutions)
Helmholtz equation (Motivations and Solutions)
 
A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...
A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...
A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...
 
Generalised Statistical Convergence For Double Sequences
Generalised Statistical Convergence For Double SequencesGeneralised Statistical Convergence For Double Sequences
Generalised Statistical Convergence For Double Sequences
 
Lesson 3: Problem Set 4
Lesson 3: Problem Set 4Lesson 3: Problem Set 4
Lesson 3: Problem Set 4
 
Optimum Engineering Design - Day 2b. Classical Optimization methods
Optimum Engineering Design - Day 2b. Classical Optimization methodsOptimum Engineering Design - Day 2b. Classical Optimization methods
Optimum Engineering Design - Day 2b. Classical Optimization methods
 
Conformal Boundary conditions
Conformal Boundary conditionsConformal Boundary conditions
Conformal Boundary conditions
 
Tutorial_3_Solution_.pdf
Tutorial_3_Solution_.pdfTutorial_3_Solution_.pdf
Tutorial_3_Solution_.pdf
 
Calculas
CalculasCalculas
Calculas
 
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
 
Solving Poisson Equation using Conjugate Gradient Method and its implementation
Solving Poisson Equation using Conjugate Gradient Methodand its implementationSolving Poisson Equation using Conjugate Gradient Methodand its implementation
Solving Poisson Equation using Conjugate Gradient Method and its implementation
 
Computational Method for Engineers: Solving a system of Linear Equations
Computational Method for Engineers: Solving a system of Linear EquationsComputational Method for Engineers: Solving a system of Linear Equations
Computational Method for Engineers: Solving a system of Linear Equations
 
Unit 5 Correlation
Unit 5 CorrelationUnit 5 Correlation
Unit 5 Correlation
 
Gaussian Process Regression
Gaussian Process Regression  Gaussian Process Regression
Gaussian Process Regression
 
WavesAppendix.pdf
WavesAppendix.pdfWavesAppendix.pdf
WavesAppendix.pdf
 
Strong convexity on gradient descent and newton's method
Strong convexity on gradient descent and newton's methodStrong convexity on gradient descent and newton's method
Strong convexity on gradient descent and newton's method
 
HK_Partial Differential Equations_Laplace equation.pdf
HK_Partial Differential Equations_Laplace equation.pdfHK_Partial Differential Equations_Laplace equation.pdf
HK_Partial Differential Equations_Laplace equation.pdf
 
Stochastic Optimization
Stochastic OptimizationStochastic Optimization
Stochastic Optimization
 

Recently uploaded

Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotecAbortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
wsppdmt
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get CytotecAbortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
q6pzkpark
 
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
jk0tkvfv
 
如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样
如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样
如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样
wsppdmt
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives
23050636
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
zifhagzkk
 
Simplify hybrid data integration at an enterprise scale. Integrate all your d...
Simplify hybrid data integration at an enterprise scale. Integrate all your d...Simplify hybrid data integration at an enterprise scale. Integrate all your d...
Simplify hybrid data integration at an enterprise scale. Integrate all your d...
varanasisatyanvesh
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 

Recently uploaded (20)

Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotecAbortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
DBMS UNIT 5 46 CONTAINS NOTES FOR THE STUDENTS
DBMS UNIT 5 46 CONTAINS NOTES FOR THE STUDENTSDBMS UNIT 5 46 CONTAINS NOTES FOR THE STUDENTS
DBMS UNIT 5 46 CONTAINS NOTES FOR THE STUDENTS
 
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get CytotecAbortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
 
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
 
Bios of leading Astrologers & Researchers
Bios of leading Astrologers & ResearchersBios of leading Astrologers & Researchers
Bios of leading Astrologers & Researchers
 
如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样
如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样
如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Simplify hybrid data integration at an enterprise scale. Integrate all your d...
Simplify hybrid data integration at an enterprise scale. Integrate all your d...Simplify hybrid data integration at an enterprise scale. Integrate all your d...
Simplify hybrid data integration at an enterprise scale. Integrate all your d...
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
 

tut07.pptx

  • 1. The Gaussian Distribution CSC 411/2515 Adopted from PRML 2.3
  • 2. The Gaussian Distribution  For a 𝐷-dimensional vector 𝑥, the multivariate Gaussian distribution takes the form: 𝒩 𝑥|𝜇, Σ = 1 2𝜋 𝐷 2 Σ 1 2 exp − 1 2 𝑥 − 𝜇 ⊤ Σ−1 𝑥 − 𝜇  Motivations:  Maximum of the entropy  Central limit theorem
  • 3. The Gaussian Distribution: Properties  The law is a function of the Mahalanobis distance from 𝑥 to 𝜇: Δ2 = 𝑥 − 𝜇 ⊤Σ−1 𝑥 − 𝜇  The expectation of 𝑥 under the Gaussian distribution is: 𝔼 𝑥 = 𝜇  The covariance matrix of 𝑥 is: cov 𝑥 = Σ
  • 4. The Gaussian Distribution: Properties  The law (quadratic function) is constant on elliptical surfaces:  𝜆𝑖 are the eigenvalues of Σ  𝑢𝑖 are the associated eigenvectors
  • 5. The Gaussian Distribution: more examples  Contours of constant probability density: in which the covariance matrix is: a) of general form b) diagonal c) proportional to the identity matrix
  • 6. Conditional Law  Given a Gaussian distribution 𝒩 𝑥|𝜇, Σ with 𝑥 = 𝑥𝑎, 𝑥𝑏 ⊤, 𝜇 = 𝜇𝑎, 𝜇𝑏 ⊤ Σ = Σ𝑎𝑎 Σ𝑎𝑏 Σ𝑏𝑎 Σ𝑏𝑏 = Λ𝑎𝑎 Λ𝑎𝑏 Λ𝑏𝑎 Λ𝑏𝑏 −1  What’s the conditional distribution 𝑝(𝑥𝑎|𝑥𝑏)?
  • 7. Conditional Law  Given a Gaussian distribution 𝒩 𝑥|𝜇, Σ with 𝑥 = 𝑥𝑎, 𝑥𝑏 ⊤, 𝜇 = 𝜇𝑎, 𝜇𝑏 ⊤ Σ = Σ𝑎𝑎 Σ𝑎𝑏 Σ𝑏𝑎 Σ𝑏𝑏 = Λ𝑎𝑎 Λ𝑎𝑏 Λ𝑏𝑎 Λ𝑏𝑏 −1  What’s the conditional distribution 𝑝(𝑥𝑎|𝑥𝑏)? − 1 2 𝑥 − 𝜇 ⊤Σ−1 𝑥 − 𝜇 = − 1 2 𝑥𝑎 − 𝜇𝑎 ⊤Λ𝑎𝑎 𝑥𝑎 − 𝜇𝑎 − 1 2 𝑥𝑎 − 𝜇𝑎 ⊤Λ𝑎𝑏 𝑥𝑏 − 𝜇𝑏 − 1 2 𝑥𝑏 − 𝜇𝑏 ⊤ Λ𝑏𝑎 𝑥𝑎 − 𝜇𝑎 − 1 2 𝑥𝑏 − 𝜇𝑏 ⊤ Λ𝑏𝑏 𝑥𝑏 − 𝜇𝑏
  • 8. Conditional Law  Using the definition: − 1 2 𝑥 − 𝜇 ⊤ Σ−1 𝑥 − 𝜇 = − 1 2 𝑥⊤ Σ−1 𝑥 + 𝑥⊤ Σ−1 𝜇 + const  What’s the conditional distribution 𝑝(𝑥𝑎|𝑥𝑏)?  Σ𝑎|𝑏 −1 = Λ𝑎𝑎  Σ𝑎|𝑏 −1 𝜇𝑎|𝑏 = Λ𝑎𝑎𝜇𝑎 − Λ𝑎𝑏 𝑥𝑏 − 𝜇𝑏 Σ = Σ𝑎𝑎 Σ𝑎𝑏 Σ𝑏𝑎 Σ𝑏𝑏 = Λ𝑎𝑎 Λ𝑎𝑏 Λ𝑏𝑎 Λ𝑏𝑏 −1  Λ𝑎𝑎 = Σ𝑎𝑎 − Σ𝑎𝑏Σ𝑏𝑏 −1 Σ𝑏𝑎 −1  Λ𝑎𝑏 = − Σ𝑎𝑎 − Σ𝑎𝑏Σ𝑏𝑏 −1 Σ𝑏𝑎 −1 Σ𝑎𝑏Σ𝑏𝑏 −1 Inverse partition identity: 𝐴 𝐵 𝐶 𝐷 −1 = 𝑀 −𝑀𝐵𝐷−1 −𝐷−1 𝐶𝑀 𝐷−1 + 𝐷−1 𝐶𝑀𝐵𝐷−1 𝑀 = 𝐴 − 𝐵𝐷−1 𝐶 −1
  • 9. Conditional Law 𝜇𝑎|𝑏 = 𝜇𝑎 + Λ𝑎𝑎Λ𝑎𝑏 −1 𝑥𝑏 − 𝜇𝑏  The form using precision matrix: Λ𝑎|𝑏 = Λ𝑎𝑎 𝜇𝑎|𝑏 = 𝜇𝑎 + Σ𝑎𝑏Σ𝑏𝑏 −1 𝑥𝑏 − 𝜇𝑏  The conditional distribution 𝑝 𝑥𝑎 𝑥𝑏 is a Gaussian with: Σ𝑎|𝑏 = Σ𝑎𝑎 − Σ𝑎𝑏Σ𝑏𝑏 −1 Σ𝑏𝑎
  • 10. Marginal Law  The marginal distribution is given by: 𝑝 𝑥𝑎 = 𝑝 𝑥𝑎, 𝑥𝑏 𝑑𝑥𝑏  Picking out those terms that involve 𝑥𝑏, we have − 1 2 𝑥𝑏 ⊤ Λ𝑏𝑏𝑥𝑏 + 𝑥𝑏 ⊤ 𝑚 = − 1 2 𝑥𝑏 − Λ𝑏𝑏 −1 𝑚 ⊤ Λ𝑏𝑏 𝑥𝑏 − Λ𝑏𝑏 −1 𝑚 + 1 2 𝑚⊤Λ𝑏𝑏 −1 𝑚 𝑚 = Λ𝑏𝑏𝜇𝑏 − Λ𝑏𝑎 𝑥𝑎 − 𝜇𝑎  Integrate over 𝑥𝑏 (unnormalized Gaussian) exp − 1 2 𝑥𝑏 − Λ𝑏𝑏 −1 𝑚 ⊤ Λ𝑏𝑏 𝑥𝑏 − Λ𝑏𝑏 −1 𝑚 𝑑𝑥𝑏  The integral is equal to the normalization term
  • 11. Marginal Law  After integrating over 𝑥𝑏, we pick out the remaining terms:  The marginal distribution is a Gaussian with 𝔼 𝑥𝑎 = 𝜇𝑎 cov 𝑥𝑎 = Σ𝑎𝑎 − 1 2 𝑥𝑎 ⊤Λ𝑎𝑎𝑥𝑎 + 𝑥𝑎 ⊤ Λ𝑎𝑎𝜇𝑎 + Λ𝑎𝑏𝜇𝑏 + 1 2 𝑚⊤Λ𝑏𝑏 −1 𝑚 + const 𝑚 = Λ𝑏𝑏𝜇𝑏 − Λ𝑏𝑎 𝑥𝑎 − 𝜇𝑎
  • 12. Short Summary  Conditional distribution: Σ = Σ𝑎𝑎 Σ𝑎𝑏 Σ𝑏𝑎 Σ𝑏𝑏 = Λ𝑎𝑎 Λ𝑎𝑏 Λ𝑏𝑎 Λ𝑏𝑏 −1 𝑝 𝑥𝑎 𝑥𝑏 = 𝒩 𝑥𝑎|𝜇𝑎|𝑏, Λ𝑎𝑎 −1 𝜇𝑎|𝑏 = 𝜇𝑎 − Λ𝑎𝑎 −1 Λ𝑎𝑏(𝑥𝑏 − 𝜇𝑏)  Marginal distribution: 𝑝 𝑥𝑎 = 𝒩 𝑥𝑎|𝜇𝑎, Σ𝑎𝑎
  • 13. Bayes’ theorem for Gaussian variables  Setup:  What’s the marginal distribution 𝑝 𝑦 and conditional distribution 𝑝 𝑥|𝑦 ? 𝑝 𝑥 = 𝒩 𝑥|𝜇, Λ−1 𝑝 𝑦|𝑥 = 𝒩 𝑦|A𝑥 + 𝑏, L−1  How about first compute 𝑝 𝑧 , where 𝑧 = 𝑥, 𝑦 ⊤  𝑝 𝑧 is a Gaussian distribution, consider the log of the joint distribution ln 𝑝 𝑧 = ln 𝑝 𝑥 + ln 𝑝 𝑦 𝑥 = − 1 2 𝑥 − 𝜇 ⊤ Λ 𝑥 − 𝜇 − 1 2 𝑦 − A𝑥 + 𝑏 ⊤L 𝑦 − A𝑥 + 𝑏 + const
  • 14. Bayes’ theorem for Gaussian variables  The same trick (consider the second order terms), we get 𝔼 𝑧 = 𝜇 A𝜇 + 𝑏 cov 𝑧 = Λ−1 Λ−1 A AΛ−1 𝐿−1 + AΛ−1A  We can then get 𝑝 𝑦 and 𝑝 𝑥|𝑦 by marginal and conditional laws!
  • 15. Maximum likelihood for the Gaussian  Assume we have X = 𝑥1, … , 𝑥𝑁 ⊤ in which the observation 𝑥𝑛 are assumed to be drawn independently from a multivariate Gaussian, the log likelihood function is given by ln 𝑝 X 𝜇, Σ = − 𝑁𝐷 2 ln 2𝜋 − 𝑁 2 ln Σ − 1 2 𝑛=1 𝑁 𝑥𝑛 − 𝜇 ⊤ Σ−1 𝑥𝑛 − 𝜇  Setting the derivative to zero, we obtain the solution for the maximum likelihood estimator: 𝜕 𝜕𝜇 ln 𝑝 X 𝜇, Σ = 𝑛=1 𝑁 Σ−1 𝑥𝑛 − 𝜇 = 0 ΣML = 1 𝑁 𝑛=1 𝑁 𝑥𝑛 − 𝜇ML 𝑥𝑛 − 𝜇ML ⊤ 𝜇ML = 1 𝑁 𝑛=1 𝑁 𝑥𝑛
  • 16. Maximum likelihood for the Gaussian  The empirical mean is unbiased in the sense 𝔼 𝜇ML = 𝜇  However, the maximum likelihood estimate for the covariance has an expectation that is less that the true value: 𝔼 ΣML = 𝑁 − 1 𝑁 Σ  We can correct it by multiplying ΣML by the factor 𝑁 𝑁−1
  • 17. Conjugate prior for the Gaussian  Introducing prior distributions over the parameters of the Gaussian  The maximum likelihood framework only gives point estimates for the parameters, we would like to have uncertainty estimation (confidence interval) for the estimation  The conjugate prior for 𝜇 is a Gaussian  The conjugate prior for precision Λ is a Gamma distribution  We would like the posterior 𝑝 𝜃 𝐷 ∝ 𝑝 𝜃 𝑝(𝐷|𝜃) has the same form as the prior (Conjugate prior!)
  • 18. The Gaussian Distribution: limitations  A lot of parameters to estimate 𝐷 + 𝐷2+𝐷 2 : structured approximation (e.g., diagonal variance matrix)  Maximum likelihood estimators are not robust to outliers: Student’s t-distribution (bottom left)  Not able to describe periodic data: von Mises distribution  Unimodel distribution: Mixture of Gaussian (bottom right)
  • 19. The Gaussian Distribution: frontiers  Gaussian Process  Bayesian Neural Networks  Generative modeling (Variational Autoencoder)  ……