Prml 2.3

•Download as PPTX, PDF•

0 likes•235 views

This document discusses Bayesian methods for Gaussian and Student's t-distribution models. It covers Bayesian linear regression, Gaussian processes, and variational mixture of Gaussians methods. The key points are: - Bayesian inference for Gaussian distributions uses conjugate prior distributions like Gaussian-Gaussian for a known variance and Gaussian-Gamma for unknown variance. - The Student's t-distribution can be represented as an infinite mixture of Gaussians, making it more robust to outliers than a single Gaussian. - Gibbs sampling can be used to fit finite and infinite Gaussian mixture models to data in an unsupervised manner.

Data & Analytics

PATTERN RECOGNITION
AND MACHINE LEARNING
CHAPTER 2.3.1-2.3.7
2.3.1-2.3.7
Gaussian
Conditionals &
Marginals
3.3 Bayesian Linear Regression
6.4 Gaussian Process
10.2 Viarational Mixture of Gaussian

Partitioned Conditionals and Marginals
Conditional
Marginal

To Bayesian Linear Regression & Gaussian Process

Bayes’ Theorem for Gaussian Variables (1)

Bayes’ Theorem for Gaussian Variables (2)
Given
we have
where

Maximum Likelihood for the Gaussian (1)
Given i.i.d. data , the log likeli-
hood function is given by
Sufficient statistics

Maximum Likelihood for the Gaussian (2)
Set the derivative of the log likelihood
function to zero,
and solve to obtain
Similarly

Maximum Likelihood for the Gaussian (3)
Under the true distribution
Hence define
When dataset is small, more bias in connivance.

Contribution of the Nth data point, xN
Sequential Estimation
correction given xN
correction weight
old estimate

Assume we are given samples from p(z,µ),
one at the time.
The Robbins-Monro Algorithm

Successive estimates of µN
are then given by
Conditions on aN for convergence :
The Robbins-Monro Algorithm

Example: estimate the mean of a Gaussian.
Robbins-Monro for Maximum Likelihood
The distribution of z is Gaussian
with mean μML.
For the Robbins-Monro update
equation, aN = σ/N.

Bayesian Inference for the Gaussian (1)
Assume σ2
is known. Given i.i.d. data
, the likelihood function for
μ is given by

Bayesian Inference for the Gaussian (2)
Combined with a Gaussian prior over μ,
this gives the posterior
Completing the square over σ2
, we see that

Bayesian Inference for the Gaussian (3)
… where
Note:

Bayesian Inference for the Gaussian (4)
Example: for N = 0, 1, 2
and 10.

Bayesian Inference for the Gaussian (5)
Now assume μ is known. The likelihood
function for λ=1/σ2
is given by
This has a Gamma shape as a function of λ.

Bayesian Inference for the Gaussian (6)
The Gamma distribution

Bayesian Inference for the Gaussian (7)
Now we combine a Gamma prior, ,
with the likelihood function for λ¸ to obtain
which we recognize as with

Bayesian Inference for the Gaussian (8)
If both μ and λ are unknown, the joint
likelihood function is given by
We need a prior with the same functional
dependence on μ and λ

Bayesian Inference for the Gaussian (10)
The Gaussian-gamma distribution

Bayesian Inference for the Gaussian (11)
The Gaussian-gamma distribution
In Bayesian model inference process, the density is updating base on data

Bayesian Inference for the Gaussian (11)
Multivariate conjugate priors
• μ unknown, Λ known: p(μ) Gaussian.
• Λ unknown, μ known: p(Λ) Wishart,
• μ and Λ unknown: p(μ, Λ) Gaussian-Wishart,

Student’s t-Distribution
Robustness to outliers: Gaussian vs t-distribution.

where
Infinite mixture of Gaussians.
Student’s t-Distribution

Student’s t-Distribution (1)
The D-variate case:
where .
Properties:

Student’s t-Distribution (2)
If use the inverse Wishart distribution as the
prior, we also get a Student t-distribution
Gibbs sampling for fitting finite and infinite Gaussian mixture models by Herman Kamper

An application on IGMM (1)
How many Gaussian distributions
can approximate the right data ?

An application on IGMM (2)
Data
Use 6
Gaussians!
Let the data speak for itself.

What's hot

benjielloyd1234benjlloyd

07. disjoint setOnkar Nath Sharma

総和伝搬法を用いた分散近似メッセージ伝搬アルゴリズムRyo Hayakawa

Daa chapter13B.Kirron Reddi

Very brief highlights on some key details 2foxtrot jp R

Enhance The K Means Algorithm On Spatial DatasetAlaaZ

Arthur B. Weglein, Hong Liang, and Chao Ma M-OSRP/Physics Dept./University o...Arthur Weglein

Data Visualizing with RBaan Bapat

MTH101 - Calculus and Analytical Geometry- Lecture 37Bilal Ahmed

Very brief highlights on some key details tosssqrdfoxtrot jp R

SEPnet_Poster-FINALJames Blandford

MS Excel functionsameermudasar

Sets and disjoint sets union123Ankita Goyal

Stringhighlights2015 seta updatefoxtrot jp R

K meansElias Hasnat

JAMB Physics, Question 20, 2006quoracles

Chap02algMunkhchimeg

What's hot (17)

benjielloyd1234

07. disjoint set

総和伝搬法を用いた分散近似メッセージ伝搬アルゴリズム

Daa chapter13

Very brief highlights on some key details 2

Enhance The K Means Algorithm On Spatial Dataset

Arthur B. Weglein, Hong Liang, and Chao Ma M-OSRP/Physics Dept./University o...

Data Visualizing with R

MTH101 - Calculus and Analytical Geometry- Lecture 37

Very brief highlights on some key details tosssqrd

SEPnet_Poster-FINAL

MS Excel functions

Sets and disjoint sets union123

Stringhighlights2015 seta update

K means

JAMB Physics, Question 20, 2006

Chap02alg

Similar to Prml 2.3

logGaussianFieldRishideep Roy

Gaussian processing홍배 김

proposal_puraErick Lin

Com_Stat_PaperShibasish Dasgupta, PhD (Statistics)

"Automatic Variational Inference in Stan" NIPS2015_yomi2016-01-20Yuta Kashino

Dd31720725IJERA Editor

On Extension of Weibull Distribution with Bayesian Analysis using S-Plus Soft...Dr. Amarjeet Singh

gmd-15-1195-2022.pdfVaideshSiva1

Data Science Meetup: DGLARS and Homotopy LASSO for Regression ModelsColleen Farrelly

JMVA_Paper_ShibasishShibasish Dasgupta, PhD (Statistics)

Gibbs Sampling with JAGS: Behind the ScenesGianpaolo Coro

Paris Lecture 3 on Bayesian Linear Mixed ModelsShravan Vasishth

2012　mdsp pr08 nonparametric approachnozomuhamada

Pattern Recognition and Machine Learning: Section 3.3Yusuke Oda

Bayesian Non-parametric Models for Data Science using PyMCMLReview

Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님AI Robotics KR

ABC workshop: 17w5025Christian Robert

Statistical Clusteringtim_hare

Training and Inference for Deep Gaussian ProcessesKeyon Vafa

Regression analysis and its typeEkta Bafna

Similar to Prml 2.3 (20)

logGaussianField

Gaussian processing

proposal_pura

Com_Stat_Paper

"Automatic Variational Inference in Stan" NIPS2015_yomi2016-01-20

Dd31720725

On Extension of Weibull Distribution with Bayesian Analysis using S-Plus Soft...

gmd-15-1195-2022.pdf

Data Science Meetup: DGLARS and Homotopy LASSO for Regression Models

JMVA_Paper_Shibasish

Gibbs Sampling with JAGS: Behind the Scenes

Paris Lecture 3 on Bayesian Linear Mixed Models

2012　mdsp pr08 nonparametric approach

Pattern Recognition and Machine Learning: Section 3.3

Bayesian Non-parametric Models for Data Science using PyMC

Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님

ABC workshop: 17w5025

Statistical Clustering

Training and Inference for Deep Gaussian Processes

Regression analysis and its type

Recently uploaded

Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408

dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach

Industrialised data - the key to AI success.pdfLars Albertsson

Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375

Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna

VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...Suhani Kapoor

04242024_CCC TUG_Joins and Relationshipsccctableauusergroup

Invezz.com - Grow your wealth with trading signalsInvezz1

From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck

High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh

Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda

VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083

Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh9953056974 Low Rate Call Girls In Saket, Delhi NCR

Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth

定制英国白金汉大学毕业证（UCB毕业证书）成绩单原版一比一ffjhghh

代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo

(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat

100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate

Recently uploaded (20)

Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps

dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt

Industrialised data - the key to AI success.pdf

Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...

Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...

VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...

04242024_CCC TUG_Joins and Relationships

Invezz.com - Grow your wealth with trading signals

From idea to production in a day – Leveraging Azure ML and Streamlit to build...

High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...

Customer Service Analytics - Make Sense of All Your Data.pptx

VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call

Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh

Unveiling Insights: The Role of a Data Analyst

定制英国白金汉大学毕业证（UCB毕业证书）成绩单原版一比一

代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改

(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service

100-Concepts-of-AI by Anupama Kate .pptx

Prml 2.3

1. PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2.3.1-2.3.7 2.3.1-2.3.7 Gaussian Conditionals & Marginals 3.3 Bayesian Linear Regression 6.4 Gaussian Process 10.2 Viarational Mixture of Gaussian

2. Partitioned Conditionals

3. Marginal

4. Partitioned Conditionals and Marginals

5. Partitioned Gaussian Distributions

6. Partitioned Conditionals and Marginals Conditional Marginal

7. To Bayesian Linear Regression & Gaussian Process

8. To Bayesian Linear Regression & Gaussian Process

9. Bayes’ Theorem for Gaussian Variables (1)

10. Bayes’ Theorem for Gaussian Variables (2) Given we have where

11. Maximum Likelihood for the Gaussian (1) Given i.i.d. data , the log likelihood function is given by Sufficient statistics

12. Maximum Likelihood for the Gaussian (2) Set the derivative of the log likelihood function to zero, and solve to obtain Similarly

13. Maximum Likelihood for the Gaussian (3) Under the true distribution Hence define When dataset is small, more bias in connivance.

14. Maximum Likelihood for the Gaussian (4)

15. Contribution of the Nth data point, xN Sequential Estimation correction given xN correction weight old estimate

16. We all know Newton's method

17. Assume we are given samples from p(z,µ), one at the time. The Robbins-Monro Algorithm

18. Successive estimates of µN are then given by Conditions on aN for convergence : The Robbins-Monro Algorithm

19. Example: estimate the mean of a Gaussian. Robbins-Monro for Maximum Likelihood The distribution of z is Gaussian with mean μML. For the Robbins-Monro update equation, aN = σ/N.

20. Go back to the online updating μ(N) ML

21. Bayesian Inference for the Gaussian (1) Assume σ2 is known. Given i.i.d. data , the likelihood function for μ is given by

22. Bayesian Inference for the Gaussian (2) Combined with a Gaussian prior over μ, this gives the posterior Completing the square over σ2 , we see that

23. Bayesian Inference for the Gaussian (3) … where Note:

24. Different from Kalman Filter

25. Bayesian Inference for the Gaussian (4) Example: for N = 0, 1, 2 and 10.

26. Bayesian Inference for the Gaussian (5) Now assume μ is known. The likelihood function for λ=1/σ2 is given by This has a Gamma shape as a function of λ.

27. Bayesian Inference for the Gaussian (6) The Gamma distribution

28. Bayesian Inference for the Gaussian (7) Now we combine a Gamma prior, , with the likelihood function for λ¸ to obtain which we recognize as with

29. Bayesian Inference for the Gaussian (8) If both μ and λ are unknown, the joint likelihood function is given by We need a prior with the same functional dependence on μ and λ

30. Bayesian Inference for the Gaussian (10) The Gaussian-gamma distribution

31. Bayesian Inference for the Gaussian (11) The Gaussian-gamma distribution In Bayesian model inference process, the density is updating base on data

32. Bayesian Inference for the Gaussian (11) Multivariate conjugate priors • μ unknown, Λ known: p(μ) Gaussian. • Λ unknown, μ known: p(Λ) Wishart, • μ and Λ unknown: p(μ, Λ) Gaussian-Wishart,

33. Student’s t-Distribution

34. Student’s t-Distribution

35. Student’s t-Distribution Robustness to outliers: Gaussian vs t-distribution.

36. where Infinite mixture of Gaussians. Student’s t-Distribution

37. Student’s t-Distribution (1) The D-variate case: where . Properties:

38. Student’s t-Distribution (2) If use the inverse Wishart distribution as the prior, we also get a Student t-distribution Gibbs sampling for fitting finite and infinite Gaussian mixture models by Herman Kamper

39. An application on IGMM (1) How many Gaussian distributions can approximate the right data ?

40. An application on IGMM (2) Data Use 6 Gaussians! Let the data speak for itself.

41. An application on IGMM (3)

42. An application on IGMM (4)

43. Thanks

Editor's Notes

The likelihood function depends on the data set only through the two quantities
Want a better expiation.
The prior is not only gives the prior knowledge, it also play as an “memory ” in the inference process. E.g., A Professor publish 0 paper this week, I published 1. Only look at this week, it seems I am better. But no body will think in this way. You have to consider the history, maybe the professor published 1000 papers before but I got only 2. What we did before is the prior, it helps us get a better ajudgement. but if I keep publish paper, the professor stopped. When I made 2000 papers with the same quality, it is more likely I will surpass him. In this process, the prior plays as an “memory”

Prml 2.3

Recommended

Recommended

More Related Content

What's hot

What's hot (17)

Similar to Prml 2.3

Similar to Prml 2.3 (20)

Recently uploaded

Recently uploaded (20)

Prml 2.3

Editor's Notes