Upcoming SlideShare
×

# 03 cv mil_probability_distributions

417 views
303 views

Published on

Published in: Technology, Education
1 Like
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

Views
Total views
417
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
3
0
Likes
1
Embeds 0
No embeds

No notes for slide

### 03 cv mil_probability_distributions

1. 1. Computer vision:models, learning and inference Chapter 3 Probability distributions Please send errata to s.prince@cs.ucl.ac.uk
2. 2. Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 2
3. 3. Why model these complicated quantities?Because we need probability distributions over model parameters as well asover data and world state. Hence, some of the distributions describe theparameters of the others: Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 3
4. 4. Why model these complicated quantities? Because we need probability distributions over model parameters as well as over data and world state. Hence, some of the distributions describe the parameters of the others: Example:Parameters modelled by: Models variance Models mean Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 4
5. 5. Bernoulli Distribution or For short we write:Bernoulli distribution describes situation where only two possibleoutcomes y=0/y=1 or failure/successTakes a single parameter Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 5
6. 6. Beta DistributionDefined over data (i.e. parameter of Bernoulli)• Two parameters both > 0 For short we write:• Mean depends on relative values E[ ] = .• Concentration depends on magnitude Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 6
7. 7. Categorical Distribution or can think of data as vector with all elements zero except kth e.g. [0,0,0,1 0] For short we write:Categorical distribution describes situation where K possibleoutcomes y=1… y=k.Takes a K parameters where Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 7
8. 8. Dirichlet DistributionDefined over K values where Or for short: Has k parameters k>0 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 8
9. 9. Univariate Normal DistributionFor short we write: Univariate normal distribution describes single continuous variable. Takes 2 parameters and 2>0 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 9
10. 10. Normal Inverse Gamma DistributionDefined on 2 variables and 2>0or for short Four parameters and Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 10
11. 11. Multivariate Normal DistributionFor short we write:Multivariate normal distribution describes multiple continuousvariables. Takes 2 parameters • a vector containing mean position, • a symmetric “positive definite” covariance matrix Positive definite: is positive for any real Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 11
12. 12. Types of covarianceCovariance matrix has three forms, termed spherical, diagonal and full Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 12
13. 13. Normal Inverse WishartDefined on two variables: a mean vector and a symmetric positive definitematrix, .or for short:Has four parameters • a positive scalar, • a positive definite matrix • a positive scalar, • a vector Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 13
14. 14. Samples from Normal Inverse Wishart Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 14
15. 15. Conjugate DistributionsThe pairs of distributions discussed have a special relationship: they are conjugate distributions• Beta is conjugate to Bernouilli• Dirichlet is conjugate to categorical• Normal inverse gamma is conjugate to univariate normal• Normal inverse Wishart is conjugate to multivariate normal Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 15
16. 16. Conjugate DistributionsWhen we take product of distribution and it’s conjugate, the result has the same form as the conjugate.For example, consider the case wherethen a constant A new Beta distribution Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 16
17. 17. Example proofWhen we take product of distribution and it’s conjugate, the result has the same form as the conjugate. Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 17
18. 18. Bayes’ Rule Terminology Likelihood – propensity for Prior – what we know observing a certain value of about y before seeing x x given a certain value of yPosterior – what we Evidence – a constant toknow about y after ensure that the left handseeing x side is a valid distribution Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 18
19. 19. Importance of the Conjugate Relation 1 1. Choose prior that• Learning parameters: is conjugate to likelihood2. Implies that posterior 3. Posterior must be a distributionmust have same form as which implies that evidence must equalconjugate prior distribution constant from conjugate relation Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 19
20. 20. Importance of the Conjugate Relation 2• Marginalizing over parameters2. Integral becomes easy --the product becomes a 1. Chosen so conjugateconstant times a distribution to other termIntegral of constant times probability distribution= constant times integral of probability distribution= constant x 1 = constant Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 20
21. 21. Conclusions• Presented four distributions which model useful quantities• Presented four other distributions which model the parameters of the first four• They are paired in a special way – the second set is conjugate to the other• In the following material we’ll see that this relationship is very useful Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 21