Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
CHAPTER 3:COMMON PROBABILITYDISTRIBUTIONSCOMPUTER VISION: MODELS, LEARNING ANDINFERENCELukas Tencer
2    Computer vision: models, learning and inference. ©2011                                         Simon J.D. Prince
Why model these complicated     quantities?3    Because we need probability distributions over model parameters as    well...
Why model these complicated       quantities?4     Because we need probability distributions over model parameters as     ...
Bernoulli Distribution5                                                or                                               Fo...
Beta Distribution6    Defined over data                              (i.e. parameter of Bernoulli)    •   Two parameters  ...
Categorical Distribution7                                            or can think of data as vector with all              ...
Dirichlet Distribution8Defined over K values                                    where    Or for short:                    ...
Univariate Normal Distribution9    For short we write:                                             Univariate normal distr...
Normal Inverse Gamma10      Distribution     Defined on 2 variables           and      2>0     or for short      Four para...
Multivariate Normal Distribution11     For short we write:     Multivariate normal distribution describes multiple continu...
Types of covariance12     Covariance matrix has three forms, termed spherical, diagonal and full                        Co...
Normal Inverse Wishart13     Defined on two variables: a mean vector         and a symmetric positive definite     matrix,...
Samples from     Normal Inverse14     Wishart       (dispersion)        (ave. Covar) (disper of means) (ave. of means)    ...
Conjugate Distributions15     The pairs of distributions discussed have a special       relationship: they are conjugate d...
Conjugate Distributions16     When we take product of distribution and it’s conjugate,      the result has the same form a...
Example proof17     When we take product of distribution and it’s conjugate,      the result has the same form as the conj...
Bayes’ Rule Terminology18           Likelihood – propensity                  Prior – what we know           for observing ...
Importance of the Conjugate19     Relation 1                                                                1. Choose prio...
Importance of the Conjugate20      Relation 2         Marginalizing over parameters2. Integral becomes easy --the product...
Conclusions21     • Presented four distributions which model useful       quantities     • Presented four other distributi...
22            Thank You            for you attentionBased on:Computer vision: models, learning and inference. ©2011 Simon ...
Upcoming SlideShare
Loading in …5
×

Common Probability Distibution

301 views

Published on

Presentation for reading session of Computer Vision: Models, Learning, and Inference

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Common Probability Distibution

  1. 1. CHAPTER 3:COMMON PROBABILITYDISTRIBUTIONSCOMPUTER VISION: MODELS, LEARNING ANDINFERENCELukas Tencer
  2. 2. 2 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
  3. 3. Why model these complicated quantities?3 Because we need probability distributions over model parameters as well as over data and world state. Hence, some of the distributions describe the parameters of the others: Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
  4. 4. Why model these complicated quantities?4 Because we need probability distributions over model parameters as well as over data and world state. Hence, some of the distributions describe the parameters of the others: Example: Parameters modelled by: Models variance Models mean Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
  5. 5. Bernoulli Distribution5 or For short we write:Bernoulli distribution describes situation where only twopossible outcomes y=0/y=1 or failure/successTakes a single parameter Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
  6. 6. Beta Distribution6 Defined over data (i.e. parameter of Bernoulli) • Two parameters both > 0 For short we write: • Mean depends on relative values E[ ] = . • Concentration depends on magnitude Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
  7. 7. Categorical Distribution7 or can think of data as vector with all elements zero except kth e.g. e4 = [0,0,0,1,0] For short we write:Categorical distribution describes situation where K possibleoutcomes y=1… y=k.Takes K parameters where Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
  8. 8. Dirichlet Distribution8Defined over K values where Or for short: Has k parameters k>0 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
  9. 9. Univariate Normal Distribution9 For short we write: Univariate normal distribution describes single continuous variable. Takes 2 parameters and Computer vision: models, learning2and inference. ©2011 >0Simon J.D. Prince
  10. 10. Normal Inverse Gamma10 Distribution Defined on 2 variables and 2>0 or for short Four parameters and Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
  11. 11. Multivariate Normal Distribution11 For short we write: Multivariate normal distribution describes multiple continuous variables. Takes 2 parameters • a vector containing mean position, • a symmetric “positive definite” covariance matrix Positive definite: is positive for any real Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
  12. 12. Types of covariance12 Covariance matrix has three forms, termed spherical, diagonal and full Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
  13. 13. Normal Inverse Wishart13 Defined on two variables: a mean vector and a symmetric positive definite matrix, . or for short: Has four parameters • a positive scalar, • a positive definite matrix • a positive scalar, • a vector Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
  14. 14. Samples from Normal Inverse14 Wishart (dispersion) (ave. Covar) (disper of means) (ave. of means) Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
  15. 15. Conjugate Distributions15 The pairs of distributions discussed have a special relationship: they are conjugate distributions  Beta is conjugate to Bernouilli  Dirichlet is conjugate to categorical  Normal inverse gamma is conjugate to univariate normal  Normal inverse Wishart is conjugate to multivariate normal Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
  16. 16. Conjugate Distributions16 When we take product of distribution and it’s conjugate, the result has the same form as the conjugate. For example, consider the case where then a constant A new Beta distribution Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
  17. 17. Example proof17 When we take product of distribution and it’s conjugate, the result has the same form as the conjugate. Computer vision: models, learning and inference. ©2011 Simon J.D. 17 Prince
  18. 18. Bayes’ Rule Terminology18 Likelihood – propensity Prior – what we know for observing a certain about y before seeing value of x given a certain x value of y Posterior – what we Evidence – a constant to know about y after ensure that the left hand seeing x side is a valid distribution Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
  19. 19. Importance of the Conjugate19 Relation 1 1. Choose prior  Learning parameters: that is conjugate to likelihood 2. Implies that posterior 3. Posterior must be a distribution must have same form as which implies that evidence must conjugate prior equal constant from conjugate distribution relation Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
  20. 20. Importance of the Conjugate20 Relation 2  Marginalizing over parameters2. Integral becomes easy --the product 1. Chosen sobecomes a constant times a distribution conjugate to othe termIntegral of constant times probabilitydistribution= constant times integral of probabilitydistribution = constant vision: models, learning and inference. Computer x 1 = constant ©2011 Simon J.D. Prince
  21. 21. Conclusions21 • Presented four distributions which model useful quantities • Presented four other distributions which model the parameters of the first four • They are paired in a special way – the second set is conjugate to the other • In the following material we’ll see that this relationship is verymodels, learning and inference. ©2011 Computer vision: useful Simon J.D. Prince
  22. 22. 22 Thank You for you attentionBased on:Computer vision: models, learning and inference. ©2011 Simon J.D. Princehttp://www.computervisionmodels.com/

×