Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- LDA Beginner's Tutorial by Wayne Lee 14253 views
- Crowdsourcing Series: LinkedIn. By ... by Hakka Labs 2129 views
- Manta: a new internet-facing object... by Hakka Labs 13269 views
- An Introduction to Sensu by Bethany... by Hakka Labs 3725 views
- Deployment Tools and Techniques at ... by Hakka Labs 7008 views
- Ad Personalization at Spotify: Iter... by Hakka Labs 5779 views

7,146 views

7,376 views

7,376 views

Published on

The Dirichlet distribution is surprisingly expressive on its own, but it can also be used as a building block for even more powerful and deep models such as mixtures and topic models.

No Downloads

Total views

7,146

On SlideShare

0

From Embeds

0

Number of Embeds

4,962

Shares

0

Downloads

131

Comments

0

Likes

15

No embeds

No notes for slide

- 1. Digging into the Dirichlet Max Sklar @maxsklar New York Machine Learning Meetup December 19th, 2013
- 2. Dedication Meyer Marks 1925 - 2013
- 3. The Dirichlet Distribution
- 4. Let’s start with something simpler
- 5. Let’s start with something simpler A Pie Chart!
- 6. Let’s start with something simpler A Pie Chart! AKA Discrete Distribution Multinomial Distribution
- 7. Let’s start with something simpler A Pie Chart! K = The number of categories. K=5
- 8. Examples of Multinomial Distributions
- 9. Examples of Multinomial Distributions
- 10. Examples of Multinomial Distributions
- 11. Examples of Multinomial Distributions
- 12. Examples of Multinomial Distributions
- 13. What does the raw data look like?
- 14. What does the raw data look like? id # dislikes 1 231 23 2 Counts! # likes 81 40 3 67 9 4 121 14 5 9 31 6 18 0 7 1 1
- 15. What does the raw data look like? More specifically: - K columns of counts - N rows of data id # likes # dislikes 1 231 23 2 81 40 3 67 9 4 121 14 5 9 31 6 18 0 7 1 1
- 16. BUT... Counts != Multinomial Distribution
- 17. BUT... We can estimate the multinomial distribution with the counts, using the maximum likelihood estimate 366 181 203
- 18. BUT... We can estimate the multinomial distribution with the counts, using the maximum likelihood estimate Sum = 366 + 181 + 203 = 750 366 181 203
- 19. BUT... We can estimate the multinomial distribution with the counts, using the maximum likelihood estimate 366 / 750 181 / 750 203 / 750 366 181 203
- 20. BUT... We can estimate the multinomial distribution with the counts, using the maximum likelihood estimate 48.8% 24.1% 27.1% 366 181 203
- 21. BUT... 366 203 1 Uh Oh 181 2 1
- 22. BUT... 366 This column will be all Yellow right? 181 203 1 2 1 0 1 0
- 23. BUT... 366 203 1 Panic!!!! 181 2 1 0 1 0 0 0 0
- 24. Bayesian Statistics to the Rescue
- 25. Bayesian Statistics to the Rescue Still assume each row was generated by a multinomial distribution
- 26. Bayesian Statistics to the Rescue Still assume each row was generated by a multinomial distribution We just don’t know which one!
- 27. The Dirichlet Distribution Is a probability distribution over all possible multinomial distributions, p.
- 28. The Dirichlet Distribution ? Represents our uncertainty over the actual distribution that created the row. ?
- 29. The Dirichlet Distribution p: represents a multinomial distribution alpha: the parameters of the dirichlet K: the number of categories
- 30. Bayesian Updates
- 31. Bayesian Updates Also a Dirichlet!
- 32. Bayesian Updates Also a Dirichlet! (Conjugate Prior)
- 33. Bayesian Updates
- 34. Bayesian Updates
- 35. Bayesian Updates +1
- 36. Why Does this Work? Let’s look at it again.
- 37. Entropy
- 38. Entropy Information Content
- 39. Entropy Information Content Energy
- 40. Entropy Information Content Energy Log Likelihood
- 41. Entropy Information Content Energy Log Likelihood
- 42. The Dirichlet Distribution
- 43. The Dirichlet Distribution Normalizing Constant
- 44. The Dirichlet Distribution Normalizing Constant
- 45. The Dirichlet Distribution
- 46. The Dirichlet Distribution
- 47. The Dirichlet Distribution
- 48. The Dirichlet Distribution Linear
- 49. The Dirichlet MACHINE Prior 1.2 3.0 0.3
- 50. The Dirichlet MACHINE Prior 1.2 3.0 0.3
- 51. The Dirichlet MACHINE Update 2.2 3.0 0.3
- 52. The Dirichlet MACHINE Prior 2.2 3.0 0.3
- 53. The Dirichlet MACHINE Update 2.2 3.0 1.3
- 54. The Dirichlet MACHINE Prior 2.2 3.0 1.3
- 55. The Dirichlet MACHINE Update 2.2 3.0 2.3
- 56. Interpreting the Parameters 1 3 2 What does this alpha vector really mean?
- 57. Interpreting the Parameters 1 normalized 1/6 3/6 3 2 sum 2/6 6
- 58. Interpreting the Parameters 1 Expected Value 3 2 6
- 59. Interpreting the Parameters 1 Expected Value 3 2 6 Weight
- 60. ANALOGY: Normal Distribution Precision = 1 / variance
- 61. ANALOGY: Normal Distribution High precision: data is close to the mean Low precision: far away from the mean
- 62. Interpreting the Parameters 1 Expected Value 3 2 6 Precision
- 63. High Weight Dirichlet 0.4 0.4 0.2
- 64. Low Weight Dirichlet 0.4 0.4 0.2
- 65. Urn Model At each step, pick a ball from the urn.. Replace it, and add another ball of that color into the urn
- 66. Urn Model At each step, pick a ball from the urn.. Replace it, and add another ball of that color into the urn
- 67. Urn Model At each step, pick a ball from the urn.. Replace it, and add another ball of that color into the urn
- 68. Urn Model At each step, pick a ball from the urn.. Replace it, and add another ball of that color into the urn
- 69. Urn Model At each step, pick a ball from the urn.. Replace it, and add another ball of that color into the urn
- 70. Urn Model Rich get richer...
- 71. Urn Model Rich get richer...
- 72. Urn Model Finally yellow catches a break
- 73. Urn Model Finally yellow catches a break
- 74. Urn Model But it’s too late...
- 75. Urn Model As the urn gets more populated, the distribution gets “stuck” in place.
- 76. Urn Model Once lots of data has been collected, or the dirichlet has high precision, it’s hard to overturn that with new data
- 77. Chinese Restaurant Process When you find the white ball, throw a new color into the mix.
- 78. Chinese Restaurant Process When you find the white ball, throw a new color into the mix.
- 79. Chinese Restaurant Process When you find the white ball, throw a new color into the mix.
- 80. Chinese Restaurant Process When you find the white ball, throw a new color into the mix.
- 81. Chinese Restaurant Process When you find the white ball, throw a new color into the mix.
- 82. Chinese Restaurant Process When you find the white ball, throw a new color into the mix.
- 83. Chinese Restaurant Process When you find the white ball, throw a new color into the mix.
- 84. Chinese Restaurant Process When you find the white ball, throw a new color into the mix.
- 85. Chinese Restaurant Process The expected infinite distribution (mean) is exponential. # of white balls controls the exponent
- 86. So coming back to the count data: 20 0 2 What Dirichlet parameters explain the data? 0 1 17 14 6 0 15 5 0 0 20 0 0 14 6
- 87. So coming back to the count data: 20 0 2 Newton’s Method: Requires Gradient + Hessian 0 1 17 14 6 0 15 5 0 0 20 0 0 14 6
- 88. So coming back to the count data: 20 0 2 Reads all of the data... 0 1 17 14 6 0 15 5 0 0 20 0 0 14 6
- 89. So coming back to the count data: 20 https://github. com/maxsklar/Baye sPy/tree/master/Con jugatePriorTools 0 0 2 1 17 14 6 0 15 5 0 0 20 0 0 14 6
- 90. So coming back to the count data: Compress the data into a Matrix and a Vector: Works for lots of sparsely populated rows 20 0 0 2 1 17 14 6 0 15 5 0 0 20 0 0 14 6
- 91. The Compression MACHINE 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
- 92. The Compression MACHINE K=3 (the 4th row is a special, total row) M=6 The maximum # samples per input 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
- 93. The Compression MACHINE 1 0 0 K=3 (the 4th row is a special, total row) M=6 The maximum # samples per input 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
- 94. The Compression MACHINE K=3 (the 4th row is a special, total row) M=6 The maximum # samples per input 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
- 95. The Compression MACHINE 1 3 2 K=3 (the 4th row is a special, total row) M=6 The maximum # samples per input 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
- 96. The Compression MACHINE K=3 (the 4th row is a special, total row) M=6 The maximum # samples per input 2 0 0 0 0 0 1 1 1 0 0 0 1 1 0 0 0 0 2 1 1 1 1 1
- 97. The Compression MACHINE 0 6 0 K=3 (the 4th row is a special, total row) M=6 The maximum # samples per input 2 0 0 0 0 0 1 1 1 0 0 0 1 1 0 0 0 0 2 1 1 1 1 1
- 98. The Compression MACHINE K=3 (the 4th row is a special, total row) M=6 The maximum # samples per input 2 0 0 0 0 0 2 2 2 1 1 1 1 1 0 0 0 0 3 2 2 2 2 2
- 99. The Compression MACHINE 2 1 1 K=3 (the 4th row is a special, total row) M=6 The maximum # samples per input 2 0 0 0 0 0 2 2 2 1 1 1 1 1 0 0 0 0 3 2 2 2 2 2
- 100. The Compression MACHINE K=3 (the 4th row is a special, total row) M=6 The maximum # samples per input 3 1 0 0 0 0 3 2 2 1 1 1 2 1 0 0 0 0 4 3 3 3 2 2
- 101. DEMO
- 102. Our Popularity Prior
- 103. Dirichlet Mixture Models Anything you can go with a Gaussian, you can also do with a Dirichlet
- 104. Dirichlet Mixture Models Example: Mixture of Gaussians using ExpectationMaximization
- 105. Dirichlet Mixture Models Assume each row is a mixture of multinomials. And the parameters of that mixture are pulled from a Dirichlet. ?
- 106. Dirichlet Mixture Models Latent Dirichlet Allocation Topic Model
- 107. Questions

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment