Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- Variational Inference by Tushar Tank 1824 views
- Non-linear density estimation using... by Arthur Breitman 735 views
- Machine Learning by Arthur Breitman 2087 views
- Gibbs cloner を用いた組み合わせ最適化と cross-en... by Takeshi Nagae 3193 views
- Deep Reinforcement Learning Through... by Jack Clark 2623 views
- 正則化つき線形モデル(「入門機械学習第6章」より) by Eric Sartre 43624 views

The cross-entropy method is a parametric technique performing adaptive importance sampling in Monte-Carlo methods. Information about the distribution being integrated is collected as samples are generated from a parametric sampling distribution. The parameters of this distribution are iteratively updated to minimize the cross-entropy between the sample of the distribution of interest and the sampling distribution. This versatile adaptive technique has found many applications in rare even simulation, combinatorial optimization and optimization of functions with multiple extrema. Through a series of use cases, I'll present a quick practioner's guide to using the cross-entropy method and will discuss common tricks and pitfalls.

No Downloads

Total views

2,014

On SlideShare

0

From Embeds

0

Number of Embeds

1

Shares

0

Downloads

71

Comments

4

Likes

1

No notes for slide

- 1. What is cross-entropy? From Riemann to Monte-Carlo Cross-Entropy techniques Cross-Entropy tricks QuestionsUsing cross-entropy techniques for rare event simulation and optimization Arthur Breitman NYC Machine learning meetup August 18, 2011 Arthur Breitman crossentropy for rare event simulation and optimization
- 2. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks QuestionsOutline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
- 3. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks QuestionsInformation entropy deﬁnition of information entropy ◮ Entropy measures disorder of a physical system ◮ Entropy measures information (Shannon) ◮ Entropy measures ignorance (E.T. Jaynes) ◮ Formally: H=− p(x) ln(p(x)) x∈Ω Arthur Breitman crossentropy for rare event simulation and optimization
- 4. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks QuestionsThe continuous case In the continuous case, for a random variable X with p.d.f p(x) entropy is deﬁned as H(X ) = − P(x) ln(p(x))dx Ω Simple, right? Arthur Breitman crossentropy for rare event simulation and optimization
- 5. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks QuestionsThe entropy of a probability distribution is meaningless Wrong! ◮ Not invariant under a change of variable ◮ Can even be negative! ◮ Not an extension of Shannon’s entropy. Arthur Breitman crossentropy for rare event simulation and optimization
- 6. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks QuestionsE.T. Jaynes to the rescue E.T. Jaynes, adjusted the deﬁnition. Consider a sequence of discrete values in Ω dense in Ω, it must a approach a distribution m. Set p(x) H(X ) = − P(x) ln dx Ω m(x) N.B. m is not necessarily a probability distribution, just a density, so improper priors are O.K. Arthur Breitman crossentropy for rare event simulation and optimization
- 7. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks QuestionsOutline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
- 8. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks QuestionsDeﬁnition of KL divergence Kullback-Leibler divergence: entropy of a probability distribution p relative to probability distribution q p(x) DKL (P||Q) = − P(x) ln dx Ω q(x) ◮ Similar but distinct from entropy. ◮ Expected number of nats (or bits) to encode data drawn from Q assuming it is drawn from P. ◮ Not symmetric! Arthur Breitman crossentropy for rare event simulation and optimization
- 9. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks QuestionsWhy code length matter ◮ All ML problems ⇔ ﬁtting a probability distribution ◮ KL divergence measures how concise your description is ◮ Relates to MDL and Solomonoﬀ induction ◮ PAC-learning patches against a lack of epistemology Arthur Breitman crossentropy for rare event simulation and optimization
- 10. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks QuestionsLikelihood of parameters and Cross-Entropy Given a sample {q}i of Q, and {P}θ∈Θ , 1 LL(θ|{q}i ) = H(Pθ ) + DKL Pθ δqi N i The likelihood of θ is the KL-divergence of Pθ w.r.t a Dirac comb. Arthur Breitman crossentropy for rare event simulation and optimization
- 11. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling QuestionsOutline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
- 12. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling QuestionsRiemann integration How does one compute the integral of a function? Rectangle method: b N−1 1 i f (x)dx → f a + (b − a) a N N i=0 Linear convergence. Arthur Breitman crossentropy for rare event simulation and optimization
- 13. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling QuestionsThe curse of dimensionality Multiple dimensions? b1 bm N−1 N−1 1 1 ··· f (x)dx → m ··· f a+ i ◦ (b − a) a1 am N N i1 =0 im =0 Computation is exponential in m. Arthur Breitman crossentropy for rare event simulation and optimization
- 14. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling QuestionsOutline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
- 15. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling QuestionsMonte-Carlo integration If P is a probability distribution over Ω, draw {x}i from P: N 1 f (xi ) f (x)dx ∼ Ω N p(xi ) i=1 Very simple to implement, often p ∼ 1 Arthur Breitman crossentropy for rare event simulation and optimization
- 16. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling QuestionsMonte-Carlo convergence ◮ Let random variable Xp = f (x)/p(x) ◮ If var(Xp ) < ∞, convergence is O(N 1/2 ) by the central-limit theorem! ◮ If m > 2, Monte-Carlo becomes attractive. Arthur Breitman crossentropy for rare event simulation and optimization
- 17. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling QuestionsProblems with MC ◮ If the mass of f is concentrated in a small region, convergence can be very slow. ◮ also a problem with Riemann integration... Arthur Breitman crossentropy for rare event simulation and optimization
- 18. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling QuestionsOutline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
- 19. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling QuestionsImportance sampling ◮ Sample preferably the regions of interest by picking p to minimize the variance of f /p ◮ In Riemann world, equivalent to an irregular grid f ◮ Ideal sampling distribution (if f > 0) is f , but we don’t know f! ◮ Best convergence when χ2 of f w.r.t p is minimized Arthur Breitman crossentropy for rare event simulation and optimization
- 20. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling QuestionsAdaptive importance sampling ◮ What if we don’t know the shape of f ? ◮ Learn it adaptively from the sampling. ◮ Iteratively improve the importance sampling function. Arthur Breitman crossentropy for rare event simulation and optimization
- 21. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling QuestionsVegas algorithm and cross-entropy ◮ Vegas algorithm, use histograms and separate variables ◮ Cross-entropy algorithm, pick p from a family of distributions to minimize cross-entropy to the sample Arthur Breitman crossentropy for rare event simulation and optimization
- 22. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters QuestionsOutline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
- 23. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters QuestionsWhy cross-entropy? In many cases, the expression is analytical and computationally cheap to derive, e.g. ◮ the uniform distribution ◮ the categorical distribution (ﬁnite, discrete) ◮ all the natural exponential family Arthur Breitman crossentropy for rare event simulation and optimization
- 24. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters QuestionsThe natural exponential distribution? fX (x|θ) = h(x) exp (θ∗ x − A(θ)) ◮ theta is the suﬃcient statistic ◮ maximum cross-entropy distribution given θ w.r.t dH ◮ Examples: normal, multivariate normal, gamma, binomial, multinomial, negative binomial Arthur Breitman crossentropy for rare event simulation and optimization
- 25. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters QuestionsBeta distribution Not analytical! To ﬁt, start with approximate values from the moment’s method ¯ ¯ X (1 − X ¯ ¯ X (1 − X ¯ α=X ¯ − 1 , β = (1 − X ) −1 S2 S2 The likelihood is given by n n n(ln(Γ(α+β)−ln(Γ(α)−ln(Γ(β))+(α−1) ln(Xi )+(β−1) ln(1−Xi ) i=0 i=0 The ﬁrst and second derivatives are the digamma and trigamma function, available in the gsl. Newton’s method using the Jacobian converges in a couple iterations. Very useful to model bounded variables. Arthur Breitman crossentropy for rare event simulation and optimization
- 26. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters QuestionsOutline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
- 27. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters QuestionsSurviving the zombie hordes Figure: Electric fences, the horde and you Arthur Breitman crossentropy for rare event simulation and optimization
- 28. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters QuestionsSimulating zombie breakouts ◮ Each fence (Ui , λi ) delivers u ∼ max(Ui − Exp(λi ), 0) volts. ◮ Crossing a fence deals u damage to a zombie ◮ Zombies come from everywhere and can take 5 damage hits each. ◮ Zombies outbreaks are very rare! Arthur Breitman crossentropy for rare event simulation and optimization
- 29. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters QuestionsMere integration fails! ◮ We can estimate this probability by sampling the random voltages and ﬁnding a shortest path. ◮ Speed of Monte-Carlo proportional to poutbreak (1 − poutbreak ), too slow! Arthur Breitman crossentropy for rare event simulation and optimization
- 30. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters QuestionsCross-Entropy to the rescue ◮ We want to approximate the multivariate power distribution conditional on an outbreak occurring! ◮ Approximate the shape by changing the parameters Ui and λi for each fence ◮ Generate samples, ﬁt Ui and λi on the samples inducing an outbreak Arthur Breitman crossentropy for rare event simulation and optimization
- 31. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters QuestionsThe elite sample What if the probability is so low that we don’t observe any outbreak in our sample? ◮ Generate n samplings using the sampling distribution ◮ If more than e samples are outbreaks, ﬁt to those samples, break ◮ Otherwise, ﬁt on the e best sample, the elite sample. ◮ Iterate ◮ Generate a sample, weight each points by the importance sampling weight, estimate probability Arthur Breitman crossentropy for rare event simulation and optimization
- 32. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters QuestionsOther examples ◮ Modeling rare event for any complex probability distribution, e.g. Bayesian networks. ◮ Estimating tails for the sum of fat-tailed distributions Arthur Breitman crossentropy for rare event simulation and optimization
- 33. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters QuestionsOutline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
- 34. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters QuestionsFrom integration to optimization Using an elite sample to help convergence is a trick that does a form of hill climbing of a smooth function approximating the indicator function of the rare event. ◮ Interesting even if not interested in integrating f . ◮ Keep iterating based on an elite sample to converge towards one global maximum. ◮ variance of the sampling distribution follows the curvature of f. ◮ e.g. using a multivariate normal allows the covariance to reﬂect the diﬀerential Arthur Breitman crossentropy for rare event simulation and optimization
- 35. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters QuestionsCombinatorial optimization One classical example if combinatorial optimization. To solve a TSP with Cross-Entropy: ◮ Assume the travel is a Markov chain on the graph nodes. ◮ Generate travels by coercing them to be permutations. ◮ Update transition probabilities from the elite sample. Arthur Breitman crossentropy for rare event simulation and optimization
- 36. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters QuestionsClustering CE does clustering too! ◮ Assign probabilities of membership to classes for each point (the sampling distribution). ◮ Sample random membership assignments. ◮ Use average distance to centroids to ﬁnd an elite sample. ◮ Slower than K-means but much less sensitive to initial choice of centroids. Arthur Breitman crossentropy for rare event simulation and optimization
- 37. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters QuestionsA form of global optimization Is it global optimization? ◮ If the sampling distribution is bounded below by a distribution that covers the global maximum, yes, with probability 1! ◮ In practice we may never see one maximum and converge to another local maximum. Arthur Breitman crossentropy for rare event simulation and optimization
- 38. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters QuestionsOutline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
- 39. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters QuestionsFitting model parameters with CE Cross-Entropy techniques work generally very well for ﬁnding ML parameters of a model. Why? ◮ Models often have diﬀerent sensitivities to diﬀerent parameters, CE reﬂects that. ◮ With a covariance structure, it does a form of gradient ascent. ◮ But it can deal with discrete parameters at the same time! ◮ It does not tend to get trapped in local maxima. ◮ Well suited for high-dimensional parameter spaces. Arthur Breitman crossentropy for rare event simulation and optimization
- 40. What is cross-entropy? From Riemann to Monte-Carlo Multiple maxima Cross-Entropy techniques Slow convergence Cross-Entropy tricks QuestionsOutline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
- 41. What is cross-entropy? From Riemann to Monte-Carlo Multiple maxima Cross-Entropy techniques Slow convergence Cross-Entropy tricks QuestionsForgetting maxima Some maxima can be ”forgotten” ◮ Smooth changes in the sampling function. ◮ Expand the sampling function (equivalent to applying a prior or ”shrinkage”). ◮ Keep the entire sample Arthur Breitman crossentropy for rare event simulation and optimization
- 42. What is cross-entropy? From Riemann to Monte-Carlo Multiple maxima Cross-Entropy techniques Slow convergence Cross-Entropy tricks QuestionsNot converging to a maximum Multiple maxima may prevent variance of the sampling from decreasing. ◮ Mixtures of multivariate normals can deal with this. ◮ They can be introduced dynamically. ◮ Fit with EM. Arthur Breitman crossentropy for rare event simulation and optimization
- 43. What is cross-entropy? From Riemann to Monte-Carlo Multiple maxima Cross-Entropy techniques Slow convergence Cross-Entropy tricks QuestionsOutline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
- 44. What is cross-entropy? From Riemann to Monte-Carlo Multiple maxima Cross-Entropy techniques Slow convergence Cross-Entropy tricks QuestionsIndependent variables If the sampling distribution is separable, convergence can be sped up by sampling over one dimension at a time. Arthur Breitman crossentropy for rare event simulation and optimization
- 45. What is cross-entropy? From Riemann to Monte-Carlo Cross-Entropy techniques Cross-Entropy tricks QuestionsQuestions Questions? Arthur Breitman crossentropy for rare event simulation and optimization

No public clipboards found for this slide

Login to see the comments