Upcoming SlideShare
×

# Bn

330 views

Published on

Published in: Technology, Education
2 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
330
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
6
0
Likes
2
Embeds 0
No embeds

No notes for slide
• \n
• \n
• \n
• \n
• \n
• \n
• \n
• \n
• \n
• \n
• \n
• \n
• \n
• \n
• \n
• \n
• \n
• \n
• \n
• ### Bn

1. 1. Learning the Structure of Dynamic Probabilistic Networks Matt Hink March 27, 2012
2. 2. Overview• Deﬁnitions and Introduction to DPNs• Learning from complete data• Experimental Results• Applications
3. 3. Regular probabilistic networks (Bayesian networks) are well established for representing probabilistic relationships among many random variables. Dynamic Probabilistic Networks (DPNs), however, extend this representation to the modeling ofstochastic evolution of a set of random variables over time. (Think “probabilistic state machines”)
4. 4. Notation• Capital letters (X,Y,Z)- sets of variables• X - Random variable of set X i• Val(X ) - Finite set of values of X i i• |X | - Size of Val(X ) i i• Lowercase italic (x,y,z)- set instantiations
5. 5. DPNs are an extension to the common Bayesian network representation where the probabilitydistribution changes with respect to time according to some stochastic process. Assume that X is a set of variables in a PN which vary according to time. Then Xi[t] is the value of the attribute Xi at time t, and X[t] is the collection of such variables.
6. 6. For simplicity’s sake, we assume that the stochastic process governing transitions is Markovian: P(X[t+1] | X[0...t]) = P(X[t+1] | X[t])That is, the probability of a certain instantiation isdependent only upon its immediate predecessor.
7. 7. We also assume the process is stationary, i.e., P(X[t+1] | X[t]) is independent of t.
8. 8. Given these two assumptions, we can describe a DPN representing the joint distribution over all possible trajectories of a process using two parts:A prior network B0 that speciﬁes a distribution over the initial states X[0]; and A transition network B-> over the variables X[0] ∪ X[1] which speciﬁes the transition probability P(X[t+1] | X[t]) for all t.
9. 9. A prior network (left) and transition network (right) for a dynamic probabilistic network
10. 10. In light of this structure, the joint distribution over the entire history of the DPN at time T is given as PB(x[0...T]) = PB0(x[0]) ∏(t=0...T-1) PB->(x[t+1] | x[t]) in other words, the product of all previous distributions.
11. 11. Learning fromComplete Data
12. 12. Common traditional methods:search algorithims using scoring methods (BIC, BDe) (given a dataset D) DPN methods: search algorithms using scoring methods! (given a dataset D, consisting of Nseq observations)
13. 13. So each entry in our dataset consists of an observation of a set of variables over time. The mth such sequence has length Nm and speciﬁes values for the variable set Xm[0...Nm] We then have Nseq instances of the initial state, andN = ∑m Nm instances of transitions. We can use these to learn the structure of the prior network and the transition network, respectively.
14. 14. BIC scores for DPNs• Let
15. 15. BIC scores for DPNs• We ﬁnd the log-likelihood using
16. 16. BIC scores for DPNs• And can then ﬁnd the BIC score using
17. 17. Experimental Results
18. 18. Application: Modeling driver behavior