Upcoming SlideShare
×

# Graphical Models 4dummies

3,798 views
3,598 views

Published on

3 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
3,798
On SlideShare
0
From Embeds
0
Number of Embeds
15
Actions
Shares
0
103
0
Likes
3
Embeds 0
No embeds

No notes for slide

### Graphical Models 4dummies

1. 1. Graphical Models for dummies Max Khesin, Data Strategist, Liquidnet Inc.
2. 2. Graphical Models For By Dummies
3. 3. Grand Theme • “Probabilistic graphical models are an elegant framework which combines uncertainty (probabilities) and logical structure (independence constraints) to compactlyrepresent complex, real-world phenomena”. (Koller 2007)
4. 4. Trying to guess if family is home. • When wife leaves the house she leaves the outdoor light on (but sometimes leaves it on for a guest) • When wife leaves the house, she usually puts the dog out • When dog has a bowel problem, she goes to the backyard • If the dog is in the backyard, I will probably hear it (but it might be the neighbor's dog)
5. 5. These causal connections are not absolute Three causes of uncertainty (Norvig, Russell 2009): - Laziness - Theoretical Ignorance - Practical Ignorance
6. 6. Problem with probability • Too many parameters • For binary random variables, 2^n-1
7. 7. Bayesean Network - definition • A Bayesian network is a directed graph in which each node is annotated with quantitative probability information. The full specification is as follows: 1. Each node corresponds to a random variable, which may be discrete or continuous. 2. A set of directed links or arrows connects pairs of nodes. If there is an arrow from node X to node Y , X is said to be a parent of Y. The graph has no directed cycles (and hence is a directed acyclic graph, or DAG. 3. Each node Xi has a conditional probability distribution P(Xi | Parents(Xi)) that quantifies the effect of the parents on the node. (Russell, Norvig 2009)
8. 8. Compressed distribution (factorization) • In our Bayesian system, we only have 10 parameters • The compression is due to independence • Independence is how causality manifests itself in the distribution
9. 9. D-connection
10. 10. Definition: conditional probability
11. 11. Decomposing a joint distribution …
12. 12. Topologic sort: • “topological ordering of a directed acyclic graph (DAG) is a linear ordering of its nodes in which each node comes before all nodes to which it has outbound edges” - Wikipedia.
13. 13. Sorting the family-home example • Family-out(1), Bowel-problem(2), Ligths- On(3), Dog-out(4), Hear-bark(4) • Bowel-problem(1), Family-out(2), Ligths- On(3), Dog-out(4), Hear-bark(5) • Right away, we got all the non-descendants to the left of the variable
14. 14. Parents only matter
15. 15. Chain rule for Bayesian Networks
16. 16. Markov Networks • These models are useful in modeling a variety of phenomena where one cannot naturally ascribe a directionality to the interaction between variables
17. 17. Markov Networks
18. 18. Markov Networks • D is a set of random variables • Factor to be a function from Val(D) to IR+.
19. 19. Markov Networks • H is a Markov network structure • set of subsets D1, . . . ,Dm, where each Di is a complete subgraph of H • factors
20. 20. Markov Network - factorization Where the unnormalized measure is And normalization factor is
21. 21. Factor Product (pointwise multiplication)
22. 22. Decision Networks - Combine Bayesian Networks with Utility Theory
23. 23. Utility-based agent
24. 24. Decision network
25. 25. Evaluating decision network
26. 26. Applications – Bayes Nets • Expert systems. • “…A later evaluation showed that the diagnostic accuracy of Pathfinder IV was at least as good as that of the expert used to design the system. When used with less expert pathologists, the system significantly improved the diagnostic accuracy of the physicians alone. Moreover, the system showed greater ability to identify important findings and to integrate these findings into a correct diagnosis. Unfortunately, multiple reasons prevent the widespread adoption of Bayesian networks as anaid for medical diagnosis, including legal liability issues for misdiagnoses and incompatibility with the physicians' workflow” (Koller 2009)
27. 27. Applications – Markov Networks • Computer vision – segmentation • Regions are contiguous. Glove is next to the arm “superpixels”
28. 28. Application – Markov Nets (combining logic and probability) 1.5 x Smokes( x) Cancer( x) 1.1 x, y Friends x, y) ( Smokes( x) Smokes( y) Two constants: Anna (A) and Bob (B) Friends(A,B) Friends(A,A) Smokes(A) Smokes(B) Friends(B,B) Cancer(A) Cancer(B) Friends(B,A)