0
Upcoming SlideShare
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Standard text messaging rates apply

# Bayesian Networks - A Brief Introduction

1,695

Published on

Bayesian networks primer

Bayesian networks primer

Published in: Education, Technology
1 Like
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total Views
1,695
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
155
0
Likes
1
Embeds 0
No embeds

No notes for slide

### Transcript

• 1. A B RIEF INTRODUCTIONA D N A N M A S O O DS C I S . N O V A . E D U / ~ A D N A NA D N A N @ N O V A . E D UD O C T O R A L C A N D I D A T EN O V A S O U T H E A S T E R N U N I V E R S I T YBayesian Networks
• 2. What is a Bayesian Network?&#xF097; A Bayesian network (BN) is a graphical model fordepicting probabilistic relationships among a setof variables.&#xF0A1; BN Encodes the conditional independence relationships between thevariables in the graph structure.&#xF0A1; Provides a compact representation of the joint probabilitydistribution over the variables&#xF0A1; A problem domain is modeled by a list of variables X1, &#x2026;, Xn&#xF0A1; Knowledge about the problem domain is represented by a jointprobability P(X1, &#x2026;, Xn)&#xF0A1; Directed links represent causal direct influences&#xF0A1; Each node has a conditional probability table quantifying the effectsfrom the parents.&#xF0A1; No directed cycles
• 3. Bayesian Network constitutes of..&#xF0A1; Directed Acyclic Graph (DAG)&#xF0A1; Set of conditional probability tables for each node inthe graphABC D
• 4. So BN = (DAG, CPD)&#xF0A1; DAG: directed acyclic graph (BN&#x2019;s structure)&#xF0F7;Nodes: random variables (typically binary or discrete,but methods also exist to handle continuous variables)&#xF0F7;Arcs: indicate probabilistic dependencies betweennodes (lack of link signifies conditional independence)&#xF0A1; CPD: conditional probability distribution (BN&#x2019;sparameters)&#xF0F7;Conditional probabilities at each node, usually storedas a table (conditional probability table, or CPT)
• 5. So, what is a DAG?ABC Ddirected acyclic graphs useonly unidirectional arrows toshow the direction ofcausationEach node in graph representsa random variableFollow the general graphprinciples such as a node A is aparent of another node B, ifthere is an arrow from node Ato node B.Informally, an arrow fromnode X to node Y means X hasa direct influence on Y
• 6. Where do all these numbers come from?There is a set of tables for each node in the network.Each node Xi has a conditional probability distributionP(Xi | Parents(Xi)) that quantifies the effect of the parentson the nodeThe parameters are the probabilities in these conditionalprobability tables (CPTs)ABC D
• 7. The infamous Burglary-Alarm ExampleBurglary EarthquakeAlarmJohn Calls Mary CallsP(B)0.001P(E)0.002B E P(A)T T 0.95T F 0.94F T 0.29F F 0.001A P(J)T 0.90F 0.05A P(M)T 0.70F 0.01
• 8. Cont..calculations on the belief networkUsing the network in the example, suppose you wantto calculate:P(A = true, B = true, C = true, D = true)= P(A = true) * P(B = true | A = true) *P(C = true | B = true) P( D = true | B = true)= (0.4)*(0.3)*(0.1)*(0.95)These numbers are from theconditional probability tablesThis is from thegraph structure
• 9. So let&#x2019;s see how you can calculate P(John called)if there was a burglary?&#xF097; Inference from effect to cause; Given a burglary,what is P(J|B)?&#xF097; Can also calculate P (M|B) = 0.6785.0)05.0)(06.0()9.0)(94.0()|()05.0)(()9.0)(()|(94.0)|()95.0)(002.0(1)94.0)(998.0(1)|()95.0)(()()94.0)(()()|(?)|(&#xF03D;&#xF02B;&#xF03D;&#xF0D8;&#xF02B;&#xF03D;&#xF03D;&#xF02B;&#xF03D;&#xF02B;&#xF0D8;&#xF03D;&#xF03D;BJPAPAPBJPBAPBAPEPBPEPBPBAPBJP
• 10. Why Bayesian Networks?&#xF097; Bayesian Probability represents the degree of beliefin that event while Classical Probability (or frequentsapproach) deals with true or physical probability ofan event&#x2022; Bayesian Network&#x2022; Handling of Incomplete Data Sets&#x2022; Learning about Causal Networks&#x2022; Facilitating the combination of domain knowledge and data&#x2022; Efficient and principled approach for avoiding the over fittingof data
• 11. What are Belief Computations?&#xF097; Belief Revision&#xF0A1; Model explanatory/diagnostic tasks&#xF0A1; Given evidence, what is the most likely hypothesis to explain theevidence?&#xF0A1; Also called abductive reasoning&#xF0A1; Example: Given some evidence variables, find the state of all othervariables that maximize the probability. E.g.: We know John Calls,but not Mary. What is the most likely state? Only considerassignments where J=T and M=F, and maximize.&#xF097; Belief Updating&#xF0A1; Queries&#xF0A1; Given evidence, what is the probability of some other randomvariable occurring?
• 12. What is conditional independence?The Markov condition says that given its parents (P1, P2), anode (X) is conditionally independent of its non-descendants(ND1, ND2)XP1 P2C1 C2ND2ND1
• 13. What is D-Separation?&#xF097; A variable a is d-separated from b by a set of variablesE if there does not exist a d-connecting path between aand b such that&#xF0A1; None of its linear or diverging nodes is in E&#xF0A1; For each of the converging nodes, either it or one of itsdescendants is in E.&#xF097; Intuition:&#xF0A1; The influence between a and b must propagate through a d-connecting path&#xF097; If a and b are d-separated by E, then they areconditionally independent of each other given E:P(a, b | E) = P(a | E) x P(b | E)
• 14. Construction of a Belief NetworkProcedure for constructing BN:&#xF097; Choose a set of variables describing the applicationdomain&#xF097; Choose an ordering of variables&#xF097; Start with empty network and add variables to thenetwork one by one according to the ordering&#xF097; To add i-th variable Xi:&#xF0A1; Determine pa(Xi) of variables already in the network (X1, &#x2026;, Xi &#x2013; 1)such thatP(Xi | X1, &#x2026;, Xi &#x2013; 1) = P(Xi | pa(Xi))(domain knowledge is needed there)&#xF0A1; Draw an arc from each variable in pa(Xi) to Xi
• 15. What is Inference in BN?&#xF097; Using a Bayesian network to compute probabilities iscalled inference&#xF097; In general, inference involves queries of the form:P( X | E )where X is the query variable and E is the evidencevariable.
• 16. Representing causality in Bayesian Networks&#xF097; A causal Bayesian network, or simply causalnetworks, is a Bayesian network whose arcs areinterpreted as indicating cause-effect relationships&#xF097; Build a causal network:&#xF0A1; Choose a set of variables that describes the domain&#xF0A1; Draw an arc to a variable from each of its direct causes(Domain knowledge required)Visit AfricaTuberculosisX-RaySmokingLung CancerBronchitisDyspneaTuberculosis orLung Cancer
• 17. Limitations of Bayesian Networks&#x2022; Typically require initial knowledge of manyprobabilities&#x2026;quality and extent of prior knowledgeplay an important role&#x2022; Significant computational cost(NP hard task)&#x2022; Unanticipated probability of an event is not takencare of.
• 18. Summary&#xF097; Bayesian methods provide sound theory and framework forimplementation of classifiers&#xF097; Bayesian networks a natural way to represent conditional independenceinformation. Qualitative info in links, quantitative in tables.&#xF097; NP-complete or NP-hard to compute exact values; typical to makesimplifying assumptions or approximate methods.&#xF097; Many Bayesian tools and systems exist&#xF097; Bayesian Networks: an efficient and effective representation of the jointprobability distribution of a set of random variables&#xF0A1; Efficient:&#xF0F7; Local models&#xF0F7; Independence (d-separation)&#xF0A1; Effective:&#xF0F7; Algorithms take advantage of structure to&#xF0F7; Compute posterior probabilities&#xF0F7; Compute most probable instantiation&#xF0F7; Decision making
• 19. Bayesian Network Resources&#xF097; Repository: www.cs.huji.ac.il/labs/compbio/Repository/&#xF097; Softwares:&#xF0A1; Infer.NET http://research.microsoft.com/en-us/um/cambridge/projects/infernet/&#xF0A1; Genie: genie.sis.pitt.edu&#xF0A1; Hugin: www.hugin.com&#xF0A1; SamIam http://reasoning.cs.ucla.edu/samiam/&#xF0A1; JavaBayes: www.cs.cmu.edu/ javabayes/Home/&#xF0A1; Bayesware: www.bayesware.com&#xF097; BN info sites&#xF0A1; Bayesian Belief Network site (Russell Greiner)http://webdocs.cs.ualberta.ca/~greiner/bn.html&#xF0A1; Summary of BN software and links to software sites (Kevin Murphy)
• 20. References and Further Reading&#xF097; Bayesian Networks without Tears by Eugene Charniakhttp://www.cs.ubc.ca/~murphyk/Bayes/Charniak_91.pdf&#xF097; Russel, S. and Norvig, P. (1995). ArtificialIntelligence, A Modern Approach. Prentice Hall.&#xF097; Weiss, S. and Kulikowski, C. (1991). Computer SystemsThat Learn. Morgan Kaufman.&#xF097; Heckerman, D. (1996). A Tutorial on Learning withBayesian Networks. Microsoft Technical ReportMSR-TR-95-06.&#xF097; Internet Resources on Bayesian Networks andMachine Learning:http://www.cs.orst.edu/~wangxi/resource.html
• 21. Modeling and Reasoning with BayesianNetworks
• 22. Machine Learning: A Probabilistic Perspective
• 23. Bayesian Reasoning and Machine Learning