SlideShare a Scribd company logo
1 of 42
Bayesian Networks

  CSC 371: Spring 2012
Today’s Lecture
• Recap: Joint distribution, independence, marginal
  independence, conditional independence



• Bayesian networks



• Reading:
   – Sections 14.1-14.4 in AIMA [Russel & Norvig]
Marginal Independence




•   Intuitively: if X ╨ Y, then
     – learning that Y=y does not change your belief in X
     – and this is true for all values y that Y could take

•   For example, weather is marginally independent
    of the result of a coin toss



3
Marginal Independence




4
Conditional Independence




• Intuitively: if X ╨ Y | Z, then
   – learning that Y=y does not change your belief in X
     when we already know Z=z
   – and this is true for all values y that Y could take
     and all values z that Z could take
• For example,
 5 ExamGrade
             ╨ AssignmentGrade | UnderstoodMaterial
Conditional Independence
“…probability theory is more fundamentally concerned with
the structure of reasoning and causation than with
numbers.”



                     Glenn Shafer and Judea Pearl
                     Introduction to Readings in Uncertain Reasoning,
                     Morgan Kaufmann, 1990
Bayesian Network Motivation
• We want a representation and reasoning system that is based on
  conditional (and marginal) independence
     – Compact yet expressive representation
     – Efficient reasoning procedures
• Bayesian (Belief) Networks are such a representation
     – Named after Thomas Bayes (ca. 1702 –1761)
     – Term coined in 1985 by Judea Pearl (1936 – )
     – Their invention changed the primary focus of AI from logic to probability!




                 Thomas Bayes                       Judea Pearl

 8
Bayesian Networks: Intuition
• A graphical representation for a joint probability
  distribution
      – Nodes are random variables
          • Can be assigned (observed) or unassigned (unobserved)
      – Arcs are interactions between nodes
          • Encode conditional independence
          • An arrow from one variable to another indicates direct influence
          • Directed arcs between nodes reflect dependence
      – A compact specification of full joint distributions
• Some informal examples:

                                              Smoking At                 Fire
                 Understood
                                                Sensor
                  Material

    Assignment                Exam
      Grade                   Grade                           Alarm
9
Example of a simple Bayesian
            network A       B

    p(A,B,C) = p(A)p(B) p(C|A,B)

                                                      C



• Probability model has simple factored form

• Directed edges => direct dependence

• Absence of an edge => conditional independence

• Also known as belief networks, graphical models, causal networks

• Other formulations, e.g., undirected graphical models
Bayesian Networks: Definition




11
Bayesian Networks: Definition




• Discrete Bayesian networks:
     – Domain of each variable is finite
     – Conditional probability distribution is a conditional probability
       table
     – We will assume this discrete case
         • But everything we say about independence (marginal & conditional)
12
           carries over to the continuous case
Examples of 3-way Bayesian
        Networks



 A    B   C   Marginal Independence:
              p(A,B,C) = p(A) p(B) p(C)
Examples of 3-way Bayesian
        Networks
            Conditionally independent effects:
            p(A,B,C) = p(B|A)p(C|A)p(A)

    A       B and C are conditionally independent
            Given A

            e.g., A is a disease, and we model
B       C   B and C as conditionally independent
            symptoms given A
Examples of 3-way Bayesian
        Networks
 A       B   Independent Causes:
             p(A,B,C) = p(C|A,B)p(A)p(B)

     C
             “Explaining away” effect:
             Given C, observing A makes B less likely
             e.g., earthquake/burglary/alarm example

             A and B are (marginally) independent
             but become dependent once C is known
Examples of 3-way Bayesian
        Networks


 A   B    C   Markov dependence:
              p(A,B,C) = p(C|B) p(B|A)p(A)
Example: Burglar Alarm
• I have a burglar alarm that is sometimes set off by minor
  earthquakes. My two neighbors, John and Mary, promised to call
  me at work if they hear the alarm
   – Example inference task: suppose Mary calls and John doesn’t call. What is
     the probability of a burglary?
• What are the random variables?
   – Burglary, Earthquake, Alarm, JohnCalls, MaryCalls
Example
 5 binary variables:
       B = a burglary occurs at your house
       E = an earthquake occurs at your house
       A = the alarm goes off
       J = John calls to report the alarm
       M = Mary calls to report the alarm

 What is P(B | M, J) ? (for example)
     We can use the full joint distribution to answer this question
          Requires 25 = 32 probabilities
 Can we use prior domain knowledge to come up with a Bayesian
  network that requires fewer probabilities?
 What are the direct influence relationships?
       A burglary can set the alarm off
       An earthquake can set the alarm off
       The alarm can cause Mary to call
       The alarm can cause John to call
Example: Burglar Alarm




             What are the model
             parameters?
Conditional probability
              distributions
• To specify the full joint distribution, we need to specify a
  conditional distribution for each node given its parents:
  P (X | Parents(X))




                Z1       Z2       …   Zn




                              X
                                           P (X | Z1, …, Zn)
Example: Burglar Alarm
The joint probability distribution
• For each node Xi, we know P(Xi | Parents(Xi))
• How do we get the full joint distribution P(X1, …, Xn)?
• Using chain rule:


                          n                             n
     P ( X 1 , , X n ) = ∏ P( X i | X 1 , , X i −1 ) = ∏ P( X i | Parents( X i ) )
                         i =1                          i =1




• For example, P(j, m, a, ¬b, ¬e)
  = P(¬b) P(¬e) P(a | ¬b, ¬e) P(j | a) P(m | a)
•
Constructing a Bayesian Network:
             Step 1
 • Order the variables in terms of causality (may be a partial order)

         e.g., {E, B} -> {A} -> {J, M}


 • P(J, M, A, E, B) = P(J, M | A, E, B) P(A| E, B) P(E, B)

                    ~ P(J, M | A)        P(A| E, B) P(E) P(B)

                    ~ P(J | A) P(M | A) P(A| E, B) P(E) P(B)


   These CI assumptions are reflected in the graph structure of the
   Bayesian network
Constructing this Bayesian
          Network: Step 2
•   P(J, M, A, E, B) =
      P(J | A) P(M | A) P(A | E, B) P(E) P(B)




•   There are 3 conditional probability tables (CPDs) to be determined:
    P(J | A), P(M | A), P(A | E, B)
     –   Requiring 2 + 2 + 4 = 8 probabilities

•   And 2 marginal probabilities P(E), P(B) -> 2 more probabilities

     • 2 + 2 + 4 + 1 + 1 = 10 numbers (vs. 25-1 = 31)

•   Where do these probabilities come from?
     –   Expert knowledge
     –   From data (relative frequency estimates)
     –   Or a combination of both
Number of Probabilities in
         Bayesian Networks
• Consider n binary variables

• Unconstrained joint distribution requires O(2n) probabilities


• If we have a Bayesian network, with a maximum of k parents
  for any node, then we need O(n 2k) probabilities

• Example
   – Full unconstrained joint distribution
       • n = 30: need 109 probabilities for full joint distribution
   – Bayesian network
       • n = 30, k = 4: need 480 probabilities
Constructing Bayesian networks
1. Choose an ordering of variables X1, … , Xn
2. For i = 1 to n
   – add Xi to the network
   – select parents from X1, … ,Xi-1 such that
     P(Xi | Parents(Xi)) = P(Xi | X1, ... Xi-1)
Example
• Suppose we choose the ordering M, J, A, B, E

P(J | M) = P(J)?
Example
• Suppose we choose the ordering M, J, A, B, E

P(J | M) = P(J)?            No
Example
• Suppose we choose the ordering M, J, A, B, E

P(J | M) = P(J)?            No
P(A | J, M) = P(A)?
P(A | J, M) = P(A | J)?
P(A | J, M) = P(A | M)?
Example
• Suppose we choose the ordering M, J, A, B, E

P(J | M) = P(J)?            No
P(A | J, M) = P(A)?         No
P(A | J, M) = P(A | J)?     No
P(A | J, M) = P(A | M)?            No
Example
• Suppose we choose the ordering M, J, A, B, E

P(J | M) = P(J)?              No
P(A | J, M) = P(A)?           No
P(A | J, M) = P(A | J)?       No
P(A | J, M) = P(A | M)?            No
P(B | A, J, M) = P(B)?
P(B | A, J, M) = P(B | A)?
Example
• Suppose we choose the ordering M, J, A, B, E

P(J | M) = P(J)?              No
P(A | J, M) = P(A)?           No
P(A | J, M) = P(A | J)?       No
P(A | J, M) = P(A | M)?             No
P(B | A, J, M) = P(B)?        No
P(B | A, J, M) = P(B | A)?    Yes
Example
• Suppose we choose the ordering M, J, A, B, E

P(J | M) = P(J)?                 No
P(A | J, M) = P(A)?              No
P(A | J, M) = P(A | J)?          No
P(A | J, M) = P(A | M)?                No
P(B | A, J, M) = P(B)?           No
P(B | A, J, M) = P(B | A)?       Yes
P(E | B, A ,J, M) = P(E)?
P(E | B, A, J, M) = P(E | A, B)?
Example
• Suppose we choose the ordering M, J, A, B, E

P(J | M) = P(J)?                No
P(A | J, M) = P(A)?             No
P(A | J, M) = P(A | J)?         No
P(A | J, M) = P(A | M)?               No
P(B | A, J, M) = P(B)?          No
P(B | A, J, M) = P(B | A)?      Yes
P(E | B, A ,J, M) = P(E)?       No
P(E | B, A, J, M) = P(E | A, B)?Yes
Example contd.




• Deciding conditional independence is hard in noncausal directions
   – The causal direction seems much more natural
• Network is less compact: 1 + 2 + 4 + 2 + 4 = 13 numbers needed
A more realistic Bayes Network:
               Car diagnosis
•   Initial observation: car won’t start
•   Orange: “broken, so fix it” nodes
•   Green: testable evidence
•   Gray: “hidden variables” to ensure sparse structure, reduce parameteres
The Bayesian Network from a different Variable Ordering
Given a graph, can we “read off” conditional independencies?
Are there wrong network
              structures?
• Some variable orderings yield more compact, some less
  compact structures
   – Compact ones are better
   – But all representations resulting from this process are correct
   – One extreme: the fully connected network is always correct
     but rarely the best choice
• How can a network structure be wrong?
   – If it misses directed edges that are required
Summary
• Bayesian networks provide a natural representation for
  (causally induced) conditional independence
• Topology + conditional probability tables
• Generally easy for domain experts to construct
Probabilistic inference
• A general scenario:
    – Query variables: X
    – Evidence (observed) variables: E = e
    – Unobserved variables: Y
• If we know the full joint distribution P(X , E, Y), how can we
  perform inference about X?
                        P( X , e)
        P( X | E = e) =           ∝ ∑ y P( X , e, y )
                         P (e )

• Problems
   – Full joint distributions are too large
   – Marginalizing out Y may involve too many summation terms
Conclusions…
• Full joint distributions are intractable to work with
   – Conditional independence assumptions allow us to model real-
     world phenomena with much simpler models
   – Bayesian networks are a systematic way to construct
     parsimonious structured distributions

• How do we do inference (reasoning) in Bayesian
  networks?
   – Systematic algorithms exist
   – Complexity depends on the structure of the graph

More Related Content

What's hot

Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Christian Robert
 
Fuzzy logic and application in AI
Fuzzy logic and application in AIFuzzy logic and application in AI
Fuzzy logic and application in AIIldar Nurgaliev
 
Discontinuous Petrov-Galerkin Methods for convection-dominated diffusion pro...
Discontinuous Petrov-Galerkin Methods for convection-dominated  diffusion pro...Discontinuous Petrov-Galerkin Methods for convection-dominated  diffusion pro...
Discontinuous Petrov-Galerkin Methods for convection-dominated diffusion pro...Mohammad Zakerzadeh
 
Physical Chemistry Homework Help
Physical Chemistry Homework HelpPhysical Chemistry Homework Help
Physical Chemistry Homework HelpEdu Assignment Help
 
A Coq Library for the Theory of Relational Calculus
A Coq Library for the Theory of Relational CalculusA Coq Library for the Theory of Relational Calculus
A Coq Library for the Theory of Relational CalculusYoshihiro Mizoguchi
 
NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)Christian Robert
 
Theory of Relational Calculus and its Formalization
Theory of Relational Calculus and its FormalizationTheory of Relational Calculus and its Formalization
Theory of Relational Calculus and its FormalizationYoshihiro Mizoguchi
 
Algebras for programming languages
Algebras for programming languagesAlgebras for programming languages
Algebras for programming languagesYoshihiro Mizoguchi
 
better together? statistical learning in models made of modules
better together? statistical learning in models made of modulesbetter together? statistical learning in models made of modules
better together? statistical learning in models made of modulesChristian Robert
 
Matroid Basics
Matroid BasicsMatroid Basics
Matroid BasicsASPAK2014
 
Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]Christian Robert
 
random forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationrandom forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationChristian Robert
 
ABC based on Wasserstein distances
ABC based on Wasserstein distancesABC based on Wasserstein distances
ABC based on Wasserstein distancesChristian Robert
 
Presentation mathmatic 3
Presentation mathmatic 3Presentation mathmatic 3
Presentation mathmatic 3nashaat algrara
 

What's hot (20)

Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
 
Fuzzy logic and application in AI
Fuzzy logic and application in AIFuzzy logic and application in AI
Fuzzy logic and application in AI
 
the ABC of ABC
the ABC of ABCthe ABC of ABC
the ABC of ABC
 
Nested sampling
Nested samplingNested sampling
Nested sampling
 
Discontinuous Petrov-Galerkin Methods for convection-dominated diffusion pro...
Discontinuous Petrov-Galerkin Methods for convection-dominated  diffusion pro...Discontinuous Petrov-Galerkin Methods for convection-dominated  diffusion pro...
Discontinuous Petrov-Galerkin Methods for convection-dominated diffusion pro...
 
Physical Chemistry Homework Help
Physical Chemistry Homework HelpPhysical Chemistry Homework Help
Physical Chemistry Homework Help
 
A Coq Library for the Theory of Relational Calculus
A Coq Library for the Theory of Relational CalculusA Coq Library for the Theory of Relational Calculus
A Coq Library for the Theory of Relational Calculus
 
NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)
 
Theory of Relational Calculus and its Formalization
Theory of Relational Calculus and its FormalizationTheory of Relational Calculus and its Formalization
Theory of Relational Calculus and its Formalization
 
Algebras for programming languages
Algebras for programming languagesAlgebras for programming languages
Algebras for programming languages
 
better together? statistical learning in models made of modules
better together? statistical learning in models made of modulesbetter together? statistical learning in models made of modules
better together? statistical learning in models made of modules
 
Classical Sets & fuzzy sets
Classical Sets & fuzzy setsClassical Sets & fuzzy sets
Classical Sets & fuzzy sets
 
Matroid Basics
Matroid BasicsMatroid Basics
Matroid Basics
 
Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]
 
QMC: Operator Splitting Workshop, Composite Infimal Convolutions - Zev Woodst...
QMC: Operator Splitting Workshop, Composite Infimal Convolutions - Zev Woodst...QMC: Operator Splitting Workshop, Composite Infimal Convolutions - Zev Woodst...
QMC: Operator Splitting Workshop, Composite Infimal Convolutions - Zev Woodst...
 
random forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationrandom forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimation
 
ABC based on Wasserstein distances
ABC based on Wasserstein distancesABC based on Wasserstein distances
ABC based on Wasserstein distances
 
Fuzzy logic
Fuzzy logicFuzzy logic
Fuzzy logic
 
2.2.ppt.SC
2.2.ppt.SC2.2.ppt.SC
2.2.ppt.SC
 
Presentation mathmatic 3
Presentation mathmatic 3Presentation mathmatic 3
Presentation mathmatic 3
 

Similar to Bayesnetwork

BayesianNetwork-converted.pdf
BayesianNetwork-converted.pdfBayesianNetwork-converted.pdf
BayesianNetwork-converted.pdfAntonyJaison3
 
Bayesian Neural Networks
Bayesian Neural NetworksBayesian Neural Networks
Bayesian Neural NetworksNatan Katz
 
AIML unit-2(1).ppt
AIML unit-2(1).pptAIML unit-2(1).ppt
AIML unit-2(1).pptashudhanraj
 
NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015Christian Robert
 
Variational inference
Variational inference  Variational inference
Variational inference Natan Katz
 
Colloquium in honor of Hans Ruedi Künsch
Colloquium in honor of Hans Ruedi KünschColloquium in honor of Hans Ruedi Künsch
Colloquium in honor of Hans Ruedi KünschChristian Robert
 
Bayesian Networks - A Brief Introduction
Bayesian Networks - A Brief IntroductionBayesian Networks - A Brief Introduction
Bayesian Networks - A Brief IntroductionAdnan Masood
 
Probability distributions for ml
Probability distributions for mlProbability distributions for ml
Probability distributions for mlSung Yub Kim
 
GAN for Bayesian Inference objectives
GAN for Bayesian Inference objectivesGAN for Bayesian Inference objectives
GAN for Bayesian Inference objectivesNatan Katz
 
Workshop on Bayesian Inference for Latent Gaussian Models with Applications
Workshop on Bayesian Inference for Latent Gaussian Models with ApplicationsWorkshop on Bayesian Inference for Latent Gaussian Models with Applications
Workshop on Bayesian Inference for Latent Gaussian Models with ApplicationsChristian Robert
 

Similar to Bayesnetwork (20)

BayesianNetwork-converted.pdf
BayesianNetwork-converted.pdfBayesianNetwork-converted.pdf
BayesianNetwork-converted.pdf
 
Bayes 6
Bayes 6Bayes 6
Bayes 6
 
Bayesian Neural Networks
Bayesian Neural NetworksBayesian Neural Networks
Bayesian Neural Networks
 
Uncertainty
UncertaintyUncertainty
Uncertainty
 
AIML unit-2(1).ppt
AIML unit-2(1).pptAIML unit-2(1).ppt
AIML unit-2(1).ppt
 
NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015
 
Variational inference
Variational inference  Variational inference
Variational inference
 
Colloquium in honor of Hans Ruedi Künsch
Colloquium in honor of Hans Ruedi KünschColloquium in honor of Hans Ruedi Künsch
Colloquium in honor of Hans Ruedi Künsch
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Bayesian network
Bayesian networkBayesian network
Bayesian network
 
Bayesian Networks - A Brief Introduction
Bayesian Networks - A Brief IntroductionBayesian Networks - A Brief Introduction
Bayesian Networks - A Brief Introduction
 
AI Lesson 29
AI Lesson 29AI Lesson 29
AI Lesson 29
 
Lesson 29
Lesson 29Lesson 29
Lesson 29
 
CS3491-Unit-2 BBN.pptx
CS3491-Unit-2 BBN.pptxCS3491-Unit-2 BBN.pptx
CS3491-Unit-2 BBN.pptx
 
BIRS 12w5105 meeting
BIRS 12w5105 meetingBIRS 12w5105 meeting
BIRS 12w5105 meeting
 
ML.pptx
ML.pptxML.pptx
ML.pptx
 
Probability distributions for ml
Probability distributions for mlProbability distributions for ml
Probability distributions for ml
 
GAN for Bayesian Inference objectives
GAN for Bayesian Inference objectivesGAN for Bayesian Inference objectives
GAN for Bayesian Inference objectives
 
Workshop on Bayesian Inference for Latent Gaussian Models with Applications
Workshop on Bayesian Inference for Latent Gaussian Models with ApplicationsWorkshop on Bayesian Inference for Latent Gaussian Models with Applications
Workshop on Bayesian Inference for Latent Gaussian Models with Applications
 
ABC model choice
ABC model choiceABC model choice
ABC model choice
 

More from Digvijay Singh (13)

Week3 applications
Week3 applicationsWeek3 applications
Week3 applications
 
Week3.1
Week3.1Week3.1
Week3.1
 
Week2.1
Week2.1Week2.1
Week2.1
 
Week1.2 intro
Week1.2 introWeek1.2 intro
Week1.2 intro
 
Networks
NetworksNetworks
Networks
 
Overfitting and-tbl
Overfitting and-tblOverfitting and-tbl
Overfitting and-tbl
 
Ngrams smoothing
Ngrams smoothingNgrams smoothing
Ngrams smoothing
 
Query execution
Query executionQuery execution
Query execution
 
Query compiler
Query compilerQuery compiler
Query compiler
 
Machine learning
Machine learningMachine learning
Machine learning
 
Hmm viterbi
Hmm viterbiHmm viterbi
Hmm viterbi
 
3 fol examples v2
3 fol examples v23 fol examples v2
3 fol examples v2
 
Multidimensional Indexing
Multidimensional IndexingMultidimensional Indexing
Multidimensional Indexing
 

Recently uploaded

DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitolTechU
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxJiesonDelaCerna
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...M56BOOKSTORE PRODUCT/SERVICE
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 

Recently uploaded (20)

DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptx
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 

Bayesnetwork

  • 1. Bayesian Networks CSC 371: Spring 2012
  • 2. Today’s Lecture • Recap: Joint distribution, independence, marginal independence, conditional independence • Bayesian networks • Reading: – Sections 14.1-14.4 in AIMA [Russel & Norvig]
  • 3. Marginal Independence • Intuitively: if X ╨ Y, then – learning that Y=y does not change your belief in X – and this is true for all values y that Y could take • For example, weather is marginally independent of the result of a coin toss 3
  • 5. Conditional Independence • Intuitively: if X ╨ Y | Z, then – learning that Y=y does not change your belief in X when we already know Z=z – and this is true for all values y that Y could take and all values z that Z could take • For example, 5 ExamGrade ╨ AssignmentGrade | UnderstoodMaterial
  • 7. “…probability theory is more fundamentally concerned with the structure of reasoning and causation than with numbers.” Glenn Shafer and Judea Pearl Introduction to Readings in Uncertain Reasoning, Morgan Kaufmann, 1990
  • 8. Bayesian Network Motivation • We want a representation and reasoning system that is based on conditional (and marginal) independence – Compact yet expressive representation – Efficient reasoning procedures • Bayesian (Belief) Networks are such a representation – Named after Thomas Bayes (ca. 1702 –1761) – Term coined in 1985 by Judea Pearl (1936 – ) – Their invention changed the primary focus of AI from logic to probability! Thomas Bayes Judea Pearl 8
  • 9. Bayesian Networks: Intuition • A graphical representation for a joint probability distribution – Nodes are random variables • Can be assigned (observed) or unassigned (unobserved) – Arcs are interactions between nodes • Encode conditional independence • An arrow from one variable to another indicates direct influence • Directed arcs between nodes reflect dependence – A compact specification of full joint distributions • Some informal examples: Smoking At Fire Understood Sensor Material Assignment Exam Grade Grade Alarm 9
  • 10. Example of a simple Bayesian network A B p(A,B,C) = p(A)p(B) p(C|A,B) C • Probability model has simple factored form • Directed edges => direct dependence • Absence of an edge => conditional independence • Also known as belief networks, graphical models, causal networks • Other formulations, e.g., undirected graphical models
  • 12. Bayesian Networks: Definition • Discrete Bayesian networks: – Domain of each variable is finite – Conditional probability distribution is a conditional probability table – We will assume this discrete case • But everything we say about independence (marginal & conditional) 12 carries over to the continuous case
  • 13. Examples of 3-way Bayesian Networks A B C Marginal Independence: p(A,B,C) = p(A) p(B) p(C)
  • 14. Examples of 3-way Bayesian Networks Conditionally independent effects: p(A,B,C) = p(B|A)p(C|A)p(A) A B and C are conditionally independent Given A e.g., A is a disease, and we model B C B and C as conditionally independent symptoms given A
  • 15. Examples of 3-way Bayesian Networks A B Independent Causes: p(A,B,C) = p(C|A,B)p(A)p(B) C “Explaining away” effect: Given C, observing A makes B less likely e.g., earthquake/burglary/alarm example A and B are (marginally) independent but become dependent once C is known
  • 16. Examples of 3-way Bayesian Networks A B C Markov dependence: p(A,B,C) = p(C|B) p(B|A)p(A)
  • 17. Example: Burglar Alarm • I have a burglar alarm that is sometimes set off by minor earthquakes. My two neighbors, John and Mary, promised to call me at work if they hear the alarm – Example inference task: suppose Mary calls and John doesn’t call. What is the probability of a burglary? • What are the random variables? – Burglary, Earthquake, Alarm, JohnCalls, MaryCalls
  • 18. Example  5 binary variables:  B = a burglary occurs at your house  E = an earthquake occurs at your house  A = the alarm goes off  J = John calls to report the alarm  M = Mary calls to report the alarm  What is P(B | M, J) ? (for example)  We can use the full joint distribution to answer this question  Requires 25 = 32 probabilities  Can we use prior domain knowledge to come up with a Bayesian network that requires fewer probabilities?  What are the direct influence relationships?  A burglary can set the alarm off  An earthquake can set the alarm off  The alarm can cause Mary to call  The alarm can cause John to call
  • 19. Example: Burglar Alarm What are the model parameters?
  • 20. Conditional probability distributions • To specify the full joint distribution, we need to specify a conditional distribution for each node given its parents: P (X | Parents(X)) Z1 Z2 … Zn X P (X | Z1, …, Zn)
  • 22. The joint probability distribution • For each node Xi, we know P(Xi | Parents(Xi)) • How do we get the full joint distribution P(X1, …, Xn)? • Using chain rule: n n P ( X 1 , , X n ) = ∏ P( X i | X 1 , , X i −1 ) = ∏ P( X i | Parents( X i ) ) i =1 i =1 • For example, P(j, m, a, ¬b, ¬e) = P(¬b) P(¬e) P(a | ¬b, ¬e) P(j | a) P(m | a) •
  • 23. Constructing a Bayesian Network: Step 1 • Order the variables in terms of causality (may be a partial order) e.g., {E, B} -> {A} -> {J, M} • P(J, M, A, E, B) = P(J, M | A, E, B) P(A| E, B) P(E, B) ~ P(J, M | A) P(A| E, B) P(E) P(B) ~ P(J | A) P(M | A) P(A| E, B) P(E) P(B) These CI assumptions are reflected in the graph structure of the Bayesian network
  • 24. Constructing this Bayesian Network: Step 2 • P(J, M, A, E, B) = P(J | A) P(M | A) P(A | E, B) P(E) P(B) • There are 3 conditional probability tables (CPDs) to be determined: P(J | A), P(M | A), P(A | E, B) – Requiring 2 + 2 + 4 = 8 probabilities • And 2 marginal probabilities P(E), P(B) -> 2 more probabilities • 2 + 2 + 4 + 1 + 1 = 10 numbers (vs. 25-1 = 31) • Where do these probabilities come from? – Expert knowledge – From data (relative frequency estimates) – Or a combination of both
  • 25. Number of Probabilities in Bayesian Networks • Consider n binary variables • Unconstrained joint distribution requires O(2n) probabilities • If we have a Bayesian network, with a maximum of k parents for any node, then we need O(n 2k) probabilities • Example – Full unconstrained joint distribution • n = 30: need 109 probabilities for full joint distribution – Bayesian network • n = 30, k = 4: need 480 probabilities
  • 26. Constructing Bayesian networks 1. Choose an ordering of variables X1, … , Xn 2. For i = 1 to n – add Xi to the network – select parents from X1, … ,Xi-1 such that P(Xi | Parents(Xi)) = P(Xi | X1, ... Xi-1)
  • 27. Example • Suppose we choose the ordering M, J, A, B, E P(J | M) = P(J)?
  • 28. Example • Suppose we choose the ordering M, J, A, B, E P(J | M) = P(J)? No
  • 29. Example • Suppose we choose the ordering M, J, A, B, E P(J | M) = P(J)? No P(A | J, M) = P(A)? P(A | J, M) = P(A | J)? P(A | J, M) = P(A | M)?
  • 30. Example • Suppose we choose the ordering M, J, A, B, E P(J | M) = P(J)? No P(A | J, M) = P(A)? No P(A | J, M) = P(A | J)? No P(A | J, M) = P(A | M)? No
  • 31. Example • Suppose we choose the ordering M, J, A, B, E P(J | M) = P(J)? No P(A | J, M) = P(A)? No P(A | J, M) = P(A | J)? No P(A | J, M) = P(A | M)? No P(B | A, J, M) = P(B)? P(B | A, J, M) = P(B | A)?
  • 32. Example • Suppose we choose the ordering M, J, A, B, E P(J | M) = P(J)? No P(A | J, M) = P(A)? No P(A | J, M) = P(A | J)? No P(A | J, M) = P(A | M)? No P(B | A, J, M) = P(B)? No P(B | A, J, M) = P(B | A)? Yes
  • 33. Example • Suppose we choose the ordering M, J, A, B, E P(J | M) = P(J)? No P(A | J, M) = P(A)? No P(A | J, M) = P(A | J)? No P(A | J, M) = P(A | M)? No P(B | A, J, M) = P(B)? No P(B | A, J, M) = P(B | A)? Yes P(E | B, A ,J, M) = P(E)? P(E | B, A, J, M) = P(E | A, B)?
  • 34. Example • Suppose we choose the ordering M, J, A, B, E P(J | M) = P(J)? No P(A | J, M) = P(A)? No P(A | J, M) = P(A | J)? No P(A | J, M) = P(A | M)? No P(B | A, J, M) = P(B)? No P(B | A, J, M) = P(B | A)? Yes P(E | B, A ,J, M) = P(E)? No P(E | B, A, J, M) = P(E | A, B)?Yes
  • 35. Example contd. • Deciding conditional independence is hard in noncausal directions – The causal direction seems much more natural • Network is less compact: 1 + 2 + 4 + 2 + 4 = 13 numbers needed
  • 36. A more realistic Bayes Network: Car diagnosis • Initial observation: car won’t start • Orange: “broken, so fix it” nodes • Green: testable evidence • Gray: “hidden variables” to ensure sparse structure, reduce parameteres
  • 37. The Bayesian Network from a different Variable Ordering
  • 38. Given a graph, can we “read off” conditional independencies?
  • 39. Are there wrong network structures? • Some variable orderings yield more compact, some less compact structures – Compact ones are better – But all representations resulting from this process are correct – One extreme: the fully connected network is always correct but rarely the best choice • How can a network structure be wrong? – If it misses directed edges that are required
  • 40. Summary • Bayesian networks provide a natural representation for (causally induced) conditional independence • Topology + conditional probability tables • Generally easy for domain experts to construct
  • 41. Probabilistic inference • A general scenario: – Query variables: X – Evidence (observed) variables: E = e – Unobserved variables: Y • If we know the full joint distribution P(X , E, Y), how can we perform inference about X? P( X , e) P( X | E = e) = ∝ ∑ y P( X , e, y ) P (e ) • Problems – Full joint distributions are too large – Marginalizing out Y may involve too many summation terms
  • 42. Conclusions… • Full joint distributions are intractable to work with – Conditional independence assumptions allow us to model real- world phenomena with much simpler models – Bayesian networks are a systematic way to construct parsimonious structured distributions • How do we do inference (reasoning) in Bayesian networks? – Systematic algorithms exist – Complexity depends on the structure of the graph