Bayesian Networks - A Brief Introduction

A B RIEF INTRODUCTION
A D N A N M A S O O D
S C I S . N O V A . E D U / ~ A D N A N
A D N A N @ N O V A . E D U
D O C T O R A L C A N D I D A T E
N O V A S O U T H E A S T E R N U N I V E R S I T Y
Bayesian Networks

What is a Bayesian Network?
 A Bayesian network (BN) is a graphical model for
depicting probabilistic relationships among a set
of variables.
 BN Encodes the conditional independence relationships between the
variables in the graph structure.
 Provides a compact representation of the joint probability
distribution over the variables
 A problem domain is modeled by a list of variables X1, …, Xn
 Knowledge about the problem domain is represented by a joint
probability P(X1, …, Xn)
 Directed links represent causal direct influences
 Each node has a conditional probability table quantifying the effects
from the parents.
 No directed cycles

Bayesian Network constitutes of..
 Directed Acyclic Graph (DAG)
 Set of conditional probability tables for each node in
the graph
A
B
C D

So BN = (DAG, CPD)
 DAG: directed acyclic graph (BN’s structure)
Nodes: random variables (typically binary or discrete,
but methods also exist to handle continuous variables)
Arcs: indicate probabilistic dependencies between
nodes (lack of link signifies conditional independence)
 CPD: conditional probability distribution (BN’s
parameters)
Conditional probabilities at each node, usually stored
as a table (conditional probability table, or CPT)

So, what is a DAG?
A
B
C D
directed acyclic graphs use
only unidirectional arrows to
show the direction of
causation
Each node in graph represents
a random variable
Follow the general graph
principles such as a node A is a
parent of another node B, if
there is an arrow from node A
to node B.
Informally, an arrow from
node X to node Y means X has
a direct influence on Y

Where do all these numbers come from?
There is a set of tables for each node in the network.
Each node Xi has a conditional probability distribution
P(Xi | Parents(Xi)) that quantifies the effect of the parents
on the node
The parameters are the probabilities in these conditional
probability tables (CPTs)A
B
C D

The infamous Burglary-Alarm Example
Burglary Earthquake
Alarm
John Calls Mary Calls
P(B)
0.001
P(E)
0.002
B E P(A)
T T 0.95
T F 0.94
F T 0.29
F F 0.001
A P(J)
T 0.90
F 0.05
A P(M)
T 0.70
F 0.01

Cont..calculations on the belief network
Using the network in the example, suppose you want
to calculate:
P(A = true, B = true, C = true, D = true)
= P(A = true) * P(B = true | A = true) *
P(C = true | B = true) P( D = true | B = true)
= (0.4)*(0.3)*(0.1)*(0.95)
These numbers are from the
conditional probability tables
This is from the
graph structure

So let’s see how you can calculate P(John called)
if there was a burglary?
 Inference from effect to cause; Given a burglary,
what is P(J|B)?
 Can also calculate P (M|B) = 0.67
85.0
)05.0)(06.0()9.0)(94.0()|(
)05.0)(()9.0)(()|(
94.0)|(
)95.0)(002.0(1)94.0)(998.0(1)|(
)95.0)(()()94.0)(()()|(
?)|(







BJP
APAPBJP
BAP
BAP
EPBPEPBPBAP
BJP

Why Bayesian Networks?
 Bayesian Probability represents the degree of belief
in that event while Classical Probability (or frequents
approach) deals with true or physical probability of
an event
• Bayesian Network
• Handling of Incomplete Data Sets
• Learning about Causal Networks
• Facilitating the combination of domain knowledge and data
• Efficient and principled approach for avoiding the over fitting
of data

What are Belief Computations?
 Belief Revision
 Model explanatory/diagnostic tasks
 Given evidence, what is the most likely hypothesis to explain the
evidence?
 Also called abductive reasoning
 Example: Given some evidence variables, find the state of all other
variables that maximize the probability. E.g.: We know John Calls,
but not Mary. What is the most likely state? Only consider
assignments where J=T and M=F, and maximize.
 Belief Updating
 Queries
 Given evidence, what is the probability of some other random
variable occurring?

What is conditional independence?
The Markov condition says that given its parents (P1, P2), a
node (X) is conditionally independent of its non-descendants
(ND1, ND2)
X
P1 P2
C1 C2
ND2ND1

What is D-Separation?
 A variable a is d-separated from b by a set of variables
E if there does not exist a d-connecting path between a
and b such that
 None of its linear or diverging nodes is in E
 For each of the converging nodes, either it or one of its
descendants is in E.
 Intuition:
 The influence between a and b must propagate through a d-
connecting path
 If a and b are d-separated by E, then they are
conditionally independent of each other given E:
P(a, b | E) = P(a | E) x P(b | E)

Construction of a Belief Network
Procedure for constructing BN:
 Choose a set of variables describing the application
domain
 Choose an ordering of variables
 Start with empty network and add variables to the
network one by one according to the ordering
 To add i-th variable Xi:
 Determine pa(Xi) of variables already in the network (X1, …, Xi – 1)
such that
P(Xi | X1, …, Xi – 1) = P(Xi | pa(Xi))
(domain knowledge is needed there)
 Draw an arc from each variable in pa(Xi) to Xi

What is Inference in BN?
 Using a Bayesian network to compute probabilities is
called inference
 In general, inference involves queries of the form:
P( X | E )
where X is the query variable and E is the evidence
variable.

Representing causality in Bayesian Networks
 A causal Bayesian network, or simply causal
networks, is a Bayesian network whose arcs are
interpreted as indicating cause-effect relationships
 Build a causal network:
 Choose a set of variables that describes the domain
 Draw an arc to a variable from each of its direct causes
(Domain knowledge required)
Visit Africa
Tuberculosis
X-Ray
Smoking
Lung Cancer
Bronchitis
Dyspnea
Tuberculosis or
Lung Cancer

Limitations of Bayesian Networks
• Typically require initial knowledge of many
probabilities…quality and extent of prior knowledge
play an important role
• Significant computational cost(NP hard task)
• Unanticipated probability of an event is not taken
care of.

Summary
 Bayesian methods provide sound theory and framework for
implementation of classifiers
 Bayesian networks a natural way to represent conditional independence
information. Qualitative info in links, quantitative in tables.
 NP-complete or NP-hard to compute exact values; typical to make
simplifying assumptions or approximate methods.
 Many Bayesian tools and systems exist
 Bayesian Networks: an efficient and effective representation of the joint
probability distribution of a set of random variables
 Efficient:
 Local models
 Independence (d-separation)
 Effective:
 Algorithms take advantage of structure to
 Compute posterior probabilities
 Compute most probable instantiation
 Decision making

Bayesian Network Resources
 Repository: www.cs.huji.ac.il/labs/compbio/Repository/
 Softwares:
 Infer.NET http://research.microsoft.com/en-
us/um/cambridge/projects/infernet/
 Genie: genie.sis.pitt.edu
 Hugin: www.hugin.com
 SamIam http://reasoning.cs.ucla.edu/samiam/
 JavaBayes: www.cs.cmu.edu/ javabayes/Home/
 Bayesware: www.bayesware.com
 BN info sites
 Bayesian Belief Network site (Russell Greiner)
http://webdocs.cs.ualberta.ca/~greiner/bn.html
 Summary of BN software and links to software sites (Kevin Murphy)

References and Further Reading
 Bayesian Networks without Tears by Eugene Charniak
http://www.cs.ubc.ca/~murphyk/Bayes/Charniak_91.
pdf
 Russel, S. and Norvig, P. (1995). Artificial
Intelligence, A Modern Approach. Prentice Hall.
 Weiss, S. and Kulikowski, C. (1991). Computer Systems
That Learn. Morgan Kaufman.
 Heckerman, D. (1996). A Tutorial on Learning with
Bayesian Networks. Microsoft Technical Report
MSR-TR-95-06.
 Internet Resources on Bayesian Networks and
Machine Learning:
http://www.cs.orst.edu/~wangxi/resource.html

Modeling and Reasoning with Bayesian
Networks

Machine Learning: A Probabilistic Perspective

Bayesian Reasoning and Machine Learning

Bayesian Networks - A Brief Introduction

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (10)

Similar to Bayesian Networks - A Brief Introduction

Similar to Bayesian Networks - A Brief Introduction (20)

More from Adnan Masood

More from Adnan Masood (10)

Recently uploaded

Recently uploaded (20)

Bayesian Networks - A Brief Introduction