SlideShare a Scribd company logo
An Importance sampling approach to
integrate expert knowledge when learning
Bayesian Networks from data
Andrés Cano, Andrés R. Masegosa and Serafín Moral
Department of Computer Science and Artificial Intelligence
University of Granada (Spain)
Dortmund, June 2010
Information Processing and Management of Uncertainty in Knowledge-Based Systems
IPMU 2010 Dortmund (Germany) 1/32
Outline
1 Introduction
2 Learning Bayesian Networks(BN) from data
3 Importance Sampling for learning BN
4 Integration of Expert Knowledge
5 Experimental Evaluation
6 Conclusions & Future Works
IPMU 2010 Dortmund (Germany) 2/32
Introduction
Part I
Introduction
IPMU 2010 Dortmund (Germany) 3/32
Introduction
Bayesian Networks
Bayesian Networks
Excellent models to graphically represent the dependency structure of the
underlying distribution in multivariate domains.
The learning from data of this dependency structure in a multivariate problem
domain represents a very relevant source of knowledge (direct interactions,
conditional independencies...)
IPMU 2010 Dortmund (Germany) 4/32
Introduction
Learning Bayesian Networks from Data
Uncertainty in Model Selection
When learning BNs from data there usually are several models with a high
score (high posterior probability given the data).
This situation is specially common in problem domains with high number of
variables and low sample sizes.
IPMU 2010 Dortmund (Germany) 5/32
Introduction
Integration of Expert Knowledge
Expert Knowledge
In many domain problems expert knowledge is available.
The graphical structure of BNs greatly ease the interaction with a human expert:
Causal ordering.
D-separtion criteria.
IPMU 2010 Dortmund (Germany) 6/32
Introduction
Integration of Expert Knowledge
Expert Knowledge
In many domain problems expert knowledge is available.
The graphical structure of BNs greatly ease the interaction with a human expert:
Causal ordering.
D-separtion criteria.
Previous Works I
There have been many attempts to introduce expert knowledge when learning
BNs from data.
Via Prior Distribution [2,5]: Use of specific prior distributions over the
possible graph structures to integrate expert knowledge:
Expert assigns higher prior probabilities to most likely edges.
IPMU 2010 Dortmund (Germany) 6/32
Introduction
Integration of Expert Knowledge
Previous Works II
Via structural Restrictions [6]: Expert codify his/her knowledge as structural
restrictions.
Expert defines the existence/absence of arcs and/or edges and causal
ordering restrictions.
Retrieved model should satisfy these restrictions.
IPMU 2010 Dortmund (Germany) 7/32
Introduction
Integration of Expert Knowledge
Previous Works II
Via structural Restrictions [6]: Expert codify his/her knowledge as structural
restrictions.
Expert defines the existence/absence of arcs and/or edges and causal
ordering restrictions.
Retrieved model should satisfy these restrictions.
Limitations of "Prior" Expert Knowledge
The system would ask to the expert his/her belief about any possible feature
of the BN (non feasible in high domains).
The expert could be biased to provide the most “easy” or clear knowledge.
The system does not help to the user to introduce information about the BN
structure.
IPMU 2010 Dortmund (Germany) 7/32
Introduction
Interactive Learning of Bayesian Networks
IPMU 2010 Dortmund (Germany) 8/32
Introduction
Interactive Learning of Bayesian Networks
Active Interaction with the Expert
Strategy: Ask to the expert by the presence of the edges that most reduce the
model uncertainty.
Method: Framework to allow an efficient and effective interaction with the expert.
Expert is only asked for this controversial structural features.
IPMU 2010 Dortmund (Germany) 8/32
Previous Knowledge
Part II
Learning Bayesian Networks from data
IPMU 2010 Dortmund (Germany) 9/32
Previous Knowledge
Notation
Let be X = (X1, ..., Xn) a set of n random variables. Val(Xi ) is the set of values
of Xi .
We assume variables are enumerated in a total causal order.
We also assume a fully observed data set D.
IPMU 2010 Dortmund (Germany) 10/32
Previous Knowledge
Notation
Let be X = (X1, ..., Xn) a set of n random variables. Val(Xi ) is the set of values
of Xi .
We assume variables are enumerated in a total causal order.
We also assume a fully observed data set D.
A Bayesian Network B can be described by:
G the graph structure.
θG the parameters.
A graph G can be decomposed as a vector of parent sets:
G = (Pa(X1), ..., Pa(Xn))
IPMU 2010 Dortmund (Germany) 10/32
Previous Knowledge
Notation
Let be X = (X1, ..., Xn) a set of n random variables. Val(Xi ) is the set of values
of Xi .
We assume variables are enumerated in a total causal order.
We also assume a fully observed data set D.
A Bayesian Network B can be described by:
G the graph structure.
θG the parameters.
A graph G can be decomposed as a vector of parent sets:
G = (Pa(X1), ..., Pa(Xn))
We also define Ui as a random variable taking values in the space of all possible
parent sets of Xi , Val(Ui ).
Let be G a random variable taking values in the set Val(G) of all possible graph
structures consistent with the total order.
IPMU 2010 Dortmund (Germany) 10/32
Previous Knowledge
The Bayesian Learning Framework
Scoring a graph structure
Marginal Likelihood of a graph structure:
P(G = G|D) = P(G|D) ∝ P(G)P(D|G) =
i
score(Xi , PaG(Xi )|D)
scoreBDeu(Xi , Ui |D) = Pi (U)
|Ui |
j=0
Γ(αij )
Γ(αij + Nij )
|Xi |
k=1
Γ(αijk + Nijk )
Γ(αijk )
Pi (U) is the prior probability that U is the parent set of Xi .
IPMU 2010 Dortmund (Germany) 11/32
Previous Knowledge
The Bayesian Learning Framework
Scoring a graph structure
Marginal Likelihood of a graph structure:
P(G = G|D) = P(G|D) ∝ P(G)P(D|G) =
i
score(Xi , PaG(Xi )|D)
scoreBDeu(Xi , Ui |D) = Pi (U)
|Ui |
j=0
Γ(αij )
Γ(αij + Nij )
|Xi |
k=1
Γ(αijk + Nijk )
Γ(αijk )
Pi (U) is the prior probability that U is the parent set of Xi .
Approximating the posterior P(G|D)
Our approach is supported on the approximation of P(G|D).
It allows to know which graph structures are the most likely (best explain the
data).
Exhaustive enumeration is not feasible because the space of graph structures is
super-exponential.
IPMU 2010 Dortmund (Germany) 11/32
Previous Knowledge
Approximating the Posterior
Factorization of P(G|D)
Assumption of a total order implies that the selection of the parent sets for
each Xi are independent among them:
P(G|D) =
i
P(Ui |D)
P(G|D) can be decomposed in n independent problems.
P(Ui |D) posterior probability of the possible parent sets of variable Xi .
Each of the sub-problems still has exponential size.
IPMU 2010 Dortmund (Germany) 12/32
Previous Knowledge
Approximating the Posterior P(Ui|D)
Closed Form Solution
In [3] it was proposed a closed form solution assuming a node can have up to
K parents.
It would have a polynomial efficiency O(n(K+1)).
IPMU 2010 Dortmund (Germany) 13/32
Previous Knowledge
Approximating the Posterior P(Ui|D)
Closed Form Solution
In [3] it was proposed a closed form solution assuming a node can have up to
K parents.
It would have a polynomial efficiency O(n(K+1)).
Markov Chain Monte Carlo
Lets Val(Ui ) be the space model of the Markov Chain.
If the Markov Chain is in some state U in iteration t, a new model U is
randomly drawn by adding, deleting of switching any edge.
The Markov Chain moves to state U in the iteration t + 1 with probability:
m(Ut
, Ut+1
) = min{1,
N(U)score(D|U )
N(U )score(D|U)
}
If it not, the Markov Chain remains in state U.
This Markov Chain has an stationary distribution (t → ∞) which is P(U|D).
IPMU 2010 Dortmund (Germany) 13/32
Importance Sampling
Part III
Importance Sampling
IPMU 2010 Dortmund (Germany) 14/32
Importance Sampling
Importance Sampling
Description
Based on the employment of an auxiliary distribution Q which roughly
approximate P, the target distribution.
Q is a distribution which is easier to sample for it.
EP (f(x)) =
P(x)
Q(x)
f(x)Q(x)dx = EQ(w(x)f(x)) (1)
where w(x) =
P(x)
Q(x)
acts as a weight function.
IPMU 2010 Dortmund (Germany) 15/32
Importance Sampling
Importance Sampling
Description
Based on the employment of an auxiliary distribution Q which roughly
approximate P, the target distribution.
Q is a distribution which is easier to sample for it.
EP (f(x)) =
P(x)
Q(x)
f(x)Q(x)dx = EQ(w(x)f(x)) (1)
where w(x) =
P(x)
Q(x)
acts as a weight function.
A set of T samples {x1, ..., xT } are generated form Q and, then, it is computed
wt =
P(xt
)
Q(xt )
.
The estimator ˆµ of EP (f(x)) is finally computed as follows:
ˆµ =
T
t=1 w(xt )f(t)
T
t=1 w(xt )
(2)
Key Aspect: P and Q can be known up to a multiplicative constant.
IPMU 2010 Dortmund (Germany) 15/32
Importance Sampling
Importance Sampling for learning BNs
Step 0:
Candidate Parents are considered in a
random permutation.
Score of initial model:
score(X, {∅}|D)
IPMU 2010 Dortmund (Germany) 16/32
Importance Sampling
Importance Sampling for learning BNs
Step 1:
Evaluate C as parent of X.
Compute the ratio:
r =
score(X, {C}|D)
score(X, {C}|D) + score(X, {∅}|D)
= 0.8
Randomly accept C as parent of X with probability
r = 0.8 −→ Accepted.
Q = 0.8
IPMU 2010 Dortmund (Germany) 17/32
Importance Sampling
Importance Sampling for learning BNs
Step 2:
Evaluate B as parent of X.
Compute the ratio:
r =
score(X, {C, B}|D)
score(X, {C, B}|D) + score(X, {C}|D)
= 0.1
Randomly accept B as parent of X with probability
r = 0.1 −→ Non Accepted.
Q = 0.8 · 0.9
IPMU 2010 Dortmund (Germany) 18/32
Importance Sampling
Importance Sampling for learning BNs
Step 3:
Evaluate A as parent of X.
Compute the ratio:
r =
score(X, {C, A}|D)
score(X, {C, A}|D) + score(X, {C}|D)
= 0.7
Randomly accept A as parent of X with probability
r = 0.7 −→ Accepted.
Q = 0.8 · 0.9 · 0.7 = 0.504
Weight of the final model:
W1
=
score(X, {C, A}|D)
0.504
IPMU 2010 Dortmund (Germany) 19/32
Importance Sampling
Importance Sampling for learning BNs
Step 3:
Evaluate A as parent of X.
Compute the ratio:
r =
score(X, {C, A}|D)
score(X, {C, A}|D) + score(X, {C}|D)
= 0.7
Randomly accept A as parent of X with probability
r = 0.7 −→ Accepted.
Q = 0.8 · 0.9 · 0.7 = 0.504
Weight of the final model:
W1
=
score(X, {C, A}|D)
0.504
The process is repeated T times.
Using these samples we get an approximation of P(Ui |D).
IPMU 2010 Dortmund (Germany) 19/32
Integrating Expert Knowledge
Part IV
Integrating Expert Knowledge
IPMU 2010 Dortmund (Germany) 20/32
Integrating Expert Knowledge
Methodology Description
IPMU 2010 Dortmund (Germany) 21/32
Integrating Expert Knowledge
Prior Knowledge
Representing absence of prior knowledge
“Uniform prior over structures is usually chosen by convenience” [3].
P(G) does not grow with the data, but it matters at low sample sizes.
IPMU 2010 Dortmund (Germany) 22/32
Integrating Expert Knowledge
Prior Knowledge
Representing absence of prior knowledge
“Uniform prior over structures is usually chosen by convenience” [3].
P(G) does not grow with the data, but it matters at low sample sizes.
Let us assume that the prior probability of an edge is p and independent of
each other. If Xi has k parents out of m nodes:
P(Pa(Xi )) = pk
(1 − p)(m−k)
If the number of candidate parents m grows, p should be decreased to
control de number of false positives edges: “multiplicity correction”.
IPMU 2010 Dortmund (Germany) 22/32
Integrating Expert Knowledge
Prior Knowledge
Representing absence of prior knowledge
“Uniform prior over structures is usually chosen by convenience” [3].
P(G) does not grow with the data, but it matters at low sample sizes.
Let us assume that the prior probability of an edge is p and independent of
each other. If Xi has k parents out of m nodes:
P(Pa(Xi )) = pk
(1 − p)(m−k)
If the number of candidate parents m grows, p should be decreased to
control de number of false positives edges: “multiplicity correction”.
This is solved by assuming that p has a Beta prior with parameter α = 0.5 [11]:
Pi (Pa(Xi )) =
Γ(2 ∗ α)
Γ(m + 2α)
Γ(k + α)Γ(m − k + α)
Γ(α)Γ(α)
IPMU 2010 Dortmund (Germany) 22/32
Integrating Expert Knowledge
Interacting with the Expert
Description
Key Idea: As lower the entropy of P(Ui |D) is, more reliable learning we have.
Expert Interaction is carried out in order to reduce the entropy H(P(Ui |D)).
.
IPMU 2010 Dortmund (Germany) 23/32
Integrating Expert Knowledge
Interacting with the Expert
P(S → X|D) = 0.85 P(R → X|D) = 0.8 P(C → X|D) = 0.45
Description
Key Idea: As lower the entropy of P(Ui |D) is, more reliable learning we have.
Expert Interaction is carried out in order to reduce the entropy H(P(Ui |D)).
System ask to the expert by those edges with higher entropy.
IPMU 2010 Dortmund (Germany) 23/32
Integrating Expert Knowledge
Interacting with the Expert
P(S → X|D) = 0.88 P(R → X|D) = 0.77 P(C → X|D) = 1.0
Description
The entropy of P(Ui |D) is reduced. Probability mass concentrates around
one model.
This methodology can be iteratively applied asking by the presence/absence
of more edges.
Stopping the interaction when the probability of MAP model is L times higher
the second most probable model.
IPMU 2010 Dortmund (Germany) 24/32
Experimental Evaluation
Part V
Experimental Evaluation
IPMU 2010 Dortmund (Germany) 25/32
Experimental Evaluation
Experimental Set-up
Bayesian Networks:
alarm (37 nodes), boblo (23 nodes), boerlage-92 (23 nodes), hailfinder (56
nodes), insurance (27 nodes).
Sample Sizes:
We run 10 times the algorithms with different samples sizes: 50, 100, 500
and 1000.
IPMU 2010 Dortmund (Germany) 26/32
Experimental Evaluation
Experimental Set-up
Bayesian Networks:
alarm (37 nodes), boblo (23 nodes), boerlage-92 (23 nodes), hailfinder (56
nodes), insurance (27 nodes).
Sample Sizes:
We run 10 times the algorithms with different samples sizes: 50, 100, 500
and 1000.
Evaluation Measures
Number of missing/extra links, Kullback-Leibler distance...
We report average values across the five networks.
Expert Interaction is simulated:
Access to the true BN model when asking by
presence/absence of an edge.
IPMU 2010 Dortmund (Germany) 26/32
Experimental Evaluation
Structure Prior Evaluation
N. of Structural Errors KL Distance
Analysis
Beta-Prior reduces the number of structural errors for both IS and MCMCM.
IS has a low number of errors than MCMC specially with low sample sizes.
Beta-Prior also reduces KL distance for both IS and MCMC.
IPMU 2010 Dortmund (Germany) 27/32
Experimental Evaluation
Expert Interaction Evaluation
N. of Structural Errors KL Distance
Analysis
As much the mass of the posterior probability is concentrated around one model,
the lower is the number of structural errors.
The KL distance does not significantly improve with large sample sizes (these
structural errors do not have great impact in the prediction capacity).
IPMU 2010 Dortmund (Germany) 28/32
Experimental Evaluation
Expert Interaction Evaluation
N. of Interactions Interaction Accuracy
Analysis
The number of Interactions are feasible for a human expert.
Prior Exhaustive Querying: 600 questions in averaged.
The Interaction Accuracy: ratio between number of reduced structural
errors and number of interactions.
Average Accuracy of random interactions: 1%.
IPMU 2010 Dortmund (Germany) 29/32
Conclusions
Part VI
Conclusions & Future Works
IPMU 2010 Dortmund (Germany) 30/32
Conclusions
Conclusions & Future Works
Conclusions
A new methodology to introduce expert knowledge when
learning BN from data.
A new Importance sampling technique for sampling BN.
System requests to the expert a feasible number of questions.
Interaction improves the quality of the inferred BN models.
IPMU 2010 Dortmund (Germany) 31/32
Conclusions
Conclusions & Future Works
Conclusions
A new methodology to introduce expert knowledge when
learning BN from data.
A new Importance sampling technique for sampling BN.
System requests to the expert a feasible number of questions.
Interaction improves the quality of the inferred BN models.
Future Works
Extend these methods to the learning of BN models without
causal ordering assumptions.
IPMU 2010 Dortmund (Germany) 31/32
Thanks for your attention!!
Questions?
IPMU 2010 Dortmund (Germany) 32/32

More Related Content

What's hot

NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015
Christian Robert
 
06 cv mil_learning_and_inference
06 cv mil_learning_and_inference06 cv mil_learning_and_inference
06 cv mil_learning_and_inference
zukun
 
Intractable likelihoods
Intractable likelihoodsIntractable likelihoods
Intractable likelihoods
Christian Robert
 
Support Vector Machine
Support Vector MachineSupport Vector Machine
Support Vector Machine
Lucas Xu
 
Reliable ABC model choice via random forests
Reliable ABC model choice via random forestsReliable ABC model choice via random forests
Reliable ABC model choice via random forests
Christian Robert
 
Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biology Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biology
tuxette
 
(DL輪読)Matching Networks for One Shot Learning
(DL輪読)Matching Networks for One Shot Learning(DL輪読)Matching Networks for One Shot Learning
(DL輪読)Matching Networks for One Shot Learning
Masahiro Suzuki
 
Data-Driven Recommender Systems
Data-Driven Recommender SystemsData-Driven Recommender Systems
Data-Driven Recommender Systems
recsysfr
 
Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...
Frank Nielsen
 
About functional SIR
About functional SIRAbout functional SIR
About functional SIR
tuxette
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorization
recsysfr
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Prediction
tuxette
 
Matching networks for one shot learning
Matching networks for one shot learningMatching networks for one shot learning
Matching networks for one shot learning
Kazuki Fujikawa
 
Random Matrix Theory in Array Signal Processing: Application Examples
Random Matrix Theory in Array Signal Processing: Application ExamplesRandom Matrix Theory in Array Signal Processing: Application Examples
Random Matrix Theory in Array Signal Processing: Application Examples
Förderverein Technische Fakultät
 
Minghui Conference Cross-Validation Talk
Minghui Conference Cross-Validation TalkMinghui Conference Cross-Validation Talk
Minghui Conference Cross-Validation Talk
Wei Wang
 
Parallel Filter-Based Feature Selection Based on Balanced Incomplete Block De...
Parallel Filter-Based Feature Selection Based on Balanced Incomplete Block De...Parallel Filter-Based Feature Selection Based on Balanced Incomplete Block De...
Parallel Filter-Based Feature Selection Based on Balanced Incomplete Block De...
AMIDST Toolbox
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random forest
tuxette
 
Prototype-based models in machine learning
Prototype-based models in machine learningPrototype-based models in machine learning
Prototype-based models in machine learning
University of Groningen
 
Boston talk
Boston talkBoston talk
Boston talk
Christian Robert
 
Considerate Approaches to ABC Model Selection
Considerate Approaches to ABC Model SelectionConsiderate Approaches to ABC Model Selection
Considerate Approaches to ABC Model Selection
Michael Stumpf
 

What's hot (20)

NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015
 
06 cv mil_learning_and_inference
06 cv mil_learning_and_inference06 cv mil_learning_and_inference
06 cv mil_learning_and_inference
 
Intractable likelihoods
Intractable likelihoodsIntractable likelihoods
Intractable likelihoods
 
Support Vector Machine
Support Vector MachineSupport Vector Machine
Support Vector Machine
 
Reliable ABC model choice via random forests
Reliable ABC model choice via random forestsReliable ABC model choice via random forests
Reliable ABC model choice via random forests
 
Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biology Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biology
 
(DL輪読)Matching Networks for One Shot Learning
(DL輪読)Matching Networks for One Shot Learning(DL輪読)Matching Networks for One Shot Learning
(DL輪読)Matching Networks for One Shot Learning
 
Data-Driven Recommender Systems
Data-Driven Recommender SystemsData-Driven Recommender Systems
Data-Driven Recommender Systems
 
Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...
 
About functional SIR
About functional SIRAbout functional SIR
About functional SIR
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorization
 
Graph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype PredictionGraph Neural Network for Phenotype Prediction
Graph Neural Network for Phenotype Prediction
 
Matching networks for one shot learning
Matching networks for one shot learningMatching networks for one shot learning
Matching networks for one shot learning
 
Random Matrix Theory in Array Signal Processing: Application Examples
Random Matrix Theory in Array Signal Processing: Application ExamplesRandom Matrix Theory in Array Signal Processing: Application Examples
Random Matrix Theory in Array Signal Processing: Application Examples
 
Minghui Conference Cross-Validation Talk
Minghui Conference Cross-Validation TalkMinghui Conference Cross-Validation Talk
Minghui Conference Cross-Validation Talk
 
Parallel Filter-Based Feature Selection Based on Balanced Incomplete Block De...
Parallel Filter-Based Feature Selection Based on Balanced Incomplete Block De...Parallel Filter-Based Feature Selection Based on Balanced Incomplete Block De...
Parallel Filter-Based Feature Selection Based on Balanced Incomplete Block De...
 
Explanable models for time series with random forest
Explanable models for time series with random forestExplanable models for time series with random forest
Explanable models for time series with random forest
 
Prototype-based models in machine learning
Prototype-based models in machine learningPrototype-based models in machine learning
Prototype-based models in machine learning
 
Boston talk
Boston talkBoston talk
Boston talk
 
Considerate Approaches to ABC Model Selection
Considerate Approaches to ABC Model SelectionConsiderate Approaches to ABC Model Selection
Considerate Approaches to ABC Model Selection
 

Viewers also liked

“Probabilistic Logic Programs and Their Applications”
“Probabilistic Logic Programs and Their Applications”“Probabilistic Logic Programs and Their Applications”
“Probabilistic Logic Programs and Their Applications”
diannepatricia
 
Monte Carlo Methods
Monte Carlo MethodsMonte Carlo Methods
Monte Carlo Methods
James Bell
 
Monte carlo integration, importance sampling, basic idea of markov chain mont...
Monte carlo integration, importance sampling, basic idea of markov chain mont...Monte carlo integration, importance sampling, basic idea of markov chain mont...
Monte carlo integration, importance sampling, basic idea of markov chain mont...
BIZIMANA Appolinaire
 
Particle Filter
Particle FilterParticle Filter
Particle Filter
Takahiro Inoue
 
History of Mathematics
History of MathematicsHistory of Mathematics
History of Mathematics
tmp44
 
Particle Filters and Applications in Computer Vision
Particle Filters and Applications in Computer VisionParticle Filters and Applications in Computer Vision
Particle Filters and Applications in Computer Vision
zukun
 
Monte Carlo Simulation Methods
Monte Carlo Simulation MethodsMonte Carlo Simulation Methods
Monte Carlo Simulation Methods
ioneec
 
Cognitive Radio
Cognitive RadioCognitive Radio
Cognitive Radio
Rajan Kumar
 
BER PERFORMANCE ANALYSIS OF OFDM IN COGNITIVE RADIO NETWORK IN RAYLEIGH FADIN...
BER PERFORMANCE ANALYSIS OF OFDM IN COGNITIVE RADIO NETWORK IN RAYLEIGH FADIN...BER PERFORMANCE ANALYSIS OF OFDM IN COGNITIVE RADIO NETWORK IN RAYLEIGH FADIN...
BER PERFORMANCE ANALYSIS OF OFDM IN COGNITIVE RADIO NETWORK IN RAYLEIGH FADIN...
International Journal of Technical Research & Application
 

Viewers also liked (9)

“Probabilistic Logic Programs and Their Applications”
“Probabilistic Logic Programs and Their Applications”“Probabilistic Logic Programs and Their Applications”
“Probabilistic Logic Programs and Their Applications”
 
Monte Carlo Methods
Monte Carlo MethodsMonte Carlo Methods
Monte Carlo Methods
 
Monte carlo integration, importance sampling, basic idea of markov chain mont...
Monte carlo integration, importance sampling, basic idea of markov chain mont...Monte carlo integration, importance sampling, basic idea of markov chain mont...
Monte carlo integration, importance sampling, basic idea of markov chain mont...
 
Particle Filter
Particle FilterParticle Filter
Particle Filter
 
History of Mathematics
History of MathematicsHistory of Mathematics
History of Mathematics
 
Particle Filters and Applications in Computer Vision
Particle Filters and Applications in Computer VisionParticle Filters and Applications in Computer Vision
Particle Filters and Applications in Computer Vision
 
Monte Carlo Simulation Methods
Monte Carlo Simulation MethodsMonte Carlo Simulation Methods
Monte Carlo Simulation Methods
 
Cognitive Radio
Cognitive RadioCognitive Radio
Cognitive Radio
 
BER PERFORMANCE ANALYSIS OF OFDM IN COGNITIVE RADIO NETWORK IN RAYLEIGH FADIN...
BER PERFORMANCE ANALYSIS OF OFDM IN COGNITIVE RADIO NETWORK IN RAYLEIGH FADIN...BER PERFORMANCE ANALYSIS OF OFDM IN COGNITIVE RADIO NETWORK IN RAYLEIGH FADIN...
BER PERFORMANCE ANALYSIS OF OFDM IN COGNITIVE RADIO NETWORK IN RAYLEIGH FADIN...
 

Similar to An Importance Sampling Approach to Integrate Expert Knowledge When Learning Bayesian Networks From Data

Interval Pattern Structures: An introdution
Interval Pattern Structures: An introdutionInterval Pattern Structures: An introdution
Interval Pattern Structures: An introdution
INSA Lyon - L'Institut National des Sciences Appliquées de Lyon
 
Accelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference CompilationAccelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference Compilation
Feynman Liang
 
Bayesian_Decision_Theory-3.pdf
Bayesian_Decision_Theory-3.pdfBayesian_Decision_Theory-3.pdf
Bayesian_Decision_Theory-3.pdf
sivasanthoshdasari1
 
20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorization
midi
 
Interactive Learning of Bayesian Networks
Interactive Learning of Bayesian NetworksInteractive Learning of Bayesian Networks
Interactive Learning of Bayesian Networks
NTNU
 
ML unit-1.pptx
ML unit-1.pptxML unit-1.pptx
ML unit-1.pptx
SwarnaKumariChinni
 
Dimensionality reduction with UMAP
Dimensionality reduction with UMAPDimensionality reduction with UMAP
Dimensionality reduction with UMAP
Jakub Bartczuk
 
SASA 2016
SASA 2016SASA 2016
CLIM Program: Remote Sensing Workshop, Statistical Emulation with Dimension R...
CLIM Program: Remote Sensing Workshop, Statistical Emulation with Dimension R...CLIM Program: Remote Sensing Workshop, Statistical Emulation with Dimension R...
CLIM Program: Remote Sensing Workshop, Statistical Emulation with Dimension R...
The Statistical and Applied Mathematical Sciences Institute
 
When Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying ViewWhen Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying View
Mohamed Farouk
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
butest
 
Lecture11 xing
Lecture11 xingLecture11 xing
Lecture11 xing
Tianlu Wang
 
CLIM: Transition Workshop - Statistical Emulation with Dimension Reduction fo...
CLIM: Transition Workshop - Statistical Emulation with Dimension Reduction fo...CLIM: Transition Workshop - Statistical Emulation with Dimension Reduction fo...
CLIM: Transition Workshop - Statistical Emulation with Dimension Reduction fo...
The Statistical and Applied Mathematical Sciences Institute
 
Econometrics 2017-graduate-3
Econometrics 2017-graduate-3Econometrics 2017-graduate-3
Econometrics 2017-graduate-3
Arthur Charpentier
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
butest
 
Extracting biclusters of similar values with Triadic Concept Analysis
Extracting biclusters of similar values with Triadic Concept AnalysisExtracting biclusters of similar values with Triadic Concept Analysis
Extracting biclusters of similar values with Triadic Concept Analysis
INSA Lyon - L'Institut National des Sciences Appliquées de Lyon
 
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
Huang Po Chun
 
Lecture15 xing
Lecture15 xingLecture15 xing
Lecture15 xing
Tianlu Wang
 
Bayesian Deep Learning
Bayesian Deep LearningBayesian Deep Learning
Bayesian Deep Learning
RayKim51
 
Probabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksProbabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering Networks
Tomaso Aste
 

Similar to An Importance Sampling Approach to Integrate Expert Knowledge When Learning Bayesian Networks From Data (20)

Interval Pattern Structures: An introdution
Interval Pattern Structures: An introdutionInterval Pattern Structures: An introdution
Interval Pattern Structures: An introdution
 
Accelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference CompilationAccelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference Compilation
 
Bayesian_Decision_Theory-3.pdf
Bayesian_Decision_Theory-3.pdfBayesian_Decision_Theory-3.pdf
Bayesian_Decision_Theory-3.pdf
 
20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorization
 
Interactive Learning of Bayesian Networks
Interactive Learning of Bayesian NetworksInteractive Learning of Bayesian Networks
Interactive Learning of Bayesian Networks
 
ML unit-1.pptx
ML unit-1.pptxML unit-1.pptx
ML unit-1.pptx
 
Dimensionality reduction with UMAP
Dimensionality reduction with UMAPDimensionality reduction with UMAP
Dimensionality reduction with UMAP
 
SASA 2016
SASA 2016SASA 2016
SASA 2016
 
CLIM Program: Remote Sensing Workshop, Statistical Emulation with Dimension R...
CLIM Program: Remote Sensing Workshop, Statistical Emulation with Dimension R...CLIM Program: Remote Sensing Workshop, Statistical Emulation with Dimension R...
CLIM Program: Remote Sensing Workshop, Statistical Emulation with Dimension R...
 
When Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying ViewWhen Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying View
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Lecture11 xing
Lecture11 xingLecture11 xing
Lecture11 xing
 
CLIM: Transition Workshop - Statistical Emulation with Dimension Reduction fo...
CLIM: Transition Workshop - Statistical Emulation with Dimension Reduction fo...CLIM: Transition Workshop - Statistical Emulation with Dimension Reduction fo...
CLIM: Transition Workshop - Statistical Emulation with Dimension Reduction fo...
 
Econometrics 2017-graduate-3
Econometrics 2017-graduate-3Econometrics 2017-graduate-3
Econometrics 2017-graduate-3
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Extracting biclusters of similar values with Triadic Concept Analysis
Extracting biclusters of similar values with Triadic Concept AnalysisExtracting biclusters of similar values with Triadic Concept Analysis
Extracting biclusters of similar values with Triadic Concept Analysis
 
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
 
Lecture15 xing
Lecture15 xingLecture15 xing
Lecture15 xing
 
Bayesian Deep Learning
Bayesian Deep LearningBayesian Deep Learning
Bayesian Deep Learning
 
Probabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksProbabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering Networks
 

More from NTNU

Varying parameter in classification based on imprecise probabilities
Varying parameter in classification based on imprecise probabilitiesVarying parameter in classification based on imprecise probabilities
Varying parameter in classification based on imprecise probabilities
NTNU
 
Bagging Decision Trees on Data Sets with Classification Noise
Bagging Decision Trees on Data Sets with Classification NoiseBagging Decision Trees on Data Sets with Classification Noise
Bagging Decision Trees on Data Sets with Classification Noise
NTNU
 
lassification with decision trees from a nonparametric predictive inference p...
lassification with decision trees from a nonparametric predictive inference p...lassification with decision trees from a nonparametric predictive inference p...
lassification with decision trees from a nonparametric predictive inference p...
NTNU
 
Locally Averaged Bayesian Dirichlet Metrics
Locally Averaged Bayesian Dirichlet MetricsLocally Averaged Bayesian Dirichlet Metrics
Locally Averaged Bayesian Dirichlet Metrics
NTNU
 
Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cel...
Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cel...Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cel...
Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cel...
NTNU
 
An interactive approach for cleaning noisy observations in Bayesian networks ...
An interactive approach for cleaning noisy observations in Bayesian networks ...An interactive approach for cleaning noisy observations in Bayesian networks ...
An interactive approach for cleaning noisy observations in Bayesian networks ...
NTNU
 
Learning classifiers from discretized expression quantitative trait loci
Learning classifiers from discretized expression quantitative trait lociLearning classifiers from discretized expression quantitative trait loci
Learning classifiers from discretized expression quantitative trait loci
NTNU
 
Split Criterions for Variable Selection Using Decision Trees
Split Criterions for Variable Selection Using Decision TreesSplit Criterions for Variable Selection Using Decision Trees
Split Criterions for Variable Selection Using Decision Trees
NTNU
 
A Semi-naive Bayes Classifier with Grouping of Cases
A Semi-naive Bayes Classifier with Grouping of CasesA Semi-naive Bayes Classifier with Grouping of Cases
A Semi-naive Bayes Classifier with Grouping of Cases
NTNU
 
Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Mea...
Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Mea...Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Mea...
Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Mea...
NTNU
 
A Bayesian approach to estimate probabilities in classification trees
A Bayesian approach to estimate probabilities in classification treesA Bayesian approach to estimate probabilities in classification trees
A Bayesian approach to estimate probabilities in classification trees
NTNU
 
A Bayesian Random Split to Build Ensembles of Classification Trees
A Bayesian Random Split to Build Ensembles of Classification TreesA Bayesian Random Split to Build Ensembles of Classification Trees
A Bayesian Random Split to Build Ensembles of Classification Trees
NTNU
 
An Experimental Study about Simple Decision Trees for Bagging Ensemble on Dat...
An Experimental Study about Simple Decision Trees for Bagging Ensemble on Dat...An Experimental Study about Simple Decision Trees for Bagging Ensemble on Dat...
An Experimental Study about Simple Decision Trees for Bagging Ensemble on Dat...
NTNU
 
Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classi...
Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classi...Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classi...
Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classi...
NTNU
 
Evaluating query-independent object features for relevancy prediction
Evaluating query-independent object features for relevancy predictionEvaluating query-independent object features for relevancy prediction
Evaluating query-independent object features for relevancy prediction
NTNU
 
Effects of Highly Agreed Documents in Relevancy Prediction
Effects of Highly Agreed Documents in Relevancy PredictionEffects of Highly Agreed Documents in Relevancy Prediction
Effects of Highly Agreed Documents in Relevancy Prediction
NTNU
 
Conference poster 6
Conference poster 6Conference poster 6
Conference poster 6
NTNU
 

More from NTNU (17)

Varying parameter in classification based on imprecise probabilities
Varying parameter in classification based on imprecise probabilitiesVarying parameter in classification based on imprecise probabilities
Varying parameter in classification based on imprecise probabilities
 
Bagging Decision Trees on Data Sets with Classification Noise
Bagging Decision Trees on Data Sets with Classification NoiseBagging Decision Trees on Data Sets with Classification Noise
Bagging Decision Trees on Data Sets with Classification Noise
 
lassification with decision trees from a nonparametric predictive inference p...
lassification with decision trees from a nonparametric predictive inference p...lassification with decision trees from a nonparametric predictive inference p...
lassification with decision trees from a nonparametric predictive inference p...
 
Locally Averaged Bayesian Dirichlet Metrics
Locally Averaged Bayesian Dirichlet MetricsLocally Averaged Bayesian Dirichlet Metrics
Locally Averaged Bayesian Dirichlet Metrics
 
Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cel...
Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cel...Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cel...
Application of a Selective Gaussian Naïve Bayes Model for Diffuse-Large B-Cel...
 
An interactive approach for cleaning noisy observations in Bayesian networks ...
An interactive approach for cleaning noisy observations in Bayesian networks ...An interactive approach for cleaning noisy observations in Bayesian networks ...
An interactive approach for cleaning noisy observations in Bayesian networks ...
 
Learning classifiers from discretized expression quantitative trait loci
Learning classifiers from discretized expression quantitative trait lociLearning classifiers from discretized expression quantitative trait loci
Learning classifiers from discretized expression quantitative trait loci
 
Split Criterions for Variable Selection Using Decision Trees
Split Criterions for Variable Selection Using Decision TreesSplit Criterions for Variable Selection Using Decision Trees
Split Criterions for Variable Selection Using Decision Trees
 
A Semi-naive Bayes Classifier with Grouping of Cases
A Semi-naive Bayes Classifier with Grouping of CasesA Semi-naive Bayes Classifier with Grouping of Cases
A Semi-naive Bayes Classifier with Grouping of Cases
 
Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Mea...
Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Mea...Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Mea...
Combining Decision Trees Based on Imprecise Probabilities and Uncertainty Mea...
 
A Bayesian approach to estimate probabilities in classification trees
A Bayesian approach to estimate probabilities in classification treesA Bayesian approach to estimate probabilities in classification trees
A Bayesian approach to estimate probabilities in classification trees
 
A Bayesian Random Split to Build Ensembles of Classification Trees
A Bayesian Random Split to Build Ensembles of Classification TreesA Bayesian Random Split to Build Ensembles of Classification Trees
A Bayesian Random Split to Build Ensembles of Classification Trees
 
An Experimental Study about Simple Decision Trees for Bagging Ensemble on Dat...
An Experimental Study about Simple Decision Trees for Bagging Ensemble on Dat...An Experimental Study about Simple Decision Trees for Bagging Ensemble on Dat...
An Experimental Study about Simple Decision Trees for Bagging Ensemble on Dat...
 
Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classi...
Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classi...Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classi...
Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classi...
 
Evaluating query-independent object features for relevancy prediction
Evaluating query-independent object features for relevancy predictionEvaluating query-independent object features for relevancy prediction
Evaluating query-independent object features for relevancy prediction
 
Effects of Highly Agreed Documents in Relevancy Prediction
Effects of Highly Agreed Documents in Relevancy PredictionEffects of Highly Agreed Documents in Relevancy Prediction
Effects of Highly Agreed Documents in Relevancy Prediction
 
Conference poster 6
Conference poster 6Conference poster 6
Conference poster 6
 

Recently uploaded

Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
Leonel Morgado
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
pablovgd
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
IshaGoswami9
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
Leonel Morgado
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
Gokturk Mehmet Dilci
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Texas Alliance of Groundwater Districts
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
University of Hertfordshire
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
Sérgio Sacani
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
İsa Badur
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
Aditi Bajpai
 
Cytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptxCytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptx
Hitesh Sikarwar
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
University of Maribor
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
Daniel Tubbenhauer
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
TinyAnderson
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
AbdullaAlAsif1
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
yqqaatn0
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
PRIYANKA PATEL
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
RitabrataSarkar3
 

Recently uploaded (20)

Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
 
Cytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptxCytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptx
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
 

An Importance Sampling Approach to Integrate Expert Knowledge When Learning Bayesian Networks From Data

  • 1. An Importance sampling approach to integrate expert knowledge when learning Bayesian Networks from data Andrés Cano, Andrés R. Masegosa and Serafín Moral Department of Computer Science and Artificial Intelligence University of Granada (Spain) Dortmund, June 2010 Information Processing and Management of Uncertainty in Knowledge-Based Systems IPMU 2010 Dortmund (Germany) 1/32
  • 2. Outline 1 Introduction 2 Learning Bayesian Networks(BN) from data 3 Importance Sampling for learning BN 4 Integration of Expert Knowledge 5 Experimental Evaluation 6 Conclusions & Future Works IPMU 2010 Dortmund (Germany) 2/32
  • 4. Introduction Bayesian Networks Bayesian Networks Excellent models to graphically represent the dependency structure of the underlying distribution in multivariate domains. The learning from data of this dependency structure in a multivariate problem domain represents a very relevant source of knowledge (direct interactions, conditional independencies...) IPMU 2010 Dortmund (Germany) 4/32
  • 5. Introduction Learning Bayesian Networks from Data Uncertainty in Model Selection When learning BNs from data there usually are several models with a high score (high posterior probability given the data). This situation is specially common in problem domains with high number of variables and low sample sizes. IPMU 2010 Dortmund (Germany) 5/32
  • 6. Introduction Integration of Expert Knowledge Expert Knowledge In many domain problems expert knowledge is available. The graphical structure of BNs greatly ease the interaction with a human expert: Causal ordering. D-separtion criteria. IPMU 2010 Dortmund (Germany) 6/32
  • 7. Introduction Integration of Expert Knowledge Expert Knowledge In many domain problems expert knowledge is available. The graphical structure of BNs greatly ease the interaction with a human expert: Causal ordering. D-separtion criteria. Previous Works I There have been many attempts to introduce expert knowledge when learning BNs from data. Via Prior Distribution [2,5]: Use of specific prior distributions over the possible graph structures to integrate expert knowledge: Expert assigns higher prior probabilities to most likely edges. IPMU 2010 Dortmund (Germany) 6/32
  • 8. Introduction Integration of Expert Knowledge Previous Works II Via structural Restrictions [6]: Expert codify his/her knowledge as structural restrictions. Expert defines the existence/absence of arcs and/or edges and causal ordering restrictions. Retrieved model should satisfy these restrictions. IPMU 2010 Dortmund (Germany) 7/32
  • 9. Introduction Integration of Expert Knowledge Previous Works II Via structural Restrictions [6]: Expert codify his/her knowledge as structural restrictions. Expert defines the existence/absence of arcs and/or edges and causal ordering restrictions. Retrieved model should satisfy these restrictions. Limitations of "Prior" Expert Knowledge The system would ask to the expert his/her belief about any possible feature of the BN (non feasible in high domains). The expert could be biased to provide the most “easy” or clear knowledge. The system does not help to the user to introduce information about the BN structure. IPMU 2010 Dortmund (Germany) 7/32
  • 10. Introduction Interactive Learning of Bayesian Networks IPMU 2010 Dortmund (Germany) 8/32
  • 11. Introduction Interactive Learning of Bayesian Networks Active Interaction with the Expert Strategy: Ask to the expert by the presence of the edges that most reduce the model uncertainty. Method: Framework to allow an efficient and effective interaction with the expert. Expert is only asked for this controversial structural features. IPMU 2010 Dortmund (Germany) 8/32
  • 12. Previous Knowledge Part II Learning Bayesian Networks from data IPMU 2010 Dortmund (Germany) 9/32
  • 13. Previous Knowledge Notation Let be X = (X1, ..., Xn) a set of n random variables. Val(Xi ) is the set of values of Xi . We assume variables are enumerated in a total causal order. We also assume a fully observed data set D. IPMU 2010 Dortmund (Germany) 10/32
  • 14. Previous Knowledge Notation Let be X = (X1, ..., Xn) a set of n random variables. Val(Xi ) is the set of values of Xi . We assume variables are enumerated in a total causal order. We also assume a fully observed data set D. A Bayesian Network B can be described by: G the graph structure. θG the parameters. A graph G can be decomposed as a vector of parent sets: G = (Pa(X1), ..., Pa(Xn)) IPMU 2010 Dortmund (Germany) 10/32
  • 15. Previous Knowledge Notation Let be X = (X1, ..., Xn) a set of n random variables. Val(Xi ) is the set of values of Xi . We assume variables are enumerated in a total causal order. We also assume a fully observed data set D. A Bayesian Network B can be described by: G the graph structure. θG the parameters. A graph G can be decomposed as a vector of parent sets: G = (Pa(X1), ..., Pa(Xn)) We also define Ui as a random variable taking values in the space of all possible parent sets of Xi , Val(Ui ). Let be G a random variable taking values in the set Val(G) of all possible graph structures consistent with the total order. IPMU 2010 Dortmund (Germany) 10/32
  • 16. Previous Knowledge The Bayesian Learning Framework Scoring a graph structure Marginal Likelihood of a graph structure: P(G = G|D) = P(G|D) ∝ P(G)P(D|G) = i score(Xi , PaG(Xi )|D) scoreBDeu(Xi , Ui |D) = Pi (U) |Ui | j=0 Γ(αij ) Γ(αij + Nij ) |Xi | k=1 Γ(αijk + Nijk ) Γ(αijk ) Pi (U) is the prior probability that U is the parent set of Xi . IPMU 2010 Dortmund (Germany) 11/32
  • 17. Previous Knowledge The Bayesian Learning Framework Scoring a graph structure Marginal Likelihood of a graph structure: P(G = G|D) = P(G|D) ∝ P(G)P(D|G) = i score(Xi , PaG(Xi )|D) scoreBDeu(Xi , Ui |D) = Pi (U) |Ui | j=0 Γ(αij ) Γ(αij + Nij ) |Xi | k=1 Γ(αijk + Nijk ) Γ(αijk ) Pi (U) is the prior probability that U is the parent set of Xi . Approximating the posterior P(G|D) Our approach is supported on the approximation of P(G|D). It allows to know which graph structures are the most likely (best explain the data). Exhaustive enumeration is not feasible because the space of graph structures is super-exponential. IPMU 2010 Dortmund (Germany) 11/32
  • 18. Previous Knowledge Approximating the Posterior Factorization of P(G|D) Assumption of a total order implies that the selection of the parent sets for each Xi are independent among them: P(G|D) = i P(Ui |D) P(G|D) can be decomposed in n independent problems. P(Ui |D) posterior probability of the possible parent sets of variable Xi . Each of the sub-problems still has exponential size. IPMU 2010 Dortmund (Germany) 12/32
  • 19. Previous Knowledge Approximating the Posterior P(Ui|D) Closed Form Solution In [3] it was proposed a closed form solution assuming a node can have up to K parents. It would have a polynomial efficiency O(n(K+1)). IPMU 2010 Dortmund (Germany) 13/32
  • 20. Previous Knowledge Approximating the Posterior P(Ui|D) Closed Form Solution In [3] it was proposed a closed form solution assuming a node can have up to K parents. It would have a polynomial efficiency O(n(K+1)). Markov Chain Monte Carlo Lets Val(Ui ) be the space model of the Markov Chain. If the Markov Chain is in some state U in iteration t, a new model U is randomly drawn by adding, deleting of switching any edge. The Markov Chain moves to state U in the iteration t + 1 with probability: m(Ut , Ut+1 ) = min{1, N(U)score(D|U ) N(U )score(D|U) } If it not, the Markov Chain remains in state U. This Markov Chain has an stationary distribution (t → ∞) which is P(U|D). IPMU 2010 Dortmund (Germany) 13/32
  • 21. Importance Sampling Part III Importance Sampling IPMU 2010 Dortmund (Germany) 14/32
  • 22. Importance Sampling Importance Sampling Description Based on the employment of an auxiliary distribution Q which roughly approximate P, the target distribution. Q is a distribution which is easier to sample for it. EP (f(x)) = P(x) Q(x) f(x)Q(x)dx = EQ(w(x)f(x)) (1) where w(x) = P(x) Q(x) acts as a weight function. IPMU 2010 Dortmund (Germany) 15/32
  • 23. Importance Sampling Importance Sampling Description Based on the employment of an auxiliary distribution Q which roughly approximate P, the target distribution. Q is a distribution which is easier to sample for it. EP (f(x)) = P(x) Q(x) f(x)Q(x)dx = EQ(w(x)f(x)) (1) where w(x) = P(x) Q(x) acts as a weight function. A set of T samples {x1, ..., xT } are generated form Q and, then, it is computed wt = P(xt ) Q(xt ) . The estimator ˆµ of EP (f(x)) is finally computed as follows: ˆµ = T t=1 w(xt )f(t) T t=1 w(xt ) (2) Key Aspect: P and Q can be known up to a multiplicative constant. IPMU 2010 Dortmund (Germany) 15/32
  • 24. Importance Sampling Importance Sampling for learning BNs Step 0: Candidate Parents are considered in a random permutation. Score of initial model: score(X, {∅}|D) IPMU 2010 Dortmund (Germany) 16/32
  • 25. Importance Sampling Importance Sampling for learning BNs Step 1: Evaluate C as parent of X. Compute the ratio: r = score(X, {C}|D) score(X, {C}|D) + score(X, {∅}|D) = 0.8 Randomly accept C as parent of X with probability r = 0.8 −→ Accepted. Q = 0.8 IPMU 2010 Dortmund (Germany) 17/32
  • 26. Importance Sampling Importance Sampling for learning BNs Step 2: Evaluate B as parent of X. Compute the ratio: r = score(X, {C, B}|D) score(X, {C, B}|D) + score(X, {C}|D) = 0.1 Randomly accept B as parent of X with probability r = 0.1 −→ Non Accepted. Q = 0.8 · 0.9 IPMU 2010 Dortmund (Germany) 18/32
  • 27. Importance Sampling Importance Sampling for learning BNs Step 3: Evaluate A as parent of X. Compute the ratio: r = score(X, {C, A}|D) score(X, {C, A}|D) + score(X, {C}|D) = 0.7 Randomly accept A as parent of X with probability r = 0.7 −→ Accepted. Q = 0.8 · 0.9 · 0.7 = 0.504 Weight of the final model: W1 = score(X, {C, A}|D) 0.504 IPMU 2010 Dortmund (Germany) 19/32
  • 28. Importance Sampling Importance Sampling for learning BNs Step 3: Evaluate A as parent of X. Compute the ratio: r = score(X, {C, A}|D) score(X, {C, A}|D) + score(X, {C}|D) = 0.7 Randomly accept A as parent of X with probability r = 0.7 −→ Accepted. Q = 0.8 · 0.9 · 0.7 = 0.504 Weight of the final model: W1 = score(X, {C, A}|D) 0.504 The process is repeated T times. Using these samples we get an approximation of P(Ui |D). IPMU 2010 Dortmund (Germany) 19/32
  • 29. Integrating Expert Knowledge Part IV Integrating Expert Knowledge IPMU 2010 Dortmund (Germany) 20/32
  • 30. Integrating Expert Knowledge Methodology Description IPMU 2010 Dortmund (Germany) 21/32
  • 31. Integrating Expert Knowledge Prior Knowledge Representing absence of prior knowledge “Uniform prior over structures is usually chosen by convenience” [3]. P(G) does not grow with the data, but it matters at low sample sizes. IPMU 2010 Dortmund (Germany) 22/32
  • 32. Integrating Expert Knowledge Prior Knowledge Representing absence of prior knowledge “Uniform prior over structures is usually chosen by convenience” [3]. P(G) does not grow with the data, but it matters at low sample sizes. Let us assume that the prior probability of an edge is p and independent of each other. If Xi has k parents out of m nodes: P(Pa(Xi )) = pk (1 − p)(m−k) If the number of candidate parents m grows, p should be decreased to control de number of false positives edges: “multiplicity correction”. IPMU 2010 Dortmund (Germany) 22/32
  • 33. Integrating Expert Knowledge Prior Knowledge Representing absence of prior knowledge “Uniform prior over structures is usually chosen by convenience” [3]. P(G) does not grow with the data, but it matters at low sample sizes. Let us assume that the prior probability of an edge is p and independent of each other. If Xi has k parents out of m nodes: P(Pa(Xi )) = pk (1 − p)(m−k) If the number of candidate parents m grows, p should be decreased to control de number of false positives edges: “multiplicity correction”. This is solved by assuming that p has a Beta prior with parameter α = 0.5 [11]: Pi (Pa(Xi )) = Γ(2 ∗ α) Γ(m + 2α) Γ(k + α)Γ(m − k + α) Γ(α)Γ(α) IPMU 2010 Dortmund (Germany) 22/32
  • 34. Integrating Expert Knowledge Interacting with the Expert Description Key Idea: As lower the entropy of P(Ui |D) is, more reliable learning we have. Expert Interaction is carried out in order to reduce the entropy H(P(Ui |D)). . IPMU 2010 Dortmund (Germany) 23/32
  • 35. Integrating Expert Knowledge Interacting with the Expert P(S → X|D) = 0.85 P(R → X|D) = 0.8 P(C → X|D) = 0.45 Description Key Idea: As lower the entropy of P(Ui |D) is, more reliable learning we have. Expert Interaction is carried out in order to reduce the entropy H(P(Ui |D)). System ask to the expert by those edges with higher entropy. IPMU 2010 Dortmund (Germany) 23/32
  • 36. Integrating Expert Knowledge Interacting with the Expert P(S → X|D) = 0.88 P(R → X|D) = 0.77 P(C → X|D) = 1.0 Description The entropy of P(Ui |D) is reduced. Probability mass concentrates around one model. This methodology can be iteratively applied asking by the presence/absence of more edges. Stopping the interaction when the probability of MAP model is L times higher the second most probable model. IPMU 2010 Dortmund (Germany) 24/32
  • 37. Experimental Evaluation Part V Experimental Evaluation IPMU 2010 Dortmund (Germany) 25/32
  • 38. Experimental Evaluation Experimental Set-up Bayesian Networks: alarm (37 nodes), boblo (23 nodes), boerlage-92 (23 nodes), hailfinder (56 nodes), insurance (27 nodes). Sample Sizes: We run 10 times the algorithms with different samples sizes: 50, 100, 500 and 1000. IPMU 2010 Dortmund (Germany) 26/32
  • 39. Experimental Evaluation Experimental Set-up Bayesian Networks: alarm (37 nodes), boblo (23 nodes), boerlage-92 (23 nodes), hailfinder (56 nodes), insurance (27 nodes). Sample Sizes: We run 10 times the algorithms with different samples sizes: 50, 100, 500 and 1000. Evaluation Measures Number of missing/extra links, Kullback-Leibler distance... We report average values across the five networks. Expert Interaction is simulated: Access to the true BN model when asking by presence/absence of an edge. IPMU 2010 Dortmund (Germany) 26/32
  • 40. Experimental Evaluation Structure Prior Evaluation N. of Structural Errors KL Distance Analysis Beta-Prior reduces the number of structural errors for both IS and MCMCM. IS has a low number of errors than MCMC specially with low sample sizes. Beta-Prior also reduces KL distance for both IS and MCMC. IPMU 2010 Dortmund (Germany) 27/32
  • 41. Experimental Evaluation Expert Interaction Evaluation N. of Structural Errors KL Distance Analysis As much the mass of the posterior probability is concentrated around one model, the lower is the number of structural errors. The KL distance does not significantly improve with large sample sizes (these structural errors do not have great impact in the prediction capacity). IPMU 2010 Dortmund (Germany) 28/32
  • 42. Experimental Evaluation Expert Interaction Evaluation N. of Interactions Interaction Accuracy Analysis The number of Interactions are feasible for a human expert. Prior Exhaustive Querying: 600 questions in averaged. The Interaction Accuracy: ratio between number of reduced structural errors and number of interactions. Average Accuracy of random interactions: 1%. IPMU 2010 Dortmund (Germany) 29/32
  • 43. Conclusions Part VI Conclusions & Future Works IPMU 2010 Dortmund (Germany) 30/32
  • 44. Conclusions Conclusions & Future Works Conclusions A new methodology to introduce expert knowledge when learning BN from data. A new Importance sampling technique for sampling BN. System requests to the expert a feasible number of questions. Interaction improves the quality of the inferred BN models. IPMU 2010 Dortmund (Germany) 31/32
  • 45. Conclusions Conclusions & Future Works Conclusions A new methodology to introduce expert knowledge when learning BN from data. A new Importance sampling technique for sampling BN. System requests to the expert a feasible number of questions. Interaction improves the quality of the inferred BN models. Future Works Extend these methods to the learning of BN models without causal ordering assumptions. IPMU 2010 Dortmund (Germany) 31/32
  • 46. Thanks for your attention!! Questions? IPMU 2010 Dortmund (Germany) 32/32