SlideShare a Scribd company logo
1 of 17
Download to read offline
Decision Making with Hierarchical Credal Sets
Alessandro Antonucci1
, Alexander Karlsson2
, and David Sundgren3
(1) IDSIA (Switzerland)
(2) University of Sk¨ovde (Sweden)
(3) Stockholm University (Sweden)
IPMU 2014, Montpellier, July 18th, 2014
Outline
Background on credal sets and hierarchical models
Credal sets and NOT hierarchical models
Hierarchical credal sets
Decision making with hierarchical credal sets
Application to credal classification
Conclusions and outlooks
Background on credal sets and hierarchical models
Model of uncertainty about variable X taking values in ΩX
Estimating the (expected) value of f : ΩX → R
Probability mass function P(X)
EP [f ] := x∈ΩX
P(x) · f (x)
Credal set K(X) (convex set of mass functions)
EK [f ] := minP(X)∈K(X) x∈ΩX
P(x) · f (x)
Hierarchical model [K(X), π(Θ)]
EK,π[f ] := ΩΘ
EPθ
[f ] · π(θ) · dθ = EPK,π
[f ]
where {Pθ(X)}θ∈ΩΘ
= K(X)
and PK,π(X) := ΩΘ
Pθ(X) · π(θ) · dθ (weighted CoM)
(Of course) Credal sets are not hierarchical models
Parametrization with Θ even with pure credal set K(X)
EK [f ] = EP∗ [f ] for at least a P∗
(X) ∈ K(X) [P∗
(X) = Pθ∗ (X)]
(improper) prior π(θ) = δθ,θ∗ gives EK , but only for this f !
Different priors for different f ⇒ a set of priors
A credal set over Θ: it should be vacuous K0(Θ)
Credal sets are (sort of) hierarchical models,
but a vacuous credal set should be placed on the second level
K(X) ≡ [PΘ(X), K0(Θ)]
For credal networks, this is the Cano-Cano-Moral transformation!
Hierarchical credal sets
Hierarchical model [PΘ(X), π(Θ)]
(hierarchical view of) credal sets [PΘ(X), K0(Θ)]
“hierarchical credal set” [PΘ(X), K (Θ)] equivalent to
K (X) = ΩΘ
Pθ(X) · π(θ) · dθ
π(Θ)∈K (Θ)
⊆ K(X)
Trade-off between realism/cautiousness and informativeness
EK [f ] ≤ EK [f ] ≤ EK,π[f ] ≤ EK [f ] ≤ EK [f ]
assuming π(Θ) ∈ K (Θ)
How to choose K (Θ)?
Shrinking (but not too much!)
Likelihood-based learning of CS [Cattaneo]
π(Θ) ∝ PΘ(D)
Model revision
π(Θ) → Kα(Θ) =



π (Θ)
π (θ) = 0
if π(θ) < α · π(θML)



Cope with [PΘ(X), Kα(Θ)]
Shifted Dirichlet prior [Karlsson & Sundgren]
A prior over credal sets induced by probability intervals
πs,t(Θ) ∝ n
i=1[Θi − P(xi )]sti −1
PK,π(xi ) = P(xi ) + ti [1 − n
j=1 P(xj )]
Back to an imprecise model?
Sampling from K(X) based on πs,t
Sampling from a credal set
A swarm of “particles” K(X) ⊃ {Pk (X)} ∼ πs,t(Θ)
Weighted sampling from polytopes as a two-step process
(i) Uniform sampling by convex combination of the vertices
(convex combination by uniform sampling from the simplex)
(ii) “Sampling from the sample”
(discrete sampling weighted by the prior)
For big swarms empirical and theoretical CoMs coincide
Heuristics to remove particles: KL distance from the CoM
Application to decision making
Simplest DM task: most probable state x∗
:= arg maxx P(x)
K(X): Ω∗
X = {x∗
∈ ΩX |∃P(X) ∈ K(X) : x∗
= arg maxx P(x)}
[K(X), πs,t(Θ)]: x∗
:= arg maxx PK,πs,t
(x)
Alternatively:
[K(X), πs,t(Θ)] → {Pj (X)}m
j=1
Shrink it to K (X) (heuristics)
Take the decision with K (X)
P(x1)
P(x2)
P(x3)
Application to decision making
Simplest DM task: most probable state x∗
:= arg maxx P(x)
K(X): Ω∗
X = {x∗
∈ ΩX |∃P(X) ∈ K(X) : x∗
= arg maxx P(x)}
[K(X), πs,t(Θ)]: x∗
:= arg maxx PK,πs,t
(x)
Alternatively:
[K(X), πs,t(Θ)] → {Pj (X)}m
j=1
Shrink it to K (X) (heuristics)
Take the decision with K (X)
P(x1)
P(x2)
P(x3)
Application to decision making
Simplest DM task: most probable state x∗
:= arg maxx P(x)
K(X): Ω∗
X = {x∗
∈ ΩX |∃P(X) ∈ K(X) : x∗
= arg maxx P(x)}
[K(X), πs,t(Θ)]: x∗
:= arg maxx PK,πs,t
(x)
Alternatively:
[K(X), πs,t(Θ)] → {Pj (X)}m
j=1
Shrink it to K (X) (heuristics)
Take the decision with K (X)
P(x1)
P(x2)
P(x3)
Application to decision making
Simplest DM task: most probable state x∗
:= arg maxx P(x)
K(X): Ω∗
X = {x∗
∈ ΩX |∃P(X) ∈ K(X) : x∗
= arg maxx P(x)}
[K(X), πs,t(Θ)]: x∗
:= arg maxx PK,πs,t
(x)
Alternatively:
[K(X), πs,t(Θ)] → {Pj (X)}m
j=1
Shrink it to K (X) (heuristics)
Take the decision with K (X)
P(x1)
P(x2)
P(x3)
Application to decision making
Simplest DM task: most probable state x∗
:= arg maxx P(x)
K(X): Ω∗
X = {x∗
∈ ΩX |∃P(X) ∈ K(X) : x∗
= arg maxx P(x)}
[K(X), πs,t(Θ)]: x∗
:= arg maxx PK,πs,t
(x)
Alternatively:
[K(X), πs,t(Θ)] → {Pj (X)}m
j=1
Shrink it to K (X) (heuristics)
Take the decision with K (X)
P(x1)
P(x2)
P(x3)
Application to decision making
Simplest DM task: most probable state x∗
:= arg maxx P(x)
K(X): Ω∗
X = {x∗
∈ ΩX |∃P(X) ∈ K(X) : x∗
= arg maxx P(x)}
[K(X), πs,t(Θ)]: x∗
:= arg maxx PK,πs,t
(x)
Alternatively:
[K(X), πs,t(Θ)] → {Pj (X)}m
j=1
Shrink it to K (X) (heuristics)
Take the decision with K (X)
P(x1)
P(x2)
P(x3)
Application to decision making
Simplest DM task: most probable state x∗
:= arg maxx P(x)
K(X): Ω∗
X = {x∗
∈ ΩX |∃P(X) ∈ K(X) : x∗
= arg maxx P(x)}
[K(X), πs,t(Θ)]: x∗
:= arg maxx PK,πs,t
(x)
Alternatively:
[K(X), πs,t(Θ)] → {Pj (X)}m
j=1
Shrink it to K (X) (heuristics)
Take the decision with K (X)
P(x1)
P(x2)
P(x3)
Testing the approach on a (credal) classification setup
Classification setup: class C and features F
Given an instance of the features F = ˜f , which c ∈ ΩC ?
(B) naive Bayes P(c, f) = P(c) i P(fi |c)
Decision based on P(C|˜f )
(C) naive credal K(C), P(Fi |c) learned by (local) IDM
Decision based on (outer approx of) K(C|˜f )
(H) hierarchical/credal approach on K(C|˜f )
Priors can be easily propagated (multiplied)
provided that someone assessed them
(C) and (H) are credal classifiers (more than a single class in output)
Accuracy of (B) compared with utility-based performance descriptor
for (C) and (H) [Zaffalon et al., 2014]
Results
Dataset n d (B) (C) (H)
Lenses 24 3 77.2 53.7 72.2
Labor 51 2 87.0 92.7 93.7
Hayes 160 4 59.5 51.1 72.4
Monk 556 2 64.1 70.6 72.9
Conclusions and outlooks
A (better?) formalization of the relation between hierarchical and
imprecise-probabilistic models
Heuristics to take more informative decisions in credal networks
(provided that a prior can be assessed)
To do:
Better heuristics: finding the smallest credal set covering a given
number of particles can be done with MILP
More ambitiously: a sound approach to learn K (Θ)
Release a R package for that

More Related Content

What's hot

On the Jensen-Shannon symmetrization of distances relying on abstract means
On the Jensen-Shannon symmetrization of distances relying on abstract meansOn the Jensen-Shannon symmetrization of distances relying on abstract means
On the Jensen-Shannon symmetrization of distances relying on abstract meansFrank Nielsen
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking componentsChristian Robert
 
Bayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear modelsBayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear modelsCaleb (Shiqiang) Jin
 
Approximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-LikelihoodsApproximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-LikelihoodsStefano Cabras
 
Truth, deduction, computation lecture i (last one)
Truth, deduction, computation   lecture i (last one)Truth, deduction, computation   lecture i (last one)
Truth, deduction, computation lecture i (last one)Vlad Patryshev
 
Predicates and Quantifiers
Predicates and Quantifiers Predicates and Quantifiers
Predicates and Quantifiers Istiak Ahmed
 
Linear Discriminant Analysis (LDA) Under f-Divergence Measures
Linear Discriminant Analysis (LDA) Under f-Divergence MeasuresLinear Discriminant Analysis (LDA) Under f-Divergence Measures
Linear Discriminant Analysis (LDA) Under f-Divergence MeasuresAnmol Dwivedi
 
ABC based on Wasserstein distances
ABC based on Wasserstein distancesABC based on Wasserstein distances
ABC based on Wasserstein distancesChristian Robert
 
Harmonic Analysis and Deep Learning
Harmonic Analysis and Deep LearningHarmonic Analysis and Deep Learning
Harmonic Analysis and Deep LearningSungbin Lim
 
Introduction to modern Variational Inference.
Introduction to modern Variational Inference.Introduction to modern Variational Inference.
Introduction to modern Variational Inference.Tomasz Kusmierczyk
 
Tutorial on Belief Propagation in Bayesian Networks
Tutorial on Belief Propagation in Bayesian NetworksTutorial on Belief Propagation in Bayesian Networks
Tutorial on Belief Propagation in Bayesian NetworksAnmol Dwivedi
 
Higher-Order (F, α, β, ρ, d) –Convexity for Multiobjective Programming Problem
Higher-Order (F, α, β, ρ, d) –Convexity for Multiobjective Programming ProblemHigher-Order (F, α, β, ρ, d) –Convexity for Multiobjective Programming Problem
Higher-Order (F, α, β, ρ, d) –Convexity for Multiobjective Programming Probleminventionjournals
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Valentin De Bortoli
 
Lecture 2 predicates quantifiers and rules of inference
Lecture 2 predicates quantifiers and rules of inferenceLecture 2 predicates quantifiers and rules of inference
Lecture 2 predicates quantifiers and rules of inferenceasimnawaz54
 
Information in the Weights
Information in the WeightsInformation in the Weights
Information in the WeightsMark Chang
 
Improved Trainings of Wasserstein GANs (WGAN-GP)
Improved Trainings of Wasserstein GANs (WGAN-GP)Improved Trainings of Wasserstein GANs (WGAN-GP)
Improved Trainings of Wasserstein GANs (WGAN-GP)Sangwoo Mo
 
Tailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest NeighborsTailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest NeighborsFrank Nielsen
 

What's hot (20)

On the Jensen-Shannon symmetrization of distances relying on abstract means
On the Jensen-Shannon symmetrization of distances relying on abstract meansOn the Jensen-Shannon symmetrization of distances relying on abstract means
On the Jensen-Shannon symmetrization of distances relying on abstract means
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking components
 
Bayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear modelsBayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear models
 
Approximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-LikelihoodsApproximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-Likelihoods
 
talk MCMC & SMC 2004
talk MCMC & SMC 2004talk MCMC & SMC 2004
talk MCMC & SMC 2004
 
Truth, deduction, computation lecture i (last one)
Truth, deduction, computation   lecture i (last one)Truth, deduction, computation   lecture i (last one)
Truth, deduction, computation lecture i (last one)
 
Predicates and Quantifiers
Predicates and Quantifiers Predicates and Quantifiers
Predicates and Quantifiers
 
Linear Discriminant Analysis (LDA) Under f-Divergence Measures
Linear Discriminant Analysis (LDA) Under f-Divergence MeasuresLinear Discriminant Analysis (LDA) Under f-Divergence Measures
Linear Discriminant Analysis (LDA) Under f-Divergence Measures
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
ABC based on Wasserstein distances
ABC based on Wasserstein distancesABC based on Wasserstein distances
ABC based on Wasserstein distances
 
Harmonic Analysis and Deep Learning
Harmonic Analysis and Deep LearningHarmonic Analysis and Deep Learning
Harmonic Analysis and Deep Learning
 
Introduction to modern Variational Inference.
Introduction to modern Variational Inference.Introduction to modern Variational Inference.
Introduction to modern Variational Inference.
 
Matrix calculus
Matrix calculusMatrix calculus
Matrix calculus
 
Tutorial on Belief Propagation in Bayesian Networks
Tutorial on Belief Propagation in Bayesian NetworksTutorial on Belief Propagation in Bayesian Networks
Tutorial on Belief Propagation in Bayesian Networks
 
Higher-Order (F, α, β, ρ, d) –Convexity for Multiobjective Programming Problem
Higher-Order (F, α, β, ρ, d) –Convexity for Multiobjective Programming ProblemHigher-Order (F, α, β, ρ, d) –Convexity for Multiobjective Programming Problem
Higher-Order (F, α, β, ρ, d) –Convexity for Multiobjective Programming Problem
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
 
Lecture 2 predicates quantifiers and rules of inference
Lecture 2 predicates quantifiers and rules of inferenceLecture 2 predicates quantifiers and rules of inference
Lecture 2 predicates quantifiers and rules of inference
 
Information in the Weights
Information in the WeightsInformation in the Weights
Information in the Weights
 
Improved Trainings of Wasserstein GANs (WGAN-GP)
Improved Trainings of Wasserstein GANs (WGAN-GP)Improved Trainings of Wasserstein GANs (WGAN-GP)
Improved Trainings of Wasserstein GANs (WGAN-GP)
 
Tailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest NeighborsTailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest Neighbors
 

Similar to Decision Making with Hierarchical Credal Sets (IPMU 2014)

The dual geometry of Shannon information
The dual geometry of Shannon informationThe dual geometry of Shannon information
The dual geometry of Shannon informationFrank Nielsen
 
ABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space modelsABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space modelsUmberto Picchini
 
On learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodOn learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodFrank Nielsen
 
Rosser's theorem
Rosser's theoremRosser's theorem
Rosser's theoremWathna
 
Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Alexander Litvinenko
 
Tensor train to solve stochastic PDEs
Tensor train to solve stochastic PDEsTensor train to solve stochastic PDEs
Tensor train to solve stochastic PDEsAlexander Litvinenko
 
IVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionIVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionCharles Deledalle
 
Bayesian Deep Learning
Bayesian Deep LearningBayesian Deep Learning
Bayesian Deep LearningRayKim51
 
Variations on the method of Coleman-Chabauty
Variations on the method of Coleman-ChabautyVariations on the method of Coleman-Chabauty
Variations on the method of Coleman-Chabautymmasdeu
 
Efficient Analysis of high-dimensional data in tensor formats
Efficient Analysis of high-dimensional data in tensor formatsEfficient Analysis of high-dimensional data in tensor formats
Efficient Analysis of high-dimensional data in tensor formatsAlexander Litvinenko
 
Some Thoughts on Sampling
Some Thoughts on SamplingSome Thoughts on Sampling
Some Thoughts on SamplingDon Sheehy
 
Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)
Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)
Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)Dahua Lin
 
Monte Carlo Methods
Monte Carlo MethodsMonte Carlo Methods
Monte Carlo MethodsJames Bell
 
Delayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithmsDelayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithmsChristian Robert
 
Unique fixed point theorems for generalized weakly contractive condition in o...
Unique fixed point theorems for generalized weakly contractive condition in o...Unique fixed point theorems for generalized weakly contractive condition in o...
Unique fixed point theorems for generalized weakly contractive condition in o...Alexander Decker
 
(α ψ)- Construction with q- function for coupled fixed point
(α   ψ)-  Construction with q- function for coupled fixed point(α   ψ)-  Construction with q- function for coupled fixed point
(α ψ)- Construction with q- function for coupled fixed pointAlexander Decker
 
Error control coding bch, reed-solomon etc..
Error control coding   bch, reed-solomon etc..Error control coding   bch, reed-solomon etc..
Error control coding bch, reed-solomon etc..Madhumita Tamhane
 

Similar to Decision Making with Hierarchical Credal Sets (IPMU 2014) (20)

The dual geometry of Shannon information
The dual geometry of Shannon informationThe dual geometry of Shannon information
The dual geometry of Shannon information
 
ABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space modelsABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space models
 
8803-09-lec16.pdf
8803-09-lec16.pdf8803-09-lec16.pdf
8803-09-lec16.pdf
 
On learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodOn learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihood
 
Rosser's theorem
Rosser's theoremRosser's theorem
Rosser's theorem
 
Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...
 
Tensor train to solve stochastic PDEs
Tensor train to solve stochastic PDEsTensor train to solve stochastic PDEs
Tensor train to solve stochastic PDEs
 
IVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionIVR - Chapter 1 - Introduction
IVR - Chapter 1 - Introduction
 
Bayesian Deep Learning
Bayesian Deep LearningBayesian Deep Learning
Bayesian Deep Learning
 
Variations on the method of Coleman-Chabauty
Variations on the method of Coleman-ChabautyVariations on the method of Coleman-Chabauty
Variations on the method of Coleman-Chabauty
 
Efficient Analysis of high-dimensional data in tensor formats
Efficient Analysis of high-dimensional data in tensor formatsEfficient Analysis of high-dimensional data in tensor formats
Efficient Analysis of high-dimensional data in tensor formats
 
Some Thoughts on Sampling
Some Thoughts on SamplingSome Thoughts on Sampling
Some Thoughts on Sampling
 
Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)
Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)
Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)
 
Monte Carlo Methods
Monte Carlo MethodsMonte Carlo Methods
Monte Carlo Methods
 
Delayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithmsDelayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithms
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Unique fixed point theorems for generalized weakly contractive condition in o...
Unique fixed point theorems for generalized weakly contractive condition in o...Unique fixed point theorems for generalized weakly contractive condition in o...
Unique fixed point theorems for generalized weakly contractive condition in o...
 
QMC: Operator Splitting Workshop, Using Sequences of Iterates in Inertial Met...
QMC: Operator Splitting Workshop, Using Sequences of Iterates in Inertial Met...QMC: Operator Splitting Workshop, Using Sequences of Iterates in Inertial Met...
QMC: Operator Splitting Workshop, Using Sequences of Iterates in Inertial Met...
 
(α ψ)- Construction with q- function for coupled fixed point
(α   ψ)-  Construction with q- function for coupled fixed point(α   ψ)-  Construction with q- function for coupled fixed point
(α ψ)- Construction with q- function for coupled fixed point
 
Error control coding bch, reed-solomon etc..
Error control coding   bch, reed-solomon etc..Error control coding   bch, reed-solomon etc..
Error control coding bch, reed-solomon etc..
 

Recently uploaded

Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Types of different blotting techniques.pptx
Types of different blotting techniques.pptxTypes of different blotting techniques.pptx
Types of different blotting techniques.pptxkhadijarafiq2012
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Caco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorptionCaco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorptionPriyansha Singh
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physicsvishikhakeshava1
 

Recently uploaded (20)

Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Types of different blotting techniques.pptx
Types of different blotting techniques.pptxTypes of different blotting techniques.pptx
Types of different blotting techniques.pptx
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Caco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorptionCaco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorption
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physics
 

Decision Making with Hierarchical Credal Sets (IPMU 2014)

  • 1. Decision Making with Hierarchical Credal Sets Alessandro Antonucci1 , Alexander Karlsson2 , and David Sundgren3 (1) IDSIA (Switzerland) (2) University of Sk¨ovde (Sweden) (3) Stockholm University (Sweden) IPMU 2014, Montpellier, July 18th, 2014
  • 2. Outline Background on credal sets and hierarchical models Credal sets and NOT hierarchical models Hierarchical credal sets Decision making with hierarchical credal sets Application to credal classification Conclusions and outlooks
  • 3. Background on credal sets and hierarchical models Model of uncertainty about variable X taking values in ΩX Estimating the (expected) value of f : ΩX → R Probability mass function P(X) EP [f ] := x∈ΩX P(x) · f (x) Credal set K(X) (convex set of mass functions) EK [f ] := minP(X)∈K(X) x∈ΩX P(x) · f (x) Hierarchical model [K(X), π(Θ)] EK,π[f ] := ΩΘ EPθ [f ] · π(θ) · dθ = EPK,π [f ] where {Pθ(X)}θ∈ΩΘ = K(X) and PK,π(X) := ΩΘ Pθ(X) · π(θ) · dθ (weighted CoM)
  • 4. (Of course) Credal sets are not hierarchical models Parametrization with Θ even with pure credal set K(X) EK [f ] = EP∗ [f ] for at least a P∗ (X) ∈ K(X) [P∗ (X) = Pθ∗ (X)] (improper) prior π(θ) = δθ,θ∗ gives EK , but only for this f ! Different priors for different f ⇒ a set of priors A credal set over Θ: it should be vacuous K0(Θ) Credal sets are (sort of) hierarchical models, but a vacuous credal set should be placed on the second level K(X) ≡ [PΘ(X), K0(Θ)] For credal networks, this is the Cano-Cano-Moral transformation!
  • 5. Hierarchical credal sets Hierarchical model [PΘ(X), π(Θ)] (hierarchical view of) credal sets [PΘ(X), K0(Θ)] “hierarchical credal set” [PΘ(X), K (Θ)] equivalent to K (X) = ΩΘ Pθ(X) · π(θ) · dθ π(Θ)∈K (Θ) ⊆ K(X) Trade-off between realism/cautiousness and informativeness EK [f ] ≤ EK [f ] ≤ EK,π[f ] ≤ EK [f ] ≤ EK [f ] assuming π(Θ) ∈ K (Θ) How to choose K (Θ)?
  • 6. Shrinking (but not too much!) Likelihood-based learning of CS [Cattaneo] π(Θ) ∝ PΘ(D) Model revision π(Θ) → Kα(Θ) =    π (Θ) π (θ) = 0 if π(θ) < α · π(θML)    Cope with [PΘ(X), Kα(Θ)] Shifted Dirichlet prior [Karlsson & Sundgren] A prior over credal sets induced by probability intervals πs,t(Θ) ∝ n i=1[Θi − P(xi )]sti −1 PK,π(xi ) = P(xi ) + ti [1 − n j=1 P(xj )] Back to an imprecise model? Sampling from K(X) based on πs,t
  • 7. Sampling from a credal set A swarm of “particles” K(X) ⊃ {Pk (X)} ∼ πs,t(Θ) Weighted sampling from polytopes as a two-step process (i) Uniform sampling by convex combination of the vertices (convex combination by uniform sampling from the simplex) (ii) “Sampling from the sample” (discrete sampling weighted by the prior) For big swarms empirical and theoretical CoMs coincide Heuristics to remove particles: KL distance from the CoM
  • 8. Application to decision making Simplest DM task: most probable state x∗ := arg maxx P(x) K(X): Ω∗ X = {x∗ ∈ ΩX |∃P(X) ∈ K(X) : x∗ = arg maxx P(x)} [K(X), πs,t(Θ)]: x∗ := arg maxx PK,πs,t (x) Alternatively: [K(X), πs,t(Θ)] → {Pj (X)}m j=1 Shrink it to K (X) (heuristics) Take the decision with K (X) P(x1) P(x2) P(x3)
  • 9. Application to decision making Simplest DM task: most probable state x∗ := arg maxx P(x) K(X): Ω∗ X = {x∗ ∈ ΩX |∃P(X) ∈ K(X) : x∗ = arg maxx P(x)} [K(X), πs,t(Θ)]: x∗ := arg maxx PK,πs,t (x) Alternatively: [K(X), πs,t(Θ)] → {Pj (X)}m j=1 Shrink it to K (X) (heuristics) Take the decision with K (X) P(x1) P(x2) P(x3)
  • 10. Application to decision making Simplest DM task: most probable state x∗ := arg maxx P(x) K(X): Ω∗ X = {x∗ ∈ ΩX |∃P(X) ∈ K(X) : x∗ = arg maxx P(x)} [K(X), πs,t(Θ)]: x∗ := arg maxx PK,πs,t (x) Alternatively: [K(X), πs,t(Θ)] → {Pj (X)}m j=1 Shrink it to K (X) (heuristics) Take the decision with K (X) P(x1) P(x2) P(x3)
  • 11. Application to decision making Simplest DM task: most probable state x∗ := arg maxx P(x) K(X): Ω∗ X = {x∗ ∈ ΩX |∃P(X) ∈ K(X) : x∗ = arg maxx P(x)} [K(X), πs,t(Θ)]: x∗ := arg maxx PK,πs,t (x) Alternatively: [K(X), πs,t(Θ)] → {Pj (X)}m j=1 Shrink it to K (X) (heuristics) Take the decision with K (X) P(x1) P(x2) P(x3)
  • 12. Application to decision making Simplest DM task: most probable state x∗ := arg maxx P(x) K(X): Ω∗ X = {x∗ ∈ ΩX |∃P(X) ∈ K(X) : x∗ = arg maxx P(x)} [K(X), πs,t(Θ)]: x∗ := arg maxx PK,πs,t (x) Alternatively: [K(X), πs,t(Θ)] → {Pj (X)}m j=1 Shrink it to K (X) (heuristics) Take the decision with K (X) P(x1) P(x2) P(x3)
  • 13. Application to decision making Simplest DM task: most probable state x∗ := arg maxx P(x) K(X): Ω∗ X = {x∗ ∈ ΩX |∃P(X) ∈ K(X) : x∗ = arg maxx P(x)} [K(X), πs,t(Θ)]: x∗ := arg maxx PK,πs,t (x) Alternatively: [K(X), πs,t(Θ)] → {Pj (X)}m j=1 Shrink it to K (X) (heuristics) Take the decision with K (X) P(x1) P(x2) P(x3)
  • 14. Application to decision making Simplest DM task: most probable state x∗ := arg maxx P(x) K(X): Ω∗ X = {x∗ ∈ ΩX |∃P(X) ∈ K(X) : x∗ = arg maxx P(x)} [K(X), πs,t(Θ)]: x∗ := arg maxx PK,πs,t (x) Alternatively: [K(X), πs,t(Θ)] → {Pj (X)}m j=1 Shrink it to K (X) (heuristics) Take the decision with K (X) P(x1) P(x2) P(x3)
  • 15. Testing the approach on a (credal) classification setup Classification setup: class C and features F Given an instance of the features F = ˜f , which c ∈ ΩC ? (B) naive Bayes P(c, f) = P(c) i P(fi |c) Decision based on P(C|˜f ) (C) naive credal K(C), P(Fi |c) learned by (local) IDM Decision based on (outer approx of) K(C|˜f ) (H) hierarchical/credal approach on K(C|˜f ) Priors can be easily propagated (multiplied) provided that someone assessed them (C) and (H) are credal classifiers (more than a single class in output) Accuracy of (B) compared with utility-based performance descriptor for (C) and (H) [Zaffalon et al., 2014]
  • 16. Results Dataset n d (B) (C) (H) Lenses 24 3 77.2 53.7 72.2 Labor 51 2 87.0 92.7 93.7 Hayes 160 4 59.5 51.1 72.4 Monk 556 2 64.1 70.6 72.9
  • 17. Conclusions and outlooks A (better?) formalization of the relation between hierarchical and imprecise-probabilistic models Heuristics to take more informative decisions in credal networks (provided that a prior can be assessed) To do: Better heuristics: finding the smallest credal set covering a given number of particles can be done with MILP More ambitiously: a sound approach to learn K (Θ) Release a R package for that