Decision Making with Hierarchical Credal Sets (IPMU 2014)

Decision Making with Hierarchical Credal Sets
Alessandro Antonucci1
, Alexander Karlsson2
, and David Sundgren3
(1) IDSIA (Switzerland)
(2) University of Sk¨ovde (Sweden)
(3) Stockholm University (Sweden)
IPMU 2014, Montpellier, July 18th, 2014

Outline
Background on credal sets and hierarchical models
Credal sets and NOT hierarchical models
Hierarchical credal sets
Decision making with hierarchical credal sets
Application to credal classiﬁcation
Conclusions and outlooks

Background on credal sets and hierarchical models
Model of uncertainty about variable X taking values in ΩX
Estimating the (expected) value of f : ΩX → R
Probability mass function P(X)
EP [f ] := x∈ΩX
P(x) · f (x)
Credal set K(X) (convex set of mass functions)
EK [f ] := minP(X)∈K(X) x∈ΩX
P(x) · f (x)
Hierarchical model [K(X), π(Θ)]
EK,π[f ] := ΩΘ
EPθ
[f ] · π(θ) · dθ = EPK,π
[f ]
where {Pθ(X)}θ∈ΩΘ
= K(X)
and PK,π(X) := ΩΘ
Pθ(X) · π(θ) · dθ (weighted CoM)

(Of course) Credal sets are not hierarchical models
Parametrization with Θ even with pure credal set K(X)
EK [f ] = EP∗ [f ] for at least a P∗
(X) ∈ K(X) [P∗
(X) = Pθ∗ (X)]
(improper) prior π(θ) = δθ,θ∗ gives EK , but only for this f !
Diﬀerent priors for diﬀerent f ⇒ a set of priors
A credal set over Θ: it should be vacuous K0(Θ)
Credal sets are (sort of) hierarchical models,
but a vacuous credal set should be placed on the second level
K(X) ≡ [PΘ(X), K0(Θ)]
For credal networks, this is the Cano-Cano-Moral transformation!

Hierarchical credal sets
Hierarchical model [PΘ(X), π(Θ)]
(hierarchical view of) credal sets [PΘ(X), K0(Θ)]
“hierarchical credal set” [PΘ(X), K (Θ)] equivalent to
K (X) = ΩΘ
Pθ(X) · π(θ) · dθ
π(Θ)∈K (Θ)
⊆ K(X)
Trade-oﬀ between realism/cautiousness and informativeness
EK [f ] ≤ EK [f ] ≤ EK,π[f ] ≤ EK [f ] ≤ EK [f ]
assuming π(Θ) ∈ K (Θ)
How to choose K (Θ)?

Shrinking (but not too much!)
Likelihood-based learning of CS [Cattaneo]
π(Θ) ∝ PΘ(D)
Model revision
π(Θ) → Kα(Θ) =



π (Θ)
π (θ) = 0
if π(θ) < α · π(θML)



Cope with [PΘ(X), Kα(Θ)]
Shifted Dirichlet prior [Karlsson & Sundgren]
A prior over credal sets induced by probability intervals
πs,t(Θ) ∝ n
i=1[Θi − P(xi )]sti −1
PK,π(xi ) = P(xi ) + ti [1 − n
j=1 P(xj )]
Back to an imprecise model?
Sampling from K(X) based on πs,t

Sampling from a credal set
A swarm of “particles” K(X) ⊃ {Pk (X)} ∼ πs,t(Θ)
Weighted sampling from polytopes as a two-step process
(i) Uniform sampling by convex combination of the vertices
(convex combination by uniform sampling from the simplex)
(ii) “Sampling from the sample”
(discrete sampling weighted by the prior)
For big swarms empirical and theoretical CoMs coincide
Heuristics to remove particles: KL distance from the CoM

Application to decision making
Simplest DM task: most probable state x∗
:= arg maxx P(x)
K(X): Ω∗
X = {x∗
∈ ΩX |∃P(X) ∈ K(X) : x∗
= arg maxx P(x)}
[K(X), πs,t(Θ)]: x∗
:= arg maxx PK,πs,t
(x)
Alternatively:
[K(X), πs,t(Θ)] → {Pj (X)}m
j=1
Shrink it to K (X) (heuristics)
Take the decision with K (X)
P(x1)
P(x2)
P(x3)

Testing the approach on a (credal) classification setup
Classification setup: class C and features F
Given an instance of the features F = ˜f , which c ∈ ΩC ?
(B) naive Bayes P(c, f) = P(c) i P(fi |c)
Decision based on P(C|˜f )
(C) naive credal K(C), P(Fi |c) learned by (local) IDM
Decision based on (outer approx of) K(C|˜f )
(H) hierarchical/credal approach on K(C|˜f )
Priors can be easily propagated (multiplied)
provided that someone assessed them
(C) and (H) are credal classifiers (more than a single class in output)
Accuracy of (B) compared with utility-based performance descriptor
for (C) and (H) [Zaffalon et al., 2014]

Results
Dataset n d (B) (C) (H)
Lenses 24 3 77.2 53.7 72.2
Labor 51 2 87.0 92.7 93.7
Hayes 160 4 59.5 51.1 72.4
Monk 556 2 64.1 70.6 72.9

Conclusions and outlooks
A (better?) formalization of the relation between hierarchical and
imprecise-probabilistic models
Heuristics to take more informative decisions in credal networks
(provided that a prior can be assessed)
To do:
Better heuristics: ﬁnding the smallest credal set covering a given
number of particles can be done with MILP
More ambitiously: a sound approach to learn K (Θ)
Release a R package for that

Decision Making with Hierarchical Credal Sets (IPMU 2014)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Decision Making with Hierarchical Credal Sets (IPMU 2014)

Similar to Decision Making with Hierarchical Credal Sets (IPMU 2014) (20)

Recently uploaded

Recently uploaded (20)

Decision Making with Hierarchical Credal Sets (IPMU 2014)