Learning graphical models in high dimensional settings
Apr 04, 2017 - Apr 07, 2017
ICMS, 15 South College Street
Probabilistic Relational Models (PRMs) extend Bayesian networks to work with relational databases rather than propositional data. Existing approaches for PRM structure learning are inspired from classical methods of learning the BN structure. We will present in this talk our works about:
- hybrid structure learning methods, with our adaptation of the Max-Min Hill Climbing (MMHC) algorithm proposed for Bayesian networks.
- exact and anytime structure learning methods, with a preliminary work.
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
Learning Probabilistic Relational Models
1. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Learning Probabilistic Relational Models
Philippe Leray, Mouna Ben Ishak, Nahla Ben Amor,
Nourhene Ettouzi, Montassar Ben Messaoud
LS2N, Nantes University, France
LARODEC, ISG Tunis & Sousse, Tunisia
ICMS Workshop
Learning graphical models in high dimensional settings
4-7 April 2017, Edinburgh, UK
PRM structure learning Ph. Leray et al. 4-7 April 2017 1 / 24
2. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Motivations
Probabilistic Relational Models (PRMs) extend Bayesian
networks to work with relational databases rather than
propositional data
Vote
Rating
Movie User
RealiseDate
Genre
AgeGender
Occupation
0.60.4
FM
User.Gender
0.40.6Comedy, F
0.50.5Comedy, M
0.10.9Horror, F
0.80.2Horror, M
0.70.3Drama, F
0.50.5Drama, M
HighLow
Votes.RatingMovie.GenreUser.Gender
PRM structure learning Ph. Leray et al. 4-7 April 2017 2 / 24
3. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Motivations
Our goal : PRM structure learning from a relational
database
Only few works, approximate methods (score-based and
constraint-based), inspired from classical BNs learning
approaches, were proposed to learn PRM structure
Our contributions
an approximate hybrid method [Ben Ishak et al., 2015]
an exact score-based method [Etouzzi et al., 2016]
PRM structure learning Ph. Leray et al. 4-7 April 2017 3 / 24
4. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Motivations
Our goal : PRM structure learning from a relational
database
Only few works, approximate methods (score-based and
constraint-based), inspired from classical BNs learning
approaches, were proposed to learn PRM structure
Our contributions
an approximate hybrid method [Ben Ishak et al., 2015]
an exact score-based method [Etouzzi et al., 2016]
PRM structure learning Ph. Leray et al. 4-7 April 2017 3 / 24
5. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Motivations
Our goal : PRM structure learning from a relational
database
Only few works, approximate methods (score-based and
constraint-based), inspired from classical BNs learning
approaches, were proposed to learn PRM structure
Our contributions
an approximate hybrid method [Ben Ishak et al., 2015]
an exact score-based method [Etouzzi et al., 2016]
PRM structure learning Ph. Leray et al. 4-7 April 2017 3 / 24
7. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Relational data
Relational schema R
Movie
User
Vote
Movie
User
Rating
Gender
Age
Occupation
RealiseDate
Genre
Definitions
classes + attributes, X.A denotes
an attribute A of a class X
reference slots = foreign keys (e.g.
Vote.Movie, Vote.User)
inverse reference slots (e.g.
User.User−1)
slot chain = a sequence of
(inverse) reference slots
ex: Vote.User.User−1
.Movie: all
the movies voted by a particular
user
PRM structure learning Ph. Leray et al. 4-7 April 2017 5 / 24
8. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Relational data
Relational schema R
Movie
User
Vote
Movie
User
Rating
Gender
Age
Occupation
RealiseDate
Genre
Definitions
classes + attributes, X.A denotes
an attribute A of a class X
reference slots = foreign keys (e.g.
Vote.Movie, Vote.User)
inverse reference slots (e.g.
User.User−1)
slot chain = a sequence of
(inverse) reference slots
ex: Vote.User.User−1
.Movie: all
the movies voted by a particular
user
PRM structure learning Ph. Leray et al. 4-7 April 2017 5 / 24
9. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Relational data
Relational schema R
Movie
User
Vote
Movie
User
Rating
Gender
Age
Occupation
RealiseDate
Genre
Definitions
classes + attributes, X.A denotes
an attribute A of a class X
reference slots = foreign keys (e.g.
Vote.Movie, Vote.User)
inverse reference slots (e.g.
User.User−1)
slot chain = a sequence of
(inverse) reference slots
ex: Vote.User.User−1
.Movie: all
the movies voted by a particular
user
PRM structure learning Ph. Leray et al. 4-7 April 2017 5 / 24
10. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Relational data
Relational schema R
Movie
User
Vote
Movie
User
Rating
Gender
Age
Occupation
RealiseDate
Genre
Definitions
classes + attributes, X.A denotes
an attribute A of a class X
reference slots = foreign keys (e.g.
Vote.Movie, Vote.User)
inverse reference slots (e.g.
User.User−1)
slot chain = a sequence of
(inverse) reference slots
ex: Vote.User.User−1
.Movie: all
the movies voted by a particular
user
PRM structure learning Ph. Leray et al. 4-7 April 2017 5 / 24
11. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Relational data
Relational database
Database = an instance of the relational schema
set of objects for each class
with a value for each reference slot and each attribute
Age
Rating
Age
Gender
Occupation
Age
Gender
Occupation
Gender
Occupation
Genre
RealiseDate
Genre
Genre
Genre
Genre
U1
U2
U3
M1
M2
M3
M4
M5
#U1, #M1
Rating
#U1, #M2
Rating
#U2, #M1
Rating
#U2, #M3
Rating
#U2, #M4
Rating
#U3, #M1
Rating
#U3, #M2
Rating
#U3, #M3
Rating
#U3, #M5
RealiseDate
RealiseDate
RealiseDate
RealiseDate
PRM structure learning Ph. Leray et al. 4-7 April 2017 6 / 24
12. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Probabilistic relational models
Probabilistic Relational Models [Koller & Pfeffer, 1998]
Definition
A PRM Π associated to a relational
schema R:
a qualitative dependency
structure S (with possible long
slot chains and aggregation
functions)
a set of parameters θS
Vote
Rating
Movie User
RealiseDate
Genre
AgeGender
Occupation
0.60.4
FM
User.Gender
0.40.6Comedy, F
0.50.5Comedy, M
0.10.9Horror, F
0.80.2Horror, M
0.70.3Drama, F
0.50.5Drama, M
HighLow
Votes.RatingMovie.Genre
User.Gender
Aggregators
Mode(Vote.User.User−1.Movie.genre) → Vote.rating
movie rating from one user can be dependent with the most
frequent genre of movies voted by this user
PRM structure learning Ph. Leray et al. 4-7 April 2017 7 / 24
13. Background Hybrid Structure Learning Exact Structure Learning Conclusion
PRM Structure Learning
PRM structure learning
Task definition
given a relational database
finding the ”meta” probabilistic model (dependency structure
S (with slot chains and aggregation functions) and its
parameters θS
Vote
Rating
Movie User
RealiseDate
Genre
AgeGender
Occupation
0.60.4
FM
User.Gender
0.40.6Comedy, F
0.50.5Comedy, M
0.10.9Horror, F
0.80.2Horror, M
0.70.3Drama, F
0.50.5Drama, M
HighLow
Votes.RatingMovie.Genre
User.Gender
PRM structure learning Ph. Leray et al. 4-7 April 2017 8 / 24
14. Background Hybrid Structure Learning Exact Structure Learning Conclusion
PRM Structure Learning
PRM structure learning
Task definition
given a relational database
finding the ”meta” probabilistic model (dependency structure
S (with slot chains and aggregation functions) and its
parameters θS
What’s the difference with BN SL ?
adding another dimension in the search space (slot chain
length x aggregators) + limitation to a given maximal length
possible existence of multiple dependencies between 2
attributes
PRM structure learning Ph. Leray et al. 4-7 April 2017 8 / 24
15. Background Hybrid Structure Learning Exact Structure Learning Conclusion
PRM Structure Learning
PRM structure learning
Constraint-based methods
relational PC [Maier et al., 2010] relational CD [Maier et al.,
2013], rCD light [Lee and Honavar, 2016]
Score-based methods
greedy search [Getoor et al., 2007]
Hybrid methods
spanning tree + particle swarm optimization [Li & He, 2014]
PRM structure learning Ph. Leray et al. 4-7 April 2017 9 / 24
16. Background Hybrid Structure Learning Exact Structure Learning Conclusion
PRM Structure Learning
PRM structure learning
Constraint-based methods
relational PC [Maier et al., 2010] relational CD [Maier et al.,
2013], rCD light [Lee and Honavar, 2016]
Score-based methods
greedy search [Getoor et al., 2007]
Hybrid methods
spanning tree + particle swarm optimization [Li & He, 2014]
PRM structure learning Ph. Leray et al. 4-7 April 2017 9 / 24
17. Background Hybrid Structure Learning Exact Structure Learning Conclusion
PRM Structure Learning
PRM structure learning
Constraint-based methods
relational PC [Maier et al., 2010] relational CD [Maier et al.,
2013], rCD light [Lee and Honavar, 2016]
Score-based methods
greedy search [Getoor et al., 2007]
Hybrid methods
spanning tree + particle swarm optimization [Li & He, 2014]
PRM structure learning Ph. Leray et al. 4-7 April 2017 9 / 24
19. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Hybrid PRM structure learning
Hybrid BN structure learning
Local search and global learning
search one local neighborhood for a given node T
reiterate for each T
learn the global BN structure with these local informations
which neighborhood ?
PC(T) : Parents and Children of T (without distinction)
MB(T) : Markov Blanket of T
MMHC [Tsamardinos et al., 2006]
Parents and Children identification + Greedy search
PRM structure learning Ph. Leray et al. 4-7 April 2017 11 / 24
20. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Hybrid PRM structure learning
Hybrid BN structure learning
Local search and global learning
search one local neighborhood for a given node T
reiterate for each T
learn the global BN structure with these local informations
which neighborhood ?
PC(T) : Parents and Children of T (without distinction)
MB(T) : Markov Blanket of T
MMHC [Tsamardinos et al., 2006]
Parents and Children identification + Greedy search
PRM structure learning Ph. Leray et al. 4-7 April 2017 11 / 24
21. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Hybrid PRM structure learning
Hybrid BN structure learning
Local search and global learning
search one local neighborhood for a given node T
reiterate for each T
learn the global BN structure with these local informations
which neighborhood ?
PC(T) : Parents and Children of T (without distinction)
MB(T) : Markov Blanket of T
MMHC [Tsamardinos et al., 2006]
Parents and Children identification + Greedy search
PRM structure learning Ph. Leray et al. 4-7 April 2017 11 / 24
22. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Hybrid PRM structure learning
Relational MMHC ‘ [Ben Ishak et al., 2015]
Association and relational data ?
what is an association between Movie.genre and Vote.rating ?
Vote.Movie.genre → Vote.rating ?
Mode(Movie.Movie−1.rating) → Movie.genre
Mode(Vote.User.User−1.Movie.genre) → Vote.rating ? ...
PC identification
[Maier et al., 2010] (undirected) edge is kept if at least one
association exists, but every other information (slot chain,
aggregation, direction) is lost
[Ben Ishak et al., 2015] all the information is kept in order to
simplify next steps
PRM structure learning Ph. Leray et al. 4-7 April 2017 12 / 24
23. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Hybrid PRM structure learning
Relational MMHC ‘ [Ben Ishak et al., 2015]
Association and relational data ?
what is an association between Movie.genre and Vote.rating ?
Vote.Movie.genre → Vote.rating ?
Mode(Movie.Movie−1.rating) → Movie.genre
Mode(Vote.User.User−1.Movie.genre) → Vote.rating ? ...
PC identification
[Maier et al., 2010] (undirected) edge is kept if at least one
association exists, but every other information (slot chain,
aggregation, direction) is lost
[Ben Ishak et al., 2015] all the information is kept in order to
simplify next steps
PRM structure learning Ph. Leray et al. 4-7 April 2017 12 / 24
24. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Hybrid PRM structure learning
Relational MMHC ‘ [Ben Ishak et al., 2015]
Symmetrical correction : 2 solutions
non conservative : X really in PC(T) iff T also detected in
PC(X)
conservative : X really in PC(T) if one of both is true
Greedy Search : 2 solutions
first identify PC until max. slot chain length, then perform
one (single) GS in the ”full” space
interleave PC identification and GS
PRM structure learning Ph. Leray et al. 4-7 April 2017 13 / 24
25. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Hybrid PRM structure learning
Relational MMHC ‘ [Ben Ishak et al., 2015]
Symmetrical correction : 2 solutions
non conservative : X really in PC(T) iff T also detected in
PC(X)
conservative : X really in PC(T) if one of both is true
Greedy Search : 2 solutions
first identify PC until max. slot chain length, then perform
one (single) GS in the ”full” space
interleave PC identification and GS
PRM structure learning Ph. Leray et al. 4-7 April 2017 13 / 24
28. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Exact BN structure learning
Exact BN approaches
Main issue : find the highest-scoring network
1 decomposable scoring function (BDe, MDL/BIC, AIC, ...)
2 optimal search strategy
PRM structure learning Ph. Leray et al. 4-7 April 2017 16 / 24
29. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Exact BN structure learning
Exact BN approaches
Main issue : find the highest-scoring network
1 decomposable scoring function (BDe, MDL/BIC, AIC, ...)
2 optimal search strategy
Decomposability
Score(BN(V)) =
X∈V
Score(X | PaX )
Each local score Score(X | PaX ) is a function of one node and its
parents
PRM structure learning Ph. Leray et al. 4-7 April 2017 16 / 24
30. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Exact BN structure learning
Exact BN approaches
Main issue : find the highest-scoring network
1 decomposable scoring function (BDe, MDL/BIC, AIC, ...)
2 optimal search strategy
Spanning Tree [Chow et Liu, 1968]
Mathematical Programming [Cussens, 2012]
Dynamic Programming [Singh et al., 2005]
A* search [Yuan et al., 2011]
variant of Best First Search (BFS) [Pearl, 1984]
BN structure learning as a shortest path finding problem
evaluation functions g and h based on local scoring function
PRM structure learning Ph. Leray et al. 4-7 April 2017 16 / 24
31. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Exact BN structure learning
Exact BN approaches
Main issue : find the highest-scoring network
1 decomposable scoring function (BDe, MDL/BIC, AIC, ...)
2 optimal search strategy
Spanning Tree [Chow et Liu, 1968]
Mathematical Programming [Cussens, 2012]
Dynamic Programming [Singh et al., 2005]
A* search [Yuan et al., 2011]
variant of Best First Search (BFS) [Pearl, 1984]
BN structure learning as a shortest path finding problem
evaluation functions g and h based on local scoring function
PRM structure learning Ph. Leray et al. 4-7 April 2017 16 / 24
33. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Exact BN structure learning
A* search for BNs [Yuan et al., 2011]
Ø
X4X3X2X1
X1, X2 X1, X3 X1, X4 X2, X3 X2, X4 X3, X4
X1, X2, X3 X1, X2, X4 X1, X3, X4 X2, X3, X4
X1, X2, X3, X4
g(U) : sum of edge costs on the
best path from the start node to U.
g(U{U∪X}) = BestScore(X,U)
PRM structure learning Ph. Leray et al. 4-7 April 2017 17 / 24
34. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Exact BN structure learning
A* search for BNs [Yuan et al., 2011]
Ø
X4X3X2X1
X1, X2 X1, X3 X1, X4 X2, X3 X2, X4 X3, X4
X1, X2, X3, X4
h(U) enables variables not yet present in the
ordering to choose their optimal parents from V.
h(U) = 𝑋∈𝑉U 𝐵𝑒𝑠𝑡𝑆𝑐𝑜𝑟𝑒(𝑋|𝑉{𝑋})
h
g(U) : sum of edge costs on the
best path from the start node to U.
g(U{U∪X}) = BestScore(X,U)
PRM structure learning Ph. Leray et al. 4-7 April 2017 17 / 24
36. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Exact PRM structure learning
Relational BFS [Ettouzi et al., 2016]
Adaptation of [Yuan et al., 2011] to the relational context
Two key points
search space : how to deal with ”relational variables” ?
⇒ relational order graph
parent determination : how to deal with slot chains,
aggregators, and possible ”multiple” dependencies between
two attributes ?
⇒ relational parent graph
⇒ evaluation functions
PRM structure learning Ph. Leray et al. 4-7 April 2017 18 / 24
37. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Exact PRM structure learning
Relational BFS [Ettouzi et al., 2016]
Adaptation of [Yuan et al., 2011] to the relational context
Two key points
search space : how to deal with ”relational variables” ?
⇒ relational order graph
parent determination : how to deal with slot chains,
aggregators, and possible ”multiple” dependencies between
two attributes ?
⇒ relational parent graph
⇒ evaluation functions
PRM structure learning Ph. Leray et al. 4-7 April 2017 18 / 24
38. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Relational order graph
Definition
lattice over 2X.A, powerset of all possible attributes
no big change wrt. BNs
{ }
m.genre t.ratingm.rdate u.gender
m.rdate,
m.genre
m.rdate,
u.gender
m.rdate,
t.rating
m.genre,
u.gender
m.genre,
t.rating
u.gender,
t.rating
m.rdate,
m.genre,
u.gender
m.rdate,
m.genre,
t.rating
m.rdate,
u.gender,
t.rating
m.genre,
u.gender,
t.rating
m.rdate, m.genre,
u.gender, t.rating
PRM structure learning Ph. Leray et al. 4-7 April 2017 19 / 24
39. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Relational parent graph
Definition (one for each attribute X.A)
lattice over the candidate parents for a given maximal slot
chain length (+ local score value)
the same attribute can appear several times in this graph
one attribute can appear in its own parent graph
m.genre { }
9
m.rdate
6
� (m.mo-1.u.gender)
5
� (m.mo-1.rating)
10
m.rdate,
� (m.mo-1.u.gender)
4
m.rdate,
� (m.mo-1.rating)
8
� (m.mo-1.rating),
� (m.mo-1.u.gender)
7
m.rdate, � (m.mo-1.rating),
� (m.mo-1.u.gender)
8
PRM structure learning Ph. Leray et al. 4-7 April 2017 20 / 24
40. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Evaluation functions
Relational cost so far (more complex than BNs)
g(U → U ∪ {X.A}) = interest of having a set of attributes U
(and X.A) as candidate parents of X.A
= BestScore(X.A | {CPai /A(CPai ) ∈ A(U) ∪ {A}})
Relational heuristic function (similar to BNs)
h(U) =
X A∈A(X)A(U)
BestScore(X.A | CPa(X.A))
PRM structure learning Ph. Leray et al. 4-7 April 2017 21 / 24
41. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Evaluation functions
Relational cost so far (more complex than BNs)
g(U → U ∪ {X.A}) = interest of having a set of attributes U
(and X.A) as candidate parents of X.A
= BestScore(X.A | {CPai /A(CPai ) ∈ A(U) ∪ {A}})
Relational heuristic function (similar to BNs)
h(U) =
X A∈A(X)A(U)
BestScore(X.A | CPa(X.A))
PRM structure learning Ph. Leray et al. 4-7 April 2017 21 / 24
42. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Relational BFS : example
PRM structure learning Ph. Leray et al. 4-7 April 2017 22 / 24
44. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Conclusion and Perspectives
Visible face of this talk
an approximate hybrid PRM structure learning algorithm
an exact score-based PRM structure learning algorithm
To do list
implementations on our software platform PILGRIM
next step : an anytime and incremental PRM structure
learning algorithm, following the ideas presented in [Roure,
2004; Aine et al., 2007; Malone et al., 2013] for BNs
Thank you for your attention
PRM structure learning Ph. Leray et al. 4-7 April 2017 24 / 24
45. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Conclusion and Perspectives
Visible face of this talk
an approximate hybrid PRM structure learning algorithm
an exact score-based PRM structure learning algorithm
To do list
implementations on our software platform PILGRIM
next step : an anytime and incremental PRM structure
learning algorithm, following the ideas presented in [Roure,
2004; Aine et al., 2007; Malone et al., 2013] for BNs
Thank you for your attention
PRM structure learning Ph. Leray et al. 4-7 April 2017 24 / 24
46. Background Hybrid Structure Learning Exact Structure Learning Conclusion
Conclusion and Perspectives
Visible face of this talk
an approximate hybrid PRM structure learning algorithm
an exact score-based PRM structure learning algorithm
To do list
implementations on our software platform PILGRIM
next step : an anytime and incremental PRM structure
learning algorithm, following the ideas presented in [Roure,
2004; Aine et al., 2007; Malone et al., 2013] for BNs
Thank you for your attention
PRM structure learning Ph. Leray et al. 4-7 April 2017 24 / 24