Exception-enriched Rule
Learning from Knowledge
Graphs
Mohamed Gad-Elrab1, Daria Stepanova1, Jacopo Urbani 2, Gerhard Weikum1
1Max-Planck-Institut für Informatik, Saarland Informatics Campus, Germany
2 Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
21st October 2016
Knowledge Graphs (KGs)
2
• Huge collection of < 𝑠𝑢𝑏𝑗𝑒𝑐𝑡, 𝑝𝑟𝑒𝑑𝑖𝑐𝑎𝑡𝑒, 𝑜𝑏𝑗𝑒𝑐𝑡 > triples
• Positive facts under Open World Assumption (OWA)
• Possibly incomplete and/or inaccurate
Mining Rules from KGs
3
Amsterdam
isMarriedToJohn Kate
Chicago
isMarriedToBrad Anna
Berlin
hasBrother
isMarriedToDave ClaraisMarriedToBob Alice
Researcher
Berlin Football
Mining Rules from KGs
4
𝑟: 𝑙𝑖𝑣𝑒𝑠𝐼𝑛 𝑋, 𝑍 ← 𝑖𝑠𝑀𝑎𝑟𝑟𝑖𝑒𝑑𝑇𝑜 𝑋, 𝑌 , 𝑙𝑖𝑣𝑒𝑠𝐼𝑛(𝑌, 𝑍)
Amsterdam
isMarriedToJohn Kate
Chicago
isMarriedToBrad Anna
Berlin
hasBrother
isMarriedToDave ClaraisMarriedToBob Alice
Researcher
Berlin Football
[Galárraga et al., 2015]
Mining Rules from KGs
5
𝑟: 𝑙𝑖𝑣𝑒𝑠𝐼𝑛 𝑋, 𝑍 ← 𝑖𝑠𝑀𝑎𝑟𝑟𝑖𝑒𝑑𝑇𝑜 𝑋, 𝑌 , 𝑙𝑖𝑣𝑒𝑠𝐼𝑛(𝑌, 𝑍)
Amsterdam
isMarriedToJohn Kate
Chicago
isMarriedToBrad Anna
Berlin
hasBrother
isMarriedToDave ClaraisMarriedToBob Alice
Researcher
Berlin Football
1 2
Our Goal
6
𝑟: 𝑙𝑖𝑣𝑒𝑠𝐼𝑛 𝑋, 𝑍 ← 𝑖𝑠𝑀𝑎𝑟𝑟𝑖𝑒𝑑𝑇𝑜 𝑋, 𝑌 , 𝑙𝑖𝑣𝑒𝑠𝐼𝑛 𝑌, 𝑍 , 𝑛𝑜𝑡 𝑖𝑠𝐴(𝑋, 𝑟𝑒𝑠)
Amsterdam
isMarriedToJohn Kate
Chicago
isMarriedToBrad Anna
Berlin
hasBrother
isMarriedToDave ClaraisMarriedToBob Alice
Researcher
Berlin Football
1 2
Problem Statement
• Quality-based theory revision problem
• Given
• Knowledge graph 𝐾𝐺
• Set of Horn rules 𝑅 𝐻
• Find the nonmonotonic revision 𝑅 𝑁𝑀 of 𝑅 𝐻
• Maximize top-k avg. confidence
• Minimize conflicting prediction
7
Problem Statement
• Quality-based theory revision problem
• Given
• Knowledge graph 𝐾𝐺
• Set of Horn rules 𝑅 𝐻
• Find the nonmonotonic revision 𝑅 𝑁𝑀 of 𝑅 𝐻
• Maximize top-k avg. confidence
• Minimize conflicting prediction
8
Unknown
Problem Statement: Conflicting Predictions
• Defining conflicts
• Measuring conflicts (auxiliary rules)
9
𝑟2: 𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑋 , 𝑛𝑜𝑡 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴(𝑋)
𝑟2
𝑎𝑢𝑥
: 𝑛𝑜𝑡_𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑋 , 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴(𝑋)
𝑅 =
𝑟1: 𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← 𝑖𝑠𝐺𝑎𝑚𝑒 𝑋 , 𝑖𝑠𝐵𝑎𝑠𝑒𝑑𝑂𝑛𝐽𝑃𝐴𝑛𝑖𝑚𝑒 (𝑋)
𝑟2: 𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑋 , 𝑛𝑜𝑡 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴(𝑋)
{𝑖𝑠𝐺𝑎𝑚𝑒 𝑎 , 𝑖𝑠𝐵𝑎𝑠𝑒𝑑𝑂𝑛𝐽𝑃𝐴𝑛𝑖𝑚𝑒 𝑎 , ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑎 , 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴 𝑎 }
Output: {(𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑎 , 𝑟2: 𝑛𝑜𝑡 𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑎 )}
𝑎 =
Problem Statement: Conflicting Predictions
• Defining conflicts
• Measuring conflicts (auxiliary rules)
10
𝑟2: 𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑋 , 𝑛𝑜𝑡 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴(𝑋)
𝑟2
𝑎𝑢𝑥
: 𝑛𝑜𝑡_𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑋 , 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴(𝑋)
𝑅 =
𝑟1: 𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← 𝑖𝑠𝐺𝑎𝑚𝑒 𝑋 , 𝑖𝑠𝐵𝑎𝑠𝑒𝑑𝑂𝑛𝐽𝑃𝐴𝑛𝑖𝑚𝑒 (𝑋)
𝑟2: 𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑋 , 𝑛𝑜𝑡 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴(𝑋)
{𝑖𝑠𝐺𝑎𝑚𝑒 𝑎 , 𝑖𝑠𝐵𝑎𝑠𝑒𝑑𝑂𝑛𝐽𝑃𝐴𝑛𝑖𝑚𝑒 𝑎 , ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑎 , 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴 𝑎 }
Output: {(𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑎 , 𝑟2: 𝑛𝑜𝑡 𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑎 )}
𝑎 =
Problem Statement: Conflicting Predictions
• Defining conflicts
• Measuring conflicts (auxiliary rules)
11
𝑟2: 𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑋 , 𝑛𝑜𝑡 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴(𝑋)
𝑟2
𝑎𝑢𝑥
: 𝑛𝑜𝑡_𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑋 , 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴(𝑋)
𝑅 =
𝑟1: 𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← 𝑖𝑠𝐺𝑎𝑚𝑒 𝑋 , 𝑖𝑠𝐵𝑎𝑠𝑒𝑑𝑂𝑛𝐽𝑃𝐴𝑛𝑖𝑚𝑒 (𝑋)
𝑟2: 𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑋 , 𝑛𝑜𝑡 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴(𝑋)
{𝑖𝑠𝐺𝑎𝑚𝑒 𝑎 , 𝑖𝑠𝐵𝑎𝑠𝑒𝑑𝑂𝑛𝐽𝑃𝐴𝑛𝑖𝑚𝑒 𝑎 , ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑎 , 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴 𝑎 }
(𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑎 , 𝑛𝑜𝑡_𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑎 )
𝑎 =
Approach Overview
Step 1
• Mining Horn Rules
Step 2
• Extracting Exception Witness Set (EWS)
Step 3
• Constructing Candidate Revisions
Step 4
• Selecting the Best Revision
12
bornInUSA livesInUSA stateless emigrant singer poet
p1 ✓ ✓
p2 ✓ ✓
p3 ✓ ✓
p4 ✓ ✓ ✓
p5 ✓ ✓ ✓
p6 ✓ ✓
p7 ✓ ✓
p8 ✓ ✓ ✓ ✓
p9 ✓ ✓ ✓
p10 ✓ ✓ ✓ ✓
p11 ✓ ✓ ✓
Step 1: Mining Horn Rules
13
𝒓𝒊: 𝒍𝒊𝒗𝒆𝒔𝑰𝒏𝑼𝑺𝑨 𝑿 ← 𝒃𝒐𝒓𝒏𝑰𝒏𝑼𝑺𝑨(𝑿)
NormalAb-normal
bornInUSA livesInUSA stateless emigrant singer poet
p1 ✓ ✓
p2 ✓ ✓
p3 ✓ ✓
p4 ✓ ✓ ✓
p5 ✓ ✓ ✓
p6 ✓ ✓
p7 ✓ ✓
p8 ✓ ✓ ✓ ✓
p9 ✓ ✓ ✓
p10 ✓ ✓ ✓ ✓
p11 ✓ ✓ ✓
𝒓𝒊: 𝒍𝒊𝒗𝒆𝒔𝑰𝒏𝑼𝑺𝑨 𝑿 ← 𝒃𝒐𝒓𝒏𝑰𝒏𝑼𝑺𝑨(𝑿)
NormalAb-normalStep 2: Extracting Exception Witness Set (EWS)
14
𝐸𝑊𝑆𝑖 = {𝑒𝑚𝑖𝑔𝑟𝑎𝑛𝑡 𝑋 , 𝑝𝑜𝑒𝑡 𝑋 }
Step 3: Constructing Candidate Revisions
• Horn rules
• Rule revisions
15
𝑟1: 𝑙𝑖𝑣𝑒𝑠𝐼𝑛𝑈𝑆𝐴 𝑋 ← 𝑏𝑜𝑟𝑛𝐼𝑛𝑈𝑆𝐴 𝑋 , 𝑛𝑜𝑡 𝑝𝑜𝑒𝑡 𝑋
𝑟1: 𝑙𝑖𝑣𝑒𝑠𝐼𝑛𝑈𝑆𝐴 𝑋 ← 𝑏𝑜𝑟𝑛𝐼𝑛𝑈𝑆𝐴 𝑋 , 𝑛𝑜𝑡 𝑒𝑚𝑖𝑔𝑟𝑎𝑛𝑡 𝑋
…
𝑅 =
𝑟1: 𝑙𝑖𝑣𝑒𝑠𝐼𝑛𝑈𝑆𝐴 𝑋 ← 𝑏𝑜𝑟𝑛𝐼𝑛𝑈𝑆𝐴(𝑋)
𝑟2: 𝑒𝑚𝑖𝑔𝑟𝑎𝑛𝑡 𝑋 ← 𝑠𝑡𝑎𝑡𝑒𝑙𝑒𝑠𝑠(𝑋)
𝐸𝑊𝑆 = {𝑝𝑜𝑒𝑡 𝑋 , 𝑒𝑚𝑖𝑔𝑟𝑎𝑛𝑡 𝑋 , … }
𝐸𝑊𝑆 = {𝑒1 𝑋 , 𝑒2(𝑋), … }
Step 4: Selecting the Best Revision
Finding globally best revision is expensive!
• Naïve ranker
• For each rule, pick the revision that maximizes confidence
• Works in isolation from other rules
• Partial materialization ranker
• 𝐾𝐺s are incomplete!
• Augment the original 𝐾𝐺 with predictions of other rules
• Rank revisions on avg. confidence of the rule and its auxiliary.
16
• Partial materialization
Ranking Rule’s Revisions
17
bornInUSA livesInUSA stateless emigrant singer poet
p1 ✓ ✓
p2 ✓ ✓
p3 ✓ ✓
p4 ✓ ✓ ✓
p5 ✓ ✓ ✓
p6 ✓ ✓
p7 ✓ ✓
p8 ✓ ✓ ✓ ✓
p9 ✓ ✓ ✓
p10 ✓ ✓ ✓ ✓
p11 ✓ ✓ ✓
𝒓𝒊: 𝒍𝒊𝒗𝒆𝒔𝑰𝒏𝑼𝑺𝑨 𝑿 ← 𝒃𝒐𝒓𝒏𝑰𝒏𝑼𝑺𝑨(𝑿)
NormalAb-normal
• Partial materialization
bornInUSA livesInUSA stateless emigrant singer poet
p1 ✓ ✓
p2 ✓ ✓
p3 ✓ ✓
p4 ✓ ✓ ✓ ✓
p5 ✓ ✓ ✓ ✓
p6 ✓ ✓ ✓
p7 ✓ ✓ ✓
p8 ✓ ✓ ✓ ✓
p9 ✓ ✓ ✓ ✓ ✓
p10 ✓ ✓ ✓ ✓
p11 ✓ ✓ ✓ ✓
𝒓𝒊: 𝒍𝒊𝒗𝒆𝒔𝑰𝒏𝑼𝑺𝑨 𝑿 ← 𝒃𝒐𝒓𝒏𝑰𝒏𝑼𝑺𝑨(𝑿)
NormalAb-normalRanking Rule’s Revisions
18
Ranking Rule’s Revisions
• Ordered partial materialization ranker
• Only rules with higher quality
• Ordered weighted partial materialization ranker
• KG fact weight = 1
• Predicted facts inherit their weights from the rules
19
Experiments
• Ruleset quality
20
*Higher is better
0.60
0.65
0.70
0.75
0.80
20 40 60 80 100
Avg.Confidence
Top-K (%) Rules
YAGO3
Horn Naive Ordered & Weighted PM
0.8
0.82
0.84
0.86
0.88
0.9
0.92
0.94
20 40 60 80 100Avg.Confidence
Top-K (%) Rules
IMDB
Horn Naive Ordered & Weighted PM
Facts 10M 2M
Rules 10K 25K
Experiments
• Predictions consistency
21*Lower is better
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
200 400 600 800 1000
ConflictRatio
Top-K Rules
YAGO3
Naive Ordered & Weighted PM
0
0.1
0.2
0.3
0.4
0.5
200 400 600 800 1000ConflictRatio
Top-K Rules
IMDB
Naive Ordered & Weighted PM
Experiments
• Examples
22
𝑖𝑠𝑀𝑜𝑢𝑛𝑡𝑎𝑖𝑛(𝑋) ← 𝑖𝑠𝐼𝑛𝐴𝑢𝑠𝑡𝑟𝑖𝑎(𝑋), 𝑖𝑠𝐼𝑛𝐼𝑡𝑎𝑙𝑦 𝑋 , 𝑛𝑜𝑡 𝑖𝑠𝑅𝑖𝑣𝑒𝑟 (𝑋)
𝑖𝑠𝑃𝑜𝑙𝑖𝑡𝑂𝑓𝑈𝑆𝐴 𝑋 ← 𝑏𝑜𝑟𝑛𝐼𝑛𝑈𝑆𝐴 𝑋 , 𝑖𝑠𝐺𝑜𝑣 𝑋 , 𝑛𝑜𝑡 𝑖𝑠𝑃𝑜𝑙𝑖𝑡𝑃𝑢𝑒𝑟𝑡𝑜𝑅𝑖𝑐𝑜(𝑋)
𝑏𝑜𝑟𝑛𝐼𝑛𝑈𝑆𝐴 𝑋 ← 𝑎𝑐𝑡𝑒𝑑𝐼𝑛𝑀𝑜𝑣𝑖𝑒 𝑋 , 𝑐𝑟𝑒𝑎𝑡𝑒𝑑𝑀𝑜𝑣𝑖𝑒 𝑋 , 𝑛𝑜𝑡 𝑤𝑜𝑛𝐹𝑖𝑙𝑚𝑓𝑎𝑟𝑒(𝑋)
Summary
• Conclusion
• Quality-based theory revision under OWA
• Partial materialization for ranking revisions
• Comparison of ranking methods on real life KGs
• Outlook
• Extending to higher arity predicates
• Binary predicates [Tran et al., to appear ILP2016]
• Evidence from text corpora
• Exploiting partial completeness
23
References
• [Angiulli and Fassetti, 2014] Fabrizio Angiulli and Fabio Fassetti. Exploiting domain knowledge to
detect outliers. Data Min. Knowl. Discov., 28(2):519–568, 2014.
• [Dimopoulos and Kakas, 1995] Yannis Dimopoulos and Antonis C. Kakas. Learning non-monotonic
logic programs: Learning exceptions. In Machine Learning: ECML-95, 8th European Conference on
Machine Learning, Heraclion, Crete, Greece, April 25-27, 1995, Proceedings, pages 122–137, 1995.
• [Galarraga et al., 2015] Luis Galarraga, Christina Teflioudi, Katja Hose, and Fabian M. Suchanek.
Fast Rule Mining in Ontological Knowledge Bases with AMIE+. In VLDB Journal, 2015.
• [Law et al., 2015] Mark Law, Alessandra Russo, and Krysia Broda. The ILASP system for learning
answer set programs, 2015.
• [Leone et al., 2006] Nicola Leone, Gerald Pfeifer, Wolfgang Faber, Thomas Eiter, Georg Gottlob,
Simona Perri, and Francesco Scarcello. 2006. The DLV system for knowledge representation and
reasoning. ACM Trans. Comput. Logic 7, 3 (July 2006), 499-562.
• [Suzuki, 2006] Einoshin Suzuki. Data mining methods for discovering interesting exceptions from
an unsupervised table. J. UCS, 12(6):627–653, 2006.
• [Tran et al., 2016] Hai Dang Tran, Daria Stepanova, Mohamed H. Gad-Elrab, Francesca A. Lisi,
Gerhard Weikum. Towards Nonmonotonic Relational Learning from Knowledge Graphs. ILP2016,
London, UK, to appear.
• [Katzouris et al., 2015] Nikos Katzouris, Alexander Artikis, and Georgios Paliouras. Incremental
learning of event definitions with inductive logic programming. Machine Learning, 100(2-3):555–
585, 2015.
24
Related Work
• Learning nonmonotonic programs
• E.g., [Dimopoulos and Kakas, 1995], ILASP [Law et al., 2015],
ILED [Katzouris et al., 2015], etc.
• Outlier detection in logic programs
• E.g., [Angiulli and Fassetti, 2014], etc.
• Mining exception rules
• E.g., [Suzuki, 2006], etc.
25
Problem Statement: Ruleset Quality
• Independent Rule Measure (𝑟𝑚)
• Support: 𝑠𝑢𝑝𝑝 𝐻 ← 𝐵 = 𝑠𝑢𝑝𝑝(𝐻 ∪ 𝐵)
• Coverage: c𝑜𝑣 𝐻 ← 𝐵 = 𝑠𝑢𝑝𝑝(𝐵)
• Confidence: c𝑜𝑛𝑓 𝐻 ← 𝐵 =
𝑠𝑢𝑝𝑝(𝐻∪𝐵)
𝑠𝑢𝑝𝑝(𝐵)
• Lift: 𝑙𝑖𝑓𝑡 𝐻 ← 𝐵 =
𝑐𝑜𝑛𝑓(𝐻←𝐵)
𝑠𝑢𝑝𝑝(𝐻)
• …
• Average Ruleset Quality
26
𝑞 𝑟𝑚 𝑅 𝑁𝑀, 𝐺 =
𝑟 ∈𝑅 𝑁𝑀
𝑟𝑚(𝑟, 𝐺)
𝑅 𝑁𝑀
Problem Statement: Conflicting Predictions
• Measuring conflicts (auxiliary rules)
27
𝑟2: 𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑋 , 𝑛𝑜𝑡 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴(𝑋)
𝑟2
𝑎𝑢𝑥
: 𝑛𝑜𝑡_𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑋 , 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴(𝑋)
𝑞 𝑐𝑜𝑛𝑓𝑙𝑖𝑐𝑡 𝑅 𝑁𝑀, 𝐺 =
|{(𝑝 𝑎 , 𝑛𝑜𝑡_𝑝 𝑎 ), … }|
{𝑛𝑜𝑡_𝑝 𝑎 , … }
Propositionalization
• Unary predicates
28
Binary Predicates
• hasType(enstein, scientist)
• isMarriedTo(elsa, einstein)
• bornIn(einstein, um)
Unary
• isAScientist(einstein)
• isMarriedToEinstein(elsa)
• bornInUlm(einstein)
Abstraction
• isAScientist(einstein)
• isMarriedToScientist(elsa)
• bornInGermany(einstein)
Experiments
• Input
• Experiment statistics
29
YAGO3 IMDB
Input Facts 10M 2M
Horn Rules 10K 25K
Revised Rules 6K 22K
General-purpose KG Domain-specific KG (Movies)
Experiments
• Predictions assessment
• Run DLV on YAGO and 𝑅 𝐻 then 𝑅 𝑁𝑀 seperately
• Sample facts such that fact 𝑓 ∈ 𝑌𝐴𝐺𝑂 𝐻𝑌𝐴𝐺𝑂 𝑁𝑀
• 73% of the sampled facts were found to be erroneous
30
checked
predictions
Ranking Rule’s Revisions
• Partial materialization ranker
• Augment the original 𝐾𝐺 with predictions of other rules
• Rank revisions on Avg. confidence of the 𝑟 and 𝑟 𝑎𝑢𝑥
31
𝑠𝑐𝑜𝑟𝑒 𝑟𝑒, 𝐾𝐺∗ =
𝑐𝑜𝑛𝑓 𝑟𝑒, 𝐾𝐺∗ + 𝑐𝑜𝑛𝑓(𝑟𝑒
𝑎𝑢𝑥, 𝐾𝐺∗)
2
where 𝑟𝑒 is the rule 𝑟 with exception 𝑒 & 𝐾𝐺∗
is the augmented 𝐾𝐺.

Exception-enrcihed Rule Learning from Knowledge Graphs

  • 1.
    Exception-enriched Rule Learning fromKnowledge Graphs Mohamed Gad-Elrab1, Daria Stepanova1, Jacopo Urbani 2, Gerhard Weikum1 1Max-Planck-Institut für Informatik, Saarland Informatics Campus, Germany 2 Vrije Universiteit Amsterdam, Amsterdam, The Netherlands 21st October 2016
  • 2.
    Knowledge Graphs (KGs) 2 •Huge collection of < 𝑠𝑢𝑏𝑗𝑒𝑐𝑡, 𝑝𝑟𝑒𝑑𝑖𝑐𝑎𝑡𝑒, 𝑜𝑏𝑗𝑒𝑐𝑡 > triples • Positive facts under Open World Assumption (OWA) • Possibly incomplete and/or inaccurate
  • 3.
    Mining Rules fromKGs 3 Amsterdam isMarriedToJohn Kate Chicago isMarriedToBrad Anna Berlin hasBrother isMarriedToDave ClaraisMarriedToBob Alice Researcher Berlin Football
  • 4.
    Mining Rules fromKGs 4 𝑟: 𝑙𝑖𝑣𝑒𝑠𝐼𝑛 𝑋, 𝑍 ← 𝑖𝑠𝑀𝑎𝑟𝑟𝑖𝑒𝑑𝑇𝑜 𝑋, 𝑌 , 𝑙𝑖𝑣𝑒𝑠𝐼𝑛(𝑌, 𝑍) Amsterdam isMarriedToJohn Kate Chicago isMarriedToBrad Anna Berlin hasBrother isMarriedToDave ClaraisMarriedToBob Alice Researcher Berlin Football [Galárraga et al., 2015]
  • 5.
    Mining Rules fromKGs 5 𝑟: 𝑙𝑖𝑣𝑒𝑠𝐼𝑛 𝑋, 𝑍 ← 𝑖𝑠𝑀𝑎𝑟𝑟𝑖𝑒𝑑𝑇𝑜 𝑋, 𝑌 , 𝑙𝑖𝑣𝑒𝑠𝐼𝑛(𝑌, 𝑍) Amsterdam isMarriedToJohn Kate Chicago isMarriedToBrad Anna Berlin hasBrother isMarriedToDave ClaraisMarriedToBob Alice Researcher Berlin Football 1 2
  • 6.
    Our Goal 6 𝑟: 𝑙𝑖𝑣𝑒𝑠𝐼𝑛𝑋, 𝑍 ← 𝑖𝑠𝑀𝑎𝑟𝑟𝑖𝑒𝑑𝑇𝑜 𝑋, 𝑌 , 𝑙𝑖𝑣𝑒𝑠𝐼𝑛 𝑌, 𝑍 , 𝑛𝑜𝑡 𝑖𝑠𝐴(𝑋, 𝑟𝑒𝑠) Amsterdam isMarriedToJohn Kate Chicago isMarriedToBrad Anna Berlin hasBrother isMarriedToDave ClaraisMarriedToBob Alice Researcher Berlin Football 1 2
  • 7.
    Problem Statement • Quality-basedtheory revision problem • Given • Knowledge graph 𝐾𝐺 • Set of Horn rules 𝑅 𝐻 • Find the nonmonotonic revision 𝑅 𝑁𝑀 of 𝑅 𝐻 • Maximize top-k avg. confidence • Minimize conflicting prediction 7
  • 8.
    Problem Statement • Quality-basedtheory revision problem • Given • Knowledge graph 𝐾𝐺 • Set of Horn rules 𝑅 𝐻 • Find the nonmonotonic revision 𝑅 𝑁𝑀 of 𝑅 𝐻 • Maximize top-k avg. confidence • Minimize conflicting prediction 8 Unknown
  • 9.
    Problem Statement: ConflictingPredictions • Defining conflicts • Measuring conflicts (auxiliary rules) 9 𝑟2: 𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑋 , 𝑛𝑜𝑡 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴(𝑋) 𝑟2 𝑎𝑢𝑥 : 𝑛𝑜𝑡_𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑋 , 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴(𝑋) 𝑅 = 𝑟1: 𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← 𝑖𝑠𝐺𝑎𝑚𝑒 𝑋 , 𝑖𝑠𝐵𝑎𝑠𝑒𝑑𝑂𝑛𝐽𝑃𝐴𝑛𝑖𝑚𝑒 (𝑋) 𝑟2: 𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑋 , 𝑛𝑜𝑡 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴(𝑋) {𝑖𝑠𝐺𝑎𝑚𝑒 𝑎 , 𝑖𝑠𝐵𝑎𝑠𝑒𝑑𝑂𝑛𝐽𝑃𝐴𝑛𝑖𝑚𝑒 𝑎 , ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑎 , 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴 𝑎 } Output: {(𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑎 , 𝑟2: 𝑛𝑜𝑡 𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑎 )} 𝑎 =
  • 10.
    Problem Statement: ConflictingPredictions • Defining conflicts • Measuring conflicts (auxiliary rules) 10 𝑟2: 𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑋 , 𝑛𝑜𝑡 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴(𝑋) 𝑟2 𝑎𝑢𝑥 : 𝑛𝑜𝑡_𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑋 , 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴(𝑋) 𝑅 = 𝑟1: 𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← 𝑖𝑠𝐺𝑎𝑚𝑒 𝑋 , 𝑖𝑠𝐵𝑎𝑠𝑒𝑑𝑂𝑛𝐽𝑃𝐴𝑛𝑖𝑚𝑒 (𝑋) 𝑟2: 𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑋 , 𝑛𝑜𝑡 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴(𝑋) {𝑖𝑠𝐺𝑎𝑚𝑒 𝑎 , 𝑖𝑠𝐵𝑎𝑠𝑒𝑑𝑂𝑛𝐽𝑃𝐴𝑛𝑖𝑚𝑒 𝑎 , ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑎 , 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴 𝑎 } Output: {(𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑎 , 𝑟2: 𝑛𝑜𝑡 𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑎 )} 𝑎 =
  • 11.
    Problem Statement: ConflictingPredictions • Defining conflicts • Measuring conflicts (auxiliary rules) 11 𝑟2: 𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑋 , 𝑛𝑜𝑡 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴(𝑋) 𝑟2 𝑎𝑢𝑥 : 𝑛𝑜𝑡_𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑋 , 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴(𝑋) 𝑅 = 𝑟1: 𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← 𝑖𝑠𝐺𝑎𝑚𝑒 𝑋 , 𝑖𝑠𝐵𝑎𝑠𝑒𝑑𝑂𝑛𝐽𝑃𝐴𝑛𝑖𝑚𝑒 (𝑋) 𝑟2: 𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑋 , 𝑛𝑜𝑡 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴(𝑋) {𝑖𝑠𝐺𝑎𝑚𝑒 𝑎 , 𝑖𝑠𝐵𝑎𝑠𝑒𝑑𝑂𝑛𝐽𝑃𝐴𝑛𝑖𝑚𝑒 𝑎 , ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑎 , 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴 𝑎 } (𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑎 , 𝑛𝑜𝑡_𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑎 ) 𝑎 =
  • 12.
    Approach Overview Step 1 •Mining Horn Rules Step 2 • Extracting Exception Witness Set (EWS) Step 3 • Constructing Candidate Revisions Step 4 • Selecting the Best Revision 12
  • 13.
    bornInUSA livesInUSA statelessemigrant singer poet p1 ✓ ✓ p2 ✓ ✓ p3 ✓ ✓ p4 ✓ ✓ ✓ p5 ✓ ✓ ✓ p6 ✓ ✓ p7 ✓ ✓ p8 ✓ ✓ ✓ ✓ p9 ✓ ✓ ✓ p10 ✓ ✓ ✓ ✓ p11 ✓ ✓ ✓ Step 1: Mining Horn Rules 13 𝒓𝒊: 𝒍𝒊𝒗𝒆𝒔𝑰𝒏𝑼𝑺𝑨 𝑿 ← 𝒃𝒐𝒓𝒏𝑰𝒏𝑼𝑺𝑨(𝑿) NormalAb-normal
  • 14.
    bornInUSA livesInUSA statelessemigrant singer poet p1 ✓ ✓ p2 ✓ ✓ p3 ✓ ✓ p4 ✓ ✓ ✓ p5 ✓ ✓ ✓ p6 ✓ ✓ p7 ✓ ✓ p8 ✓ ✓ ✓ ✓ p9 ✓ ✓ ✓ p10 ✓ ✓ ✓ ✓ p11 ✓ ✓ ✓ 𝒓𝒊: 𝒍𝒊𝒗𝒆𝒔𝑰𝒏𝑼𝑺𝑨 𝑿 ← 𝒃𝒐𝒓𝒏𝑰𝒏𝑼𝑺𝑨(𝑿) NormalAb-normalStep 2: Extracting Exception Witness Set (EWS) 14 𝐸𝑊𝑆𝑖 = {𝑒𝑚𝑖𝑔𝑟𝑎𝑛𝑡 𝑋 , 𝑝𝑜𝑒𝑡 𝑋 }
  • 15.
    Step 3: ConstructingCandidate Revisions • Horn rules • Rule revisions 15 𝑟1: 𝑙𝑖𝑣𝑒𝑠𝐼𝑛𝑈𝑆𝐴 𝑋 ← 𝑏𝑜𝑟𝑛𝐼𝑛𝑈𝑆𝐴 𝑋 , 𝑛𝑜𝑡 𝑝𝑜𝑒𝑡 𝑋 𝑟1: 𝑙𝑖𝑣𝑒𝑠𝐼𝑛𝑈𝑆𝐴 𝑋 ← 𝑏𝑜𝑟𝑛𝐼𝑛𝑈𝑆𝐴 𝑋 , 𝑛𝑜𝑡 𝑒𝑚𝑖𝑔𝑟𝑎𝑛𝑡 𝑋 … 𝑅 = 𝑟1: 𝑙𝑖𝑣𝑒𝑠𝐼𝑛𝑈𝑆𝐴 𝑋 ← 𝑏𝑜𝑟𝑛𝐼𝑛𝑈𝑆𝐴(𝑋) 𝑟2: 𝑒𝑚𝑖𝑔𝑟𝑎𝑛𝑡 𝑋 ← 𝑠𝑡𝑎𝑡𝑒𝑙𝑒𝑠𝑠(𝑋) 𝐸𝑊𝑆 = {𝑝𝑜𝑒𝑡 𝑋 , 𝑒𝑚𝑖𝑔𝑟𝑎𝑛𝑡 𝑋 , … } 𝐸𝑊𝑆 = {𝑒1 𝑋 , 𝑒2(𝑋), … }
  • 16.
    Step 4: Selectingthe Best Revision Finding globally best revision is expensive! • Naïve ranker • For each rule, pick the revision that maximizes confidence • Works in isolation from other rules • Partial materialization ranker • 𝐾𝐺s are incomplete! • Augment the original 𝐾𝐺 with predictions of other rules • Rank revisions on avg. confidence of the rule and its auxiliary. 16
  • 17.
    • Partial materialization RankingRule’s Revisions 17 bornInUSA livesInUSA stateless emigrant singer poet p1 ✓ ✓ p2 ✓ ✓ p3 ✓ ✓ p4 ✓ ✓ ✓ p5 ✓ ✓ ✓ p6 ✓ ✓ p7 ✓ ✓ p8 ✓ ✓ ✓ ✓ p9 ✓ ✓ ✓ p10 ✓ ✓ ✓ ✓ p11 ✓ ✓ ✓ 𝒓𝒊: 𝒍𝒊𝒗𝒆𝒔𝑰𝒏𝑼𝑺𝑨 𝑿 ← 𝒃𝒐𝒓𝒏𝑰𝒏𝑼𝑺𝑨(𝑿) NormalAb-normal
  • 18.
    • Partial materialization bornInUSAlivesInUSA stateless emigrant singer poet p1 ✓ ✓ p2 ✓ ✓ p3 ✓ ✓ p4 ✓ ✓ ✓ ✓ p5 ✓ ✓ ✓ ✓ p6 ✓ ✓ ✓ p7 ✓ ✓ ✓ p8 ✓ ✓ ✓ ✓ p9 ✓ ✓ ✓ ✓ ✓ p10 ✓ ✓ ✓ ✓ p11 ✓ ✓ ✓ ✓ 𝒓𝒊: 𝒍𝒊𝒗𝒆𝒔𝑰𝒏𝑼𝑺𝑨 𝑿 ← 𝒃𝒐𝒓𝒏𝑰𝒏𝑼𝑺𝑨(𝑿) NormalAb-normalRanking Rule’s Revisions 18
  • 19.
    Ranking Rule’s Revisions •Ordered partial materialization ranker • Only rules with higher quality • Ordered weighted partial materialization ranker • KG fact weight = 1 • Predicted facts inherit their weights from the rules 19
  • 20.
    Experiments • Ruleset quality 20 *Higheris better 0.60 0.65 0.70 0.75 0.80 20 40 60 80 100 Avg.Confidence Top-K (%) Rules YAGO3 Horn Naive Ordered & Weighted PM 0.8 0.82 0.84 0.86 0.88 0.9 0.92 0.94 20 40 60 80 100Avg.Confidence Top-K (%) Rules IMDB Horn Naive Ordered & Weighted PM Facts 10M 2M Rules 10K 25K
  • 21.
    Experiments • Predictions consistency 21*Loweris better 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 200 400 600 800 1000 ConflictRatio Top-K Rules YAGO3 Naive Ordered & Weighted PM 0 0.1 0.2 0.3 0.4 0.5 200 400 600 800 1000ConflictRatio Top-K Rules IMDB Naive Ordered & Weighted PM
  • 22.
    Experiments • Examples 22 𝑖𝑠𝑀𝑜𝑢𝑛𝑡𝑎𝑖𝑛(𝑋) ←𝑖𝑠𝐼𝑛𝐴𝑢𝑠𝑡𝑟𝑖𝑎(𝑋), 𝑖𝑠𝐼𝑛𝐼𝑡𝑎𝑙𝑦 𝑋 , 𝑛𝑜𝑡 𝑖𝑠𝑅𝑖𝑣𝑒𝑟 (𝑋) 𝑖𝑠𝑃𝑜𝑙𝑖𝑡𝑂𝑓𝑈𝑆𝐴 𝑋 ← 𝑏𝑜𝑟𝑛𝐼𝑛𝑈𝑆𝐴 𝑋 , 𝑖𝑠𝐺𝑜𝑣 𝑋 , 𝑛𝑜𝑡 𝑖𝑠𝑃𝑜𝑙𝑖𝑡𝑃𝑢𝑒𝑟𝑡𝑜𝑅𝑖𝑐𝑜(𝑋) 𝑏𝑜𝑟𝑛𝐼𝑛𝑈𝑆𝐴 𝑋 ← 𝑎𝑐𝑡𝑒𝑑𝐼𝑛𝑀𝑜𝑣𝑖𝑒 𝑋 , 𝑐𝑟𝑒𝑎𝑡𝑒𝑑𝑀𝑜𝑣𝑖𝑒 𝑋 , 𝑛𝑜𝑡 𝑤𝑜𝑛𝐹𝑖𝑙𝑚𝑓𝑎𝑟𝑒(𝑋)
  • 23.
    Summary • Conclusion • Quality-basedtheory revision under OWA • Partial materialization for ranking revisions • Comparison of ranking methods on real life KGs • Outlook • Extending to higher arity predicates • Binary predicates [Tran et al., to appear ILP2016] • Evidence from text corpora • Exploiting partial completeness 23
  • 24.
    References • [Angiulli andFassetti, 2014] Fabrizio Angiulli and Fabio Fassetti. Exploiting domain knowledge to detect outliers. Data Min. Knowl. Discov., 28(2):519–568, 2014. • [Dimopoulos and Kakas, 1995] Yannis Dimopoulos and Antonis C. Kakas. Learning non-monotonic logic programs: Learning exceptions. In Machine Learning: ECML-95, 8th European Conference on Machine Learning, Heraclion, Crete, Greece, April 25-27, 1995, Proceedings, pages 122–137, 1995. • [Galarraga et al., 2015] Luis Galarraga, Christina Teflioudi, Katja Hose, and Fabian M. Suchanek. Fast Rule Mining in Ontological Knowledge Bases with AMIE+. In VLDB Journal, 2015. • [Law et al., 2015] Mark Law, Alessandra Russo, and Krysia Broda. The ILASP system for learning answer set programs, 2015. • [Leone et al., 2006] Nicola Leone, Gerald Pfeifer, Wolfgang Faber, Thomas Eiter, Georg Gottlob, Simona Perri, and Francesco Scarcello. 2006. The DLV system for knowledge representation and reasoning. ACM Trans. Comput. Logic 7, 3 (July 2006), 499-562. • [Suzuki, 2006] Einoshin Suzuki. Data mining methods for discovering interesting exceptions from an unsupervised table. J. UCS, 12(6):627–653, 2006. • [Tran et al., 2016] Hai Dang Tran, Daria Stepanova, Mohamed H. Gad-Elrab, Francesca A. Lisi, Gerhard Weikum. Towards Nonmonotonic Relational Learning from Knowledge Graphs. ILP2016, London, UK, to appear. • [Katzouris et al., 2015] Nikos Katzouris, Alexander Artikis, and Georgios Paliouras. Incremental learning of event definitions with inductive logic programming. Machine Learning, 100(2-3):555– 585, 2015. 24
  • 25.
    Related Work • Learningnonmonotonic programs • E.g., [Dimopoulos and Kakas, 1995], ILASP [Law et al., 2015], ILED [Katzouris et al., 2015], etc. • Outlier detection in logic programs • E.g., [Angiulli and Fassetti, 2014], etc. • Mining exception rules • E.g., [Suzuki, 2006], etc. 25
  • 26.
    Problem Statement: RulesetQuality • Independent Rule Measure (𝑟𝑚) • Support: 𝑠𝑢𝑝𝑝 𝐻 ← 𝐵 = 𝑠𝑢𝑝𝑝(𝐻 ∪ 𝐵) • Coverage: c𝑜𝑣 𝐻 ← 𝐵 = 𝑠𝑢𝑝𝑝(𝐵) • Confidence: c𝑜𝑛𝑓 𝐻 ← 𝐵 = 𝑠𝑢𝑝𝑝(𝐻∪𝐵) 𝑠𝑢𝑝𝑝(𝐵) • Lift: 𝑙𝑖𝑓𝑡 𝐻 ← 𝐵 = 𝑐𝑜𝑛𝑓(𝐻←𝐵) 𝑠𝑢𝑝𝑝(𝐻) • … • Average Ruleset Quality 26 𝑞 𝑟𝑚 𝑅 𝑁𝑀, 𝐺 = 𝑟 ∈𝑅 𝑁𝑀 𝑟𝑚(𝑟, 𝐺) 𝑅 𝑁𝑀
  • 27.
    Problem Statement: ConflictingPredictions • Measuring conflicts (auxiliary rules) 27 𝑟2: 𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑋 , 𝑛𝑜𝑡 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴(𝑋) 𝑟2 𝑎𝑢𝑥 : 𝑛𝑜𝑡_𝑟𝑒𝑙𝑒𝑎𝑠𝑒𝑑𝐼𝑛𝐽𝑃 𝑋 ← ℎ𝑎𝑠𝐽𝑃𝐶𝑜𝑚𝑝𝑜𝑠𝑒𝑟 𝑋 , 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑟𝑈𝑆𝐴(𝑋) 𝑞 𝑐𝑜𝑛𝑓𝑙𝑖𝑐𝑡 𝑅 𝑁𝑀, 𝐺 = |{(𝑝 𝑎 , 𝑛𝑜𝑡_𝑝 𝑎 ), … }| {𝑛𝑜𝑡_𝑝 𝑎 , … }
  • 28.
    Propositionalization • Unary predicates 28 BinaryPredicates • hasType(enstein, scientist) • isMarriedTo(elsa, einstein) • bornIn(einstein, um) Unary • isAScientist(einstein) • isMarriedToEinstein(elsa) • bornInUlm(einstein) Abstraction • isAScientist(einstein) • isMarriedToScientist(elsa) • bornInGermany(einstein)
  • 29.
    Experiments • Input • Experimentstatistics 29 YAGO3 IMDB Input Facts 10M 2M Horn Rules 10K 25K Revised Rules 6K 22K General-purpose KG Domain-specific KG (Movies)
  • 30.
    Experiments • Predictions assessment •Run DLV on YAGO and 𝑅 𝐻 then 𝑅 𝑁𝑀 seperately • Sample facts such that fact 𝑓 ∈ 𝑌𝐴𝐺𝑂 𝐻𝑌𝐴𝐺𝑂 𝑁𝑀 • 73% of the sampled facts were found to be erroneous 30 checked predictions
  • 31.
    Ranking Rule’s Revisions •Partial materialization ranker • Augment the original 𝐾𝐺 with predictions of other rules • Rank revisions on Avg. confidence of the 𝑟 and 𝑟 𝑎𝑢𝑥 31 𝑠𝑐𝑜𝑟𝑒 𝑟𝑒, 𝐾𝐺∗ = 𝑐𝑜𝑛𝑓 𝑟𝑒, 𝐾𝐺∗ + 𝑐𝑜𝑛𝑓(𝑟𝑒 𝑎𝑢𝑥, 𝐾𝐺∗) 2 where 𝑟𝑒 is the rule 𝑟 with exception 𝑒 & 𝐾𝐺∗ is the augmented 𝐾𝐺.