Machine Learning Methods for Analysing and Linking RDF Data

Machine Learning Methods
for Analysing and Linking RDF Data
Jens Lehmann
September 16, 2014
Jens Lehmann (AKSW, Uni Leipzig) Analysing and Linking RDF Data September 16, 2014 1 / 35

Structured Machine Learning

Structured Machine Learning
How to analyse
structured data?

Detecting Prime Patterns: Series Finder
Construct "Modus operandi" of criminals - identified 9 new crime
patterns in Cambridge MA, USA
Wang, Tong, et al. "Detecting Patterns of Crime with Series Finder." AAAI 2013.

Discovery of Laws of Physics
Background data generated using experiments
Mathematical functions on input variables form hypothesis space
Schmidt, Lipson. "Distilling free-form natural laws from experimental data." Science 2009.

Protein Interaction
Rules learned via Inductive Logic Programming (ProGolem)
understandable by experts and competitive with statistical learners
Possibly better drug design and reduction of side effects
Santos et al. "Automated identification of protein-ligand interaction features using Inductive
Logic Programming: a hexose binding case study." BMC Bioinformatics 2012.

Background Knowledge

RDF and the Linked Data Principles
RDF Triple:

RDF Triple:
Example:
|http://cs.o{xz.ac.uk/John}
Subject
|http://cs.ox.{azc.uk/studies}
Predicate
|http://cs.{ozx.ac.uk/CS}
Object

RDF Triple:
Example:
Subject
Predicate
Object
The term Linked Data refers to a set of best practices for publishing and
interlinking structured data on the Web.

RDF Triple:
Example:
Subject
Predicate
Object
The term Linked Data refers to a set of best practices for publishing and
interlinking structured data on the Web.
Linked Data principles (simplified version):
1 Use RDF and URLs as identifiers
2 Include links to other datasets

OWL Ontologies
Web Ontology Language (OWL) builds on RDF and Description
Logics

OWL Ontologies
Logics
Objects
Specific resources (constants)
Examples: MARIA, LEIPZIG
Classes
Sets of objects (unary predicates)
Examples: Student, Car, Country
Properties
Connections between objects (binary predicates)
Examples: hasChild, partOf

OWL Ontologies
Logics
Objects
Specific resources (constants)
Examples: MARIA, LEIPZIG
Classes
Sets of objects (unary predicates)
Examples: Student, Car, Country
Properties
Connections between objects (binary predicates)
Examples: hasChild, partOf
Can be combined to complex concepts (OWL Class Expressions), e.g.:
Child u 9hasParent.Professor

Learning OWL Class Expressions - Definition
Given:
Background Knowledge (OWL ontologies and RDF datasets)
Positive and negative examples (objects in datasets)
Goal:
Find OWL class expression describing positive but not negative
examples

Application Example: Therapy Response Prediction
0.5-1% of population affected by Rheumatoid Arthritis
Anti-TNF not effective for several million persons for unknown reasons

Learning OWL Class Expressions - Approaches
Least common subsumers
Cohen et al. Computing least common subsumers in description
logics. AAAI 1992
Terminological decision trees
Fanizzi et al. Induction of concepts in web ontologies through
terminological decision trees. ECML PKDD 2010
Rule-based
Fanizzi et al. DL-FOIL concept learning in description logics. ILP
2008
Genetic Programming
Lehmann, Jens. Hybrid learning of ontology classes. MLDM 2007
Refinement operators
Lehmann et al. Concept learning in description logics using refinement
operators. ML 2010
Iannone et al. An algorithm based on counterfactuals for concept
learning in the semantic web. AI 2007

Refinement Operators - Definitions
Given a DL L, consider the quasi-ordered space hC(L),vT i over
concepts of L
: C(L) ! 2C(L) is a downward L refinement operator if for any
C 2 C(L):
D 2 (C) implies D vT C
Notation: Write C D instead of D 2 (C)
Example refinement chain:
Person Man Man u 9hasChild.

Learning using Refinement Operators
0,45

too weak
Car
0,73
Person
0,78
Person u 9attends.
0,97
Person u 9attends.Talk
. . .
. . .
. . .
Start with most
general concept
(top down)
Heuristic evaluates
using pos/neg
examples

0,45

too weak
Car
0,73
Person
0,78
Person u 9attends.
0,97
. . .
. . .
. . .
Start with most
general concept
(top down)
Heuristic evaluates
using pos/neg
examples
Operator specialises

0,45

too weak
Car
0,73
Person
0,78
Person u 9attends.
0,97
. . .
. . .
. . .
Start with most
general concept
(top down)
Heuristic evaluates
using pos/neg
examples
Operator specialises
Continue until
termination
criterion met
=
Learning Algorithm

Properties of Refinement Operators
An L downward refinement operator is called
Finite iff (C) is finite for any concept C 2 C(L)
C
C1 . . . . . . Cn

Redundant iff there exist two different refinement chains from a
concept C to a concept D.
C
C1 . . . . . . Cn
C
E . . .
D

Proper iff for C,D 2 C(L), C D implies C6T D
C
C1 . . . . . . Cn
C
E . . .
D
C
C E

Proper iff for C,D 2 C(L), C D implies C6T D
Complete iff for C,D 2 C(L) with D @T C there is a concept E with
E T D and a refinement chain C · · · E
Weakly complete iff for any concept C with C @T we can reach a
concept E with E T C from by .
C
C1 . . . . . . Cn
C
E . . .
D
C
C E
C
. . .
D E

Properties indicate how suitable a refinement operator is for solving
the learning problem:
Incomplete operators may miss solutions
Redundant operators may lead to duplicate concepts in the search tree
Improper operators may produce equivalent concepts (which cover the
same examples)
For infinite operators it may not be possible to compute all refinements
of a given concept
Key question: Which properties can be combined?

Theorem: Properties of L Refinement Operators
Theorem
Maximum sets of combinable properties of L refinement operators for
L 2 {ALC,ALCN, SHOIN, SROIQ} are:
1 {weakly complete, complete, finite}
2 {weakly complete, complete, proper}
3 {weakly complete, non-redundant, finite}
4 {weakly complete, non-redundant, proper}
5 {non-redundant, finite, proper}
Concept Learning in Description Logics Using Refinement Operators, Lehmann, Hitzler, Ma-chine
Learning journal, 2010
Foundations of Refinement Operators for Description Logics; Lehmann, Hitzler, ILP confer-ence,
2008

Definition of
(C) =
n
{?} [ (C) if C =
(C) otherwise
B(C) =
8
:
; if C = ?
{C1 t · · · t Cn | Ci 2 MB (1 i n)} if C =
{A0 | A0 2 sh#(A)} if C = A (A 2 NC )
[{A u D | D 2 B()}
{¬A0 | A0 2 sh(A)} if C = ¬A (A 2 NC )
[{¬A u D | D 2 B()}
{9r.E | A = ar(r), E 2 A(D)} if C = 9r.D
[ {9r.D u E | E 2 B()}
[ {9s.D | s 2 sh#(r)}
{8r.E | A = ar(r), E 2 A(D)} if C = 8r.D
[ {8r.D u E | E 2 B()}
[ {8r.? |
D = A 2 NC and sh#(A) = ;}
[ {8s.D | s 2 sh#(r)}
{C1 u · · · u Ci−1 u D u Ci+1 u · · · u Cn | if C = C1 u · · · u Cn
D 2 B(Ci ), 1 i n} (n 2)
{C1 t · · · t Ci−1 t D t Ci+1 t · · · t Cn | if C = C1 t · · · t Cn
D 2 B(Ci ), 1 i n} (n 2)
[ {(C1 t · · · t Cn) u D |
D 2 B()}
Base Operator (Excerpt)

Definition of
{9r.E | A = ar(r), E 2 A(D)} if C = 9r.D
[ {9r.D u E | E 2 B()}
[ {9s.D | s 2 sh#(r)}
Examples:
9takesPartIn.SocialEvent
9takesPartIn.Meeting

Definition of
{9r.E | A = ar(r), E 2 A(D)} if C = 9r.D
[ {9r.D u E | E 2 B()}
[ {9s.D | s 2 sh#(r)}
Examples:
Student u 9takesPartIn.SocialEvent

Definition of
{9r.E | A = ar(r), E 2 A(D)} if C = 9r.D
[ {9r.D u E | E 2 B()}
[ {9s.D | s 2 sh#(r)}
Examples:
Student u 9takesPartIn.SocialEvent
9leads.SocialEvent

Properties of
# is complete
# is infinite, e.g. there are infinitely many refinement steps of the
form:
# C1 t C2 t C3 t . . .
cl
# is proper
# is redundant: 8r1.A1 t 8r2.A1 # 8r1.(A1 u A2) t 8r2.A1
#
#
8r1.A1 t 8r2.(A1 u A2) # 8r1.(A1 u A2) t 8r2.(A1 u A2)
“DL-Learner: Learning Concepts in Description Logics”,
Jens Lehmann, Journal of Machine Learning Research (JMLR), 2009

0,457 [01]

too weak
Car
0,7345789 [012345]
Person
0,789 [45]
Person u 9attends.
0,97 [4]
. . .
. . .
. . .
Redundancy
elimination
technique with
polynomial
complexity wrt.
search tree size
Length of children
limited by
expansion value
Infinite applicable
he used by heuristic
(Bias towards short
concepts - Occam’s
Razor)

Scalability
Refinement operator should build coherent concepts
Class Expression Learning for Ontology Engineering; Jens Lehmann, Sören Auer, Lorenz
Bühmann, Sebastian Tramp; Journal of Web Semantics (JWS), 2011

Scalability
Inference:
Complete sound vs. approximation
Open World Assumption (OWA) vs. Closed World Assumption (CWA)

Scalability
Inference:
Stochastic coverage computation
Pick random example ! perform instance check ! compute
confidence interval (e.g. via Wald Method) wrt. objective function
(e.g. F-measure)
Up to 99% less instance checks in test examples
Low influence on accuracy shown for 380 learning tasks using 7
ontologies (0, 2% ± 0, 4% F-measure difference)

Scalability
Inference:
Stochastic coverage computation
Pick random example ! perform instance check ! compute
confidence interval (e.g. via Wald Method) wrt. objective function
(e.g. F-measure)
Up to 99% less instance checks in test examples
Low influence on accuracy shown for 380 learning tasks using 7
ontologies (0, 2% ± 0, 4% F-measure difference)
Fragment extraction for application on large knowledge bases

Carcinogenesis
Goal: predict whether substance causes cancer
Why:
Each year 1000 new substances developed
Substances can often be only be validated using time consuming and
expensive experiments with mice ! prioritise those with high risk
Background knowledge:
Database of the US National Toxicology Program (NTP)
“Obtaining accurate structural alerts for the causes of chemical cancers is
a problem of great scientific and humanitarian value.” (A. Srinivasan, R.D.
King, S.H. Muggleton, M.J.E. Sternberg 1997)

Knowledge Base Enrichment
Pattern Based Knowledge Base Enrichment; Lorenz Bühmann, Jens Lehmann; International
Semantic Web Conference (ISWC) 2013
Universal OWL Axiom Enrichment for Large Knowledge Bases; Lorenz Bühmann, Jens
Lehmann; Knowledge Engineering and Knowledge Management (EKAW) 2012

Protégé Plugin
Support for ontology creation and maintenance

Ontology Debugging: ORE
ORE - A Tool for Repairing and Enriching Knowledge Bases; Lehmann, Bühmann; Interna-tional
Semantic Web Conference (ISWC) 2010

Data Quality Measurement: RDFUnit
Test-driven Evaluation of Linked Data Quality; World Wide Web Conference (WWW),
ACM, 2014; Dimitris Kontokostas, Patrick Westphal, Sören Auer, Sebastian Hellmann, Jens
Lehmann, Roland Cornelissen, Amrapali J. Zaveri

Robot Scientists Adam Eve
Abduction to form hypothesis and 1 000 experiments per day
12 new scientific discoveries regarding functions of genes in yeast
King, Ross D et al. The automation of science. Science 324 (2009): 85-89.

Link Discovery - Motivation
Links are backbone of traditional WWW and Data Web
Links are central for data integration, deduplication, cross-ontology
question answering, reasoning, federated queries . . .
Central problem for many large IT companies

Link Discovery - Motivation
Links are backbone of traditional WWW and Data Web
Links are central for data integration, deduplication, cross-ontology
question answering, reasoning, federated queries . . .
Central problem for many large IT companies
Automated tools (LIMES, SILK) can create a high number of links
between RDF resources by using heuristics

Link Discovery - Definition
Definition (Link Discovery)
Given sets S and T of resources and relation R (often owl:sameAs)
Common approach: Find M = {(s, t) 2 S × T : (s, t) }

S: DBpedia
rdfs:label: African Elephant
T: BBC Wildlife
dc:title: African Bush Elephant

S: DBpedia
T: BBC Wildlife
dbpedia:AfricanElephant owl:sameAs bbc:hfzw82929 ?

S: DBpedia
T: BBC Wildlife
= levenshtein(S.rdfs:label,T.dc:title)

S: DBpedia
T: BBC Wildlife
= levenshtein(S.rdfs:label,T.dc:title)
(dbpedia:AfricanElephant, bbc:hfzw82929) = 5

Example: Link Specification
f (trigrams(:name, :label), 0.5) f (edit(:socId, :socId), 0.5)
t

Link Specification Syntax and Semantics
LS [[LS]]
f (m, ,M) {(s, t, r)|(s, t, r) 2 M ^ (m(s, t) )}
LS1 u LS2 {(s, t, r) | (s, t, r1) 2 [[L1]] ^ (s, t, r2) 2 [[L2]] ^ r = min(r1, r2)}
LS1 t LS2
8
:
(s, t, r) |
8 :
r = r1 if 9(s, t, r1) 2 [[L1]] ^ ¬(9r2 : (s, t, r2) 2 [[L2]]),
r = r2 if 9(s, t, r2) 2 [[L2]] ^ ¬(9r1 : (s, t, r1) 2 [[L1]]),
r = max(r1, r2) if (s, t, r1) 2 [[L1]] ^ (s, t, r2) 2 [[L2]].
Syntax and semantics allow to define an ordering similar to
subsumption (more specific specs generate less links)

Link Specification Refinement Operator
#(LS) =
8
:
{f (m1, 1, ) u · · · u f (mn, 1, ) if LS = ?
| mi 2 SM, 1 i n, n 2|SM|}
f (m, dt(),M) [ LS t f (m0, 1,M) if LS = f (m, ,M) (atomic)
(m 2 SM,m6= m0)
LS1 u · · · u LSi−1 u LS0 u LSi+1 u · · · u LSn if LS = LS1 u · · · u LSn(n 2)
with LS0 2 #(LSi )
LS1 t · · · t LSi−1 t LS0 t LSi+1 t · · · t LSn if LS = LS1 t · · · t LSn(n 2)
with LS0 2 #(LSi ) [ LS t f (m, 1,M)
(m 2 SM,m not used in LS)
Upward refinement operator
Postitive: Weakly complete, finite
Negative: Not complete, redundant, not proper

Refinement Chain Example
f (edit(:socId, :socId), 1.0)

t

t

t

Projects: DL-Learner and LIMES
DL-Learner
Open-Source-Project: http://dl-learner.org
Extensible Platform for concept learning algorithms
Supports all RDF/OWL serialisations and major reasoners
Several thousand downloads
LIMES (http://aksw.org/Projects/LIMES.html)
Highly scalable engine (fastest RDF link discovery tool)
Several machine learning approaches integrated (including the one
presented)
“DL-Learner: Learning Concepts in Description Logics”,
Jens Lehmann, Journal of Machine Learning Research (JMLR), 2009

Summary Conclusions
Many interesting applications of structured machine learning (therapy
response prediction, disease prediction, protein folding, data quality
measurement, ontology debugging)
Still few machine learning tools for working with RDF/OWL although
more and more data available
Refinement operators allow to apply supervised machine learning on
complex background knowledge
Can be applied to other languages like link specifications

Machine Learning Methods for Analysing and Linking RDF Data

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Machine Learning Methods for Analysing and Linking RDF Data

Similar to Machine Learning Methods for Analysing and Linking RDF Data (20)

Recently uploaded

Recently uploaded (20)

Machine Learning Methods for Analysing and Linking RDF Data