Fixing the Domain and Range of Properties in Linked Data by Context Disambiguation
1. Fixing the Domain and Range
of Properties in Linked Data
by Context Disambiguation
Alberto Tonon, Michele Catasta,
Gianluca Demartini, Philippe Cudré-Mauroux
LDOW - May the 19th, 2015
2. Linked Data…
2
"Cobie Smulders"
"Neil Patrick Harris"
"How I Met Your Mother"
showName
starring
starring
name
name
TV Show
type type
Person
type
type
type
type
network
type
TV Network
Broadcast
Network
type
Actor
Actor
Person
Work
3. … and its Schema
3
......
Thing
Person Work
TV showActor
Organisation
Broadcaster
...
Type Hierarchy
network
Broadcaster
range
domain
Broadcaster
starring
Work
range
domain
Actor
Property
Definitions
4. Data-Schema Coherence
4
"Cobie Smulders"
"Neil Patrick Harris"
"How I Met Your Mother"
showName
starring
starring
name
name
TV Show
type type
Person
type
type
type
type
network
type
TV Network
Broadcast
Network
type
Actor
Actor
Person
Work
network
Broadcaster
range
domain
Broadcaster
starring
Work
range
domain
Actor
5. Data-Schema Coherence
4
"Cobie Smulders"
"Neil Patrick Harris"
"How I Met Your Mother"
showName
starring
starring
name
name
TV Show
type type
Person
type
type
type
type
network
type
TV Network
Broadcast
Network
type
Actor
Actor
Person
Work
network
Broadcaster
range
domain
Broadcaster
starring
Work
range
domain
Actor
✔
✔
6. Data-Schema Coherence
4
"Cobie Smulders"
"Neil Patrick Harris"
"How I Met Your Mother"
showName
starring
starring
name
name
TV Show
type type
Person
type
type
type
type
network
type
TV Network
Broadcast
Network
type
Actor
Actor
Person
Work
network
Broadcaster
range
domain
Broadcaster
starring
Work
range
domain
Actor
✔
✔
✘
7. Incoherences in Real KBs
5
Property
Dom
Incoherences
% Dom
Incoherences
dpo:years ~641k 100%
dpo:currentMember ~260k 100%
… … …
Property
Dom
Incoherences
% Dom
Incoherences
fb:[…]object.type ~99M 61%
fb:[…]object.name ~41M 100%
… … …
8. Data-Driven
Domains/Ranges
• Just intersect the types of all resources appearing
as subject/object…
• …being consistent with the type hierarchy.
6
......
Thing
Person Work
TV showActor
Organisation
Broadcaster
...
Type Hierarchy
12. SportSeason
0.55
Agent
0.44
...
Thing
1.00
...Soccer Cricket"k 1
Rugby"k
Baseball"10.42
... ...SoccerClubSeason
0.55
SportsTeam
0.44
... ... Organisation
0.44
SportsTeamSeason
0.55
dpo:manager is usedin two different contexts
LEXT: an Example
Computing the domain of dpo:manager
8
manager
soccer club
season manager
sports team
manager
Thing
SoccerClubSeason SportsTeam
13. SportSeason
0.55
Agent
0.44
...
Thing
1.00
...Soccer Cricket"k 1
Rugby"k
Baseball"10.42
... ...SoccerClubSeason
0.55
SportsTeam
0.44
... ... Organisation
0.44
SportsTeamSeason
0.55
dpo:manager is usedin two different contexts
LEXT: an Example
Computing the domain of dpo:manager
8
manager
soccer club
season manager
sports team
manager
Thing
SoccerClubSeason SportsTeam
Visit the hierarchy until:
1) Pr(type | property) ≥ λ
&&
2) H(Pr(property | children)) < η
LEXT
14. H = 1.96
H = 0.9 SportSeason
0.55
Agent
0.44
...
Thing
1.00
...Soccer Cricket"k 1
Rugby"k
Baseball"10.42
... ...SoccerClubSeason
0.55
SportsTeam
0.44
... ... Organisation
0.44
SportsTeamSeason
0.55
dpo:manager is usedin two different contexts
LEXT: an Example
Computing the domain of dpo:manager
8
manager
soccer club
season manager
sports team
manager
Thing
SoccerClubSeason SportsTeam
Visit the hierarchy until:
1) Pr(type | property) ≥ λ
&&
2) H(Pr(property | children)) < η
LEXT
15. REXT and LERIXT
• REXT = LEXT but with types of object resources
• LERIXT = LEXT + REXT
• two type trees (one for Domain and one for
Range), current state is a pair (subject type,
object type)
9
SportSeason Agent ...
Thing
...Soccer Cricket RugbyBaseball
... ...SoccerClubSeason SportsTeam
... ... OrganisationSportsTeamSeason
SportSeason Agent ...
Thing
...Soccer Cricket RugbyBaseball
... ...SoccerClubSeason SportsTeam
... ... OrganisationSportsTeamSeason
Current State
17. Evaluation
• Fixed λ = 0.1, η = 1
• 3 authors + 2 experts (majority vote) evaluated the
output of LEXT REXT, and LERIXT.
• LERIXT generates too many new sub-properties
11
LEXT REXT LERIXT
Precision 96.50% 91.40% 87.00%
Table 2: Precision of LEXT, REXT, and LERIXT
18. Conclusion
• Three different methods for identifying contexts
• LEXT: exploits the type of the subject resources
• REXT: exploits the type of the object resources
• LERIXT: exploits both
• Up to 96.50% precision.
12
Visit the hierarchy until:
1) Pr(type | property) ≥ λ
&&
2) H(Pr(property | children)) < η
LEXT