Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Fixing the Domain and Range of Properties in Linked Data by Context Disambiguation

2,221 views

Published on

Presentation given at LDOW2015

Published in: Science
  • Be the first to comment

  • Be the first to like this

Fixing the Domain and Range of Properties in Linked Data by Context Disambiguation

  1. 1. Fixing the Domain and Range of Properties in Linked Data by Context Disambiguation Alberto Tonon, Michele Catasta, Gianluca Demartini, Philippe Cudré-Mauroux LDOW - May the 19th, 2015
  2. 2. Linked Data… 2 "Cobie Smulders" "Neil Patrick Harris" "How I Met Your Mother" showName starring starring name name TV Show type type Person type type type type network type TV Network Broadcast Network type Actor Actor Person Work
  3. 3. … and its Schema 3 ...... Thing Person Work TV showActor Organisation Broadcaster ... Type Hierarchy network Broadcaster range domain Broadcaster starring Work range domain Actor Property Definitions
  4. 4. Data-Schema Coherence 4 "Cobie Smulders" "Neil Patrick Harris" "How I Met Your Mother" showName starring starring name name TV Show type type Person type type type type network type TV Network Broadcast Network type Actor Actor Person Work network Broadcaster range domain Broadcaster starring Work range domain Actor
  5. 5. Data-Schema Coherence 4 "Cobie Smulders" "Neil Patrick Harris" "How I Met Your Mother" showName starring starring name name TV Show type type Person type type type type network type TV Network Broadcast Network type Actor Actor Person Work network Broadcaster range domain Broadcaster starring Work range domain Actor ✔ ✔
  6. 6. Data-Schema Coherence 4 "Cobie Smulders" "Neil Patrick Harris" "How I Met Your Mother" showName starring starring name name TV Show type type Person type type type type network type TV Network Broadcast Network type Actor Actor Person Work network Broadcaster range domain Broadcaster starring Work range domain Actor ✔ ✔ ✘
  7. 7. Incoherences in Real KBs 5 Property Dom Incoherences % Dom Incoherences dpo:years ~641k 100% dpo:currentMember ~260k 100% … … … Property Dom Incoherences % Dom Incoherences fb:[…]object.type ~99M 61% fb:[…]object.name ~41M 100% … … …
  8. 8. Data-Driven Domains/Ranges • Just intersect the types of all resources appearing as subject/object… • …being consistent with the type hierarchy. 6 ...... Thing Person Work TV showActor Organisation Broadcaster ... Type Hierarchy
  9. 9. Data-Driven Domains/Ranges • Dom(foaf:name) = Thing 
 —> Everything has a name ! • Dom(dpo:manager) = Thing 
 —> Everything has a manager " 7
  10. 10. SportSeason 0.55 Agent 0.44 ... Thing 1.00 ...Soccer Cricket"k 1 Rugby"k Baseball"10.42 ... ...SoccerClubSeason 0.55 SportsTeam 0.44 ... ... Organisation 0.44 SportsTeamSeason 0.55 LEXT: an Example Computing the domain of dpo:manager 8
  11. 11. SportSeason 0.55 Agent 0.44 ... Thing 1.00 ...Soccer Cricket"k 1 Rugby"k Baseball"10.42 ... ...SoccerClubSeason 0.55 SportsTeam 0.44 ... ... Organisation 0.44 SportsTeamSeason 0.55 dpo:manager is usedin two different contexts LEXT: an Example Computing the domain of dpo:manager 8
  12. 12. SportSeason 0.55 Agent 0.44 ... Thing 1.00 ...Soccer Cricket"k 1 Rugby"k Baseball"10.42 ... ...SoccerClubSeason 0.55 SportsTeam 0.44 ... ... Organisation 0.44 SportsTeamSeason 0.55 dpo:manager is usedin two different contexts LEXT: an Example Computing the domain of dpo:manager 8 manager soccer club season manager sports team manager Thing SoccerClubSeason SportsTeam
  13. 13. SportSeason 0.55 Agent 0.44 ... Thing 1.00 ...Soccer Cricket"k 1 Rugby"k Baseball"10.42 ... ...SoccerClubSeason 0.55 SportsTeam 0.44 ... ... Organisation 0.44 SportsTeamSeason 0.55 dpo:manager is usedin two different contexts LEXT: an Example Computing the domain of dpo:manager 8 manager soccer club season manager sports team manager Thing SoccerClubSeason SportsTeam Visit the hierarchy until: 1) Pr(type | property) ≥ λ && 2) H(Pr(property | children)) < η LEXT
  14. 14. H = 1.96 H = 0.9 SportSeason 0.55 Agent 0.44 ... Thing 1.00 ...Soccer Cricket"k 1 Rugby"k Baseball"10.42 ... ...SoccerClubSeason 0.55 SportsTeam 0.44 ... ... Organisation 0.44 SportsTeamSeason 0.55 dpo:manager is usedin two different contexts LEXT: an Example Computing the domain of dpo:manager 8 manager soccer club season manager sports team manager Thing SoccerClubSeason SportsTeam Visit the hierarchy until: 1) Pr(type | property) ≥ λ && 2) H(Pr(property | children)) < η LEXT
  15. 15. REXT and LERIXT • REXT = LEXT but with types of object resources • LERIXT = LEXT + REXT • two type trees (one for Domain and one for Range), current state is a pair (subject type, object type) 9 SportSeason Agent ... Thing ...Soccer Cricket RugbyBaseball ... ...SoccerClubSeason SportsTeam ... ... OrganisationSportsTeamSeason SportSeason Agent ... Thing ...Soccer Cricket RugbyBaseball ... ...SoccerClubSeason SportsTeam ... ... OrganisationSportsTeamSeason Current State
  16. 16. About λ 10 Figure 1: Coverage and number of new sub-properties varying λ.
  17. 17. Evaluation • Fixed λ = 0.1, η = 1 • 3 authors + 2 experts (majority vote) evaluated the output of LEXT REXT, and LERIXT. • LERIXT generates too many new sub-properties 11 LEXT REXT LERIXT Precision 96.50% 91.40% 87.00% Table 2: Precision of LEXT, REXT, and LERIXT
  18. 18. Conclusion • Three different methods for identifying contexts • LEXT: exploits the type of the subject resources • REXT: exploits the type of the object resources • LERIXT: exploits both • Up to 96.50% precision. 12 Visit the hierarchy until: 1) Pr(type | property) ≥ λ && 2) H(Pr(property | children)) < η LEXT

×