Linked Data, Ontologies and Inference


Published on

Presented at the New York SemWeb Meetup, April 2013

Published in: Technology, Education

Linked Data, Ontologies and Inference

  1. 1. Barry Norton, Solutions ArchitectOntotext (UK), LondonSemWeb Meet-up, NYC, April 2013Linked Data,Ontologies and Inference
  2. 2. Linked Data• Defined in a W3C Technical Note includingthese core principles:1. Use URIs as names for things2. Use HTTP URIs so that people can look up those2. Use HTTP URIs so that people can look up thosenames.3. When someone looks up a URI, provide usefulinformation, using the standards (RDF*, SPARQL)4. Include links to other URIs. so that they candiscover more things.2
  3. 3. Linked Open Data• The Linking Open Data (LOD) project of theW3C Semantic Web Outreach and EducationTask Force hasdeveloped adeveloped agood deal ofbest practiceand exposeda large numberof interlinked datasets3
  4. 4. • Many datasets – variety of publishers• Re-using URIs enables Linked Data• Browse using URIs to datasetsLinked DataVision#4
  5. 5. FactForge and LinkedLifeDataLinking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
  6. 6. • FactForge (indicated in red on the next slide)– Some of the central LOD datasets– General-knowledge information– 1.2B explicit plus .9B inferred indexed, 10B retrievable statements– Contents• Linked Life Data (indicated in yellow)– 25 of the most popular life-science datasets– Complemented by gluing ontologies– 2.7B explicit and 1.4B inferred, total of 4.1B indexed statements–
  7. 7. • Datasets: DBPedia, Freebase, Geonames, UMBEL,MusicBrainz, Wordnet, CIA World Factbook, Lingvoj• Ontologies: Dublin Core, SKOS, RSS, FOAF• Inference: materialization with respect to OWL2 RL– owl:sameAs optimization in BigOWLIM allows reduction of theindices without loss of semantics, but big gains in performanceFactForgeindices without loss of semantics, but big gains in performance• Free public service at,– Incremental URI auto-suggest– Query and explore through Forest and Tabulator– RDF Search: retrieve ranked list of URIs by keywords– SPARQL end-point#7
  8. 8. DatasetExplicitIndexedTriples(000)InferredIndexedTriples(000)Total # ofStoredTriples(000)Entities(000 ofnodes inthe graph)InferredclosureratioSechmata and ontologies 11 7 18 6 0.6DBpedia (categories) 2,877 42,587 45,464 1,144 14.8DBpedia (sameAs) 5,544 566 6,110 8,464 0.1UMBEL 5,162 42,212 47,374 500 8.2FactForge: DatasetsUMBEL 5,162 42,212 47,374 500 8.2Lingvoj 20 863 883 18 43.8CIA Factbook 76 4 80 25 0.1Wordnet 2,281 9,296 11,577 830 4.1Geonames 91,908 125,025 216,933 33,382 1.4DBpedia core 560,096 198,043 758,139 127,931 0.4Freebase 463,689 40,840 504,529 94,810 0.1MusicBrainz 45,536 421,093 466,630 15,595 9.2Total 1,177,961 881,224 2,058,185 283,253 0.7#8
  9. 9. Querying Linked DataPresented by:Barry Norton
  10. 10. Motivation: Music!VisualizationModuleApplicationAnalysis &Mining ModuleLDDatasetAccessVocabularySPARQLEndpointPublishingRDFa10MetadataStreaming providersPhysical WrapperDownloadsDataacquisitionD2R Transf.LD WrapperMusical ContentLDDatasetLD WrapperRDF/XMLIntegratedDatasetInterlinking CleansingVocabularyMappingOther content
  11. 11. • The data of interest may be stored in a wide range orformats:Extracting the Data• Several tools support the process of mining datafrom different repositories, for example:11EUCLID - Providing Linked DataSpreadsheetsor tabular dataDatabases TextR2RML
  12. 12. Reasoning forLinked Data Integration• Example: Integration of the MusicBrainz data set andthe DBpedia data setIntegrationEUCLID - Querying Linked Data 12IntegrationData set Data set
  13. 13. Reasoning forLinked Data Integrationmo:b10bbbfc-cf9e-42e0-be17-e2c3e1d2600dfoaf:name The Beatles;mo:membermo:ba550d0e-adac-4864-b88b-407cab5e76af;mo:membermo:4d5447d7-c61c-4120-ba1b-d7f471d385b9;mo:membermo:42a8f507-8412-4611-854f-926571049fa0;dbpedia:The_Beatlesdbpedia-ont:origin dbpedia:Liverpool;dbpedia-ont:genre dbpedia:Rock_music;foaf:depiction .sameEUCLID - Querying Linked Data 13mo:42a8f507-8412-4611-854f-926571049fa0;mo:membermo:300c4c73-33ac-4255-9d57-4e32627f5e13.IntegrationData set Data set
  14. 14. Reasoning forLinked Data Integrationmo:b10bbbfc-cf9e-42e0-be17-e2c3e1d2600dfoaf:name The Beatles;mo:membermo:ba550d0e-adac-4864-b88b-407cab5e76af;mo:membermo:4d5447d7-c61c-4120-ba1b-d7f471d385b9;mo:membermo:42a8f507-8412-4611-854f-926571049fa0;dbpedia:The_Beatlesdbpedia-ont:origin dbpedia:Liverpool;dbpedia-ont:genre dbpedia:Rock_music;foaf:depiction .sameEUCLID - Querying Linked Data 14mo:42a8f507-8412-4611-854f-926571049fa0;mo:membermo:300c4c73-33ac-4255-9d57-4e32627f5e13.SELECT ?m ?g WHERE {dbpedia:The_Beatlesdbpedia-ont:genre ?g;mo:member ?m.}Query: ?m ?gmo:ba550d0e-adac-4864-b88b-407cab5e76afdbpedia:Rock_musicmo:4d5447d7-c61c-4120-ba1b-d7f471d385b9dbpedia:Rock_musicmo42a8f507-8412-4611-854f-926571049fa0;dbpedia:Rock_musicmo300c4c73-33ac-4255-9d57-4e32627f5e13dbpedia:Rock_musicResult set:
  15. 15. SPARQL 1.1:Entailment Regimes• SPARQL 1.0 was defined only for simple entailment(pattern matching )• SPARQL 1.1 is extended with entailment regimes otherthan simple entailment:– RDF entailmentEUCLID - Querying Linked Data 15– RDF entailment– RDFS entailment– D-Entailment– OWL RL entailment– OWL Full entailment– OWL 2 DL, EL, and QL entailment– RIF entailmentSource:
  16. 16. RDFSResource Description Framework SchemaTaxonomies and inferencesEUCLID - Querying Linked Data 16Semantic Web StackBerners-Lee (2006)Taxonomies and inferences
  17. 17. RDFS Entailment Regimes• Contains 13 entailment rules denominated rdfsi forinference over RDFS definitions*:– rdfs:Literal (rdfs1, rdfs13)– rdfs:domain (rdfs2), rdfs:range (rdfs3)– rdfs:Resource (rdfs4a, rdfs4, rdfs8)EUCLID - Querying Linked Data 17– rdfs:Resource (rdfs4a, rdfs4, rdfs8)– rdfs:subPropertyOf (rdfs5, rdfs6, rdfs7, rdfs12)– rdfs:Class (rdfs8, rdfs10)– rdfs:subClassOf (rdfs9, rdfs10, rdfs11)– rdfs:ContainerMembershipProperty (rdfs12)– rdfs:Datatype (rdfs13)* Source:
  18. 18. rdfs2 – rdfs:domaindbpedia:The_Beatlesdbpedia:Paul_McCartneymo:memberSchema: Query:dbpedia:John_Lennondbpedia:George_Harrisondbpedia:Ringo_Starrmo:member mo:membermo:memberEUCLID - Querying Linked Data 18SELECT ?x WHERE {?x a mo:MusicGroup.}mo:member rdfs:domainmo:MusicGroup .?x ?xdbpedia:The_Beatles …Schema: Query:Result set: Result set with inference:
  19. 19. rdfs3 – rdfs:rangedbpedia:The_Beatlesdbpedia:Paul_McCartneydbpedia-ont:bandMemberSchema: Query:dbpedia:John_Lennondbpedia:George_Harrisondbpedia:Ringo_Starrdbpedia-ont:bandMemberdbpedia-ont:bandMemberdbpedia-ont:bandMemberEUCLID - Querying Linked Data 19SELECT ?x WHERE {?x a foaf:Agent.}mo:member rdfs:rangefoaf:Agent .?x ?xdbpedia:Paul_McCartneydbpedia:John_Lennondbpedia:Ringo_Starrdbpedia:George_Harrison …Schema: Query:Result set: Result set with inference:
  20. 20. rdfs7 – rdfs:subPropertyOfdbpedia:Yesterdaydbpedia:Paul_McCartneymo:singerSchema: Query:dbpedia:John_Lennondbpedia:George_Harrisondbpedia:Ringo_Starrmo:performer mo:performermo:performermo:performerEUCLID - Querying Linked Data 20SELECT ?x WHERE {dbpedia:Yesterday mo:performer ?x.}mo:singer rdfs:subPropertyOfmo:performer .?xdbpedia:John_Lennondbpedia:Ringo_Starrdbpedia:George_Harrison?xdbpedia:John_Lennondbpedia:Ringo_Starrdbpedia:George_Harrisondbpedia:Paul_McCartneySchema: Query:Result set: Result set with inference:
  21. 21. rdfs9 – rdfs:subClassOfdbpedia:The_BeatlesSchema: Query:mo:MusicArtistrdf:typemo:MusicGrouprdf:typeEUCLID - Querying Linked Data 21SELECT ?x WHERE {?x a mo:MusicArtist.}mo:MusicGroup rdfs:subClassOfmo:MusicArtist .?x ?xdbpedia:The_Beatles …Schema: Query:Result set: Result set with inference:
  22. 22. Inference from Schema• Knowledge encoded in the schema leads to infer newfactsmo:MusicGroup rdfs:subClassOf mo:MusicArtist .mo:MusicGroup a rdfs:Class .mo:MusicArtist a rdfs:Class .Schema:Inferredfacts:EUCLID - Querying Linked Data 22• This is also captured in the set of axiomatic triples,which provide basic meaning for all the vocabulary termsmo:MusicArtist a rdfs:Class .facts:rdfs:subClassOf rdfs:domain rdfs:Class .rdfs:subClassOf rdfs:range rdfs:Class .
  23. 23. RDFS:Lack of Consistency Check• It is possible to infer facts that seem incorrect facts,but RDFS cannot prevent this:Schema: mo:member rdfs:domain mo:MusicGroup ;rdfs:range foaf:Agent .EUCLID - Querying Linked Data 23Existing :PaulMcCartney a :SoloMusicArtist ;facts: :member :TheBeatles .Inferred :PaulMcCartney a :MusicGroup .facts: No contradiction!:The mis-modeling isnot diagnosedrdfs2
  24. 24. • We might wish further inferences, but these arebeyond the entailment rules implemented by RDFSRDFS:Inference Limitationsfoaf:knows rdfs:domain foaf:Person ;rdfs:range foaf:Person .foaf:made rdfs:domain foaf:Agent .:PaulMcCartney foaf:made :Yesterday ;Schema:ExistingEUCLID - Querying Linked Data 24:PaulMcCartney foaf:made :Yesterday ;foaf:knows :RingoStarr .:PaulMcCartney a foaf:Agent ;a foaf:Person .:RingoStarr a foaf:Person .Existingfact:Inferredfacts::Yesterday dc:creator :PaulMcCartney.:RingoStarr foaf:knows :PaulMcCartney .These inferences require OWL!NOTinferred:Cannot model withRDFS that ‘x knows y’implies ‘y knows x’Cannot model withRDFS that if ‘x makesy’ implies that ‘thecreator of y is x’
  25. 25. OWLWeb Ontology LanguageOntologies and inferencesEUCLID - Querying Linked Data 25Semantic Web StackBerners-Lee (2006)Ontologies and inferences
  26. 26. Introduction to OWL• Provides more ontological constructs and avoids some ofthe potential confusion in RDFS• OWL 2 is divided into sub-languages denominatedprofiles:– OWL 2 EL: Limited to basic classification,but with polynomial-time reasoningEUCLID - Querying Linked Data 26but with polynomial-time reasoning– OWL 2 QL: Designed to be translatableto relational database querying– OWL 2 RL: Designed to be efficientlyimplementable in rule-based systems• Most triple stores concentrate on the use of RDFS with asubset of OWL features, called OWL-Horst or RDFS++More restrictivethan OWL DL
  27. 27. OWL PropertiesOWL distinguishes between two types of properties:• OWL ObjectProperties: resources as values• OWL DatatypeProperties: literals as values:plays rdf:type owl:ObjectProperty;EUCLID - Querying Linked Data 27:plays rdf:type owl:ObjectProperty;rdfs:domain :Musician;rdfs:range :Instrument .:hasMembers rdf:type owl:DatatypeProperty;rdfs:domain :MusicGrouprdfs:range xsd:int .
  28. 28. PropertyAxioms• Property axioms include those from RDF Schema• OWL allows for property equivalence. Example:EquivalentObjectProperties(dbpedia-ont:bandMember mo:member)dbpedia-ont:bandMember owl:equivalentProperty mo:member.≡Query:EUCLID - Querying Linked Data 28dbpedia:The_Beatlesdbpedia:Paul_McCartneymo:memberdbpedia:John_Lennondbpedia:George_Harrisondbpedia:Ringo_Starrmo:membermo:membermo:memberSELECT ?x {dbpedia:The_Beatlesdbpedia-ont:bandMember ?x.}Query:?xResult set:?xdbpedia:Paul_McCartneydbpedia:John_Lennondbpedia:Ringo_Starrdbpedia:George_HarrisonResult set with inference:
  29. 29. PropertyAxioms• Property axioms include those from RDF Schema• OWL allows for property equivalence. Example:EquivalentObjectProperties(dbpedia-ont:bandMember mo:member)dbpedia-ont:bandMember owl:equivalentProperty mo:member.≡EUCLID - Querying Linked Data 29• OWL allows for property disjointness. Example:DisjointObjectProperty(dbpedia-ont:length mo:duration)dbpedia-ont:length owl:propertyDisjointWith mo:duration.• There is no standard for implementing inconsistencyreports under SPARQL≡
  30. 30. PropertyAxioms (2)OWL allows the definition of property characteristics to infer newfacts relating to instances and their properties• Symmetry• TransitivityEUCLID - Querying Linked Data 30• Transitivity• Inverse• Functional• Inverse Functional
  31. 31. Property Axioms:Symmetrydbpedia:The_Beatlesdbpedia:Plastic_Ono_Band :associatedMusicalArtista owl:SymmetricProperty .:associatedMusicalArtistSchema:SELECT ?x WHERE {dbpedia:The_BeatlesQuery::associatedMusicalArtistEUCLID - Querying Linked Data 31dbpedia:Billy_Preston?genredbpedia:Plastic_Ono_Band?genredbpedia:Plastic_Ono_Banddbpedia:Billy_PrestonResult set: Result set with inference:dbpedia:The_Beatles:associatedMusicalArtist ?x.}:associatedMusicalArtist
  32. 32. Property Axioms:Transitivity:Rock:Heavy_:Heavy_metal:Punk_:Punk_rockSELECT ?genre WHERE {:Rock :subgenre ?genre .}:subgenre a owl:TransitiveProperty .:subgenre :subgenre:subgenre :subgenreSchema:Query:EUCLID - Querying Linked Data 32:Black_:Black_metal:Rock :subgenre ?genre .}?genre:Heavy_metal:Punk_rock?genre:Heavy_metal:Punk_rock:Black_metalResult set: Result set with inference:
  33. 33. Property Axioms:InverseSELECT ?x WHERE {?x mo:member_ofmo:member_of owl:inverseOf mo:member.Schema:Query:dbpedia:The_Beatlesmo:member_ofdbpedia:John_Lennondbpedia:George_Harrisonmo:membermo:member_ofmo:membermo:member_of mo:member_ofEUCLID - Querying Linked Data 33?x mo:member_ofdbpedia:The_Beatles .}?xdbpedia:John_Lennondbpedia:George_Harrison?xdbpedia:John_Lennondbpedia:George_Harrisondbpedia:Paul_McCartneydbpedia:Ringo_StarrResult set: Result set with inference:dbpedia:Paul_McCartneydbpedia:Ringo_Starrmo:member_of mo:member_of
  34. 34. Example: Every artist primarily playsonly one musical instrumentProperty Axioms:FunctionalIt refers to a property that can have only one (unique)value for each instancer2samer1mo:primary_instrument rdf:type owl:FunctionalProperty .dbpedia:Jimi_Hendrix mo:primary_instrument dbpedia:Electric_Guitar.dbpedia:Jimi_Hendrix mo:primary_instrument dbpedia:E-Guitar.Conclusion dbpedia:Electric_Guitarowl:sameAs dbpedia:E-Guitar .EUCLID - Querying Linked Data 34r2same
  35. 35. Example: Every recording has a unique ISRC(International Standard Recording Code)Property Axioms:Inverse FunctionalIt is useful for specifying unique properties identifyingan individualr2samer1mo:isrc rdf:type owl:InverseFunctionalProperty .mo:21047249-7b3f-4651-acca-246669c081fd mo:isrc "GBAYE6300412" .dbpedia:She_Loves_You mo:isrc "GBAYE6300412" .Conclusion mo:21047249-7b3f-4651-acca-246669c081fdowl:sameAs :dbpedia:She_Loves_You .EUCLID - Querying Linked Data 35r2same
  36. 36. Individual AxiomsOWL Individuals represent instances of classes. They are related totheir class by the rdf:type property• We can state that two individuals are the sameSameIndividual(<artist/ba550d0e-adac-4864-b88b-407cab5e76af#_> dbpedia:PaulMcCartney)<artist/ba550d0e-adac-4864-b88b-407cab5e76af#_> owl:sameAs dbpedia:PaulMcCartney .≡EUCLID - Querying Linked Data 36<artist/ba550d0e-adac-4864-b88b-407cab5e76af#_> owl:sameAs dbpedia:PaulMcCartney .• We can state that two individuals are differentDifferentIndividuals(:TheBeatles_band :TheBeatles_TVseries):TheBeatles_band owl:differentFrom :TheBeatles_Tvseries .≡≡
  37. 37. Class AxiomsAxioms declare general statements about concepts which are usedin logical inference (reasoning). Class axioms:• Sub-class relationship (from RDF Schema)• Equivalent relationship: classes have the same individualsEquivalentClass(:Musician :MusicArtist)EUCLID - Querying Linked Data 37EquivalentClass(:Musician :MusicArtist):Musician owl:equivalentClass :MusicArtist .• Disjointness: classes have no shared individualsDisjointClasses(:SoloMusicArtist :MusicGroup):SoloMusicArtist owl:disjointWith :MusicGroup .≡≡
  38. 38. Class Construction• OWL classes are defined by the OWL term owl:Class• OWL classes can be subclassed as in RDFS:EUCLID - Querying Linked Data 38• OWL classes may be combined with class constructs tobuild new classesMusic ArtistArtist:MusicArtist rdfs:subClassOf :Artist .
  39. 39. Class Construction (2)These class constructs are available in OWL, not in RDFSThe class of female music artistsObjectIntersectionOf(:Female :MusicArtist)[a owl:Class;owl:intersectionOf(:Female :MusicArtist)]The class of music artistsFemaleMusic ArtistSolo≡EUCLID - Querying Linked Data 39The class of music artistsObjectUnionOf(:SoloMusicArtist :MusicGroup)[a owl:Class;owl:unionOf(:SoloMusicArtist :MusicGroup)]Everything that’s not instrumental musicObjectComplementOf(:InstrumentalMusic)[a owl:Class;owl:complementOf(:InstrumentalMusic)]SoloGroupInstrumental≡≡NOTE: Anonymous classes!
  40. 40. Naming Class Constructions• Direct naming can be achieved via owl:equivalentClassMusic ArtistSoloGroupEquivalentClass(:MusicArtistObjectUnionOf(:SoloMusicArtist:MusicGroup))≡EUCLID - Querying Linked Data 40• This construction provides necessary and sufficient conditionsfor class membership• Class naming can be also achieved using rdfs:subClassOf,it provides a necessary but insufficient condition for classmembershipGroup:MusicArtist owl:equivalentClass[owl:unionOf (:SoloMusicArtist :MusicGroup)]
  41. 41. For exercises, quiz and further material visit our website:http://www.euclid-project.eueBook CourseEUCLID - Providing Linked Data 41@euclid_project EUCLID project EUCLIDprojectOther channels: