jerome Euzenat - Ontology Matching
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

jerome Euzenat - Ontology Matching

  • 1,967 views
Uploaded on

Jerome Euzenat's presentation at SSSW 2012

Jerome Euzenat's presentation at SSSW 2012

More in: Education , Spiritual
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
1,967
On Slideshare
955
From Embeds
1,012
Number of Embeds
2

Actions

Shares
Downloads
42
Comments
1
Likes
2

Embeds 1,012

http://sssw.org 1,011
http://webcache.googleusercontent.com 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. What you have learned so far Ontology matching J´rˆme Euzenat eo Data can be expressed in RDF Linked through URIs Modelled with OWL ontologies & Retrieved through SPARQL queries Montbonnot, France Jerome.Euzenat@inrialpes.fr Thanks to Pavel Shvaiko and Natasha Noy for our collaboration on former versions of these slides J´rˆme Euzenat eo Ontology matching 2 / 36 Being serious about the semantic web Ontology heterogeneity Monograph Item integer pages price isbn It is not one person’s ontology string title author It is not several people common ontology doi title creator uri It is many people’s many ontologies pp Essay So it is a mess, but a meaningful mess. Person Literary critics DVD Human Politics Book Biography author WriterHeterogeneity is not a bug, it is a feature subject Paperback Autobiography Hardcover Literature CD J´rˆme Euzenat eo Ontology matching 3 / 36 J´rˆme Euzenat eo Ontology matching 4 / 36
  • 2. Heterogeneity problem How can we address the problem?Resources being expressed in different ways must be reconciled before beingused.Mismatch between formalized knowledge can occur when: different languages are used (OWL vs. Topic maps); different terminologies are used: First ontology parameters English vs. Chinese; Book vs. Monograph. Initial alignment matching Resulting alignment different models are used: different classes: Autobiography vs. Paperback; Second ontology resources classes vs. property: Essay vs. literarygenre; classes vs. instances: One physical book as an instance vs. one work as an instance. different scopes and granularity are used. Only books vs. cultural items vs. any product; Books detailed to the print and translation level vs. books as works. J´rˆme Euzenat eo Ontology matching 5 / 36 J´rˆme Euzenat eo Ontology matching 6 / 36 Ontology alignment Expressive alignments (EDOAL) ≥ Monograph integer Volume Item pages Pocket ≥ price 14 size string isbn title ≥ author doi title Book creator uri = Essay topic Autobiography pp = ≥ Literary critics author Person DVD ≤ Human Politics Book Biography author ≥ Writer subject ∀x, Pocket(x) ⇐ Volume(x) ∧ size(x, y ) ∧ y ≤ 14 Paperback Autobiography ∀x, Book(x) ∧ author (x, y ) ∧ topic(x, y ) ≡ Autobiography (x) Hardcover Literature CD J´rˆme Euzenat eo Ontology matching 7 / 36 J´rˆme Euzenat eo Ontology matching 8 / 36
  • 3. Transformation and mediation Ontology networks a2 b5 SELECT x.doi SELECT x.isbn o2 WHERE x : Book WHERE x : Autobiography o5 AND x.author = ”Bertrand Russell” AND x.author = ”Bertrand Russell” b2 c2 A2,4 f5 g5 AND x.topic = ”Bertrand Russell” a4 A1,2 a1 f2 g2 d2 e2 h5 j5 o4 mediator A2,3 o1 b4 c4 b1 c1 a3 f4 g 4 d4 e4 d1 e1 o3 b3 c3 A3,4 x.doi=http://dx.doi.org/10.1080/041522862X x.isbn=041522862X A1,3 f3 g3 d3 e 3 J´rˆme Euzenat eo Ontology matching 9 / 36 J´rˆme Euzenat eo Ontology matching 10 / 36 Why should we deal with this? Applications: Query answeringApplications of semantic integration First Second Matcher Catalogue integration ontology ontology Schema and data integration Query answering Alignment Peer-to-peer information sharing Web service composition Generator Agent communication Data transformation First query reformulated query Second mediator Ontology evolution peer reformulated answer answer peer Data interlinking J´rˆme Euzenat eo Ontology matching 11 / 36 J´rˆme Euzenat eo Ontology matching 12 / 36
  • 4. Applications: Agent communication Data interlinking First Second Matcher ontology ontology First Second Matcher ontology ontology Alignment Alignment Generator Transformed Generator message ontology Translator Transformed message First Second First Second links dataset dataset agent agent J´rˆme Euzenat eo Ontology matching 13 / 36 J´rˆme Euzenat eo Ontology matching 14 / 36Ontology matching in three steps On what basis can we match? Reconciliation can be performed in 3 steps o Content: relying on what is inside the ontology o Name, comments, alternate names, names of related entities: NLP, IR, etc. Match, Matcher Internal structure: constraints on relations, typing External structure: relations between entities: Data mining, Discrete thereby determines the alignment A mathematics Extension: Statistics, data analysis, data mining, machine learning Semantics (models): Reasoning techniques Generate Generator Context: the relations of the ontology with the outsidea processor (for merging, transforming, etc.) Transformation Annotated resources: The web External ontologies: dbpedia, etc. Apply External resources: wordnet, etc. J´rˆme Euzenat eo Ontology matching 15 / 36 J´rˆme Euzenat eo Ontology matching 16 / 36
  • 5. Name similarity Structure similarity Monograph Monograph Item pages Item integer pages price isbn creator string isbn title author author DVD doi title uri title creator ≥ Essay Book Essay pp price Person Literary critics Literary critics DVD title Human Politics doi Human Politics Book pp Biography Biography author Writer author Person Writer subject subject Paperback Paperback Autobiography Autobiography Hardcover Hardcover Literature Literature CD CD J´rˆme Euzenat eo Ontology matching 17 / 36 J´rˆme Euzenat eo Ontology matching 18 / 36Instance similarity Combining different techniques Monograph Basic matchers provide candidate correspondences, most of the systems use Item several such matchers and further combine and filter their results. o Essay M A Literary critics DVD A A A Politics Book Biography M A M A Paperback Autobiography o Matcher composition Aggregation Filtering Hardcover Bertrand Russell: My life Literature CD Iteration Albert Camus: La chute J´rˆme Euzenat eo Ontology matching 19 / 36 J´rˆme Euzenat eo Ontology matching 20 / 36
  • 6. How well do these approaches work? Evaluation processOntology Alignment Evaluation Initiative (OAEI) Formal comparative evaluation of different ontology-matching tools; Run every year since 2004; R Variety of test cases (in size, in formalism, in content); o parameters m evaluator Results consistent across test cases; Results very dependent on the tasks and the data (from under 50% of matching A precision and recall to well over 80% if ontologies are relatively similar) Progress every year! o resourceshttp://oaei.ontologymatching.orgNow involved in the SEALS (Semantics Evaluation At Large Scale) project. J´rˆme Euzenat eo Ontology matching 21 / 36 J´rˆme Euzenat eo Ontology matching 22 / 36 Benchmark results (precision and recall Tools you should be aware of curves) 1. Frameworks 2010 ASMOV Alignment API: used by many tools; provides an exchange format and 2009 evaluation tools for OAEI. Alignment server for sharing. Lily PROMPT (a Prot´g´ plug-in): includes a user interface and a plug-in e e 2008 architecture. precision Lily COMA++: oriented toward database integration (many basic algorithms implemented). 2007 ASMOV Matching systems 2006 OAEI best performers (Falcon, RiMOM, ASMOV, etc.) RiMOM Available systems (FOAM, Falcon, COMA++, Aroma, etc.) 2005 Falcon 0. edna 0. recall 1. J´rˆme Euzenat eo Ontology matching 23 / 36 J´rˆme Euzenat eo Ontology matching 24 / 36
  • 7. The data interlinking problem Example: Linking INSEE and NUTS owl:sameAs NUTS: Nomenclature of territorial units for statistics #INSEE INSEE name NUTS Level #NUTS URI1 URI2 1 Pays 0 34 1 142 26 R´gion e 2 344 100 D´partement e 3 1488 Data interlinking 342 Arrondissement 4036 Canton 4 52422 Commune 5 o o J´rˆme Euzenat eo Ontology matching 25 / 36 J´rˆme Euzenat eo Ontology matching 26 / 36INSEE and NUTS: ontology alignment Simple alignments are not sufficient Territoire FR Region Territoire FR Region = nom name integer code = name nom string level Region Pays ≤ code Region = ≤ ≤ hasSubRegion Departement NUTSRegion subdivision ≤ = chef-lieu Country DEP 75 ≤ FR101 ≤ nom = = name Departement ≤ NUTSRegion = Commune Paris Arrondissement LAURegion = Commune COM 75056 nom J´rˆme Euzenat eo Ontology matching 27 / 36 J´rˆme Euzenat eo Ontology matching 28 / 36
  • 8. Expressive alignments are necessary Query generation SELECT ?r PREFIX insee: <http://rdf.insee.fr/ontologie-geo-2006.rdf#> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> FROM <http://rdf.insee.fr/geo/regions-2011.rdf> NUTSRegion WHERE { ?r rdf:type insee:Region . = 2 level Region = } = FR1 hasParentRegion = SELECT ?n subdivision hasSubRegion PREFIX nuts: <http://ec.europa.eu/eurostat/ramon/ontologies/geographic.rdf# = PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> nom name FROM <http://ec.europa.eu/eurostat/ramon/rdfdata/nuts2008/> WHERE { ?n rdf:type nuts:NUTSRegion . ?n nuts:level 2^^xsd:int . ?n nuts:hasParentRegion nuts:FR1 . } J´rˆme Euzenat eo Ontology matching 29 / 36 J´rˆme Euzenat eo Ontology matching 30 / 36 Query generation What does this mean?CONSTRUCT { ?r owl:sameAs ?n . }PREFIX insee: <http://rdf.insee.fr/ontologie-geo-2006.rdf#>PREFIX nuts: <http://ec.europa.eu/eurostat/ramon/ontologies/geographic.rdf#> Ontology alignments are schema-level expression of correspondences;PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> They are useful for focussing the search;FROM <http://rdf.insee.fr/geo/regions-2011.rdf>FROM <http://ec.europa.eu/eurostat/ramon/rdfdata/nuts2008/> Expressive alignments are necessary;WHERE { They can be turned into SPARQL-based link generators. ?r rdf:type insee:Region . ?r insee:nom ?l . but it is also necessary to express instance level constraints: ?n rdf:type nuts:NUTSRegion . for converting data (e.g., mph vs. m/s); ?n nuts:name ?l . for expressing matching constraint on data (e.g., similarity). ?n nuts:level 2^^xsd:int . ?n nuts:hasParentRegion nuts:FR1 .} J´rˆme Euzenat eo Ontology matching 31 / 36 J´rˆme Euzenat eo Ontology matching 32 / 36
  • 9. General framework Selected challenges Scalability and efficiency Current matchers can be fast, scale and accurate, but not all at once. owl:sameAs New sources of matching Context-based matching, URI1 URI2 General purpose matching (vs. special purpose matching) Data interlinking Matcher combination, Matcher selection and self-configuration, User involvement, o A o Matching (serendipitously) while working, How to explain alignments? Social and collaborative ontology matching, Alignment management: infrastructure and support, Ontology matching How do we maintain alignments when ontologies evolve? Reasoning with alignments, Being robust to incorrect alignments. and, of course, many others, J´rˆme Euzenat eo Ontology matching 33 / 36 J´rˆme Euzenat eo Ontology matching 34 / 36Further reading “Ontology Matching” by Euzenat and Shvaiko Jerome.Euzenat@inria.fr Proceedings of ISWC, ASWC, ESWC, WWW conferences, etc. Journal of web semantics, Semantic web http://exmo.inrialpes.fr journal, Journal on data semantics, etc. http://www.ontologymatching.org J´rˆme Euzenat eo Ontology matching 35 / 36