{Ontology: Resource} x {Matching : Mapping} x {Schema : Instance} ::  Components of the same challenge?   Invited Talk, In...
Information System needs and Ontology Matching goals SemDis, ISIS Semantic Web, some DL-II projects, Semagix SCORE, Applie...
Information systems - From mediators to information brokering <ul><li>Mediators between heterogeneous information sources ...
Information systems - From mediators to information brokers <ul><li>Information brokers </li></ul><ul><ul><li>InfoQuilt, O...
Need for querying across multiple ontologies OBSERVER Circa 1994, 1996-2002 IRM Interontologies Relationships ... Reposito...
Ontology Matching – goals <ul><li>Goals of ontology matching (and mapping, or integration)  </li></ul><ul><ul><li>Shallow ...
Ontology Matching – changing notions <ul><li>Given the distributed nature of modeling domains and metadata, the need for m...
The process of Ontology Matching <ul><li>Different for purposes of  merging / aligning ontologies  </li></ul><ul><ul><li>T...
Top down and bottom up view to ontology matching <ul><li>Top Down: schema + instance integration to provide information in...
Top down and bottom up view to ontology matching <ul><li>Bottom up: exploit external data sources to drive schema matching...
A step back DB vs. Ontology - Fundamental differences
Schema integration goals – DB vs. Ontology <ul><li>DB schema integration goal </li></ul><ul><ul><li>“Defining an  integrat...
Goals are different because of differences in: <ul><li>The modeling paradigms </li></ul><ul><ul><li>A database schema is a...
Modeling Database vs. Ontology schemas - Fundamental differences Emphasis while modeling is on the semantics of the domain...
Choice of modeling affects the possible  space of heterogeneities and  therefore the process of matching. In  both cases  ...
The space of heterogeneities in DB schema integration <ul><li>Conflicts/Heterogeneities in DB schema integration </li></ul...
<ul><li>Conflicts/Heterogeneities in ontology schema integration  </li></ul><ul><ul><li>Significant conflicts in perceptio...
Key Observations <ul><li>There are  significant philosophical differences  in how a DB schema and an Ontology schema are m...
Schema Integration – DB vs. Ontology Have we advanced the state of art ?
Schema Integration – techniques used <ul><li>Syntactic </li></ul><ul><ul><li>Linguistic: Matching names, descriptions, nam...
Schema Integration – techniques used <ul><li>Structural </li></ul><ul><ul><li>Constraint-based: Tree / Graph structure mat...
Schema Integration – techniques used <ul><li>Linguistic </li></ul><ul><ul><li>IR techniques, word frequencies, key terms, ...
Discovered semantic relationships <ul><li>State of the art – in DBs and Ontologies </li></ul><ul><ul><li>Relationships wit...
Key Observation <ul><li>DB and Ontology schema matching techniques overlap significantly </li></ul><ul><ul><li>Not much ad...
(Complex) named relationships and Ontology Matching
(Complex) named relationships - example AFFECTS VOLCANO LOCATION ASH RAIN PYROCLASTIC FLOW ENVIRON. LOCATION PEOPLE WEATHE...
Discovering such (complex) named relationships <ul><li>Matching techniques have exhausted Schema + Instance properties </l...
Knowledge discovery and validation PubMed etc. Rele-vant docs Query  and update DBs Prediction of  - Pathways - Symptoms o...
A Vision for Ontology Matching :  Discovering simple to complex matches – from schema, instances and corpus SIMPLE TO COMP...
Corpus based schema matching
The Intuition 9284  documents  4733   documents Disease or  Syndrome Biologically  active substance causes affects causes ...
The Method – Identify entities and Relationships in Parse Tree Modifiers Modified entities Composite Entities
Key Observation <ul><li>What is interesting is not the entity “estrogen” or “endometrium” </li></ul><ul><li>The real knowl...
Converting candidate relationships to ontology matches <ul><li>Linguistic and statistical challenges:  </li></ul><ul><ul><...
Discovery vs. Validation of relationships – two sides of the coin <ul><li>Discovering complex relationships from text is a...
Corpus based Hypothesis validation  PubMed Does magnesium alleviate effects of migraine in patients? One possible hypothes...
From matching to mappings – several challenges <ul><li>Mappings are not always simple mathematical / string transformation...
The take home message
A world beyond simple matches and mappings <ul><li>The distinction between schema and instances is slowly disappearing </l...
For more information <ul><li>LSDIS Lab:  http://lsdis.cs.uga.edu </li></ul><ul><li>Kno.e.sis Center: http://www.knoesis.or...
Upcoming SlideShare
Loading in …5
×

{Ontology: Resource} x {Matching : Mapping} x {Schema : Instance} :: Components of the same challenge?

1,383 views
1,330 views

Published on

Invited Talk, International Workshop on Ontology Matching
collocated with the 5th International Semantic Web Conference
ISWC-2006, November 5, 2006, Athens GA

Published in: Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,383
On SlideShare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
36
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • With time information systems and the use of semantic metadata and ontologies has evolved – from structured data exchange to integration, capturing semantic metadata, to using 1 ontology for mediating between sources to using multiple ontologies for information integration, to analysis and discovery in distributed multi-ontology, mutli-domain heterogeneous Web resoure environments.
  • And with this, the need for and goals of ontology matching have evolved
  • Christopher 11/3/2006 can maybe mention the static nature of databases that require large efforts to extend the schema vs. the extensible nature of ontologies due to the use of semi-structured data
  • Predictor can predict a pathway by a gene sequence. But we don’t know if the predicted pathway is actually possible. Need to verify in the literature, if the patway is not already in the ontology or actually not allowed according to the ontology Ontology – literature – dbs, prediction systems etc Predictor depends on application. For hypothesis verification, a human feeds available knowledge, for discovery it can be an HMM or other machine learning technique When the system is e.g. asked to predict or verify a pathway or some other complex relationship, the predicted result is then verified by the ontology management system. If the predicted pathway/complex relationship is not in the ontology, the literature and DBs are queried for concepts involved in the predicted pathway/complex relationship and correlated with known concepts in the ontology. Output are relevant publications,, DB entries and maybe a predicted likelihood of the patway/complex relationship being true, according to the found literature.
  • Migraine patients experience stress Ca inhibit stress Mag natural channel blocker Does magnesium alleviate effects of migraine in patients
  • The process of matching needs to support the generation of complex mappings
  • {Ontology: Resource} x {Matching : Mapping} x {Schema : Instance} :: Components of the same challenge?

    1. 1. {Ontology: Resource} x {Matching : Mapping} x {Schema : Instance} :: Components of the same challenge? Invited Talk, International Workshop on Ontology Matching collocated with the 5th International Semantic Web Conference ISWC-2006 , November 5, 2006, Athens GA Professor Amit Sheth Special Thanks: Meena Nagarajan Acknowledgment: SemDis project, funded by NSF
    2. 2. Information System needs and Ontology Matching goals SemDis, ISIS Semantic Web, some DL-II projects, Semagix SCORE, Applied Semantics VideoAnywhere InfoQuilt OBSERVER Generation III (information brokering) 1997... Semantics (Ontology, Context, Relationships, KB) InfoSleuth, KMed, DL-I projects Infoscopes, HERMES, SIMS, Garlic,TSIMMIS,Harvest, RUFUS,... Generation II (mediators) 1990s VisualHarness InfoHarness Metadata (Domain model) Mermaid DDTS Multibase, MRDSM, ADDS, IISS, Omnibase, ... Generation I (federated DB/ multidatabases) 1980s Data (Schema, “semantic data modeling)
    3. 3. Information systems - From mediators to information brokering <ul><li>Mediators between heterogeneous information sources </li></ul><ul><ul><li>InfoHarness, VisualHarness, InfoSleuth, SIMS, Garlic etc. </li></ul></ul>Circa 1992-1996. IH Server Raw Data IH Clients Image Text Video Audio VisualHarness Architecture End User Web Browsers End User Web Browsers End User Web Browsers Internet Information Resources Metadata Database (Metabase) (Oracle) Repository 1 Repository m ..... IH administrative tools
    4. 4. Information systems - From mediators to information brokers <ul><li>Information brokers </li></ul><ul><ul><li>InfoQuilt, OBSERVER etc. </li></ul></ul>Circa 1996-2000 INFORMATION CONSUMERS INFORMATION PROVIDERS Corporations Universities People Government Programs User Query User Query User Query Information System Data Repository Information System Newswires Universities Corporations Research Labs INFORMATION BROKERING Domain Specific Ontologies
    5. 5. Need for querying across multiple ontologies OBSERVER Circa 1994, 1996-2002 IRM Interontologies Relationships ... Repositories Mappings/ Ontology Server Query Processor ... Repositories Mappings/ Ontology Server Query Processor ... ... Mappings/ Ontology Server Query Processor User Query Ontologies Ontologies Ontologies
    6. 6. Ontology Matching – goals <ul><li>Goals of ontology matching (and mapping, or integration) </li></ul><ul><ul><li>Shallow analysis to identify dependencies for integration </li></ul></ul><ul><ul><li>Deeper analysis to create mappings for query based transformations / integration </li></ul></ul><ul><ul><li>Integrate schemas to create a global schema </li></ul></ul><ul><ul><li>Integrate instance bases </li></ul></ul><ul><ul><li>Sheth, Review of a real world experience in database schema integration (Bellcore, ca. 1993) </li></ul></ul>
    7. 7. Ontology Matching – changing notions <ul><li>Given the distributed nature of modeling domains and metadata, the need for matching advanced to Information Integration </li></ul><ul><li>Now </li></ul><ul><ul><li>Query processing not limited to multiple databases or ontologies, but multiple domains and sources of information </li></ul></ul><ul><ul><li>Exploiting structured, semi-structured and unstructured data sources, multi-model Web sources </li></ul></ul>
    8. 8. The process of Ontology Matching <ul><li>Different for purposes of merging / aligning ontologies </li></ul><ul><ul><li>Type of relationships that suffice to be discovered are limited to equivalence / inclusion / disjointness / overlap mappings </li></ul></ul><ul><li>Different for purposes of information integration to analytics to discovery </li></ul><ul><ul><li>Need for discovering more Complex mappings </li></ul></ul><ul><ul><ul><li>Named relationships / associations </li></ul></ul></ul><ul><ul><ul><li>Graph based / numerical mappings </li></ul></ul></ul>
    9. 9. Top down and bottom up view to ontology matching <ul><li>Top Down: schema + instance integration to provide information integration </li></ul><ul><li>Top Down: schema + instance integration to provide information integration </li></ul>
    10. 10. Top down and bottom up view to ontology matching <ul><li>Bottom up: exploit external data sources to drive schema matching </li></ul>
    11. 11. A step back DB vs. Ontology - Fundamental differences
    12. 12. Schema integration goals – DB vs. Ontology <ul><li>DB schema integration goal </li></ul><ul><ul><li>“Defining an integrated view of the data for all applications using the data.” </li></ul></ul><ul><li>Ontology schema integration goal </li></ul><ul><ul><li>“Defining an agreement between multiple ontology schemas modeled for the same domain .” </li></ul></ul>
    13. 13. Goals are different because of differences in: <ul><li>The modeling paradigms </li></ul><ul><ul><li>A database schema is a model for the data that one more applications intend to use. </li></ul></ul><ul><ul><li>An ontology is a model of knowledge for a bounded region of interest (also known as a domain ) </li></ul></ul><ul><li>Data vs. Knowledge : A DB instance base is not the same as an ontology instance base </li></ul><ul><ul><li>A database models data to be used by one or more applications </li></ul></ul><ul><ul><li>An ontology models knowledge about a domain , independent of the application </li></ul></ul>
    14. 14. Modeling Database vs. Ontology schemas - Fundamental differences Emphasis while modeling is on the semantics of the domain – emphasis on relationships, also facts/knowledge/ground truth Emphasis while modeling is on structure of the tables Structure vs. Semantics Intended to model a domain Intended to model data being used by one or more applications Modeling perspective Ontology schemas Database schemas Axis of comparison
    15. 15. Choice of modeling affects the possible space of heterogeneities and therefore the process of matching. In both cases however, the schema is only an abstraction of the real world; the real power/semantics lies at the instance level. Symbolizes agreement of the modeling of a domain possibly used by applications in varying contexts. Limited to a syntactic agreement between applications using the data Agreement More expressive modeling paradigm Limited expressivity in capturing instance level metadata due to static schemas Instance metadata modeling / expressiveness Modeling of a domain irrespective of applications Well defined by applications using the data Context of modeling
    16. 16. The space of heterogeneities in DB schema integration <ul><li>Conflicts/Heterogeneities in DB schema integration </li></ul><ul><ul><li>Model / representation : relational vs. network vs. hierarchical models </li></ul></ul><ul><ul><li>Structural / schematic : </li></ul></ul><ul><ul><ul><li>Domain Incompatibilities </li></ul></ul></ul><ul><ul><ul><li>Entity Definition Incompatibilities </li></ul></ul></ul><ul><ul><ul><li>Data Value Incompatibilities </li></ul></ul></ul><ul><ul><ul><li>Abstraction level Incompatibilities </li></ul></ul></ul><ul><li>Largely syntactic and structural; relatively few semantic conflicts </li></ul>Sheth/Kashyap 1992, Kim/Seo 1993, Kashyap/Sheth 1996)
    17. 17. <ul><li>Conflicts/Heterogeneities in ontology schema integration </li></ul><ul><ul><li>Significant conflicts in perception of a domain – semantic conflicts </li></ul></ul><ul><ul><li>Other heterogeneities are similar to those in the DB world </li></ul></ul><ul><ul><ul><li>Model / representation : OWL/RDF ; topic maps etc. </li></ul></ul></ul><ul><ul><ul><li>Structural : modeling as an entity vs. an attribute/property; generalization vs. abstraction etc. </li></ul></ul></ul><ul><li>Largely semantic conflicts; comparable syntactic conflicts </li></ul>The space of heterogeneities in ontology schema integration
    18. 18. Key Observations <ul><li>There are significant philosophical differences in how a DB schema and an Ontology schema are modeled </li></ul><ul><li>In spite of these distinctions, many schema matching techniques overlap significantly . </li></ul><ul><li>Have we advanced the state of art in ontology schema matching? </li></ul>
    19. 19. Schema Integration – DB vs. Ontology Have we advanced the state of art ?
    20. 20. Schema Integration – techniques used <ul><li>Syntactic </li></ul><ul><ul><li>Linguistic: Matching names, descriptions, namespaces etc. </li></ul></ul><ul><ul><li>Constraint-based: Constraint matches on data types, value ranges, uniqueness, cardinalities etc. </li></ul></ul>Schema matching techniques Information exploited DB Ontology <ul><li>Matching Table and column level names and constraints </li></ul><ul><li>Matching class, properties/ relationship, attribute level names and constraints </li></ul>Schema level
    21. 21. Schema Integration – techniques used <ul><li>Structural </li></ul><ul><ul><li>Constraint-based: Tree / Graph structure matching </li></ul></ul>Schema matching techniques Information exploited <ul><li>Matching structures of relational tables </li></ul><ul><li>Matching class hierarchies and structures </li></ul>DB Ontology Schema level
    22. 22. Schema Integration – techniques used <ul><li>Linguistic </li></ul><ul><ul><li>IR techniques, word frequencies, key terms, combination of key terms etc. </li></ul></ul><ul><li>Constraint based </li></ul><ul><ul><li>Numerical value patterns, ranges useful for recognizing phone numbers etc. </li></ul></ul>Schema matching techniques Information exploited DB Ontology Instance level <ul><li>Hybrid approaches use a combination of all techniques </li></ul>
    23. 23. Discovered semantic relationships <ul><li>State of the art – in DBs and Ontologies </li></ul><ul><ul><li>Relationships with set semantics: overlap / disjointness / exclusion / equivalence / subsumption </li></ul></ul><ul><ul><li>Their logical encodings are what they mean </li></ul></ul><ul><li>Of more interest is discovering arbitrary named relationships </li></ul><ul><ul><li>Relationships such as works_for or causes have “real-world” semantics. Their encoding in first order logic lacks semantic grounding. </li></ul></ul><ul><li>Matching and mapping closely tied. Ability to capture complex mapping (e.g., semantic proximity) puts significantly different demand on matching </li></ul>
    24. 24. Key Observation <ul><li>DB and Ontology schema matching techniques overlap significantly </li></ul><ul><ul><li>Not much advancement since DB schema integration efforts </li></ul></ul><ul><li>Ontologies formalize the semantics of a domain, but matching is still primarily syntactic / structural. </li></ul><ul><ul><li>The semantics of ‘named relationships’ is largely unexploited </li></ul></ul><ul><li>The real semantics lies in the relationships connecting entities </li></ul><ul><ul><li>Modeled as first class objects in Ontologies </li></ul></ul><ul><ul><li>In DB, they are not explicit and have to be inferred </li></ul></ul>
    25. 25. (Complex) named relationships and Ontology Matching
    26. 26. (Complex) named relationships - example AFFECTS VOLCANO LOCATION ASH RAIN PYROCLASTIC FLOW ENVIRON. LOCATION PEOPLE WEATHER PLANT BUILDING DESTROYS COOLS TEMP DESTROYS KILLS
    27. 27. Discovering such (complex) named relationships <ul><li>Matching techniques have exhausted Schema + Instance properties </li></ul><ul><li>Ontology modeling de couples schema + instance base </li></ul><ul><ul><li>Tremendous opportunity to exploit knowledge present outside the ontology knowledge base (External structured, semi-structured and unstructured data sources) </li></ul></ul>
    28. 28. Knowledge discovery and validation PubMed etc. Rele-vant docs Query and update DBs Prediction of - Pathways - Symptoms of Diseases - Other complex relationship
    29. 29. A Vision for Ontology Matching : Discovering simple to complex matches – from schema, instances and corpus SIMPLE TO COMPLEX MATCHES Possible identifiable matches: equivalence / inclusion / overlap / disjointness Possible to identify more complex relationships from the corpus. Ontologies Heterogeneous data Today , the Food and Drug Administration ( FDA ) is announcing that it has asked Pfizer , Inc . to voluntarily withdraw Bextra from the market . Pfizer has agreed to suspend sales and marketing of Bextra in the , pending further discussions with the agency . Semantic metadata
    30. 30. Corpus based schema matching
    31. 31. The Intuition 9284 documents 4733 documents Disease or Syndrome Biologically active substance causes affects causes complicates Fish Oils Raynaud’s Disease ??????? instance_of instance_of 5 documents UMLS MeSH PubMed Lipid affects
    32. 32. The Method – Identify entities and Relationships in Parse Tree Modifiers Modified entities Composite Entities
    33. 33. Key Observation <ul><li>What is interesting is not the entity “estrogen” or “endometrium” </li></ul><ul><li>The real knowledge lies in the complex and modified entities “an excessive endogeneous stimulation by estrogen” </li></ul>Current KR frameworks do not model this. Capturing this might affect the way we think of matching and mapping.
    34. 34. Converting candidate relationships to ontology matches <ul><li>Linguistic and statistical challenges: </li></ul><ul><ul><li>Variations of entities, relationships and associations </li></ul></ul><ul><li>Translating instance level findings to the schema level </li></ul><ul><ul><li>GOING FROM several discovered relationships like “Deficiency in migraine causes Migraine” TO “substance X causes condition Y” </li></ul></ul>
    35. 35. Discovery vs. Validation of relationships – two sides of the coin <ul><li>Discovering complex relationships from text is a hard problem </li></ul><ul><ul><li>Natural Language challenges (not all sentences are well formed) </li></ul></ul><ul><li>Validating complex relationships / hypothesis is relatively simpler </li></ul>
    36. 36. Corpus based Hypothesis validation PubMed Does magnesium alleviate effects of migraine in patients? One possible hypothesized connection between magnesium and migraine…. isa Magnesium Migraine Stress Calcium Channel Blockers Patient affectedBy inhibit Complex Query Supporting Document sets retrieved
    37. 37. From matching to mappings – several challenges <ul><li>Mappings are not always simple mathematical / string transformations </li></ul><ul><li>Examples of complex mappings </li></ul><ul><ul><li>Associations / paths between classes </li></ul></ul><ul><ul><li>Graph based / form fitting functions </li></ul></ul>Number of earthquakes with magnitude > 7 almost constant. So if at all, then nuclear tests only cause earthquakes with magnitude < 7 E 1 : Reviewer E 6 : Person E 5 : Person E 2 : Paper E 4 : Paper E 7 : Submission E 3 : Person author _ of author _ of author _ of author _ of author _ of knows knows
    38. 38. The take home message
    39. 39. A world beyond simple matches and mappings <ul><li>The distinction between schema and instances is slowly disappearing </li></ul><ul><li>Integrating new and external data sources, mining and analyzing them is gaining importance. </li></ul><ul><li>Tremendous opportunities and challenges in using more information than what is modeled in a schema and captured in an instance base. </li></ul>Need to go beyond well-mannered schemas and knowledge representations; and relatively simpler mappings
    40. 40. For more information <ul><li>LSDIS Lab: http://lsdis.cs.uga.edu </li></ul><ul><li>Kno.e.sis Center: http://www.knoesis.org </li></ul>

    ×