Ontology matching


      Ícaro Medeiros
 Jaumir Valença da Silveira
     Franklin Amorim
     Pedro Henrique
Outline


●   Context
●   Definitions
●   Classifications of Ontology Matching Techniques
●   Basic Techniques
●   Matching Strategies
Bibliography
[1] Jerome Euzenat and Pavel Shvaiko. 2010. Ontology Matching (1st ed.).
Springer Publishing Company, Incorporated.
[2] Namyoun Choi, Il-Yeol Song, and Hyoil Han. 2006. A survey on ontology
mapping. SIGMOD Rec.35, 3 (September 2006), 34-41.
[3] Yannis Kalfoglou and Marco Schorlemmer. 2003. Ontology mapping: the
state of the art. Knowl. Eng. Rev. 18, 1 (January 2003), 1-31.
[4] Noy, N., 2005. Ontology Mapping and Alignment. Search, p.1-34. Available
at: http://www.aifb.uni-karlsruhe.de/WBS/meh/foam/.
[5] Casanova, M. A., 2012. Tecnologias de Banco de Dados para a Web
Semântica - Módulo 9a - Ontologias - Matching.
Outline


●   Context
●   Definitions
●   Classifications of Ontology Matching Techniques
●   Basic Techniques
●   Matching Strategies
Context
● We have to deal with heterogeneity

● Different models are based on different
  domains of knowledge and use different
  tools, at different detail levels

● Distributed nature of ontology development
  has lead to different ontologies in the same
  or overlapping domains
The need for ontology matching


●   Creating global ontologies from local ontologies
●   Reuse information between ontologies
●   Dealing with heterogeneity
●   Queries across multiple distributed resources
●   Data transformation
Outline

● Context
● Definitions
● Classifications of Ontology Matching
  Techniques
● Basic Techniques
● Matching Strategies
What is ontology matching?


It is the process of finding relationships
or correspondences between entities of
            different ontologies.

entities - classes, instances, properties
                or formulas
Other terms used
The matching process

Ontologies o and o'
Alignment A
Parameters             Alignment A'
Resources
Ontology matching example
Outline

●   Context
●   Definitions
●   Classifications of Ontology Matching Techniques
●   Basic Techniques
●   Matching Strategies
Classifying ontology matching in
  regard to the use

● Matching local ontologies to global ontologies

● Matching ontologies of complementary domains

● Merging two ontologies of the same domain
Synthetic Classifications


●   Granularity/Input Interpretation Layer
    ○   e.g. element- or structure-level

●   Kind of Input Layer
    ○   Classification based on the kind of input used by
        elementary matching techniques

●   Basic Techniques Layer
    ○   Classification based on how input information is
        interpreted
Granularity/Input Interpretation Layer



●   Element-level matching techniques
    ○   Analysing entities or instances in isolation
    ○   Ignoring their relations with other entities or their
        instances

●   Structure-level techniques
    ○   Analysing how entities or their instances appear
        together in a structure (e.g. by representing
        ontologies as a graph)
Granularity/Input Interpretation Layer


Syntactic techniques
   ○   Interpret the input with regard to its sole structure
External techniques
   ○   Uses external resources of a domain and common
       knowledge
Semantic techniques
   ○   Interpret the input by using model-theoretic
       semantics
Kind of Input Layer


●   Terminological
    ○   Strings found in the ontology descriptions
●   Structural
    ○   Structures found in the ontology descriptions
●   Semantics
    ○   Requires some semantic interpretation of the
        ontology
●   Extensional
    ○   Use data instances
●   In some papers, semantic=logic;
    extensional=semantic
Kind of Input Layer (Second level)


●   Terminological
    ○ String-based: terms as sequences of characters
    ○ Linguistic: interpretation of the terms as linguistic
      objects

● Structural
    ○ Internal: consider the internal structure of entities
    ○ Relational: consider the relation of entities with other
      entities
Basic Techniques Layer


A label can be interpreted as
   ○   A string (a sequence of letters)
   ○   A word or a phrase in some natural language


A hierarchy can be considered as
   ○   A graph
   ○   A taxonomy
Basic Techniques Layer


Element-level

 ●   String-based
 ●   Language-based
 ●   Based on linguistic resources
 ●   Constraint-based
 ●   Alignment reuse
 ●   Based on upper level and domain specific formal
     ontologies
Basic Techniques Layer


Structure-level

  ● Graph-based
  ● Taxonomy-based
Element-level Techniques


●   String-based techniques
    ●   The more similar the strings, the more likely they
        are to denote the same concepts
    ●   Distance functions map a pair of strings to a real
        number


●   Language-based techniques
    ●   Based on natural language processing techniques
        exploiting morphological properties of the input
        words
Element-level Techniques


●   Constraint-based techniques
    ●   Deal with the internal constraints being applied to the
        definitions of entities, such as types, cardinality of
        attributes, etc


●   Linguistic resources
    ●   Lexicons or domain specific thesauri, used to match
        words based on linguistic relations between them like
        synonyms, hyponyms, etc
Element-level Techniques



●   Alignment reuse
    ●   Record alignments of previously matched
        ontologies



●   Upper level and domain specific ontologies
    ●   Used as external sources of common knowledge
Structure-level Techniques



●   Graph-based techniques
    ●   Treat input ontologies as labelled graphs
    ●   If two nodes from two ontologies are similar, their
        neighbours may also be somehow similar


●   Taxonomy-based techniques
    ●   is-a links connect terms that are already similar,
        therefore their neighbours may be also somehow
        similar
Outline

● Context
● Definitions
● Classifications of Ontology Matching
  Techniques
● Basic Techniques
● Matching Strategies
Basic Techniques

● Examples of metrics: Similarity and
  Distance
● Name-based techniques
● Structure-based techniques
● Extensional techniques
● Semantic-based techniques
Basic Techniques
Similarity: Function from a pair of entities to a real number
Name-based Techniques

● They can be applied to the name, the label
  or the comments of entities in order to find
  those which are similar


● They can be used for comparing class
  names and/or URIs
String-based methods

●   Based on string similarity only


●   Useful if conceptual schemas (or ontologies) use
    very similar strings to denote the same concepts

●   Yield a low similarity, if schemas use synonyms with
    different syntax

●   Yield many false positives, if pairs of strings with low
    similarity are selected
String-based methods

String distance functions:
String-based methods


Levenshtein (edit) distance
   ●   Measure the similarity between two strings by
       the minimum number of insertions, deletions, and
       substitutions of characters required to transform
       one string into the other
   ●   Example:

(“Gaming”, “Games”) = 2 substitutions [“e” by “i” and “n” by “s”]
                        + 1 deletion [“g”]
                       =3
String-based methods


Token-based distance

  ●   Usually applied to the complete description of a
      concept
  ●   Treats strings as a bag of words (multisets of
      substrings)
  ●   May split strings into independent tokens
      ●   Example: "InProceedings" is represented by
           ●   the bag of words {In, Proceedings}
           ●   or a bag of substrings of length 3 {InP, roc, eed, ing, s}
String-based methods

Bag of words represented as a vector
      ●   Each dimension corresponds to a token
      ●   Each position of the vector is the number of occurrences of the
          token
Cosine Similarity




               Ontology
                            Ontology      Mapping, ontology
                          mapping=(1,1)    mapping=(1,2)
                 1




                                                     Mapping
                                 1          2
V = {"Ontology", "Mapping" }
Language-based methods

Intrinsic methods
  ●   reduce each term to a normal form to facilitate
      matching
  ●   use traditional natural language processing
      techniques
      ●   stopword elimination
      ●   tokenization: segment strings into sequences of tokens
      ●   lemmatization: reduce words to normal forms
           ●   suppress tense, gender and number
Language-based methods

Example – Variants of the term “theory paper”
Language-based methods

Extrinsic methods


Use dictionaries, lexicons and terminologies to
help match terms from different schemas or
ontologies
    ●   e.g. a terminology - a thesaurus which very often
        contains phrases rather than single words
    ●   deal with synonyms
    ●   word sense disambiguation
Language-based methods

WordNet – an example of an external resource
●
    ●   an electronic lexical database for English
    ●   based on the notion of synsets (sets of synonyms)
        ●   a synset denotes a concept or a sense of a group of terms


    ●   WordNet also provides:
        ●   an hypernym structure (superconcept / subconcept)
        ●   a meronym relation (part of)
        ●   textual descriptions of the concepts (glossary)
Language-based methods

●   Example
      ● WordNet 2.0 entry for the word author
      author1 noun: Someone who originates or causes or initiates something;
        Example ‘he was the generator of several complaints’. Synonym
        generator, source. Hypernym maker. Hyponym coiner.
      author2 noun: Writes (books or stories or articles or the like) professionally
        (for pay). Synonym writer2. Hypernym communicator. Hyponym
        abstractor, alliterator, authoress, biographer, coauthor, commentator,
        contributor, cyberpunk, drafter, dramatist, encyclopedist, essayist, folk
        writer, framer, gagman, ghostwriter, Gothic romancer, hack, journalist,
        libretist, lyricist, novelist, pamphleter, paragrapher, poet, polemist,
        rhymer, scriptwriter, space writer, speechwriter, tragedian, wordmonger,
        word-painter, wordsmith, Andersen, Assimov...
      author3 verb.: Be the author of; Example ‘She authored this play’.
        Hypernym write. Hyponym co-author, ghost.
Language-based methods

●   Example
      ●   fragment of the WordNet hierarchy (limited to nouns) for
           “illustrator”, “author”, “creator”, “person”, “writer”




      (“author”) =
        {A1, A2W2}


      (“writer”) =
        {W1, A2W2, W3}
Language-based methods

Example – Synonym Similarity
●



      (s,t) = 1 iff        (s)     (t)   (terms have a synset in common)

             = 0 otherwise

     (“author”) = {A1, A2W2}
     (“writer”) = {W1, A2W2, W3}

     (“author”)     (“writer”)
Language-based methods

Example – Co-synonymy similarity
●

    ’(s,t) = |      (s)          (t)|
             |      (s)           (t)|


    (“author”) = {A1, A2W2}
    (“writer”) = {W1, A2W2, W3}
    (“author”)         (“writer”) = 1
    (“author”)         (“writer”) = 4
Structure-based techniques

Internal structure (constraint-based approaches)

●   based on the internal structure of classes


●   calculate the similarity between two classes based on
     ○   the set of their properties, including keys
     ○   the range of their properties (attributes and relations)
     ○   the cardinality of their properties
     ○   the transitivity or symmetry of their properties
Structure-based techniques

Internal structure (constraint-based approaches)
Structure-based techniques

Internal structure (constraint-based approaches)
  ●   positive point:
       ●   can be used to eliminate incompatible matches
  ●   negative points:
       ●   does not provide much information about the classes to
           compare
       ●   different classes may have properties with the same datatypes
       ●   different models of a concept use different, and incompatible,
           types
  ●   approach suggested:
       ●   use method in combination with other methods
Structure-based techniques


Relational Structure
●   similarity between two concepts
●   based on the relations between the concepts with other
    concepts
     ○   similar concepts should have similar related concepts


●   given a relation r, a pair of concepts may be:
     ○   directly related through r
     ○   inversely related through r
     ○   transitively related through r
     ○   the maximal elements of r+
Structure-based techniques

Example
  subclass(Book) =
    {Science, Pocket, Children}
  subclass−1(Book) =
    {Product}
  subclass+(Book) =
    {Science, Pocket, Textbook, Popular, Children}
  subclass ↑ (Book) =
    {Textbook, Popular, Pocket, Children}
Structure-based techniques

Taxonomic Structure
●   Similarity between two concepts
     ○   Based on the graph of the subClassOf relation
     ○   Example
          ■ (e,e’) = number of edges of the taxonomy between e and e’,
                    normalized by dividing by the longest path
Structure-based techniques

Bounded path matchers

  ●   use anchors relating paths from two distinct
      taxonomies
      ●   take two paths with links from two distinct taxonomies
      ●   compare terms and their positions along these paths
      ●   identify similar terms
Structure-based techniques

Example

  “Book -> Volume” and
  “Popular -> Autobiography”
  implies that possibly
  “Science -> Biography” or
  “Science -> Essay”
Structure-based techniques

Summary of relational structure methods

●   Powerful methods to match conceptual schemas and
    ontologies
     ○   Allow relations between concepts to be taken into account


●   Often used in combination with internal structural and
    terminological methods
Extensional techniques

When two ontologies share the same set of
individuals, matching is highly facilitated.
Extensional techniques

●   Jaccard Similarity: Given two sets A and B, let P(X)
    be the probability of a random instance to be in the set
    X.




●   Note that the Jaccard Similarity reaches 1 when A = B
    and 0 when they are disjoint.
Semantic-based techniques

●   Semantic-based techniques rely on using the axioms of
    ontologies and deductive methods.

●   But for an inductive task like ontology matching, they do
    not perform well alone. So, a preprocessing is needed.

●   Therefore, we need, firstly, to suppress the lack of a
    common ground between the ontologies.

●   For those reasons, authors propose the use of semantic
    techniques in two steps: the so-called anchoring step
    and the deriving relations step.
Semantic-based techniques
●   Anchoring: is matching ontologies o' and o'' to the
    background ontology o. This can be done using any
    method described so far.

●   Deriving relations: is the (indirect) matching of
    ontologies o' and o'' by using the correspondences
    discovered during the anchoring step.

●   Example: Micro-company: Has at most 5 employees.
                SME: Has at most 10 associates.
    anchoring: employees ---> EMPLOYEE <--- associates
                 Micro-company ---> FIRM <--- SME
     deriving relations: Micro-company is a subclass of SME.
Outline

● Context
● Definitions
● Classifications of Ontology Matching
  Techniques
● Basic Techniques
● Matching Strategies
Matching strategies - Global
Methods
● Aggregating the results of the basic methods
● Developing a strategy for computing these
  similarities
● Learning from data the best method and the
  best parameters for matching
● Using probabilistic methods to combine
  matchers or to derive missing correspondences
● Involving users in the loop
● Extracting the alignments from the resulting
  (dis)similarity
Matcher composition

● Sequential composition of matchers
Matcher composition

● Using matrices to represents a similarity or
  distance measure between entities to be
  matched
Matcher composition

● Parallel composition of matchers
Similarity aggregation
Compound similarity is concerned with the
aggregation of heterogeneous similarities

  ○ e.g. A single similarity measure composed by the
    similarity obtained from their names, the similarity of
    their superclasses, the similarity of their instances
    and that of their properties

Ontology matching

  • 1.
    Ontology matching Ícaro Medeiros Jaumir Valença da Silveira Franklin Amorim Pedro Henrique
  • 2.
    Outline ● Context ● Definitions ● Classifications of Ontology Matching Techniques ● Basic Techniques ● Matching Strategies
  • 3.
    Bibliography [1] Jerome Euzenatand Pavel Shvaiko. 2010. Ontology Matching (1st ed.). Springer Publishing Company, Incorporated. [2] Namyoun Choi, Il-Yeol Song, and Hyoil Han. 2006. A survey on ontology mapping. SIGMOD Rec.35, 3 (September 2006), 34-41. [3] Yannis Kalfoglou and Marco Schorlemmer. 2003. Ontology mapping: the state of the art. Knowl. Eng. Rev. 18, 1 (January 2003), 1-31. [4] Noy, N., 2005. Ontology Mapping and Alignment. Search, p.1-34. Available at: http://www.aifb.uni-karlsruhe.de/WBS/meh/foam/. [5] Casanova, M. A., 2012. Tecnologias de Banco de Dados para a Web Semântica - Módulo 9a - Ontologias - Matching.
  • 4.
    Outline ● Context ● Definitions ● Classifications of Ontology Matching Techniques ● Basic Techniques ● Matching Strategies
  • 5.
    Context ● We haveto deal with heterogeneity ● Different models are based on different domains of knowledge and use different tools, at different detail levels ● Distributed nature of ontology development has lead to different ontologies in the same or overlapping domains
  • 6.
    The need forontology matching ● Creating global ontologies from local ontologies ● Reuse information between ontologies ● Dealing with heterogeneity ● Queries across multiple distributed resources ● Data transformation
  • 7.
    Outline ● Context ● Definitions ●Classifications of Ontology Matching Techniques ● Basic Techniques ● Matching Strategies
  • 9.
    What is ontologymatching? It is the process of finding relationships or correspondences between entities of different ontologies. entities - classes, instances, properties or formulas
  • 10.
  • 11.
    The matching process Ontologieso and o' Alignment A Parameters Alignment A' Resources
  • 12.
  • 13.
    Outline ● Context ● Definitions ● Classifications of Ontology Matching Techniques ● Basic Techniques ● Matching Strategies
  • 14.
    Classifying ontology matchingin regard to the use ● Matching local ontologies to global ontologies ● Matching ontologies of complementary domains ● Merging two ontologies of the same domain
  • 16.
    Synthetic Classifications ● Granularity/Input Interpretation Layer ○ e.g. element- or structure-level ● Kind of Input Layer ○ Classification based on the kind of input used by elementary matching techniques ● Basic Techniques Layer ○ Classification based on how input information is interpreted
  • 17.
    Granularity/Input Interpretation Layer ● Element-level matching techniques ○ Analysing entities or instances in isolation ○ Ignoring their relations with other entities or their instances ● Structure-level techniques ○ Analysing how entities or their instances appear together in a structure (e.g. by representing ontologies as a graph)
  • 18.
    Granularity/Input Interpretation Layer Syntactictechniques ○ Interpret the input with regard to its sole structure External techniques ○ Uses external resources of a domain and common knowledge Semantic techniques ○ Interpret the input by using model-theoretic semantics
  • 20.
    Kind of InputLayer ● Terminological ○ Strings found in the ontology descriptions ● Structural ○ Structures found in the ontology descriptions ● Semantics ○ Requires some semantic interpretation of the ontology ● Extensional ○ Use data instances ● In some papers, semantic=logic; extensional=semantic
  • 21.
    Kind of InputLayer (Second level) ● Terminological ○ String-based: terms as sequences of characters ○ Linguistic: interpretation of the terms as linguistic objects ● Structural ○ Internal: consider the internal structure of entities ○ Relational: consider the relation of entities with other entities
  • 23.
    Basic Techniques Layer Alabel can be interpreted as ○ A string (a sequence of letters) ○ A word or a phrase in some natural language A hierarchy can be considered as ○ A graph ○ A taxonomy
  • 24.
    Basic Techniques Layer Element-level ● String-based ● Language-based ● Based on linguistic resources ● Constraint-based ● Alignment reuse ● Based on upper level and domain specific formal ontologies
  • 25.
    Basic Techniques Layer Structure-level ● Graph-based ● Taxonomy-based
  • 26.
    Element-level Techniques ● String-based techniques ● The more similar the strings, the more likely they are to denote the same concepts ● Distance functions map a pair of strings to a real number ● Language-based techniques ● Based on natural language processing techniques exploiting morphological properties of the input words
  • 27.
    Element-level Techniques ● Constraint-based techniques ● Deal with the internal constraints being applied to the definitions of entities, such as types, cardinality of attributes, etc ● Linguistic resources ● Lexicons or domain specific thesauri, used to match words based on linguistic relations between them like synonyms, hyponyms, etc
  • 28.
    Element-level Techniques ● Alignment reuse ● Record alignments of previously matched ontologies ● Upper level and domain specific ontologies ● Used as external sources of common knowledge
  • 29.
    Structure-level Techniques ● Graph-based techniques ● Treat input ontologies as labelled graphs ● If two nodes from two ontologies are similar, their neighbours may also be somehow similar ● Taxonomy-based techniques ● is-a links connect terms that are already similar, therefore their neighbours may be also somehow similar
  • 30.
    Outline ● Context ● Definitions ●Classifications of Ontology Matching Techniques ● Basic Techniques ● Matching Strategies
  • 31.
    Basic Techniques ● Examplesof metrics: Similarity and Distance ● Name-based techniques ● Structure-based techniques ● Extensional techniques ● Semantic-based techniques
  • 32.
    Basic Techniques Similarity: Functionfrom a pair of entities to a real number
  • 33.
    Name-based Techniques ● Theycan be applied to the name, the label or the comments of entities in order to find those which are similar ● They can be used for comparing class names and/or URIs
  • 34.
    String-based methods ● Based on string similarity only ● Useful if conceptual schemas (or ontologies) use very similar strings to denote the same concepts ● Yield a low similarity, if schemas use synonyms with different syntax ● Yield many false positives, if pairs of strings with low similarity are selected
  • 35.
  • 36.
    String-based methods Levenshtein (edit)distance ● Measure the similarity between two strings by the minimum number of insertions, deletions, and substitutions of characters required to transform one string into the other ● Example: (“Gaming”, “Games”) = 2 substitutions [“e” by “i” and “n” by “s”] + 1 deletion [“g”] =3
  • 37.
    String-based methods Token-based distance ● Usually applied to the complete description of a concept ● Treats strings as a bag of words (multisets of substrings) ● May split strings into independent tokens ● Example: "InProceedings" is represented by ● the bag of words {In, Proceedings} ● or a bag of substrings of length 3 {InP, roc, eed, ing, s}
  • 38.
    String-based methods Bag ofwords represented as a vector ● Each dimension corresponds to a token ● Each position of the vector is the number of occurrences of the token
  • 39.
    Cosine Similarity Ontology Ontology Mapping, ontology mapping=(1,1) mapping=(1,2) 1 Mapping 1 2 V = {"Ontology", "Mapping" }
  • 40.
    Language-based methods Intrinsic methods ● reduce each term to a normal form to facilitate matching ● use traditional natural language processing techniques ● stopword elimination ● tokenization: segment strings into sequences of tokens ● lemmatization: reduce words to normal forms ● suppress tense, gender and number
  • 41.
    Language-based methods Example –Variants of the term “theory paper”
  • 42.
    Language-based methods Extrinsic methods Usedictionaries, lexicons and terminologies to help match terms from different schemas or ontologies ● e.g. a terminology - a thesaurus which very often contains phrases rather than single words ● deal with synonyms ● word sense disambiguation
  • 43.
    Language-based methods WordNet –an example of an external resource ● ● an electronic lexical database for English ● based on the notion of synsets (sets of synonyms) ● a synset denotes a concept or a sense of a group of terms ● WordNet also provides: ● an hypernym structure (superconcept / subconcept) ● a meronym relation (part of) ● textual descriptions of the concepts (glossary)
  • 44.
    Language-based methods ● Example ● WordNet 2.0 entry for the word author author1 noun: Someone who originates or causes or initiates something; Example ‘he was the generator of several complaints’. Synonym generator, source. Hypernym maker. Hyponym coiner. author2 noun: Writes (books or stories or articles or the like) professionally (for pay). Synonym writer2. Hypernym communicator. Hyponym abstractor, alliterator, authoress, biographer, coauthor, commentator, contributor, cyberpunk, drafter, dramatist, encyclopedist, essayist, folk writer, framer, gagman, ghostwriter, Gothic romancer, hack, journalist, libretist, lyricist, novelist, pamphleter, paragrapher, poet, polemist, rhymer, scriptwriter, space writer, speechwriter, tragedian, wordmonger, word-painter, wordsmith, Andersen, Assimov... author3 verb.: Be the author of; Example ‘She authored this play’. Hypernym write. Hyponym co-author, ghost.
  • 45.
    Language-based methods ● Example ● fragment of the WordNet hierarchy (limited to nouns) for “illustrator”, “author”, “creator”, “person”, “writer” (“author”) = {A1, A2W2} (“writer”) = {W1, A2W2, W3}
  • 46.
    Language-based methods Example –Synonym Similarity ● (s,t) = 1 iff (s) (t) (terms have a synset in common) = 0 otherwise (“author”) = {A1, A2W2} (“writer”) = {W1, A2W2, W3} (“author”) (“writer”)
  • 47.
    Language-based methods Example –Co-synonymy similarity ● ’(s,t) = | (s) (t)| | (s) (t)| (“author”) = {A1, A2W2} (“writer”) = {W1, A2W2, W3} (“author”) (“writer”) = 1 (“author”) (“writer”) = 4
  • 48.
    Structure-based techniques Internal structure(constraint-based approaches) ● based on the internal structure of classes ● calculate the similarity between two classes based on ○ the set of their properties, including keys ○ the range of their properties (attributes and relations) ○ the cardinality of their properties ○ the transitivity or symmetry of their properties
  • 49.
    Structure-based techniques Internal structure(constraint-based approaches)
  • 50.
    Structure-based techniques Internal structure(constraint-based approaches) ● positive point: ● can be used to eliminate incompatible matches ● negative points: ● does not provide much information about the classes to compare ● different classes may have properties with the same datatypes ● different models of a concept use different, and incompatible, types ● approach suggested: ● use method in combination with other methods
  • 51.
    Structure-based techniques Relational Structure ● similarity between two concepts ● based on the relations between the concepts with other concepts ○ similar concepts should have similar related concepts ● given a relation r, a pair of concepts may be: ○ directly related through r ○ inversely related through r ○ transitively related through r ○ the maximal elements of r+
  • 52.
    Structure-based techniques Example subclass(Book) = {Science, Pocket, Children} subclass−1(Book) = {Product} subclass+(Book) = {Science, Pocket, Textbook, Popular, Children} subclass ↑ (Book) = {Textbook, Popular, Pocket, Children}
  • 53.
    Structure-based techniques Taxonomic Structure ● Similarity between two concepts ○ Based on the graph of the subClassOf relation ○ Example ■ (e,e’) = number of edges of the taxonomy between e and e’, normalized by dividing by the longest path
  • 54.
    Structure-based techniques Bounded pathmatchers ● use anchors relating paths from two distinct taxonomies ● take two paths with links from two distinct taxonomies ● compare terms and their positions along these paths ● identify similar terms
  • 55.
    Structure-based techniques Example “Book -> Volume” and “Popular -> Autobiography” implies that possibly “Science -> Biography” or “Science -> Essay”
  • 56.
    Structure-based techniques Summary ofrelational structure methods ● Powerful methods to match conceptual schemas and ontologies ○ Allow relations between concepts to be taken into account ● Often used in combination with internal structural and terminological methods
  • 57.
    Extensional techniques When twoontologies share the same set of individuals, matching is highly facilitated.
  • 58.
    Extensional techniques ● Jaccard Similarity: Given two sets A and B, let P(X) be the probability of a random instance to be in the set X. ● Note that the Jaccard Similarity reaches 1 when A = B and 0 when they are disjoint.
  • 59.
    Semantic-based techniques ● Semantic-based techniques rely on using the axioms of ontologies and deductive methods. ● But for an inductive task like ontology matching, they do not perform well alone. So, a preprocessing is needed. ● Therefore, we need, firstly, to suppress the lack of a common ground between the ontologies. ● For those reasons, authors propose the use of semantic techniques in two steps: the so-called anchoring step and the deriving relations step.
  • 60.
    Semantic-based techniques ● Anchoring: is matching ontologies o' and o'' to the background ontology o. This can be done using any method described so far. ● Deriving relations: is the (indirect) matching of ontologies o' and o'' by using the correspondences discovered during the anchoring step. ● Example: Micro-company: Has at most 5 employees. SME: Has at most 10 associates. anchoring: employees ---> EMPLOYEE <--- associates Micro-company ---> FIRM <--- SME deriving relations: Micro-company is a subclass of SME.
  • 61.
    Outline ● Context ● Definitions ●Classifications of Ontology Matching Techniques ● Basic Techniques ● Matching Strategies
  • 62.
    Matching strategies -Global Methods ● Aggregating the results of the basic methods ● Developing a strategy for computing these similarities ● Learning from data the best method and the best parameters for matching ● Using probabilistic methods to combine matchers or to derive missing correspondences ● Involving users in the loop ● Extracting the alignments from the resulting (dis)similarity
  • 63.
    Matcher composition ● Sequentialcomposition of matchers
  • 64.
    Matcher composition ● Usingmatrices to represents a similarity or distance measure between entities to be matched
  • 65.
    Matcher composition ● Parallelcomposition of matchers
  • 66.
    Similarity aggregation Compound similarityis concerned with the aggregation of heterogeneous similarities ○ e.g. A single similarity measure composed by the similarity obtained from their names, the similarity of their superclasses, the similarity of their instances and that of their properties