• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Semantic Intensity Spectrum and Semantic Integration Algorithms
 

Semantic Intensity Spectrum and Semantic Integration Algorithms

on

  • 585 views

highlights of the CROSI project on semantic integration, introduction of semantic alignment algorithms

highlights of the CROSI project on semantic integration, introduction of semantic alignment algorithms

Statistics

Views

Total Views
585
Views on SlideShare
585
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Semantic Intensity Spectrum and Semantic Integration Algorithms Semantic Intensity Spectrum and Semantic Integration Algorithms Presentation Transcript

    • Semantic Intensity Spectrum
      • At least 25 systems have been developed for ontology alignment, matching.
      • A classification technique for ontology alignment approaches
      • Based on semantic intensity
        • Semantics: the intended meanings of ontological entities
        • Some methods consider only syntactical features (semantic poor)
          • E.g. String distance, String equality
        • Semantics are added via:
          • meanings of words provided by external lexicons
          • Positions in taxonomies
          • Relations with other ontological entities
          • Logic entailment
          • Classified instance data
      • Project aims & targets
      • Timeline and deliverables
      • Semantic Intensity Spectrum
      • Modular architecture
      • Algorithms
      • CMS
      • Evaluation
      • Lessons learnt & future work
    • Semantic Intensity Spectrum SIS Diagram http://www.aktors.org/crosi/si-spectrum/
      • Project aims & targets
      • Timeline and deliverables
      • Semantic Intensity Spectrum
      • Modular architecture
      • Algorithms
      • CMS
      • Evaluation
      • Lessons learnt & future work
    • Semantic Intensity Spectrum Existing systems
      • Duplicate efforts in developing largely overlapped algorithms
        • Re-implement algorithms e.g. string distance
        • Similar heuristic rules
      • Different performance and different results w.r.t. the same test sets
      • Project aims & targets
      • Timeline and deliverables
      • Semantic Intensity Spectrum
      • Modular architecture
      • Algorithms
      • CMS
      • Evaluation
      • Lessons learnt & future work
    • Semantic Intensity Spectrum Diversity of existing systems
      • Project aims & targets
      • Timeline and deliverables
      • Semantic Intensity Spectrum
      • Modular architecture
      • Algorithms
      • CMS
      • Evaluation
      • Lessons learnt & future work
    • Semantic Intensity Spectrum A possible solution
      • Combine existing systems to minimise development efforts
      • Possibility of combining in a meaningful way
        • Many systems output in compatible format
        • Heterogeneous outputs need to be normalised using heuristic rules, e.g. convert “more general than” into numeric values
      • Reuse available packages
        • E.g. SecondString for computing string distance
      • Project aims & targets
      • Timeline and deliverables
      • Semantic Intensity Spectrum
      • Modular architecture
      • Algorithms
      • CMS
      • Evaluation
      • Lessons learnt & future work
    • A principled architecture
      • Project aims & targets
      • Timeline and deliverables
      • Semantic Intensity Spectrum
      • Modular architecture
      • Algorithms
      • CMS
      • Evaluation
      • Lessons learnt & future work
    • A principled architecture: Signature Extraction
      • Ontologies can be captured with a set of ontological signatures
        • Local signatures:
          • Labels, IDs, and URIs
          • Declaimed properties, property domains and ranges
          • Equivalent and complement classes, inverse and functional properties
          • Instantiated classes
        • Global signatures:
          • Super-, sub-classes, properties
          • Disjoint classes
          • Sibling classes
          • Comments, version information
      • Project aims & targets
      • Timeline and deliverables
      • Semantic Intensity Spectrum
      • Modular architecture
      • Algorithms
      • CMS
      • Evaluation
      • Lessons learnt & future work
    • A principled architecture: Multiple matchers
      • Specialised internal matchers targeting at particular signatures
        • Name matchers
          • String distance based matchers
          • WordNet based matchers
        • Class matchers
          • Taxonomy based matchers
          • Definition based matchers
      • Invoking existing ontology matching/alignment systems as external matchers
        • FOAM API
        • INRIA Alignment API
      • Project aims & targets
      • Timeline and deliverables
      • Semantic Intensity Spectrum
      • Modular architecture
      • Algorithms
      • CMS
      • Evaluation
      • Lessons learnt & future work
    • CMS design commitments
      • Avoid reinventing the wheel
        • Use existing packages to enhance internal matchers
        • Use existing mapping/alignment systems as external matchers
      • Semantically enriched matchers based on the definition of concepts
        • Propagate similarity along concept hierarchies
        • Refine concept similarity by taking into account the names, domains and ranges of declared properties
        • Compute similarity using WordNet hierarchies
      • Project aims & targets
      • Timeline and deliverables
      • Semantic Intensity Spectrum
      • Modular architecture
      • Algorithms
      • CMS
      • Evaluation
      • Lessons learnt & future work
    • String distance
      • Reuse of existing packages
        • SecondString Metrics
          • Jaro, MongeElkan, NeedlemanWunsh, etc.
        • Soundex Metrics
      • Consider only the local names of ontological entities
        • Namespace is ignored
        • Names of super(sub)classes are ignored
      • Project aims & targets
      • Timeline and deliverables
      • Semantic Intensity Spectrum
      • Modular architecture
      • Algorithms
      • CMS
      • Evaluation
      • Lessons learnt & future work
    • WordNet-based algorithms
      • Use JWNL WordNet Java Lib
      • Names only
        • Synonyms are retrieved and compared with string equality or string distance
        • Composite names are split and stop words are removed
          • E.g. “has_name” => “name”
      • WordNet hierarchy
        • Calculate distance between two Words
      • Project aims & targets
      • Timeline and deliverables
      • Semantic Intensity Spectrum
      • Modular architecture
      • Algorithms
      • CMS
      • Evaluation
      • Lessons learnt & future work
    • WordNet-based algorithms WordNet hierarchy
      • h the distance between “Word” and Root
      • h’ the distance between “Word’” and Root
      • H the distance between common subsumer of “Word” and “Word’” and Root
      • Similarity between “Word” and “Word’” is computed as
      • 2H/(h+h’)
      Root Common Subsumer Word’ Word
      • Project aims & targets
      • Timeline and deliverables
      • Semantic Intensity Spectrum
      • Modular architecture
      • Algorithms
      • CMS
      • Evaluation
      • Lessons learnt & future work
      H h’ h
    • Canonical Name
      • C=> A.B.D.C
      • C’=> A’.B’.D’.E’.C’
      • Compute the similarity between C and C’ as well as the respective similarity between every pair of super classes of C and C’
      • Penalise the similarity between C and C’ with those of their super classes
      C’ A’ B’ D’ E’ A B C D
      • Project aims & targets
      • Timeline and deliverables
      • Semantic Intensity Spectrum
      • Modular architecture
      • Algorithms
      • CMS
      • Evaluation
      • Lessons learnt & future work
    • Structure algorithm f (name similarity, domain similarity, range similarity )
      • Project aims & targets
      • Timeline and deliverables
      • Semantic Intensity Spectrum
      • Modular architecture
      • Algorithms
      • CMS
      • Evaluation
      • Lessons learnt & future work
      P1’(C’, B’) P2’ P3’ P1(C, B) P2 P3 H G Domain of P1 H’ G’ I’ Domain of P1’ A B D Range of P1 E A’ B’ D’ E’ Range of P1’ F’
    • Structure algorithm cnt’d
      • Structure
        • Retrieve declaimed properties
        • For each property, retrieve its domains and ranges
        • Compare property’s name, domain and range
      • StructurePlus
        • Compare also the super and sub classes of property’s domain and range
      • When compare domains and ranges, using existing name matching techniques
      • Project aims & targets
      • Timeline and deliverables
      • Semantic Intensity Spectrum
      • Modular architecture
      • Algorithms
      • CMS
      • Evaluation
      • Lessons learnt & future work
    • Implemented Matchers powered by existing java libs
      • Project aims & targets
      • Timeline and deliverables
      • Semantic Intensity Spectrum
      • Modular architecture
      • Algorithms
      • CMS
      • Evaluation
      • Lessons learnt & future work
    • Post alignment
      • Aggregator
        • Weighted average based aggregation
        • Weights are manually set by users
      • Evaluator
        • Nothing is more qualified than a human inspector with domain knowledge and experiences
      • Project aims & targets
      • Timeline and deliverables
      • Semantic Intensity Spectrum
      • Modular architecture
      • Algorithms
      • CMS
      • Evaluation
      • Lessons learnt & future work
    • CMS: deployment options
      • Run CMS from command line
        • A batch file is provided
      • Invoke CMS as an API
      • Run CMS as a service
        • via JSP interface
      • Project aims & targets
      • Timeline and deliverables
      • Semantic Intensity Spectrum
      • Modular architecture
      • Algorithms
      • CMS
      • Evaluation
      • Lessons learnt & future work
    • Demo
      • Ontologies: web directory, small size and simple structure
      • Run with different weights
      • Output to different formats
        • OWL, SKOS, HTML, XML(OAEI)
      • Project aims & targets
      • Timeline and deliverables
      • Semantic Intensity Spectrum
      • Modular architecture
      • Algorithms
      • CMS
      • Evaluation
      • Lessons learnt & future work
    • Use of SIS
      • Declaimed functionalities of existing systems can be justified against this spectrum
      • A reference for selecting the right mapping techniques for a particular problem
      • A designer’s aid for navigating through different mapping approaches with emphasis on the use of semantics
      • Project aims & targets
      • Timeline and deliverables
      • Semantic Intensity Spectrum
      • Modular architecture
      • Algorithms
      • CMS
      • Evaluation
      • Lessons learnt & future work