Ontologies and Similarity
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Ontologies and Similarity

on

  • 486 views

Keynote Talk at Int Conf on Case-based Reasoning

Keynote Talk at Int Conf on Case-based Reasoning

Statistics

Views

Total Views
486
Views on SlideShare
486
Embed Views
0

Actions

Likes
0
Downloads
18
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Ontologies and Similarity Presentation Transcript

  • 1. Ontologies and Similarity
    Steffen Staab
    Acknowledgements to Claudia d’Amato, Univ Bari,
    & WeST Team
    TexPoint fonts used in EMF.
    Read the TexPoint manual before you delete this box.: AAAAAAAAAAA
  • 2. Agenda
    Kris: Brocoliisvegetableused in stirfry
    Motivation Whatareexamplesemanticapplications?
    FoundationWhatis an ontology?
    Reality Check Whataretypicalontologies?
    Survey Howissimilaritymeasured in ontologies?
    CritiqueWhatshouldbemeasured?
    Solution A preliminarysolution
    ConclusionWhatto do now?
  • 3. Motivation
    SemanticApplications
    Check out: http://challenge.semanticweb.org/
  • 4. Linked Data
    Cases withMetadatawithout Frontiers
  • 5. Semantic Search & Browsing: Semantic Portals
    [WWW 2000]
    http://ontoprise.com
  • 6. FacetedSemantic Media Browsing: Semaplorer
    Winner Billion Triples Challenge 2008 [JoWS 2009]
    http://kreuzverweis.com
  • 7. Semantic Desktop
    Additional Semantic Meta Data, e.g. sender, subject
    Access to further PIM tools
  • 8. Mobile Exploration ofLinked Data: Mobile Facets
  • 9. LessonsLearned
    Examples + http://challenge.semanticweb.org
    Semantic Boolean Search in Conjunction with Keyword Search dominates in
    • Ontology-based applications
    • 10. Linked data applications
    Feast or famine
    Further useofsimilarity
    • Learning
    • 11. Ontologyengineeringadvice
    Available
    • IR Ranking
    • 12. (Textual) Similarity
    Needed
    • Semantic Ranking
    • 13. SemanticSimilarity
    [Franz et al 09]
    [stuffhere], BUT
  • 14. Whatis an Ontology?
    Foundation
  • 15. Whatis an ontology?
    Whatfor?
    Agreements thatmakelinkeddatamoreuseful
    Reasoning
    Gruber 1993:
    An ontology is an “explicit specificationof a conceptualization”
    Oberle, Guarino, Staab. Whatis an ontology? Handbook on ontologies, Springer 2009.
  • 16. Observations in the Real World
  • 17. A Model ofthe Real World
    knows
    knows
    Manager(I034820)
    Researcher(I046758)
    knows
    cooperates
    Employee(I050000)
    Researcher(I044443)
  • 18. Abstractingfromthe Individual Model
    knows
    knows
    Manager
    Researcher
    knows
    cooperates
    Employee
    Researcher
  • 19. A Conceptual Model
    Intensional Relations
    Unary
    Manager
    Research
    Employee
    Binary
    cooperates
    knows
    Cognitive Bias
    Perception
    Knowledge
    Belief
    The conceptualmodelcaptureswhatis invariant accordingtoone‘sconceptualizationoftheworld
  • 20. Formal Specification
    Whatmakesit so hardtoformallyspecifyontologicalcommitment?
    Algebraic Relations do not work:
    • Definedextensionally
    • 21. E.g. Lecturer1 = {Ashwin, Nirmalie, Steffen, Kris,…}
    • 22. Problem: New instancewouldchangetheontology, e.g.Lecturer2 = Lecturer1  {Fernando}
    Intensional Relations needtobedefined in Higher Order Language:
    • Specifytheintendedmodelswhereonemayquantifyoversetsofindividuals
    An ontologyis a theory (typically in firstorderlogicallanguage) wherethepossiblemodelsapproximatetheintendedmodels „asgoodaspossible“
  • 23. Conceptualization
    Perception
    Reality
    State of affairs
    State of affairs
    relevant invariants across presentation patterns:D, 
    Presentationpatterns
    Phenomena
    Bad
    Ontology
    Ontological commitmentK (selects D’D and ’)
    Models MD’(L)
    Ontology
    InterpretationsI
    Intended models for each IK(L)
    Ontology models
    Language L
    ~Good
    Slide by Nicola Guarino
  • 24. Description Logics: First orderlanguage(s) forontology
    T-Box
    Describing Relations Intensionally
    Flight  Service.
    Flight  ∃to.Airport
    Flight  to.Airport
    Flight  ∃from.Airport
    Flight  from.Airport
    approachedBy ⊇ to-1
    FlightFromDE = Flight ∩ ∃from.(Airport ∩part.{DE})
    A-Box
    Describing Relations Extensionally
    Flight(LH123).
    Flight(BA121).
    Airport(FRA).
    from(LH123,FRA).
    to(LH123,LHR).

    Key Feature: Classes (unaryrelations) aredefinedbyrelationstodefinitionsofotherclasses
  • 25. Description Logics: First orderlanguage(s) forontology
    T-Box
    Describing Relations Intensionally
    Flight  Service.
    Flight  ∃to.Airport
    Flight  to.Airport
    Flight  ∃from.Airport
    Flight  from.Airport
    domain(to) ⊇ Flight
    FlightFromDE = Flight ∩ ∃from.(Airport ∩part.{DE})
    A-Box
    Describing Relations Extensionally
    Flight(LH123).
    Flight(BA121).
    Airport(FRA).
    from(LH123,FRA).
    to(LH123,LHR).

    • Typicallydecidableandintractable
    • 26. Pragmaticallytractablefor 105concepts
    • 27. Oftenmostusefulat design time only
  • WhataretypicalOntologies?
    Reality Check
  • 28. ExamplesforOntologies & Thesauri
    Foundational Model ofAnatomy
    • 78K classes in FMA 2.0
    • 29. Severaltranslationsto OWL fordiscoveringmodelingproblems ([Noy & Rubin; Bodenreider et al])
    SNOMED CT(Systematized Nomenclature of Medicine -- Clinical Terms)
    • Representation in descriptionlogicslanguage EL++
    • 30. 106classes
    Dewey Decimal System
    • Internationallyusedthesaurusforformingpre-coordinatedclassesfrom an inventoryofcodes
  • Examplefrom Dewey Decimal
    590 Animals (Zoology)
    770 Photography, Computer Art
    597.96 Serpentes
    779.32 Photographyof Animals
    779.32796 PhotographyofSnakes
    Core messageofthistalk:
    Influencing also non-OWL ontologies/thesauri
    Conceptsaredefinedbased on therelationshiptothedefinitionofotherconceptsaffectingsimilarity
  • 31. Howissimilaritymeasured in ontologies?
    Survey
  • 32. ExampleOntology
    Airport
    Service
    Europe
    part
    Hub
    part
    part
    Flight
    LHR
    IT
    UK
    FCO
    LCY
    DE
    FRA
    part
    part
    Including „invariant“ A-Box facts(like Airport(FRA))
    to
    to
    to
    FRA-LCY
    FRA-LHR
    FRA-FCO
  • 33. Similarity Measurement Tasks
    ComparingClasses
    Comparing Objects
    • Based on objectfeatures
    • 34. Based on classcomparisons
    ComparingOntologies
    • Lexeme comparisons
    • 35. Graph comparison
    • 36. Consideringthesemanticsofhierarchies
    • 37. isa
    • 38. part
    • 39. Other relations
    Relatedto
    • Ontologylearning
    • 40. Ontologyalignment
    Based on
    • Class comparisons
  • Class Comparisons in MaterializedHierarchies
    Airport
    Service
    Europe
    part
    part
    part
    Flight
    LHR
    IT
    UK
    FCO
    LCY
    DE
    FRA
    part
    part
    to
    to
    to
    FRA-LCY
    FRA-LHR
    FRA-FCO
  • 41. Class Comparisons in MaterializedHierarchies
    Airport
    Service
    Europe
    part
    part
    part
    Flight
    LHR
    IT
    UK
    FCO
    LCY
    DE
    FRA
    part
    part
    Flight-DE-UK
    Flight-DE-IT
    Howmanyyellowconcepts?
    • Infinitelymanyin powerfulDL languages
    to
    to
    to
    FRA-LCY
    FRA-LHR
    FRA-FCO
  • 42. IntensionalCountingof Path Length
    𝑠𝑖𝑚𝐹𝑅𝐴−𝐿𝐻𝑅,𝐹𝑅𝐴−𝐿𝐶𝑌 ~ 1𝑃𝑎𝑡h1=12
     
    Service
    𝑠𝑖𝑚𝐹𝑅𝐴−𝐿𝐻𝑅,𝐹𝑅𝐴−𝐹𝐶𝑂 ~ 1𝑃𝑎𝑡h2=14
     
    Flight
    3 importantobservations:
    • Most papersinvestigatedampening, i.e. higher links indicatemoredissimilarity
    • 43. Absolute similarityvaluesmostly irrelevant (like in CBR)
    • 44. Most information in theontology will bediscarded
    Flight-DE-UK
    Flight-DE-IT
    FRA-LCY
    FRA-LHR
    FRA-FCO
    [Rada et al.'89] ff
  • 45. IntensionalCountingof Path Length
    𝑠𝑖𝑚𝐹𝑅𝐴−𝐿𝐻𝑅,𝐹𝑅𝐴−𝐿𝐶𝑌 ~ 1𝑃𝑎𝑡h1=1min2,3=12
     
    Service
    𝑠𝑖𝑚𝐹𝑅𝐴−𝐿𝐻𝑅,𝐹𝑅𝐴−𝐹𝐶𝑂 ~ 1𝑃𝑎𝑡h2=1min4,2=12 
     
    Flight
    Flight-DE-UK
    Flight-DE-IT
    FlightFromHub
    FlightToHub
    FlightFrom+ToHub
    FRA-LCY
    FRA-LHR
    FRA-FCO
  • 46. `Improved´IntensionalCountingof Path Length
     𝑐𝑜𝑡𝑜𝑝𝑦𝐶=𝐷 | 𝑖𝑠𝑎∗(𝐶,𝐷)
     
    Service
    𝑠𝑖𝑚𝑐𝑜𝑡𝑜𝑝𝑦(𝐶,𝐷)~ |𝑐𝑜𝑡𝑜𝑝𝑦(𝐶) ∩𝑐𝑜𝑡𝑜𝑝𝑦(𝐷)||𝑐𝑜𝑡𝑜𝑝𝑦(𝐶)⋃𝑐𝑜𝑡𝑜𝑝𝑦(𝐷)|
     
    𝑠𝑖𝑚𝑐𝑜𝑡𝑜𝑝𝑦𝐹𝑅𝐴−𝐿𝐻𝑅,𝐹𝑅𝐴−𝐹𝐶𝑂 ~ 59
     
    Flight
    Further dampeningpossible
    Flight-DE-UK
    Flight-DE-IT
    FlightFromHub
    FlightToHub
    FlightFrom+ToHub
    FRA-LCY
    FRA-LHR
    FRA-FCO
  • 47. `Improved´ IntensionalCountingof Path Length - Jaccard
     𝑐𝑜𝑡𝑜𝑝𝑦𝐶=𝐷 | 𝑖𝑠𝑎∗(𝐶,𝐷)
     
    Service
    𝑠𝑖𝑚𝑐𝑜𝑡𝑜𝑝𝑦(𝐶,𝐷)~ |𝑐𝑜𝑡𝑜𝑝𝑦(𝐶) ∩𝑐𝑜𝑡𝑜𝑝𝑦(𝐷)||𝑐𝑜𝑡𝑜𝑝𝑦(𝐶)⋃𝑐𝑜𝑡𝑜𝑝𝑦(𝐷)|
     
    𝑠𝑖𝑚𝑐𝑜𝑡𝑜𝑝𝑦𝐹𝑅𝐴−𝐿𝐻𝑅,𝐹𝑅𝐴−𝐹𝐶𝑂 ~ 59
     
    Flight
    𝑠𝑖𝑚𝑐𝑜𝑡𝑜𝑝𝑦𝐹𝑅𝐴−𝐿𝐻𝑅,𝐹𝑅𝐴−𝐿𝐶𝑌 ~ 48
     
    Flight-DE-UK
    Flight-DE-IT
    FlightFromHub
    FlightToHub
    FlightFrom+ToHub
    FRA-LCY
    FRA-LHR
    FRA-FCO
  • 48. Intension basedSimilarity Measurement
    Strengths
    Works somehow
    Weaknesses
    Bothpathcounting/Cotopyheavilysufferfrommodellingartefacts in theontology
  • 49. CountingExtensions – Jaccard-likeMetrics
    𝑠𝑖𝑚𝐹𝑅𝐴−𝐿𝐻𝑅,𝐹𝑙𝑖𝑔h𝑡𝑇𝑜𝐻𝑢𝑏 ~ |𝐹𝑅𝐴−𝐿𝐻𝑅 ∩𝐹𝑙𝑖𝑔h𝑡𝑇𝑜𝐻𝑢𝑏||𝐹𝑅𝐴−𝐿𝐻𝑅 𝐹𝑙𝑖𝑔h𝑡𝑇𝑜𝐻𝑢𝑏|=36
     
    Service
    𝑠𝑖𝑚𝐹𝑅𝐴−𝐿𝐻𝑅,𝐹𝑅𝐴−𝐿𝐶𝑌 ~ |𝐹𝑅𝐴−𝐿𝐻𝑅 ∩𝐹𝑅𝐴−𝐿𝐶𝑌||𝐹𝑅𝐴−𝐿𝐻𝑅 𝐹𝑅𝐴−𝐿𝐶𝑌|=04
     
    Flight
    Flight-DE-UK
    Flight-DE-IT
    FlightFromHub
    FlightToHub
    FlightFrom+ToHub
    FRA-LCY
    FRA-LHR
    FRA-FCO
    Disjointnessincompatibility
    LH127
    LH123
    BA124
    BA121
    LH345
    LH567
    AI234
    [Resnik ‘95-‘99]
  • 50. Extension basedSimilarity
    Strengths
    Countingextensionsseemsnaturalandefficient(Jaccard-likemeasure)
    Weaknesses
    DisjointnessIncompatibility
    Classesaresimilar, but do not shareinstances:
    • Male – Female
    • 51. Housecat – Lion
    Extensionsareuncountable
    Ontologiessupposedtoabstractfromspecificextensions!
    Extensionsmaybe infinite
  • 52. Class Syntax basedSimilarity
    Quitefrequent in theliterature
    Listedhere just forsakeofcompleteness, because…
    Class syntaxbasedsimilarityis
    equivalenceunsound
  • 53. WhatshouldSimilarityDeliver?
    Critique
    [d‘Amato et al 2008]
  • 54. Core criteriaforsimilaritymeasures– almostunchanged
    Positiveness: C,D sim(C,D)  0
    Strong reflexivity:Csim(C,C) = 1
    Upperbound: C,D sim(C,D)  1
    Symmetry: C,D sim(C,D) = sim(D,C)
    Problem with strong reflexivity:
    FlightFromDEHub = Flight ∩∃from.(Hub ∩part.{DE}) FromHubAndFromDE = ∃from.Hub∩∃from.part.{DE}
    Reasoningisneededtodiscoverthat
    sim(FlightFromDEHub,FromHubAndFromDE) = 1
    But problem:
    FlightFromDEHub = Flight ∩∃from.(Hub ∩part.{DE}) FromHubAndFromDE = ∃from.Hub∩∃from.part.{DE}
    Reasoningisneededtodiscoverthat
    sim(FlightFromDEHub,FromHubAndFromDE) = 1
  • 55. Additional Ones in Ontologies!
    5. PreventDisjointnessIncompatibility (seenbefore)
    6. Equivalence Soundness:
    C,D,E DE  sim(C,D)=sim(C,E)
    Example:
    sim(Flight,FlightFromDEHub) =
    sim(Flight,FromHubAndFromDE)
    Proposition:Reflexivityandtriangleinequalityimplyequivalencesoundness
  • 56. Additional Ones in Ontologies!
    7. Monotonicity
    CL, DL, CU, DU,
    EU, E⊆L
    ∃H such thatCH, EH, DH
     sim(C,D)  sim(C,E)
    U
    L
    C
    D
    E
    Myfeelingis: weneedmore!
    (continuity,…)
  • 57. A Preliminary Solution
    Solution
    [d‘Amato et al 2010]
  • 58. Core idea: Combine Cotopy & Extension-basedApproaches
    Cotopy-based: IntersectionattheLeastCommonSubsumer
    Extension-based: Count instances (orsubclasses)
    Venndiagramsindicates: sim(C,D) > sim(C,E)
    E
    gcs(C,D)
    C
    C
    D
    gcs(C,E)
  • 59. Indirect (tentative) Indicationof Correctness
    Growingindexingtreebyclusteringwithnewsimilaritymeasure
    Comparingquerying time for different ontologiesusingthe original hierarchyandtheindexingtreederivedfromsimilaritymeasure
    Problem: similaritycomputationtoo expensive
    [d‘Amato et al 2010]
  • 60. Whatto do now?
    Conclusion
  • 61. Conclusion
    Conclusion: A call to arms!
    • Semanticapplicationscovermanydomainsofcommercialandsocialinterest
    • 62. Ontologiesprovidethemodelingbackboneandareevenfound in unexpectedplaces
    • 63. Similaritymeasuresforontologiesexistandgive back someresults
    • 64. Criteriaforsemanticsimilaritymeasuresare still in themaking
    • 65. Thereis a lack oftheoryforontology-basedsimilarity
    • 66. Thereis a lack ofefficientrealizationofontology-basedsimilarity
    Targeted Side Effect:
    ClarificationofSomeOftenMistakenUseofTerminologyaroundOntologies
  • 67. Institut WeST – Web Science & Technologies
    ThankYou!
    Semantic Web
    Web Retrieval
    Interactive Web
    Multimedia Web
    Software Web
    eGovernment
    eMedia
    eScience
    eOrganizations
    eCitizen