Ontologies and Similarity

  • 290 views
Uploaded on

Keynote Talk at Int Conf on Case-based Reasoning

Keynote Talk at Int Conf on Case-based Reasoning

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
290
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
18
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Ontologies and Similarity
    Steffen Staab
    Acknowledgements to Claudia d’Amato, Univ Bari,
    & WeST Team
    TexPoint fonts used in EMF.
    Read the TexPoint manual before you delete this box.: AAAAAAAAAAA
  • 2. Agenda
    Kris: Brocoliisvegetableused in stirfry
    Motivation Whatareexamplesemanticapplications?
    FoundationWhatis an ontology?
    Reality Check Whataretypicalontologies?
    Survey Howissimilaritymeasured in ontologies?
    CritiqueWhatshouldbemeasured?
    Solution A preliminarysolution
    ConclusionWhatto do now?
  • 3. Motivation
    SemanticApplications
    Check out: http://challenge.semanticweb.org/
  • 4. Linked Data
    Cases withMetadatawithout Frontiers
  • 5. Semantic Search & Browsing: Semantic Portals
    [WWW 2000]
    http://ontoprise.com
  • 6. FacetedSemantic Media Browsing: Semaplorer
    Winner Billion Triples Challenge 2008 [JoWS 2009]
    http://kreuzverweis.com
  • 7. Semantic Desktop
    Additional Semantic Meta Data, e.g. sender, subject
    Access to further PIM tools
  • 8. Mobile Exploration ofLinked Data: Mobile Facets
  • 9. LessonsLearned
    Examples + http://challenge.semanticweb.org
    Semantic Boolean Search in Conjunction with Keyword Search dominates in
    • Ontology-based applications
    • 10. Linked data applications
    Feast or famine
    Further useofsimilarity
    • Learning
    • 11. Ontologyengineeringadvice
    Available
    • IR Ranking
    • 12. (Textual) Similarity
    Needed
    • Semantic Ranking
    • 13. SemanticSimilarity
    [Franz et al 09]
    [stuffhere], BUT
  • 14. Whatis an Ontology?
    Foundation
  • 15. Whatis an ontology?
    Whatfor?
    Agreements thatmakelinkeddatamoreuseful
    Reasoning
    Gruber 1993:
    An ontology is an “explicit specificationof a conceptualization”
    Oberle, Guarino, Staab. Whatis an ontology? Handbook on ontologies, Springer 2009.
  • 16. Observations in the Real World
  • 17. A Model ofthe Real World
    knows
    knows
    Manager(I034820)
    Researcher(I046758)
    knows
    cooperates
    Employee(I050000)
    Researcher(I044443)
  • 18. Abstractingfromthe Individual Model
    knows
    knows
    Manager
    Researcher
    knows
    cooperates
    Employee
    Researcher
  • 19. A Conceptual Model
    Intensional Relations
    Unary
    Manager
    Research
    Employee
    Binary
    cooperates
    knows
    Cognitive Bias
    Perception
    Knowledge
    Belief
    The conceptualmodelcaptureswhatis invariant accordingtoone‘sconceptualizationoftheworld
  • 20. Formal Specification
    Whatmakesit so hardtoformallyspecifyontologicalcommitment?
    Algebraic Relations do not work:
    • Definedextensionally
    • 21. E.g. Lecturer1 = {Ashwin, Nirmalie, Steffen, Kris,…}
    • 22. Problem: New instancewouldchangetheontology, e.g.Lecturer2 = Lecturer1  {Fernando}
    Intensional Relations needtobedefined in Higher Order Language:
    • Specifytheintendedmodelswhereonemayquantifyoversetsofindividuals
    An ontologyis a theory (typically in firstorderlogicallanguage) wherethepossiblemodelsapproximatetheintendedmodels „asgoodaspossible“
  • 23. Conceptualization
    Perception
    Reality
    State of affairs
    State of affairs
    relevant invariants across presentation patterns:D, 
    Presentationpatterns
    Phenomena
    Bad
    Ontology
    Ontological commitmentK (selects D’D and ’)
    Models MD’(L)
    Ontology
    InterpretationsI
    Intended models for each IK(L)
    Ontology models
    Language L
    ~Good
    Slide by Nicola Guarino
  • 24. Description Logics: First orderlanguage(s) forontology
    T-Box
    Describing Relations Intensionally
    Flight  Service.
    Flight  ∃to.Airport
    Flight  to.Airport
    Flight  ∃from.Airport
    Flight  from.Airport
    approachedBy ⊇ to-1
    FlightFromDE = Flight ∩ ∃from.(Airport ∩part.{DE})
    A-Box
    Describing Relations Extensionally
    Flight(LH123).
    Flight(BA121).
    Airport(FRA).
    from(LH123,FRA).
    to(LH123,LHR).

    Key Feature: Classes (unaryrelations) aredefinedbyrelationstodefinitionsofotherclasses
  • 25. Description Logics: First orderlanguage(s) forontology
    T-Box
    Describing Relations Intensionally
    Flight  Service.
    Flight  ∃to.Airport
    Flight  to.Airport
    Flight  ∃from.Airport
    Flight  from.Airport
    domain(to) ⊇ Flight
    FlightFromDE = Flight ∩ ∃from.(Airport ∩part.{DE})
    A-Box
    Describing Relations Extensionally
    Flight(LH123).
    Flight(BA121).
    Airport(FRA).
    from(LH123,FRA).
    to(LH123,LHR).

    • Typicallydecidableandintractable
    • 26. Pragmaticallytractablefor 105concepts
    • 27. Oftenmostusefulat design time only
  • WhataretypicalOntologies?
    Reality Check
  • 28. ExamplesforOntologies & Thesauri
    Foundational Model ofAnatomy
    • 78K classes in FMA 2.0
    • 29. Severaltranslationsto OWL fordiscoveringmodelingproblems ([Noy & Rubin; Bodenreider et al])
    SNOMED CT(Systematized Nomenclature of Medicine -- Clinical Terms)
    • Representation in descriptionlogicslanguage EL++
    • 30. 106classes
    Dewey Decimal System
    • Internationallyusedthesaurusforformingpre-coordinatedclassesfrom an inventoryofcodes
  • Examplefrom Dewey Decimal
    590 Animals (Zoology)
    770 Photography, Computer Art
    597.96 Serpentes
    779.32 Photographyof Animals
    779.32796 PhotographyofSnakes
    Core messageofthistalk:
    Influencing also non-OWL ontologies/thesauri
    Conceptsaredefinedbased on therelationshiptothedefinitionofotherconceptsaffectingsimilarity
  • 31. Howissimilaritymeasured in ontologies?
    Survey
  • 32. ExampleOntology
    Airport
    Service
    Europe
    part
    Hub
    part
    part
    Flight
    LHR
    IT
    UK
    FCO
    LCY
    DE
    FRA
    part
    part
    Including „invariant“ A-Box facts(like Airport(FRA))
    to
    to
    to
    FRA-LCY
    FRA-LHR
    FRA-FCO
  • 33. Similarity Measurement Tasks
    ComparingClasses
    Comparing Objects
    • Based on objectfeatures
    • 34. Based on classcomparisons
    ComparingOntologies
    Relatedto
    • Ontologylearning
    • 40. Ontologyalignment
    Based on
    • Class comparisons
  • Class Comparisons in MaterializedHierarchies
    Airport
    Service
    Europe
    part
    part
    part
    Flight
    LHR
    IT
    UK
    FCO
    LCY
    DE
    FRA
    part
    part
    to
    to
    to
    FRA-LCY
    FRA-LHR
    FRA-FCO
  • 41. Class Comparisons in MaterializedHierarchies
    Airport
    Service
    Europe
    part
    part
    part
    Flight
    LHR
    IT
    UK
    FCO
    LCY
    DE
    FRA
    part
    part
    Flight-DE-UK
    Flight-DE-IT
    Howmanyyellowconcepts?
    • Infinitelymanyin powerfulDL languages
    to
    to
    to
    FRA-LCY
    FRA-LHR
    FRA-FCO
  • 42. IntensionalCountingof Path Length
    𝑠𝑖𝑚𝐹𝑅𝐴−𝐿𝐻𝑅,𝐹𝑅𝐴−𝐿𝐶𝑌 ~ 1𝑃𝑎𝑡h1=12
     
    Service
    𝑠𝑖𝑚𝐹𝑅𝐴−𝐿𝐻𝑅,𝐹𝑅𝐴−𝐹𝐶𝑂 ~ 1𝑃𝑎𝑡h2=14
     
    Flight
    3 importantobservations:
    • Most papersinvestigatedampening, i.e. higher links indicatemoredissimilarity
    • 43. Absolute similarityvaluesmostly irrelevant (like in CBR)
    • 44. Most information in theontology will bediscarded
    Flight-DE-UK
    Flight-DE-IT
    FRA-LCY
    FRA-LHR
    FRA-FCO
    [Rada et al.'89] ff
  • 45. IntensionalCountingof Path Length
    𝑠𝑖𝑚𝐹𝑅𝐴−𝐿𝐻𝑅,𝐹𝑅𝐴−𝐿𝐶𝑌 ~ 1𝑃𝑎𝑡h1=1min2,3=12
     
    Service
    𝑠𝑖𝑚𝐹𝑅𝐴−𝐿𝐻𝑅,𝐹𝑅𝐴−𝐹𝐶𝑂 ~ 1𝑃𝑎𝑡h2=1min4,2=12 
     
    Flight
    Flight-DE-UK
    Flight-DE-IT
    FlightFromHub
    FlightToHub
    FlightFrom+ToHub
    FRA-LCY
    FRA-LHR
    FRA-FCO
  • 46. `Improved´IntensionalCountingof Path Length
     𝑐𝑜𝑡𝑜𝑝𝑦𝐶=𝐷 | 𝑖𝑠𝑎∗(𝐶,𝐷)
     
    Service
    𝑠𝑖𝑚𝑐𝑜𝑡𝑜𝑝𝑦(𝐶,𝐷)~ |𝑐𝑜𝑡𝑜𝑝𝑦(𝐶) ∩𝑐𝑜𝑡𝑜𝑝𝑦(𝐷)||𝑐𝑜𝑡𝑜𝑝𝑦(𝐶)⋃𝑐𝑜𝑡𝑜𝑝𝑦(𝐷)|
     
    𝑠𝑖𝑚𝑐𝑜𝑡𝑜𝑝𝑦𝐹𝑅𝐴−𝐿𝐻𝑅,𝐹𝑅𝐴−𝐹𝐶𝑂 ~ 59
     
    Flight
    Further dampeningpossible
    Flight-DE-UK
    Flight-DE-IT
    FlightFromHub
    FlightToHub
    FlightFrom+ToHub
    FRA-LCY
    FRA-LHR
    FRA-FCO
  • 47. `Improved´ IntensionalCountingof Path Length - Jaccard
     𝑐𝑜𝑡𝑜𝑝𝑦𝐶=𝐷 | 𝑖𝑠𝑎∗(𝐶,𝐷)
     
    Service
    𝑠𝑖𝑚𝑐𝑜𝑡𝑜𝑝𝑦(𝐶,𝐷)~ |𝑐𝑜𝑡𝑜𝑝𝑦(𝐶) ∩𝑐𝑜𝑡𝑜𝑝𝑦(𝐷)||𝑐𝑜𝑡𝑜𝑝𝑦(𝐶)⋃𝑐𝑜𝑡𝑜𝑝𝑦(𝐷)|
     
    𝑠𝑖𝑚𝑐𝑜𝑡𝑜𝑝𝑦𝐹𝑅𝐴−𝐿𝐻𝑅,𝐹𝑅𝐴−𝐹𝐶𝑂 ~ 59
     
    Flight
    𝑠𝑖𝑚𝑐𝑜𝑡𝑜𝑝𝑦𝐹𝑅𝐴−𝐿𝐻𝑅,𝐹𝑅𝐴−𝐿𝐶𝑌 ~ 48
     
    Flight-DE-UK
    Flight-DE-IT
    FlightFromHub
    FlightToHub
    FlightFrom+ToHub
    FRA-LCY
    FRA-LHR
    FRA-FCO
  • 48. Intension basedSimilarity Measurement
    Strengths
    Works somehow
    Weaknesses
    Bothpathcounting/Cotopyheavilysufferfrommodellingartefacts in theontology
  • 49. CountingExtensions – Jaccard-likeMetrics
    𝑠𝑖𝑚𝐹𝑅𝐴−𝐿𝐻𝑅,𝐹𝑙𝑖𝑔h𝑡𝑇𝑜𝐻𝑢𝑏 ~ |𝐹𝑅𝐴−𝐿𝐻𝑅 ∩𝐹𝑙𝑖𝑔h𝑡𝑇𝑜𝐻𝑢𝑏||𝐹𝑅𝐴−𝐿𝐻𝑅 𝐹𝑙𝑖𝑔h𝑡𝑇𝑜𝐻𝑢𝑏|=36
     
    Service
    𝑠𝑖𝑚𝐹𝑅𝐴−𝐿𝐻𝑅,𝐹𝑅𝐴−𝐿𝐶𝑌 ~ |𝐹𝑅𝐴−𝐿𝐻𝑅 ∩𝐹𝑅𝐴−𝐿𝐶𝑌||𝐹𝑅𝐴−𝐿𝐻𝑅 𝐹𝑅𝐴−𝐿𝐶𝑌|=04
     
    Flight
    Flight-DE-UK
    Flight-DE-IT
    FlightFromHub
    FlightToHub
    FlightFrom+ToHub
    FRA-LCY
    FRA-LHR
    FRA-FCO
    Disjointnessincompatibility
    LH127
    LH123
    BA124
    BA121
    LH345
    LH567
    AI234
    [Resnik ‘95-‘99]
  • 50. Extension basedSimilarity
    Strengths
    Countingextensionsseemsnaturalandefficient(Jaccard-likemeasure)
    Weaknesses
    DisjointnessIncompatibility
    Classesaresimilar, but do not shareinstances:
    • Male – Female
    • 51. Housecat – Lion
    Extensionsareuncountable
    Ontologiessupposedtoabstractfromspecificextensions!
    Extensionsmaybe infinite
  • 52. Class Syntax basedSimilarity
    Quitefrequent in theliterature
    Listedhere just forsakeofcompleteness, because…
    Class syntaxbasedsimilarityis
    equivalenceunsound
  • 53. WhatshouldSimilarityDeliver?
    Critique
    [d‘Amato et al 2008]
  • 54. Core criteriaforsimilaritymeasures– almostunchanged
    Positiveness: C,D sim(C,D)  0
    Strong reflexivity:Csim(C,C) = 1
    Upperbound: C,D sim(C,D)  1
    Symmetry: C,D sim(C,D) = sim(D,C)
    Problem with strong reflexivity:
    FlightFromDEHub = Flight ∩∃from.(Hub ∩part.{DE}) FromHubAndFromDE = ∃from.Hub∩∃from.part.{DE}
    Reasoningisneededtodiscoverthat
    sim(FlightFromDEHub,FromHubAndFromDE) = 1
    But problem:
    FlightFromDEHub = Flight ∩∃from.(Hub ∩part.{DE}) FromHubAndFromDE = ∃from.Hub∩∃from.part.{DE}
    Reasoningisneededtodiscoverthat
    sim(FlightFromDEHub,FromHubAndFromDE) = 1
  • 55. Additional Ones in Ontologies!
    5. PreventDisjointnessIncompatibility (seenbefore)
    6. Equivalence Soundness:
    C,D,E DE  sim(C,D)=sim(C,E)
    Example:
    sim(Flight,FlightFromDEHub) =
    sim(Flight,FromHubAndFromDE)
    Proposition:Reflexivityandtriangleinequalityimplyequivalencesoundness
  • 56. Additional Ones in Ontologies!
    7. Monotonicity
    CL, DL, CU, DU,
    EU, E⊆L
    ∃H such thatCH, EH, DH
     sim(C,D)  sim(C,E)
    U
    L
    C
    D
    E
    Myfeelingis: weneedmore!
    (continuity,…)
  • 57. A Preliminary Solution
    Solution
    [d‘Amato et al 2010]
  • 58. Core idea: Combine Cotopy & Extension-basedApproaches
    Cotopy-based: IntersectionattheLeastCommonSubsumer
    Extension-based: Count instances (orsubclasses)
    Venndiagramsindicates: sim(C,D) > sim(C,E)
    E
    gcs(C,D)
    C
    C
    D
    gcs(C,E)
  • 59. Indirect (tentative) Indicationof Correctness
    Growingindexingtreebyclusteringwithnewsimilaritymeasure
    Comparingquerying time for different ontologiesusingthe original hierarchyandtheindexingtreederivedfromsimilaritymeasure
    Problem: similaritycomputationtoo expensive
    [d‘Amato et al 2010]
  • 60. Whatto do now?
    Conclusion
  • 61. Conclusion
    Conclusion: A call to arms!
    • Semanticapplicationscovermanydomainsofcommercialandsocialinterest
    • 62. Ontologiesprovidethemodelingbackboneandareevenfound in unexpectedplaces
    • 63. Similaritymeasuresforontologiesexistandgive back someresults
    • 64. Criteriaforsemanticsimilaritymeasuresare still in themaking
    • 65. Thereis a lack oftheoryforontology-basedsimilarity
    • 66. Thereis a lack ofefficientrealizationofontology-basedsimilarity
    Targeted Side Effect:
    ClarificationofSomeOftenMistakenUseofTerminologyaroundOntologies
  • 67. Institut WeST – Web Science & Technologies
    ThankYou!
    Semantic Web
    Web Retrieval
    Interactive Web
    Multimedia Web
    Software Web
    eGovernment
    eMedia
    eScience
    eOrganizations
    eCitizen