Where does it break?
        or:
  Why Semantic Web research
         is not just
  “Computer Science as usual”

       Frank van Harmelen
          AI Department
  Vrije Universiteit Amsterdam
But first:
    “the Semantic Web forces us to rethink
     the foundations of many subfields of
     Computer Science”

                                if you can
    “the challenge of the Semantic Webscontinues to
                                           ,
                                 read thishared
     break many often silently held andst be
                                 you mu research”
     assumptions underlying decades of ids
                                  on stero

    “I will try to identify silently held assumptions
     which are no longer true on the Semantic Web,
     prompting a radical rethink of many past results”

                                                         2
Oh no, not more “vision”…
                     our plan is to invent      the visionary
sales are dropping      some sort of         leadership work is
    like a rock         doohicky that        done. How long will
                      everyone wants to        your part take?
                             buy.




Don’t worry,
 there will be lots of
 technical content

                                                             3
Grand Topics…
 what are the science challenges in SW?
 Which implicit traditional assumptions break?
 Illustrated with 4 such “traditional assumptions”

and also:
 “Which Semantic Web” ?




                                                4
Before we go on:

Which Semantic Web
are we talking about?
Typical SemWeb slide 1:

General idea of Semantic Web
 Make current web more machine accessible
  (currently all the intelligence is in the user)

 Motivating use-cases
   q   search
   q   personalisation
   q   semantic linking
   q   data integration
   q   web services
   q   ...
                                                    6
Typical SemWeb slide 2:

General idea of Semantic Web
Do this by:

2.   Making data and meta-data
     available on the Web
     in machine-understandable form
     (formalised)
                                 These are non-trivial
                                 design decisions.
3.   Structure the data and meta-data in
                                 Alternative would be:
     ontologies

                                                         7
Which Semantic Web?
 Version 1:
  "Semantic Web as Web of Data" (TBL)


 recipe:
  expose databases on the web,
  use RDF, integrate
 meta-data from:
  q   expressing DB schema semantics
      in machine interpretable ways
 enable integration and unexpected re-use
                                        8
Which Semantic Web?
 Version 2:
  “Enrichment of the current Web”

 recipe:
  Annotate, classify, index
 meta-data from:
  q   automatically producing markup:
      named-entity recognition,
      concept extraction, tagging, etc.
 enable personalisation, search, browse,..
                                          9
Which Semantic Web?
 Version 1:
  “Semantic Web as Web of Data”

 Version 2:
  “Enrichment of the current Web”

 Different use-cases
 Different techniques
 Different users

                                    10
Before we go on:

The current state of
the Semantic Web?
What’s up in the Semantic Web?
The 4 hard questions:
Q1: "where does the meta-data come from?”
 NL technology is delivering on concept-extraction
Q2: “where do the ontologies come from?”
 many handcrafted ontologies
 ontology learning remains hard
 relation extraction remains hard
Q3: “what to do with many ontologies?”
 ontology mapping/aligning remains VERY hard
Q4: “where’s the ‘Web’ in the Semantic Web?”
 more attention to social aspects (P2P, FOAF)
 non-textual media remains hard                12
What’s up in the Semantic Web?
The 4 hard questions:
 healthy uptake in some areas:
     knowledge management / intranets
     data-integration (Boeing)
     life-sciences (e-Science)
     convergence with Semantic Grid
     cultural heritage
 emerging applications in search & browse
     Elsevier, Ilse, MagPie, KIM
 very few applications in
     personalisation
     mobility/context awareness
 Most applications for companies,
  few applications for the public            13
Semantic Web:

Science or technology?
Semantic Web as Technology
   better search & browse
   personalisation
   semantic linking
   semantic web services
   ...
Semantic Web as Science


                             15
4 examples of
  “where does it break?”

d assumptions that no longer hold,
d approaches that no longer work
4 examples of
“where does it break?”

 Traditional complexity measures
Who cares about                        decidability?
 Decidability ≈ completeness
       guarantee to find an answer,
       or tell you it doesn’t exist,
       given enough run-time & memory
 Sources of incompleteness:
  q   incompleteness of the input data
  q   insufficient run-time to wait for the answer
Completeness is unachievable
 in practice anyway,
  regardless of the completeness of the algorithm
                                                     18
Who cares about undecidability?
 Undecidability
  ≠     always guaranteed not to find an answer
 Undecidability
  = not always guaranteed     to find an answer

 Undecidability may be harmless
    in many cases;
    in all  cases that matter



                                             19
Who cares about complexity?
 worst-case: may be exponentially rare
 asymptotic




 ignores constants




                                          20
What to do instead?
 Practical observations on RDF Schema:
                                  6
                                  9
  q
      Compute full closure of O(105) statements
 Practical observations on OWL:
  q   NEXPTIME 
      but fine on many practical cases

 Do more experimental
  performance profiles
  with realistic data

 Think hard about
  “average case” complexity….                     21
4 examples of
“where does it break?”

 Traditional complexity measures
 Hard in theory, easy in practice
Example:
    Reasoning with
Inconsistent Knowledge


       This work with
     Zhisheng Huang &
      Annette ten Teije
Knowledge will be inconsistent
Because of:
 mistreatment of defaults
 homonyms
 migration from another formalism
 integration of multiple sources




                                     24
New formal notions are needed
 New notions:
   q   Accepted:
   q   Rejected:
   q   Overdetermined:
   q   Undetermined:

 Soundness: (only classically justified results)



                                                    25
Basic Idea
    –     Start from the query
    –     Incrementally select larger parts of the
          ontology that are “relevant” to the query,
          until:
            • you have an ontology subpart that is
Selection
function
                small enough to be consistent and
                large enough to answer the query
                or
            • the selected subpart is already
                inconsistent
                before it can answer the query
                                                  26
General Framework




   s(T,φ,0)    s(T,φ,2)
        s(T,φ,1)




                          27
More precisely:
Use selection function s(T,φ,k),
   with s(T,φ,k) µ s(T,φ,k+1)

 Start with k=0:
  s(T,φ,0) j¼ φ or s(T,φ,0) j¼ :φ ?

 Increase k, until
  s(T,φ,k) j¼ φ or s(T,φ,k) j¼ :φ

 Abort when
  q   undetermined at maximal k       28
Nice general framework, but...
 which selection function s(T,φ,k) to use?
 Simple option: syntactic distance
  q   put all formulae in clausal form:
      a1 Ç a2 Ç … Ç an
  q   distance k=1 if some clausal letters overlap
      a1 Ç X Ç … Ç an, b1 Ç … X Ç bn
  q   distance k if chain of k overlapping clauses
      are needed
      a1 Ç X Ç … X1 Ç an
                                                     29
      b ÇX Ç…X Çb,
Evaluation
  Ontologies:
   Transport:           450 concepts
    Communication:       200 concepts
    Madcow:               55 concepts

  Selection functions:
   symbol-relevance = axioms overlap by ≥1 symbol
   concept-relevance ≈ axioms overlap by ≥1 concept

  Query a random set of subsumption queries:
             Concept1 ⊆ Concept2 ?
                                               30
Evaluation: Lessons




this makes concept-relevance a
  high quality sound approximation
  (> 90% recall, 100% precision)

                                     31
Works surprisingly well
On our benchmarks,
allmost all answers are “intuitive”
 Not well understood why
 Theory doesn’t predict that this is easy
  q   paraconsistent logic,
  q   relevance logic
  q   multi-valued logic
 Hypothesis:
  due to “local structure of knowledge”?
                                             32
4 examples of
“where does it break?”

 Traditional complexity measures
 Hard in theory, easy in practice
 context-specific nature of knowledge
Opinion poll
      left                        right
meaning of a sentence      meaning of sentence
is only determined         is not only determined
by the sentence itself,    by the sentence itself,
and not influenced by      but is also influenced by
the surrounding            by the surrounding
sentences,                 sentences,
and not by the situation   and also by the situation
in which the sentence      in which the sentence
is used                    is used
                                               34
Opinion poll
     left      right

                don’t you see
                what I mean?




                                35
Example:
   Ontology mapping
with community support

      This work with
    Zharko Aleksovski &
       Michel Klein
The general idea
              background
               knowledge



  anchoring                     anchoring
                    inference

    source                       target
               mapping


                                            37
Example 1




            38
Example 2




            39
Results
Example matchings discovered
 q   OLVG: Acute respiratory failure         cardiale
     AMC: Asthma cardiale
 q   OLVG: Aspergillus fumigatus             cause
     AMC: Aspergilloom
 q   OLVG: duodenum perforation              abnormality
     AMC: Gut perforation
 q   OLVG: HIV                               cause
     AMC: AIDS
 q   OLVG: Aorta thoracalis dissectie type   B   location
     AMC: Dissection of artery                       40
Experimental results
 Source & target =
  flat lists of ±1400 ICU terms each
 Anchoring =
  substring + simple germanic morphology
 Background = DICE (2300 concepts in DL)




                                            41
New results:
 more background knowledge makes
  mappings better
  q   DICE (2300 concepts)
  q   MeSH (22000 concepts)
  q   ICD-10 (11000 concepts)
 Monotonic improvement of quality
 Linear increase of cost
                                100
                                90
                                80
                                70
                                60
                                50
                                40
                                30
                                20
                                10
                                 0
                                      1   2   3   42   4
Distributed/P2P setting


              background
               knowledge



  anchoring                     anchoring
                    inference

    source                       target
               mapping
                                            43
So…
 The OLVG & AMC terms get their meaning
  from the context in which they are being
  used.
 Different background knowledge would
  have resulted in different mappings
 Their semantics is not context-free
 See also: S-MATCH by Trento



                                        44
4 examples of
“where does it break?”


 Traditional complexity measures
 Hard in theory, easy in practice
 context-specific nature of knowledge
 logic vs. statistics
Logic vs. statistics
 DB schema’s & integration is
  only logic, no statistics

 AI is both logic and statistics,
  but completely disjoint

 Find combinations of the two worlds?
  q   Statistics in the logic?
  q   Statistics to control the logic?
  q   Statistics to define the semantics of the logic?
                                                    46
Statistics in the logic? Fuzzy DL
 (TalksByFrank v InterestingTalks) ¸ 0.7

 (Turkey:EuropeanCountry) · 0.2
                1
 youngPerson = Person u9age.Young
 Young(x) =     0
                    10yr   30yr

               1

 veryYoungPerson = Person u9age.very(Young)
                0
                    10yr   30yr              47
                                    Umberto Straccia
Statistics to control the logic?
 query: A v B ?

 B = B1 u B2 u B3 A v B1, A v B2, A v B3 ?

                       B2


        B1         A          B3



                                          48
Statistics to control the logic?
 Use “Google distance” to decide which ones
  are reasonable to focus on
 Google distance
   ≈symmetric conditional probability of co-occurrence
   ≈estimate of semantic distance

   ≈ estimate of “contribution” to A v B1 u B2 u B3
                                            B2
                                   B1     A            B3
This work by
Riste Gligorov                                        49
Statistics to define semantics?
 Many peers have many mappings
  on many terms
  to many other peers
 Mapping is good if results of
  “whispering game” are truthful
 Punish mappings that contribute to bad
  whispering results
 Network will converge
  to set of good mappings
  (or at least: consistent)      This work by
                                   Karl Aberer   50
Statistics to define semantics?
 Meaning of terms =
  relations to other terms
 Determined by stochastic process
 Meaning ≈
  stable state of self-organising system
 statistics = getting a system to a
                 meaning-defining stable state
 logic     = description of such a stable state

 Note: meaning is still binary,
  classical truth-value
 Note: same system may have
  multiple stable states…                   51
4 examples of
 “where does it break?”
 Traditional complexity that no longer hold,
 old assumptions measures don’t work
  completeness, decidability, complexity
Sometimes “hard in theory, no longer work
 old approaches that easy in practice”
  Q/A over inconsistent ontologies is easy, but why?
 Meaning dependent on context
  meaning determined by background knowledge
 Logic versus statistics
  statistics in the logic
  statistics to control the logic
  statistics to determine semantics
Final comments
 These 4 “broken assumptions/old methods”
  were just examples. There are many more.
  (e.g. Hayes, Halpin on identity, equality and reference)

 Notice that they are interlinked, e.g
   hard theory/easy practice      &      complexity
   meaning in context             &      logic/statistics

 Working on these will not be SemWeb work per se,
  but
   q   they will be inspired by SemWeb challenges
   q   they will help the SemWeb effort (either V1 or V2)

                                                              53
Have fun with the puzzles!




                             54

Where Does It Break?

  • 1.
    Where does itbreak? or: Why Semantic Web research is not just “Computer Science as usual” Frank van Harmelen AI Department Vrije Universiteit Amsterdam
  • 2.
    But first: “the Semantic Web forces us to rethink the foundations of many subfields of Computer Science” if you can “the challenge of the Semantic Webscontinues to , read thishared break many often silently held andst be you mu research” assumptions underlying decades of ids on stero “I will try to identify silently held assumptions which are no longer true on the Semantic Web, prompting a radical rethink of many past results” 2
  • 3.
    Oh no, notmore “vision”… our plan is to invent the visionary sales are dropping some sort of leadership work is like a rock doohicky that done. How long will everyone wants to your part take? buy. Don’t worry, there will be lots of technical content 3
  • 4.
    Grand Topics…  whatare the science challenges in SW?  Which implicit traditional assumptions break?  Illustrated with 4 such “traditional assumptions” and also:  “Which Semantic Web” ? 4
  • 5.
    Before we goon: Which Semantic Web are we talking about?
  • 6.
    Typical SemWeb slide1: General idea of Semantic Web  Make current web more machine accessible (currently all the intelligence is in the user)  Motivating use-cases q search q personalisation q semantic linking q data integration q web services q ... 6
  • 7.
    Typical SemWeb slide2: General idea of Semantic Web Do this by: 2. Making data and meta-data available on the Web in machine-understandable form (formalised) These are non-trivial design decisions. 3. Structure the data and meta-data in Alternative would be: ontologies 7
  • 8.
    Which Semantic Web? Version 1: "Semantic Web as Web of Data" (TBL)  recipe: expose databases on the web, use RDF, integrate  meta-data from: q expressing DB schema semantics in machine interpretable ways  enable integration and unexpected re-use 8
  • 9.
    Which Semantic Web? Version 2: “Enrichment of the current Web”  recipe: Annotate, classify, index  meta-data from: q automatically producing markup: named-entity recognition, concept extraction, tagging, etc.  enable personalisation, search, browse,.. 9
  • 10.
    Which Semantic Web? Version 1: “Semantic Web as Web of Data”  Version 2: “Enrichment of the current Web”  Different use-cases  Different techniques  Different users 10
  • 11.
    Before we goon: The current state of the Semantic Web?
  • 12.
    What’s up inthe Semantic Web? The 4 hard questions: Q1: "where does the meta-data come from?”  NL technology is delivering on concept-extraction Q2: “where do the ontologies come from?”  many handcrafted ontologies  ontology learning remains hard  relation extraction remains hard Q3: “what to do with many ontologies?”  ontology mapping/aligning remains VERY hard Q4: “where’s the ‘Web’ in the Semantic Web?”  more attention to social aspects (P2P, FOAF)  non-textual media remains hard 12
  • 13.
    What’s up inthe Semantic Web? The 4 hard questions:  healthy uptake in some areas:  knowledge management / intranets  data-integration (Boeing)  life-sciences (e-Science)  convergence with Semantic Grid  cultural heritage  emerging applications in search & browse  Elsevier, Ilse, MagPie, KIM  very few applications in  personalisation  mobility/context awareness  Most applications for companies, few applications for the public 13
  • 14.
  • 15.
    Semantic Web asTechnology  better search & browse  personalisation  semantic linking  semantic web services  ... Semantic Web as Science 15
  • 16.
    4 examples of “where does it break?” d assumptions that no longer hold, d approaches that no longer work
  • 17.
    4 examples of “wheredoes it break?”  Traditional complexity measures
  • 18.
    Who cares about decidability?  Decidability ≈ completeness guarantee to find an answer, or tell you it doesn’t exist, given enough run-time & memory  Sources of incompleteness: q incompleteness of the input data q insufficient run-time to wait for the answer Completeness is unachievable in practice anyway, regardless of the completeness of the algorithm 18
  • 19.
    Who cares aboutundecidability?  Undecidability ≠ always guaranteed not to find an answer  Undecidability = not always guaranteed to find an answer  Undecidability may be harmless in many cases; in all cases that matter 19
  • 20.
    Who cares aboutcomplexity?  worst-case: may be exponentially rare  asymptotic  ignores constants 20
  • 21.
    What to doinstead?  Practical observations on RDF Schema: 6 9 q Compute full closure of O(105) statements  Practical observations on OWL: q NEXPTIME  but fine on many practical cases  Do more experimental performance profiles with realistic data  Think hard about “average case” complexity…. 21
  • 22.
    4 examples of “wheredoes it break?”  Traditional complexity measures  Hard in theory, easy in practice
  • 23.
    Example: Reasoning with Inconsistent Knowledge This work with Zhisheng Huang & Annette ten Teije
  • 24.
    Knowledge will beinconsistent Because of:  mistreatment of defaults  homonyms  migration from another formalism  integration of multiple sources 24
  • 25.
    New formal notionsare needed  New notions: q Accepted: q Rejected: q Overdetermined: q Undetermined:  Soundness: (only classically justified results) 25
  • 26.
    Basic Idea – Start from the query – Incrementally select larger parts of the ontology that are “relevant” to the query, until: • you have an ontology subpart that is Selection function small enough to be consistent and large enough to answer the query or • the selected subpart is already inconsistent before it can answer the query 26
  • 27.
    General Framework s(T,φ,0) s(T,φ,2) s(T,φ,1) 27
  • 28.
    More precisely: Use selectionfunction s(T,φ,k), with s(T,φ,k) µ s(T,φ,k+1)  Start with k=0: s(T,φ,0) j¼ φ or s(T,φ,0) j¼ :φ ?  Increase k, until s(T,φ,k) j¼ φ or s(T,φ,k) j¼ :φ  Abort when q undetermined at maximal k 28
  • 29.
    Nice general framework,but...  which selection function s(T,φ,k) to use?  Simple option: syntactic distance q put all formulae in clausal form: a1 Ç a2 Ç … Ç an q distance k=1 if some clausal letters overlap a1 Ç X Ç … Ç an, b1 Ç … X Ç bn q distance k if chain of k overlapping clauses are needed a1 Ç X Ç … X1 Ç an 29 b ÇX Ç…X Çb,
  • 30.
    Evaluation Ontologies:  Transport: 450 concepts Communication: 200 concepts Madcow: 55 concepts Selection functions:  symbol-relevance = axioms overlap by ≥1 symbol  concept-relevance ≈ axioms overlap by ≥1 concept Query a random set of subsumption queries: Concept1 ⊆ Concept2 ? 30
  • 31.
    Evaluation: Lessons this makesconcept-relevance a high quality sound approximation (> 90% recall, 100% precision) 31
  • 32.
    Works surprisingly well Onour benchmarks, allmost all answers are “intuitive”  Not well understood why  Theory doesn’t predict that this is easy q paraconsistent logic, q relevance logic q multi-valued logic  Hypothesis: due to “local structure of knowledge”? 32
  • 33.
    4 examples of “wheredoes it break?”  Traditional complexity measures  Hard in theory, easy in practice  context-specific nature of knowledge
  • 34.
    Opinion poll left right meaning of a sentence meaning of sentence is only determined is not only determined by the sentence itself, by the sentence itself, and not influenced by but is also influenced by the surrounding by the surrounding sentences, sentences, and not by the situation and also by the situation in which the sentence in which the sentence is used is used 34
  • 35.
    Opinion poll left right don’t you see what I mean? 35
  • 36.
    Example: Ontology mapping with community support This work with Zharko Aleksovski & Michel Klein
  • 37.
    The general idea background knowledge anchoring anchoring inference source target mapping 37
  • 38.
  • 39.
  • 40.
    Results Example matchings discovered q OLVG: Acute respiratory failure cardiale AMC: Asthma cardiale q OLVG: Aspergillus fumigatus cause AMC: Aspergilloom q OLVG: duodenum perforation abnormality AMC: Gut perforation q OLVG: HIV cause AMC: AIDS q OLVG: Aorta thoracalis dissectie type B location AMC: Dissection of artery 40
  • 41.
    Experimental results  Source& target = flat lists of ±1400 ICU terms each  Anchoring = substring + simple germanic morphology  Background = DICE (2300 concepts in DL) 41
  • 42.
    New results:  morebackground knowledge makes mappings better q DICE (2300 concepts) q MeSH (22000 concepts) q ICD-10 (11000 concepts)  Monotonic improvement of quality  Linear increase of cost 100 90 80 70 60 50 40 30 20 10 0 1 2 3 42 4
  • 43.
    Distributed/P2P setting background knowledge anchoring anchoring inference source target mapping 43
  • 44.
    So…  The OLVG& AMC terms get their meaning from the context in which they are being used.  Different background knowledge would have resulted in different mappings  Their semantics is not context-free  See also: S-MATCH by Trento 44
  • 45.
    4 examples of “wheredoes it break?”  Traditional complexity measures  Hard in theory, easy in practice  context-specific nature of knowledge  logic vs. statistics
  • 46.
    Logic vs. statistics DB schema’s & integration is only logic, no statistics  AI is both logic and statistics, but completely disjoint  Find combinations of the two worlds? q Statistics in the logic? q Statistics to control the logic? q Statistics to define the semantics of the logic? 46
  • 47.
    Statistics in thelogic? Fuzzy DL  (TalksByFrank v InterestingTalks) ¸ 0.7  (Turkey:EuropeanCountry) · 0.2 1  youngPerson = Person u9age.Young Young(x) = 0 10yr 30yr 1  veryYoungPerson = Person u9age.very(Young) 0 10yr 30yr 47 Umberto Straccia
  • 48.
    Statistics to controlthe logic?  query: A v B ?  B = B1 u B2 u B3 A v B1, A v B2, A v B3 ? B2 B1 A B3 48
  • 49.
    Statistics to controlthe logic?  Use “Google distance” to decide which ones are reasonable to focus on  Google distance ≈symmetric conditional probability of co-occurrence ≈estimate of semantic distance ≈ estimate of “contribution” to A v B1 u B2 u B3 B2 B1 A B3 This work by Riste Gligorov 49
  • 50.
    Statistics to definesemantics?  Many peers have many mappings on many terms to many other peers  Mapping is good if results of “whispering game” are truthful  Punish mappings that contribute to bad whispering results  Network will converge to set of good mappings (or at least: consistent) This work by Karl Aberer 50
  • 51.
    Statistics to definesemantics?  Meaning of terms = relations to other terms  Determined by stochastic process  Meaning ≈ stable state of self-organising system  statistics = getting a system to a meaning-defining stable state  logic = description of such a stable state  Note: meaning is still binary, classical truth-value  Note: same system may have multiple stable states… 51
  • 52.
    4 examples of “where does it break?”  Traditional complexity that no longer hold,  old assumptions measures don’t work completeness, decidability, complexity Sometimes “hard in theory, no longer work  old approaches that easy in practice” Q/A over inconsistent ontologies is easy, but why?  Meaning dependent on context meaning determined by background knowledge  Logic versus statistics statistics in the logic statistics to control the logic statistics to determine semantics
  • 53.
    Final comments  These4 “broken assumptions/old methods” were just examples. There are many more. (e.g. Hayes, Halpin on identity, equality and reference)  Notice that they are interlinked, e.g  hard theory/easy practice &  complexity  meaning in context &  logic/statistics  Working on these will not be SemWeb work per se, but q they will be inspired by SemWeb challenges q they will help the SemWeb effort (either V1 or V2) 53
  • 54.
    Have fun withthe puzzles! 54