Semantic Web research
    anno 2006:
 main streams, popular falacies,
current status, future challenges

        Frank van...
This is NOT
a Semantic Web
evangelization talk
(I assume
 you are already
 converted)
                      2
This is a “topical” talk:

Webster:
“referring to the topics of the day,
of temporary interest”
Semantic Web research anno 2006:
      main streams popular falacies,
      main streams,
      current status, future cha...
General idea of Semantic Web
Make current web more machine accessible
 (currently all the intelligence is in the user)
Mot...
General idea of Semantic Web(2)

Do this by:

 Making data and meta-data
  available on the Web
  in machine-understandab...
“machine-understandable form”
     (What it’s like to be a machine)
            alleviates
                              M...
Expressed using the W3C stack




                            8
Which Semantic Web?
 Version 1:
  "Semantic Web as Web of Data" (TBL)


 recipe:
  expose databases on the web,
  use RD...
Which Semantic Web?
 Version 2:
  “Enrichment of the current Web”

 recipe:
  Annotate, classify, index
 meta-data from...
Which Semantic Web?
 Version 1:
  “Semantic Web as Web of Data”

 Version 2:
  “Enrichment of the current Web”

 Differ...
Semantic Web research anno 2006:
      main streams, popular falacies,
                              falacies
      curren...
First: clear up some popular
misunderstandings
False statement No :
“Semantic Web people try to
  enforce meaning from th...
First: clear up some popular
misunderstandings
False statement No :
“The Semantic Web people will require
  everybody to ...
First: clear up some popular
misunderstandings
False statement No :
“The Semantic Web will require users to
  understand ...
First: clear up some popular
misunderstandings
False statement No :
“The Semantic Web people will require us to
  manuall...
Semantic Web research anno 2006:
     main streams, popular falacies,
     current status future challenges
     current s...
4 hard questions on the
Semantic Web:
Q1: "where does the meta-data come from?”
 NL technology is delivering on concept-e...
Q1: Where do the ontologies
come from?
Professional bodies, scientific communities,
  companies, publishers, ….

Good old ...
Q1: Where do the ontologies
come from?
 handcrafted
  q   music: CDnow (2410/5), MusicMoz (1073/7)
 community efforts
  ...
Q2: Where do the annotations
come from?
- Automated learning
- shallow natural language analysis
- Concept extraction
  Ex...
Q2: Where do the annotations
come from?
 lightweight NLP
  q   Dutch language semantic search engine
 exploit existing l...
Q3: What to do with many
ontologies?
 Mesh
  q   Medical Subject Headings, National Library of Medicine
  q   22.000 desc...
Q3: What to do with many
ontologies?
 Stitching all this together by hand?




                                         24
Q3: What to do with many
ontologies?
 Linguistics & structure


 Shared vocabulary


 Instance-based matching


 Share...
Where are we now: tools
 Languages are stable
 Tooling is rapidly emerging
  q   HP, IBM, Oracle, Adobe, …
  q   Parsers...
Where are we now: applications
 healthy uptake in some areas:
     knowledge management / intranets
     data-integrati...
Semantic Web research anno 2006:
      main streams, popular falacies,
      current status, future challenges
           ...
Semantic Web as an integrator
of many different subfields
 Databases
 Natural Language Processing
 Knowledge Representa...
Provocation…
 Ontology research is done……
  q   We know how to
      make, maintain & deploy them
  q   We have tools & m...
Large open questions
 Ontology learning & mapping
 emerging semantics (social & statistical)
 Semantic Web services
  q...
Changing focus
     centralised,
    formalised,
     complete,
        precise
                    distributed,
         ...
Slide by Carol Goble


  Predicting the future…
            Artificial Intelligence
            Decision making
      OWL
...
Upcoming SlideShare
Loading in...5
×

Semantic Web research anno 2006:main streams, popular falacies, current status, future challenges

1,098

Published on

This keynote at the Cooperative Intelligent Agents Workshop was a good opportunity to give my view on the current state of Semantic Web research: what is it about, what is it not about, what has been achieved, what remains to be done. (Includes the now infamous slide "What's it like to be a machine")

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,098
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
25
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Semantic Web research anno 2006:main streams, popular falacies, current status, future challenges

  1. 1. Semantic Web research anno 2006: main streams, popular falacies, current status, future challenges Frank van Harmelen Vrije Universiteit Amsterdam
  2. 2. This is NOT a Semantic Web evangelization talk (I assume you are already converted) 2
  3. 3. This is a “topical” talk: Webster: “referring to the topics of the day, of temporary interest”
  4. 4. Semantic Web research anno 2006: main streams popular falacies, main streams, current status, future challenges Which Semantic Web are we talking about?
  5. 5. General idea of Semantic Web Make current web more machine accessible (currently all the intelligence is in the user) Motivating use-cases  Search engines • concepts, not keywords • semantic narrowing/widening of queries  Shopbots • semantic interchange, not screenscraping  E-commerce q Negotiation, catalogue mapping, data-integration  Web Services q Need semantic characterisations to find them  Navigation • by semantic proximity, not hardwired links 5
  6. 6. General idea of Semantic Web(2) Do this by:  Making data and meta-data available on the Web in machine-understandable form (formalised)  Structure the data and meta-data in These are non-trivial ontologies design decisions. Alternative would be: 6
  7. 7. “machine-understandable form” (What it’s like to be a machine) alleviates META-DATA <treatment> <name> <symptoms> IS-A <disease> <drug> <drug administration> 7
  8. 8. Expressed using the W3C stack 8
  9. 9. Which Semantic Web?  Version 1: "Semantic Web as Web of Data" (TBL)  recipe: expose databases on the web, use RDF, integrate  meta-data from: q expressing DB schema semantics in machine interpretable ways  enable integration and unexpected re-use 9
  10. 10. Which Semantic Web?  Version 2: “Enrichment of the current Web”  recipe: Annotate, classify, index  meta-data from: q automatically producing markup: named-entity recognition, concept extraction, tagging, etc.  enable personalisation, search, browse,.. 10
  11. 11. Which Semantic Web?  Version 1: “Semantic Web as Web of Data”  Version 2: “Enrichment of the current Web”  Different use-cases  Different techniques  Different users 11
  12. 12. Semantic Web research anno 2006: main streams, popular falacies, falacies current status, future challenges Four popular falacies about the Semantic Web
  13. 13. First: clear up some popular misunderstandings False statement No : “Semantic Web people try to enforce meaning from the top” They only “enforce” a language. They don’t enforce what is said in that language Compare: HTML “enforced” from the top, But content is entirely free. 13
  14. 14. First: clear up some popular misunderstandings False statement No : “The Semantic Web people will require everybody to subscribe to a single predefined "meaning" for the terms we use.” Of course, meaning is fluid, contextual, etc. Lot’s of work on (semi)-automatically bridging between different vocabularies. 14
  15. 15. First: clear up some popular misunderstandings False statement No : “The Semantic Web will require users to understand the complicated details of formalised knowledge representation.” All of this is “under the hood”. 15
  16. 16. First: clear up some popular misunderstandings False statement No : “The Semantic Web people will require us to manually markup all the existing web-pages.” Lots of work on automatically producing semantic markup: named-entity recognition, concept extraction, etc. 16
  17. 17. Semantic Web research anno 2006: main streams, popular falacies, current status future challenges current status, The current state of Semantic Web
  18. 18. 4 hard questions on the Semantic Web: Q1: "where does the meta-data come from?”  NL technology is delivering on concept-extraction  Socially emerging (learning from tagging). Q2: “where do the meta-data-schema come from?”  many handcrafted schema  hierarchy learning remains hard  relation extraction remains hard. Q3: “what to do with many meta-data schema?”  ontology mapping/aligning remains VERY hard. Q4: “where’s the ‘Web’ in the Semantic Web?”  more attention to social aspects (P2P, FOAF)  non-textual media remains hard 18  deal with typical Web requirements.
  19. 19. Q1: Where do the ontologies come from? Professional bodies, scientific communities, companies, publishers, …. Good old fashioned Knowledge Engineering Convert from DB-schema, UML, etc. Learning remains very hard… 19
  20. 20. Q1: Where do the ontologies come from?  handcrafted q music: CDnow (2410/5), MusicMoz (1073/7)  community efforts q biomedical: SNOMED (200k), GO (15k),  commercial: Emtree(45k+190k)  ranging from lightweight (Yahoo) to heavyweight (Cyc)  ranging from small (METAR) to large (UNSPC) 20
  21. 21. Q2: Where do the annotations come from? - Automated learning - shallow natural language analysis - Concept extraction Example: Encyclopedia Britannica on “Amsterdam” trade antwerp europe amsterdam netherlands merchant center city town 21
  22. 22. Q2: Where do the annotations come from?  lightweight NLP q Dutch language semantic search engine  exploit existing legacy-data q Amazon q Lab equipment  side-effect from user interaction q MIT Lab photo-annotator  NOT from manual effort 22
  23. 23. Q3: What to do with many ontologies?  Mesh q Medical Subject Headings, National Library of Medicine q 22.000 descriptions  EMTREE q Commercial Elsevier, Drugs and diseases q 45.000 terms, 190.000 synonyms  UMLS q Integrates 100 different vocabularies  SNOMED q 200.000 concepts, College of American Pathologists  Gene Ontology q 15.000 terms in molecular biology  NCI Cancer Ontology: 23 q 17,000 classes (about 1M definitions),
  24. 24. Q3: What to do with many ontologies?  Stitching all this together by hand? 24
  25. 25. Q3: What to do with many ontologies?  Linguistics & structure  Shared vocabulary  Instance-based matching  Shared background knowledge 25
  26. 26. Where are we now: tools  Languages are stable  Tooling is rapidly emerging q HP, IBM, Oracle, Adobe, … q Parsers, q Editors, q visualisers, q large scale storage and querying q Portal generation, search 26
  27. 27. Where are we now: applications  healthy uptake in some areas:  knowledge management / intranets  data-integration  life-sciences  convergence with Semantic Grid  cultural heritage  still very few applications in  personalisation  mobility/context awareness  Most applications for companies, few applications for the public 27
  28. 28. Semantic Web research anno 2006: main streams, popular falacies, current status, future challenges future challenges Future directions/challenges
  29. 29. Semantic Web as an integrator of many different subfields  Databases  Natural Language Processing  Knowledge Representation  Machine Learning  Information Retrieval  Agents  HCI  …. 29
  30. 30. Provocation…  Ontology research is done…… q We know how to make, maintain & deploy them q We have tools & methods for editing, storing, inferencing, visualising, etc  … except for two problems: q Learning q Mapping  Natural lang. technology is also done… q at least it’s good enough 30
  31. 31. Large open questions  Ontology learning & mapping  emerging semantics (social & statistical)  Semantic Web services q discovery, composition: realistic?  non-textual media q the semantic gap: text or social?  Deployment: 1. data-integration 2. search 3. personalisation 31
  32. 32. Changing focus centralised, formalised, complete, precise distributed, heterogeneous, open, P2P, approximate, lightweight Web 3.0 = Web 2.0 + Semantic Web 32
  33. 33. Slide by Carol Goble Predicting the future… Artificial Intelligence Decision making OWL Lots SWRL Knowledge Discovery Semantics Ontology Building Semantic Information Web linking Services NLP Flexible & RDF FOAF Not extensible Social RSS much Metadata bookmarking schemas Collective Intelligence Not Lots much Web 33
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×