Your SlideShare is downloading. ×
Semantic Web research anno 2006:main streams, popular falacies, current status, future challenges
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Semantic Web research anno 2006:main streams, popular falacies, current status, future challenges

1,031
views

Published on

This keynote at the Cooperative Intelligent Agents Workshop was a good opportunity to give my view on the current state of Semantic Web research: what is it about, what is it not about, what has been …

This keynote at the Cooperative Intelligent Agents Workshop was a good opportunity to give my view on the current state of Semantic Web research: what is it about, what is it not about, what has been achieved, what remains to be done. (Includes the now infamous slide "What's it like to be a machine")

Published in: Technology, Education

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,031
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
25
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Semantic Web research anno 2006: main streams, popular falacies, current status, future challenges Frank van Harmelen Vrije Universiteit Amsterdam
  • 2. This is NOT a Semantic Web evangelization talk (I assume you are already converted) 2
  • 3. This is a “topical” talk: Webster: “referring to the topics of the day, of temporary interest”
  • 4. Semantic Web research anno 2006: main streams popular falacies, main streams, current status, future challenges Which Semantic Web are we talking about?
  • 5. General idea of Semantic Web Make current web more machine accessible (currently all the intelligence is in the user) Motivating use-cases  Search engines • concepts, not keywords • semantic narrowing/widening of queries  Shopbots • semantic interchange, not screenscraping  E-commerce q Negotiation, catalogue mapping, data-integration  Web Services q Need semantic characterisations to find them  Navigation • by semantic proximity, not hardwired links 5
  • 6. General idea of Semantic Web(2) Do this by:  Making data and meta-data available on the Web in machine-understandable form (formalised)  Structure the data and meta-data in These are non-trivial ontologies design decisions. Alternative would be: 6
  • 7. “machine-understandable form” (What it’s like to be a machine) alleviates META-DATA <treatment> <name> <symptoms> IS-A <disease> <drug> <drug administration> 7
  • 8. Expressed using the W3C stack 8
  • 9. Which Semantic Web?  Version 1: "Semantic Web as Web of Data" (TBL)  recipe: expose databases on the web, use RDF, integrate  meta-data from: q expressing DB schema semantics in machine interpretable ways  enable integration and unexpected re-use 9
  • 10. Which Semantic Web?  Version 2: “Enrichment of the current Web”  recipe: Annotate, classify, index  meta-data from: q automatically producing markup: named-entity recognition, concept extraction, tagging, etc.  enable personalisation, search, browse,.. 10
  • 11. Which Semantic Web?  Version 1: “Semantic Web as Web of Data”  Version 2: “Enrichment of the current Web”  Different use-cases  Different techniques  Different users 11
  • 12. Semantic Web research anno 2006: main streams, popular falacies, falacies current status, future challenges Four popular falacies about the Semantic Web
  • 13. First: clear up some popular misunderstandings False statement No : “Semantic Web people try to enforce meaning from the top” They only “enforce” a language. They don’t enforce what is said in that language Compare: HTML “enforced” from the top, But content is entirely free. 13
  • 14. First: clear up some popular misunderstandings False statement No : “The Semantic Web people will require everybody to subscribe to a single predefined "meaning" for the terms we use.” Of course, meaning is fluid, contextual, etc. Lot’s of work on (semi)-automatically bridging between different vocabularies. 14
  • 15. First: clear up some popular misunderstandings False statement No : “The Semantic Web will require users to understand the complicated details of formalised knowledge representation.” All of this is “under the hood”. 15
  • 16. First: clear up some popular misunderstandings False statement No : “The Semantic Web people will require us to manually markup all the existing web-pages.” Lots of work on automatically producing semantic markup: named-entity recognition, concept extraction, etc. 16
  • 17. Semantic Web research anno 2006: main streams, popular falacies, current status future challenges current status, The current state of Semantic Web
  • 18. 4 hard questions on the Semantic Web: Q1: "where does the meta-data come from?”  NL technology is delivering on concept-extraction  Socially emerging (learning from tagging). Q2: “where do the meta-data-schema come from?”  many handcrafted schema  hierarchy learning remains hard  relation extraction remains hard. Q3: “what to do with many meta-data schema?”  ontology mapping/aligning remains VERY hard. Q4: “where’s the ‘Web’ in the Semantic Web?”  more attention to social aspects (P2P, FOAF)  non-textual media remains hard 18  deal with typical Web requirements.
  • 19. Q1: Where do the ontologies come from? Professional bodies, scientific communities, companies, publishers, …. Good old fashioned Knowledge Engineering Convert from DB-schema, UML, etc. Learning remains very hard… 19
  • 20. Q1: Where do the ontologies come from?  handcrafted q music: CDnow (2410/5), MusicMoz (1073/7)  community efforts q biomedical: SNOMED (200k), GO (15k),  commercial: Emtree(45k+190k)  ranging from lightweight (Yahoo) to heavyweight (Cyc)  ranging from small (METAR) to large (UNSPC) 20
  • 21. Q2: Where do the annotations come from? - Automated learning - shallow natural language analysis - Concept extraction Example: Encyclopedia Britannica on “Amsterdam” trade antwerp europe amsterdam netherlands merchant center city town 21
  • 22. Q2: Where do the annotations come from?  lightweight NLP q Dutch language semantic search engine  exploit existing legacy-data q Amazon q Lab equipment  side-effect from user interaction q MIT Lab photo-annotator  NOT from manual effort 22
  • 23. Q3: What to do with many ontologies?  Mesh q Medical Subject Headings, National Library of Medicine q 22.000 descriptions  EMTREE q Commercial Elsevier, Drugs and diseases q 45.000 terms, 190.000 synonyms  UMLS q Integrates 100 different vocabularies  SNOMED q 200.000 concepts, College of American Pathologists  Gene Ontology q 15.000 terms in molecular biology  NCI Cancer Ontology: 23 q 17,000 classes (about 1M definitions),
  • 24. Q3: What to do with many ontologies?  Stitching all this together by hand? 24
  • 25. Q3: What to do with many ontologies?  Linguistics & structure  Shared vocabulary  Instance-based matching  Shared background knowledge 25
  • 26. Where are we now: tools  Languages are stable  Tooling is rapidly emerging q HP, IBM, Oracle, Adobe, … q Parsers, q Editors, q visualisers, q large scale storage and querying q Portal generation, search 26
  • 27. Where are we now: applications  healthy uptake in some areas:  knowledge management / intranets  data-integration  life-sciences  convergence with Semantic Grid  cultural heritage  still very few applications in  personalisation  mobility/context awareness  Most applications for companies, few applications for the public 27
  • 28. Semantic Web research anno 2006: main streams, popular falacies, current status, future challenges future challenges Future directions/challenges
  • 29. Semantic Web as an integrator of many different subfields  Databases  Natural Language Processing  Knowledge Representation  Machine Learning  Information Retrieval  Agents  HCI  …. 29
  • 30. Provocation…  Ontology research is done…… q We know how to make, maintain & deploy them q We have tools & methods for editing, storing, inferencing, visualising, etc  … except for two problems: q Learning q Mapping  Natural lang. technology is also done… q at least it’s good enough 30
  • 31. Large open questions  Ontology learning & mapping  emerging semantics (social & statistical)  Semantic Web services q discovery, composition: realistic?  non-textual media q the semantic gap: text or social?  Deployment: 1. data-integration 2. search 3. personalisation 31
  • 32. Changing focus centralised, formalised, complete, precise distributed, heterogeneous, open, P2P, approximate, lightweight Web 3.0 = Web 2.0 + Semantic Web 32
  • 33. Slide by Carol Goble Predicting the future… Artificial Intelligence Decision making OWL Lots SWRL Knowledge Discovery Semantics Ontology Building Semantic Information Web linking Services NLP Flexible & RDF FOAF Not extensible Social RSS much Metadata bookmarking schemas Collective Intelligence Not Lots much Web 33