SlideShare a Scribd company logo
1 of 33
Semantic Web research
    anno 2006:
 main streams, popular falacies,
current status, future challenges

        Frank van Harmelen
   Vrije Universiteit Amsterdam
This is NOT
a Semantic Web
evangelization talk
(I assume
 you are already
 converted)
                      2
This is a “topical” talk:

Webster:
“referring to the topics of the day,
of temporary interest”
Semantic Web research anno 2006:
      main streams popular falacies,
      main streams,
      current status, future challenges




Which Semantic Web
are we talking about?
General idea of Semantic Web
Make current web more machine accessible
 (currently all the intelligence is in the user)
Motivating use-cases
 Search engines
   • concepts, not keywords
   • semantic narrowing/widening of queries
 Shopbots
   • semantic interchange, not screenscraping
 E-commerce
   q   Negotiation, catalogue mapping, data-integration
 Web Services
   q   Need semantic characterisations to find them
 Navigation
   • by semantic proximity, not hardwired links           5
General idea of Semantic Web(2)

Do this by:

 Making data and meta-data
  available on the Web
  in machine-understandable form
  (formalised)
 Structure the data and meta-data in
                         These are non-trivial
  ontologies             design decisions.
                           Alternative would be:


                                                   6
“machine-understandable form”
     (What it’s like to be a machine)
            alleviates
                              META-DATA
<treatment>
                                    <name>

  <symptoms>

  IS-A                             <disease>
         <drug>

         <drug
administration>                         7
Expressed using the W3C stack




                            8
Which Semantic Web?
 Version 1:
  "Semantic Web as Web of Data" (TBL)


 recipe:
  expose databases on the web,
  use RDF, integrate
 meta-data from:
  q   expressing DB schema semantics
      in machine interpretable ways
 enable integration and unexpected re-use
                                        9
Which Semantic Web?
 Version 2:
  “Enrichment of the current Web”

 recipe:
  Annotate, classify, index
 meta-data from:
  q   automatically producing markup:
      named-entity recognition,
      concept extraction, tagging, etc.
 enable personalisation, search, browse,..
                                          10
Which Semantic Web?
 Version 1:
  “Semantic Web as Web of Data”

 Version 2:
  “Enrichment of the current Web”

 Different use-cases
 Different techniques
 Different users

                                    11
Semantic Web research anno 2006:
      main streams, popular falacies,
                              falacies
      current status, future challenges




Four popular falacies
about the Semantic Web
First: clear up some popular
misunderstandings
False statement No :
“Semantic Web people try to
  enforce meaning from the top”

They only “enforce” a language.
They don’t enforce what is said in that language

Compare: HTML “enforced” from the top,
But content is entirely free.
                                           13
First: clear up some popular
misunderstandings
False statement No :
“The Semantic Web people will require
  everybody to subscribe to a single predefined
  "meaning" for the terms we use.”

Of course, meaning is fluid, contextual, etc.

Lot’s of work on (semi)-automatically
bridging between different vocabularies.
                                            14
First: clear up some popular
misunderstandings
False statement No :
“The Semantic Web will require users to
  understand the complicated details of
  formalised knowledge representation.”

All of this is “under the hood”.




                                          15
First: clear up some popular
misunderstandings
False statement No :
“The Semantic Web people will require us to
  manually markup all the existing web-pages.”


Lots of work on automatically producing
semantic markup:

named-entity recognition,
concept extraction, etc.
                                          16
Semantic Web research anno 2006:
     main streams, popular falacies,
     current status future challenges
     current status,




The current state of
Semantic Web
4 hard questions on the
Semantic Web:
Q1: "where does the meta-data come from?”
 NL technology is delivering on concept-extraction
 Socially emerging (learning from tagging).
Q2: “where do the meta-data-schema
  come from?”
 many handcrafted schema
 hierarchy learning remains hard
 relation extraction remains hard.
Q3: “what to do with many meta-data schema?”
 ontology mapping/aligning remains VERY hard.
Q4: “where’s the ‘Web’ in the Semantic Web?”
 more attention to social aspects (P2P, FOAF)
 non-textual media remains hard                      18
 deal with typical Web requirements.
Q1: Where do the ontologies
come from?
Professional bodies, scientific communities,
  companies, publishers, ….

Good old fashioned Knowledge Engineering


Convert from DB-schema, UML, etc.

Learning remains very hard…



                                               19
Q1: Where do the ontologies
come from?
 handcrafted
  q   music: CDnow (2410/5), MusicMoz (1073/7)
 community efforts
  q   biomedical: SNOMED (200k), GO (15k),
 commercial: Emtree(45k+190k)
 ranging from lightweight (Yahoo)
  to heavyweight (Cyc)
 ranging from small (METAR)
  to large (UNSPC)                               20
Q2: Where do the annotations
come from?
- Automated learning
- shallow natural language analysis
- Concept extraction
  Example: Encyclopedia Britannica on “Amsterdam”
                        trade

              antwerp         europe

                  amsterdam                netherlands
   merchant                       center
                city    town
                                                    21
Q2: Where do the annotations
come from?
 lightweight NLP
  q   Dutch language semantic search engine
 exploit existing legacy-data
  q   Amazon
  q   Lab equipment
 side-effect from user interaction
  q   MIT Lab photo-annotator
 NOT from manual effort
                                              22
Q3: What to do with many
ontologies?
 Mesh
  q   Medical Subject Headings, National Library of Medicine
  q   22.000 descriptions
 EMTREE
  q   Commercial Elsevier, Drugs and diseases
  q   45.000 terms, 190.000 synonyms
 UMLS
  q   Integrates 100 different vocabularies
 SNOMED
  q   200.000 concepts, College of American Pathologists
 Gene Ontology
  q   15.000 terms in molecular biology
 NCI Cancer Ontology:
                                                           23
  q   17,000 classes (about 1M definitions),
Q3: What to do with many
ontologies?
 Stitching all this together by hand?




                                         24
Q3: What to do with many
ontologies?
 Linguistics & structure


 Shared vocabulary


 Instance-based matching


 Shared background knowledge
                                25
Where are we now: tools
 Languages are stable
 Tooling is rapidly emerging
  q   HP, IBM, Oracle, Adobe, …
  q   Parsers,
  q   Editors,
  q   visualisers,
  q   large scale storage and querying
  q   Portal generation, search


                                         26
Where are we now: applications
 healthy uptake in some areas:
     knowledge management / intranets
     data-integration
     life-sciences
     convergence with Semantic Grid
     cultural heritage
 still very few applications in
     personalisation
     mobility/context awareness
 Most applications for companies,
  few applications for the public        27
Semantic Web research anno 2006:
      main streams, popular falacies,
      current status, future challenges
                      future challenges




Future
directions/challenges
Semantic Web as an integrator
of many different subfields
 Databases
 Natural Language Processing
 Knowledge Representation
 Machine Learning
 Information Retrieval
 Agents
 HCI
 ….
                                29
Provocation…
 Ontology research is done……
  q   We know how to
      make, maintain & deploy them
  q   We have tools & methods for
      editing, storing, inferencing, visualising, etc
 … except for two problems:
  q   Learning
  q   Mapping
 Natural lang. technology is also done…
  q   at least it’s good enough
                                                        30
Large open questions
 Ontology learning & mapping
 emerging semantics (social & statistical)
 Semantic Web services
  q   discovery, composition: realistic?
 non-textual media
  q   the semantic gap: text or social?
 Deployment:
  1. data-integration
  2. search
  3. personalisation
                                           31
Changing focus
     centralised,
    formalised,
     complete,
        precise
                    distributed,
                    heterogeneous,
                       open, P2P,
                     approximate,
                    lightweight

Web 3.0 = Web 2.0 + Semantic Web     32
Slide by Carol Goble


  Predicting the future…
            Artificial Intelligence
            Decision making
      OWL
Lots  SWRL        Knowledge
                  Discovery
Semantics




     Ontology
     Building             Semantic Information
                            Web         linking
                          Services              NLP
           Flexible &
                            RDF            FOAF
Not       extensible             Social          RSS
much       Metadata           bookmarking
           schemas
                                      Collective Intelligence
                   Not                        Lots
                   much          Web
                                                               33

More Related Content

Similar to Semantic Web research anno 2006:main streams, popular falacies, current status, future challenges

Some news about the SW
Some news about the SWSome news about the SW
Some news about the SWIvan Herman
 
Applications for Social Networking Strategies in an Agency Context: Exploitin...
Applications for Social Networking Strategies in an Agency Context: Exploitin...Applications for Social Networking Strategies in an Agency Context: Exploitin...
Applications for Social Networking Strategies in an Agency Context: Exploitin...BoaB Team
 
Applications for Social Networking Strategies in an Agency Context
Applications for Social Networking Strategies in an Agency ContextApplications for Social Networking Strategies in an Agency Context
Applications for Social Networking Strategies in an Agency ContextJohn Brisbin
 
Understanding Research 2.0 from a Socio-technical Perspective
Understanding Research 2.0 from a Socio-technical PerspectiveUnderstanding Research 2.0 from a Socio-technical Perspective
Understanding Research 2.0 from a Socio-technical PerspectiveYuwei Lin
 
Web 2.0 Managerial Economics
Web 2.0 Managerial EconomicsWeb 2.0 Managerial Economics
Web 2.0 Managerial EconomicsAvinash Singh
 
Semantic Web 2.0
Semantic Web 2.0Semantic Web 2.0
Semantic Web 2.0hchen1
 
Web 2.0 E Oltre
Web 2.0 E OltreWeb 2.0 E Oltre
Web 2.0 E Oltreronchet
 
MONAI and Open Science for Medical Imaging Deep Learning: SIPAIM 2020
MONAI and Open Science for Medical Imaging Deep Learning: SIPAIM 2020MONAI and Open Science for Medical Imaging Deep Learning: SIPAIM 2020
MONAI and Open Science for Medical Imaging Deep Learning: SIPAIM 2020Stephen Aylward
 
Web Technology Trends (early 2009)
Web Technology Trends (early 2009)Web Technology Trends (early 2009)
Web Technology Trends (early 2009)Prodosh Banerjee
 
Jonathan hendler deri - galway - feb 25 2008
Jonathan hendler   deri - galway - feb 25 2008Jonathan hendler   deri - galway - feb 25 2008
Jonathan hendler deri - galway - feb 25 2008hendler
 
Building a semantic enterprise content management system from scratch v1
Building a semantic enterprise content management system from scratch v1Building a semantic enterprise content management system from scratch v1
Building a semantic enterprise content management system from scratch v1Ron Michael Zettlemoyer
 
Teaching 2.0 Learning & Leading in the Digital Age
Teaching 2.0 Learning & Leading in the Digital AgeTeaching 2.0 Learning & Leading in the Digital Age
Teaching 2.0 Learning & Leading in the Digital AgeMatthew Hayden
 
CAMA 2007 Visions of the Future for Contextualized Attention Metadata
CAMA 2007 Visions of the Future for Contextualized Attention MetadataCAMA 2007 Visions of the Future for Contextualized Attention Metadata
CAMA 2007 Visions of the Future for Contextualized Attention MetadataWayne Hodgins
 
[Workshop] Analyzing Your Deliverables: Developing the Optimal Documentation ...
[Workshop] Analyzing Your Deliverables: Developing the Optimal Documentation ...[Workshop] Analyzing Your Deliverables: Developing the Optimal Documentation ...
[Workshop] Analyzing Your Deliverables: Developing the Optimal Documentation ...Scott Abel
 
Analyzing Your Deliverables: Developing the Optimal Documentation Library
Analyzing Your Deliverables: Developing the Optimal Documentation LibraryAnalyzing Your Deliverables: Developing the Optimal Documentation Library
Analyzing Your Deliverables: Developing the Optimal Documentation LibraryScott Abel
 

Similar to Semantic Web research anno 2006:main streams, popular falacies, current status, future challenges (20)

Technology Trends
Technology TrendsTechnology Trends
Technology Trends
 
Some news about the SW
Some news about the SWSome news about the SW
Some news about the SW
 
Applications for Social Networking Strategies in an Agency Context: Exploitin...
Applications for Social Networking Strategies in an Agency Context: Exploitin...Applications for Social Networking Strategies in an Agency Context: Exploitin...
Applications for Social Networking Strategies in an Agency Context: Exploitin...
 
Hak intis2013
Hak intis2013Hak intis2013
Hak intis2013
 
Applications for Social Networking Strategies in an Agency Context
Applications for Social Networking Strategies in an Agency ContextApplications for Social Networking Strategies in an Agency Context
Applications for Social Networking Strategies in an Agency Context
 
Understanding Research 2.0 from a Socio-technical Perspective
Understanding Research 2.0 from a Socio-technical PerspectiveUnderstanding Research 2.0 from a Socio-technical Perspective
Understanding Research 2.0 from a Socio-technical Perspective
 
Web 2.0 Managerial Economics
Web 2.0 Managerial EconomicsWeb 2.0 Managerial Economics
Web 2.0 Managerial Economics
 
Semantic Web 2.0
Semantic Web 2.0Semantic Web 2.0
Semantic Web 2.0
 
When?
When?When?
When?
 
Web 2.0 E Oltre
Web 2.0 E OltreWeb 2.0 E Oltre
Web 2.0 E Oltre
 
MONAI and Open Science for Medical Imaging Deep Learning: SIPAIM 2020
MONAI and Open Science for Medical Imaging Deep Learning: SIPAIM 2020MONAI and Open Science for Medical Imaging Deep Learning: SIPAIM 2020
MONAI and Open Science for Medical Imaging Deep Learning: SIPAIM 2020
 
Web Technology Trends (early 2009)
Web Technology Trends (early 2009)Web Technology Trends (early 2009)
Web Technology Trends (early 2009)
 
Jonathan hendler deri - galway - feb 25 2008
Jonathan hendler   deri - galway - feb 25 2008Jonathan hendler   deri - galway - feb 25 2008
Jonathan hendler deri - galway - feb 25 2008
 
The Future of LOD
The Future of LODThe Future of LOD
The Future of LOD
 
Building a semantic enterprise content management system from scratch v1
Building a semantic enterprise content management system from scratch v1Building a semantic enterprise content management system from scratch v1
Building a semantic enterprise content management system from scratch v1
 
Teaching 2.0 Learning & Leading in the Digital Age
Teaching 2.0 Learning & Leading in the Digital AgeTeaching 2.0 Learning & Leading in the Digital Age
Teaching 2.0 Learning & Leading in the Digital Age
 
CAMA 2007 Visions of the Future for Contextualized Attention Metadata
CAMA 2007 Visions of the Future for Contextualized Attention MetadataCAMA 2007 Visions of the Future for Contextualized Attention Metadata
CAMA 2007 Visions of the Future for Contextualized Attention Metadata
 
Where Does It Break?
Where Does It Break?Where Does It Break?
Where Does It Break?
 
[Workshop] Analyzing Your Deliverables: Developing the Optimal Documentation ...
[Workshop] Analyzing Your Deliverables: Developing the Optimal Documentation ...[Workshop] Analyzing Your Deliverables: Developing the Optimal Documentation ...
[Workshop] Analyzing Your Deliverables: Developing the Optimal Documentation ...
 
Analyzing Your Deliverables: Developing the Optimal Documentation Library
Analyzing Your Deliverables: Developing the Optimal Documentation LibraryAnalyzing Your Deliverables: Developing the Optimal Documentation Library
Analyzing Your Deliverables: Developing the Optimal Documentation Library
 

More from Frank van Harmelen

The K in "neuro-symbolic" stands for "knowledge"
The K in "neuro-symbolic" stands for "knowledge"The K in "neuro-symbolic" stands for "knowledge"
The K in "neuro-symbolic" stands for "knowledge"Frank van Harmelen
 
Adoption of Knowledge Graphs, mid 2022 (incomplete)
Adoption of Knowledge Graphs, mid 2022 (incomplete)Adoption of Knowledge Graphs, mid 2022 (incomplete)
Adoption of Knowledge Graphs, mid 2022 (incomplete)Frank van Harmelen
 
Modular design patterns for systems that learn and reason: a boxology
Modular design patterns for systems that learn and reason: a boxologyModular design patterns for systems that learn and reason: a boxology
Modular design patterns for systems that learn and reason: a boxologyFrank van Harmelen
 
Adoption of Knowledge Graphs, late 2019
Adoption of Knowledge Graphs, late 2019Adoption of Knowledge Graphs, late 2019
Adoption of Knowledge Graphs, late 2019Frank van Harmelen
 
Adoption of Knowledge Graphs, mid 2019
Adoption of Knowledge Graphs, mid 2019Adoption of Knowledge Graphs, mid 2019
Adoption of Knowledge Graphs, mid 2019Frank van Harmelen
 
The Empirical Turn in Knowledge Representation
The Empirical Turn in Knowledge RepresentationThe Empirical Turn in Knowledge Representation
The Empirical Turn in Knowledge RepresentationFrank van Harmelen
 
The end of the scientific paper as we know it (or not...)
The end of the scientific paper as we know it (or not...)The end of the scientific paper as we know it (or not...)
The end of the scientific paper as we know it (or not...)Frank van Harmelen
 
On the nature of AI, and the relation between symbolic and statistical approa...
On the nature of AI, and the relation between symbolic and statistical approa...On the nature of AI, and the relation between symbolic and statistical approa...
On the nature of AI, and the relation between symbolic and statistical approa...Frank van Harmelen
 
The end of the scientific paper as we know it (in 4 easy steps)
The end of the scientific paper as we know it (in 4 easy steps)The end of the scientific paper as we know it (in 4 easy steps)
The end of the scientific paper as we know it (in 4 easy steps)Frank van Harmelen
 
Linked Open Data for Medical Guidelines Interactions
Linked Open Data for Medical  Guidelines InteractionsLinked Open Data for Medical  Guidelines Interactions
Linked Open Data for Medical Guidelines InteractionsFrank van Harmelen
 
The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?Frank van Harmelen
 
Semantic Web questions we couldn't ask 10 years ago
Semantic Web questions we couldn't ask 10 years agoSemantic Web questions we couldn't ask 10 years ago
Semantic Web questions we couldn't ask 10 years agoFrank van Harmelen
 
Knowledge Engineering rediscovered, Towards Reasoning Patterns for the Semant...
Knowledge Engineering rediscovered, Towards Reasoning Patterns for the Semant...Knowledge Engineering rediscovered, Towards Reasoning Patterns for the Semant...
Knowledge Engineering rediscovered, Towards Reasoning Patterns for the Semant...Frank van Harmelen
 
Informatics is a natural science
Informatics is a natural scienceInformatics is a natural science
Informatics is a natural scienceFrank van Harmelen
 
How the Web can change social science research (including yours)
How the Web can change social science research (including yours)How the Web can change social science research (including yours)
How the Web can change social science research (including yours)Frank van Harmelen
 
Ontology Mapping - Out Of The Babel Tower
Ontology Mapping - Out Of The Babel TowerOntology Mapping - Out Of The Babel Tower
Ontology Mapping - Out Of The Babel TowerFrank van Harmelen
 

More from Frank van Harmelen (20)

The K in "neuro-symbolic" stands for "knowledge"
The K in "neuro-symbolic" stands for "knowledge"The K in "neuro-symbolic" stands for "knowledge"
The K in "neuro-symbolic" stands for "knowledge"
 
Adoption of Knowledge Graphs, mid 2022 (incomplete)
Adoption of Knowledge Graphs, mid 2022 (incomplete)Adoption of Knowledge Graphs, mid 2022 (incomplete)
Adoption of Knowledge Graphs, mid 2022 (incomplete)
 
Modular design patterns for systems that learn and reason: a boxology
Modular design patterns for systems that learn and reason: a boxologyModular design patterns for systems that learn and reason: a boxology
Modular design patterns for systems that learn and reason: a boxology
 
Adoption of Knowledge Graphs, late 2019
Adoption of Knowledge Graphs, late 2019Adoption of Knowledge Graphs, late 2019
Adoption of Knowledge Graphs, late 2019
 
Adoption of Knowledge Graphs, mid 2019
Adoption of Knowledge Graphs, mid 2019Adoption of Knowledge Graphs, mid 2019
Adoption of Knowledge Graphs, mid 2019
 
Empirical Semantics
Empirical SemanticsEmpirical Semantics
Empirical Semantics
 
The Empirical Turn in Knowledge Representation
The Empirical Turn in Knowledge RepresentationThe Empirical Turn in Knowledge Representation
The Empirical Turn in Knowledge Representation
 
The end of the scientific paper as we know it (or not...)
The end of the scientific paper as we know it (or not...)The end of the scientific paper as we know it (or not...)
The end of the scientific paper as we know it (or not...)
 
On the nature of AI, and the relation between symbolic and statistical approa...
On the nature of AI, and the relation between symbolic and statistical approa...On the nature of AI, and the relation between symbolic and statistical approa...
On the nature of AI, and the relation between symbolic and statistical approa...
 
The end of the scientific paper as we know it (in 4 easy steps)
The end of the scientific paper as we know it (in 4 easy steps)The end of the scientific paper as we know it (in 4 easy steps)
The end of the scientific paper as we know it (in 4 easy steps)
 
Linked Open Data for Medical Guidelines Interactions
Linked Open Data for Medical  Guidelines InteractionsLinked Open Data for Medical  Guidelines Interactions
Linked Open Data for Medical Guidelines Interactions
 
The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?
 
Semantic Web questions we couldn't ask 10 years ago
Semantic Web questions we couldn't ask 10 years agoSemantic Web questions we couldn't ask 10 years ago
Semantic Web questions we couldn't ask 10 years ago
 
Knowledge Engineering rediscovered, Towards Reasoning Patterns for the Semant...
Knowledge Engineering rediscovered, Towards Reasoning Patterns for the Semant...Knowledge Engineering rediscovered, Towards Reasoning Patterns for the Semant...
Knowledge Engineering rediscovered, Towards Reasoning Patterns for the Semant...
 
Informatics is a natural science
Informatics is a natural scienceInformatics is a natural science
Informatics is a natural science
 
How the Web can change social science research (including yours)
How the Web can change social science research (including yours)How the Web can change social science research (including yours)
How the Web can change social science research (including yours)
 
Het slimme Web 3.0
Het slimme Web 3.0Het slimme Web 3.0
Het slimme Web 3.0
 
OWL briefing
OWL briefingOWL briefing
OWL briefing
 
RDF briefing
RDF briefingRDF briefing
RDF briefing
 
Ontology Mapping - Out Of The Babel Tower
Ontology Mapping - Out Of The Babel TowerOntology Mapping - Out Of The Babel Tower
Ontology Mapping - Out Of The Babel Tower
 

Recently uploaded

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Semantic Web research anno 2006:main streams, popular falacies, current status, future challenges

  • 1. Semantic Web research anno 2006: main streams, popular falacies, current status, future challenges Frank van Harmelen Vrije Universiteit Amsterdam
  • 2. This is NOT a Semantic Web evangelization talk (I assume you are already converted) 2
  • 3. This is a “topical” talk: Webster: “referring to the topics of the day, of temporary interest”
  • 4. Semantic Web research anno 2006: main streams popular falacies, main streams, current status, future challenges Which Semantic Web are we talking about?
  • 5. General idea of Semantic Web Make current web more machine accessible (currently all the intelligence is in the user) Motivating use-cases  Search engines • concepts, not keywords • semantic narrowing/widening of queries  Shopbots • semantic interchange, not screenscraping  E-commerce q Negotiation, catalogue mapping, data-integration  Web Services q Need semantic characterisations to find them  Navigation • by semantic proximity, not hardwired links 5
  • 6. General idea of Semantic Web(2) Do this by:  Making data and meta-data available on the Web in machine-understandable form (formalised)  Structure the data and meta-data in These are non-trivial ontologies design decisions. Alternative would be: 6
  • 7. “machine-understandable form” (What it’s like to be a machine) alleviates META-DATA <treatment> <name> <symptoms> IS-A <disease> <drug> <drug administration> 7
  • 8. Expressed using the W3C stack 8
  • 9. Which Semantic Web?  Version 1: "Semantic Web as Web of Data" (TBL)  recipe: expose databases on the web, use RDF, integrate  meta-data from: q expressing DB schema semantics in machine interpretable ways  enable integration and unexpected re-use 9
  • 10. Which Semantic Web?  Version 2: “Enrichment of the current Web”  recipe: Annotate, classify, index  meta-data from: q automatically producing markup: named-entity recognition, concept extraction, tagging, etc.  enable personalisation, search, browse,.. 10
  • 11. Which Semantic Web?  Version 1: “Semantic Web as Web of Data”  Version 2: “Enrichment of the current Web”  Different use-cases  Different techniques  Different users 11
  • 12. Semantic Web research anno 2006: main streams, popular falacies, falacies current status, future challenges Four popular falacies about the Semantic Web
  • 13. First: clear up some popular misunderstandings False statement No : “Semantic Web people try to enforce meaning from the top” They only “enforce” a language. They don’t enforce what is said in that language Compare: HTML “enforced” from the top, But content is entirely free. 13
  • 14. First: clear up some popular misunderstandings False statement No : “The Semantic Web people will require everybody to subscribe to a single predefined "meaning" for the terms we use.” Of course, meaning is fluid, contextual, etc. Lot’s of work on (semi)-automatically bridging between different vocabularies. 14
  • 15. First: clear up some popular misunderstandings False statement No : “The Semantic Web will require users to understand the complicated details of formalised knowledge representation.” All of this is “under the hood”. 15
  • 16. First: clear up some popular misunderstandings False statement No : “The Semantic Web people will require us to manually markup all the existing web-pages.” Lots of work on automatically producing semantic markup: named-entity recognition, concept extraction, etc. 16
  • 17. Semantic Web research anno 2006: main streams, popular falacies, current status future challenges current status, The current state of Semantic Web
  • 18. 4 hard questions on the Semantic Web: Q1: "where does the meta-data come from?”  NL technology is delivering on concept-extraction  Socially emerging (learning from tagging). Q2: “where do the meta-data-schema come from?”  many handcrafted schema  hierarchy learning remains hard  relation extraction remains hard. Q3: “what to do with many meta-data schema?”  ontology mapping/aligning remains VERY hard. Q4: “where’s the ‘Web’ in the Semantic Web?”  more attention to social aspects (P2P, FOAF)  non-textual media remains hard 18  deal with typical Web requirements.
  • 19. Q1: Where do the ontologies come from? Professional bodies, scientific communities, companies, publishers, …. Good old fashioned Knowledge Engineering Convert from DB-schema, UML, etc. Learning remains very hard… 19
  • 20. Q1: Where do the ontologies come from?  handcrafted q music: CDnow (2410/5), MusicMoz (1073/7)  community efforts q biomedical: SNOMED (200k), GO (15k),  commercial: Emtree(45k+190k)  ranging from lightweight (Yahoo) to heavyweight (Cyc)  ranging from small (METAR) to large (UNSPC) 20
  • 21. Q2: Where do the annotations come from? - Automated learning - shallow natural language analysis - Concept extraction Example: Encyclopedia Britannica on “Amsterdam” trade antwerp europe amsterdam netherlands merchant center city town 21
  • 22. Q2: Where do the annotations come from?  lightweight NLP q Dutch language semantic search engine  exploit existing legacy-data q Amazon q Lab equipment  side-effect from user interaction q MIT Lab photo-annotator  NOT from manual effort 22
  • 23. Q3: What to do with many ontologies?  Mesh q Medical Subject Headings, National Library of Medicine q 22.000 descriptions  EMTREE q Commercial Elsevier, Drugs and diseases q 45.000 terms, 190.000 synonyms  UMLS q Integrates 100 different vocabularies  SNOMED q 200.000 concepts, College of American Pathologists  Gene Ontology q 15.000 terms in molecular biology  NCI Cancer Ontology: 23 q 17,000 classes (about 1M definitions),
  • 24. Q3: What to do with many ontologies?  Stitching all this together by hand? 24
  • 25. Q3: What to do with many ontologies?  Linguistics & structure  Shared vocabulary  Instance-based matching  Shared background knowledge 25
  • 26. Where are we now: tools  Languages are stable  Tooling is rapidly emerging q HP, IBM, Oracle, Adobe, … q Parsers, q Editors, q visualisers, q large scale storage and querying q Portal generation, search 26
  • 27. Where are we now: applications  healthy uptake in some areas:  knowledge management / intranets  data-integration  life-sciences  convergence with Semantic Grid  cultural heritage  still very few applications in  personalisation  mobility/context awareness  Most applications for companies, few applications for the public 27
  • 28. Semantic Web research anno 2006: main streams, popular falacies, current status, future challenges future challenges Future directions/challenges
  • 29. Semantic Web as an integrator of many different subfields  Databases  Natural Language Processing  Knowledge Representation  Machine Learning  Information Retrieval  Agents  HCI  …. 29
  • 30. Provocation…  Ontology research is done…… q We know how to make, maintain & deploy them q We have tools & methods for editing, storing, inferencing, visualising, etc  … except for two problems: q Learning q Mapping  Natural lang. technology is also done… q at least it’s good enough 30
  • 31. Large open questions  Ontology learning & mapping  emerging semantics (social & statistical)  Semantic Web services q discovery, composition: realistic?  non-textual media q the semantic gap: text or social?  Deployment: 1. data-integration 2. search 3. personalisation 31
  • 32. Changing focus centralised, formalised, complete, precise distributed, heterogeneous, open, P2P, approximate, lightweight Web 3.0 = Web 2.0 + Semantic Web 32
  • 33. Slide by Carol Goble Predicting the future… Artificial Intelligence Decision making OWL Lots SWRL Knowledge Discovery Semantics Ontology Building Semantic Information Web linking Services NLP Flexible & RDF FOAF Not extensible Social RSS much Metadata bookmarking schemas Collective Intelligence Not Lots much Web 33