Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

How the Semantic Web is transforming information access

1,446 views

Published on

Talk at the B-Debate meeting "Artificial Intelligence: Dreams, Risks, and Reality", 8 March 2017, Barcelona

Published in: Technology
  • Be the first to comment

  • Be the first to like this

How the Semantic Web is transforming information access

  1. 1. How the Semantic Web is transforming information access Guus Schreiber VU University Amsterdam
  2. 2. Overview of this talk • My journey • Knowledge engineering in the Web era • Using knowledge for enriched information access • Selection of (research) issues
  3. 3. My journey knowledge engineering • design patterns for problem solving • methodology for knowledge systems • models of domain knowledge • ontology engineering
  4. 4. My journey access to digital heritage
  5. 5. My journey Web standards Chair of •Web metadata: RDF 1.1 •OWL Web Ontology Language 1.0 •SKOS model for publishing vocabularies on the Web •Deployment & best practices
  6. 6. A few words about Web standardization • Key success factor! • Consensus process actually works – Some of the time at least • Public review – Taking every comment seriously • The danger of over-designing – Principle of minimal commitment
  7. 7. Knowledge engineering in the web era
  8. 8. Key principle: Think large! "Once you have a truly massive amount of information integrated as knowledge, then the human-software system will be superhuman, in the same sense that mankind with writing is superhuman compared to mankind before writing." Doug Lenat (IJCAI Milan 1987: “On the thresholds of knowledge”)
  9. 9. Google Knowledge Graph: 70 billion facts (October 2016)
  10. 10. WordNet(s): semantic networks of language terms
  11. 11. Iconclass: categorizing image scene
  12. 12. Other frequently-used ontologies • Dublin Core • Friend-of-a-Friend • Schema.org • Library of Congress Subject Headings • .....
  13. 13. Lessons for knowledge engineering 1. Be modest – appreciate what is already available 1. Don’t recreate, but enrich and align 2. Ontologies represent consensus – “creating my own ontology” is a contradictio in terminis 1. Beware of ontological over-commitment
  14. 14. The Web: resources and links URL URL Web link
  15. 15. The Semantic Web: typed resources and links URL URL Web link ULAN Henri Matisse Dublin Core creator Painting “Woman with hat” SFMOMA
  16. 16. Knowledge representation: graphs
  17. 17. Generating Linked Data WSSS-12, Leiden 18
  18. 18. Using knowledge for information access
  19. 19. Strategies for using knowledge in information retrieval • Enrich corpus used for information retrieval – E.g. add synonyms • Enriched presentation of search results • Custom-tailored search algorithms – Graph search – Recommendation
  20. 20. Enriching metadata with concepts
  21. 21. Visualising piracy events
  22. 22. Sample graph search algorithm From search term (literal) to art work: •Find resources with matching label •Find path from resource to art work – Cost of each step (step when above cost threshold) – Special treatment of semantics: sameAs, inverseOf, … •Cluster results based on path similarities
  23. 23. Graph search: “Picasso”
  24. 24. Generating a personalized museum tour
  25. 25. Challenge: relation search Picasso, Matisse & Braque
  26. 26. Challenge: event-centered approach => people like narratives
  27. 27. Research issues in knowledge engineering
  28. 28. The myth of a unified ontology • In large virtual collections there are always multiple ontologies – In multiple languages • Every ontology has its own perspective – You can’t just merge them • But: you can use ontologies jointly by defining a limited set of links! • Large body of research on “ontology alignment”
  29. 29. Limitations of categorical thinking • The set theory on which ontology languages are built is inadequate for modelling how people think about categories (Lakoff) – Category boundaries are not hard: cf. art styles – People think of prototypes; see theory of Rosch • We need to make meta-distinctions in hierarchies explicit – organizing class: “furniture” – base-level class: “chair” – domain-specific: “Windsor chair”
  30. 30. Use of crowdsourcing to annotate data • Collecting annotation data on text, images, sounds and videos • Use as gold standard for learning algorithms – Inter-observer agreement – Are experts better annotators than laypersons?
  31. 31. Winner EuroITV Competition Best Archives on the Web Award http://waisda.nl 32
  32. 32. Nichesourcing: finding the right minority vote “the frog stands for the rejected lover” “The frog indicates the profession of the women” “Sexy ladies!” annotation quality level 33 “women”
  33. 33. A small selection of issues • Ownership/rights of knowledge sources – Incorporation of public data in private data • Open access • Fair use of (lay/expert) crowd • Explicit knowledge sources support explanation?

×