Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Principles for knowledge engineering on the Web

6,844 views

Published on

Keynote ICK3 conference, Paris, 2011

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Principles for knowledge engineering on the Web

  1. 1. Principles for knowledge engineering on the Web Guus Schreiber VU University Amsterdam Computer Science, Web & Media
  2. 2. Overview of this talk • Semantic Web: the digital heritage case • Knowledge-engineering principles • Challenges for Web KE
  3. 3. My journey knowledge engineering • design patterns for problem solving • methodology for knowledge systems • models of domain knowledge • ontology engineering
  4. 4. My journey access to digital heritage
  5. 5. My journey Web standards • Web metadata: RDF • OWL Web Ontology Language • SKOS model for publishing vocabularies on the Web
  6. 6. SEMANTIC WEB: THE DIGITAL-HERITAGE CASE
  7. 7. The Web: resources and links URL URL Web link
  8. 8. The Semantic Web: typed resources and links URL URL Web link ULAN Henri Matisse Dublin Core creator Painting “Woman with hat” SFMOMA
  9. 9. Vocabulary interoperability: SKOS
  10. 10. Vocabulary representations • SKOS has been a major success • Easy to understand and create • LCSH publication set important example
  11. 11. The myth of a unified vocabulary • In large virtual collections there are always multiple vocabularies – In multiple languages • Every vocabulary has its own perspective – You can’t just merge them • But you can use vocabularies jointly by defining a limited set of links – “Vocabulary alignment” • It is surprising what you can do with just a few links
  12. 12. Example use of vocabulary alignment “Tokugawa” SVCN period Edo SVCN is local in-house ethnology thesaurus AAT style/period Edo (Japanese period) Tokugawa AAT is Getty’s Art & Architecture Thesaurus
  13. 13. Enriching metadata with concepts
  14. 14. Learning vocabulary alignments • Example: learning relations between art styles and artists through NLP of art historic texts – “Who are Impressionist painters?”
  15. 15. Semantic search: result clustering based on retrieval path
  16. 16. Research issues • Information retrieval as graph search – more semantics => more paths – finding optimal graph patterns • Vocabulary alignment • Information extraction – recognizing people, locations, … – identity resolution • Multi-lingual resources
  17. 17. Personalized Rijksmuseum • Interactive user modeling •Recommendations of artworks and art topics
  18. 18. Mobile museum tour
  19. 19. KNOWLEDGE ENGINEERING PRINCIPLES Lessons I learned
  20. 20. Principle 1: Be modest! • Ontology engineers should refrain from developing their own idiosyncratic ontologies • Instead, they should make the available rich vocabularies, thesauri and databases available in an interoperable (web) format • Initially, only add the originally intended semantics
  21. 21. Principle 2: Think large! "Once you have a truly massive amount of information integrated as knowledge, then the human-software system will be superhuman, in the same sense that mankind with writing is superhuman compared to mankind before writing." Doug Lenat
  22. 22. Principle 3: Develop and use patterns! • Don’t try to be (too) creative • Ontology engineering should not be an art but a discipline • Patterns play a key role in methodology for ontology engineering • See for example patterns developed by the W3C Semantic Web Best Practices group http://www.w3.org/2001/sw/BestPractices/
  23. 23. Principle 4: Don’t recreate, but enrich and align • Techniques: – Learning ontology relations/mappings – Semantic analysis, e.g. OntoClean – Processing of scope notes in thesauri
  24. 24. Principle 5: Beware of ontological over-commitment!
  25. 25. Principle 6: writing in an ontology language doesn’t make it an ontology! • Ontology is vehicle for sharing • Papers about your own idiosyncratic “university ontology” should be rejected at conferences • The quality of an ontology does not depend on the number of, for example, OWL constructs used
  26. 26. Principle 7: Required level of formal semantics depends on the domain! • In our semantic search we use three OWL constructs: – owl:sameAs, owl:TransitiveProperty, owl:SymmetricProperty • But cultural heritage has is very different from medicine and bioinformatics – Don’t over-generalize on requirements for e.g. OWL
  27. 27. CHALLENGES FOR WEB KE
  28. 28. Challenge: Linked Open Data
  29. 29. Availability of government data: http://data.gov.uk
  30. 30. The fight for “standard” semantics Schema.org
  31. 31. Challenge: vocabulary alignment methodology • Multitude of alignment techniques available – Direct syntactic match – Lexical manipulation – Structured, …. • Precision & recall varies • Large evaluation initiative – OAEI http://oaei.ontologymatching.org/
  32. 32. Limitations of categorical thinking • The set theory on which ontology languages are built is inadequate for modelling how people think about categories (Lakoff) – Category boundaries are not hard: cf. art styles – People think of prototypes; some examples are very prototypical, others less • We also need to make meta-distinctions explicit – organizing class: “furniture” – base-level class: “chair” – domain-specific: “Windsor chair”
  33. 33. Challenge: new types of search exploiting semantics
  34. 34. Relation search: Picasso, Matisse & Braque
  35. 35. Challenge: combining professional annotations with public “tags”
  36. 36. Challenge: data trust issues • How can a museum trust annotations of outsiders? • Need to adapt techniques from closed world to open world • Ongoing case studies study reputation assessment, use of probability theories, ….
  37. 37. Challenge: event-centred approach => people like narratives
  38. 38. Extracting piracy events from piracy reports & Web sources
  39. 39. Visualising piracy events
  40. 40. Large-scale experimentation!
  41. 41. TOWARDS WEB SCIENCE
  42. 42. We need to study the Web as a phenomenon • Web dynamics • Collective intelligence • Privacy, trust and security • Linked open data • Universal access
  43. 43. Web for Social Development 48
  44. 44. Acknowledgements • Long list of people • Projects: MIA, MultiemdiaN E-Culture, CHOICE, MunCH, CHIP, Agora, PrestoPrime, NoTube, EuropeanaConnect, Poseidon

×