Pragmatic Approaches to the Semantic Web


Published on

Mike Bergman offers his take on what approaches to the semantic Web are working, what are not, and what all of this might say about the semantic Web moving forward. Informed by Structured Dynamics' open source frameworks and client experiences, the main thesis is that the pragmatic contribution of semantic technologies resides more in mindsets, information models and architectures than in 'linked data' as currently practiced.

Published in: Technology, Education
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Pragmatic Approaches to the Semantic Web

  1. 1. Pragmatic Approaches to the Semantic Web or, Why Aren’t We in Hyperland Yet? Michael K. Bergman
  2. 2. Outline Intro to SD and Me Summary of Main Thesis A Wee Bit of History What is Not Working? Problems with Linked Data What is Working? Some Pragmatic Lessons SD’s Pragmatic Approach Conclusion and Q & A 2
  3. 3. Structured Dynamics Founded 2008; predecessor Zitgist LLC; two principals Privately held, revenue funded Boutique semantic technology shop Services and consulting:  Semantic enterprise adoption  Ontology development and mapping  Tech transfer and training Development and software:  Open source OSF stack  Data conversion and migration  Client-specific development 3
  4. 4. Current Products and OSF Stack the pivotal product; Web services middleware that provides distributed data access and federation Drupal-based structured data linkage to structWSF spreadsheet, JSON and XML authoring and conversion framework reference set of linking subjects and basis for domain vocabularies an ontology- and entity-driven information extraction and tagging system 4
  5. 5. SD Locations 5
  6. 6. Michael Bergman 6
  7. 7. Summary of Main Thesis
  8. 8. Main Arguments Not against linked data  Proponent and explicator since 2006 But, linked data burdensome, not pivotal to interoperability Interoperability requires:  Structured data (from any source)  Canonical data model (RDF)  (Relatively simple) ontologies for world views, schema  Curation 8
  9. 9. A Wee Bit of History
  10. 10. Key Historical Milestones 1945: Memex 1963: Hypertext 1990: Hyperland 2001: Semantic Web  Lack of uptake 2006: Linked Data 2010: Revisionist Linked Data 10
  11. 11. Hyperland 11
  12. 12. Linked Data “Linked Data is a set of best practices for publishing and deploying instance and class data using the RDF data model, naming the data objects using uniform resource identifiers (URIs), thereby exposing the data for access via the HTTP protocol, while emphasizing data interconnections, interrelationships and context useful to both humans and machine agents.” 12
  13. 13. What is Not Working?
  14. 14. Some Disappointments to Date Full semantic Web vision Widescale adoption of the semantic Web, linked data Lack of intelligent agents Many aspects of the practice of linked data 14
  15. 15. Problems with Linked Data
  16. 16. Problems with Linked Data Burdensome on publishers Naïve linkages:  Overuse of sameAs  Lack of accurate alignments (Often) poor data quality Wrong focus 16
  17. 17. Some Conditions for Interoperability<Interoperability> <needsMapping> <Predicates> <Interoperability> <needsReference> <Nouns> 17
  18. 18. Many Mappings Should be Approximate skos:broadMatch skos:related ore:similarTo umbel:isAbout vmf:isInVocabulary skos:closeMatch lvont:nearlySameAs umbel:isLike umbel:hasCharacteristic lvont:somewhatSameAs rdfs:seeAlso ore:describes map:narrowerThan skos:narrower map:broaderThan skos:broader dc:subject link:uri foaf:isPrimaryTopicOf 18
  19. 19. What is Working?
  20. 20. Successes Siri Bing (Powerset) Google + (Some) linked data 20
  21. 21. Siri 21
  22. 22. Bing (Powerset) 22
  23. 23. Google Statistical NLP Structured results Initial schema (Metaweb) (with Yahoo, Bing and Yandex) 23
  24. 24. Some Linked Data Some selected knowledge bases:  DBpedia  GeoNames  Freebase (Google) Biomedical community LOD-LAM community 24
  25. 25. Some Pragmatic Lessons
  26. 26. Some Lessons Learned Structure is good in any form Keep semantic technology in the background Open Web (FYN) likely to be disappointing Ontologies essential for alignments NLP an essential contributor to structure Metadata an essential contributor to characterization, use Linked data is a burden to publishers, places semantic emphasis on wrong part of chain 26
  27. 27. Seven Pillars 27
  28. 28. Preserving Existing Assets Relational databases (RDBMs) Distributed structured assets  spreadsheets  lightweight datastores Web pages and Web sites Existing documents and text Web databases and APIs Other databases (RDF, OO, etc.) 28
  29. 29. irON Dataset Exchange Framework Simple authoring and dataset creation irON includes an abstract notation and vocabulary for instance records Notations for:  Instance records  Schema  Datasets and metadata  Linkages to other schema Serializations available for:  XML (irXML)  JSON (irJSON)  CSV/spreadsheets (commON) 29
  30. 30. Three irON Serializations irXML irJSON commON 30
  31. 31. Spreadsheet Correspondence to Triples 31
  32. 32. More-or-less Interchangeable Formats 32
  33. 33. SD’s Pragmatic Approach
  34. 34. A Layered Approach 34
  35. 35. OSF Stack 35
  36. 36. Conclusion
  37. 37. Summary If you can, do linked data; it is a GOOD THING In any event, expose your data:  Structured (use NLP for unstructured)  Metadata  Definitions  Relations (simple)  “Semsets” (synonyms, acronyms, spelling variants) Build vocabulary and ontology consortia Build trust and curation communities Semantics essential at the interoperability level, not necessarily publication or data transfer 37
  38. 38. Take Aways James Hendler: “A little bit of semantics goes a long way” Leverage linked data, but broaden focus Consider adopting the semantic enterprise as the broader focus 38
  39. 39. Further Information
  40. 40. More Info and Links Open Semantic Framework (OSF) stack:  TechWiki (400 detailed OSF how-to articles):  Key ontologies:  UMBEL:  BIBO: Blogs:  Mike Bergman:  Fred Giasson: Structured Dynamics:   (community indicator systems) 40