Digital Enterprise Research Institute                                                               www.deri.ie           ...
Input for this workshopDigital Enterprise Research Institute                                  www.deri.ie           LEDP ...
The Semantic Web:                                a decade is a long timeDigital Enterprise Research Institute             ...
Choice of methodology?Digital Enterprise Research Institute                                       www.deri.ie         Goa...
OverviewDigital Enterprise Research Institute                                                  www.deri.ie                ...
Empirical surveyDigital Enterprise Research Institute                                        www.deri.ie                  ...
Empirical survey resultsDigital Enterprise Research Institute                                   www.deri.ie    widespread...
Conceptual architectureDigital Enterprise Research Institute                                          www.deri.ie         ...
Components of conceptual                                architectureDigital Enterprise Research Institute                 ...
LD gaps:                                publishing/consumingDigital Enterprise Research Institute                         ...
LD gaps: beyond open dataDigital Enterprise Research Institute                                                   www.deri....
Software Eng. process                                shortcomings (1)Digital Enterprise Research Institute                ...
Software Eng. process                                shortcomings (2)Digital Enterprise Research Institute                ...
Software Eng. solutions (1)Digital Enterprise Research Institute                                             www.deri.ie  ...
Software Eng. solutions (2)Digital Enterprise Research Institute                                         www.deri.ie      ...
Software Eng. solutions (3)Digital Enterprise Research Institute                                                        ww...
SummaryDigital Enterprise Research Institute                                                   www.deri.ie                ...
Appendix: threats to validityDigital Enterprise Research Institute                                           www.deri.ie  ...
Table: Impl. detailsDigital Enterprise Research Institute                                                                 ...
Tables: Data integration and                                other propertiesDigital Enterprise Research Institute         ...
Table: architectural analysisDigital Enterprise Research Institute                                                        ...
Upcoming SlideShare
Loading in...5
×

Lessons and requirements from a decade of deployed Semantic Web apps

1,423

Published on

Published in: Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,423
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
15
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Lessons and requirements from a decade of deployed Semantic Web apps

  1. 1. Digital Enterprise Research Institute www.deri.ie Lessons and requirements from a decade of deployed Semantic Web apps Benjamin Heitmann, Richard Cyganiak, Conor Hayes, Stefan Decker Funded by Science Foundation Ireland under Grant No. SFI/08/CE/I1380 (Líon-2)© Copyright 2011 Digital Enterprise Research Institute. All rights reserved. Enabling Networked Knowledge
  2. 2. Input for this workshopDigital Enterprise Research Institute www.deri.ie  LEDP workshop CfP calls for:  requirements  patterns  gaps in Linked Data standards + guidelines  Where should this input come from ? Enabling Networked Knowledge Benjamin Heitmann, slide: 2/17
  3. 3. The Semantic Web: a decade is a long timeDigital Enterprise Research Institute www.deri.ie 2001 2011 Enabling Networked Knowledge Benjamin Heitmann, slide: 3/17
  4. 4. Choice of methodology?Digital Enterprise Research Institute www.deri.ie  Goal:  patterns, requirements and gaps regarding LD  Data:  10 years of Semantic Web research  Which scientific approach fits ?  Empirical software engineering  Full IEEE transactions journal paper: http://tinyurl.com/semweblessons Enabling Networked Knowledge Benjamin Heitmann, slide: 4/17
  5. 5. OverviewDigital Enterprise Research Institute www.deri.ie Empirical survey Architecture: LD standards: Software Eng. Process: arch. pattern gaps shortcomings Software engineering solutions Enabling Networked Knowledge Benjamin Heitmann, slide: 5/17
  6. 6. Empirical surveyDigital Enterprise Research Institute www.deri.ie  Sources: 124 apps total  Semantic Web Challenge (ISWC): 2003-2009, 101 apps  Scripting for SemWeb Challenge (ESWC), 2006-2009, 23 apps  includes industry & research apps  Checklist (12 questions)  Data collection: 1. own analysis of paper 2. validation by email Enabling Networked Knowledge Benjamin Heitmann, slide: 6/17
  7. 7. Empirical survey resultsDigital Enterprise Research Institute www.deri.ie  widespread support for SemWeb specific features  clear difference to database-driven apps  big uptake of Linked Data principles and eco-system  integration requires human intervention  top 3 standards: RDF, OWL, SPARQL  top 3 vocabularies: FOAF, DC, SIOC Enabling Networked Knowledge Benjamin Heitmann, slide: 7/17
  8. 8. Conceptual architectureDigital Enterprise Research Institute www.deri.ie  Conceptual architecture:  describes major design elements of a system (+ relations)  domain specific (e.g. the Semantic Web)  provides architectural pattern  documents community consensus Enabling Networked Knowledge Benjamin Heitmann, slide: 8/17
  9. 9. Components of conceptual architectureDigital Enterprise Research Institute www.deri.ie starting point: decouple + specialise RDF data Graph access RDF store Graph query language service handling layer (100%) (88%) (77%) Data Data homogenisation Data discovery integration service (74%) service (30%) User Graph-based Structured data navigation interface authoring interface interface (91%) (29%) Enabling Networked Knowledge Benjamin Heitmann, slide: 9/17
  10. 10. LD gaps: publishing/consumingDigital Enterprise Research Institute www.deri.ie  all applications consume RDF  73% import API, 69% export API  but: incompatible implementations  LD principles in 2006 led to consolidation  embedding RDF:  web for humans vs. web for machines  2008: introduction of RDFa Enabling Networked Knowledge Benjamin Heitmann, slide: 10/17
  11. 11. LD gaps: beyond open dataDigital Enterprise Research Institute www.deri.ie  writing/changing/updating RDF data is difficult  71% of apps do not support data changes  Writing to remote RDF store:  draft status in 2011: SPARQL Update  Restricting access (read/write):  no standards  no interoperability  closest ideas (?): R/W design note, WebID Enabling Networked Knowledge Benjamin Heitmann, slide: 11/17
  12. 12. Software Eng. process shortcomings (1)Digital Enterprise Research Institute www.deri.ie  Integrating noisy RDF data:  60% semi-automatic integration  this involves human intervention  only 20% use automatic heuristics  major part of Semantic Web specific code  Distribution of application logic:  multiple components and standards  queries(41%), rules(52%) or formal vocabularies  hard to maintain Enabling Networked Knowledge Benjamin Heitmann, slide: 12/17
  13. 13. Software Eng. process shortcomings (2)Digital Enterprise Research Institute www.deri.ie graph-based  Mismatch of data models between components  graph versus relational or object oriented (90%)  overhead in communication  inconsistent round-trip conversion  3 way ORM needed ? object relational oriented Enabling Networked Knowledge Benjamin Heitmann, slide: 13/17
  14. 14. Software Eng. solutions (1)Digital Enterprise Research Institute www.deri.ie  More guidelines, best practices and design patterns:  current examples: – Linked Data principles and publishing guidelines – guidelines for naming of URIs – Linked Data patterns collection  result: more interoperability, more coherent Web of Data Enabling Networked Knowledge Benjamin Heitmann, slide: 14/17
  15. 15. Software Eng. solutions (2)Digital Enterprise Research Institute www.deri.ie  More software libraries (beyond RDF storage!)  guidelines can be hardcoded in reusable libraries  good libraries can make complicated guidelines easy to use (See HTTP, SSL, SMTP and DNS lookups)  current examples: – any23, d2r server, Semantic Web Client Library Enabling Networked Knowledge Benjamin Heitmann, slide: 15/17
  16. 16. Software Eng. solutions (3)Digital Enterprise Research Institute www.deri.ie  More software factories:  create complete applications  requires patterns + libraries  or: “opinionated software”  components can be customised for domain  Interface, homogenisation and data discovery usually made from scratch https://developers.facebook.com/docs/beta/opengraph/tutorial/ Enabling Networked Knowledge Benjamin Heitmann, slide: 16/17
  17. 17. SummaryDigital Enterprise Research Institute www.deri.ie Empirical survey Architecture: LD standards: Software Eng. Process: arch. pattern gaps shortcomings Full article: Software engineering http://tinyurl.com/ solutions semweblessons Enabling Networked Knowledge Benjamin Heitmann, slide: 17/17
  18. 18. Appendix: threats to validityDigital Enterprise Research Institute www.deri.ie  Representativeness:  only complete applications part of challenges (not tools or libraries)  apps needed to use real-world data  submission of paper describing the app was required  challenge extends of multiple years, allows trends to be seen  Number of authors who verified checklist (65%):  academic email addresses expire quickly  we manually tried to find new email addresses  no source code was used:  source code was not required for challenges due to e.g. IP issues Enabling Networked Knowledge Benjamin Heitmann, slide: 18/17
  19. 19. Table: Impl. detailsDigital Enterprise Research Institute www.deri.ie 2003 2004 2005 2006 2007 2008 2009 overall Programming Java 10% Java 46% Java 48% Java 60% Java 56% Java 50% Java 43% languages Java 66% JS 15% JS 23% PHP 19% C 20% JS 12% PHP 25% PHP 21% PHP 26% PHP 23% JS 13% Jena 18% RAP 15% Sesame 17% RDF libraries — — Sesame 33% Sesame 19% Sesame 12% RDFLib ARC 17% Sesame 23% Jena 8% Jena 9% Lucene 18% 10% Jena 13% RDF 89% RDF 100% RDF 100% RDF 100% RDF 96% RDF 87% RDF 66% SemWeb standards RDF 100% OWL 42% SPARQL SPARQL SPARQL OWL 43% RDFS 37% OWL 66% OWL 30% SPARQL 50% 17% 69% SPARQL OWL 37% RDFS 50% 15% OWL 41% OWL 10% OWL 46% 41% Schemas/ FOAF 30% RSS 20% FOAF 26% FOAF 41% FOAF 34% FOAF 27% vocabularies/ DC 12% — DC 21% FOAF 20% RSS 15% DC 20% DC 15% DC 13% ontologies SWRC 12% DBpedia DC 20% Bibtex 10% SIOC 20% SKOS 15% SIOC 7% 13% Enabling Networked Knowledge Benjamin Heitmann, slide: 19/17
  20. 20. Tables: Data integration and other propertiesDigital Enterprise Research Institute www.deri.ie 2003 2004 2005 2006 2007 2008 2009 manual 30% 13% 0% 16% 9% 5% 4% semi- 70% 31% 100% 47% 58% 65% 61% automatic automatic 0% 25% 0% 11% 13% 4% 19% not needed 0% 31% 0% 26% 20% 26% 16% 2003 2004 2005 2006 2007 2008 2009 Data creation 20% 37% 50% 52% 37% 52% 76% Data import 70% 50% 83% 52% 70% 86% 73% Data export 70% 56% 83% 68% 79% 86% 73% Inferencing 60% 68% 83% 57% 79% 52% 42% Decentralised 90% 75% 100% 57% 41% 95% 96% sources Multiple 90% 93% 100% 89% 83% 91% 88% owners Heterogeneous 90% 87% 100% 89% 87% 78% 88% formats Data updates 90% 75% 83% 78% 45% 73% 50% Linked Data 0% 0% 0% 5% 25% 26% 65% principles Enabling Networked Knowledge Benjamin Heitmann, slide: 20/17
  21. 21. Table: architectural analysisDigital Enterprise Research Institute www.deri.ie authoring interface graph-based navi- language service data homogeni- gation interface structured data data discovery sation service graph access graph query applications number of service layer RDF store year 2003 10 100% 80% 90% 90% 80% 20% 50% 2004 16 100% 94% 100% 50% 88% 38% 25% 2005 6 100% 100% 100% 83% 83% 33% 33% 2006 19 100% 95% 89% 63% 68% 37% 16% 2007 24 100% 92% 96% 88% 88% 33% 54% 2008 23 100% 87% 83% 70% 78% 26% 30% 2009 26 100% 77% 88% 80% 65% 19% 15% total 124 100% 88% 91% 74% 77% 29% 30% Enabling Networked Knowledge Benjamin Heitmann, slide: 21/17
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×