• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Lessons and requirements from a decade of deployed Semantic Web apps
 

Lessons and requirements from a decade of deployed Semantic Web apps

on

  • 1,511 views

 

Statistics

Views

Total Views
1,511
Views on SlideShare
1,407
Embed Views
104

Actions

Likes
1
Downloads
14
Comments
0

3 Embeds 104

http://www.scoop.it 59
http://paper.li 44
https://twitter.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Lessons and requirements from a decade of deployed Semantic Web apps Lessons and requirements from a decade of deployed Semantic Web apps Presentation Transcript

    • Digital Enterprise Research Institute www.deri.ie Lessons and requirements from a decade of deployed Semantic Web apps Benjamin Heitmann, Richard Cyganiak, Conor Hayes, Stefan Decker Funded by Science Foundation Ireland under Grant No. SFI/08/CE/I1380 (Líon-2)© Copyright 2011 Digital Enterprise Research Institute. All rights reserved. Enabling Networked Knowledge
    • Input for this workshopDigital Enterprise Research Institute www.deri.ie  LEDP workshop CfP calls for:  requirements  patterns  gaps in Linked Data standards + guidelines  Where should this input come from ? Enabling Networked Knowledge Benjamin Heitmann, slide: 2/17
    • The Semantic Web: a decade is a long timeDigital Enterprise Research Institute www.deri.ie 2001 2011 Enabling Networked Knowledge Benjamin Heitmann, slide: 3/17
    • Choice of methodology?Digital Enterprise Research Institute www.deri.ie  Goal:  patterns, requirements and gaps regarding LD  Data:  10 years of Semantic Web research  Which scientific approach fits ?  Empirical software engineering  Full IEEE transactions journal paper: http://tinyurl.com/semweblessons Enabling Networked Knowledge Benjamin Heitmann, slide: 4/17
    • OverviewDigital Enterprise Research Institute www.deri.ie Empirical survey Architecture: LD standards: Software Eng. Process: arch. pattern gaps shortcomings Software engineering solutions Enabling Networked Knowledge Benjamin Heitmann, slide: 5/17
    • Empirical surveyDigital Enterprise Research Institute www.deri.ie  Sources: 124 apps total  Semantic Web Challenge (ISWC): 2003-2009, 101 apps  Scripting for SemWeb Challenge (ESWC), 2006-2009, 23 apps  includes industry & research apps  Checklist (12 questions)  Data collection: 1. own analysis of paper 2. validation by email Enabling Networked Knowledge Benjamin Heitmann, slide: 6/17
    • Empirical survey resultsDigital Enterprise Research Institute www.deri.ie  widespread support for SemWeb specific features  clear difference to database-driven apps  big uptake of Linked Data principles and eco-system  integration requires human intervention  top 3 standards: RDF, OWL, SPARQL  top 3 vocabularies: FOAF, DC, SIOC Enabling Networked Knowledge Benjamin Heitmann, slide: 7/17
    • Conceptual architectureDigital Enterprise Research Institute www.deri.ie  Conceptual architecture:  describes major design elements of a system (+ relations)  domain specific (e.g. the Semantic Web)  provides architectural pattern  documents community consensus Enabling Networked Knowledge Benjamin Heitmann, slide: 8/17
    • Components of conceptual architectureDigital Enterprise Research Institute www.deri.ie starting point: decouple + specialise RDF data Graph access RDF store Graph query language service handling layer (100%) (88%) (77%) Data Data homogenisation Data discovery integration service (74%) service (30%) User Graph-based Structured data navigation interface authoring interface interface (91%) (29%) Enabling Networked Knowledge Benjamin Heitmann, slide: 9/17
    • LD gaps: publishing/consumingDigital Enterprise Research Institute www.deri.ie  all applications consume RDF  73% import API, 69% export API  but: incompatible implementations  LD principles in 2006 led to consolidation  embedding RDF:  web for humans vs. web for machines  2008: introduction of RDFa Enabling Networked Knowledge Benjamin Heitmann, slide: 10/17
    • LD gaps: beyond open dataDigital Enterprise Research Institute www.deri.ie  writing/changing/updating RDF data is difficult  71% of apps do not support data changes  Writing to remote RDF store:  draft status in 2011: SPARQL Update  Restricting access (read/write):  no standards  no interoperability  closest ideas (?): R/W design note, WebID Enabling Networked Knowledge Benjamin Heitmann, slide: 11/17
    • Software Eng. process shortcomings (1)Digital Enterprise Research Institute www.deri.ie  Integrating noisy RDF data:  60% semi-automatic integration  this involves human intervention  only 20% use automatic heuristics  major part of Semantic Web specific code  Distribution of application logic:  multiple components and standards  queries(41%), rules(52%) or formal vocabularies  hard to maintain Enabling Networked Knowledge Benjamin Heitmann, slide: 12/17
    • Software Eng. process shortcomings (2)Digital Enterprise Research Institute www.deri.ie graph-based  Mismatch of data models between components  graph versus relational or object oriented (90%)  overhead in communication  inconsistent round-trip conversion  3 way ORM needed ? object relational oriented Enabling Networked Knowledge Benjamin Heitmann, slide: 13/17
    • Software Eng. solutions (1)Digital Enterprise Research Institute www.deri.ie  More guidelines, best practices and design patterns:  current examples: – Linked Data principles and publishing guidelines – guidelines for naming of URIs – Linked Data patterns collection  result: more interoperability, more coherent Web of Data Enabling Networked Knowledge Benjamin Heitmann, slide: 14/17
    • Software Eng. solutions (2)Digital Enterprise Research Institute www.deri.ie  More software libraries (beyond RDF storage!)  guidelines can be hardcoded in reusable libraries  good libraries can make complicated guidelines easy to use (See HTTP, SSL, SMTP and DNS lookups)  current examples: – any23, d2r server, Semantic Web Client Library Enabling Networked Knowledge Benjamin Heitmann, slide: 15/17
    • Software Eng. solutions (3)Digital Enterprise Research Institute www.deri.ie  More software factories:  create complete applications  requires patterns + libraries  or: “opinionated software”  components can be customised for domain  Interface, homogenisation and data discovery usually made from scratch https://developers.facebook.com/docs/beta/opengraph/tutorial/ Enabling Networked Knowledge Benjamin Heitmann, slide: 16/17
    • SummaryDigital Enterprise Research Institute www.deri.ie Empirical survey Architecture: LD standards: Software Eng. Process: arch. pattern gaps shortcomings Full article: Software engineering http://tinyurl.com/ solutions semweblessons Enabling Networked Knowledge Benjamin Heitmann, slide: 17/17
    • Appendix: threats to validityDigital Enterprise Research Institute www.deri.ie  Representativeness:  only complete applications part of challenges (not tools or libraries)  apps needed to use real-world data  submission of paper describing the app was required  challenge extends of multiple years, allows trends to be seen  Number of authors who verified checklist (65%):  academic email addresses expire quickly  we manually tried to find new email addresses  no source code was used:  source code was not required for challenges due to e.g. IP issues Enabling Networked Knowledge Benjamin Heitmann, slide: 18/17
    • Table: Impl. detailsDigital Enterprise Research Institute www.deri.ie 2003 2004 2005 2006 2007 2008 2009 overall Programming Java 10% Java 46% Java 48% Java 60% Java 56% Java 50% Java 43% languages Java 66% JS 15% JS 23% PHP 19% C 20% JS 12% PHP 25% PHP 21% PHP 26% PHP 23% JS 13% Jena 18% RAP 15% Sesame 17% RDF libraries — — Sesame 33% Sesame 19% Sesame 12% RDFLib ARC 17% Sesame 23% Jena 8% Jena 9% Lucene 18% 10% Jena 13% RDF 89% RDF 100% RDF 100% RDF 100% RDF 96% RDF 87% RDF 66% SemWeb standards RDF 100% OWL 42% SPARQL SPARQL SPARQL OWL 43% RDFS 37% OWL 66% OWL 30% SPARQL 50% 17% 69% SPARQL OWL 37% RDFS 50% 15% OWL 41% OWL 10% OWL 46% 41% Schemas/ FOAF 30% RSS 20% FOAF 26% FOAF 41% FOAF 34% FOAF 27% vocabularies/ DC 12% — DC 21% FOAF 20% RSS 15% DC 20% DC 15% DC 13% ontologies SWRC 12% DBpedia DC 20% Bibtex 10% SIOC 20% SKOS 15% SIOC 7% 13% Enabling Networked Knowledge Benjamin Heitmann, slide: 19/17
    • Tables: Data integration and other propertiesDigital Enterprise Research Institute www.deri.ie 2003 2004 2005 2006 2007 2008 2009 manual 30% 13% 0% 16% 9% 5% 4% semi- 70% 31% 100% 47% 58% 65% 61% automatic automatic 0% 25% 0% 11% 13% 4% 19% not needed 0% 31% 0% 26% 20% 26% 16% 2003 2004 2005 2006 2007 2008 2009 Data creation 20% 37% 50% 52% 37% 52% 76% Data import 70% 50% 83% 52% 70% 86% 73% Data export 70% 56% 83% 68% 79% 86% 73% Inferencing 60% 68% 83% 57% 79% 52% 42% Decentralised 90% 75% 100% 57% 41% 95% 96% sources Multiple 90% 93% 100% 89% 83% 91% 88% owners Heterogeneous 90% 87% 100% 89% 87% 78% 88% formats Data updates 90% 75% 83% 78% 45% 73% 50% Linked Data 0% 0% 0% 5% 25% 26% 65% principles Enabling Networked Knowledge Benjamin Heitmann, slide: 20/17
    • Table: architectural analysisDigital Enterprise Research Institute www.deri.ie authoring interface graph-based navi- language service data homogeni- gation interface structured data data discovery sation service graph access graph query applications number of service layer RDF store year 2003 10 100% 80% 90% 90% 80% 20% 50% 2004 16 100% 94% 100% 50% 88% 38% 25% 2005 6 100% 100% 100% 83% 83% 33% 33% 2006 19 100% 95% 89% 63% 68% 37% 16% 2007 24 100% 92% 96% 88% 88% 33% 54% 2008 23 100% 87% 83% 70% 78% 26% 30% 2009 26 100% 77% 88% 80% 65% 19% 15% total 124 100% 88% 91% 74% 77% 29% 30% Enabling Networked Knowledge Benjamin Heitmann, slide: 21/17