Initial Usage Analysis of DBpedia's Triple Pattern Fragments

Ruben Verborgh
Ruben VerborghPostdoctoral Researcher in Semantic Hypermedia at Ghent University – imec, Belgium
Initial Usage Analysis

of DBpedia's

Triple Pattern Fragments
Ruben Verborgh,

Erik Mannens & Rik Van de Walle
Initial Usage Analysis of DBpedia's Triple Pattern Fragments
What role will the Semantic Web play

for the future generation?
Will it be remotely as important

as the Web now is to us?
There used to be no applications

because there was no data.
Linked Data more of less solved

this chicken-and-egg problem.
There are no applications

because data is not queryable.
SPARQL endpoints are unreliable.

Data dumps are not live.
We analyzed DBpedia’s low-cost

Triple Pattern Fragments interface

between Nov 2014 and Feb 2015.
Over 4M requests were made.

There was 1 minute of downtime.
Web interfaces to triples
Four months of fragments
Extending the analysis
Web interfaces to triples
Four months of fragments
Extending the analysis
Web interfaces act as gateways

between clients and databases.
Database Client
Web
interface
The interface hides the database schema.
The interface restricts the kind of queries.
No sane Web developer or admin

would give direct database access.
Database Client
Web
interface
The client must know the database schema.
The client can ask any query.
SPARQL endpoints happily give

direct access to the database.
Triple

store
Client
SPARQL
protocol
The client must know the database schema.
The client can ask any query.
SPARQL interfaces are expensive,

so we have an availability problem.
There are few SPARQL endpoints

because hosting them is expensive.
Many of the endpoints that exist

suffer from low availability.
You already give data for free.

Do you have to pay for query time as well?
Data dumps allow you to set up

your own private SPARQL endpoint.
But then we no longer query the Web…
No usage statistics whatsoever.
Not everybody can do this:

mobile devices, non-technical users, …
The interface hides the database schema.
The interface restricts the kind of queries.
A Triple Pattern Fragments interface

acts as a gateway to an RDF source.
RDF

source
Client
TPF

interface
A Triple Pattern Fragments interface

acts as a gateway to an RDF source.
Client can only ask ?s ?p ?o patterns.
Decompose complex SPARQL queries

on the client client-side.
Low server cost, highly cacheable,

but higher bandwidth and query time.
Initial Usage Analysis of DBpedia's Triple Pattern Fragments
Initial Usage Analysis of DBpedia's Triple Pattern Fragments
Web interfaces to triples
Four months of fragments
Extending the analysis
In mid-October 2014, we started

an official TPF interface for DBpedia.
Will this interface be used?
How will clients use it?
Will the availability be sufficient

for live application usage?
The server is deployed virtually,
availability monitored externally.
Amazon Elastic Compute Cloud

c3.2xlarge machine (8 CPU, 15GB RAM)
Compressed HDT format as backend
Pingdom for analysis
4.5 million Triple Pattern Fragments

of DBpedia 2014@en were requested.
The TPF client library consumed most,

followed by crawlers and Chrome.
Turtle as content type is most popular,

but being surpassed by TriG.
Most requests come from Europe,

the US and China are following.
The “type” fragment was popular,

but it’s hard to conclude anything.
A quarter of all requests was cached

(but we could cache everything).
During four months,

the API had 99.9994% availability.
We deeply apologize for

that one minute of downtime

in November.
Web interfaces to triples
Four months of fragments
Extending the analysis
We don’t know exactly

which clients executed queries.
Was the TPF client used standalone?
As a library of another application?
Also hard for SPARQL endpoints.
The analysis did not give insights

in which queries clients executed.
Good for privacy!
We can try reconstructing SPARQL queries,

but maybe clients did something else.
We only know with SPARQL endpoints,

not with data dumps or LD documents.
We could learn from the human Web:

can clients give explicit feedback?
“This is the query I executed.

It took me 5 seconds.”
Potential source of insights,

but clients need a gain.
Will this be representative/truthful?
Web interfaces to triples
Four months of fragments
Extending the analysis
We have a >99.999% available API

to the most popular RDF datasource.
No more excuses not to build apps.
So where are they?
Is something else holding us back?
We need to think differently

on how to build Linked Data apps.
The paradigm of querying a database

and waiting for the results

does not scale to the Web.
Live data requires new interfaces

and new visualizations.
We need developers to build bridges

from data to end users.
Now that the chicken-and-egg problem

and the availability problems are solved,

we need to tackle fundamental questions.
Where are the killer apps

the next generation is waiting for?
Initial Usage Analysis

of DBpedia's

Triple Pattern Fragments
@RubenVerborgh

ruben.verborgh.org

linkeddatafragments.org
1 of 36

Recommended

The Digital Cavemen of Linked Lascaux by
The Digital Cavemen of Linked LascauxThe Digital Cavemen of Linked Lascaux
The Digital Cavemen of Linked LascauxRuben Verborgh
3.8K views117 slides
Live DBpedia querying with high availability by
Live DBpedia querying with high availabilityLive DBpedia querying with high availability
Live DBpedia querying with high availabilityRuben Verborgh
2.8K views20 slides
The Future is Federated by
The Future is FederatedThe Future is Federated
The Future is FederatedRuben Verborgh
2K views65 slides
DBpedia's Triple Pattern Fragments by
DBpedia's Triple Pattern FragmentsDBpedia's Triple Pattern Fragments
DBpedia's Triple Pattern FragmentsRuben Verborgh
2.7K views33 slides
Querying data on the Web – client or server? by
Querying data on the Web – client or server?Querying data on the Web – client or server?
Querying data on the Web – client or server?Ruben Verborgh
1.8K views66 slides
Querying datasets on the Web with high availability by
Querying datasets on the Web with high availabilityQuerying datasets on the Web with high availability
Querying datasets on the Web with high availabilityRuben Verborgh
3.4K views46 slides

More Related Content

What's hot

Querying federations 
of Triple Pattern Fragments by
Querying federations 
of Triple Pattern FragmentsQuerying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern FragmentsRuben Verborgh
1.2K views35 slides
Linked Data Fragments by
Linked Data FragmentsLinked Data Fragments
Linked Data FragmentsRuben Verborgh
7.2K views37 slides
Linking media, data, and services by
Linking media, data, and servicesLinking media, data, and services
Linking media, data, and servicesRuben Verborgh
2K views62 slides
Hypermedia APIs that make sense by
Hypermedia APIs that make senseHypermedia APIs that make sense
Hypermedia APIs that make senseRuben Verborgh
2.6K views72 slides
The web – A hypermedia story by
The web – A hypermedia storyThe web – A hypermedia story
The web – A hypermedia storyRuben Verborgh
2.9K views69 slides
The Lonesome LOD Cloud by
The Lonesome LOD CloudThe Lonesome LOD Cloud
The Lonesome LOD CloudRuben Verborgh
4.5K views88 slides

What's hot(20)

Querying federations 
of Triple Pattern Fragments by Ruben Verborgh
Querying federations 
of Triple Pattern FragmentsQuerying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern Fragments
Ruben Verborgh1.2K views
Linking media, data, and services by Ruben Verborgh
Linking media, data, and servicesLinking media, data, and services
Linking media, data, and services
Ruben Verborgh2K views
Hypermedia APIs that make sense by Ruben Verborgh
Hypermedia APIs that make senseHypermedia APIs that make sense
Hypermedia APIs that make sense
Ruben Verborgh2.6K views
The web – A hypermedia story by Ruben Verborgh
The web – A hypermedia storyThe web – A hypermedia story
The web – A hypermedia story
Ruben Verborgh2.9K views
Functional Composition of Sensor Web APIs by Ruben Verborgh
Functional Composition of Sensor Web APIsFunctional Composition of Sensor Web APIs
Functional Composition of Sensor Web APIs
Ruben Verborgh1.3K views
RESTdesc – Efficient runtime service discovery and consumption by Ruben Verborgh
RESTdesc – Efficient runtime service discovery and consumptionRESTdesc – Efficient runtime service discovery and consumption
RESTdesc – Efficient runtime service discovery and consumption
Ruben Verborgh1.3K views
(Re-)Discovering Lost Web Pages by Michael Nelson
(Re-)Discovering Lost Web Pages(Re-)Discovering Lost Web Pages
(Re-)Discovering Lost Web Pages
Michael Nelson761 views
On the Persistence of Persistent Identifiers of the Scholarly Web by Martin Klein
On the Persistence of Persistent Identifiers of the Scholarly WebOn the Persistence of Persistent Identifiers of the Scholarly Web
On the Persistence of Persistent Identifiers of the Scholarly Web
Martin Klein572 views
Epiphany: Adaptable RDFa Generation Linking the Web of Documents to the Web o... by Benjamin Adrian
Epiphany: Adaptable RDFa Generation Linking the Web of Documents to the Web o...Epiphany: Adaptable RDFa Generation Linking the Web of Documents to the Web o...
Epiphany: Adaptable RDFa Generation Linking the Web of Documents to the Web o...
Benjamin Adrian1.4K views
Synchronicity: Just-In-Time Discovery of Lost Web Pages by Michael Nelson
Synchronicity: Just-In-Time Discovery of Lost Web PagesSynchronicity: Just-In-Time Discovery of Lost Web Pages
Synchronicity: Just-In-Time Discovery of Lost Web Pages
Michael Nelson976 views
Scraping data from the web and documents by Tommy Tavenner
Scraping data from the web and documentsScraping data from the web and documents
Scraping data from the web and documents
Tommy Tavenner7K views
Semantic framework for web scraping. by Shyjal Raazi
Semantic framework for web scraping.Semantic framework for web scraping.
Semantic framework for web scraping.
Shyjal Raazi4.4K views
Getting Started With The Talis Platform by Leigh Dodds
Getting Started With The Talis PlatformGetting Started With The Talis Platform
Getting Started With The Talis Platform
Leigh Dodds811 views

Viewers also liked

Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra... by
Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...
Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...Markus Lanthaler
3.9K views29 slides
Adrs Presentation March 2008 by
Adrs Presentation March 2008Adrs Presentation March 2008
Adrs Presentation March 2008guestabd20
621 views35 slides
A Semantic Description Language for RESTful Data Services to Combat Semaphobia by
A Semantic Description Language for RESTful Data Services to Combat SemaphobiaA Semantic Description Language for RESTful Data Services to Combat Semaphobia
A Semantic Description Language for RESTful Data Services to Combat SemaphobiaMarkus Lanthaler
3.8K views31 slides
SAPS - Semantic AtomPub-based Services by
SAPS - Semantic AtomPub-based ServicesSAPS - Semantic AtomPub-based Services
SAPS - Semantic AtomPub-based ServicesMarkus Lanthaler
2.9K views25 slides
Semantic Web Services: State of the Art by
Semantic Web Services: State of the ArtSemantic Web Services: State of the Art
Semantic Web Services: State of the ArtMarkus Lanthaler
2.8K views30 slides
Hypermedia Cannot be the Engine by
Hypermedia Cannot be the EngineHypermedia Cannot be the Engine
Hypermedia Cannot be the EngineRuben Verborgh
4.4K views36 slides

Viewers also liked(19)

Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra... by Markus Lanthaler
Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...
Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...
Markus Lanthaler3.9K views
Adrs Presentation March 2008 by guestabd20
Adrs Presentation March 2008Adrs Presentation March 2008
Adrs Presentation March 2008
guestabd20621 views
A Semantic Description Language for RESTful Data Services to Combat Semaphobia by Markus Lanthaler
A Semantic Description Language for RESTful Data Services to Combat SemaphobiaA Semantic Description Language for RESTful Data Services to Combat Semaphobia
A Semantic Description Language for RESTful Data Services to Combat Semaphobia
Markus Lanthaler3.8K views
SAPS - Semantic AtomPub-based Services by Markus Lanthaler
SAPS - Semantic AtomPub-based ServicesSAPS - Semantic AtomPub-based Services
SAPS - Semantic AtomPub-based Services
Markus Lanthaler2.9K views
Semantic Web Services: State of the Art by Markus Lanthaler
Semantic Web Services: State of the ArtSemantic Web Services: State of the Art
Semantic Web Services: State of the Art
Markus Lanthaler2.8K views
Hypermedia Cannot be the Engine by Ruben Verborgh
Hypermedia Cannot be the EngineHypermedia Cannot be the Engine
Hypermedia Cannot be the Engine
Ruben Verborgh4.4K views
A Deep Dive into JSON-LD and Hydra by Markus Lanthaler
A Deep Dive into JSON-LD and HydraA Deep Dive into JSON-LD and Hydra
A Deep Dive into JSON-LD and Hydra
Markus Lanthaler9.8K views
Web Standards adoption in the AR market by Rob Manson
Web Standards adoption in the AR marketWeb Standards adoption in the AR market
Web Standards adoption in the AR market
Rob Manson6.6K views
LODeX: Schema Summarization and automatic SPARQL query generation for Linked ... by Fabio Benedetti
LODeX: Schema Summarization and automatic SPARQL query generation for Linked ...LODeX: Schema Summarization and automatic SPARQL query generation for Linked ...
LODeX: Schema Summarization and automatic SPARQL query generation for Linked ...
Fabio Benedetti1.1K views
Linked Data Generation Process by LD4SC
Linked Data Generation ProcessLinked Data Generation Process
Linked Data Generation Process
LD4SC1.7K views
What is Hydra? by Findwise
What is Hydra?What is Hydra?
What is Hydra?
Findwise2.7K views
Lisp Macros in 20 Minutes (Featuring Clojure) by Phil Calçado
Lisp Macros in 20 Minutes (Featuring Clojure)Lisp Macros in 20 Minutes (Featuring Clojure)
Lisp Macros in 20 Minutes (Featuring Clojure)
Phil Calçado7.8K views
HTTP and Your Angry Dog by Ross Tuck
HTTP and Your Angry DogHTTP and Your Angry Dog
HTTP and Your Angry Dog
Ross Tuck10.3K views
A Web of Things to Reduce Energy Wastage by Markus Lanthaler
A Web of Things to Reduce Energy WastageA Web of Things to Reduce Energy Wastage
A Web of Things to Reduce Energy Wastage
Markus Lanthaler2.9K views
Rest and the hypermedia constraint by Inviqa
Rest and the hypermedia constraintRest and the hypermedia constraint
Rest and the hypermedia constraint
Inviqa7.7K views
The Web 3.0 is just around the corner. Be prepared! by Markus Lanthaler
The Web 3.0 is just around the corner. Be prepared!The Web 3.0 is just around the corner. Be prepared!
The Web 3.0 is just around the corner. Be prepared!
Markus Lanthaler12.7K views

Similar to Initial Usage Analysis of DBpedia's Triple Pattern Fragments

Semantic Web Servers by
Semantic Web ServersSemantic Web Servers
Semantic Web Serverswebhostingguy
589 views58 slides
SemTech 2010: Pelorus Platform by
SemTech 2010: Pelorus PlatformSemTech 2010: Pelorus Platform
SemTech 2010: Pelorus PlatformClark & Parsia LLC
973 views16 slides
Graphql by
GraphqlGraphql
GraphqlNiv Ben David
619 views50 slides
Explaining The Semantic Web by
Explaining The Semantic WebExplaining The Semantic Web
Explaining The Semantic WebAditya Tuli
749 views40 slides
Spivack Blogtalk 2008 by
Spivack Blogtalk 2008Spivack Blogtalk 2008
Spivack Blogtalk 2008Blogtalk 2008
2.8K views44 slides
Sigma EE: Reaping low-hanging fruits in RDF-based data integration by
Sigma EE: Reaping low-hanging fruits in RDF-based data integrationSigma EE: Reaping low-hanging fruits in RDF-based data integration
Sigma EE: Reaping low-hanging fruits in RDF-based data integrationRichard Cyganiak
1K views21 slides

Similar to Initial Usage Analysis of DBpedia's Triple Pattern Fragments(20)

Explaining The Semantic Web by Aditya Tuli
Explaining The Semantic WebExplaining The Semantic Web
Explaining The Semantic Web
Aditya Tuli749 views
Spivack Blogtalk 2008 by Blogtalk 2008
Spivack Blogtalk 2008Spivack Blogtalk 2008
Spivack Blogtalk 2008
Blogtalk 20082.8K views
Sigma EE: Reaping low-hanging fruits in RDF-based data integration by Richard Cyganiak
Sigma EE: Reaping low-hanging fruits in RDF-based data integrationSigma EE: Reaping low-hanging fruits in RDF-based data integration
Sigma EE: Reaping low-hanging fruits in RDF-based data integration
Richard Cyganiak1K views
GoodRelations & RDFa for Deep Comparison Shopping on a Web Scale by Martin Hepp
GoodRelations & RDFa for Deep Comparison Shopping on a Web ScaleGoodRelations & RDFa for Deep Comparison Shopping on a Web Scale
GoodRelations & RDFa for Deep Comparison Shopping on a Web Scale
Martin Hepp2.1K views
Consuming Linked Data 4/5 Semtech2011 by Juan Sequeda
Consuming Linked Data 4/5 Semtech2011Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011
Juan Sequeda2.3K views
apidays Australia - The Playful Bond Between REST And Data Streams, Warren Ve... by apidays
apidays Australia - The Playful Bond Between REST And Data Streams, Warren Ve...apidays Australia - The Playful Bond Between REST And Data Streams, Warren Ve...
apidays Australia - The Playful Bond Between REST And Data Streams, Warren Ve...
apidays36 views
The Playful Bond Between REST And Data Streams by confluent
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streams
confluent49 views
Sem tech 2011 v8 by dallemang
Sem tech 2011 v8Sem tech 2011 v8
Sem tech 2011 v8
dallemang587 views
How Linked Data Can Speed Information Discovery by Alex Meadows
How Linked Data Can Speed Information DiscoveryHow Linked Data Can Speed Information Discovery
How Linked Data Can Speed Information Discovery
Alex Meadows710 views
Flickr Services by royans
Flickr ServicesFlickr Services
Flickr Services
royans1.8K views
Flickr Services by royans
Flickr ServicesFlickr Services
Flickr Services
royans690 views
Question Answering over Linked Data - Reasoning Issues by Michael Petychakis
Question Answering over Linked Data - Reasoning IssuesQuestion Answering over Linked Data - Reasoning Issues
Question Answering over Linked Data - Reasoning Issues
Michael Petychakis670 views
Things you must know on ruby on rails single page application by Andolasoft Inc
Things you must know on ruby on rails single page applicationThings you must know on ruby on rails single page application
Things you must know on ruby on rails single page application
Andolasoft Inc99 views
A sweet affordable combo for Linked Data Archives by Miel Vander Sande
A sweet affordable combo for Linked Data ArchivesA sweet affordable combo for Linked Data Archives
A sweet affordable combo for Linked Data Archives
Miel Vander Sande963 views

Initial Usage Analysis of DBpedia's Triple Pattern Fragments

  • 1. Initial Usage Analysis
 of DBpedia's
 Triple Pattern Fragments Ruben Verborgh,
 Erik Mannens & Rik Van de Walle
  • 3. What role will the Semantic Web play
 for the future generation? Will it be remotely as important
 as the Web now is to us?
  • 4. There used to be no applications
 because there was no data. Linked Data more of less solved
 this chicken-and-egg problem.
  • 5. There are no applications
 because data is not queryable. SPARQL endpoints are unreliable.
 Data dumps are not live.
  • 6. We analyzed DBpedia’s low-cost
 Triple Pattern Fragments interface
 between Nov 2014 and Feb 2015. Over 4M requests were made.
 There was 1 minute of downtime.
  • 7. Web interfaces to triples Four months of fragments Extending the analysis
  • 8. Web interfaces to triples Four months of fragments Extending the analysis
  • 9. Web interfaces act as gateways
 between clients and databases. Database Client Web interface The interface hides the database schema. The interface restricts the kind of queries.
  • 10. No sane Web developer or admin
 would give direct database access. Database Client Web interface The client must know the database schema. The client can ask any query.
  • 11. SPARQL endpoints happily give
 direct access to the database. Triple
 store Client SPARQL protocol The client must know the database schema. The client can ask any query.
  • 12. SPARQL interfaces are expensive,
 so we have an availability problem. There are few SPARQL endpoints
 because hosting them is expensive. Many of the endpoints that exist
 suffer from low availability. You already give data for free.
 Do you have to pay for query time as well?
  • 13. Data dumps allow you to set up
 your own private SPARQL endpoint. But then we no longer query the Web… No usage statistics whatsoever. Not everybody can do this:
 mobile devices, non-technical users, …
  • 14. The interface hides the database schema. The interface restricts the kind of queries. A Triple Pattern Fragments interface
 acts as a gateway to an RDF source. RDF
 source Client TPF
 interface
  • 15. A Triple Pattern Fragments interface
 acts as a gateway to an RDF source. Client can only ask ?s ?p ?o patterns. Decompose complex SPARQL queries
 on the client client-side. Low server cost, highly cacheable,
 but higher bandwidth and query time.
  • 18. Web interfaces to triples Four months of fragments Extending the analysis
  • 19. In mid-October 2014, we started
 an official TPF interface for DBpedia. Will this interface be used? How will clients use it? Will the availability be sufficient
 for live application usage?
  • 20. The server is deployed virtually, availability monitored externally. Amazon Elastic Compute Cloud
 c3.2xlarge machine (8 CPU, 15GB RAM) Compressed HDT format as backend Pingdom for analysis
  • 21. 4.5 million Triple Pattern Fragments
 of DBpedia 2014@en were requested.
  • 22. The TPF client library consumed most,
 followed by crawlers and Chrome.
  • 23. Turtle as content type is most popular,
 but being surpassed by TriG.
  • 24. Most requests come from Europe,
 the US and China are following.
  • 25. The “type” fragment was popular,
 but it’s hard to conclude anything.
  • 26. A quarter of all requests was cached
 (but we could cache everything).
  • 27. During four months,
 the API had 99.9994% availability. We deeply apologize for
 that one minute of downtime
 in November.
  • 28. Web interfaces to triples Four months of fragments Extending the analysis
  • 29. We don’t know exactly
 which clients executed queries. Was the TPF client used standalone? As a library of another application? Also hard for SPARQL endpoints.
  • 30. The analysis did not give insights
 in which queries clients executed. Good for privacy! We can try reconstructing SPARQL queries,
 but maybe clients did something else. We only know with SPARQL endpoints,
 not with data dumps or LD documents.
  • 31. We could learn from the human Web:
 can clients give explicit feedback? “This is the query I executed.
 It took me 5 seconds.” Potential source of insights,
 but clients need a gain. Will this be representative/truthful?
  • 32. Web interfaces to triples Four months of fragments Extending the analysis
  • 33. We have a >99.999% available API
 to the most popular RDF datasource. No more excuses not to build apps. So where are they? Is something else holding us back?
  • 34. We need to think differently
 on how to build Linked Data apps. The paradigm of querying a database
 and waiting for the results
 does not scale to the Web. Live data requires new interfaces
 and new visualizations.
  • 35. We need developers to build bridges
 from data to end users. Now that the chicken-and-egg problem
 and the availability problems are solved,
 we need to tackle fundamental questions. Where are the killer apps
 the next generation is waiting for?
  • 36. Initial Usage Analysis
 of DBpedia's
 Triple Pattern Fragments @RubenVerborgh
 ruben.verborgh.org
 linkeddatafragments.org