SlideShare a Scribd company logo
1 of 66
Querying data on the Web:

client or server?
Ruben Verborgh
Ghent University – iMinds
The current Semantic Web

has many implicit assumptions.
We should be able

to answer all queries.
Complexity is more important

than availability.
Data servers

need to be expensive.
Those assumptions are

not necessarily wrong.
They’re also not necessarily

the only possible ones.
Some queries are

hard to answer.
Availability is
a top priority.
Low-cost data servers

have potential.
Let’s rethink our assumptions,

just to see what’s possible.
Different assumptions lead

to a different Semantic Web.
Maybe they bring us closer

to the Web We Want.
…but what do we want?
The Semantic Web’s assumptions
Client-side query execution
Querying data on the Web:

client or server?
New query opportunities
1. Clients need a different protocol.
The Web for humans offers
an HTTP interface to HTML.
client dataHTTP
HTML
The Web for applications offers
an HTTP interface to JSON.
client dataHTTP
JSON
The Web for applications offers
an HTTP interface to RDF.
client dataHTTP
RDF
The Web for applications offers
an SPARQL interface to RDF.
client dataHTTP
RDF
SPARQL
Documents need a new language.
Semantic Web clients were

perceived as very limited.
Querying needs a new protocol.
…unlike “simple” JSON clients.
1. Clients need a different protocol.
2. Live queries require that protocol.
public SPARQL endpoints
There are 3 common ways

to publish Linked Data.
Linked Data documents
downloadable data dumps
…and that’s not always a good thing.
Public SPARQL endpoints

offer a very powerful interface.
Clients can ask any query…
…if the endpoint is available.
Hosting an endpoint is costly.
Low-cost to host.
Linked Data documents

seem to work like the Web.
Solve queries by traversing links.
Many queries cannot be solved.
Set up your own endpoint.
Downloadable data dumps

have high availability.
Data is not live.
You’re not really querying the Web.
1. Clients need a different protocol.
2. Live queries require that protocol.
3. Clients can request any query.
The query language abstracts away

the steps needed to solve it.
In SPARQL, asking a simple query

is as easy as asking a difficult one.
In contrast to the rest of the Web,

clients are in control.
With a JSON interface, the server
decides how clients access data.
client dataHTTP
JSON
client dataHTTP
RDF
SPARQL
With a SPARQL interface, clients

decide how they access data.
Clients can ask anything, also

queries that bring servers down.
The majority

of public SPARQL endpoints

has less than 95% availability.
That means the endpoint

—and thus your application—

doesn’t work 1.5 days each month.
If you have operational need

for SPARQL accessible data,

you must have your own infrastructure.
No public endpoints.

Public endpoints are for lookups and discovery;

sort of a dataset demo.
—Orri Erling, OpenLink (2014)
SEMANTICthings we happen to have

downloaded from the
WEB
If you want to study

a subject on Wikipedia,
do you download all

4,614,000 articles first?
1. Clients need a different protocol.
2. Live queries require that protocol.
3. Clients can request any query.
The Semantic Web’s assumptions
Client-side query execution
New query opportunities
Querying data on the Web:

client or server?
data

dump
SPARQL

endpoint
Any fragment of a Linked Data set

is called a Linked Data Fragment.
derefer-

encing
high server efforthigh client effort
all subject SPARQL querySELECTOR
Each type of Linked Data Fragment

is defined by three characteristics.
selector
metadata
controls
What data does it contain?
What do we know about it?
What can we do next?
a SPARQL query
(none)
(none)
SPARQL CONSTRUCT result
selector
metadata
controls
Each type of Linked Data Fragment

is defined by three characteristics.
a specific entity
creator, maintainer, …
links to other LD documents
Linked Data Document
selector
metadata
controls
Each type of Linked Data Fragment

is defined by three characteristics.
everything
(none)
data dump
number of triples, file size
selector
metadata
controls
Each type of Linked Data Fragment

is defined by three characteristics.
Can we query fragments that

balance client and server effort?
data

dump
SPARQL

endpoint
triple

pattern

fragments
derefer-

encing
high server efforthigh client effort
all subject SPARQL querytriple pattern
triple pattern
total number of matches
access to all other fragments
selector
metadata
controls
Triple pattern fragments are cheap

yet enable efficient querying.
data (first 100)
controls (other fragments)
metadata (total count)
Other APIs exist, but are specific.
Triple pattern fragment servers

enable clients to execute queries.
Triple patterns work on all datasets.
Combine data, metadata & controls.
How to answer this query using

only triple pattern fragments?
SELECT ?person ?city WHERE {
?person a dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace ?city.
?city foaf:name "York"@en.
}
Get the corresponding fragments

?person a dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace ?city.
?city foaf:name "York"@en.
dbpedia:York foaf:name “York”@en.
dbpedia:York,_Ontario foaf:name “York”@en.

…
dbpedia:Ganesh_Ghosh …:birthPlace dbpedia:Bengal_Presidency.
dbpedia:Jacques_L'enfant …:birthPlace dbpedia:Beauce.

…
dbpedia:Aamir_Zaki a dbpedia-owl:Artist.
dbpedia:Ahmad_Morid a dbpedia-owl:Artist.

…
Get the corresponding fragments

and read the count metadata.
?person a dbpedia-owl:Artist. ±61,000
±470,000
12
?person dbpedia-owl:birthPlace ?city.
?city foaf:name "York"@en.
dbpedia:York foaf:name “York”@en.
dbpedia:York,_Ontario foaf:name “York”@en.

…
dbpedia:Ganesh_Ghosh …:birthPlace dbpedia:Bengal_Presidency.
dbpedia:Jacques_L'enfant …:birthPlace dbpedia:Beauce.

…
dbpedia:Aamir_Zaki a dbpedia-owl:Artist.
dbpedia:Ahmad_Morid a dbpedia-owl:Artist.

…
Start with the smallest fragment.

Start with the first match.
?person a dbpedia-owl:Artist ±61,
±470,
12
?person dbpedia-owl:birthPlace
?city foaf:name "York"@en.
dbpedia:York foaf:name “York”@en.
dbpedia:York,_Ontario foaf:name “York”@en.

…
dbpedia:Ganesh_Ghosh …:birthPlace dbpedia:Bengal_Presidency.
dbpedia:Jacques_L'enfant …:birthPlace dbpedia:Beauce.
…
dbpedia:Aamir_Zaki
dbpedia:Ahmad_Morid a dbpedia-owl:Artist.
…
How to answer this query using

only triple pattern fragments?
SELECT ?person WHERE {
?person a dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace dbpedia:York.
dbpedia:York foaf:name "York"@en.
}
Get the corresponding fragments

?person a dbpedia-owl:Artist.
?person dbpo:birthPlace dbpedia:York.
dbpedia:John_Flaxman dbpo:birthPlace dbpedia:York.
dbpedia:Joseph_Hansom dbpo:birthPlace dbpedia:York.

…
dbpedia:Aamir_Zaki a dbpedia-owl:Artist.
dbpedia:Ahmad_Morid a dbpedia-owl:Artist.

…
Get the corresponding fragments

and read the count metadata.
?person a dbpedia-owl:Artist. ±61,000
75?person dbpo:birthPlace dbpedia:York.
dbpedia:John_Flaxman dbpo:birthPlace dbpedia:York.
dbpedia:Joseph_Hansom dbpo:birthPlace dbpedia:York.

…
dbpedia:Aamir_Zaki a dbpedia-owl:Artist.
dbpedia:Ahmad_Morid a dbpedia-owl:Artist.

…
Start with the smallest fragment.

Start with the first match.
?person a dbpedia-owl:Artist ±61,
75?person dbpo:birthPlace dbpedia:York.
dbpedia:John_Flaxman dbpo:birthPlace dbpedia:York.
dbpedia:Joseph_Hansom dbpo:birthPlace dbpedia:York.

…
dbpedia:Aamir_Zaki
dbpedia:Ahmad_Morid a dbpedia-owl:Artist.
…
How to answer this query using

only triple pattern fragments?
ASK {
dbp:John_Flaxman a dbpo:Artist.
dbp:John_Flaxman dbpo:birthPlace dbp:York.
dbp:York foaf:name "York"@en.
}
Get the corresponding fragment

and read the count metadata.
dbpedia:John_Flaxman a dbpedia-owl:Artist. 1
dbpedia:John_Flaxman a dbpedia-owl:Artist.
!
Output the match:
?person = dbpedia:John_Flaxman

?city = dbpedia:York
Recursively repeat the process

for all bindings.
?person dbpo:birthPlace dbpedia:York.
dbpedia:John_Flaxman dbpo:birthPlace dbpedia:York.
dbpedia:Joseph_Hansom dbpo:birthPlace dbpedia:York.

…
?city foaf:name "York"@en.
dbpedia:York foaf:name “York”@en.
dbpedia:York,_Ontario foaf:name “York”@en.

…
Use the Web’s protocol HTTP.
This way of querying

changes the usual assumptions.
Don’t be smart; enable intelligence.
Some queries will be hard / slow.
Querying semantic datasources

means managing expectations.
data

dump
SPARQL

endpoint
triple

pattern

fragments
derefer-

encing
high server efforthigh client effort
low availabilityhigh availability
low freshness / speed high freshness / speed
The Semantic Web’s assumptions
Client-side query execution
New query opportunities
Querying data on the Web:

client or server?
Coupling access and processing

leads to low availability.
SPARQL Server
Client
Client
Client
Client
Client
Client
Client
(a) sparql endpoints perform all processing on the server, leading to fast
query execution with low data bandwidth, and a rapidly overloaded server.
LDF Server
Client
ClientClient
Client
Client
Client
Client Client
Client
(b) ldf servers only support simple requests and can thus handle far higher
loads. Clients perform the querying, so they need more (cacheable) data.
Enabling clients to query

leads to high scalability.
Show a sorted list of molecules

that match certain characteristics.
…
Molecules endpoint

approach
fragment

approach
Molecules
endpoint

approach
SPARQL

endpoint
Molecules
Show a sorted list of molecules

that match certain characteristics.
endpoint

approachSELECT DISTINCT(?mol) MIN(?name)
WHERE {
?mol rdfs:label ?name;
…
…
}
ORDER BY ?name
Show a sorted list of molecules

that match certain characteristics.
endpoint

approach
Show a sorted list of molecules

that match certain characteristics.
SELECT DISTINCT(?mol) MIN(?name)
WHERE {
?mol rdfs:label ?name;
…
…
}
ORDER BY ?name
endpoint

approach
DISTINCT
MIN
SORT BY
keep all results in memory
keep all results in memory, blocking
keep all results in memory, blocking
Consequences:
Doesn’t matter; we’re waiting anyway.
Show a sorted list of molecules

that match certain characteristics.
fragments

approach
No blocking operators; streaming matters.
Show a sorted list of molecules

that match certain characteristics.
SELECT ?mol ?name
WHERE {
?mol rdfs:label ?name;
…
…
}
Molecules
fragments

approach
MoleculesMolecules
Show a sorted list of molecules

that match certain characteristics.
The algorithm remains the same

when clients use one or multiple

triple pattern fragment servers.
Federation also becomes

substantially easier.
Avoid the unavailability cascade.
An optimal solution doesn’t exist.

We should look at all APIs.
data

dump
SPARQL

endpoint
triple

pattern

fragments
derefer-

encing
Servers indicate what they do,

enabling clients to query optimally.
“This server supports triple patterns

and full-text search on objects.”
“This server supports SPARQL queries

with up to 2 joins.”
“This server supports Linked Data documents.”
The Semantic Web’s assumptions
Client-side query execution
New query opportunities
Querying data on the Web:

client or server?
Different assumptions

lead to different trade-offs.
Live querying of public data
is possible at low cost,

but at slower speeds…
…for now :-)
Let your browser

solve a SPARQL query:

client.linkeddatafragments.org
Ruben Verborgh
Ghent University – iMinds

More Related Content

What's hot

Distributed Affordance
Distributed AffordanceDistributed Affordance
Distributed AffordanceRuben Verborgh
 
Functional Composition of Sensor Web APIs
Functional Composition of Sensor Web APIsFunctional Composition of Sensor Web APIs
Functional Composition of Sensor Web APIsRuben Verborgh
 
Querying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern FragmentsQuerying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern FragmentsRuben Verborgh
 
The web – A hypermedia story
The web – A hypermedia storyThe web – A hypermedia story
The web – A hypermedia storyRuben Verborgh
 
Hypermedia APIs that make sense
Hypermedia APIs that make senseHypermedia APIs that make sense
Hypermedia APIs that make senseRuben Verborgh
 
Hypermedia Cannot be the Engine
Hypermedia Cannot be the EngineHypermedia Cannot be the Engine
Hypermedia Cannot be the EngineRuben Verborgh
 
Linking media, data, and services
Linking media, data, and servicesLinking media, data, and services
Linking media, data, and servicesRuben Verborgh
 
RESTdesc – Efficient runtime service discovery and consumption
RESTdesc – Efficient runtime service discovery and consumptionRESTdesc – Efficient runtime service discovery and consumption
RESTdesc – Efficient runtime service discovery and consumptionRuben Verborgh
 
Web data from R
Web data from RWeb data from R
Web data from Rschamber
 
Flink Community Update 2015 June
Flink Community Update 2015 JuneFlink Community Update 2015 June
Flink Community Update 2015 JuneMárton Balassi
 
SPARQL Query Forms
SPARQL Query FormsSPARQL Query Forms
SPARQL Query FormsLeigh Dodds
 
Getting Started With The Talis Platform
Getting Started With The Talis PlatformGetting Started With The Talis Platform
Getting Started With The Talis PlatformLeigh Dodds
 
Synchronicity: Just-In-Time Discovery of Lost Web Pages
Synchronicity: Just-In-Time Discovery of Lost Web PagesSynchronicity: Just-In-Time Discovery of Lost Web Pages
Synchronicity: Just-In-Time Discovery of Lost Web PagesMichael Nelson
 
Designing RESTful APIs
Designing RESTful APIsDesigning RESTful APIs
Designing RESTful APIsanandology
 
Creating 3rd Generation Web APIs with Hydra
Creating 3rd Generation Web APIs with HydraCreating 3rd Generation Web APIs with Hydra
Creating 3rd Generation Web APIs with HydraMarkus Lanthaler
 

What's hot (20)

Distributed Affordance
Distributed AffordanceDistributed Affordance
Distributed Affordance
 
Linked Data Fragments
Linked Data FragmentsLinked Data Fragments
Linked Data Fragments
 
Functional Composition of Sensor Web APIs
Functional Composition of Sensor Web APIsFunctional Composition of Sensor Web APIs
Functional Composition of Sensor Web APIs
 
Querying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern FragmentsQuerying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern Fragments
 
The web – A hypermedia story
The web – A hypermedia storyThe web – A hypermedia story
The web – A hypermedia story
 
Hypermedia APIs that make sense
Hypermedia APIs that make senseHypermedia APIs that make sense
Hypermedia APIs that make sense
 
Hypermedia Cannot be the Engine
Hypermedia Cannot be the EngineHypermedia Cannot be the Engine
Hypermedia Cannot be the Engine
 
Linking media, data, and services
Linking media, data, and servicesLinking media, data, and services
Linking media, data, and services
 
RESTdesc – Efficient runtime service discovery and consumption
RESTdesc – Efficient runtime service discovery and consumptionRESTdesc – Efficient runtime service discovery and consumption
RESTdesc – Efficient runtime service discovery and consumption
 
Web data from R
Web data from RWeb data from R
Web data from R
 
2010 Sopac Cosugi
2010 Sopac Cosugi2010 Sopac Cosugi
2010 Sopac Cosugi
 
STACK OVERFLOW DATASET ANALYSIS
STACK OVERFLOW DATASET ANALYSISSTACK OVERFLOW DATASET ANALYSIS
STACK OVERFLOW DATASET ANALYSIS
 
Flink Community Update 2015 June
Flink Community Update 2015 JuneFlink Community Update 2015 June
Flink Community Update 2015 June
 
SPARQL Query Forms
SPARQL Query FormsSPARQL Query Forms
SPARQL Query Forms
 
Getting Started With The Talis Platform
Getting Started With The Talis PlatformGetting Started With The Talis Platform
Getting Started With The Talis Platform
 
Synchronicity: Just-In-Time Discovery of Lost Web Pages
Synchronicity: Just-In-Time Discovery of Lost Web PagesSynchronicity: Just-In-Time Discovery of Lost Web Pages
Synchronicity: Just-In-Time Discovery of Lost Web Pages
 
Designing RESTful APIs
Designing RESTful APIsDesigning RESTful APIs
Designing RESTful APIs
 
Tutorial Linked APIs
Tutorial Linked APIsTutorial Linked APIs
Tutorial Linked APIs
 
Kibana: Real-World Examples
Kibana: Real-World ExamplesKibana: Real-World Examples
Kibana: Real-World Examples
 
Creating 3rd Generation Web APIs with Hydra
Creating 3rd Generation Web APIs with HydraCreating 3rd Generation Web APIs with Hydra
Creating 3rd Generation Web APIs with Hydra
 

Similar to Querying data on the Web – client or server?

The Lonesome LOD Cloud
The Lonesome LOD CloudThe Lonesome LOD Cloud
The Lonesome LOD CloudRuben Verborgh
 
2009 Dils Flyweb
2009 Dils Flyweb2009 Dils Flyweb
2009 Dils FlywebJun Zhao
 
Opportunistic Linked Data Querying through Approximate Membership Metadata
Opportunistic Linked Data Querying through Approximate Membership MetadataOpportunistic Linked Data Querying through Approximate Membership Metadata
Opportunistic Linked Data Querying through Approximate Membership MetadataMiel Vander Sande
 
2010 03 Lodoxf Openflydata
2010 03 Lodoxf Openflydata2010 03 Lodoxf Openflydata
2010 03 Lodoxf OpenflydataJun Zhao
 
Finding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic WebFinding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic Webebiquity
 
Querying the Web of Data
Querying the Web of DataQuerying the Web of Data
Querying the Web of DataRinke Hoekstra
 
RDF Stream Processing: Let's React
RDF Stream Processing: Let's ReactRDF Stream Processing: Let's React
RDF Stream Processing: Let's ReactJean-Paul Calbimonte
 
ESWC2015 - Query Optimization for Clients of Linked Data Fragments
ESWC2015 - Query Optimization for Clients of Linked Data FragmentsESWC2015 - Query Optimization for Clients of Linked Data Fragments
ESWC2015 - Query Optimization for Clients of Linked Data FragmentsJoachim Van Herwegen
 
2008 11 13 Hcls Call
2008 11 13 Hcls Call2008 11 13 Hcls Call
2008 11 13 Hcls CallJun Zhao
 
Phalcon 2 High Performance APIs - DevWeekPOA 2015
Phalcon 2 High Performance APIs - DevWeekPOA 2015Phalcon 2 High Performance APIs - DevWeekPOA 2015
Phalcon 2 High Performance APIs - DevWeekPOA 2015Jackson F. de A. Mafra
 
2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod GmodJun Zhao
 
Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011Juan Sequeda
 
How Graph Databases used in Police Department?
How Graph Databases used in Police Department?How Graph Databases used in Police Department?
How Graph Databases used in Police Department?Samet KILICTAS
 
ParlBench: a SPARQL-benchmark for electronic publishing applications.
ParlBench: a SPARQL-benchmark for electronic publishing applications.ParlBench: a SPARQL-benchmark for electronic publishing applications.
ParlBench: a SPARQL-benchmark for electronic publishing applications.Tatiana Tarasova
 
Declarative Multilingual Information Extraction with SystemT
Declarative Multilingual Information Extraction with SystemTDeclarative Multilingual Information Extraction with SystemT
Declarative Multilingual Information Extraction with SystemTLaura Chiticariu
 
Question Answering over Linked Data - Reasoning Issues
Question Answering over Linked Data - Reasoning IssuesQuestion Answering over Linked Data - Reasoning Issues
Question Answering over Linked Data - Reasoning IssuesMichael Petychakis
 
Semantic Web Servers
Semantic Web ServersSemantic Web Servers
Semantic Web Serverswebhostingguy
 
RDF Stream Processing and the role of Semantics
RDF Stream Processing and the role of SemanticsRDF Stream Processing and the role of Semantics
RDF Stream Processing and the role of SemanticsJean-Paul Calbimonte
 

Similar to Querying data on the Web – client or server? (20)

The Lonesome LOD Cloud
The Lonesome LOD CloudThe Lonesome LOD Cloud
The Lonesome LOD Cloud
 
2009 Dils Flyweb
2009 Dils Flyweb2009 Dils Flyweb
2009 Dils Flyweb
 
Opportunistic Linked Data Querying through Approximate Membership Metadata
Opportunistic Linked Data Querying through Approximate Membership MetadataOpportunistic Linked Data Querying through Approximate Membership Metadata
Opportunistic Linked Data Querying through Approximate Membership Metadata
 
2010 03 Lodoxf Openflydata
2010 03 Lodoxf Openflydata2010 03 Lodoxf Openflydata
2010 03 Lodoxf Openflydata
 
Finding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic WebFinding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic Web
 
Querying the Web of Data
Querying the Web of DataQuerying the Web of Data
Querying the Web of Data
 
RDF Stream Processing: Let's React
RDF Stream Processing: Let's ReactRDF Stream Processing: Let's React
RDF Stream Processing: Let's React
 
ESWC2015 - Query Optimization for Clients of Linked Data Fragments
ESWC2015 - Query Optimization for Clients of Linked Data FragmentsESWC2015 - Query Optimization for Clients of Linked Data Fragments
ESWC2015 - Query Optimization for Clients of Linked Data Fragments
 
2008 11 13 Hcls Call
2008 11 13 Hcls Call2008 11 13 Hcls Call
2008 11 13 Hcls Call
 
Phalcon 2 High Performance APIs - DevWeekPOA 2015
Phalcon 2 High Performance APIs - DevWeekPOA 2015Phalcon 2 High Performance APIs - DevWeekPOA 2015
Phalcon 2 High Performance APIs - DevWeekPOA 2015
 
2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod Gmod
 
Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011
 
How Graph Databases used in Police Department?
How Graph Databases used in Police Department?How Graph Databases used in Police Department?
How Graph Databases used in Police Department?
 
Graphql
GraphqlGraphql
Graphql
 
ParlBench: a SPARQL-benchmark for electronic publishing applications.
ParlBench: a SPARQL-benchmark for electronic publishing applications.ParlBench: a SPARQL-benchmark for electronic publishing applications.
ParlBench: a SPARQL-benchmark for electronic publishing applications.
 
Declarative Multilingual Information Extraction with SystemT
Declarative Multilingual Information Extraction with SystemTDeclarative Multilingual Information Extraction with SystemT
Declarative Multilingual Information Extraction with SystemT
 
cyclades eswc2016
cyclades eswc2016cyclades eswc2016
cyclades eswc2016
 
Question Answering over Linked Data - Reasoning Issues
Question Answering over Linked Data - Reasoning IssuesQuestion Answering over Linked Data - Reasoning Issues
Question Answering over Linked Data - Reasoning Issues
 
Semantic Web Servers
Semantic Web ServersSemantic Web Servers
Semantic Web Servers
 
RDF Stream Processing and the role of Semantics
RDF Stream Processing and the role of SemanticsRDF Stream Processing and the role of Semantics
RDF Stream Processing and the role of Semantics
 

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 

Recently uploaded (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

Querying data on the Web – client or server?

  • 1. Querying data on the Web:
 client or server? Ruben Verborgh Ghent University – iMinds
  • 2. The current Semantic Web
 has many implicit assumptions. We should be able
 to answer all queries. Complexity is more important
 than availability. Data servers
 need to be expensive.
  • 3. Those assumptions are
 not necessarily wrong. They’re also not necessarily
 the only possible ones.
  • 4. Some queries are
 hard to answer. Availability is a top priority. Low-cost data servers
 have potential. Let’s rethink our assumptions,
 just to see what’s possible.
  • 5. Different assumptions lead
 to a different Semantic Web. Maybe they bring us closer
 to the Web We Want.
  • 6. …but what do we want?
  • 7. The Semantic Web’s assumptions Client-side query execution Querying data on the Web:
 client or server? New query opportunities
  • 8. 1. Clients need a different protocol.
  • 9. The Web for humans offers an HTTP interface to HTML. client dataHTTP HTML
  • 10. The Web for applications offers an HTTP interface to JSON. client dataHTTP JSON
  • 11. The Web for applications offers an HTTP interface to RDF. client dataHTTP RDF
  • 12. The Web for applications offers an SPARQL interface to RDF. client dataHTTP RDF SPARQL
  • 13. Documents need a new language. Semantic Web clients were
 perceived as very limited. Querying needs a new protocol. …unlike “simple” JSON clients.
  • 14. 1. Clients need a different protocol. 2. Live queries require that protocol.
  • 15. public SPARQL endpoints There are 3 common ways
 to publish Linked Data. Linked Data documents downloadable data dumps
  • 16. …and that’s not always a good thing. Public SPARQL endpoints
 offer a very powerful interface. Clients can ask any query… …if the endpoint is available. Hosting an endpoint is costly.
  • 17. Low-cost to host. Linked Data documents
 seem to work like the Web. Solve queries by traversing links. Many queries cannot be solved.
  • 18. Set up your own endpoint. Downloadable data dumps
 have high availability. Data is not live. You’re not really querying the Web.
  • 19. 1. Clients need a different protocol. 2. Live queries require that protocol. 3. Clients can request any query.
  • 20. The query language abstracts away
 the steps needed to solve it. In SPARQL, asking a simple query
 is as easy as asking a difficult one. In contrast to the rest of the Web,
 clients are in control.
  • 21. With a JSON interface, the server decides how clients access data. client dataHTTP JSON
  • 22. client dataHTTP RDF SPARQL With a SPARQL interface, clients
 decide how they access data.
  • 23. Clients can ask anything, also
 queries that bring servers down. The majority
 of public SPARQL endpoints
 has less than 95% availability. That means the endpoint
 —and thus your application—
 doesn’t work 1.5 days each month.
  • 24. If you have operational need
 for SPARQL accessible data,
 you must have your own infrastructure. No public endpoints.
 Public endpoints are for lookups and discovery;
 sort of a dataset demo. —Orri Erling, OpenLink (2014)
  • 25. SEMANTICthings we happen to have
 downloaded from the WEB
  • 26. If you want to study
 a subject on Wikipedia, do you download all
 4,614,000 articles first?
  • 27. 1. Clients need a different protocol. 2. Live queries require that protocol. 3. Clients can request any query.
  • 28. The Semantic Web’s assumptions Client-side query execution New query opportunities Querying data on the Web:
 client or server?
  • 29. data
 dump SPARQL
 endpoint Any fragment of a Linked Data set
 is called a Linked Data Fragment. derefer-
 encing high server efforthigh client effort all subject SPARQL querySELECTOR
  • 30. Each type of Linked Data Fragment
 is defined by three characteristics. selector metadata controls What data does it contain? What do we know about it? What can we do next?
  • 31. a SPARQL query (none) (none) SPARQL CONSTRUCT result selector metadata controls Each type of Linked Data Fragment
 is defined by three characteristics.
  • 32. a specific entity creator, maintainer, … links to other LD documents Linked Data Document selector metadata controls Each type of Linked Data Fragment
 is defined by three characteristics.
  • 33. everything (none) data dump number of triples, file size selector metadata controls Each type of Linked Data Fragment
 is defined by three characteristics.
  • 34. Can we query fragments that
 balance client and server effort? data
 dump SPARQL
 endpoint triple
 pattern
 fragments derefer-
 encing high server efforthigh client effort all subject SPARQL querytriple pattern
  • 35. triple pattern total number of matches access to all other fragments selector metadata controls Triple pattern fragments are cheap
 yet enable efficient querying.
  • 36. data (first 100) controls (other fragments) metadata (total count)
  • 37. Other APIs exist, but are specific. Triple pattern fragment servers
 enable clients to execute queries. Triple patterns work on all datasets. Combine data, metadata & controls.
  • 38. How to answer this query using
 only triple pattern fragments? SELECT ?person ?city WHERE { ?person a dbpedia-owl:Artist. ?person dbpedia-owl:birthPlace ?city. ?city foaf:name "York"@en. }
  • 39. Get the corresponding fragments
 ?person a dbpedia-owl:Artist. ?person dbpedia-owl:birthPlace ?city. ?city foaf:name "York"@en. dbpedia:York foaf:name “York”@en. dbpedia:York,_Ontario foaf:name “York”@en.
 … dbpedia:Ganesh_Ghosh …:birthPlace dbpedia:Bengal_Presidency. dbpedia:Jacques_L'enfant …:birthPlace dbpedia:Beauce.
 … dbpedia:Aamir_Zaki a dbpedia-owl:Artist. dbpedia:Ahmad_Morid a dbpedia-owl:Artist.
 …
  • 40. Get the corresponding fragments
 and read the count metadata. ?person a dbpedia-owl:Artist. ±61,000 ±470,000 12 ?person dbpedia-owl:birthPlace ?city. ?city foaf:name "York"@en. dbpedia:York foaf:name “York”@en. dbpedia:York,_Ontario foaf:name “York”@en.
 … dbpedia:Ganesh_Ghosh …:birthPlace dbpedia:Bengal_Presidency. dbpedia:Jacques_L'enfant …:birthPlace dbpedia:Beauce.
 … dbpedia:Aamir_Zaki a dbpedia-owl:Artist. dbpedia:Ahmad_Morid a dbpedia-owl:Artist.
 …
  • 41. Start with the smallest fragment.
 Start with the first match. ?person a dbpedia-owl:Artist ±61, ±470, 12 ?person dbpedia-owl:birthPlace ?city foaf:name "York"@en. dbpedia:York foaf:name “York”@en. dbpedia:York,_Ontario foaf:name “York”@en.
 … dbpedia:Ganesh_Ghosh …:birthPlace dbpedia:Bengal_Presidency. dbpedia:Jacques_L'enfant …:birthPlace dbpedia:Beauce. … dbpedia:Aamir_Zaki dbpedia:Ahmad_Morid a dbpedia-owl:Artist. …
  • 42. How to answer this query using
 only triple pattern fragments? SELECT ?person WHERE { ?person a dbpedia-owl:Artist. ?person dbpedia-owl:birthPlace dbpedia:York. dbpedia:York foaf:name "York"@en. }
  • 43. Get the corresponding fragments
 ?person a dbpedia-owl:Artist. ?person dbpo:birthPlace dbpedia:York. dbpedia:John_Flaxman dbpo:birthPlace dbpedia:York. dbpedia:Joseph_Hansom dbpo:birthPlace dbpedia:York.
 … dbpedia:Aamir_Zaki a dbpedia-owl:Artist. dbpedia:Ahmad_Morid a dbpedia-owl:Artist.
 …
  • 44. Get the corresponding fragments
 and read the count metadata. ?person a dbpedia-owl:Artist. ±61,000 75?person dbpo:birthPlace dbpedia:York. dbpedia:John_Flaxman dbpo:birthPlace dbpedia:York. dbpedia:Joseph_Hansom dbpo:birthPlace dbpedia:York.
 … dbpedia:Aamir_Zaki a dbpedia-owl:Artist. dbpedia:Ahmad_Morid a dbpedia-owl:Artist.
 …
  • 45. Start with the smallest fragment.
 Start with the first match. ?person a dbpedia-owl:Artist ±61, 75?person dbpo:birthPlace dbpedia:York. dbpedia:John_Flaxman dbpo:birthPlace dbpedia:York. dbpedia:Joseph_Hansom dbpo:birthPlace dbpedia:York.
 … dbpedia:Aamir_Zaki dbpedia:Ahmad_Morid a dbpedia-owl:Artist. …
  • 46. How to answer this query using
 only triple pattern fragments? ASK { dbp:John_Flaxman a dbpo:Artist. dbp:John_Flaxman dbpo:birthPlace dbp:York. dbp:York foaf:name "York"@en. }
  • 47. Get the corresponding fragment
 and read the count metadata. dbpedia:John_Flaxman a dbpedia-owl:Artist. 1 dbpedia:John_Flaxman a dbpedia-owl:Artist. ! Output the match: ?person = dbpedia:John_Flaxman
 ?city = dbpedia:York
  • 48. Recursively repeat the process
 for all bindings. ?person dbpo:birthPlace dbpedia:York. dbpedia:John_Flaxman dbpo:birthPlace dbpedia:York. dbpedia:Joseph_Hansom dbpo:birthPlace dbpedia:York.
 … ?city foaf:name "York"@en. dbpedia:York foaf:name “York”@en. dbpedia:York,_Ontario foaf:name “York”@en.
 …
  • 49. Use the Web’s protocol HTTP. This way of querying
 changes the usual assumptions. Don’t be smart; enable intelligence. Some queries will be hard / slow.
  • 50. Querying semantic datasources
 means managing expectations. data
 dump SPARQL
 endpoint triple
 pattern
 fragments derefer-
 encing high server efforthigh client effort low availabilityhigh availability low freshness / speed high freshness / speed
  • 51. The Semantic Web’s assumptions Client-side query execution New query opportunities Querying data on the Web:
 client or server?
  • 52. Coupling access and processing
 leads to low availability. SPARQL Server Client Client Client Client Client Client Client (a) sparql endpoints perform all processing on the server, leading to fast query execution with low data bandwidth, and a rapidly overloaded server.
  • 53. LDF Server Client ClientClient Client Client Client Client Client Client (b) ldf servers only support simple requests and can thus handle far higher loads. Clients perform the querying, so they need more (cacheable) data. Enabling clients to query
 leads to high scalability.
  • 54. Show a sorted list of molecules
 that match certain characteristics. … Molecules endpoint
 approach fragment
 approach
  • 55. Molecules endpoint
 approach SPARQL
 endpoint Molecules Show a sorted list of molecules
 that match certain characteristics.
  • 56. endpoint
 approachSELECT DISTINCT(?mol) MIN(?name) WHERE { ?mol rdfs:label ?name; … … } ORDER BY ?name Show a sorted list of molecules
 that match certain characteristics.
  • 57. endpoint
 approach Show a sorted list of molecules
 that match certain characteristics. SELECT DISTINCT(?mol) MIN(?name) WHERE { ?mol rdfs:label ?name; … … } ORDER BY ?name
  • 58. endpoint
 approach DISTINCT MIN SORT BY keep all results in memory keep all results in memory, blocking keep all results in memory, blocking Consequences: Doesn’t matter; we’re waiting anyway. Show a sorted list of molecules
 that match certain characteristics.
  • 59. fragments
 approach No blocking operators; streaming matters. Show a sorted list of molecules
 that match certain characteristics. SELECT ?mol ?name WHERE { ?mol rdfs:label ?name; … … }
  • 60. Molecules fragments
 approach MoleculesMolecules Show a sorted list of molecules
 that match certain characteristics.
  • 61. The algorithm remains the same
 when clients use one or multiple
 triple pattern fragment servers. Federation also becomes
 substantially easier. Avoid the unavailability cascade.
  • 62. An optimal solution doesn’t exist.
 We should look at all APIs. data
 dump SPARQL
 endpoint triple
 pattern
 fragments derefer-
 encing
  • 63. Servers indicate what they do,
 enabling clients to query optimally. “This server supports triple patterns
 and full-text search on objects.” “This server supports SPARQL queries
 with up to 2 joins.” “This server supports Linked Data documents.”
  • 64. The Semantic Web’s assumptions Client-side query execution New query opportunities Querying data on the Web:
 client or server?
  • 65. Different assumptions
 lead to different trade-offs. Live querying of public data is possible at low cost,
 but at slower speeds… …for now :-)
  • 66. Let your browser
 solve a SPARQL query:
 client.linkeddatafragments.org Ruben Verborgh Ghent University – iMinds