AnzoGraph DB - SPARQL 101

©Cambridge Semantics Inc.
Company Confidential
SPARQL 101
Thomas Cook
Sales Director, AnzoGraph DB
Thomas.Cook@cambridgesemantics.com

▪ History lesson: What are the origins of SPARQL?
▪ Semantic Web
▪ Linked Open Data
▪ Knowledge Graphs
▪ What is RDF ?
▪ What’s a URI ?
▪ What is SPARQL?
Agenda

Origins of The Semantic Web
„The Semantic Web is an extension of the current web in which
information is given well-defined meaning, better enabling
computers and people to work in cooperation"
Tim Berners-Lee, James Hendler, Ora Lassila: The Semantic Web, Scientific American, 284(5), pp. 34-43(2001)

The World's Web Standards Organization

https://lod-cloud.net/
Linked Open Data
The dataset currently contains
1,239 datasets with 16,147 links
(Mar 2019)
Linked Data (name coined in 2006 by
Tim Berners-Lee)

https://wiki.dbpedia.org/about
DBPedia
DBpedia has information stored from 125 languages
DBpedia release consists of 3 billion pieces of information (RDF triples)
580 million were extracted from the English edition of Wikipedia
2.46 billion were extracted from other language editions
DBpedia is an open knowledge graph (OKG) which is available for everyone on the Web
A knowledge graph is a special kind of database which stores knowledge in a machine-readable form and
provides a means for information to be collected, organized, shared, searched and utilized
Google uses a similar approach to create those knowledge cards during search

http://mappings.dbpedia.org/server/ontology/classes/
DBPedia
Instances per class
Class Instances
Resource (overall) 4,233,000
Place 735,000
Person 1,450,000
Work 411,000
Species 251,000
Organisation 241,000
The DBpedia Ontology currently contains about 4,233,000 instances.
The table below lists the number of instances for several classes within
the ontology:

• http://dbpedia.org/snorql/?query=%23select+distinct+%3FConcept+w
here+%7B%5B%5D+a+%3FConcept%7D+LIMIT+100%0D%0A%0D
%0A%23Select+distinct+*+where+%7B+%3Fs+a+foaf%3APerson+%
7D+limit+100%0D%0A%0D%0ASelect+distinct+*+where+%7B+%3F
s+%3Fp+%3Fo+.%0D%0A%0D%0Afilter+%28regex%28%3Fs%2C+
%22Pacino%22%2C+%22i%22%29%29%0D%0A%7D+limit+100%0
D%0A%0D%0A+
SPARQL Explorer

RDF (Resource Description Framework) is the data model of the Semantic Web. That means
that all data in Semantic Web technologies is represented as RDF
RDF's simple data model and ability to model disparate, abstract concepts has also led to its
increasing use in knowledge management applications unrelated to Semantic Web activity
What is RDF ?
At the most atomic level,
RDF is made of Triples.
A “Triple” is a single fact
Subject Object
E.g. “The Sky is Blue”
Sky Blue
Color
Predicate
https://en.wikipedia.org/wiki/Resource_Description_Framework

RDF is not like the tabular data model of relational databases. Nor is it like the
trees of the XML world. Instead, RDF is a graph
It’s a labeled, directed graph.
RDF Graph
Alice Telsa
drives
Bill
friend_of
Austin
resident_of

<Alice> <drives> <Tesla> .
<Alice> <friend_of> <Bill> .
<Alice> <resident_of> <Austin>.
<Tesla> <color> “blue” .
RDF Serializations – Turtle, N-Triples, RDFa, JSON-LD
Alice Tesla
drives
Bill
friend_of
Austin
resident_of
color
“blue”

Resource nodes A resource is anything that can have things said about it. It’s easy to think of a
resource as a thing vs. a value. In a visual representation, resources are represented by ovals.
Literal nodes The term literal is a fancy word for value. In a visual representation, literals are
represented by rectangles.
Blank nodes
3 Types of Nodes
Alice Tesla
drives
Bill
friend_of
Austin
resident_of
color
“blue”

<Alice>
Expressed as a full URI would look something more like:
<http://example.com/resource/person#Alice>
And
<drives>
Would be more like:
<http://example.com/resource/person#drives>
URIs – Uniform Resource Identifier
How can we uniquely ID resources universally? Add a URL to the start of your ID.
<http://example.com/resource/person#Alice> <http://example.com/resource/person#drives> <http://example.com/resource/person#Tesla> .
<http://example.com/resource/person#Alice> <http://example.com/resource/person#friend_of> <http://example.com/resource/person#Bill> .
<http://example.com/resource/person#Alice> <http://example.com/resource/person#resident_of> <http://example.com/resource/person#Austin>.
<http://example.com/resource/car#Tesla> <http://example.com/car#color> “blue” .

SPARQL PREFIX abbreviation
BEFORE:
<http://example.com/resource/person#Alice> <http://example.com/resource/person#drives> <http://example.com/resource/car#Tesla
<http://example.com/resource/person#Alice> <http://example.com/resource/person#friend_of> <http://example.com/resource/person
<http://example.com/resource/person#Alice> <http://example.com/resource/person#resident_of> <http://example.com/resource#Aus
<http://example.com/resource#Tesla> <http://example.com/resource#color> “blue” .
With PREFIX we can get a much shorter representation with abbreviations.
AFTER:
PREFIX tslap: <http://example.com/resource/person#> .
PREFIX tslar: <http://example.com/resource#> .
tslap:Alice tslap:drives tslac:Tesla .
tslap:Alice tslap:friend_of tslap:Bill .
tslap:Alice tslap:resident_of tslar:Austin .
tslar:Tesla tslar:color “blue”.
<Tesla> <color> “blue” .
Same as below without URIs, but
now universally uniquely identified

PREFIX Short For:
rdf: http://xmlns.com/foaf/0.1/
rdfs: http://www.w3.org/2000/01/rdf-schema#
owl: http://www.w3.org/2002/07/owl#
xsd: http://www.w3.org/2001/XMLSchema#
dc: http://purl.org/dc/elements/1.1/
foaf: http://xmlns.com/foaf/0.1/
Common Prefixes
More common prefixes at http://prefix.cc

SPARQL stands for:
SPARQL Protocol And RDF Query Language
A query language and a protocol
What is SPARQL?
A SPARQL QUERY:
SELECT …
FROM ….
WHERE { … }
GROUP BY …
ORDER BY …
SELECT – Identifies the values to return
FROM – selects the dataset to query
WHERE – the graph patterns to match
GROUP BY – group aggregations on this field
ORDER BY – order the result set

INSERT DATA { GRAPH <test1> {
<Tesla> <color> "blue" .
}
}
Let’s INSERT some data

SELECT (count(*) as ?count)
FROM <test1>
WHERE {
?s ?p ?o .
}
RESULT:
count
--------
4
Let’s count how many triples are in the graph

SELECT ?s ?p ?o
FROM <test1>
WHERE {
?s ?p ?o .
}
Show all the triples

SELECT ?s
FROM <test1>
WHERE {
?s <drives> <Tesla> .
}
RESULT:
s
-------
Alice
1 rows
Use graph patterns to find data you want
Who drives a Tesla?

SELECT ?s
FROM <test1>
WHERE {
?s <drives> ?car .
?car <color> "blue" .
}
Who drives a blue car?
Join operation
Use graph patterns to match data in the graph

Join operation
Use graph patterns to match data in the graph

SELECT ?s ?color ?year
FROM <test1>
WHERE {
?s <drives> ?car .
?car <color> ?color .
?car <year> ?year .
}
RESULT? No results. Why? <year> does not exist in our graph
Graph patterns must exist in the WHERE
Who drives a car and what’s the color and year?

SELECT ?s ?color ?year
FROM <test1>
WHERE {
?s <drives> ?car .
?car <color> ?color .
OPTIONAL{?car <year> ?year . }
}
RESULT:
s | color | year
-------+-------+------
Alice | blue |
1 rows
USE OPTIONAL for Graph patterns that might not exist

• ORDER BY: This modifier sorts the result set in a particular order. It sorts query solutions on the
value of one or more variables.
• OFFSET: Using this modifier in conjunction with LIMIT and ORDER BY returns a slice of a sorted
solution set, for example, for paging.
• LIMIT: This modifier restricts the results to return a certain number of solutions.
• GROUP BY: This modifier is used with aggregate functions and specifies the key variables to use
to partition the solutions into groups. For information about AnzoGraph GROUP BY clause
extensions, see Advanced Grouping Sets.
• HAVING: This modifier is used with aggregate functions and further filters the results after
applying the aggregates.
SPARQL SELECT, like SQL, has several solution modifiers

The built-in SPARQL aggregate functions:
AVG: Calculates the average value for a numeric expression.
COUNT: Counts the number of times the specified value is bound to the given
variable.
GROUP_CONCAT: Performs a string concatenation of all of the values that are
bound to the given variable.
MAX: Returns the maximum value from the specified set of values.
MIN: Returns the minimum value from the specified set of values.
SAMPLE: Returns an arbitrary value from the specified set of values.
SUM: Adds the specified values.
Aggregate Functions

There are Four standard SPARQL query forms:
SELECT: Run SELECT queries when you want to find and return all of the data that
matches certain patterns.
CONSTRUCT: Run CONSTRUCT queries when you want to create or transform data
based on the existing data.
ASK: Run ASK queries when you want to know whether a certain pattern exists in the
data. ASK queries return only "true" or "false" to indicate whether a solution exists.
DESCRIBE: Run DESCRIBE queries when you want to view the RDF graph that
describes a particular resource.
Query Forms

info.anzograph@CambridgeSemantics.com
www.anzograph.com
AnzoGraph.com

AnzoGraph DB - SPARQL 101

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to AnzoGraph DB - SPARQL 101

Similar to AnzoGraph DB - SPARQL 101 (20)

More from Cambridge Semantics

More from Cambridge Semantics (20)

Recently uploaded

Recently uploaded (20)

AnzoGraph DB - SPARQL 101