SPARQL in the Semantic Web
Jan Carl Beeck Pepper
1
Agenda
• Definitions and Motivation
• SPARQL
2
Definitions and Motivation
• Statement (or triple):
– Small piece of knowledge (a single fact).
– It has Subject-Predicate-Object.
– Ex. Evidence A is_a physiotherapy evidence.
– Ex. <subj0> <pred0> <obj0>
• Subject (resource) and Object (value):
– Names for things in the world.
• Predicate (property):
– Name of a relation that connects two things.
3
Definitions and Motivation
• Semantic Web:
– It is built on top of the current Web.
– Besides the HTML constructs, it contains some
“statements” that can be collected by an agent.
– The agent organizes and connects the statements
into a graph format (data integration).
– Automatic data integration on the Web can be
powerful and can help a lot when it comes to
information discovery and retrieval.
4
Definitions and Motivation
• Query-based-language:
– The agent should be able to process some
common queries that are submitted against the
statements it has collected. After all, without
providing a query interface, the collected
statements will not be of too much use to us.
5
Definitions and Motivation
• Linked Data:
– A collection of machine-understandable
statements, published without having them
related to any Web site at all.
• Web of Data:
– Interchangeable terms for the Semantic Web.
6
Definitions and Motivation
• Resource description framework (RDF):
– The building block for the Semantic Web.
– Standard for encoding metadata.
• Metadata: describe the data contained on the Web.
• Machine understandable (also interoperability).
• Domain independent.
– Describe any resources and their relations existing
in the real world.
– RDF is for the Semantic Web what HTML has been
for the Web.
7
Definitions and Motivation
• RDF Schema (RDFS):
– Stands for RDF Schema.
– Common language, or, a vocabulary, where
classes, sub-classes, properties, and also relation
between the classes and properties are defined.
– Domain-specific.
– Allow the creation of distributed RDF documents.
8
Definitions and Motivation
• Web Ontology Language (OWL):
– Is the most popular language to use when creating
ontologies.
– Is build upon RDF Schema.
– Has the same purpose as RDF Schema.
• Classes, properties, and their relationships for a specific
application domain.
– Provides the capability to express much more
complex and richer relationships (better
expressiveness).
9
Definitions and Motivation
• Web Ontology Language (OWL):
– Axiom: basic statement (basic piece of
knowledge).
– A collection of axioms is an OWL Ontology.
– Protégé is free OWL editor.
• IRI:
– Stands for Internationalized Resource Identifiers
(like URIs with Unicode characters).
10
Definitions and Motivation
• Computer Ontology:
– Reflects the structure of the world.
– Is often about structure of the concepts.
– Each statement collected by an agent represents a
piece of knowledge. Therefore, there has to be a
way (a model) to represent knowledge on the
Web. Furthermore, this model of representing
knowledge has to be easily and readily processed
(understood) by machines.
11
Definitions and Motivation
• Computer Ontology:
– An application can understand a given ontology;
that means the application can parse the ontology
and create a list of axioms based on the ontology,
and all the facts are expressed as RDF statements.
12
Agenda
• Definitions and Motivation
• SPARQL
13
SPARQL
• SPARQL: Querying the Semantic Web.
– Pronounced “splarkle”
– Stands for SPARQL Protocol and RDF Query
Language.
– Locate specific information on the machine-
readable Web.
– The Web can be viewed as a gigantic database.
14
SPARQL (cont)
• Related concepts:
– RDF data store: is a special database system built
for the storage and retrieval of RDF statements.
• Every record is a short statement in the form of
subject-predicate-object.
• Store RDF statements and retrieve them by using a
query language.
– Triple pattern: any or all the subject, predicate,
and object values can be a variable.
• <http://danbri.org/foaf.rdf#danbri> foaf:name ?name.
15
SPARQL (cont)
• Graph pattern: is used to select triples from a
given RDF graph.
– Is a collection of triple patterns.
– { ?who foaf:name ?name.
?who foaf:interest ?interest.
?who foaf:knows ?others. }
– Note: FOAF (friend of a friend) is an Ontology (a
group of properties that describes a person) and a
collection of RDF statements.
16
SPARQL (cont)
• SPARQL engine:
– Tries to match the triples contained in the graph
patterns against the RDF graph, which is a
collection of triples.
– Once a match is successful, it will bind the graph
pattern’s variables to the graph’s nodes, and one
such variable binding is called a query solution.
17
SPARQL (cont)
• SPARQL endpoint:
– Interface that users (human or apps) can access to
query an RDF data store by using SPARQL query
language.
• Web-based application.
• Set of APIs that can be used by an agent.
– Ex. Joseki Web-based SPARQL.
• http://sparql.org/sparql.html
18
SPARQL (cont)
• SPARQL Query Language:
– SELECT query (most frequently used).
– ASK query.
– DESCRIBE query.
– CONSTRUCT query.
19
SPARQL (cont)
• Structure of a SELECT Query:
– # base directive
BASE <URI>
# list of prefixes
PREFIX pref: <URI>
...
# result description
SELECT...
# graph to search
FROM . . .
# query pattern
WHERE {
...
}
# query modifiers
ORDER BY... 20
SPARQL (cont)
• Ex. Find all the picture formats used by Dan Brickley’s friends (from graph
http://danbri.org/foaf.rdf#danbri).
21
SPARQL (cont)
• The query finds all the picture format used by Dan Bricley's friends.
• Base define the source file (graph) which link is: http...
• prefix define the ontology of persons foaf which link is: http...
• the other prefix define the image format ontology dc which link is: http...
• select * from the source graph
• where is defined the term dambri in the foaf ontology throw the knows
attribute and store that information in the variable ?friend
• where ?friend has a description of the image throw the attribute
foaf:depiction and store that information in the variable ?picture
• where ?picture has the name of the image format throw attribute
dc:format and store the name in the variable ?imageFormat
22
SPARQL (cont)
23
SPARQL (cont)
• Optional keyword: is needed because RDF
data graph is only a semi-structured data
model.
– i.e. two instances of the same class type in a given
RDF graph may have different set of property
instances created for each one of them.
– The query says, find all the people known by Dan
Brickley and show their name, e-mail, and home
page information if any of this information is
available.
24
SPARQL (cont)
25
SPARQL (cont)
26
SPARQL (cont)
• Solution modifiers:
– Distinct: eliminate duplicate solutions from the
result.
– Order by:
• Asc (): ascending.
• Desc (): descending.
– Limit: set the maximum number of solutions.
– Offset: sets the number of solutions to be skipped.
27
SPARQL (cont)
• Filter keyword to add value constraints,
functions and operators.
28
SPARQL (cont)
• Union keyword:
– A query expressed by multiple graph patterns that
are mutually exclusive, and any solution will have
to match exactly one of these patterns
(alternative match).
29
SPARQL (cont)
• Multiple graphs:
– SPARQL allows us to query any number of named
graphs.
30
SPARQL (cont)
• Construct query:
– Returns a new RDF graph.
• Describe query:
– Return an RDF graph whose statement are
determined by the query processor.
• Ask query:
– The query processor simply returns a true or false
value.
31
SPARQL (cont)
• Aggregate functions:
– COUNT
– SUM
– MIN/MAX
– AVG
– GROUP_CONCAT
– SAMPLE
32
SPARQL (cont)
• Other operators and functions:
– NOT EXISTS
– MINUS
– Concat() : for expressions in a query.
– INSERT DATA
– DELETE DATA
– CREATE [SILENT] GRAPH <uri>
– DROP [SILENT] GRAPH <uri>
33
References
• Liyang Yu (2011). A Developer’s Guide to the
Semantic Web. Springer. ISBN: 978-3-642-
15969-5.
• (2011) Huang J, Abadi D.J. and Ren K. Scalable
SPARQL Querying of Large RDF Graphs.
34

SPARQL in the Semantic Web

  • 1.
    SPARQL in theSemantic Web Jan Carl Beeck Pepper 1
  • 2.
    Agenda • Definitions andMotivation • SPARQL 2
  • 3.
    Definitions and Motivation •Statement (or triple): – Small piece of knowledge (a single fact). – It has Subject-Predicate-Object. – Ex. Evidence A is_a physiotherapy evidence. – Ex. <subj0> <pred0> <obj0> • Subject (resource) and Object (value): – Names for things in the world. • Predicate (property): – Name of a relation that connects two things. 3
  • 4.
    Definitions and Motivation •Semantic Web: – It is built on top of the current Web. – Besides the HTML constructs, it contains some “statements” that can be collected by an agent. – The agent organizes and connects the statements into a graph format (data integration). – Automatic data integration on the Web can be powerful and can help a lot when it comes to information discovery and retrieval. 4
  • 5.
    Definitions and Motivation •Query-based-language: – The agent should be able to process some common queries that are submitted against the statements it has collected. After all, without providing a query interface, the collected statements will not be of too much use to us. 5
  • 6.
    Definitions and Motivation •Linked Data: – A collection of machine-understandable statements, published without having them related to any Web site at all. • Web of Data: – Interchangeable terms for the Semantic Web. 6
  • 7.
    Definitions and Motivation •Resource description framework (RDF): – The building block for the Semantic Web. – Standard for encoding metadata. • Metadata: describe the data contained on the Web. • Machine understandable (also interoperability). • Domain independent. – Describe any resources and their relations existing in the real world. – RDF is for the Semantic Web what HTML has been for the Web. 7
  • 8.
    Definitions and Motivation •RDF Schema (RDFS): – Stands for RDF Schema. – Common language, or, a vocabulary, where classes, sub-classes, properties, and also relation between the classes and properties are defined. – Domain-specific. – Allow the creation of distributed RDF documents. 8
  • 9.
    Definitions and Motivation •Web Ontology Language (OWL): – Is the most popular language to use when creating ontologies. – Is build upon RDF Schema. – Has the same purpose as RDF Schema. • Classes, properties, and their relationships for a specific application domain. – Provides the capability to express much more complex and richer relationships (better expressiveness). 9
  • 10.
    Definitions and Motivation •Web Ontology Language (OWL): – Axiom: basic statement (basic piece of knowledge). – A collection of axioms is an OWL Ontology. – Protégé is free OWL editor. • IRI: – Stands for Internationalized Resource Identifiers (like URIs with Unicode characters). 10
  • 11.
    Definitions and Motivation •Computer Ontology: – Reflects the structure of the world. – Is often about structure of the concepts. – Each statement collected by an agent represents a piece of knowledge. Therefore, there has to be a way (a model) to represent knowledge on the Web. Furthermore, this model of representing knowledge has to be easily and readily processed (understood) by machines. 11
  • 12.
    Definitions and Motivation •Computer Ontology: – An application can understand a given ontology; that means the application can parse the ontology and create a list of axioms based on the ontology, and all the facts are expressed as RDF statements. 12
  • 13.
    Agenda • Definitions andMotivation • SPARQL 13
  • 14.
    SPARQL • SPARQL: Queryingthe Semantic Web. – Pronounced “splarkle” – Stands for SPARQL Protocol and RDF Query Language. – Locate specific information on the machine- readable Web. – The Web can be viewed as a gigantic database. 14
  • 15.
    SPARQL (cont) • Relatedconcepts: – RDF data store: is a special database system built for the storage and retrieval of RDF statements. • Every record is a short statement in the form of subject-predicate-object. • Store RDF statements and retrieve them by using a query language. – Triple pattern: any or all the subject, predicate, and object values can be a variable. • <http://danbri.org/foaf.rdf#danbri> foaf:name ?name. 15
  • 16.
    SPARQL (cont) • Graphpattern: is used to select triples from a given RDF graph. – Is a collection of triple patterns. – { ?who foaf:name ?name. ?who foaf:interest ?interest. ?who foaf:knows ?others. } – Note: FOAF (friend of a friend) is an Ontology (a group of properties that describes a person) and a collection of RDF statements. 16
  • 17.
    SPARQL (cont) • SPARQLengine: – Tries to match the triples contained in the graph patterns against the RDF graph, which is a collection of triples. – Once a match is successful, it will bind the graph pattern’s variables to the graph’s nodes, and one such variable binding is called a query solution. 17
  • 18.
    SPARQL (cont) • SPARQLendpoint: – Interface that users (human or apps) can access to query an RDF data store by using SPARQL query language. • Web-based application. • Set of APIs that can be used by an agent. – Ex. Joseki Web-based SPARQL. • http://sparql.org/sparql.html 18
  • 19.
    SPARQL (cont) • SPARQLQuery Language: – SELECT query (most frequently used). – ASK query. – DESCRIBE query. – CONSTRUCT query. 19
  • 20.
    SPARQL (cont) • Structureof a SELECT Query: – # base directive BASE <URI> # list of prefixes PREFIX pref: <URI> ... # result description SELECT... # graph to search FROM . . . # query pattern WHERE { ... } # query modifiers ORDER BY... 20
  • 21.
    SPARQL (cont) • Ex.Find all the picture formats used by Dan Brickley’s friends (from graph http://danbri.org/foaf.rdf#danbri). 21
  • 22.
    SPARQL (cont) • Thequery finds all the picture format used by Dan Bricley's friends. • Base define the source file (graph) which link is: http... • prefix define the ontology of persons foaf which link is: http... • the other prefix define the image format ontology dc which link is: http... • select * from the source graph • where is defined the term dambri in the foaf ontology throw the knows attribute and store that information in the variable ?friend • where ?friend has a description of the image throw the attribute foaf:depiction and store that information in the variable ?picture • where ?picture has the name of the image format throw attribute dc:format and store the name in the variable ?imageFormat 22
  • 23.
  • 24.
    SPARQL (cont) • Optionalkeyword: is needed because RDF data graph is only a semi-structured data model. – i.e. two instances of the same class type in a given RDF graph may have different set of property instances created for each one of them. – The query says, find all the people known by Dan Brickley and show their name, e-mail, and home page information if any of this information is available. 24
  • 25.
  • 26.
  • 27.
    SPARQL (cont) • Solutionmodifiers: – Distinct: eliminate duplicate solutions from the result. – Order by: • Asc (): ascending. • Desc (): descending. – Limit: set the maximum number of solutions. – Offset: sets the number of solutions to be skipped. 27
  • 28.
    SPARQL (cont) • Filterkeyword to add value constraints, functions and operators. 28
  • 29.
    SPARQL (cont) • Unionkeyword: – A query expressed by multiple graph patterns that are mutually exclusive, and any solution will have to match exactly one of these patterns (alternative match). 29
  • 30.
    SPARQL (cont) • Multiplegraphs: – SPARQL allows us to query any number of named graphs. 30
  • 31.
    SPARQL (cont) • Constructquery: – Returns a new RDF graph. • Describe query: – Return an RDF graph whose statement are determined by the query processor. • Ask query: – The query processor simply returns a true or false value. 31
  • 32.
    SPARQL (cont) • Aggregatefunctions: – COUNT – SUM – MIN/MAX – AVG – GROUP_CONCAT – SAMPLE 32
  • 33.
    SPARQL (cont) • Otheroperators and functions: – NOT EXISTS – MINUS – Concat() : for expressions in a query. – INSERT DATA – DELETE DATA – CREATE [SILENT] GRAPH <uri> – DROP [SILENT] GRAPH <uri> 33
  • 34.
    References • Liyang Yu(2011). A Developer’s Guide to the Semantic Web. Springer. ISBN: 978-3-642- 15969-5. • (2011) Huang J, Abadi D.J. and Ren K. Scalable SPARQL Querying of Large RDF Graphs. 34