This document discusses querying cultural heritage data stored as graphs using SPARQL. It provides examples of expressing single statements as triples and triple patterns, and using SPARQL to retrieve and count information. Exercises demonstrate querying for object owners and names, material types, and counting objects by material.
1. Querying
Cultural Heritage DataCultural Heritage Data
Dr. Barry Norton,
Development Manager, ResearchSpace*
* Funded by the Andrew W. Mellon Foundation
* Hosted by the Curatorial Directorate, British Museum
2. Statements and Patterns
• For one edge in a graph:
bm-obj:EOC3130
crm:P52_has_current_owner
bm-id:the-british-museumbm-obj:EOC3130
3. Statements and Patterns
• For one edge in a graph:
bm-obj:EOC3130
crm:P52_has_current_owner
bm-id:the-british-museum
• We can declare/retrieve one (N)Triple:
bm-obj:EOC3130
4. Statements and Patterns
• For one edge in a graph:
bm-obj:EOC3130
crm:P52_has_current_owner
bm-id:the-british-museum
• We can declare/retrieve one (N)Triple:
• Or write this in Turtle:
bm-obj:EOC3130
@prefix crm: <http://erlangen-crm.org/current/> .
@prefix bm-obj: <http://collection.britishmuseum.org/id/object/> .
@prefix bm-id: <http://collection.britishmuseum.org/id/> .
bm-obj:EOC3130 crm:P52_has_current_owner bm-id:the-british-museum .
5. Statements and Patterns
• For one edge in a graph:
bm-obj:EOC3130
crm:P52_has_current_owner
bm-id:the-british-museum
• We can write this in Turtle:
• And check for it in SPARQL:
bm-obj:EOC3130
bm-obj:EOC3130 crm:P52_has_current_owner bm-id:the-british-museum .
PREFIX crm: <http://erlangen-crm.org/current/>
PREFIX bm-obj: <http://collection.britishmuseum.org/id/object/>
PREFIX bm-id: <http://collection.britishmuseum.org/id/>
ASK {bm-obj:EOC3130 crm:P52_has_current_owner bm-id:the-british-museum}
true
6. Statements and Patterns
• For a set of edges:
bm-obj:EOC3130
bm-id:the-british-museum
crm:P51_has_former_or_current_owner
?
• We can do the work on the client:
• Or have the server do it by turning the
triple into a triple pattern:
crm:P51_has_former_or_current_owner
?
bm-obj:EOC3130 crm:P51_has_former_or_current_owner ?owner
8. Solutions & Exercises
• Why is the answer different?
– Reasoning, part of the work by the server
(being a triplestore) means that if two things
are related by crm:P52_has_current_owner
then they’re related bythen they’re related by
crm:P51_has_former_or_current_owner
• This is part of the work that the server
(triplestore) can do for you
• Exercise: query for the (strictly) former
owners… ?
?
11. Solutions & Exercises
Who are the two (other) one-time owners?
• Since people and institutions (and places) are
?
?
• Since people and institutions (and places) are
treated as are concepts, the names of the former
owners are attached using skos:prefLabel
• Exercise: if you didn’t already, include the
names in your query results
12. Solutions & Exercises
If you didn’t already, include the names in
your query results:
Question:
Why are we back at two answers?
13. Answer
• Answer:
– Just as we can add triples together to make a
graph in RDF, so we can add triple patterns
together in SPARQL to make a graph pattern
– By default all triple patterns must be matched,– By default all triple patterns must be matched,
but we can use the OPTIONAL {} pattern to
allow variation
• Exercise:
– Query for the owners and their names, if they
exist*
* N.B. this bug in the BM data will be fixed soon
16. CSV Exercise
• Type:
• Observe that one can now paste the query
including line breaks*including line breaks*
• Type:
* N.B. for now you should first replace the "s with 's and
change the one occurrence of ecrm: with crm: - we’ll fix this
* N.B. currently the query needs to be simplified as the BBC
data is not loaded – this will be available soon
17. Data Analysis
• One can import this CSV file into many
tools:
– A spreadsheet can be a good way to carry out
basic visualisations
– A scripting environment like (i)python/scipy or
R can allow more analysis before
visualisation, but:
• both languages also have libraries to encapsulate
interaction via SPARQL (rdflib/sparqlwrapper and
SPARQL/RCurl respectively)
• one should decide whether more analysis should
first be carried out using SPARQL…
18. Exercise
• If you haven’t so far, click on one of the
(HotW) 100 Objects (such as number 70,
Hoa Hakananai'a Easter Island Statue)
having run the main queryhaving run the main query
• Choose a material and observe the query
for other objects in this material
• Adapt this query to count how many BM
objects are made from basalt
19. Solution & Exercise
• Exercise: Now count the ‘top ten’ materials
and the number of objects for each
21. A Last Word
• SPARQLing a ‘native RDF’ database
(often called a ‘triplestore’) is not the only
option before defaulting to programming
• A ‘native graph’ database indexes the• A ‘native graph’ database indexes the
graph in a different way, supporting
traversal-oriented queries