Querying Cultural Heritage

772
-1

Published on

SPARQL queries for cultural heritage data in the CIDOC-CRM ontology with British Museum examples and exercises

Published in: Internet
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
772
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
15
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Querying Cultural Heritage

  1. 1. Querying Cultural Heritage DataCultural Heritage Data Dr. Barry Norton, Development Manager, ResearchSpace* * Funded by the Andrew W. Mellon Foundation * Hosted by the Curatorial Directorate, British Museum
  2. 2. Statements and Patterns • For one edge in a graph: bm-obj:EOC3130 crm:P52_has_current_owner bm-id:the-british-museumbm-obj:EOC3130
  3. 3. Statements and Patterns • For one edge in a graph: bm-obj:EOC3130 crm:P52_has_current_owner bm-id:the-british-museum • We can declare/retrieve one (N)Triple: bm-obj:EOC3130
  4. 4. Statements and Patterns • For one edge in a graph: bm-obj:EOC3130 crm:P52_has_current_owner bm-id:the-british-museum • We can declare/retrieve one (N)Triple: • Or write this in Turtle: bm-obj:EOC3130 @prefix crm: <http://erlangen-crm.org/current/> . @prefix bm-obj: <http://collection.britishmuseum.org/id/object/> . @prefix bm-id: <http://collection.britishmuseum.org/id/> . bm-obj:EOC3130 crm:P52_has_current_owner bm-id:the-british-museum .
  5. 5. Statements and Patterns • For one edge in a graph: bm-obj:EOC3130 crm:P52_has_current_owner bm-id:the-british-museum • We can write this in Turtle: • And check for it in SPARQL: bm-obj:EOC3130 bm-obj:EOC3130 crm:P52_has_current_owner bm-id:the-british-museum . PREFIX crm: <http://erlangen-crm.org/current/> PREFIX bm-obj: <http://collection.britishmuseum.org/id/object/> PREFIX bm-id: <http://collection.britishmuseum.org/id/> ASK {bm-obj:EOC3130 crm:P52_has_current_owner bm-id:the-british-museum} true
  6. 6. Statements and Patterns • For a set of edges: bm-obj:EOC3130 bm-id:the-british-museum crm:P51_has_former_or_current_owner ? • We can do the work on the client: • Or have the server do it by turning the triple into a triple pattern: crm:P51_has_former_or_current_owner ? bm-obj:EOC3130 crm:P51_has_former_or_current_owner ?owner
  7. 7. Exercise ? ? Questions: • Why is the answer different? • Who are the two (other) one-time owners?
  8. 8. Solutions & Exercises • Why is the answer different? – Reasoning, part of the work by the server (being a triplestore) means that if two things are related by crm:P52_has_current_owner then they’re related bythen they’re related by crm:P51_has_former_or_current_owner • This is part of the work that the server (triplestore) can do for you • Exercise: query for the (strictly) former owners… ? ?
  9. 9. Solution 1/2 • Using specific server functionality:
  10. 10. Solution 2/2 • In pure SPARQL:
  11. 11. Solutions & Exercises Who are the two (other) one-time owners? • Since people and institutions (and places) are ? ? • Since people and institutions (and places) are treated as are concepts, the names of the former owners are attached using skos:prefLabel • Exercise: if you didn’t already, include the names in your query results
  12. 12. Solutions & Exercises If you didn’t already, include the names in your query results: Question: Why are we back at two answers?
  13. 13. Answer • Answer: – Just as we can add triples together to make a graph in RDF, so we can add triple patterns together in SPARQL to make a graph pattern – By default all triple patterns must be matched,– By default all triple patterns must be matched, but we can use the OPTIONAL {} pattern to allow variation • Exercise: – Query for the owners and their names, if they exist* * N.B. this bug in the BM data will be fixed soon
  14. 14. Solution
  15. 15. Exercise • Take a look here: • Exercise: copy and run this query
  16. 16. CSV Exercise • Type: • Observe that one can now paste the query including line breaks*including line breaks* • Type: * N.B. for now you should first replace the "s with 's and change the one occurrence of ecrm: with crm: - we’ll fix this * N.B. currently the query needs to be simplified as the BBC data is not loaded – this will be available soon
  17. 17. Data Analysis • One can import this CSV file into many tools: – A spreadsheet can be a good way to carry out basic visualisations – A scripting environment like (i)python/scipy or R can allow more analysis before visualisation, but: • both languages also have libraries to encapsulate interaction via SPARQL (rdflib/sparqlwrapper and SPARQL/RCurl respectively) • one should decide whether more analysis should first be carried out using SPARQL…
  18. 18. Exercise • If you haven’t so far, click on one of the (HotW) 100 Objects (such as number 70, Hoa Hakananai'a Easter Island Statue) having run the main queryhaving run the main query • Choose a material and observe the query for other objects in this material • Adapt this query to count how many BM objects are made from basalt
  19. 19. Solution & Exercise • Exercise: Now count the ‘top ten’ materials and the number of objects for each
  20. 20. Solution
  21. 21. A Last Word • SPARQLing a ‘native RDF’ database (often called a ‘triplestore’) is not the only option before defaulting to programming • A ‘native graph’ database indexes the• A ‘native graph’ database indexes the graph in a different way, supporting traversal-oriented queries
  22. 22. Exercise Double click
  23. 23. Exercise Double click

×