Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Mon norton tut_querying cultural heritage data

411 views

Published on

  • Be the first to comment

  • Be the first to like this

Mon norton tut_querying cultural heritage data

  1. 1. Querying Cultural Heritage Data Dr. Barry Norton, Development Manager, ResearchSpace* * Funded by the Andrew W. Mellon Foundation * Hosted by the Curatorial Directorate, British Museum
  2. 2. Statements and Patterns • For one edge in a graph: crm:P52_has_current_owner bm-obj:EOC3130 bm-id:the-british-museum
  3. 3. Statements and Patterns • For one edge in a graph: crm:P52_has_current_owner bm-obj:EOC3130 bm-id:the-british-museum • We can declare/retrieve one (N)Triple:
  4. 4. Statements and Patterns • For one edge in a graph: crm:P52_has_current_owner bm-obj:EOC3130 bm-id:the-british-museum • We can declare/retrieve one (N)Triple: • Or write this in Turtle: @prefix crm: <http://erlangen-crm.org/current/> . @prefix bm-obj: <http://collection.britishmuseum.org/id/object/> . @prefix bm-id: <http://collection.britishmuseum.org/id/> . bm-obj:EOC3130 crm:P52_has_current_owner bm-id:the-british-museum .
  5. 5. Statements and Patterns • For one edge in a graph: crm:P52_has_current_owner bm-obj:EOC3130 bm-id:the-british-museum • We can write this in Turtle: • And check for it in SPARQL: bm-obj:EOC3130 crm:P52_has_current_owner bm-id:the-british-museum . PREFIX crm: <http://erlangen-crm.org/current/> PREFIX bm-obj: <http://collection.britishmuseum.org/id/object/> PREFIX bm-id: <http://collection.britishmuseum.org/id/> ASK {bm-obj:EOC3130 crm:P52_has_current_owner bm-id:the-british-museum} true
  6. 6. Statements and Patterns • For a set of edges: bm-obj:EOC3130 bm-id:the-british-museum ? crm:P51_has_former_or_current_owner ? • We can do the work on the client: • Or have the server do it by turning the triple into a triple pattern: bm-obj:EOC3130 crm:P51_has_former_or_current_owner ?owner
  7. 7. Exercise ? Questions: • Why is the answer different? • Who are the two (other) one-time owners? ?
  8. 8. Solutions & Exercises • Why is the answer different? – Reasoning, part of the work by the server (being a triplestore) means that if two things are related by crm:P52_has_current_owner then they’’re related by crm:P51_has_former_or_current_owner • This is part of the work that the server (triplestore) can do for you • Exercise: query for the (strictly) former owners… ? ?
  9. 9. Solution 1/2 • Using specific server functionality:
  10. 10. Solution 2/2 • In pure SPARQL:
  11. 11. Solutions & Exercises Who are the two (other) one-time owners? • Since people and institutions (and places) are ? ? treated as are concepts, the names of the former owners are attached using skos:prefLabel • Exercise: if you didn’t already, include the names in your query results
  12. 12. Solutions & Exercises If you didn’t already, include the names in your query results: Question: Why are we back at two answers?
  13. 13. Answer • Answer: – Just as we can add triples together to make a graph in RDF, so we can add triple patterns together in SPARQL to make a graph pattern – By default all triple patterns must be matched, but we can use the OPTIONAL {} pattern to allow variation • Exercise: – Query for the owners and their names, if they exist* * N.B. this bug in the BM data will be fixed soon
  14. 14. Solution
  15. 15. Exercise • Take a look here: • Exercise: copy and run this query
  16. 16. CSV Exercise • Type: • Observe that one can now paste the query including line breaks* • Type: * N.B. for now you should first replace the "s with 's and change the one occurrence of ecrm: with crm: - we’ll fix this * N.B. currently the query needs to be simplified as the BBC data is not loaded – this will be available soon
  17. 17. Data Analysis • One can import this CSV file into many tools: – A spreadsheet can be a good way to carry out basic visualisations – A scripting environment like (i)python/scipy or R can allow more analysis before visualisation, but: • both languages also have libraries to encapsulate interaction via SPARQL (rdflib/sparqlwrapper and SPARQL/RCurl respectively) • one should decide whether more analysis should first be carried out using SPARQL…
  18. 18. Exercise • If you haven’t so far, click on one of the (HotW) 100 Objects (such as number 70, Hoa Hakananai'a Easter Island Statue) having run the main query • Choose a material and observe the query for other objects in this material • Adapt this query to count how many BM objects are made from basalt
  19. 19. Solution & Exercise • Exercise: Now count the ‘top ten’ materials and the number of objects for each
  20. 20. Solution
  21. 21. A Last Word • SPARQLing a ‘native RDF’ database (often called a ‘triplestore’) is not the only option before defaulting to programming • A ‘native graph’ database indexes the graph in a different way, supporting traversal-oriented queries
  22. 22. Exercise Double click
  23. 23. Exercise Double click

×