Presentation for agINFRA Hackathon in Athens 12th December 2013

270 views

Published on

Using SPARQL to locate specific educational material on Open Learn (from the Open University)

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
270
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Presentation for agINFRA Hackathon in Athens 12th December 2013

  1. 1. Open Data in Agriculture Hands-on with data infrastructures that can power your agricultural data products 12/12/2013 Athens, Greece Supported by EU projects 1
  2. 2. OpenLearn and the SPARQL endpoint 2
  3. 3. Maths, Computing and Technology Faculty The Open University Walton Hall Milton Keynes MK7 6AA www.open.ac.uk mct‐research.open.ac.uk Jane Bromley    David King      David Morse
  4. 4. *open* 4
  5. 5. Objectives An introduction to the Open University’s free material • Show available metadata • Talk about RDF – the format used for graph databases • How to query the material through SPARQL 5
  6. 6. http://www.open.edu/openlearn/body‐mind/the‐real‐story‐behind‐cereals 6
  7. 7. http://www.open.edu/openlearn/nature‐environment/good‐food‐destroying‐biodiversity 7
  8. 8. http://www.open.edu/openlearn/science‐maths‐technology/science/biofuels/content‐section‐0 8
  9. 9. Open Research Online – publications originating from OU researchers OU Podcasts Course Descriptions Some KMi datasets And… 9
  10. 10. http://data.open.ac.uk/site/datasets.html Available through standard formats (RDF and SPARQL) 10
  11. 11. RDF Resource Description Framework  • one of the basic building blocks forming web of semantic data • defines a graph database • format defines statements comprising: Subject is the T‐shirt Predicate (property) is the colour Object is white subject‐>predicate‐>object relationship is called a triple. RDF/XML ‐ the XML form of RDF <?xml version="1.0" encoding="UTF‐8"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22‐rdf‐syntax‐ns#" xmlns:feature="http://www.linkeddatatools.com/clothing‐features#"> <rdf:Description rdf:about="http://www.linkeddatatools.com/clothes#t‐shirt  <feature:color rdf:resource="http://www.linkeddatatools.com/colors#white"/> </rdf:Description> </rdf:RDF> 11
  12. 12. The SPARQL endpoint http://data.open.ac.uk/query 12
  13. 13. select distinct ?props from <http://data.open.ac.uk/context/openlearn>  where { ?subj ?props ?obj } 13
  14. 14. 14
  15. 15. http://www.open.edu/openlearn/science‐maths‐technology/science/biofuels/content‐section‐0 15
  16. 16. http://data.open.ac.uk/page/openlearn/s173_1 16
  17. 17. How to find agriculturally useful  material in OpenLearn? 17
  18. 18. A three step process: 1. Find all the subjects and chose those   relevant to agriculture 2. Find all the OpenLearn Units that have  just these subjects 3. Collect the metadata for each of the  selected Open Learn units 18
  19. 19. 19
  20. 20. (1130) as of end of October 2013 http://data.open.ac.uk/topic/psychology http://data.open.ac.uk/topic/sociology http://data.open.ac.uk/topic/social_care http://data.open.ac.uk/topic/educational_practice http://data.open.ac.uk/topic/biology http://data.open.ac.uk/topic/herbicides http://data.open.ac.uk/topic/energyofficial1342688874openlearn_teamadmin http://data.open.ac.uk/topic/unitsdefault1330523206frank_siebertzz884926 http://data.open.ac.uk/topic/pre_course_workdefault1263940536linda_smithlps32 http://data.open.ac.uk/topic/employmentofficial1342688874richard_howesrh4685 http://data.open.ac.uk/topic/using_mathsdefault1231080717peter_mcalisterzz298445 http://data.open.ac.uk/topic/numbersdefault1330523196elizabeth_ellisee944 http://data.open.ac.uk/topic/nuclearofficial1342688874lucy_hendylmf7 http://data.open.ac.uk/topic/environmental_science http://data.open.ac.uk/topic/audio http://data.open.ac.uk/topic/cctv http://data.open.ac.uk/topic/social_workhttp://data.open.ac.uk/topic/scotland http://data.open.ac.uk/topic/personalisation http://data.open.ac.uk/topic/religious_studieshttp://data.open.ac.uk/topic/religion … 20
  21. 21. Topics relevant to agriculture? 40 topics chosen: <http://data.open.ac.uk/topic/agriculture>,          <http://data.open.ac.uk/topic/environment>,          <http://data.open.ac.uk/topic/the_environment>,          <http://data.open.ac.uk/topic/nature_&amp_environm ent>  <http://data.open.ac.uk/topic/environmental_science>, <http://data.open.ac.uk/topic/herbicides>, <http://data.open.ac.uk/topic/ecology>, <http://data.open.ac.uk/topic/genetics>, <http://data.open.ac.uk/topic/diversity>, <http://data.open.ac.uk/topic/global_warming>, <http://data.open.ac.uk/topic/biodiversity>, <http://data.open.ac.uk/topic/pollution>, <http://data.open.ac.uk/topic/conservation>, <http://data.open.ac.uk/topic/the_environment>, <http://data.open.ac.uk/topic/climate>, <http://data.open.ac.uk/topic/environmental_studies>, <http://data.open.ac.uk/topic/climate_change>, <http://data.open.ac.uk/topic/sustainability>, <http://data.open.ac.uk/topic/biogas>, <http://data.open.ac.uk/topic/biofuels>, <http://data.open.ac.uk/topic/photosynthesis>, <http://data.open.ac.uk/topic/waste_management>, <http://data.open.ac.uk/topic/landfill>, <http://data.open.ac.uk/topic/economic_growth>, <http://data.open.ac.uk/topic/waste>, <http://data.open.ac.uk/topic/acid_rain>, <http://data.open.ac.uk/topic/weather>, <http://data.open.ac.uk/topic/meteorology>, <http://data.open.ac.uk/topic/natural_resources>, <http://data.open.ac.uk/topic/animals>, <http://data.open.ac.uk/topic/ecological_sustainability>, <http://data.open.ac.uk/topic/overfishing>, <http://data.open.ac.uk/topic/ecosystem>, <http://data.open.ac.uk/topic/the_end_of_nature>, <http://data.open.ac.uk/topic/survival_of_the_fittest>, <http://data.open.ac.uk/topic/barter>, <http://data.open.ac.uk/topic/plants>, <http://data.open.ac.uk/topic/freshwater>, <http://data.open.ac.uk/topic/maps>, <http://data.open.ac.uk/topic/food> .. 21
  22. 22. A three step process: 1. Find all the subjects and chose  those relevant to agriculture 2.  Find all the OpenLearn Units that  have just these subjects 3.  Collect the metadata for each of the  selected Open Learn units 22
  23. 23. select distinct ?olu from  <http://data.open.ac.uk/context/openlearn> where {   ?olu <http://purl.org/dc/terms/subject> ?topic .   filter ( ?topic in (          <http://data.open.ac.uk/topic/agriculture>,           <http://data.open.ac.uk/topic/environment>, .. .. etc. ) ) } → 85 OpenLearn units Units are extracts from OU courses with multiple pages of  material and expected to take many hours of study. 23
  24. 24. http://data.open.ac.uk/openlearn/s250_3 http://data.open.ac.uk/openlearn/sdk125_1 http://data.open.ac.uk/openlearn/t123_1 http://data.open.ac.uk/openlearn/t206_2 http://data.open.ac.uk/openlearn/t213_1 http://data.open.ac.uk/openlearn/s173_1 http://data.open.ac.uk/openlearn/u116_3 http://data.open.ac.uk/openlearn/s278_19 http://data.open.ac.uk/openlearn/t306_3 http://data.open.ac.uk/openlearn/s189_1 http://data.open.ac.uk/openlearn/s344_1 http://data.open.ac.uk/openlearn/s324_1 http://data.open.ac.uk/openlearn/s250_2 … … 24
  25. 25. http://data.open.ac.uk/openlearn/s250_2 http://www.open.edu/openlearn/science‐maths‐technology/science/ environmental‐science/social‐issues‐and‐gm‐crops/content‐section‐0 This unit is an adapted extract from the course Science in context (S250) 25
  26. 26. A three step process: 1. Find all the subjects and chose  those relevant to agriculture 2. Find all the OpenLearn Units that  have just these subjects 3. Collect the metadata for each of the  selected Open Learn units 26
  27. 27. Python script to dump the metadata import urllib.parse import urllib.request # To run: python get_SPARQL_from_OpenData.py # Edit this file in two places to choose output format as json or rdf/xml def run_SPARQL(course_id):     ''' returns results of SPARQL query'''     # EDIT HERE # place course_id in request     # req = urllib.request.Request('http://data.open.ac.uk/openlearn/{}'.format(course_id),  headers={'Accept': 'application/rdf+json'})     req = urllib.request.Request('http://data.open.ac.uk/openlearn/{}'.format(course_id),  headers={'Accept': 'application/rdf+xml'}) # fire off the query     f = urllib.request.urlopen(req)     # pass back the query result having rendered it readable first     return(f.read().decode('utf‐8')) if __name__ == '__main__': llist = ['a180_2', 'b823_1', 'd837_1', 'dd100_7', 'e500_11', 'k111_1', …] for course_id in llist: print(course_id) # run query with chosen course id # result = run_SPARQL(course_id) # EDIT HERE # with open('{}.json'.format(course_id), 'w', encoding='utf‐8', newline='n') as f: with open('{}.xml'.format(course_id), 'w', encoding='utf‐8', newline='n') as f: f.write(result) 27
  28. 28. json format { "http://data.open.ac.uk/openlearn/s250_2" : { "http://purl.org/dc/terms/language" : [ { "type" : "literal" , "value" : "en‐gb" , "datatype" : http://www.w3.org/2001/XMLSchema#string } ] , "http://data.open.ac.uk/openlearn/ontology/relatesToCourse" : [ { "type" : "uri" , "value" : http://data.open.ac.uk/course/s250 } ] , rdf/xml format "http://purl.org/dc/terms/title" : [ { "type" : "literal" , "value" : "Social issues and GM crops" , "datatype" : http://www.w3.org/2001/XMLSchema#string } <rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22‐rdf‐syntax‐ns# … xmlns:j.0=http://dbpedia.org/property/ … xmlns:j.1="http://xmlns.com/foaf/0.1/"  xmlns:j.3=http://web.resource.org/cc/ xmlns:j.2=http://www.w3.org/TR/2010/WD‐mediaont‐10‐20100608/ xmlns:j.4=http://purl.org/dc/terms/ xmlns:j.5=http://data.open.ac.uk/openlearn/ontology/ xmlns:rdfs="http://www.w3.org/2000/01/rdf‐schema#"> <j.1:Document rdf:about="http://data.open.ac.uk/openlearn/s250_2"> <j.2:locator rdf:resource="http://www.open.edu/openlearn/nature‐environment/the‐environment/environmental‐science /social‐issues‐and‐gm‐crops/content‐section‐0"/> <j.5:relatesToCourse rdf:resource="http://data.open.ac.uk/course/s250"/> <j.4:creator rdf:resource="http://data.open.ac.uk/organization/the_open_university"/>     <j.4:subject rdf:resource="http://data.open.ac.uk/topic/risk"/> <j.4:published rdf:datatype=http://www.w3.org/2001/XMLSchema#dateTime >2011‐06‐02T23:00:00Z</j.4:published> … 28 …
  29. 29. Summary: A three step process: 1. Find all subjects/keywords relevant to agriculture 2. Identify OpenLearn Units with these subjects 3. Collect the metadata for each Open Learn unit All the scripts (and more) are available 29
  30. 30. Thanks j.m.bromley@open.ac.uk David.King@open.ac.uk David.Morse@open.ac.uk 30

×