Successfully reported this slideshow.
Your SlideShare is downloading. ×

Technologies and practices for maintaining and publishing earth science vocabularies

Ad

Simon J D Cox, Jonathan Yu, Megan Williams, Fabrizio Giabardo, Dominic Lowe
16 April 2015
LAND AND WATER FLAGSHIP
Technolo...

Ad

Are these the same?
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe
“nitrogen”
“dissolved nitrog...

Ad

Why are vocabularies
important?
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe3 |

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Loading in …3
×

Check these out next

1 of 50 Ad
1 of 50 Ad

Technologies and practices for maintaining and publishing earth science vocabularies

Download to read offline

Presented at BGS, Keyworth on 2015-04-27
Extended version of paper presented at EGU on 2015-04-17

Presented at BGS, Keyworth on 2015-04-27
Extended version of paper presented at EGU on 2015-04-17

Advertisement
Advertisement

More Related Content

Similar to Technologies and practices for maintaining and publishing earth science vocabularies (20)

Advertisement

Technologies and practices for maintaining and publishing earth science vocabularies

  1. 1. Simon J D Cox, Jonathan Yu, Megan Williams, Fabrizio Giabardo, Dominic Lowe 16 April 2015 LAND AND WATER FLAGSHIP Technologies and practices for maintaining and publishing earth science vocabularies
  2. 2. Are these the same? Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe “nitrogen” “dissolved nitrogen” “Total nitrogen, water, filtered, milligrams per liter” “Concentration of nitrogen (total) per unit volume of the water body [dissolved plus reactive particulate phase] by oxidation and colorimetric autoanalysis“ “Concentration of nitrogen (total) per unit mass of the water body [dissolved plus reactive particulate <GF/F phase] by filtration and high temperature Pt catalytic oxidation” “Concentration (moles or mass) of total nitrogen (i.e. nitrogen in all chemical forms) in suspended particulate material per unit volume of the water column.” “Concentration of nitrogen (total) {'PON'} per unit volume of the water body [particulate 2-10um phase] by filtration, acidification and elemental analysis” “Dissolved total and organic nitrogen concentrations in the water column” 2 |
  3. 3. Why are vocabularies important? Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe3 |
  4. 4. OM_Observation + phenomenonTime + resultTime + validTime [0..1] + resultQuality [0..*] + parameter [0..*] GFI_PropertyType GFI_Feature OM_ProcessGFI_DomainFeature Any +observedProperty 1 +propertyValueProvider 0..* +featureOfInterest 1 +generatedObservation 0..* +procedure1 +result Range observed property Parameter dictionary procedure Register of sensors, processes & algorithms feature of interest Feature-type catalogue Feature service result format: GML, SWE, netCDF, JSON, SQLite ... O&M domain specialization Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe4 |
  5. 5. RDF Data Cube 101 - Slices and observations Dimension d6 Dimension d7 Dimension d1 Dimension d2 Dimension d3 Dimension d4 Dimension d5 Measure m1, m2, … Attribute a1, a2, … Cube Slice Observation Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe5 | A linked sensor data cube, Lefort, 5th Intl. SSN workshop, 2012
  6. 6. W3C Data Cube ontology Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe6 | Each axis or variable specified as a skos:Concept Values of coded-properties selected from a skos:ConceptScheme Homogeneous observations, common structure definition The RDF Data Cube Vocabulary, Cyganiak & Reynolds, W3C Recommendation 2014
  7. 7. What is available? Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe7 |
  8. 8. AGU Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe8 |
  9. 9. Thomson-Reuters Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe9 |
  10. 10. ANZSRC Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe10 |
  11. 11. Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe11 |
  12. 12. ICS Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe 12 |
  13. 13. GSSP Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe13 |
  14. 14. GCMD Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe14 |
  15. 15. Standard ontology of chemicals Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe15 |
  16. 16. Vocabulary formalization Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe16 |
  17. 17. Formalization: RDF – SKOS for basic vocabularies Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe17 | chem:sodium a skos:Concept ; rdfs:label "sodium"^^xsd:string ; skos:broader chem:alkali ; skos:exactMatch <http://dbpedia.org/resource/Sodium> ; skos:inScheme skos:chemicals ; skos:prefLabel "nátrium"@hu , "sodio"@it , "sodium"@fr , "sodium"@en .
  18. 18. RDFS Semantic web dead long live semantic web | Simon Cox18 | GeochronEra TemporalReference System component member skos:ConceptScheme skos:Concept skos:hasTopConcept skos:narrowersubClassOf subClassOf subPropertyOf subPropertyOf domain domain domain range range range domain range
  19. 19. Inferencing • Entailments and reasoning • What does this combination of axioms imply? • Is there anything unexpected? Phanerozoic Cenozoic Neogene Stratigraphic Chart GeochronEra TemporalReference System type type type type component member member hasTopConcept narrower narrowernarrowerTransitive Concept ConceptScheme broaderTransitive Semantic web dead long live semantic web | Simon Cox19 |
  20. 20. Formalization and encoding process Create order within existing excel spreadsheets Every layout is different Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe20 |
  21. 21. Formalization and encoding process RDF 123 Every mapping is different Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe21 |
  22. 22. Formalization and encoding process Turtle, in text editor … Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe22 |
  23. 23. People + judgement Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe23 |
  24. 24. Vocabulary distribution Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe24 |
  25. 25. • Physical documents, PDF • Tables on web pages • Bespoke XML documents • RDF documents, OWL documents • Web services • RESTful web resources, Linked data Vocabulary services | Cox & Yu Delivery
  26. 26. Publish as linked data URI = web-scale foreign-key Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe26 |
  27. 27. Linked vocabularies can be shared and re-used Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe27 |
  28. 28. Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe28 |
  29. 29. Status and lifecycle Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe29 |
  30. 30. Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe30 |
  31. 31. Governance issues, design flaws Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe31 |
  32. 32. Governance issues What is the best way to re-use existing content already published as linked data? Do we fix it for them? Do we re-claim it? Vocabulary deployment and governance | Cox32 |
  33. 33. Modeling flaws GCMD science keywords • Same textual definition, same label • Different parent, different URI – are they the same concept? Vocabulary deployment and governance | Cox33 |
  34. 34. Re-base the URI? <http://registry.it.csiro.au/def/kwa/gcmd/ABRASION> a skos:Concept ; rdfs:label "ABRASION" ; dct:description "Mechanical scraping of a rock surface by friction between rocks and moving particles."@en ; owl:sameAs <http://gcmdservices.gsfc.nasa.gov/kms/concept/8f57f4b0-5177-4362-81e8-ced75d37d1aa> , <http://gcmdservices.gsfc.nasa.gov/kms/concept/fd29bf77-df38-4b80-8148-8184fa41d843> , <http://gcmdservices.gsfc.nasa.gov/kms/concept/efacd4f6-59ea-4019-8265-8cc81ecc99c0> , <http://gcmdservices.gsfc.nasa.gov/kms/concept/f6e19e2e-555a-4d40-9833-c7513d92c813> ; skos:prefLabel "ABRASION"@en . Vocabulary deployment and governance | Cox34 |
  35. 35. Versioning flaws NASA SWEET http://sweet.jpl.nasa.gov/1.1/time.owl#PLIOCENE http://sweet.jpl.nasa.gov/2.0/timeGeologic.owl#Pliocene http://sweet.jpl.nasa.gov/2.1/reprTimeGeologicPeriod.owl#Pliocene http://sweet.jpl.nasa.gov/2.2/stateTimeGeologic.owl#Pliocene http://sweet.jpl.nasa.gov/2.3/stateTimeGeologic.owll#Pliocene • Same label, and same place in hierarchy • Different URI - are they the same concept? Vocabulary deployment and governance | Cox35 |
  36. 36. Governance issues Who is the expert? - Wikipedia?? Vocabulary deployment and governance | Cox36 |
  37. 37. Collection sub-set? <http://registry.it.csiro.au/def/kwa/gcmd/GCMD-keywords-subset_newnames> a skos:Collection ; rdfs:label "Subset of GCMD keywords - re-based"^^xsd:string ; skos:member <http://registry.it.csiro.au/def/kwa/gcmd/ABLATION> , <http://registry.it.csiro.au/def/kwa/gcmd/ABRASION> , <http://registry.it.csiro.au/def/kwa/gcmd/ABLATION-ZONES-ACCUMULATION-ZONES> . - Or - <http://registry.it.csiro.au/def/kwa/gcmd/GCMD-keywords-subset> a skos:Collection ; rdfs:label "Subset of GCMD keywords"^^xsd:string ; skos:member <http://gcmdservices.gsfc.nasa.gov/kms/concept/8f57f4b0-5177-4362-81e8-ced75d37d1aa> , <http://gcmdservices.gsfc.nasa.gov/kms/concept/95fbaefd-1afe-4887-a1ba-fc338a8109bb> , <http://gcmdservices.gsfc.nasa.gov/kms/concept/99db4dca-4d07-48fd-8ba3-393532d04aa6> , <http://gcmdservices.gsfc.nasa.gov/kms/concept/a994a6f6-cfcd-45d2-95a4-0f8455a9454d> , <http://gcmdservices.gsfc.nasa.gov/kms/concept/efacd4f6-59ea-4019-8265-8cc81ecc99c0> , <http://gcmdservices.gsfc.nasa.gov/kms/concept/fd29bf77-df38-4b80-8148-8184fa41d843> , <http://gcmdservices.gsfc.nasa.gov/kms/concept/f6e19e2e-555a-4d40-9833-c7513d92c813> . Vocabulary deployment and governance | Cox37 |
  38. 38. More complex constraints? OWL classes vs instances cgi-lith-instance:carbonate_rich_mudstone a skos:Concept ; rdfs:label "carbonate-rich mudstone" ; skos:broader cgi-lith-instance:rock_material ; CGI_Lith:ConsolDegree CGI_Lith:consolidated ; CGI_Lith:Constituents CGI_Lith:carbonateBearing ; CGI_Lith:GeneticCateg CGI_Lith:sedimentary ; CGI_Lith:GrainSize CGI_Lith:mud_size ; CGI_Lith:ParticleType CGI_Lith:grain . Vocabulary deployment and governance | Cox38 |
  39. 39. Summary and conclusions Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe39 |
  40. 40. Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe40 | Source Vocabulary (csv, html, txt) Database (triple-store) Formalized vocabulary (skos/rdf) Vocab service LDR API SPARQL SISSVoc
  41. 41. Summary • Term vocabularies can be formalized in RDF (SKOS, OWL) and published as linked data • Much content available, but needs converting (‘lifting’) to semantic technologies • Excel, RDF123, Text editor, SKOS, LDR and SISSVoc are our enablers (but people are essential) Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe41 |
  42. 42. Applications and published vocabularies • GeoSciML vocabularies • http://def.seegrid.csiro.au/sissvoc/cgi201211/collection • http://resource.geosciml.org/classifier/ics/ischart/ • Environmental observations vocabularies • http://environment.data.gov.au/def/ • http://registry.it.csiro.au/environment/def • Bioregional assessments glossary • http://registry.it.csiro.au/test1/ba-glossary • Agriculture definitions • http://registry.it.csiro.au/agriculture/def • Australian Government definitions - AGIFT 2014, ANZSRC 2008 … • http://registry.it.csiro.au/agldwg/def • CSIRO Keyword aggregator … • Coming soon Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe42 |
  43. 43. LAND AND WATER FLAGSHIP Thank youEnvironmental Informatics Infrastructure Simon J D Cox Research Scientist t +61 3 9252 6342 e simon.cox@csiro.au w people.csiro.au/C/S/Simon-Cox Jonathan Yu Research Engineer t +61 3 9252 6440 e jonathan.yu@csiro.au w people.csiro.au/C/S/Jonathan-Yu
  44. 44. Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe44 |
  45. 45. SISSVoc UI & API for vocabulary query Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe 45 |
  46. 46. Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe46 |
  47. 47. Simplified Knowledge Organization System SKOS: a W3C Standard Focus on the concept rather than the term • Web/Linked data principle: Concept is identified by a URI • Concept is annotated with text labels (i.e. the traditional ‘term’) • Structured using hierarchical relations within a vocabulary • broader, narrower • Matching relations between vocabularies • broadMatch, closeMatch, exactMatch Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe47 |
  48. 48. • Physical documents, PDF • Tables on web pages • Bespoke XML documents • RDF documents, OWL documents • Web services • RESTful web resources, Linked data Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe Delivery
  49. 49. O&M OM_Observation + phenomenonTime + resultTime + validTime [0..1] + resultQuality [0..*] + parameter [0..*] GF_PropertyType GFI_Feature OM_Process Any +observedProperty 1 0..* +featureOfInterest 1 0..* +procedure1 +result Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, LoweISO 19156:2011 Geographic Information – Observations and measurements – ed. S Cox49 |
  50. 50. Governance Clear roles: • Content is determined by the experts • Formalization may uncover inconsistencies • History and status must be visible • No deletions! - retirement or supercession Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe50 |

×