Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Simon J D Cox, Jonathan Yu, Megan Williams, Fabrizio Giabardo, Dominic Lowe
16 April 2015
LAND AND WATER FLAGSHIP
Technolo...
Are these the same?
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe
“nitrogen”
“dissolved nitrog...
Why are vocabularies
important?
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe3 |
OM_Observation
+ phenomenonTime
+ resultTime
+ validTime [0..1]
+ resultQuality [0..*]
+ parameter [0..*]
GFI_PropertyType...
RDF Data Cube 101 - Slices and observations
Dimension d6
Dimension d7
Dimension d1
Dimension d2
Dimension d3
Dimension d4
...
W3C Data Cube ontology
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe6 |
Each axis or variable
...
What is available?
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe7 |
AGU
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe8 |
Thomson-Reuters
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe9 |
ANZSRC
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe10 |
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe11 |
ICS
Publishing earth science vocabularies |
Cox, Yu, Williams, Giabardo, Lowe
12 |
GSSP
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe13 |
GCMD
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe14 |
Standard ontology
of chemicals
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe15 |
Vocabulary formalization
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe16 |
Formalization: RDF – SKOS for basic vocabularies
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe...
RDFS
Semantic web dead long live semantic web | Simon Cox18 |
GeochronEra
TemporalReference
System
component
member
skos:C...
Inferencing
• Entailments and reasoning
• What does this combination of axioms
imply?
• Is there anything unexpected?
Phan...
Formalization and encoding process
Create order within
existing excel
spreadsheets
Every layout is
different
Publishing ea...
Formalization and encoding process
RDF 123
Every mapping
is different
Publishing earth science vocabularies | Cox, Yu, Wil...
Formalization and encoding process
Turtle,
in text editor …
Publishing earth science vocabularies | Cox, Yu, Williams, Gia...
People + judgement
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe23 |
Vocabulary distribution
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe24 |
• Physical documents, PDF
• Tables on web pages
• Bespoke XML documents
• RDF documents, OWL documents
• Web services
• RE...
Publish as linked data
URI = web-scale foreign-key
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lo...
Linked vocabularies can be shared and re-used
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe27 |
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe28 |
Status and lifecycle
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe29 |
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe30 |
Governance issues, design flaws
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe31 |
Governance issues
What is the best way to re-use existing content already published as
linked data?
Do we fix it for them?...
Modeling flaws
GCMD science keywords
• Same textual definition, same label
• Different parent, different URI
– are they th...
Re-base the URI?
<http://registry.it.csiro.au/def/kwa/gcmd/ABRASION>
a skos:Concept ;
rdfs:label "ABRASION" ;
dct:descript...
Versioning flaws
NASA SWEET
http://sweet.jpl.nasa.gov/1.1/time.owl#PLIOCENE
http://sweet.jpl.nasa.gov/2.0/timeGeologic.owl...
Governance issues
Who is the expert? - Wikipedia??
Vocabulary deployment and governance | Cox36 |
Collection sub-set?
<http://registry.it.csiro.au/def/kwa/gcmd/GCMD-keywords-subset_newnames>
a skos:Collection ;
rdfs:labe...
More complex
constraints?
OWL classes vs instances
cgi-lith-instance:carbonate_rich_mudstone a skos:Concept ;
rdfs:label "...
Summary and conclusions
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe39 |
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe40 |
Source
Vocabulary
(csv, html, txt)
Database
...
Summary
• Term vocabularies can be formalized in RDF (SKOS, OWL) and
published as linked data
• Much content available, bu...
Applications and published vocabularies
• GeoSciML vocabularies
• http://def.seegrid.csiro.au/sissvoc/cgi201211/collection...
LAND AND WATER FLAGSHIP
Thank youEnvironmental Informatics Infrastructure
Simon J D Cox
Research Scientist
t +61 3 9252 63...
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe44 |
SISSVoc UI & API for vocabulary query
Publishing earth science vocabularies
| Cox, Yu, Williams, Giabardo, Lowe
45 |
Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe46 |
Simplified Knowledge Organization System
SKOS: a W3C Standard
Focus on the concept rather than the term
• Web/Linked data ...
• Physical documents, PDF
• Tables on web pages
• Bespoke XML documents
• RDF documents, OWL documents
• Web services
• RE...
O&M
OM_Observation
+ phenomenonTime
+ resultTime
+ validTime [0..1]
+ resultQuality [0..*]
+ parameter [0..*]
GF_PropertyT...
Governance
Clear roles:
• Content is determined by the experts
• Formalization may uncover inconsistencies
• History and s...
Upcoming SlideShare
Loading in …5
×

Technologies and practices for maintaining and publishing earth science vocabularies

308 views

Published on

Presented at BGS, Keyworth on 2015-04-27
Extended version of paper presented at EGU on 2015-04-17

Published in: Science
  • Login to see the comments

  • Be the first to like this

Technologies and practices for maintaining and publishing earth science vocabularies

  1. 1. Simon J D Cox, Jonathan Yu, Megan Williams, Fabrizio Giabardo, Dominic Lowe 16 April 2015 LAND AND WATER FLAGSHIP Technologies and practices for maintaining and publishing earth science vocabularies
  2. 2. Are these the same? Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe “nitrogen” “dissolved nitrogen” “Total nitrogen, water, filtered, milligrams per liter” “Concentration of nitrogen (total) per unit volume of the water body [dissolved plus reactive particulate phase] by oxidation and colorimetric autoanalysis“ “Concentration of nitrogen (total) per unit mass of the water body [dissolved plus reactive particulate <GF/F phase] by filtration and high temperature Pt catalytic oxidation” “Concentration (moles or mass) of total nitrogen (i.e. nitrogen in all chemical forms) in suspended particulate material per unit volume of the water column.” “Concentration of nitrogen (total) {'PON'} per unit volume of the water body [particulate 2-10um phase] by filtration, acidification and elemental analysis” “Dissolved total and organic nitrogen concentrations in the water column” 2 |
  3. 3. Why are vocabularies important? Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe3 |
  4. 4. OM_Observation + phenomenonTime + resultTime + validTime [0..1] + resultQuality [0..*] + parameter [0..*] GFI_PropertyType GFI_Feature OM_ProcessGFI_DomainFeature Any +observedProperty 1 +propertyValueProvider 0..* +featureOfInterest 1 +generatedObservation 0..* +procedure1 +result Range observed property Parameter dictionary procedure Register of sensors, processes & algorithms feature of interest Feature-type catalogue Feature service result format: GML, SWE, netCDF, JSON, SQLite ... O&M domain specialization Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe4 |
  5. 5. RDF Data Cube 101 - Slices and observations Dimension d6 Dimension d7 Dimension d1 Dimension d2 Dimension d3 Dimension d4 Dimension d5 Measure m1, m2, … Attribute a1, a2, … Cube Slice Observation Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe5 | A linked sensor data cube, Lefort, 5th Intl. SSN workshop, 2012
  6. 6. W3C Data Cube ontology Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe6 | Each axis or variable specified as a skos:Concept Values of coded-properties selected from a skos:ConceptScheme Homogeneous observations, common structure definition The RDF Data Cube Vocabulary, Cyganiak & Reynolds, W3C Recommendation 2014
  7. 7. What is available? Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe7 |
  8. 8. AGU Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe8 |
  9. 9. Thomson-Reuters Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe9 |
  10. 10. ANZSRC Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe10 |
  11. 11. Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe11 |
  12. 12. ICS Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe 12 |
  13. 13. GSSP Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe13 |
  14. 14. GCMD Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe14 |
  15. 15. Standard ontology of chemicals Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe15 |
  16. 16. Vocabulary formalization Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe16 |
  17. 17. Formalization: RDF – SKOS for basic vocabularies Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe17 | chem:sodium a skos:Concept ; rdfs:label "sodium"^^xsd:string ; skos:broader chem:alkali ; skos:exactMatch <http://dbpedia.org/resource/Sodium> ; skos:inScheme skos:chemicals ; skos:prefLabel "nátrium"@hu , "sodio"@it , "sodium"@fr , "sodium"@en .
  18. 18. RDFS Semantic web dead long live semantic web | Simon Cox18 | GeochronEra TemporalReference System component member skos:ConceptScheme skos:Concept skos:hasTopConcept skos:narrowersubClassOf subClassOf subPropertyOf subPropertyOf domain domain domain range range range domain range
  19. 19. Inferencing • Entailments and reasoning • What does this combination of axioms imply? • Is there anything unexpected? Phanerozoic Cenozoic Neogene Stratigraphic Chart GeochronEra TemporalReference System type type type type component member member hasTopConcept narrower narrowernarrowerTransitive Concept ConceptScheme broaderTransitive Semantic web dead long live semantic web | Simon Cox19 |
  20. 20. Formalization and encoding process Create order within existing excel spreadsheets Every layout is different Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe20 |
  21. 21. Formalization and encoding process RDF 123 Every mapping is different Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe21 |
  22. 22. Formalization and encoding process Turtle, in text editor … Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe22 |
  23. 23. People + judgement Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe23 |
  24. 24. Vocabulary distribution Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe24 |
  25. 25. • Physical documents, PDF • Tables on web pages • Bespoke XML documents • RDF documents, OWL documents • Web services • RESTful web resources, Linked data Vocabulary services | Cox & Yu Delivery
  26. 26. Publish as linked data URI = web-scale foreign-key Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe26 |
  27. 27. Linked vocabularies can be shared and re-used Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe27 |
  28. 28. Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe28 |
  29. 29. Status and lifecycle Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe29 |
  30. 30. Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe30 |
  31. 31. Governance issues, design flaws Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe31 |
  32. 32. Governance issues What is the best way to re-use existing content already published as linked data? Do we fix it for them? Do we re-claim it? Vocabulary deployment and governance | Cox32 |
  33. 33. Modeling flaws GCMD science keywords • Same textual definition, same label • Different parent, different URI – are they the same concept? Vocabulary deployment and governance | Cox33 |
  34. 34. Re-base the URI? <http://registry.it.csiro.au/def/kwa/gcmd/ABRASION> a skos:Concept ; rdfs:label "ABRASION" ; dct:description "Mechanical scraping of a rock surface by friction between rocks and moving particles."@en ; owl:sameAs <http://gcmdservices.gsfc.nasa.gov/kms/concept/8f57f4b0-5177-4362-81e8-ced75d37d1aa> , <http://gcmdservices.gsfc.nasa.gov/kms/concept/fd29bf77-df38-4b80-8148-8184fa41d843> , <http://gcmdservices.gsfc.nasa.gov/kms/concept/efacd4f6-59ea-4019-8265-8cc81ecc99c0> , <http://gcmdservices.gsfc.nasa.gov/kms/concept/f6e19e2e-555a-4d40-9833-c7513d92c813> ; skos:prefLabel "ABRASION"@en . Vocabulary deployment and governance | Cox34 |
  35. 35. Versioning flaws NASA SWEET http://sweet.jpl.nasa.gov/1.1/time.owl#PLIOCENE http://sweet.jpl.nasa.gov/2.0/timeGeologic.owl#Pliocene http://sweet.jpl.nasa.gov/2.1/reprTimeGeologicPeriod.owl#Pliocene http://sweet.jpl.nasa.gov/2.2/stateTimeGeologic.owl#Pliocene http://sweet.jpl.nasa.gov/2.3/stateTimeGeologic.owll#Pliocene • Same label, and same place in hierarchy • Different URI - are they the same concept? Vocabulary deployment and governance | Cox35 |
  36. 36. Governance issues Who is the expert? - Wikipedia?? Vocabulary deployment and governance | Cox36 |
  37. 37. Collection sub-set? <http://registry.it.csiro.au/def/kwa/gcmd/GCMD-keywords-subset_newnames> a skos:Collection ; rdfs:label "Subset of GCMD keywords - re-based"^^xsd:string ; skos:member <http://registry.it.csiro.au/def/kwa/gcmd/ABLATION> , <http://registry.it.csiro.au/def/kwa/gcmd/ABRASION> , <http://registry.it.csiro.au/def/kwa/gcmd/ABLATION-ZONES-ACCUMULATION-ZONES> . - Or - <http://registry.it.csiro.au/def/kwa/gcmd/GCMD-keywords-subset> a skos:Collection ; rdfs:label "Subset of GCMD keywords"^^xsd:string ; skos:member <http://gcmdservices.gsfc.nasa.gov/kms/concept/8f57f4b0-5177-4362-81e8-ced75d37d1aa> , <http://gcmdservices.gsfc.nasa.gov/kms/concept/95fbaefd-1afe-4887-a1ba-fc338a8109bb> , <http://gcmdservices.gsfc.nasa.gov/kms/concept/99db4dca-4d07-48fd-8ba3-393532d04aa6> , <http://gcmdservices.gsfc.nasa.gov/kms/concept/a994a6f6-cfcd-45d2-95a4-0f8455a9454d> , <http://gcmdservices.gsfc.nasa.gov/kms/concept/efacd4f6-59ea-4019-8265-8cc81ecc99c0> , <http://gcmdservices.gsfc.nasa.gov/kms/concept/fd29bf77-df38-4b80-8148-8184fa41d843> , <http://gcmdservices.gsfc.nasa.gov/kms/concept/f6e19e2e-555a-4d40-9833-c7513d92c813> . Vocabulary deployment and governance | Cox37 |
  38. 38. More complex constraints? OWL classes vs instances cgi-lith-instance:carbonate_rich_mudstone a skos:Concept ; rdfs:label "carbonate-rich mudstone" ; skos:broader cgi-lith-instance:rock_material ; CGI_Lith:ConsolDegree CGI_Lith:consolidated ; CGI_Lith:Constituents CGI_Lith:carbonateBearing ; CGI_Lith:GeneticCateg CGI_Lith:sedimentary ; CGI_Lith:GrainSize CGI_Lith:mud_size ; CGI_Lith:ParticleType CGI_Lith:grain . Vocabulary deployment and governance | Cox38 |
  39. 39. Summary and conclusions Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe39 |
  40. 40. Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe40 | Source Vocabulary (csv, html, txt) Database (triple-store) Formalized vocabulary (skos/rdf) Vocab service LDR API SPARQL SISSVoc
  41. 41. Summary • Term vocabularies can be formalized in RDF (SKOS, OWL) and published as linked data • Much content available, but needs converting (‘lifting’) to semantic technologies • Excel, RDF123, Text editor, SKOS, LDR and SISSVoc are our enablers (but people are essential) Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe41 |
  42. 42. Applications and published vocabularies • GeoSciML vocabularies • http://def.seegrid.csiro.au/sissvoc/cgi201211/collection • http://resource.geosciml.org/classifier/ics/ischart/ • Environmental observations vocabularies • http://environment.data.gov.au/def/ • http://registry.it.csiro.au/environment/def • Bioregional assessments glossary • http://registry.it.csiro.au/test1/ba-glossary • Agriculture definitions • http://registry.it.csiro.au/agriculture/def • Australian Government definitions - AGIFT 2014, ANZSRC 2008 … • http://registry.it.csiro.au/agldwg/def • CSIRO Keyword aggregator … • Coming soon Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe42 |
  43. 43. LAND AND WATER FLAGSHIP Thank youEnvironmental Informatics Infrastructure Simon J D Cox Research Scientist t +61 3 9252 6342 e simon.cox@csiro.au w people.csiro.au/C/S/Simon-Cox Jonathan Yu Research Engineer t +61 3 9252 6440 e jonathan.yu@csiro.au w people.csiro.au/C/S/Jonathan-Yu
  44. 44. Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe44 |
  45. 45. SISSVoc UI & API for vocabulary query Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe 45 |
  46. 46. Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe46 |
  47. 47. Simplified Knowledge Organization System SKOS: a W3C Standard Focus on the concept rather than the term • Web/Linked data principle: Concept is identified by a URI • Concept is annotated with text labels (i.e. the traditional ‘term’) • Structured using hierarchical relations within a vocabulary • broader, narrower • Matching relations between vocabularies • broadMatch, closeMatch, exactMatch Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe47 |
  48. 48. • Physical documents, PDF • Tables on web pages • Bespoke XML documents • RDF documents, OWL documents • Web services • RESTful web resources, Linked data Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe Delivery
  49. 49. O&M OM_Observation + phenomenonTime + resultTime + validTime [0..1] + resultQuality [0..*] + parameter [0..*] GF_PropertyType GFI_Feature OM_Process Any +observedProperty 1 0..* +featureOfInterest 1 0..* +procedure1 +result Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, LoweISO 19156:2011 Geographic Information – Observations and measurements – ed. S Cox49 |
  50. 50. Governance Clear roles: • Content is determined by the experts • Formalization may uncover inconsistencies • History and status must be visible • No deletions! - retirement or supercession Publishing earth science vocabularies | Cox, Yu, Williams, Giabardo, Lowe50 |

×