Converting GHO to RDF

561 views
478 views

Published on

Published in: Technology, Education
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
561
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
3
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Converting GHO to RDF

  1. 1. Converting WHO’s GlobalHealth Observatory Data to RDF Amrapali Zaveri, PhD student August  27,  2012 1
  2. 2. Outline• Background• What is the RDF Data Cube Vocabulary?• Semi-automated approach• OntoWikis CSVImport plug-in• RDFized GHO data• Limitations and Future Work 2
  3. 3. Background• Biomedical statistical data • Published as Excel sheets• Advantage • Readable by humans• Disadvantages • Cannot be queried efficiently • Difficult to integrate with other data (in different formats)• Our approach • Converting data into a single data model - RDF • Using the RDF Data Cube Vocabulary* • designed particularly to represent multidimensional statistical data using RDF. *http://www.w3.org/TR/vocab-data-cube/ 3
  4. 4. What is the RDF Data CubeVocabulary? 4
  5. 5. What is the RDF Data CubeVocabulary? • Dimensions • Attributes • Measures • Observations 5
  6. 6. Semi-automated approach• Transforming CSV to RDF in a fully automated way is not feasible. • Dimensions may often be encoded in heading or label of a sheet• Our semi-automatic approach: • As a plug-in in OntoWiki# • a semantic collaboration platform developed by the AKSW research group. • A CSV file is converted into RDF using the RDF Data Cube Vocabulary: http://aksw.org/Projects/ Stats2RDF # Sören Auer, Sebastian Tramp (geb. Dietzold), Jens Lehmann, and Thomas Riechert: OntoWiki: A Tool for Social Semantic Collaboration In: Proceedings of the Workshop on Social and Collaborative Construction of Structured Knowledge CKC 2007 at the 16th International World Wide Web Conference WWW2007 Banff, Canada, May 8, 2007 6
  7. 7. 1. Create Knowledge Base 7
  8. 8. 2. Import a CSV file 8
  9. 9. 3. Define dimensions 9
  10. 10. 4. Define data range 10
  11. 11. 5. Save template, extract triples 11
  12. 12. 6. Re-use template for similar files 12
  13. 13. 7. View resources 13
  14. 14. RDFized GHO datagho:Country rdfs:subClassOf qb:DimensionProperty; rdf:type rdfs:Class; rdfs:label "Country" .gho:Disease rdfs:subClassOf qb:DimensionProperty; rdf:type rdfs:Class; rdfs:label "Disease" .gho: Afghanistan rdf:type ex:Country; rdfs:label "Afghanistan" .gho:Tuberculosis rdf:type ex:Disease; rdfs:label "Tuberculosis" .gho:c1-r6 rdf:type qb:Observation; rdf:value "127"^^xsd:integer; qb:dimension gho:Afghanistan; qb:dimension gho:Tuberculosis . 14
  15. 15. RDFized GHO data• Available at http://gho.aksw.org• 50 datasets• ~ 8 million triples• Paper published at SWJ Call for Dataset descriptions:http://www.semantic-web-journal.net/content/publishing-and-interlinking-global-health-observatory-dataset 15
  16. 16. Limitations and Future Work • Conversion • Coherence • Temporal Comparability • Exploring GHO 16
  17. 17. Thank You! Questions?http://aksw.org/AmrapaliZaverizaveri@informatik.uni-leipzig.de 17

×