2012.10 - Workshop on Semantic Statistics - 1

289 views
239 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
289
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

2012.10 - Workshop on Semantic Statistics - 1

  1. 1. Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences Workshop on Semantic Statistics 15.10.2012 – 19.10.2012 Thomas Bosch M.Sc. (TUM) postgraduate student http://boschthomas.blogspot.com GESIS - Leibniz Institute for the Social Sciences
  2. 2. Agenda 2
  3. 3. Why DDI as Linked Data?• Currently no such ontology available• To increase visibility of data holdings using mainstream Web technologies• To open DDI to the Linked Data community• To process DDI-RDF by RDF tools• To link DDI-RDF to other RDF data• To better identify opportunities for merging datasets• To enable inferencing• To research microdata within the LOD cloud 3
  4. 4. How was the DDI Ontology developed?• DDI subset • of the most important DDI elements• Use cases • Experts in the statistics domain formulated use cases which are seen as most significant to solve frequent problems • Most important use case: discover microdata connected with multiple studies• Leverage existing DDI-XML docs to DDI-RDF automatically • Direct mapping • Generic mapping (Bosch and Mathiak, 2011) 4
  5. 5. Discovery Use Case• Which studies are connected with a specific coverage consisting of the 3 dimensions: time, country, and subject?• What questions with a specific question text are contained in the study questionnaire?• What questions are connected with a concept with a specific label?• What questions are combined with a variable with an associated coverage consisting of the 3 dimensions time, country, and subject?• What concepts are linked to particular variables or questions?• What representation does a specific variable have?• What codes and what categories are part of this representation?• What variable label does a variable with a particular variable name have?• What‘s the maximum value of a certain variable?• What are the absolute and relative frequencies of a specific code?• What data files contain the entire dataset? 5
  6. 6. 6
  7. 7. study | coverage 7
  8. 8. 8
  9. 9. instrument | question | concept 9
  10. 10. 10
  11. 11. 11
  12. 12. values | value labels 12
  13. 13. 13
  14. 14. 14
  15. 15. variable | descriptive statistics 15
  16. 16. 16
  17. 17. 17
  18. 18. logical dataset | dataset | data file 18
  19. 19. 19
  20. 20. 20
  21. 21. conceptual model 21
  22. 22. 22
  23. 23. Open Issues• DDI Ontology URL and Prefix• DC namespace• Naming Conventions• Cardinalities• Consistency Check• Universe vs. Coverage• DescriptiveStatistics• Study Groups• Classes• Datatype Properties• Object Properties 23
  24. 24. Thank you for you attention! 24

×