Your SlideShare is downloading. ×
  • Like
  • Save
Bosch, Wackerow: Linked data on the web
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply
Published

DDI-RDF Discovery Vocabulary ; A Metadata Vocabulary for Documenting Research and Survey Data: …

DDI-RDF Discovery Vocabulary ; A Metadata Vocabulary for Documenting Research and Survey Data:

Overview:
- What is DDI?
- Motivation
- Relationships to Vocabularies
- DDI-RDF Discovery Vocabulary
- Conceptual Model

Published in Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
92
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. DDI-RDF Discovery VocabularyA Metadata Vocabulary for Documenting Research and Survey DataLinked Data on the Web (LDOW 2013)14.05.2013Thomas BoschGESIS, Germanythomas.bosch@gesis.orgRichard CyganiakDERI, Irelandrichard.cyganiak@deri.orgArofan GregoryOpen Data Foundation, USAagregory@opendatafoundation.orgJoachim WackerowGESIS, Germanyjoachim.wackerow@gesis.org
  • 2. Outline• What is DDI?• Motivation• Relationships to Vocabularies• DDI-RDF Discovery Vocabulary• Conceptual Model2
  • 3. What is DDI?• DDI (Data Documentation Initiative)• DDI is an established international standard for the documentationand management of data from the social, behavioral, and economicsciences• DDI is a data model for describing statistical data• data collected for research and official statistics• The DDI Alliance• International consortium of > 35 member institutions• produces and maintaines the DDI3
  • 4. What is DDI?• DDI supports the entire research data lifecycle Secondary analysis results can be reproduced4
  • 5. What is DDI?• DDI focuses on the documentation of microdata• DDI also supports aggregated data• DDI-C (Codebook)• general information about a study• data dictionary• DDI-L (Lifecycle)• description of more complex multi-wave studies• throughout the data lifecycle5
  • 6. What is DDI?• Structured high quality metadata enable secondary analysis withoutthe need to contact the primary researcher• DDI enables the reuse of metadata of existing studies for designingnew studies• DDI is currently specified using XML Schemas• XML Schemas are organized in multiple modules corresponding to theindividual stages of the research data lifecycle• XML Schemas comprehend over 800 XML elements6
  • 7. Motivation for the DDI Community• publish microdata (data sets representing microdata)• increase visibility of microdata• increase use of microdata• discover microdata• enable inferencing on microdata• harmonize microdata (make microdata comparable)• RDF tools can process DDI-RDF7
  • 8. Motivation for the LD Community• an ontology describing the statistical domain is now available• publish microdata• publish metadata on microdata• metadata about already published but under-documentedmicrodata can be published• RDF tools can process DDI-RDF• to link microdata to other microdatamaking the data and the results of research (e.g. publications) more closelyconnected8
  • 9. Relationships to Vocabularies• DCMI Metadata Terms• are used for citation purposes• Simple Knowledge Organization System (SKOS)• is used for creating hierarchies of concepts similar to thesauri andclassification systems• SKOS Extension (XKOS)• a vocabulary which extends SKOS to allow for a more complete description offormal statistical classifications• planned for publication 2013 by the DDI Alliance• reference: https://github.com/linked-statistics/xkos9
  • 10. Relationships to Vocabularies• Data Catalog Vocabulary (DCAT)• W3C standard for describing catalogs of data sets• disco:LogicalDataSet ⊑ dcat:Dataset• disco: DataFile ⊑ dcat:Distribution• RDF Data Cube Vocabulary• W3C standard for representing data cubes, i.e. multidimensional aggregatedata• disco:aggregation (disco:LogicalDataSet, qb:DataSet)• disco:inputVariable (qb:DataSet, disco:Variable)• reference: http://www.w3.org/TR/vocab-data-cube/10
  • 11. DDI-RDF Discovery Vocabulary• contains only a small subset of DDI-XML + additional axioms• The conceptual model is derived from use cases which are typical inthe statistical community• Statistical domain experts have formulated these use cases whichare seen as most significant to solve frequent problems• enables to• publish• discovermicrodata and metadata about microdata (research and surveydata) in the Web of Linked Data11
  • 12. DDI-RDF Discovery Vocabulary• Availability of (meta)data• Microdata may be available (typically as CSV files)• In most cases, metadata about microdata is NOT available• contains major types of metadata of DDI-C and DDI-L• Mappings from DDI-XML to DDI-RDF• No straightforward Mapping from DDI-RDF to DDI-XML• enables better support for the LD community• partly no corresponding constructs in DDI-XML• 26 experts from the statistics and the Linked Data community of12 different countries have contributed12
  • 13. Conceptual Modelclass overview«union»VariableQuestionInstrumentQuestionnairedcat:DatasetLogicalDataSetskos:ConceptAnalysisUnitskos:ConceptUniverseStudyStudyGroup1..* product0..*0..*inGroup0..11..*variable 0..*0..*universe11..*containsVariable0..*0..*question1..*0..*universe10..*analysisUnit0..10..*universe10..*question0..*0..*analysisUnit0..10..*universe1..*13
  • 14. class study-universeStudy- owl:versionInfoStudyGroupskos:ConceptUniverse- skos:definition :rdf:langStringskos:ConceptAnalysisUnit«union»- dcterms:abstract :rdf:langString- dcterms:alternative :rdf:langString- dcterms:available :xsd:dateTime- dcterms:title :rdf:langString- purpose :rdf:langString- subtitle :rdf:langString0..*analysisUnit0..10..*universe1..*0..*inGroup0..114
  • 15. class variableVariable- dcterms:description :rdf:langString+ skos:notation :rdfs:Literal- skos:prefLabel :rdf:langStringVariableDefinition+ dcterms:description :rdf:langString- skos:prefLabel :rdf:langStringskos:Concept- skos:definition :rdf:langString- skos:notation :rdfs:Literal- skos:prefLabel :rdf:langStringRepresentation0..*skos:narrower0..*0..*skos:broader0..*0..*representation0..*0..*concept10..*representation10..*basedOn0..10..*concept115
  • 16. class representationskos:Concept- skos:definition :rdf:langString- skos:notation :rdfs:Literal- skos:prefLabel :rdf:langString«union»rdfs:Datatypeskos:ConceptSchemeRepresentation0..*skos:hasTopConcept0..*0..*skos:inScheme0..*0..*skos:narrower0..*0..*skos:broader0..*16
  • 17. class overview-data-setdcat:DatasetLogicalDataSet- dcterms:title :rdf:langString- isPublic :xsd:booleandcat:Distributiondcterms:DatasetDataFile- caseQuantity :xsd:nonNegativeInteger- dcterms:description :rdf:langString- owl:versionInfo :stringDescriptiveStatisticsCategoryStatistics- cumulativePercentage :xsd:decimal- frequency :xsd:nonNegativeInteger- percentage :xsd:decimal- weightedCumulativePercentage :xsd:decimal- weightedFrequency :xsd:nonNegativeInteger- weightedPercentage :xsd:decimalSummaryStatistics- invalidCases :xsd:nonNegativeInteger- maximum :xsd:decimal- mean :xsd:decimal- median :xsd:decimal- minimum :xsd:decimal- mode :xsd:decimal- standardDeviation :xsd:decimal- validCases :xsd:nonNegativeInteger- weightedInvalidCases :xsd:nonNegativeInteger- weightedMean :xsd:decimal- weightedMedian :xsd:decimal- weightedMode :xsd:decimal- weightedValidCases :xsd:nonNegativeInteger0..*statisticsDataFile0..*0..*dataFile0..*17
  • 18. 18class Data CollectionQuestion- questionText :rdf:langString- skos:prefLabel :rdf:langStringQuestionnaireInstrument- dcterms:description :rdf:langString- skos:prefLabel :rdf:langStringfoaf:DocumentRepresentation0..*externalDocumentation0..*0..*question1..*0..*responseDomain1..*
  • 19. Thank you for your attention…• Unofficial draft [planned as specification by DDI Alliance by 2013]http://rdf-vocabulary.ddialliance.org/discovery• Specification (current state) on GitHub repositoryhttps://github.com/linked-statistics/disco-spec• Scenarios for the DDI-RDF Discovery Vocabulary [in preparation]http://dx.doi.org/10.3886/DDISemanticWeb0219Thomas BoschGESIS - Leibniz Institute for the Social Sciencesthomas.bosch@gesis.orgboschthomas.blogspot.comhttps://github.com/boschthomas/PhD
  • 20. Acknowledgements26 experts from the statistical community and the Linked Data community comingfrom 12 different countries contributed to this work. They were participating inthe events mentioned below.• 1st workshop on Semantic Statistics for Social, Behavioural, and EconomicSciences: Leveraging the DDI Model for the Linked Data Web at SchlossDagstuhl - Leibniz Center for Informatics, Germany in September 2011• Working meeting in the course of the 3rd Annual European DDI Users GroupMeeting (EDDI11) in Gothenburg, Sweden in December 2011• 2nd workshop on Semantic Statistics for Social, Behavioural, and EconomicSciences: Leveraging the DDI Model for the Linked Data Web at SchlossDagstuhl - Leibniz Center for Informatics, Germany in October 2012• Working meeting at GESIS - Leibniz Institute for the Social Sciences inMannheim, Germany in February 201320