SlideShare a Scribd company logo
Course for Doctoral Students
RESEARCH DATA MANAGEMENT AND OPEN DATA
25th July 2015, Social Science Data Arhives,
Faculty of Social Sciences, University of Ljubljana
ECPR Summer School 2015
PREPARING DATAAND
DOCUMENTATION FOR DIGITAL
CURRATION
Irena Vipavc Brvar, Social Science Data Archives
Content
• Which things should I save and how
• Documentation
• Data
• Metadata (standards)
• What tools are there
SHARING MY RESEARCH
Data should be user-friendly, shareable and with long-
lasting usability.
-> ensure they can be understood and interpreted by any
user
This requires clear data description,
annotation, contextual information
and documentation.
Documentation
Data documentation might include:
• a survey questionnaire
• an interview schedule
• records of interviewees and their demographic
characteristics in a qualitative study
• variable labels in a table
• published articles that provides background information
• description of the methodology used to collect the data
Source: UK Data Service
What should be captured?
Any useful documentation such as:
• final report, published reports, user guide, working paper,
publications, lab books
Information on dataset structure
• inventory of data files
• relationships between those files
• records, cases...
Variable-level documentation
• labels, codes, classifications
• missing values
• derivations and aggregations
Source: UK Data Service
What should be captured?
Contextual information about project and data
• background, project history, aims, objectives, hypotheses
• publications based on data collection
Data collection methodology and processes
• data collection process and sampling
• instruments used - questionnaires, showcards, interview schedules
• temporal/geographic coverage
• data validation - cleaning, error checking
• compilation of derived variables
• weighting: factors and variables, weighting process
• secondary data sources used
Data confidentiality, access and use conditions
• anonymisation carried out
• consent conditions/procedures
• access or use conditions of data Source: UK Data Service
Data - level documentation
Certain types of data file may contain important information
which should be preserved:
• variable/value labels; document metadata; table
relationships and queries in relational databases; GIS data
layers/tables
Some examples:
• SPSS: variable attributes documented in Variable View (label,
code, data type, missing values)
• MS Access: relationships between tables
• ArcGIS: shapefiles (layers) and tables in geodatabase;
metadata created in ArcCatalog
• MS Excel: document properties, worksheet labels (where
multiple)
Source: UK Data Service
Data - level documentation: variable names
All structured, tabular data should have cases/records and variables
adequately documented with names, labels and descriptions.
Variable names might include:
• question number system related to questions in a survey/questionnaire
e.g. Q1a, Q1b, Q2, Q3a
• numerical order system
e.g. V1, V2, V3
• meaningful abbreviations or combinations of abbreviations referring to
meaning of the variable
e.g. oz%=percentage ozone, GOR=Government Office Region,
moocc=mother occupation, faocc=father occupation
• for interoperability across platforms - variable names should be max 8
characters and without spaces
Source: UK Data Service
Data - level documentation: variable labels
Similar principles for variable labels:
• be brief, max. 80 characters
• include unit of measurement where applicable
• reference the question number of a survey or questionnaire
e.g. variable 'q11hexw' with label 'Q11: hours spent taking physical exercise in
a typical week' - the label gives the unit of measurement and a reference to
the question number (Q11b)
• Codes of, and reasons for, missing data avoid blanks, system - missing
or '0' values
e.g. '99=not recorded', '98=not provided (no answer)', '97=not applicable',
'96=not known', '95=error'
• Coding or classification schemes used, with a bibliographic ref
e.g. Standard Occupational Classification 2000 - a list of codes to classify
respondents' jobs; ISO 3166 alpha-2 country codes - an international standard
of 2 - letter country codes
Source: UK Data Service
Data - level documentation: transcripts
Qualitative data/text documents:
• interview transcript speech demarcation (speaker
tags)
• document header with brief details of interview
date, place, interviewer name, interviewee details,
context
Source: UK Data Service
METADATA
Metadata – data about data
Describe your survey using standard
International standards/schemes
• Data Documentation Initiative (DDI)
• ISO19115
• Dublin Core
• Metadata Encoding and Transmission Standard (METS)
• Preservation Metadata Maintenance Activity (PREMIS)
BASIC STRUCTURE OF DDI 2.*
- Section 1.0 ‐ Document Description consists of
bibliographic information that c an be considered as the
header whose elements uniquely describe the full
contents of the compliant DDI file.
- Section 2.0 ‐ Study Description consists of information
about the data collection. This section includes
information about who collected and who distributes the
data, about the scope and coverage, sampling (if
relevant), data collection methods and processing,
citation requirements, etc.
Controlled Vocabulary Multilingual
XML
Semantic and technical interoperability
BASIC STRUCTURE OF DDI 2.*
• Section 3.0 ‐ Data Files Description provides
information about the Data file(s).
• Section 4.0 ‐ Variable Description provides a detailed
description o f variables, including (when relevant) t he
variable type, variable and value labels, literal
questions, computation or imputation methods,
instructions to interviewers, universe, descriptive
statistics, etc.
• Section 5.0 ‐ Other Study Related Materials allows for
the inclusion of other materials related to the study such
as questionnaires, user manuals, computer programs,
interviewer manuals, maps, coding information, etc.
Colectica for Excel
Nesstar Publisher
Nesstar Publisher – a sophisticated authoring environment that can
publish data from a variety of sources (including SPSS, SAS, Excel
etc.). The tool includes a specialised metadata editor, data and
metadata validation routines and metadata templates that provide
standardisation and control.
Easy editing/creation and export
of DDI documented datasets with
XML experience needed.
Tools to compute/recode/label
new, or existing, variables to be
added to a dataset before
publishing.
Tools to validate metadata and
variables.
The ability to import and export
data to the most common statistical
formats, including delimited files.
The ability to include automatically
generated frequency and summary
statistics for each variable.
Multilingual - Arabic, Chinese,
English, French, Portuguese,
Russian and Spanish and more.
Deposit data to data centre: ADP case
Deposit data to data centre: ADP case
Deposit data to data centre: ADP case

More Related Content

What's hot

08 sip database
08 sip database08 sip database
08 sip databaseIkhsan Bz
 
Se 381 - lec 21 - 23 - 12 may09 - df-ds and data dictionary
Se 381 - lec 21 - 23 - 12 may09 - df-ds and data dictionarySe 381 - lec 21 - 23 - 12 may09 - df-ds and data dictionary
Se 381 - lec 21 - 23 - 12 may09 - df-ds and data dictionary
babak danyal
 
Data Dictionary in System Analysis and Design
Data Dictionary in System Analysis and DesignData Dictionary in System Analysis and Design
Data Dictionary in System Analysis and Design
Arafat Hossan
 
Data
DataData
Hm306 week 2
Hm306 week 2Hm306 week 2
Hm306 week 2
BHUOnlineDepartment
 
Data Archiving and Processing
Data Archiving and ProcessingData Archiving and Processing
Data Archiving and ProcessingCRRC-Armenia
 
MetadataTheory: Introduction to Metadata (5th of 10)
MetadataTheory: Introduction to Metadata (5th of 10)MetadataTheory: Introduction to Metadata (5th of 10)
MetadataTheory: Introduction to Metadata (5th of 10)
Nikos Palavitsinis, PhD
 
Dissemination Documentation
Dissemination DocumentationDissemination Documentation
Dissemination Documentationannegrete
 
Visual tools for databade queries and analysis
Visual tools for databade queries and analysisVisual tools for databade queries and analysis
Visual tools for databade queries and analysismoochm
 
Using ca e rwin modeling to asure data 09162010
Using ca e rwin modeling to asure data 09162010Using ca e rwin modeling to asure data 09162010
Using ca e rwin modeling to asure data 09162010ERwin Modeling
 
Semantic challenges in sharing dataset metadata and creating federated datase...
Semantic challenges in sharing dataset metadata and creating federated datase...Semantic challenges in sharing dataset metadata and creating federated datase...
Semantic challenges in sharing dataset metadata and creating federated datase...
Valeria Pesce
 
Data discovery through federated dataset catalogs
Data discovery through federated dataset catalogsData discovery through federated dataset catalogs
Data discovery through federated dataset catalogs
Valeria Pesce
 
Dats nih-dccpc-kc7-april2018-prs-uoxf
Dats  nih-dccpc-kc7-april2018-prs-uoxfDats  nih-dccpc-kc7-april2018-prs-uoxf
Dats nih-dccpc-kc7-april2018-prs-uoxf
Philippe Rocca-Serra
 
Statistical Software
Statistical SoftwareStatistical Software
Statistical Software
Dennis Sanchez
 
LIS 653, Session 11: Data Management & Curation
LIS 653, Session 11: Data Management & CurationLIS 653, Session 11: Data Management & Curation
LIS 653, Session 11: Data Management & Curation
Dr. Starr Hoffman
 
Ooluk Data Dictionary Manager
Ooluk Data Dictionary ManagerOoluk Data Dictionary Manager
Ooluk Data Dictionary Manager
Siddhesh Prabhu
 

What's hot (17)

rEDCap At A Glance
rEDCap At A GlancerEDCap At A Glance
rEDCap At A Glance
 
08 sip database
08 sip database08 sip database
08 sip database
 
Se 381 - lec 21 - 23 - 12 may09 - df-ds and data dictionary
Se 381 - lec 21 - 23 - 12 may09 - df-ds and data dictionarySe 381 - lec 21 - 23 - 12 may09 - df-ds and data dictionary
Se 381 - lec 21 - 23 - 12 may09 - df-ds and data dictionary
 
Data Dictionary in System Analysis and Design
Data Dictionary in System Analysis and DesignData Dictionary in System Analysis and Design
Data Dictionary in System Analysis and Design
 
Data
DataData
Data
 
Hm306 week 2
Hm306 week 2Hm306 week 2
Hm306 week 2
 
Data Archiving and Processing
Data Archiving and ProcessingData Archiving and Processing
Data Archiving and Processing
 
MetadataTheory: Introduction to Metadata (5th of 10)
MetadataTheory: Introduction to Metadata (5th of 10)MetadataTheory: Introduction to Metadata (5th of 10)
MetadataTheory: Introduction to Metadata (5th of 10)
 
Dissemination Documentation
Dissemination DocumentationDissemination Documentation
Dissemination Documentation
 
Visual tools for databade queries and analysis
Visual tools for databade queries and analysisVisual tools for databade queries and analysis
Visual tools for databade queries and analysis
 
Using ca e rwin modeling to asure data 09162010
Using ca e rwin modeling to asure data 09162010Using ca e rwin modeling to asure data 09162010
Using ca e rwin modeling to asure data 09162010
 
Semantic challenges in sharing dataset metadata and creating federated datase...
Semantic challenges in sharing dataset metadata and creating federated datase...Semantic challenges in sharing dataset metadata and creating federated datase...
Semantic challenges in sharing dataset metadata and creating federated datase...
 
Data discovery through federated dataset catalogs
Data discovery through federated dataset catalogsData discovery through federated dataset catalogs
Data discovery through federated dataset catalogs
 
Dats nih-dccpc-kc7-april2018-prs-uoxf
Dats  nih-dccpc-kc7-april2018-prs-uoxfDats  nih-dccpc-kc7-april2018-prs-uoxf
Dats nih-dccpc-kc7-april2018-prs-uoxf
 
Statistical Software
Statistical SoftwareStatistical Software
Statistical Software
 
LIS 653, Session 11: Data Management & Curation
LIS 653, Session 11: Data Management & CurationLIS 653, Session 11: Data Management & Curation
LIS 653, Session 11: Data Management & Curation
 
Ooluk Data Dictionary Manager
Ooluk Data Dictionary ManagerOoluk Data Dictionary Manager
Ooluk Data Dictionary Manager
 

Similar to Deposit data to data centre: ADP case

Handling quantitative data and preparing for sharing and reuse, including dat...
Handling quantitative data and preparing for sharing and reuse, including dat...Handling quantitative data and preparing for sharing and reuse, including dat...
Handling quantitative data and preparing for sharing and reuse, including dat...
Arhiv družboslovnih podatkov
 
Data documentation and contextual descriptions
Data documentation and contextual descriptionsData documentation and contextual descriptions
Data documentation and contextual descriptions
Arhiv družboslovnih podatkov
 
Elements of Data Documentation
Elements of Data DocumentationElements of Data Documentation
Elements of Data Documentation
ssri-duke
 
DataONE Education Module 07: Metadata
DataONE Education Module 07: MetadataDataONE Education Module 07: Metadata
DataONE Education Module 07: Metadata
DataONE
 
Data Management Lab: Session 2 slides
Data Management Lab: Session 2 slidesData Management Lab: Session 2 slides
Data Management Lab: Session 2 slides
IUPUI
 
Data Profiling, Data Catalogs and Metadata Harmonisation
Data Profiling, Data Catalogs and Metadata HarmonisationData Profiling, Data Catalogs and Metadata Harmonisation
Data Profiling, Data Catalogs and Metadata Harmonisation
Alan McSweeney
 
L07 metadata
L07 metadataL07 metadata
L07 metadata
thplayer127
 
Data mining concept and methods for basic
Data mining concept and methods for basicData mining concept and methods for basic
Data mining concept and methods for basic
NivaTripathy2
 
FAIR: standards and services
FAIR: standards and servicesFAIR: standards and services
FAIR: standards and services
Susanna-Assunta Sansone
 
chapter11-220725121546-671fc36c.pdf
chapter11-220725121546-671fc36c.pdfchapter11-220725121546-671fc36c.pdf
chapter11-220725121546-671fc36c.pdf
MahmoudSOLIMAN380726
 
‏‏‏‏‏‏‏‏Chapter 11: Meta-data Management
‏‏‏‏‏‏‏‏Chapter 11: Meta-data Management‏‏‏‏‏‏‏‏Chapter 11: Meta-data Management
‏‏‏‏‏‏‏‏Chapter 11: Meta-data Management
Ahmed Alorage
 
Research data management workshop april12 2016
Research data management workshop april12 2016 Research data management workshop april12 2016
Research data management workshop april12 2016
Rebecca Raworth, MLIS
 
Research data management workshop April 2016
Research data management workshop April 2016Research data management workshop April 2016
Research data management workshop April 2016
Rebecca Raworth, MLIS
 
Chapter 2 - Introduction to Data Science.pptx
Chapter 2 - Introduction to Data Science.pptxChapter 2 - Introduction to Data Science.pptx
Chapter 2 - Introduction to Data Science.pptx
Wollo UNiversity
 
Database Management Systems.pptx
Database Management Systems.pptxDatabase Management Systems.pptx
Database Management Systems.pptx
CallplanetsDeveloper
 
Ch_2.pdf
Ch_2.pdfCh_2.pdf
Ch_2.pdf
DawitBirhanu13
 
Data Archiving and Sharing
Data Archiving and SharingData Archiving and Sharing
Data Archiving and Sharing
C. Tobin Magle
 
Database Systems - Lecture Week 1
Database Systems - Lecture Week 1Database Systems - Lecture Week 1
Database Systems - Lecture Week 1
Dios Kurniawan
 
Build data quality rules and data cleansing into your data pipelines
Build data quality rules and data cleansing into your data pipelinesBuild data quality rules and data cleansing into your data pipelines
Build data quality rules and data cleansing into your data pipelines
Mark Kromer
 
FSCI Data Discovery
FSCI Data DiscoveryFSCI Data Discovery
FSCI Data Discovery
ARDC
 

Similar to Deposit data to data centre: ADP case (20)

Handling quantitative data and preparing for sharing and reuse, including dat...
Handling quantitative data and preparing for sharing and reuse, including dat...Handling quantitative data and preparing for sharing and reuse, including dat...
Handling quantitative data and preparing for sharing and reuse, including dat...
 
Data documentation and contextual descriptions
Data documentation and contextual descriptionsData documentation and contextual descriptions
Data documentation and contextual descriptions
 
Elements of Data Documentation
Elements of Data DocumentationElements of Data Documentation
Elements of Data Documentation
 
DataONE Education Module 07: Metadata
DataONE Education Module 07: MetadataDataONE Education Module 07: Metadata
DataONE Education Module 07: Metadata
 
Data Management Lab: Session 2 slides
Data Management Lab: Session 2 slidesData Management Lab: Session 2 slides
Data Management Lab: Session 2 slides
 
Data Profiling, Data Catalogs and Metadata Harmonisation
Data Profiling, Data Catalogs and Metadata HarmonisationData Profiling, Data Catalogs and Metadata Harmonisation
Data Profiling, Data Catalogs and Metadata Harmonisation
 
L07 metadata
L07 metadataL07 metadata
L07 metadata
 
Data mining concept and methods for basic
Data mining concept and methods for basicData mining concept and methods for basic
Data mining concept and methods for basic
 
FAIR: standards and services
FAIR: standards and servicesFAIR: standards and services
FAIR: standards and services
 
chapter11-220725121546-671fc36c.pdf
chapter11-220725121546-671fc36c.pdfchapter11-220725121546-671fc36c.pdf
chapter11-220725121546-671fc36c.pdf
 
‏‏‏‏‏‏‏‏Chapter 11: Meta-data Management
‏‏‏‏‏‏‏‏Chapter 11: Meta-data Management‏‏‏‏‏‏‏‏Chapter 11: Meta-data Management
‏‏‏‏‏‏‏‏Chapter 11: Meta-data Management
 
Research data management workshop april12 2016
Research data management workshop april12 2016 Research data management workshop april12 2016
Research data management workshop april12 2016
 
Research data management workshop April 2016
Research data management workshop April 2016Research data management workshop April 2016
Research data management workshop April 2016
 
Chapter 2 - Introduction to Data Science.pptx
Chapter 2 - Introduction to Data Science.pptxChapter 2 - Introduction to Data Science.pptx
Chapter 2 - Introduction to Data Science.pptx
 
Database Management Systems.pptx
Database Management Systems.pptxDatabase Management Systems.pptx
Database Management Systems.pptx
 
Ch_2.pdf
Ch_2.pdfCh_2.pdf
Ch_2.pdf
 
Data Archiving and Sharing
Data Archiving and SharingData Archiving and Sharing
Data Archiving and Sharing
 
Database Systems - Lecture Week 1
Database Systems - Lecture Week 1Database Systems - Lecture Week 1
Database Systems - Lecture Week 1
 
Build data quality rules and data cleansing into your data pipelines
Build data quality rules and data cleansing into your data pipelinesBuild data quality rules and data cleansing into your data pipelines
Build data quality rules and data cleansing into your data pipelines
 
FSCI Data Discovery
FSCI Data DiscoveryFSCI Data Discovery
FSCI Data Discovery
 

More from Arhiv družboslovnih podatkov

Odprti dostop do raziskovalnih podatkov in Univerza. Kako bolje izkoristiti p...
Odprti dostop do raziskovalnih podatkov in Univerza. Kako bolje izkoristiti p...Odprti dostop do raziskovalnih podatkov in Univerza. Kako bolje izkoristiti p...
Odprti dostop do raziskovalnih podatkov in Univerza. Kako bolje izkoristiti p...
Arhiv družboslovnih podatkov
 
Preparing documentation and adapting work processes for acquiring DSA
Preparing documentation and adapting work processes for acquiring DSAPreparing documentation and adapting work processes for acquiring DSA
Preparing documentation and adapting work processes for acquiring DSA
Arhiv družboslovnih podatkov
 
Access and licencing of data
Access and licencing of dataAccess and licencing of data
Access and licencing of data
Arhiv družboslovnih podatkov
 
Uporaba popisnih mikropodatkov v izobraževanju
Uporaba popisnih mikropodatkov v izobraževanjuUporaba popisnih mikropodatkov v izobraževanju
Uporaba popisnih mikropodatkov v izobraževanju
Arhiv družboslovnih podatkov
 
Research Data Management Planning: problems and solutions
Research Data Management Planning: problems and solutionsResearch Data Management Planning: problems and solutions
Research Data Management Planning: problems and solutions
Arhiv družboslovnih podatkov
 
Research Data Lifecycle: Role of Data Services
Research Data Lifecycle: Role of Data ServicesResearch Data Lifecycle: Role of Data Services
Research Data Lifecycle: Role of Data Services
Arhiv družboslovnih podatkov
 
Preparing data and documentation for digital curation
Preparing data and documentation for digital curationPreparing data and documentation for digital curation
Preparing data and documentation for digital curation
Arhiv družboslovnih podatkov
 
Access to social science data: research, election results, official statistics
Access to social science data: research, election results, official statisticsAccess to social science data: research, election results, official statistics
Access to social science data: research, election results, official statistics
Arhiv družboslovnih podatkov
 
Arhiv družboslovnih podatkov: Nacionalno podatkovno središče kot infrastruktu...
Arhiv družboslovnih podatkov: Nacionalno podatkovno središče kot infrastruktu...Arhiv družboslovnih podatkov: Nacionalno podatkovno središče kot infrastruktu...
Arhiv družboslovnih podatkov: Nacionalno podatkovno središče kot infrastruktu...
Arhiv družboslovnih podatkov
 
Ravnanje z raziskovalnimi podatki v fazi načrtovanja in ustvarjanja
Ravnanje z raziskovalnimi podatki v fazi načrtovanja in ustvarjanjaRavnanje z raziskovalnimi podatki v fazi načrtovanja in ustvarjanja
Ravnanje z raziskovalnimi podatki v fazi načrtovanja in ustvarjanja
Arhiv družboslovnih podatkov
 
Razlogi za odprti dostop do raziskovalnih podatkov: politike, načela in koristi
Razlogi za odprti dostop do raziskovalnih podatkov: politike, načela in koristiRazlogi za odprti dostop do raziskovalnih podatkov: politike, načela in koristi
Razlogi za odprti dostop do raziskovalnih podatkov: politike, načela in koristi
Arhiv družboslovnih podatkov
 
Life of a data archive: Workflow, staff, skills, partnerships. ADP example
Life of a data archive: Workflow, staff, skills, partnerships. ADP exampleLife of a data archive: Workflow, staff, skills, partnerships. ADP example
Life of a data archive: Workflow, staff, skills, partnerships. ADP example
Arhiv družboslovnih podatkov
 
General concepts: DDI
General concepts: DDIGeneral concepts: DDI
General concepts: DDI
Arhiv družboslovnih podatkov
 
ADP – Arhiv družboslovnih podatkov (Social science data archives): A bit of H...
ADP – Arhiv družboslovnih podatkov (Social science data archives): A bit of H...ADP – Arhiv družboslovnih podatkov (Social science data archives): A bit of H...
ADP – Arhiv družboslovnih podatkov (Social science data archives): A bit of H...
Arhiv družboslovnih podatkov
 
SERSCIDA project Training Activities
SERSCIDA project Training ActivitiesSERSCIDA project Training Activities
SERSCIDA project Training Activities
Arhiv družboslovnih podatkov
 
Raziskovalni podatki kot objekt digitalnega skrbništva / dolgotrajnega ohranj...
Raziskovalni podatki kot objekt digitalnega skrbništva / dolgotrajnega ohranj...Raziskovalni podatki kot objekt digitalnega skrbništva / dolgotrajnega ohranj...
Raziskovalni podatki kot objekt digitalnega skrbništva / dolgotrajnega ohranj...
Arhiv družboslovnih podatkov
 
Načrt ravnanja z raziskovalnimi podatki, prijava Obzorje 2020
Načrt ravnanja z raziskovalnimi podatki, prijava Obzorje 2020Načrt ravnanja z raziskovalnimi podatki, prijava Obzorje 2020
Načrt ravnanja z raziskovalnimi podatki, prijava Obzorje 2020
Arhiv družboslovnih podatkov
 
Praktični del: preizkus orodja in seznanjanje z navodili
Praktični del: preizkus orodja in seznanjanje z navodiliPraktični del: preizkus orodja in seznanjanje z navodili
Praktični del: preizkus orodja in seznanjanje z navodili
Arhiv družboslovnih podatkov
 
Praktični del: izoblikovanje obrazca ''Privolitev za sodelovanje v raziskavi''
Praktični del: izoblikovanje obrazca ''Privolitev za sodelovanje v raziskavi''Praktični del: izoblikovanje obrazca ''Privolitev za sodelovanje v raziskavi''
Praktični del: izoblikovanje obrazca ''Privolitev za sodelovanje v raziskavi''
Arhiv družboslovnih podatkov
 
Good Data Battle in Slovenia – ADP experience
Good Data Battle in Slovenia – ADP experienceGood Data Battle in Slovenia – ADP experience
Good Data Battle in Slovenia – ADP experience
Arhiv družboslovnih podatkov
 

More from Arhiv družboslovnih podatkov (20)

Odprti dostop do raziskovalnih podatkov in Univerza. Kako bolje izkoristiti p...
Odprti dostop do raziskovalnih podatkov in Univerza. Kako bolje izkoristiti p...Odprti dostop do raziskovalnih podatkov in Univerza. Kako bolje izkoristiti p...
Odprti dostop do raziskovalnih podatkov in Univerza. Kako bolje izkoristiti p...
 
Preparing documentation and adapting work processes for acquiring DSA
Preparing documentation and adapting work processes for acquiring DSAPreparing documentation and adapting work processes for acquiring DSA
Preparing documentation and adapting work processes for acquiring DSA
 
Access and licencing of data
Access and licencing of dataAccess and licencing of data
Access and licencing of data
 
Uporaba popisnih mikropodatkov v izobraževanju
Uporaba popisnih mikropodatkov v izobraževanjuUporaba popisnih mikropodatkov v izobraževanju
Uporaba popisnih mikropodatkov v izobraževanju
 
Research Data Management Planning: problems and solutions
Research Data Management Planning: problems and solutionsResearch Data Management Planning: problems and solutions
Research Data Management Planning: problems and solutions
 
Research Data Lifecycle: Role of Data Services
Research Data Lifecycle: Role of Data ServicesResearch Data Lifecycle: Role of Data Services
Research Data Lifecycle: Role of Data Services
 
Preparing data and documentation for digital curation
Preparing data and documentation for digital curationPreparing data and documentation for digital curation
Preparing data and documentation for digital curation
 
Access to social science data: research, election results, official statistics
Access to social science data: research, election results, official statisticsAccess to social science data: research, election results, official statistics
Access to social science data: research, election results, official statistics
 
Arhiv družboslovnih podatkov: Nacionalno podatkovno središče kot infrastruktu...
Arhiv družboslovnih podatkov: Nacionalno podatkovno središče kot infrastruktu...Arhiv družboslovnih podatkov: Nacionalno podatkovno središče kot infrastruktu...
Arhiv družboslovnih podatkov: Nacionalno podatkovno središče kot infrastruktu...
 
Ravnanje z raziskovalnimi podatki v fazi načrtovanja in ustvarjanja
Ravnanje z raziskovalnimi podatki v fazi načrtovanja in ustvarjanjaRavnanje z raziskovalnimi podatki v fazi načrtovanja in ustvarjanja
Ravnanje z raziskovalnimi podatki v fazi načrtovanja in ustvarjanja
 
Razlogi za odprti dostop do raziskovalnih podatkov: politike, načela in koristi
Razlogi za odprti dostop do raziskovalnih podatkov: politike, načela in koristiRazlogi za odprti dostop do raziskovalnih podatkov: politike, načela in koristi
Razlogi za odprti dostop do raziskovalnih podatkov: politike, načela in koristi
 
Life of a data archive: Workflow, staff, skills, partnerships. ADP example
Life of a data archive: Workflow, staff, skills, partnerships. ADP exampleLife of a data archive: Workflow, staff, skills, partnerships. ADP example
Life of a data archive: Workflow, staff, skills, partnerships. ADP example
 
General concepts: DDI
General concepts: DDIGeneral concepts: DDI
General concepts: DDI
 
ADP – Arhiv družboslovnih podatkov (Social science data archives): A bit of H...
ADP – Arhiv družboslovnih podatkov (Social science data archives): A bit of H...ADP – Arhiv družboslovnih podatkov (Social science data archives): A bit of H...
ADP – Arhiv družboslovnih podatkov (Social science data archives): A bit of H...
 
SERSCIDA project Training Activities
SERSCIDA project Training ActivitiesSERSCIDA project Training Activities
SERSCIDA project Training Activities
 
Raziskovalni podatki kot objekt digitalnega skrbništva / dolgotrajnega ohranj...
Raziskovalni podatki kot objekt digitalnega skrbništva / dolgotrajnega ohranj...Raziskovalni podatki kot objekt digitalnega skrbništva / dolgotrajnega ohranj...
Raziskovalni podatki kot objekt digitalnega skrbništva / dolgotrajnega ohranj...
 
Načrt ravnanja z raziskovalnimi podatki, prijava Obzorje 2020
Načrt ravnanja z raziskovalnimi podatki, prijava Obzorje 2020Načrt ravnanja z raziskovalnimi podatki, prijava Obzorje 2020
Načrt ravnanja z raziskovalnimi podatki, prijava Obzorje 2020
 
Praktični del: preizkus orodja in seznanjanje z navodili
Praktični del: preizkus orodja in seznanjanje z navodiliPraktični del: preizkus orodja in seznanjanje z navodili
Praktični del: preizkus orodja in seznanjanje z navodili
 
Praktični del: izoblikovanje obrazca ''Privolitev za sodelovanje v raziskavi''
Praktični del: izoblikovanje obrazca ''Privolitev za sodelovanje v raziskavi''Praktični del: izoblikovanje obrazca ''Privolitev za sodelovanje v raziskavi''
Praktični del: izoblikovanje obrazca ''Privolitev za sodelovanje v raziskavi''
 
Good Data Battle in Slovenia – ADP experience
Good Data Battle in Slovenia – ADP experienceGood Data Battle in Slovenia – ADP experience
Good Data Battle in Slovenia – ADP experience
 

Recently uploaded

Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
EduSkills OECD
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
TechSoup
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
CarlosHernanMontoyab2
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
timhan337
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Po-Chuan Chen
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 

Recently uploaded (20)

Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 

Deposit data to data centre: ADP case

  • 1. Course for Doctoral Students RESEARCH DATA MANAGEMENT AND OPEN DATA 25th July 2015, Social Science Data Arhives, Faculty of Social Sciences, University of Ljubljana ECPR Summer School 2015
  • 2. PREPARING DATAAND DOCUMENTATION FOR DIGITAL CURRATION Irena Vipavc Brvar, Social Science Data Archives
  • 3. Content • Which things should I save and how • Documentation • Data • Metadata (standards) • What tools are there
  • 4. SHARING MY RESEARCH Data should be user-friendly, shareable and with long- lasting usability. -> ensure they can be understood and interpreted by any user This requires clear data description, annotation, contextual information and documentation.
  • 5.
  • 6. Documentation Data documentation might include: • a survey questionnaire • an interview schedule • records of interviewees and their demographic characteristics in a qualitative study • variable labels in a table • published articles that provides background information • description of the methodology used to collect the data Source: UK Data Service
  • 7. What should be captured? Any useful documentation such as: • final report, published reports, user guide, working paper, publications, lab books Information on dataset structure • inventory of data files • relationships between those files • records, cases... Variable-level documentation • labels, codes, classifications • missing values • derivations and aggregations Source: UK Data Service
  • 8. What should be captured? Contextual information about project and data • background, project history, aims, objectives, hypotheses • publications based on data collection Data collection methodology and processes • data collection process and sampling • instruments used - questionnaires, showcards, interview schedules • temporal/geographic coverage • data validation - cleaning, error checking • compilation of derived variables • weighting: factors and variables, weighting process • secondary data sources used Data confidentiality, access and use conditions • anonymisation carried out • consent conditions/procedures • access or use conditions of data Source: UK Data Service
  • 9. Data - level documentation Certain types of data file may contain important information which should be preserved: • variable/value labels; document metadata; table relationships and queries in relational databases; GIS data layers/tables Some examples: • SPSS: variable attributes documented in Variable View (label, code, data type, missing values) • MS Access: relationships between tables • ArcGIS: shapefiles (layers) and tables in geodatabase; metadata created in ArcCatalog • MS Excel: document properties, worksheet labels (where multiple) Source: UK Data Service
  • 10. Data - level documentation: variable names All structured, tabular data should have cases/records and variables adequately documented with names, labels and descriptions. Variable names might include: • question number system related to questions in a survey/questionnaire e.g. Q1a, Q1b, Q2, Q3a • numerical order system e.g. V1, V2, V3 • meaningful abbreviations or combinations of abbreviations referring to meaning of the variable e.g. oz%=percentage ozone, GOR=Government Office Region, moocc=mother occupation, faocc=father occupation • for interoperability across platforms - variable names should be max 8 characters and without spaces Source: UK Data Service
  • 11. Data - level documentation: variable labels Similar principles for variable labels: • be brief, max. 80 characters • include unit of measurement where applicable • reference the question number of a survey or questionnaire e.g. variable 'q11hexw' with label 'Q11: hours spent taking physical exercise in a typical week' - the label gives the unit of measurement and a reference to the question number (Q11b) • Codes of, and reasons for, missing data avoid blanks, system - missing or '0' values e.g. '99=not recorded', '98=not provided (no answer)', '97=not applicable', '96=not known', '95=error' • Coding or classification schemes used, with a bibliographic ref e.g. Standard Occupational Classification 2000 - a list of codes to classify respondents' jobs; ISO 3166 alpha-2 country codes - an international standard of 2 - letter country codes Source: UK Data Service
  • 12. Data - level documentation: transcripts Qualitative data/text documents: • interview transcript speech demarcation (speaker tags) • document header with brief details of interview date, place, interviewer name, interviewee details, context Source: UK Data Service
  • 13. METADATA Metadata – data about data Describe your survey using standard International standards/schemes • Data Documentation Initiative (DDI) • ISO19115 • Dublin Core • Metadata Encoding and Transmission Standard (METS) • Preservation Metadata Maintenance Activity (PREMIS)
  • 14. BASIC STRUCTURE OF DDI 2.* - Section 1.0 ‐ Document Description consists of bibliographic information that c an be considered as the header whose elements uniquely describe the full contents of the compliant DDI file. - Section 2.0 ‐ Study Description consists of information about the data collection. This section includes information about who collected and who distributes the data, about the scope and coverage, sampling (if relevant), data collection methods and processing, citation requirements, etc. Controlled Vocabulary Multilingual XML Semantic and technical interoperability
  • 15. BASIC STRUCTURE OF DDI 2.* • Section 3.0 ‐ Data Files Description provides information about the Data file(s). • Section 4.0 ‐ Variable Description provides a detailed description o f variables, including (when relevant) t he variable type, variable and value labels, literal questions, computation or imputation methods, instructions to interviewers, universe, descriptive statistics, etc. • Section 5.0 ‐ Other Study Related Materials allows for the inclusion of other materials related to the study such as questionnaires, user manuals, computer programs, interviewer manuals, maps, coding information, etc.
  • 16.
  • 18.
  • 19. Nesstar Publisher Nesstar Publisher – a sophisticated authoring environment that can publish data from a variety of sources (including SPSS, SAS, Excel etc.). The tool includes a specialised metadata editor, data and metadata validation routines and metadata templates that provide standardisation and control. Easy editing/creation and export of DDI documented datasets with XML experience needed. Tools to compute/recode/label new, or existing, variables to be added to a dataset before publishing. Tools to validate metadata and variables. The ability to import and export data to the most common statistical formats, including delimited files. The ability to include automatically generated frequency and summary statistics for each variable. Multilingual - Arabic, Chinese, English, French, Portuguese, Russian and Spanish and more.