Brislinger, Recker: Keeping data re-usable in the evs


Published on

Alive and kicking! Keeping data re-usable in the European Values Study:

- Data and information flow in the EVS project
- Principles and workflows for managing data and documentation in survey projects

Published in: Business, Technology, Education
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Brislinger, Recker: Keeping data re-usable in the evs

  1. 1. Alive and kicking!Keeping data re-usable in theEuropean Values StudyIASSIST Cologne, May, Evelyn.Brislinger@gesis.orgGESIS, Data Archive for the Social Sciences
  2. 2. Overview Data and information flow in the EVS project Principles and workflows for managing data anddocumentation in survey projects
  3. 3. GESIS Data ArchiveBasisInterplay between Principal Investigators (PI) and Data ArchiveAgreement on submission of data and information packagesGoalsEase access to data for a broad user communityProvide metadata for discovery, understanding, and good use of dataPreserve data and metadata for re-use and replicationsHoldingsStudies, study series, and complex survey programs as ISSP, Eurobarometer,ALLBUS, European Values Study (EVS), or election studies
  4. 4. Data and information created in a survey projectTotal stock of data anddocumentation createdData and documentationsubmitted to an archiveFurther information necessaryfor the project(?)Selection processesManagement solutions for structuring data and information
  5. 5. Example: European Values Study (EVS)9-year-period, 4 waves49 countries, 125 national surveysCross-national, longitudinalresearch programNational surveysWaves1981/1990/1999/2008Longitudinal data File1981-2008 (LdF)Integrated Values SurveysEVS/WVS (IVS)Harmonization and integration processNumber of filesSize of filesAtlas of European
  6. 6. Collaboration of actors involved (EVS 2008)DatacreatedprocesseddocumentedNational teamDatastandardizedharmonizedintegratedCentral teamData Archive Secondary usersPrincipal InvestigatorsDatacheckeddocumentedpreservedreleasedDatare-usedAnalysesreplicatedResultsreported
  7. 7. Users: analyze and evaluate outcomesQuestionsCheck trend questions and originalquestions ZACAT-Online Study CatalogueDataAnalyze data, report errors, monitorerror reporting GESIS Data CataloguePublicationsReplicate analysis of other projects EVS Repository…. and detect peculiarities inquestions or problems in data
  8. 8. Peculiarities in question text spotted?Project DesignQuestionnaire DesignQuestionnaire TranslationData CollectionData DocumentationData ProcessingCheck question and translationMaster/field questionnaire, methodologicalquestionnaire, report ‘Translation History’Check source of questionTrend question from EVS and WVS,questions borrowed from other surveysIdentify consequences forCountries sharing/adopting affectedlanguage, languages belonging to a family,further languages used in a countryEVS 2008 Data lifecycle
  9. 9. Data error detected?Standardization and harmonization process: check comparability of surveys,questions, variables  cumulate data and document each stepIntegratedValuesSurveysEVS/WVSLongitudi-nal dataFile1981-2008Wave2008NationaldataOriginaldata fileWave1999…..Nationaldata…..Retrace data processing steps across surveys: check data, syntaxfiles, and documentation  update data and highlight problems for next waveError detected
  10. 10. Data and information createdDesignated communities Principal Investigator/Project Secondary userExperiences from EVS projectData and information packages Project package Archive packageSelection processes Within project Between project and archiveProjectArchiveTotal stock
  11. 11. Communicating with the future: Activity on two levelsMacro levelDefining workflows, file and information paths on whichnecessary information is passed onMicro levelOrganizing information so that it isre-usable (RDM, metadata,systematic file structures)
  12. 12. Begin by identifying principles for structuring and documenting files inthe project (Research Data Management)Selectwhich informationis relevantto whom?A tidy house, a tidy mind!Reference, don’tduplicate fileswhenever possibleIdentify andcapture “kinshiprelations”Capture processknowledgeclassesitineraries Make changestraceableversioningdocument revisions &annotationsminutesprotocols
  13. 13. The magic wand Follow principles of good researchdata management (RDM) Use metadata to document processand content information Use standards wherever possible(e.g. DDI, Dublin Core, ISO codes,file naming conventions, etc.)(and not the one used by the sorcerer’s apprentice)
  14. 14. DocumentDatecreatedLanguageVersionFormatResourceRightsDatemodifiedEnglishActorNameCollectionhasDatehasModifiercreatesmodifieshasAccessRightsisAhasVersionisAhasCreatorhasLanguagehasIdentifierisPartOfhasFormathasIdentifierhasRoledc:creatordc:createddc:modifieddc:identifierdc:formatdc:provenancedc:descriptiondc:languagedc:accessRightsdc:collection…isA
  15. 15. Managing information flows in a collaborative, long-term project Which paths does information (data, documentation, othercontextual material) take from producers to users? Two models helped us clarify processes and paths, as well asidentify helpful terminology and concepts– Project life cycle– Open Archival Information System (OAIS) reference model(CCSDS 2012)CCSDS (2012). Reference Model for an Open Archival Information System (OAIS). Recommended Practice.
  16. 16. Project RepositoryIngestData processingand enhancementDataManagementTemporaryStorageAccess(project-internaluse, PIs)Project DesignDataDisseminationQuestionnaireDesignQuestionnaireTranslationData CollectionDataDocumentationDataProcessingProject life cycle: Data flow during creation of a surveyGuidelines
  17. 17. Data Archive(preservation service provider)DataManagementAccessArchival Storage(long-term)Preservation PlanningAdministrationIngestSecondaryUsers(future)PrincipalInvestigatorsSIP AIPAIP DIPProject Repository(content provider)IngestData processingand enhancementDataManagementTemporaryStorageAccess(project-internaluse, PIs)Project and Data Archive as distributed systemPIPPIPPIPPIPPIPPIPPIPPIPPIPPIP = Project Information Package, SIP = Submission Information Package,AIP = Archival Information Package, DIP = Dissemination Information PackageProject DesignDataDisseminationQuestionnaireDesignQuestionnaireTranslationData CollectionDataProcessingDataDocumentation
  18. 18. Staying Alive! Where we are going from here Developing a guideline for projects– structuring and annotating of information on the micro level– issues to discuss with an Archive (preservation service provider) Testing our model– implementing our ideas in smaller projects with the aim ofmaking the results available to other projects
  19. 19. Thank you for your attention!Evelyn Brislinger | Astrid ReckerGESIS – Leibniz Institute for the Social Sciences, Data |