Imac 090924


Published on

Published in: Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Welke data en waarom ‘deponeren’?
  • Imac 090924

    1. 1. Project 3TU.Datacentrum Im@c, September 24 th 2009 Jeroen Rombouts, MSc Project manager 3TU.Datacentrum
    2. 2. Presentation outline Why care about research data? What do data producers have to say?
    3. 3. Why care? 1/3 Research Manuscript Publication Data Metadata Repository Library    
    4. 4. Why care? 2/3 <ul><li>Physical decay of storage media; </li></ul><ul><li>Loss of descriptive (meta)data; </li></ul><ul><li>Loss of ‘rendering’ capabilities (contemporary applications for viewing and analysing data). </li></ul>Risks of current research data management Reasons for long-term preservation and access <ul><li>Data value (cost intensive, valorisation, continuous datasets); </li></ul><ul><li>Research quality (verification, knowledge transfer, sharing). </li></ul>
    5. 5. Why care? 3/3 <ul><li>Plan of National Science Foundation regarding preservation of digital scientific output (2006); </li></ul><ul><li>OAIS reference model (2002 by CCSDS) becomes ISO standard (2009); </li></ul><ul><li>KNAW starts Dutch data repository for humanities and social sciences: DANS (Data Archiving and Networked Services) (2005); </li></ul><ul><li>No initiatives for engineering and science in the Netherlands. </li></ul>Project setting
    6. 6. The 3TU.Datacentrum 1/8 <ul><li>Builds on two previous projects; </li></ul><ul><ul><li>E-Archiving – digital depot </li></ul></ul><ul><ul><li>Darelux – Data Archiving River Environment Luxemburg </li></ul></ul><ul><li>Time frame of 3 years 2008 - 2010; </li></ul><ul><ul><li>Financed mainly by 3TU.Federation </li></ul></ul><ul><ul><li>Datasets from TUD, TU/e and UT, later other science data </li></ul></ul><ul><li>Goal: long-term access to research data. </li></ul>Project description
    7. 7. The 3TU.Datacentrum 2/8 Tasks Collaboration <ul><li>With DANS, SURF, Koninklijke Bibliotheek and others: </li></ul><ul><li>“ DRIVER-II” (EU-7FP), Demonstrator voor Enhanced Publications; </li></ul><ul><li>“ Waardevolle Data & Diensten” (SURFshare) , identify added value of data repository for data producers. </li></ul><ul><li>Partner in DataCite consortium with TIB Hannover, ETH Zurich, INIST (France), British Library, DTU Kopenhagen, NRC-CISTI (Canada), California Digital Library. </li></ul><ul><li>Implement and run ‘data-archive’ (facilitate data producers); - Collect, preserve, publish and provide access to data - ( ß ): </li></ul><ul><li>Data management consultancy; - Select and develop formats, metadata, tools, etc. </li></ul>
    8. 8. The 3TU.Datacentrum 3/8 <ul><li>Data of ‘enhanced publications’ (underlying data and visualisations linked to publications). Increase publication value (stronger basis, more citations, …); </li></ul><ul><li>Data generated by ‘hard to repeat’ processes. E.g. high cost, (environmental) observations, complex or continuous experiments, …; </li></ul><ul><li>Data collected with public funding. Conditions by funding organisations or publishers like Nature Publishing Group, NWO, governmental organisations, universities, …; </li></ul><ul><li>Preferably open access data with potential for reuse (verification, new research, …). Increase visibility, efficiency and quality of research efforts. </li></ul>Which data to preserve? And why?
    9. 9. <ul><li>Technical infrastructure (server, platform, websites, formats & models) </li></ul><ul><li>Dataset Darelux (2.0) repository /resource: study-CITG /view/ html </li></ul><ul><li>Dataset Flame (BagIt) </li></ul><ul><li>Dataset Wind speed/Solar radiation </li></ul><ul><li>Datasets ‘on the way’: NNV Survey ‘job market physicists’, Enhanced Publication ‘combustion’, Waterlab, Biotechnology, Remote sensing, ‘Tire noise’ </li></ul>The 3TU.Datacentrum 4/8
    10. 10. <ul><li>Partner in DataCite consortium with TIB Hannover, ETH Zurich, INIST (France), British Library, DTU Kopenhagen, NRC-CISTI (Canada), California Digital Library. “to support researchers by providing methods for them to locate, identify, and cite research datasets with confidence”; </li></ul><ul><li>Founding member COAR: Confederation of Open Access Repositories (October); </li></ul><ul><li>Provide input for “Nota Wetenschappelijke informatievoorziening” (OC&W), “Toekomst voor ons digitaal geheugen” (NCDD); </li></ul><ul><li>Partner in “Nationale Coalitie Digitale Duurzaamheid” ( ); </li></ul><ul><li>Coordinating “Forum onderzoeksdata”. </li></ul>Related ‘results’ 5/8
    11. 11. The 3TU.Datacentrum 6/8
    12. 13. The 3TU.Datacentrum 8/8 The benefits for data producers and data consumers <ul><li>Increased visibility of research output. (metadata in repository networks, assigning doi’s, facilitate increases citation rate for ‘enhanced publications’, ...); </li></ul><ul><li>Improved quality of dataset (quality assurance for multi- user setup, checks on ingest, …); </li></ul><ul><li>Provide (long-term) preservation of and accessibility to, valuable research data; </li></ul><ul><li>Distribution of research data for reuse, including administration and usage statistics; </li></ul><ul><li>Provides advice on data management, rights, formats, metadata, etc. </li></ul>
    13. 14. Nobody needs my data Data transfer not needed, every PhD does own project Our datasets are confidential Interesting but not for me Only for long term continuous data  Datasets are stored by publisher No time! Our research is once only What do data producers say? 1/2
    14. 15. Surprising our university had no faciltity for data preservation Transfer of data between PhD’s can be improved Would like to publish data Good opportunity to share datasets we bought Very usefull, essential metadata often missing  Much to improve in reuse of data When can I store my datasets? What do data producers say? 2/2
    15. 16. <ul><li>At the start of research projects time and resources must be allocated for data preservation; </li></ul><ul><li>(Controlled) sharing of pre-publication data must be possible; </li></ul><ul><li>The ‘source’ of every data set must be traceable; </li></ul><ul><li>Quality of research data is responsibility of data producer. </li></ul><ul><li>Data transformations should be stored with data in order to make review possible. </li></ul>What do data producers say? 3/3 Workshop statements (june 4 th 2009)
    16. 17. Questions? Suggestions? Nature News Special on Data Sharing (september 2009) Toekomst voor ons digitaal geheugen
    17. 18. Resources <ul><li>The 3TU.Datacentrum project </li></ul><ul><li>&quot;Unavailability of online supplementary scientific information from articles published in major journals&quot; doi:10.1096/fj.05-4784lsf </li></ul><ul><li>&quot;Going, Going, Gone: Lost Internet References“ doi:10.1126/science.1088234 </li></ul><ul><li>“ Sharing Detailed Research Data Is Associated with Increased Citation Rate” doi:10.1371/journal.pone.0000308 </li></ul><ul><li>“ To share or not to share” /data-publication </li></ul><ul><li>“ NSF’s Cyberinfrastructure Vision for 21st century Discovery” </li></ul><ul><li>“ SURF Direct” Digitale rechten – onderzoeksdata (Dutch) </li></ul><ul><li>Nature News Special on Data Sharing (september 2009) </li></ul><ul><li>Toekomst voor ons digitaal geheugen </li></ul>