Making Data Sharing Happen
Upcoming SlideShare
Loading in...5
×
 

Making Data Sharing Happen

on

  • 1,019 views

Flash talk for Beyond the PDF 2, Amsterdam, 2013

Flash talk for Beyond the PDF 2, Amsterdam, 2013

Statistics

Views

Total Views
1,019
Views on SlideShare
931
Embed Views
88

Actions

Likes
1
Downloads
13
Comments
0

2 Embeds 88

https://twitter.com 87
http://tweetedtimes.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Making Data Sharing Happen Making Data Sharing Happen Presentation Transcript

    • Making It Happen: Making It Happen Sustainable Data Preservation and Use March 19, 2013 Anita de WaardVP Research Data Collaborations, Elsevier RDS a.dewaard@elsevier.com
    • “What aspects/tools/capabilities/frameworks are related to this idea?”• There are many different research databases– both generic (Dryad, Dataverse, …) and specific (NIF, IEDA, PDB, …)• There are many systems for creating/sharing workflows (Taverna, MyExperiment, Vistrails, Workflow4Ever etc)• There are many e-lab notebooks (LabGuru, LabArchives, LaBlog, etc)• There are scores of projects, committees, standards, bodies, grants, initiatives, conferences for discussing and connecting all of this (KEfED, Pegasus, PROV, RDA, Science Gateways, Codata, BRDI, Earthcube, etc. etc)• You can make a living out of this ;-)! (and many of us do…)
    • …but this is what scientists do:Using antibodiesand squishy bitsGrad Students experimentand enter details into theirlab notebook.The PI then tries tomake sense of this,and writes a paper.End of story.
    • Why save research data?A. Data Preservation: – Preserve record of scientific process, provenance – Enable reproducible researchB. Data Use: – Use results obtained by others – Do better science! – Improve interdisciplinary workC. Sustainable Models: – Technology transfer; societal/industrial development – Reward scientists for data creation (credit/attribution) – Long-term archiving
    • Where The Data Goes Now: PDB: A small portion of data 88,3 k (1-2%?) stored in small, PetDB: > 50 My Papers 1,5 k SedDB: topic-focused 2 M scientists data repositories 0.6 k MiRB:2 M papers/year 25k TAIR: 72,1 k Some data (8%?) stored in large, generic data Majority of data repositories (90%?) is stored on local hard drives Dryad: Dataverse: 7,631 files 0.6 M Datacite: 1.5 M
    • Key Needs: DEVELOP SUSTAINABLE MODELS PDB: A small portion of data 88,3 k (1-2%?) stored in small, PetDB: > 50 My Papers 1,5 k SedDB: topic-focused 2 M scientists data repositories 0.6 k MiRB:2 M papers/year 25k TAIR: 72,1 k Some data (8%?) stored in large, generic data Majority of data repositories (90%?) is stored on local hard drives Dryad: Dataverse: 7,631 files 0.6 M INCREASE DATA PRESERVATION Datacite: 1.5 M
    • Objections (and rebuttals) to data sharing: Objection: Rebuttal: “Our lab notebooks are all on Graft tools closely on scientists’ paper – it’s how we do things” daily practice “I need to see a direct benefit Create tools to allow better of any effort I put in.” insight in own and other’s results. “I don’t really trust anyone Create social networking context else’s data – and don’t think and allow data owner to provide they’ll trust mine” granular access control. “I am afraid other people => Reward system moves might scoop my from a competition to a discoveries” ‘shared mission’
    • From insular ‘CoSI-Factories’… Prepare PrepareObserve Ponder Ponder Observe Communicate Communicate Analyze Analyze
    • …to shared experimental repositories:Across labs, experiments:track reagents and howthey are used Observations Observations Observations Prepare Prepare Analyze Communicate Analyze Communicate
    • …to shared experimental repositories:Compare outcome ofinteractions with theseentities Observations Observations Observations Prepare Prepare Analyze Communicate Analyze Communicate
    • …to shared experimental repositories:Build a ‘virtual reagentspectrogram’ by comparinghow different entities Observationsinteracted in differentexperiments Think Observations Observations Prepare Prepare Analyze Communicate Communicate Analyze
    • Some examples:• Grafting tools on workflow: create tailored metadata collection tools on mini-tablets in labs to replace paper notebook• Direct rewards: through ‘PI-Dashboard’: allow immediate access/analysis of shared data: new science!• Data sharing rewards: Data Rescue Challenge:: collect and reward stories/practices of data preservation/use in Earth/Lunar Science• Improve data use: With NIF/Eagle-I: add antibodies as key ‘entities’ to paper, link to AB repository consortium
    • How do we make data use happen:• We are creating repositories of shared experiments: you are part of a greater whole!• Collect and share stories and practices re. data use and sustainable systems: “What gets to them?”• Develop system of rewards for data sharing: enable demonstrably better science!• Work with grant agencies, repositories (generic/specific, institutional, cross-national) to integrate and annotate existing datasets and enable cross-use• Collectively pioneer long-term funding options; support/develop ‘shared mission’ funding challenges