Looking for Data: Finding New Science
Upcoming SlideShare
Loading in...5
×
 

Looking for Data: Finding New Science

on

  • 332 views

Keynote for STM innovations seminar 2014: http://www.stm-assoc.org/events/stm-innovations-seminar-u-s-2014/

Keynote for STM innovations seminar 2014: http://www.stm-assoc.org/events/stm-innovations-seminar-u-s-2014/

Statistics

Views

Total Views
332
Views on SlideShare
308
Embed Views
24

Actions

Likes
1
Downloads
6
Comments
0

1 Embed 24

https://twitter.com 24

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Looking for Data: Finding New Science Looking for Data: Finding New Science Presentation Transcript

    • Looking for Data: Finding New Science Anita de Waard VP Research Data Collaborations a.dewaard@elsevier.com http://researchdata.elsevier.com/
    • Why should science publishers care about Research Data?Funding bodies:  Demonstrate impact  Guarantee permanence, discoverability  Avoid fraud  Avoid double funding  Serve general public Research Management/Libary:  Generate, track outputs  Comply with mandates  Ensure availability Phil Bourne, (then) Associate Vice Chancellor, UCSD, 4/13: “We need to think about the university as a digital enterprise.” Mike Huerta, Ass. Director NLM: “Today, the major public product of science are concepts, written down in papers. But tomorrow, data will be the main product of science…. We will require scientists to track and share their data as least as well, if not better, than they are sharing their ideas today.” Researchers:  Derive credit  Comply with mandates  Discover and use  Cite/acknowledge Nathan Urban, PI Urban Lab, CMU, 3/13: “If we can share our data, we can write a paper that will knock everybody’s socks off!” Barbara Ransom, NSF Program Director Earth Sciences: “We’re not going to spend any more money for you to go out and get more data! We want you first to show us how you’re going to use all the data we paid y’all to collect in the past!”
    • Research data management today: Using antibodies and squishy bits Grad Students experiment and enter details into their lab notebook. The PI then tries to make sense of their slides, and writes a paper. End of story.
    • Prepare Observe Analyze Ponder Communicate Prepare Observe Analyze Ponder Communicate Most of biology is quite insular
    • But it is also VERY complicated: http://en.wikipedia.org/wiki/File:Duck_of_Vaucanson.jpg • Interspecies variability: A specimen is not a species • Gene expression variability: Knowing genes is not knowing how they are expressed • Microbiome: An animal is an ecosystem • Systems biology: A whole is more than the sum of its parts • Male researchers stress out rodents! Reductionist science does not work for living systems! Statistics to the rescue!
    • What if the research data was connected? Prepare Analyze Communicate Prepare Analyze Communicate Observations Observations Observations Across labs, experiments: track reagents and how they are used
    • Prepare Analyze Communicate Prepare Analyze Communicate Observations Observations Observations Compare outcome of interactions with these entities What if the research data was connected?
    • Prepare Analyze Communicate Prepare AnalyzeCommunicate Observations Observations Observations Build a ‘virtual reagent spectrogram’ by comparing how different entities interacted in different experiments Think What if the research data was connected?
    • Maslow Hierarchy of Research Data Needs: Use ful Trusted Reproducible Discoverable Comprehensible Archived Accessible Preserved in digital format
    • 1: Urban Legend How can we make a standard neuroscience wet lab store and share their data? • Incorporate structured workflows into the daily practice of a typical electrophysiology lab (the Urban Lab at CMU) – What does it take? – Where are points of conflict? • 1-year pilot, funded by Elsevier RDS: – CMU: Shreejoy Tripathy, manage/user test – Elsevier: development, UI, project management • Next steps: NIH grant to scale up to 4 labs Use ful Trusted Reproducible Discoverable Comprehensible Archived Accessible Preserved in digital format
    • de Waard, A., Burton, S. et al., 2013 Urban Legend Components
    • Data Entry App:
    • Data dashboard (e.g. SDB140225c4_onbeam_CC)
    • 2: Moonrocks How can we scale up data curation? Pilot project with IEDA: • Build a database for lunar geochemistry • Leapfrog & improve curation time • Write joint report on processes, costs and challenges • 1-year pilot, funded by Elsevier • Next step: NSF grant on schema’s > spreadsheets Use ful Trus- ted Reprodu- cible Discoverable Comprehensible Archived Accessible Preserved in digital format
    • Moonrocks Data Import: Moonrocks: pushing data curation to the researcher
    • 3: How do we improve how data (and software) are published? • Eg with the Virtual Microscope • Or Interactive Plots • Or Executable Papers Use ful Trusted Reprodu-cible Discoverable Comprehensible Archived Accessible Preserved in digital format
    • Let’s support the needs of research data! Experimental Metadata: Workflows, Samples, Settings, Reagents, Organisms, etc. Record Metadata: DOI, Date, Author, Institute, etc. Processed Data: Mathematically/computationally processed data: correlations, plots, etc. Raw Data: Direct outputs from equipment: images, traces, spectra, etc. Methods and Equipment: Reagents, settings, manufacturer’s details, etc. Validation: Approval, Reproduction, Selection, Quality Stamp Use ful Trusted Reproducib le Discoverable Comprehensible Archived Accessible Preserved in digital format Morecuration Moreusable
    • Anita de Waard a.dewaard@elsevier.com Collaborations and discussions gratefully acknowledged: • CMU: Nathan Urban, Shreejoy Tripathy, Shawn Burton, Ed Hovy • UCSD: Brian Shoettlander, David Minor, Declan Fleming, Ilya Zaslavsky • NIF: Maryann Martone, Anita Bandrowski • OHSU: Melissa Haendel, Nicole Vasilevsky • Columbia/IEDA: Kerstin Lehnert, Leslie Hsu • MIT: Micah Altman Thank you! http://researchdata.elsevier.com/