Your SlideShare is downloading. ×
0
Towards Reusable Experiments:
Making Metadata While You
Measure
Shreejoy Tripathy
PhD student, Carnegie Mellon
Email: stri...
Lots of great tools for data sharing…
Barriers to data sharing
• Social
– “What’s in it for me? How will I get credit?”
– “It’s my data, not yours”
– “The benef...
Project idea
• How can we make a standard neuroscience
wet lab more data-sharing savvy?
• Incorporate structured workflows...
Key insights/motivations
1. Effective data
sharing includes raw
data files +
experimental
metadata (typically
stored in a ...
Key insights/motivations
1. Share raw data files
+ experimental
metadata
2. You know the most
about an
experiment when
you...
Key insights/motivations
1. Share raw data files +
experimental
metadata
2. You know the most
about an experiment
when you...
Project schematic
Project schematic
Metadata data app
• Electronic lab
notebook models
sequential slice-
electrophysiology
workflow
– Replaces pen-and-
paper ...
Metadata data entry
• Electronic lab
notebook allows
structured data entry
Animal Strain
Metadata data entry
• Electronic lab
notebook allows
structured data entry
(i.e., dropdown
menus)
– Allows incorporation
o...
Metadata data entry
MGI:3719486
• Electronic lab
notebook facilitates
entry of new content,
like registration of
recorded ...
Data integration
• Syncing of metadata
app and
electrophysiology data
acquisition via server
– Each trace of
experimental ...
Data dashboard (web-based)
Data dashboard (future-steps)
• Use collected
metadata to sort
experiments
– Like mouse strain,
neuron type, animal
age
• ...
Next steps
• Use built tools
– Populate data server with many experiments
• Is use of e-notebook too prohibitive?
– If yes...
Acknowledgements
• Carnegie Mellon
– Shreejoy Tripathy
– Nathan Urban
– Shawn Burton
– Rick Gerkin
– Santosh
Chandrasekara...
Next steps
• Roll out updated app to experimentalists
• Populate database with the contents of many
experiments
• Flesh ou...
Effective data sharing is…
• Not just experimental data file
– But also the experimental metadata: what was
done? What doe...
App user testing
• “I don’t like the way the app forces me
through a specific workflow, I want to enter
experimental data ...
What is effective data sharing?
• Effective data sharing – someone who is not
the person who collected the data can
unders...
Neuroinformatics successes don’t
come from large-scale multi-lab data
sharing
• NeuroSynth
• NeuroElectro?
Upcoming SlideShare
Loading in...5
×

Towards reusable experiments: making metadata while you measure

267

Published on

Slides from my short talk at INCF 2013 (neuroinformatics annual meeting) in Stockholm. I talk about realities of data sharing and a proposal to make it easier through use and adoption of electronic lab notebooks. Project a collaboration between carnegie mellon university and elsevier research data services.

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
267
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Tangible benefits of data sharing – more people can collaborate on the same project – which should lead to more productivity and better science = “nature paper”
  • Walk through pieces 1 by 1, also mention that this is very much an uncompleted work in progress
  • Walk through pieces 1 by 1, also mention that this is very much an uncompleted work in progress
  • Transcript of "Towards reusable experiments: making metadata while you measure"

    1. 1. Towards Reusable Experiments: Making Metadata While You Measure Shreejoy Tripathy PhD student, Carnegie Mellon Email: stripat3@gmail.com Twitter: @neuronJoy
    2. 2. Lots of great tools for data sharing…
    3. 3. Barriers to data sharing • Social – “What’s in it for me? How will I get credit?” – “It’s my data, not yours” – “The benefit to me isn’t worth the time I put into it” – “What if I get scooped?” • Methodological – “How do I share data? What do I share?” – “Going back and annotating my files to share is super- time consuming” – Specifying file formats, data standards – Building FTP servers and nice user interfaces
    4. 4. Project idea • How can we make a standard neuroscience wet lab more data-sharing savvy? • Incorporate structured workflows into the daily practice of a typical electrophysiology lab (the Urban Lab at CMU) – What does it take? – Where are points of conflict?
    5. 5. Key insights/motivations 1. Effective data sharing includes raw data files + experimental metadata (typically stored in a lab notebook) SDB_MC_12_voltages.mat
    6. 6. Key insights/motivations 1. Share raw data files + experimental metadata 2. You know the most about an experiment when you’re performing it
    7. 7. Key insights/motivations 1. Share raw data files + experimental metadata 2. You know the most about an experiment when you’re performing it 3. Improved data practices should make labs more productive
    8. 8. Project schematic
    9. 9. Project schematic
    10. 10. Metadata data app • Electronic lab notebook models sequential slice- electrophysiology workflow – Replaces pen-and- paper lab notebook
    11. 11. Metadata data entry • Electronic lab notebook allows structured data entry Animal Strain
    12. 12. Metadata data entry • Electronic lab notebook allows structured data entry (i.e., dropdown menus) – Allows incorporation of semantic ontologies • Important to strike a balance between structure and flexibility MGI:3719486
    13. 13. Metadata data entry MGI:3719486 • Electronic lab notebook facilitates entry of new content, like registration of recorded neurons to brain atlas
    14. 14. Data integration • Syncing of metadata app and electrophysiology data acquisition via server – Each trace of experimental data annotated with metadata • IGOR-Pro specific, support pClamp, other acquisition packages as needed later
    15. 15. Data dashboard (web-based)
    16. 16. Data dashboard (future-steps) • Use collected metadata to sort experiments – Like mouse strain, neuron type, animal age • Enable in-browser analyses – Track provenance of analyzed data back to raw data
    17. 17. Next steps • Use built tools – Populate data server with many experiments • Is use of e-notebook too prohibitive? – If yes, continue to iterate – What can we ask now that we couldn’t before? • It is much easier to ask exploratory questions, like – How is the cell type that Shawn records different from the one that Matt records? • Exposing data to neuroscience databases – NIF, INCF Dataspace, neuroelectro.org • How adaptable are these solutions for use in other labs? • Who is going to pay for this?
    18. 18. Acknowledgements • Carnegie Mellon – Shreejoy Tripathy – Nathan Urban – Shawn Burton – Rick Gerkin – Santosh Chandrasekaran – Matthew Geramita • Elsevier Research Data Services – Anita de Waard – Mark Harviston – Jez Alder – Sarah Tyrchniewicz – David Marques – (funding!)
    19. 19. Next steps • Roll out updated app to experimentalists • Populate database with the contents of many experiments • Flesh out Data dashboard functionality • Investigate the new things that we can achieve given these tools
    20. 20. Effective data sharing is… • Not just experimental data file – But also the experimental metadata: what was done? What does this variable mean? This is usually stored in PHYSICAL lab notebooks, understandable by only the experimenter • Effective data sharing – someone who is not the person who collected the data can understand the experiment and data
    21. 21. App user testing • “I don’t like the way the app forces me through a specific workflow, I want to enter experimental data when I see fit” • “I’m not opposed to the idea of dropdowns, but I want more flexibility, more text fields” • “When I use a lab notebook, I only write down the absolute minimum. Can the app’s fields be prepolated with the results of an old experiment?”
    22. 22. What is effective data sharing? • Effective data sharing – someone who is not the person who collected the data can understand the experiment and data – i.e., datasets should be more or less self- describing – >90% of data sharing use cases are an experimentalist sharing data with a future version of herself or with a labmate
    23. 23. Neuroinformatics successes don’t come from large-scale multi-lab data sharing • NeuroSynth • NeuroElectro?
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×