Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Share and Reuse: how data sharing can take your research to the next level

603 views

Published on

Talk given at CIC, McGill, Montreal in December 2015

Published in: Science
  • Be the first to comment

Share and Reuse: how data sharing can take your research to the next level

  1. 1. Share and Reuse how data sharing can take your research to the next level Chris Gorgolewski Department of Psychology Stanford University
  2. 2. HOW CAN YOU BENEFIT FROM DATA SHARING?
  3. 3. NKI Enhanced • 329 subjects (will reach 1000) – Representative sample: young and old, some with mental health history • 1 hour worth of MRI (3T) scanning: – MPRAGE (TR = 1900; voxel size = 1mm isotropic) – 3x resting state scans (645msec, 1400msec, and 2500msec) – Diffusion Tensor Imaging (137 direction; voxel size = 2mm isotropic) – Visual Checkboard and Breath Holding manipulations
  4. 4. fcon_1000.projects.nitrc.org/indi/en hanced/
  5. 5. Human Connectome Project • > 500 subjects (will reach 1200) – Young and healthy (22-35yrs) – 200 twins! • 1 hour worth of MRI scanning: – State of the art sequences – high temporal and spatial resolution – Resting-state fMRI (R-fMRI) – Task-evoked fMRI (T-fMRI) • Working Memory • Gambling • Motor • Language • Social Cognition • Relational Processing • Emotion Processing – Diffusion MRI (dMRI) – MEG and EEG – 7T coming soon
  6. 6. Human Connectome Project • Rich phenotypical data – Cognition, personality, substance abuse etc. • Genotyping! (not yet available) • Methodological developments – Fine tuned sequences – Innovative field inhomogeneitycorrections – New preprocessing techniques • Ready to use preprocessed data
  7. 7. humanconnectome.org
  8. 8. The Global Alzheimer's Association Interactive Network GAAIN.org
  9. 9. SchizConnect.org
  10. 10. NeuroImage Data Sharing special issue Free for the next three months!
  11. 11. FCP/INDI Usage Survey Survey Courtesy of Stan Colcombe & Cameron Craddock FCP/INDI Data Usage Description Master's thesis research 11.94% Doctoral dissertation research 38.81% Teaching resource (projects or examples) 13.43% Pilot data for grant applications 16.42% Research intended for publication 76.12% Independent study (e.g., teach self about analysis) 37.31% FCP/INDI Users; 10% respondent rate
  12. 12. Growth of the reuse of OpenfMRI datasets
  13. 13. Motivation • Share your stat maps! vs. institutions scientists
  14. 14. Data sharing saves money $878,988 cost of reacquiring data for each of the reuses of OpenfMRI datasets
  15. 15. Data sharing fears • Fear of being scooped • Fear of someone finding a mistake • Misconceptions about the ownership of the data
  16. 16. Studies sharing data have higher statistical quality Wicherts JM, Bakker M, Molenaar D (2011) Willingness to Share Research Data Is Related to the Strength of the Evidence and the Quality of Reporting of Statistical Results. PLoS ONE 6(11): e26828. doi: 10.1371/journal.pone.0026828
  17. 17. Neuroimaging data sharing hierarchy Poldrack and Gorgolewski, 2014
  18. 18. Just coordinates? • Databases such as Neurosynth or BrainMap rely on peak coordinates reported in papers (only strong effects)
  19. 19. Are we throwing money away?
  20. 20. Baby steps • Everything is a question of cost and benefit – If we keep the cost low even small benefit (or just conviction that data sharing is GOOD) will suffice
  21. 21. NeuroVault.org simple data sharing • Minimize the cost! • We just want your statistical maps with minimum description (DOI) – If you want you can put more metadata, but you don’t have to • We streamline login process (Google, Facebook)
  22. 22. NeuroVault.org Gorgolewski, et al., submitted
  23. 23. Benefits - visualisation
  24. 24. Benefits - decoding
  25. 25. Benefits - other • Private collections • Multiple contributors to one collection • Sharable persistent URLs • Viewer embeddable on your labs website or your private blog • Improved exposure of your research • Improved reusability of your results • Long term storage in Stanford Digital Repository
  26. 26. Using NeuroVault… • Improves collaboration • Makes your paper more attractive • Shows you care about transparency • Takes only five minutes • Gives you warm and fuzzy feeling that you helped future meta-analyses
  27. 27. Validation and gains in sensitivity
  28. 28. NeuroVault for developers • RESTful API (field tested by Neurosynth) • Source code available on GitHub
  29. 29. What is NIDM-Results?
  30. 30. Neuroimaging data sharing hierarchy Poldrack and Gorgolewski, 2014
  31. 31. MAKING DATASHARING COUNT Credit where credit’s due
  32. 32. Quality control • Share your stat maps! Complex datasets require elaborate descriptions
  33. 33. • Share your stat maps! How can we appropriately reward extra effort and risk related with sharing data?
  34. 34. Solution – data papers • Authors get recognizable credit for their work. – Even smaller contributors such as RAs can be included. • Acquisition methods are described in detail. • Quality of metadata is being controlled by peer review.
  35. 35. Gorgolewski, Milham, and Margulies, 2013
  36. 36. • Neuroinformatics (Springer) • GigaScience (BGI, BioMed Central) • Scientific Data (Nature Publising Group) • F1000Research (Faculty of 1000) • Data in Brief (Elsevier) • Journal of Open Psychology Data (Ubiquity press) Where to publish data papers?
  37. 37. What makes a good data paper? • Clear and accurate description of the acquisition protocol. • Good data organization. • Ease of access to data. • Data quality description. • Fair credit attribution.
  38. 38. How to improve the impact of your dataset? • Provide preprocessed data. • Reach out to your peers… – …and people outside of your field (ML) • Build a community around the data.
  39. 39. StudyForrest.org
  40. 40. Repositories • Field specific – OpenfMRI.org (task based fMRI) – FCP/INDI (resting state fMRI) – COINS • Field agnostic – DataVerse (Harvard) – Figshare (only small datasets) – DataDryad (fees may apply)
  41. 41. OpenfMRI • Will host any MRI dataset • No fees • Curated and uncurated datasets • Recommended by many journals (including Scientific Data)
  42. 42. Quality Assessment Protocol preprocessed-connectomes-project.github.io/quality- assessment-protocol
  43. 43. Prepare in advance • Make sure your consent form includes data sharing • Decide which database you want to send your data to in advance – Organize your data according to their requirements • Work on anonymized data as much as you can
  44. 44. Brain Imaging Data Structure bids.neuroimaging.io
  45. 45. Ultimate consent form • Inform participants about your intention to share data • Explain the benefits • Discuss the risks open-brain-consent.readthedocs.org
  46. 46. If I haven’t convinced you yet • Why to share data: – It’s the ethical thing to do (Brakewood and Poldrack 2013) – The journal might require it (PLoS). – Your funders might require it (NIH). – Track record of data sharing can improve your chances of getting your next grant.
  47. 47. Sharing data is related to higher citation rate Piwowar, Day & Fridsma (2007) Piwowar & Vision(2013)
  48. 48. Acknowledgements Russell A. Poldrack Jean-Baptiste Poline Yannick Schwarz Tal Yarkoni Michael Milham Daniel Margulies Yannick Schwartz Gael Varoquox Joseph Wexler Gabriel Rivera Camile Maumet Vanessa Sochat Thomas Nichols MPI CBS Resting state group Poldrack Lab INCF Data Sharing Task Force

×