Public Sharing of Research Datasets: A Pilot Study of Associations
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Public Sharing of Research Datasets: A Pilot Study of Associations

  • 1,664 views
Uploaded on

Presented at ASIST & ISSI Pre-Conference ...

Presented at ASIST & ISSI Pre-Conference
Symposium on Informetrics and Scientometrics on Nov 7, 2009

http://www.sois.uwm.edu/MetricsPreCon/program.html

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,664
On Slideshare
1,663
From Embeds
1
Number of Embeds
1

Actions

Shares
Downloads
2
Comments
0
Likes
3

Embeds 1

http://www.slideshare.net 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Public sharing of research datasets: a pilot study of associations Heather Piwowar and Wendy Chapman Department of Biomedical Informatics University of Pittsburgh
  • 2. http://www.flickr.com/photos/vroomvroommm/3457772539 data data
  • 3. http://www.flickr.com/photos/75166820@N00/5318468/ stale
  • 4. http://www.flickr.com/photos/ryanr/142455033/ sounds great
  • 5. http://www.flickr.com/photos/faerie-dust/2315927946/ but not easy
  • 6. http://www.flickr.com/photos/sunrise/35819369/ http://www.flickr.com/photos/fboyd/2156630044/ persuade
  • 7. http://www.flickr.com/photos/mesh/14102209/ does it work?
  • 8. Prior work has focused on surveys and studies of intention. Our aim: measure associations between observed data sharing behaviour and environmental variables aim
  • 9. Funder Journal Investigator Institution Study Is research data shared after publication? aim
  • 10. Funder Journal Investigator Institution Study Is research data shared after publication? aim
  • 11. http://en.wikipedia.org/wiki/DNA_microarray http://en.wikipedia.org/wiki/Image:Heatmap.png http://commons.wikimedia.org/wiki/ File:DNA_double_helix_vertikal.PNG microarray data
  • 12. microarray data
  • 13. Ochsner et al. (2008). Much room for improvement in deposition rates of expression microarray datasets. Nature Methods, 5(12), 991. Manually reviewed 20 journals for 2007: 400 studies 200 shared their microarray data data sample
  • 14. Journal Funder Journal Investigator impact mandates mandates “experience” factor Is research data shared after publication? variables
  • 15. Funder mandates variables
  • 16. Funder mandates NIH 2003 Data Sharing Requirement Requires a data sharing plan for studies funded after October 2003 that receive more than $500 000 in direct funding per year variables
  • 17. Funder mandates Assumed data sharing requirement was applicable if: the NIH grant numbers associated with PubMed entry had $750 000 in total funding any year since 2004 plus a NIH grant number with a leading “1” or “2” since 2004 variables
  • 18. Journal mandates variables
  • 19. Journal mandates Piwowar and Chapman. A review of journal policies for sharing research data. International Conference on Electronic Publishing (ELPUB) 2008 Journal Policy Strength: Strong, Weak, or None variables
  • 20. Author experience variables
  • 21. Author experience Publication history and impact variables
  • 22. Author experience “experience and impact” proxy: • years since first publication • h-index estimate • a-index estimate Scriptable, to allow scaling up to thousands of authors? variables
  • 23. Author experience Author publication history variables
  • 24. Author experience Citation counts variables
  • 25. Author experience Author name disambiguation Author-ity web service: Torvik & Smalheiser. (2009). Author Name Disambiguation in MEDLINE. ACM Transactions on Knowledge Discovery from Data, 3(3):11. variables
  • 26. Author experience PubMed + PubMed Central + Author-ity to compute pubmedi citation estimates ➡ not comprehensive account of publication accomplishments ➡ for aggregate analysis: free, open, scriptable, flexible, reproducible. variables
  • 27. Author experience For each first and last author, we used the first principal component of: • years since first publication • pubmedi h-index estimate • pubmedi a-index estimate variables
  • 28. Journal Funder Journal Investigator impact mandates mandates “experience” factor Is research data shared after publication? variables
  • 29. Univariate odds ratios Multivariate logistic regression stats
  • 30. http://www.flickr.com/photos/paperpariah/3002687604/ results
  • 31. Not statistically significant Statistically significant Journal Funder Journal Investigator impact mandates mandates “experience” factor Is research data shared after publication? results
  • 32. Funder mandates 33% results
  • 33. Journal Journal impact mandates factor Strength of journal data sharing policy is very correlated with impact factor results
  • 34. Investigator “experience” results
  • 35. Investigator “experience” results
  • 36. Investigator “experience” results
  • 37. Investigator “experience” results
  • 38. Investigator “experience” results
  • 39. Investigator “experience” results
  • 40. http://www.flickr.com/photos/vlastula/300102949/ • Association does not imply causation • Only one datatype • Small sample, limited variables • Dataset contains disproportionate number of high-impact studies limitations
  • 41. • NIH data sharing plan applies to a minority of NIH microarray studies • NIH data sharing plan does not seem to increase frequency of data sharing • More experienced investigators are more likely to share data prelim conclusions
  • 42. http://www.flickr.com/photos/krcla/2069243613/ PhD dissertation! • More samples • More variables next steps
  • 43. http://www.flickr.com/photos/cogdog/123072/ Spin-off projects: • Quantify usefulness of pubmedi h-index • Study the patterns and prevalence of data reuse future
  • 44. Dept of Biomedical Informatics at U of Pittsburgh NLM for training grant funding Open science online community and those who release their articles, datasets and photos openly Dr Wendy Chapman for her support and feedback thanks
  • 45. Journal mandates variables
  • 46. Journal Policy strength mandates categorization: None: No applicable mention of data sharing Weak: Request or unenforceable requirement Strong: Require data deposit accession number as a condition of publication variables
  • 47. http://www.flickr.com/photos/myklroventine/892446624/ I post my data, code, and statistical scripts at http://www.dbmi.pitt.edu/piwowar Share yours too! open science