Sharing Detailed Research Data
      is Associated with
   Increased Citation Rate

              Heather Piwowar
       R...
Sharing research data




    PAST MEDICAL HISTORY:
    Past medical history showed she had
    superficial phlebitis time...
Shared data benefits science
•    Verify
•    Understand
•    Extend
•    Explore
•    Combine
•    Synergize
•    Train
•...
But… costly for authors
•    Find
•    Organize
•    Document
•    Deidentify
•    Format
•    Decide
•    Ask
•    Submit...
So what’s in it for them?
Carrot.
A currency of value?
Citations.
  $50!


 Do trials which share their data
      receive...
Methods
Cancer Microarray Trials
   Ntzani and Ioannidis identified 85 trials published 1999-2003


Citations
   ISI Web o...
Results:
                    Eligible trials
•  85 trials
•  41 (48%) made data available
•  Various locations:
  –  Lab w...
Results:
                                Big picture
           85 clinical trials used 
          These 85 trials were ci...
Results:
                         Distribution of citation counts




From: Piwowar HA, Day RS, Fridsma DB (2007) Sharing ...
Results:
                                Multivariate regression




From: Piwowar HA, Day RS, Fridsma DB (2007) Sharing D...
Limitations
•  Outliers
  –  Subset analysis of lower profile papers


•  Complex timing
  –  Additional analysis of citat...
Data sharing help on the way
•  Free, centralized databases
  –  SMD, GEO, ArrayExpress
•  Standards
  –  MIAME, CONSORT
•...
Conclusions
•  70% increase in citation impact for trials
   which make data available

•  Result holds for lower-profile ...
For more information
•  Participate in the discussion on this paper
   at PLoS ONE

•  Check out blogs on Open Access, Ope...
Thank you
•  Peter Suber’s blog: “Open Access News”
•  Wikipedia: “Open Data”
•  Nature Editorial: May 3, 2007

I support ...
Upcoming SlideShare
Loading in …5
×

PLoS ONE Piwowar: Sharing Detailed Research Data Is Associated with Increased Citation Rate

1,732
-1

Published on

Heather A Piwowar, Roger S Day, Douglas B Fridsma (2007) Sharing Detailed Research Data Is Associated with Increased Citation Rate PLoS ONE 2: 3. e308

Abstract: Sharing research data provides benefit to the general scientific community, but the benefit is less obvious for the investigator who makes his or her data available. We examined the citation history of 85 cancer microarray clinical trial publications with respect to the availability of their data. The 48% of trials with publicly available microarray data received 85% of the aggregate citations. Publicly available data was significantly (p = 0.006) associated with a 69% increase in citations, independently of journal impact factor, date of publication, and author country of origin using linear regression. This correlation between publicly available data and increased literature impact may further motivate investigators to share their detailed research data.

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,732
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
8
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

PLoS ONE Piwowar: Sharing Detailed Research Data Is Associated with Increased Citation Rate

  1. 1. Sharing Detailed Research Data is Associated with Increased Citation Rate Heather Piwowar Roger Day and Douglas Fridsma University of Pittsburgh Published in PLoS ONE, March 27 2007 Funded by NLM Training Grant Presented at NLM Trainee Conference, June 27 2007
  2. 2. Sharing research data PAST MEDICAL HISTORY: Past medical history showed she had superficial phlebitis times two in the past, had non-insulin dependent diabetes mellitus for four years. She had been hypothyroid for three years. HISTORY OF PRESENT ILLNESS: The patient is a 58-year-old female, … http://upload.wikimedia.org/wikipedia/commons/7/76/PeptideMSMS.jpg; http://en.wikipedia.org/wiki/Image:Helices.png; http://en.wikipedia.org/wiki/ Image:Heatmap.png; http://en.wikipedia.org/wiki/Image:Microarray2.gif; http://zellig.cpmc.columbia.edu/medlee/demo/; htp://www.plosone.org/article/fetchArticle.action?articleURI=info:doi/10.1371/journal.pone.0000441
  3. 3. Shared data benefits science •  Verify •  Understand •  Extend •  Explore •  Combine •  Synergize •  Train •  Reduce
  4. 4. But… costly for authors •  Find •  Organize •  Document •  Deidentify •  Format •  Decide •  Ask •  Submit •  Answer questions •  Worry about mistakes being found •  Worry about data being misinterpreted •  Worry about being scooped •  Forgo money and IP and prestige???
  5. 5. So what’s in it for them? Carrot. A currency of value? Citations. $50! Do trials which share their data receive more citations?
  6. 6. Methods Cancer Microarray Trials Ntzani and Ioannidis identified 85 trials published 1999-2003 Citations ISI Web of Science Citation Index, citations from 2004-2005 Data availability Publisher and lab websites, microarray databases, WayBack Internet Archive, Oncomine Statistics Multivariate linear regression
  7. 7. Results: Eligible trials •  85 trials •  41 (48%) made data available •  Various locations: –  Lab websites (28) –  Publisher websites (4) –  SMD (6) –  GEO (6) –  GEDP (2) •  6239 total citations
  8. 8. Results: Big picture 85 clinical trials used These 85 trials were cited microarrays to study cancer 6239 times between 1999-2003 during 2004-2005 41 (48%) of these trials Trials which shared made their microarray data received data publicly available 5334 (85%) on the internet of these citations Number of trials Number of citations
  9. 9. Results: Distribution of citation counts From: Piwowar HA, Day RS, Fridsma DB (2007) Sharing Detailed Research Data Is Associated with Increased Citation Rate. PLoS ONE 2(3): e308. doi:10.1371/journal.pone.0000308
  10. 10. Results: Multivariate regression From: Piwowar HA, Day RS, Fridsma DB (2007) Sharing Detailed Research Data Is Associated with Increased Citation Rate. PLoS ONE 2(3): e308. doi:10.1371/journal.pone.0000308
  11. 11. Limitations •  Outliers –  Subset analysis of lower profile papers •  Complex timing –  Additional analysis of citations within 24 months •  Association does not imply causation –  Could be common cause
  12. 12. Data sharing help on the way •  Free, centralized databases –  SMD, GEO, ArrayExpress •  Standards –  MIAME, CONSORT •  Tools –  De-id, caBIG •  Community –  Journals, Funders, Organizations, Blogs
  13. 13. Conclusions •  70% increase in citation impact for trials which make data available •  Result holds for lower-profile publications •  Hopefully a motivation for authors to share data and thus maximize its usefulness
  14. 14. For more information •  Participate in the discussion on this paper at PLoS ONE •  Check out blogs on Open Access, Open Data, Open Notebook Science –  Peter Suber’s Open Access News blog –  Wikipedia: “Open Data” –  Nature Editorial: May 3, 2007 •  Contact Heather Piwowar for further discussion and enthusiasm! hpiwowar@alumni.pitt.edu
  15. 15. Thank you •  Peter Suber’s blog: “Open Access News” •  Wikipedia: “Open Data” •  Nature Editorial: May 3, 2007 I support Open Data and share my literature, code, and data whenever possible. Long term research interest: data reuse as an underutilized informatics resource Questions?
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×