On the reproducibilityof scienceMelissa HaendelBeyond the PDF220 March 2013@firstname.lastname@example.org
Do we know if the infrastructure isactually broken?Slide from Gully Burns The science cycle
This is a broken data story.The science cycle Image: h6p://www.joinchangena=on.org/blog/post/roadblocks-‐on-‐the-‐pathway-‐to-‐ci=zenship
Journal guidelines for methods areoften poor and space is limited“All companies from which materials were obtained should be listed.” -‐ A well-‐known journal Reproducibility is dependent at a minimum, on using the same resources. But…
Hypothesis: AnAbodies in the published literature are not uniquely idenAﬁable An experiment in reproducibilityGather journal ar=cles 5 domains: Immunology Cell biology Neuroscience Developmental biology General biology 3 impact factors: High Medium Low 28 Journals 119 papers 454 an=bodies 408 commercial an=bodies 46 non-‐commercial an=bodies Iden=fying ques=ons: Is the an=body iden=ﬁable in the vendor site? Is the catalog number reported? Is the source organism reported? Is the an=body target iden=ﬁable?
The data shows…Approximately half of anAbodies are not uniquely idenAﬁable in 119 publicaAons Percent idenAﬁable 0% 10% 20% 30% 40% 50% 60% Commercial an=body Non-‐commerical an=body n=408 n=46
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Immunology Neuroscience Dev Bio Cell Bio General Bio High Medium Low Percent iden=ﬁable n=124 n=94 n=87 n=95 n=56 Unique idenAﬁcaAon of commercial anAbodies varies across discipline and impact factor In some domains high impact journals have worsereporting, and in others it is the opposite
Scientists really do put theirdata in cardboard boxes.
Ø Promote beJer reporAng guidelines in journals Ø Include reviewing guidelines Ø Provide tools to reference research resources with unique and persistent IDs/URIs Ø Train librarians and other data stewards to apply data standards What are we going to do about it?