SlideShare a Scribd company logo
Research data and scholarly publications:Going from casual acquaintances to something more Todd Vision Dept of Biology, University of North Carolina at Chapel Hill and the U.S. National Evolutionary Synthesis Center ALPSP, September 2011 Abort, Retry, Fail?  Data and the scholarly literature
Peer-to-peer ‘sharing’ fails Wicherts and colleagues requested data from from 141 articles in American Psychological Association journals. “6 months later, after … 400 emails, [sending] detailed descriptions of our study aims, approvals of our ethical committee, signed assurances not to share data with others, and even our full resumes…” only 27% of authors complied  Wicherts, J.M., Borsboom, D., Kats, J., & Molenaar, D. (2006). The poor availability of psychological research data for reanalysis. American Psychologist, 61, 726-728.
Time of publication Specific details General details Retirement or  career change Information Content Accident Death Time (Michener et al. 1997)
Bumpus HC (1898) The Elimination of the Unfit as Illustrated by the Introduced Sparrow, Passer domesticus. Biological Lectures from the Marine Biological Laboratory: 209-226.
n=3824 Source: Publishing Research Consortium, http://publishingresearch.net
Taxonomy of data archiving benefits Modified from Beagrie et al. (2009) Keeping Research Data Safe 2 10
Joint Data Archiving Policy (JDAP) Data are important products of the scientific enterprise, and they should be preserved and usable for decades in the future.  As a condition for publication, data supporting the results in the article should be deposited in an appropriate public archive. Authors may elect to embargo access to the data for a period up to a year after publication.  Exceptions may be granted at the discretion of the editor, especially for sensitive information. Whitlock, M. C., M. A. McPeek, M. D. Rausher, L. Rieseberg, and A. J. Moore. 2010. Data Archiving. American Naturalist. 175(2):145-146.
The long tail of orphan data in “small science” after B. Heidorn “Most of the bytes are at the high end, but most of the datasets are at the low end” – Jim Gray Specialized repositories (e.g. GenBank, PDB) Volume Orphan data Rank frequency of datatype
Smit E (2011) Abelard and Héloise:  Why Data and Publications Belong Together. D-Lib Magazine doi:10.1045/january2011-smit
The End To make data archiving and reuse standard part of research and publishing.   The Means Enable low-burden data archiving at the time of manuscript submission. Promote researcher benefits from data archiving. Promote responsible data reuse. Empower journals, societies & publishers in shared governance. Ensure sustainability and long-term preservation. The Scope Data underlying peer-reviewed articles in basic and applied biosciences.
Integrated Submit manuscript
Integrated Submit manuscript Prompt author Manuscript metadata
Integrated Submit manuscript Submit data Prompt author Manuscript metadata
Integrated Submit manuscript Submit data Prompt author Manuscript metadata Review passcode Peer review
Integrated Submit manuscript Submit data Prompt author Manuscript metadata Review passcode Peer review Acceptance notification Curation Data DOI Production
Integrated Submit manuscript Submit data Prompt author Manuscript metadata Review passcode Peer review Acceptance notification Curation Data DOI Production Article metadata Curation
Integrated Submit manuscript Submit data Prompt author Manuscript metadata Review passcode Peer review Acceptance notification Curation Data DOI Production Article metadata Curation Article Publication Data publication Article DOI/final metadata harvested
Non-integrated Integrated Submit manuscript Submit data Prompt author Manuscript metadata Review passcode Peer review Submit data Acceptance notification Curation Data DOI Production Article metadata Curation Article Publication Data publication Article DOI/final metadata harvested
Non-integrated Integrated Submit manuscript Submit data Prompt author Manuscript metadata Review passcode Peer review Submit data Acceptance notification Curation Data DOI Production Author includes data DOI Data DOI Article metadata Curation Article Publication Data publication Article DOI/final metadata harvested
Non-integrated Integrated Submit manuscript Submit data Prompt author Manuscript metadata Review passcode Peer review Submit data Acceptance notification Curation Data DOI Production Author includes data DOI Data DOI Article metadata Curation Article Publication Data publication Article publication DOI/final metadata harvested Article DOI/final metadata harvested
 Dryad relative to Supplementary Online Materials * A few publisher SOM sites are exceptions to the general rule ** Practices differ among publishers, see Smit (2011), doi:10.1045/january2011-smit 26
Article citation Wu D, Wu M, Halpern A, Rusch DB, Yooseph S, Frazier M, Venter JC, Eisen JA (2011) Stalking the fourth domain in metagenomic data: searching for, discovering, and interpreting novel, deep branches in phylogenetic trees of phylogenetic marker genes. PLoS ONE 6(3): e18011. doi:10.1371/journal.pone.0018011 Data citation Wu D, Wu M, Halpern A, Rusch DB, Yooseph S, Frazier M, Venter JC, Eisen JA (2011) Data from: Stalking the fourth domain in metagenomic data: searching for, discovering, and interpreting novel, deep branches in phylogenetic trees of phylogenetic marker genes. Dryad Digital Repository. doi:10.5061/dryad.8384
Rebbeck CA, Leroi AM, Burt A (2011) Mitochondrial capture by a transmissible cancer. Science 331, 303
Number of data packages
20 papers from Delsuc and  Douzery going back to 2002
By now, downloaded >1000X
Fulfilling the role of a journal
Does sharing imply that it need be altruistic? For a set of 85 cancer microarray clinical trials 48% had publicly available data These received 85% of the article citations Independent of journal impact factor, publication date, author nationality Piwowar H, et al. (2007) Sharing Detailed Research Data Is Associated with Increased Citation Rate. PLoS ONE 2(3): e308.
Does sharing imply that it need be altruistic? For a set of 85 cancer microarray clinical trials 48% had publicly available data These received 85% of the article citations Independent of journal impact factor, publication date, author nationality Piwowar H, et al. (2007) Sharing Detailed Research Data Is Associated with Increased Citation Rate. PLoS ONE 2(3): e308.
Data policies among bioscience journals IF=3.6 IF=6.0 IF=4.5 n=70 Piwowar HA, Chapman WW (2008) A review of journal policies for sharing research data. Presented at ELPUB2008, Nature Precedings hdl:10101/npre.2008.1700.1
The value proposition For researchers Increase the impact of, and citations to, published research. Preserve and make data available to verify published results, to refine methodologies, and to repurpose.   Free researchers from the burden of data preservation and access. For journals, publishers and societies Free journals from the burden of managing supplemental data  Increase the discoverability, impact, and integrity of articles Increase their value to the community they serve. For funders A cost-effective mechanism to make research more accessible Leverage existing investments in order to enable new science
Sustainability and governance Business model Long-term preservation requires a long-term organization In Dryad’s case, a membership-based nonprofit  Revenue received from a broad array of ‘customers, including journals, societies, publishers, and researchers Deposit charges Paid upfront, when the majority of costs are incurred Ensure free access to the data in perpetuity Allow revenue to naturally scale with costs (i.e. volume of deposits) Distribute costs fairly among stakeholders Governance 12 member Board of Directors nominated, elected by Membership  Membership serves in advisory capacity, and is a community of practice
Costs Moderate economies of scale are required At 10K packages/yr, <$50/deposit, depending on curation What are the costs for SOM? Journal of Clinical Investigation: $300 flat fee Ecological Archives: $250 <10Mb, more fees beyond that FASEB: $100 per file Beagrie N, Eakin-Richards L, Vision TJ (2009) Business models and cost estimation: Dryad repository case study.  iPRES 2010
Proposed payment plans Journal-based annual fee based on all research articles published/yr (~$25/per*) covers any deposits from the journal (even from prior yrs) Voucher-based pay in advance for some number of deposits (<$50/per deposit) Pay-as-you-go:  be invoiced retrospectively for deposits (>$50/per deposit) Author-pays Author pays online at time of deposit Journal can still facilitate archiving through submission integration *These are rates for Members, which include a 10% discount
What is the return on investment? A rigorous framework is lacking But we can look at comparators Marginal cost of data archiving $50/article is <2% of of publication costs (>$2.5K) And 0.2% of grant costs/article (~$25K) Is the data worth 2% of the research investment? Using DNA microarray data in GEO as a model 2,711 submissions in 2007 Data reused by 3rd parties in >1,150 articles Vision (2011) Open data and social contract of scientific publishing. BioScience, 60(5):330-330   Piwowar H, Vision TJ, Whitlock MC (2011) Data archiving is a good investment. Nature 473:285
http://datadryad.org http://blog.datadryad.org http://datadryad.org/wiki http://code.google.com/p/dryad dryad-users@nescent.org      @datadryad       Dryad
A very incomplete list of contributors JDAP: M. Whitlock DryadUS. R. Scherle, E. Feinstein, J. Greenberg, H. Piwowar, P. Schaeffer DryadUK: B. Hole, Max Wilkinson, D. Shotton Sustainability planning: N. Beagrie, L. Eakin-Richards

More Related Content

What's hot

dkNET Poster Experimental Biology 2019
dkNET Poster Experimental Biology 2019dkNET Poster Experimental Biology 2019
dkNET Poster Experimental Biology 2019
dkNET
 
Why study Data Sharing? (+ why share your data)
Why study Data Sharing?  (+ why share your data)Why study Data Sharing?  (+ why share your data)
Why study Data Sharing? (+ why share your data)
Heather Piwowar
 
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
CEDAR: Center for Expanded Data Annotation and Retrieval
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
Carole Goble
 
FAIR data and the Etsin service
FAIR data and the Etsin serviceFAIR data and the Etsin service
FAIR data and the Etsin service
Jessica Parland-von Essen
 
Building an NIH Data Catalog: Bit by Bit
Building an NIH Data Catalog: Bit by BitBuilding an NIH Data Catalog: Bit by Bit
Building an NIH Data Catalog: Bit by Bit
readkev
 
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
CEDAR: Center for Expanded Data Annotation and Retrieval
 
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Merce Crosas
 
Laurie Goodman at #aibsdata: Beyond Data Release Mandates - Helping Authors M...
Laurie Goodman at #aibsdata: Beyond Data Release Mandates - Helping Authors M...Laurie Goodman at #aibsdata: Beyond Data Release Mandates - Helping Authors M...
Laurie Goodman at #aibsdata: Beyond Data Release Mandates - Helping Authors M...
GigaScience, BGI Hong Kong
 
Reproducible research: First steps.
Reproducible research: First steps. Reproducible research: First steps.
Reproducible research: First steps.
Richard Layton
 
dkNET Poster ENDO 2019
dkNET Poster ENDO 2019dkNET Poster ENDO 2019
dkNET Poster ENDO 2019
dkNET
 
The DataTags System: Sharing Sensitive Data with Confidence
The DataTags System: Sharing Sensitive Data with ConfidenceThe DataTags System: Sharing Sensitive Data with Confidence
The DataTags System: Sharing Sensitive Data with Confidence
Merce Crosas
 
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Merce Crosas
 
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
CEDAR: Center for Expanded Data Annotation and Retrieval
 
Gaining credit for sharing research data
Gaining credit for sharing research dataGaining credit for sharing research data
Gaining credit for sharing research data
Varsha Khodiyar
 
Public Data Archiving in Ecology and Evolution: How well are we doing?
Public Data Archiving in Ecology and Evolution: How well are we doing?Public Data Archiving in Ecology and Evolution: How well are we doing?
Public Data Archiving in Ecology and Evolution: How well are we doing?
Sandra Binning
 
An Open Repository Model for Acquiring Knowledge About Scientific Experiments
An Open Repository Model for Acquiring Knowledge About Scientific ExperimentsAn Open Repository Model for Acquiring Knowledge About Scientific Experiments
An Open Repository Model for Acquiring Knowledge About Scientific Experiments
CEDAR: Center for Expanded Data Annotation and Retrieval
 
GSmith Springer Nature Data policies and practices: HKU Open Data and Data Pu...
GSmith Springer Nature Data policies and practices: HKU Open Data and Data Pu...GSmith Springer Nature Data policies and practices: HKU Open Data and Data Pu...
GSmith Springer Nature Data policies and practices: HKU Open Data and Data Pu...
GrahamSmith646206
 
Scott Edmunds ISMB talk on Big Data Publishing
Scott Edmunds ISMB talk on Big Data PublishingScott Edmunds ISMB talk on Big Data Publishing
Scott Edmunds ISMB talk on Big Data Publishing
GigaScience, BGI Hong Kong
 

What's hot (19)

dkNET Poster Experimental Biology 2019
dkNET Poster Experimental Biology 2019dkNET Poster Experimental Biology 2019
dkNET Poster Experimental Biology 2019
 
Why study Data Sharing? (+ why share your data)
Why study Data Sharing?  (+ why share your data)Why study Data Sharing?  (+ why share your data)
Why study Data Sharing? (+ why share your data)
 
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
 
FAIR data and the Etsin service
FAIR data and the Etsin serviceFAIR data and the Etsin service
FAIR data and the Etsin service
 
Building an NIH Data Catalog: Bit by Bit
Building an NIH Data Catalog: Bit by BitBuilding an NIH Data Catalog: Bit by Bit
Building an NIH Data Catalog: Bit by Bit
 
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
 
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
Open Source Tools Facilitating Sharing/Protecting Privacy: Dataverse and Data...
 
Laurie Goodman at #aibsdata: Beyond Data Release Mandates - Helping Authors M...
Laurie Goodman at #aibsdata: Beyond Data Release Mandates - Helping Authors M...Laurie Goodman at #aibsdata: Beyond Data Release Mandates - Helping Authors M...
Laurie Goodman at #aibsdata: Beyond Data Release Mandates - Helping Authors M...
 
Reproducible research: First steps.
Reproducible research: First steps. Reproducible research: First steps.
Reproducible research: First steps.
 
dkNET Poster ENDO 2019
dkNET Poster ENDO 2019dkNET Poster ENDO 2019
dkNET Poster ENDO 2019
 
The DataTags System: Sharing Sensitive Data with Confidence
The DataTags System: Sharing Sensitive Data with ConfidenceThe DataTags System: Sharing Sensitive Data with Confidence
The DataTags System: Sharing Sensitive Data with Confidence
 
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
 
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
 
Gaining credit for sharing research data
Gaining credit for sharing research dataGaining credit for sharing research data
Gaining credit for sharing research data
 
Public Data Archiving in Ecology and Evolution: How well are we doing?
Public Data Archiving in Ecology and Evolution: How well are we doing?Public Data Archiving in Ecology and Evolution: How well are we doing?
Public Data Archiving in Ecology and Evolution: How well are we doing?
 
An Open Repository Model for Acquiring Knowledge About Scientific Experiments
An Open Repository Model for Acquiring Knowledge About Scientific ExperimentsAn Open Repository Model for Acquiring Knowledge About Scientific Experiments
An Open Repository Model for Acquiring Knowledge About Scientific Experiments
 
GSmith Springer Nature Data policies and practices: HKU Open Data and Data Pu...
GSmith Springer Nature Data policies and practices: HKU Open Data and Data Pu...GSmith Springer Nature Data policies and practices: HKU Open Data and Data Pu...
GSmith Springer Nature Data policies and practices: HKU Open Data and Data Pu...
 
Scott Edmunds ISMB talk on Big Data Publishing
Scott Edmunds ISMB talk on Big Data PublishingScott Edmunds ISMB talk on Big Data Publishing
Scott Edmunds ISMB talk on Big Data Publishing
 

Similar to Research data and scholarly publications: going from casual acquaintances to something more

Trends influencing future scholarshp
Trends influencing future scholarshpTrends influencing future scholarshp
Trends influencing future scholarshp
tsbbbu
 
Data sharing archiving discovery, Bill Michener
Data sharing archiving discovery, Bill MichenerData sharing archiving discovery, Bill Michener
Data sharing archiving discovery, Bill Michener
Alison Specht
 
The desolate state of our scientific infrastructure
The desolate state of our scientific infrastructureThe desolate state of our scientific infrastructure
The desolate state of our scientific infrastructure
Björn Brembs
 
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...Natsuko Nicholls
 
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...
GigaScience, BGI Hong Kong
 
Research Data Sharing and Re-Use: Practical Implications for Data Citation Pr...
Research Data Sharing and Re-Use: Practical Implications for Data Citation Pr...Research Data Sharing and Re-Use: Practical Implications for Data Citation Pr...
Research Data Sharing and Re-Use: Practical Implications for Data Citation Pr...
SC CTSI at USC and CHLA
 
Developing data services: a tale from two Oregon universities
Developing data services: a tale from two Oregon universitiesDeveloping data services: a tale from two Oregon universities
Developing data services: a tale from two Oregon universities
Amanda Whitmire
 
NPG Scientific Data Overview for GBIF - TDWG meeting Oct 2013
NPG Scientific Data Overview for GBIF - TDWG meeting Oct 2013NPG Scientific Data Overview for GBIF - TDWG meeting Oct 2013
NPG Scientific Data Overview for GBIF - TDWG meeting Oct 2013Susanna-Assunta Sansone
 
Tweets and Mendeley readers: Two different types of article level metrics
Tweets and Mendeley readers: Two different types of article level metricsTweets and Mendeley readers: Two different types of article level metrics
Tweets and Mendeley readers: Two different types of article level metrics
Stefanie Haustein
 
Empirical analyses of scientific papers and researchers on Twitter: Results...
 	Empirical analyses of scientific papers and researchers on Twitter: Results... 	Empirical analyses of scientific papers and researchers on Twitter: Results...
Empirical analyses of scientific papers and researchers on Twitter: Results...
Stefanie Haustein
 
2015 12 ebi_ganley_final
2015 12 ebi_ganley_final2015 12 ebi_ganley_final
2015 12 ebi_ganley_final
Emma Ganley
 
The Future of Research (Science and Technology)
The Future of Research (Science and Technology)The Future of Research (Science and Technology)
The Future of Research (Science and Technology)
Duncan Hull
 
Citation analysis for research evaluation
Citation analysis for research evaluationCitation analysis for research evaluation
Citation analysis for research evaluation
Wouter Gerritsma
 
Problem-citations--CrossrefLive18--2018-11-13
Problem-citations--CrossrefLive18--2018-11-13Problem-citations--CrossrefLive18--2018-11-13
Problem-citations--CrossrefLive18--2018-11-13
jodischneider
 
Publishing and impact Wageningen University IL for PhD 20141202
Publishing and impact  Wageningen University IL for PhD 20141202Publishing and impact  Wageningen University IL for PhD 20141202
Publishing and impact Wageningen University IL for PhD 20141202Hugo Besemer
 
Assessing Annotated Corpora As Research Output
Assessing Annotated Corpora As Research OutputAssessing Annotated Corpora As Research Output
Assessing Annotated Corpora As Research Output
Martha Brown
 
RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015
William Gunn
 
Evolving and emerging scholarly communication services in libraries: public a...
Evolving and emerging scholarly communication services in libraries: public a...Evolving and emerging scholarly communication services in libraries: public a...
Evolving and emerging scholarly communication services in libraries: public a...
Claire Stewart
 
A Tale of Two Data Catalogs
A Tale of Two Data CatalogsA Tale of Two Data Catalogs
A Tale of Two Data Catalogs
readkev
 
Discussion Ethical Dimensions of Research StudiesIn the best-se.docx
Discussion Ethical Dimensions of Research StudiesIn the best-se.docxDiscussion Ethical Dimensions of Research StudiesIn the best-se.docx
Discussion Ethical Dimensions of Research StudiesIn the best-se.docx
lefrancoishazlett
 

Similar to Research data and scholarly publications: going from casual acquaintances to something more (20)

Trends influencing future scholarshp
Trends influencing future scholarshpTrends influencing future scholarshp
Trends influencing future scholarshp
 
Data sharing archiving discovery, Bill Michener
Data sharing archiving discovery, Bill MichenerData sharing archiving discovery, Bill Michener
Data sharing archiving discovery, Bill Michener
 
The desolate state of our scientific infrastructure
The desolate state of our scientific infrastructureThe desolate state of our scientific infrastructure
The desolate state of our scientific infrastructure
 
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
 
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...
 
Research Data Sharing and Re-Use: Practical Implications for Data Citation Pr...
Research Data Sharing and Re-Use: Practical Implications for Data Citation Pr...Research Data Sharing and Re-Use: Practical Implications for Data Citation Pr...
Research Data Sharing and Re-Use: Practical Implications for Data Citation Pr...
 
Developing data services: a tale from two Oregon universities
Developing data services: a tale from two Oregon universitiesDeveloping data services: a tale from two Oregon universities
Developing data services: a tale from two Oregon universities
 
NPG Scientific Data Overview for GBIF - TDWG meeting Oct 2013
NPG Scientific Data Overview for GBIF - TDWG meeting Oct 2013NPG Scientific Data Overview for GBIF - TDWG meeting Oct 2013
NPG Scientific Data Overview for GBIF - TDWG meeting Oct 2013
 
Tweets and Mendeley readers: Two different types of article level metrics
Tweets and Mendeley readers: Two different types of article level metricsTweets and Mendeley readers: Two different types of article level metrics
Tweets and Mendeley readers: Two different types of article level metrics
 
Empirical analyses of scientific papers and researchers on Twitter: Results...
 	Empirical analyses of scientific papers and researchers on Twitter: Results... 	Empirical analyses of scientific papers and researchers on Twitter: Results...
Empirical analyses of scientific papers and researchers on Twitter: Results...
 
2015 12 ebi_ganley_final
2015 12 ebi_ganley_final2015 12 ebi_ganley_final
2015 12 ebi_ganley_final
 
The Future of Research (Science and Technology)
The Future of Research (Science and Technology)The Future of Research (Science and Technology)
The Future of Research (Science and Technology)
 
Citation analysis for research evaluation
Citation analysis for research evaluationCitation analysis for research evaluation
Citation analysis for research evaluation
 
Problem-citations--CrossrefLive18--2018-11-13
Problem-citations--CrossrefLive18--2018-11-13Problem-citations--CrossrefLive18--2018-11-13
Problem-citations--CrossrefLive18--2018-11-13
 
Publishing and impact Wageningen University IL for PhD 20141202
Publishing and impact  Wageningen University IL for PhD 20141202Publishing and impact  Wageningen University IL for PhD 20141202
Publishing and impact Wageningen University IL for PhD 20141202
 
Assessing Annotated Corpora As Research Output
Assessing Annotated Corpora As Research OutputAssessing Annotated Corpora As Research Output
Assessing Annotated Corpora As Research Output
 
RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015
 
Evolving and emerging scholarly communication services in libraries: public a...
Evolving and emerging scholarly communication services in libraries: public a...Evolving and emerging scholarly communication services in libraries: public a...
Evolving and emerging scholarly communication services in libraries: public a...
 
A Tale of Two Data Catalogs
A Tale of Two Data CatalogsA Tale of Two Data Catalogs
A Tale of Two Data Catalogs
 
Discussion Ethical Dimensions of Research StudiesIn the best-se.docx
Discussion Ethical Dimensions of Research StudiesIn the best-se.docxDiscussion Ethical Dimensions of Research StudiesIn the best-se.docx
Discussion Ethical Dimensions of Research StudiesIn the best-se.docx
 

Recently uploaded

20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 

Recently uploaded (20)

20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 

Research data and scholarly publications: going from casual acquaintances to something more

  • 1. Research data and scholarly publications:Going from casual acquaintances to something more Todd Vision Dept of Biology, University of North Carolina at Chapel Hill and the U.S. National Evolutionary Synthesis Center ALPSP, September 2011 Abort, Retry, Fail? Data and the scholarly literature
  • 2.
  • 3.
  • 4. Peer-to-peer ‘sharing’ fails Wicherts and colleagues requested data from from 141 articles in American Psychological Association journals. “6 months later, after … 400 emails, [sending] detailed descriptions of our study aims, approvals of our ethical committee, signed assurances not to share data with others, and even our full resumes…” only 27% of authors complied Wicherts, J.M., Borsboom, D., Kats, J., & Molenaar, D. (2006). The poor availability of psychological research data for reanalysis. American Psychologist, 61, 726-728.
  • 5. Time of publication Specific details General details Retirement or career change Information Content Accident Death Time (Michener et al. 1997)
  • 6. Bumpus HC (1898) The Elimination of the Unfit as Illustrated by the Introduced Sparrow, Passer domesticus. Biological Lectures from the Marine Biological Laboratory: 209-226.
  • 7.
  • 8. n=3824 Source: Publishing Research Consortium, http://publishingresearch.net
  • 9.
  • 10. Taxonomy of data archiving benefits Modified from Beagrie et al. (2009) Keeping Research Data Safe 2 10
  • 11. Joint Data Archiving Policy (JDAP) Data are important products of the scientific enterprise, and they should be preserved and usable for decades in the future. As a condition for publication, data supporting the results in the article should be deposited in an appropriate public archive. Authors may elect to embargo access to the data for a period up to a year after publication. Exceptions may be granted at the discretion of the editor, especially for sensitive information. Whitlock, M. C., M. A. McPeek, M. D. Rausher, L. Rieseberg, and A. J. Moore. 2010. Data Archiving. American Naturalist. 175(2):145-146.
  • 12. The long tail of orphan data in “small science” after B. Heidorn “Most of the bytes are at the high end, but most of the datasets are at the low end” – Jim Gray Specialized repositories (e.g. GenBank, PDB) Volume Orphan data Rank frequency of datatype
  • 13. Smit E (2011) Abelard and Héloise: Why Data and Publications Belong Together. D-Lib Magazine doi:10.1045/january2011-smit
  • 14. The End To make data archiving and reuse standard part of research and publishing. The Means Enable low-burden data archiving at the time of manuscript submission. Promote researcher benefits from data archiving. Promote responsible data reuse. Empower journals, societies & publishers in shared governance. Ensure sustainability and long-term preservation. The Scope Data underlying peer-reviewed articles in basic and applied biosciences.
  • 16. Integrated Submit manuscript Prompt author Manuscript metadata
  • 17. Integrated Submit manuscript Submit data Prompt author Manuscript metadata
  • 18. Integrated Submit manuscript Submit data Prompt author Manuscript metadata Review passcode Peer review
  • 19. Integrated Submit manuscript Submit data Prompt author Manuscript metadata Review passcode Peer review Acceptance notification Curation Data DOI Production
  • 20. Integrated Submit manuscript Submit data Prompt author Manuscript metadata Review passcode Peer review Acceptance notification Curation Data DOI Production Article metadata Curation
  • 21. Integrated Submit manuscript Submit data Prompt author Manuscript metadata Review passcode Peer review Acceptance notification Curation Data DOI Production Article metadata Curation Article Publication Data publication Article DOI/final metadata harvested
  • 22.
  • 23. Non-integrated Integrated Submit manuscript Submit data Prompt author Manuscript metadata Review passcode Peer review Submit data Acceptance notification Curation Data DOI Production Article metadata Curation Article Publication Data publication Article DOI/final metadata harvested
  • 24. Non-integrated Integrated Submit manuscript Submit data Prompt author Manuscript metadata Review passcode Peer review Submit data Acceptance notification Curation Data DOI Production Author includes data DOI Data DOI Article metadata Curation Article Publication Data publication Article DOI/final metadata harvested
  • 25. Non-integrated Integrated Submit manuscript Submit data Prompt author Manuscript metadata Review passcode Peer review Submit data Acceptance notification Curation Data DOI Production Author includes data DOI Data DOI Article metadata Curation Article Publication Data publication Article publication DOI/final metadata harvested Article DOI/final metadata harvested
  • 26. Dryad relative to Supplementary Online Materials * A few publisher SOM sites are exceptions to the general rule ** Practices differ among publishers, see Smit (2011), doi:10.1045/january2011-smit 26
  • 27. Article citation Wu D, Wu M, Halpern A, Rusch DB, Yooseph S, Frazier M, Venter JC, Eisen JA (2011) Stalking the fourth domain in metagenomic data: searching for, discovering, and interpreting novel, deep branches in phylogenetic trees of phylogenetic marker genes. PLoS ONE 6(3): e18011. doi:10.1371/journal.pone.0018011 Data citation Wu D, Wu M, Halpern A, Rusch DB, Yooseph S, Frazier M, Venter JC, Eisen JA (2011) Data from: Stalking the fourth domain in metagenomic data: searching for, discovering, and interpreting novel, deep branches in phylogenetic trees of phylogenetic marker genes. Dryad Digital Repository. doi:10.5061/dryad.8384
  • 28. Rebbeck CA, Leroi AM, Burt A (2011) Mitochondrial capture by a transmissible cancer. Science 331, 303
  • 29.
  • 30. Number of data packages
  • 31. 20 papers from Delsuc and Douzery going back to 2002
  • 33. Fulfilling the role of a journal
  • 34.
  • 35. Does sharing imply that it need be altruistic? For a set of 85 cancer microarray clinical trials 48% had publicly available data These received 85% of the article citations Independent of journal impact factor, publication date, author nationality Piwowar H, et al. (2007) Sharing Detailed Research Data Is Associated with Increased Citation Rate. PLoS ONE 2(3): e308.
  • 36. Does sharing imply that it need be altruistic? For a set of 85 cancer microarray clinical trials 48% had publicly available data These received 85% of the article citations Independent of journal impact factor, publication date, author nationality Piwowar H, et al. (2007) Sharing Detailed Research Data Is Associated with Increased Citation Rate. PLoS ONE 2(3): e308.
  • 37. Data policies among bioscience journals IF=3.6 IF=6.0 IF=4.5 n=70 Piwowar HA, Chapman WW (2008) A review of journal policies for sharing research data. Presented at ELPUB2008, Nature Precedings hdl:10101/npre.2008.1700.1
  • 38. The value proposition For researchers Increase the impact of, and citations to, published research. Preserve and make data available to verify published results, to refine methodologies, and to repurpose. Free researchers from the burden of data preservation and access. For journals, publishers and societies Free journals from the burden of managing supplemental data Increase the discoverability, impact, and integrity of articles Increase their value to the community they serve. For funders A cost-effective mechanism to make research more accessible Leverage existing investments in order to enable new science
  • 39. Sustainability and governance Business model Long-term preservation requires a long-term organization In Dryad’s case, a membership-based nonprofit Revenue received from a broad array of ‘customers, including journals, societies, publishers, and researchers Deposit charges Paid upfront, when the majority of costs are incurred Ensure free access to the data in perpetuity Allow revenue to naturally scale with costs (i.e. volume of deposits) Distribute costs fairly among stakeholders Governance 12 member Board of Directors nominated, elected by Membership Membership serves in advisory capacity, and is a community of practice
  • 40. Costs Moderate economies of scale are required At 10K packages/yr, <$50/deposit, depending on curation What are the costs for SOM? Journal of Clinical Investigation: $300 flat fee Ecological Archives: $250 <10Mb, more fees beyond that FASEB: $100 per file Beagrie N, Eakin-Richards L, Vision TJ (2009) Business models and cost estimation: Dryad repository case study. iPRES 2010
  • 41. Proposed payment plans Journal-based annual fee based on all research articles published/yr (~$25/per*) covers any deposits from the journal (even from prior yrs) Voucher-based pay in advance for some number of deposits (<$50/per deposit) Pay-as-you-go: be invoiced retrospectively for deposits (>$50/per deposit) Author-pays Author pays online at time of deposit Journal can still facilitate archiving through submission integration *These are rates for Members, which include a 10% discount
  • 42. What is the return on investment? A rigorous framework is lacking But we can look at comparators Marginal cost of data archiving $50/article is <2% of of publication costs (>$2.5K) And 0.2% of grant costs/article (~$25K) Is the data worth 2% of the research investment? Using DNA microarray data in GEO as a model 2,711 submissions in 2007 Data reused by 3rd parties in >1,150 articles Vision (2011) Open data and social contract of scientific publishing. BioScience, 60(5):330-330 Piwowar H, Vision TJ, Whitlock MC (2011) Data archiving is a good investment. Nature 473:285
  • 43.
  • 44. http://datadryad.org http://blog.datadryad.org http://datadryad.org/wiki http://code.google.com/p/dryad dryad-users@nescent.org @datadryad Dryad
  • 45. A very incomplete list of contributors JDAP: M. Whitlock DryadUS. R. Scherle, E. Feinstein, J. Greenberg, H. Piwowar, P. Schaeffer DryadUK: B. Hole, Max Wilkinson, D. Shotton Sustainability planning: N. Beagrie, L. Eakin-Richards

Editor's Notes

  1. .
  2. .
  3. .
  4. This is a riskier workflow – it is more dependent on theauthor to make sure the publication contains a link to the data.
  5. Getting authors and journals to do this sensibly on the article side is not easy. This is a relatively good example – but actual practice is all over the map. Sometimes in acknowledgements, sometimes in main text, sometimes in a standardized data availability section set up by the journal.One interesting area of agreement at the Data Citation workshop Micah mentioned was that the original article should list the data citation in the reference list, for better indexing. At any rate, there is much room for standardization, and awareness-raising among both authors and journals.
  6. For funders, we have estimated that each publication costs 16K UK pounds worth of NSF funding. For another repository we have studied (GEO), 2711 data sets submitted in 2007 made substantive contributions to more than 1150 published articles in 2007-2010 alone, which would cost &gt;18M UKP in original research grants.