Supporting scientific communities by publishing data
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Supporting scientific communities by publishing data

on

  • 340 views

Describes how the Dryad data repository http://datadryad.org/ is a model of scientific data sharing and preservation, and suggests how libraries and librarians may use Dryad to promote data sharing. ...

Describes how the Dryad data repository http://datadryad.org/ is a model of scientific data sharing and preservation, and suggests how libraries and librarians may use Dryad to promote data sharing. Presented at Joint OpenAIRE/LIBER workshop ‘Dealing with Data - what’s the role for the library?’, Ghent (Belgium), 28 May 2013.

Statistics

Views

Total Views
340
Views on SlideShare
340
Embed Views
0

Actions

Likes
1
Downloads
4
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution-NonCommercial LicenseCC Attribution-NonCommercial License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Dryad is a repository that holds data associated with scientific publications (mostly articles but also from theses, book chapters) . Scope: initially evolutionary biology and ecology; now also medicine, public health, and indeed “any scholarly publication” Content policy: Dryad is an international repository of data underlying peer-reviewed scientific and medical literature, particularly data for which no specialized repository exists.What makes Dryad unique is that all material in Dryad is associated with a scholarly publication.We preserve & make pubic any information that the researcher used to generate graphs or statistical results in the article.We also have “nontraditional” and unusual data, like the amount of force used by a shark to bite different types of prey, and how people perceive simulated monsters.
  • Non proprietary formats are preferred, of courseYes we take PDFs (and many authors submit files in this format) but we would rather they didn’t…. And our curators encourage depositors to submit reusable formats.
  • New home page on redesigned website Stats are from late April 2013.
  • This is the landing page for a data package. Dryad displays the cover for Integrated partner journals -- clicking on the cover takes the user to the article. This is an article from August 2012.This page lists all the files associated with an article, and gives the number of times each has been downloaded. Note that these authors have included a README file with context about the data and instructions on how to use it….Below the list of files, Dryad displays both the article citation and a suggested citation for the data --- see next slide….
  • There is evidence for widespread data reuse. This 2009 data package been downloaded over 4 thousand times.Someof this download activity is known to be from use of data in the classroom.Note that every data package page in Dryad has a link to the original journal article.IF TIME PERMITS: Dryad has the potential to transform the way research data are communicated and preserved. The credibility and effectiveness of the research enterprise is due in large part to the social contract behind scholarly publishing. Researchers disclose their work to their peers in return for professional credit. In so doing, they also expose their findings to be confirmed or refuted, and enable other researchers to build upon their results. Dryad seeks to extend this social contract to research data by providing a model for how a disciplinary repository can motivate researchers to disclose the data that is of the greatest value for scientific reuse, that associated with publications, and realize the manifold benefits of free access to scientific data in perpetuity.
  • This slide shows the number of downloads for all data packages in Dryad and for the 4 journals with the most data packages in Dryad. It was prepared from data generated on 2013-4-24.
  • close association of data deposition with the process and business of scholarly publishing, and by using article publication as a model for how researchers can benefit from data sharing infrastructure. Metadata Article metadata – core ofmetadata for data sets
  • Curation by librarians and information science students at the University of North Carolina School of Information and Library Science Open dataOpen sourceOpen organization – a new nonprofit membership organization, comprised of members (journals, publishers, libraries! As well as data producers, funders etc.)
  • Role for liaison librarians or hybrid roles where librarians have collegial relationships with scientists and can play a role in suggesting resources. But the first step is to ask about it– explicitly focusing on data management when you can. Don’t wait for researchers to ask. Ask them: Where is it? What are your plans for preserving it and/or making it available? Your data is important & deserves to be preserved. You can suggest that researchers archive their data in a public repository, like DryadShow Point out how authors can see view & download stats (but not identity of users) –many authors are surprised at how many times their data has been downloaded A picture is worth a thousand words– find data from the researcher’s field in Dryad and show how many times it’s been viewed or downloaded.Scenarios:Younger scientists can gain additional visibility from making their data files available and citable, and by adding a section for data to their CV's, or consider how this post-doc's CV highlights data in Dryad on his list of publications .Research teams and established academics can archive data from years of published research. & Delsuc) YOU TOO can Deposit YOUR data in Dryad--- yes we have Information Science data! (Piwowar data)
  • Talk to authors about the benefits of data sharing Visibility: Making your data available online (and linking it back to the publication) provides a new pathway for others to learn about your work through topical searches. There is a demonstrated correlation between the availability of a dataset linked to a journal article and that article's citation rank. Citability: all data packages and data files you deposit will receive persistent, resolvable Digital Object Identifiers (DOIs) that can be used in a citation as well as listed on your CV. When you re-use your data, or when others use your data, your original data publication can be cited in the same way that an article would be, gaining you academic credit. Workload reduction: if you receive individual requests for data, you can simply direct them to the items in Dryad. Preservation: your data files will be permanently and safely archived in perpetuity. We will keep the files intact and migrate them to new formats as old formats become obsolete; you will not have to worry whether Excel 2003 files will open in Excel 2023! Impact: you will garner citations through the reuse of your data, and you can monitor the use of your data through Dryad's usage statistics (available for each data file in the repository) and may gain opportunities for collaboration etc.
  • Now many sources of instructions for data management– find the one that’s best for your audience, or create one!
  • Promote the use of DOIs even for less formal communications, including social media, because we are as interested in altmetrics for data as we are in data citations.  Reinforce the importance of putting the data DOI in the published article (many journals have no policies about this, so authors need to make sure that the DOI appears in their article)While the inclusion of DOIs on internal paperwork may not appear very significant, with more and more CV-type information now being posted online by researchers in LinkedIn, Twitterand other social media services, over time this will make a difference (both in terms of metrics, but also hopefully in terms of cultural change).
  • A little bit about Dryad’s citation philosophy–
  • Every landing page for data in Dryad provides suggested citations for both the article and its data Eagle eyes will note on this example that the author list for the data is not the same as for the article, at the authors’ request. ---- An independent data citation can provide a incentive for members of collaborations who want to get primary credit for their contributions.So this is how we recommend authors and journals cite data in Dryad. But are they??????
  • Authors and journals are not yet sure how and where to include the link to the underlying data in the “data-sharing” articleHere’s a not so good example-- This citation was at the end of a list of “References and Notes”. The data statement has no URL or DOI -- making it difficult for others to locate the data.
  • Digression--- to report results of recent work by a student at the Univ North Carolina School of Information Science, Metadata Research Center (MRC) –looked at randomly selected articles associated with data in Dryad from our partner journals,& after May 2011 -- no consistency in how DOIs are included from one journal to the nextBut some of the journals do have an inner consistencyStandards for data citation are still evolving. Journals have yet to agree on where to place data citations, and authors are just starting to become familiar with the concept. We have work to do here, and this is a dynamic arena.
  • One more idea for libraries-- to extend the role of librarians whenever possible to explicitly focus on the research data life cycle, and scholarly communications. In many settings, a librarian’s existing collaborative role with researchers is a good starting point for raising questions about data preservation and sharing. This a natural role for us– we are already involved with researchers at the right time and in the right place – we have the expertise and opportunity, so let’s use it. New opportunities are arising all the time, and emerging roles – what other ideas do you have? As someone said last week at the Data Publication meetings in Oxford, “libraries are old, but data sharing is not “

Supporting scientific communities by publishing data Presentation Transcript

  • 1. Supporting scientificcommunitiesby publishing dataDryad Digital RepositoryPeggy SchaefferOpenAIRE/LIBER WorkshopMay 28, 2013Ghent, Belgium
  • 2. Outline• Introduction to DryadJoint Data Archiving Policy (JDAP)How Dryad works with journals and publishers• How librarians can use Dryad• Ideas forlibrarianstosupport researchdata management & publication2DataDryad.org
  • 3. An international repository ofdata underlying scientific andmedical publicationsdatadryad.org3DataDryad.org
  • 4. Dryad welcomes data in any format4DataDryad.org
  • 5. Joint Data Archiving Policy< Journal > requires, as a condition for publication,that data supporting the results in the papershould be archived in an appropriate publicarchive, such as <list of approved archives>.Data are important products of the scientificenterprise, and they should be preserved andusable for decades in the future. Authors may electto have the data publicly available at time ofpublication, or, if the technology of the archiveallows, may opt to embargo access to the data fora period up to a year after publication.Exceptions may be granted at the discretion of theeditor, especially for sensitive information such ashuman subject data or the location of endangeredspecies.5DataDryad.org
  • 6. Researchers and journals are using Dryadfor archivingDataDryad.org 6
  • 7. …and using the data for researchDataDryad.org 7
  • 8. DataDryad.org 8
  • 9. DataDryad.org 9Journal IntegrationDateDataPackagesDataDownloadsAve. downloads perpackageAll Journals 2765 127,396 46MolecularEcology2009-11-29 615 23,604 38Evolution 2010-5-4 380 12,524 33AmericanNaturalist2009-8-29 208 11,195 54Journal ofEvolutionaryBiology2010-7-12 205 5,729 28Journals benefit when data is reusedA “Data Package” is all of the data files for a journal article. All Dryad datapackages link to the associated journal article.
  • 10. What makes Dryad unique1. Aligns data deposition process with theprocess and business of scholarly publishing;2. Article publication as a model for howresearchers can benefit from data sharing;3. Motivates researchers to disclose data of thegreatest value for scientific reuse;4. Article metadata = foundation of metadatafor associated data.10DataDryad.org
  • 11. What makes Dryad unique5. Data files in Dryad are curated;6. All contents are freely available via CC0waiver;7. Dryad is an open source enterprise, builton DSpace, with open development &open documentation; and8. Dryad is a nonprofit organizationresponsive to and managed by itsstakeholders.11DataDryad.org
  • 12. How can librarians use Dryad• Ask scientists about their data!• Refer grant writers to Dryad, they can build itin to a data management plan• ShowDryad to researchers• Add a link to Dryad when you offer resourcesfor researchers• Demo Dryad to show what making data filespublic looks like.12DataDryad.org
  • 13. Data sharing: advantages to authorsVisibilityCitabilityWorkload reductionPreservationImpact and opportunity13DataDryad.org
  • 14. Other ways librarians can supportscientific data managementConsult and share best practice guidelines:• Some Simple Guidelines for Effective Data Management,Borer ET, Seabloom EW, Jones MB, Schildhauer M (2009).Bulletin of the Ecological Society of America 90(2), 205-214. doi:10.1890/0012-9623-90.2.205.• Data archiving in ecology and evolution: best practices,MC Whitlock, (2010). Trends in Ecology & Evolution, 26(2), p. 61-65. doi:10.1016/j.tree.2010.11.006.14DataDryad.org
  • 15. Use & promote the use of good datacitations with DOIs• Data citation conventions are evolving• Authors, journals and publishers need to seegood models of data citation– Articles– CVs– Grant proposals• Help make data citation and data DOIs familiar• Use DOIs in social media15DataDryad.org
  • 16. Dryad’s citation philosophy:• Cite both the article and the data – they areboth useful research products• But limit data citations to one “data package”per article – this eliminates most concernsabout the size/granularity of data files16DataDryad.org
  • 17. DataDryad.org 17
  • 18. 18DataDryad.org
  • 19. Checking citations to the data in the data-sharing articleFor 338 articles associated with Dryad data:-- 253 did include a DOI for the data (75%)-- 85 did not (25%)• where the DOIs were located:-- dedicated section (Data accessibility) n= 148-- in or near article header n= 43-- in-text (Methods, Acknowledgments): n= 71-- in References: n= 28 (but: 17 are not actual fullcitations in the style of the other citations).19DataDryad.org
  • 20. Encourage opportunities for librariesand librarians• Scientific data positions at academic libraries,universities, & research institutes• Scholarly communications officers• Collaborative roles with researchers canextend toconsultingon data sharing• Other emerging roles• ?????20DataDryad.org
  • 21. To learn more:• Repository: http://datadryad.org• News: http://blog.datadryad.org• Documentation: http://wiki.datadryad.org• Twitter: @datadryadPeggy Schaefferpschaeffer@datadryad.org21DataDryad.org