Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Ucmp 20150407


Published on

A description of services provided by the California Digital Library (CDL) to researchers at the University of California

Published in: Education
  • Be the first to comment

  • Be the first to like this

Ucmp 20150407

  1. 1. Perry Willett Stephen Abrams University of California Curation Center CDL services for UC researchers Museum of Paleontology UC Berkeley April 7, 2015
  2. 2. CDL Services and the Research Cycle
  3. 3. Sharing your data is good for science  reproducibility  integrity  enables collaboration and synergy  minimizes needless duplication of effort © Universal Pictures
  4. 4. “Papers with publicly available microarray data received more citations than similar papers that did not make their data available, even after controlling for many variables known to influence citation rate” Sharing your data is good for scientists  get credit for your work  higher impact factor
  5. 5. … and you have to (and should want to)  funders (are starting to) require it  journals require it  disciplinary best practice (increasingly) expects it “To do otherwise should come to be regarded as scientific malpractice” – Royal Society, 2014
  6. 6. what can I do? adopt the growing body of good practices 10 aspirational goals ►
  7. 7. plan ahead 10 implicit (non-)decisions can have significant consequences
  8. 8. plan ahead 10 a data management plan describes your intentions during and after your research project
  9. 9. prefer formats that are … standard customized open source proprietary commonly-used obscure self-describing opaque text binary 9 be preservation- friendly from the start
  10. 10. assign an identifier to your data 8 DOIs provide unambiguous reference, persistent access, and citation metrics [digital object identifier]
  11. 11. get an identifier for yourself 7 ORCIDs provide unambiguous reference and citation metrics [open researcher and contributor identifier]
  12. 12. describe and document what would you want to know about someone else’s data? who? what? when? where? how? why? …? 6
  13. 13. upload to a repository 5 professional, pro-active management replication fixity monitoring media refresh technology watch disaster recovery/ business continuity … replication fixity monitoring media refresh technology watch disaster recovery/ business continuity …
  14. 14. use a license with the most permissive terms 4 allows simplest reuse used by Dash custom data use agreement should be avoided
  15. 15. publish 3 so your data are available to collaborators, colleagues, and community
  16. 16. cite yourself and others 2 add data citations to your CV and publications track usage of your data products through alt-metrics
  17. 17. preserve your code 1 everything just said about data applies equally well to code
  18. 18. plan format identify (your data) identify (yourself) describe upload license publish cite code data preservation 101
  19. 19. features: • datasets! • open to Berkeley researchers (faculty members, grad students) • no cost (to you) • assistance in describing your dataset • easy drag-and-drop to assemble your files • DOIs • catalog of datasets from other UC researchers
  20. 20. for more information …
  21. 21. for more information … … also, a good paper to review: Goodman, Pepe, Blocker, Borgman, Cranmer et al. (2014) “Ten simple rules for the care and feeding of scientific data” PLOS Computational Biology 10(4):e1003452, doi:10.1371/journal.pcbi.1003542 … and ask your local librarian