Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data Publication at CDL for IDCC14


Published on

Talk for IDCC14 workshop on Data Publication, 24 February 2014.

Published in: Technology
  • Be the first to comment

Data Publication at CDL for IDCC14

  1. 1. Data Publication Etcetera at the CDL Carly Strasser & John Kratz California Digital Library @carlystrasser #IDCC14 February 2014
  2. 2. Zooming out Zooming out
  3. 3. Back in the day… From Wikimedia Commons From
  4. 4. Back in the day… Curie Newton Da Vinci Darwin
  5. 5. Research has Better changed
  6. 6. From Flickr by John Jobby So much data! From wikimedia Such Internet! So many tools!
  7. 7. Research has changed Worse
  8. 8. From Flickr by US Army Environmental Command From Flickr by deltaMike From Flickr by DW0825 From Flickr by Flickmor Courtesey of WHOI Digital data C. Strasser
  9. 9. Digital data + Complex workflows
  10. 10. “Reproducibility Crisis” “Digital Dark Age” “Erosion of Trust”
  11. 11. Can we fix science? v t h e w ay we o m m u n i c ate c our All of the science Early & often Transparently & openly
  12. 12. Zooming out Zooming in
  13. 13. Data Publication @
  14. 14. From Flickr by lindyjb John Kratz, CLIR Postdoc
  15. 15. “Data Publication”
  16. 16. What does “data publication” mean? Props to Sarah Callaghan & colleagues!
  17. 17. What does “data publication” mean? Data are 1. Available 2. Citable 3. Trustworthy*
  18. 18. What does “data publication” mean? Data are 1. Available 2. Citable 3. Trustworthy* *peer reviewed? certified?
  19. 19. Available | Citable | Trustworthy “Email me!” CC-0 on web Publish means to “make public”. You should not have to email the author. The data doesn’t have to be open access.
  20. 20. Available | Citable | Trustworthy Simple case… Data citations should be in reference list. Five-element citation: author, year, title, publisher, identifier Boettiger C, Dushoff J, Weitz JS (2009). Data from: Fluctuation domains in adaptive evolution. Theoretical Population Biology. Published in Dryad. doi:10.5061/dryad.j8n0p7vc
  21. 21. Available | Citable | Trustworthy More complicated… Deep data citation: what if you want to cite a subset? Dynamic data: how to create a reliable citation when a dataset is changing?
  22. 22. Available | Citable | Trustworthy Technical VS. Scientific Sometimes consider impact and/or novelty Guidelines provided From Flickr by Percival Lowell
  23. 23. What does a data publication look like? From Flickr by subsetsum 1.  Data as supplemental material Data published alongside a traditional journal article. Available + citable. Review varies. Potential issues with long-term availability.
  24. 24. What does a data publication look like? From Flickr by subsetsum 2.  Data paper: Data + descriptive “data paper” Most require data be in a trusted repository. All have a component of peer review. Examples: •  •  Standalone journals: Nature Scientific Data, Geoscience Data Journal, Ecological Archives Journals that publish data papers: GigaScience, F1000 Research, Internet Archaeology
  25. 25. What does a data publication look like? From Flickr by subsetsum 3.  Standalone data Data published without a related journal article. Rich metadata (structured or unstructured) Examples: •  Open Context •  NASA PDS Peer Review Data •  figshare (but no validation)
  26. 26. “Available” “Paper” “Publish” “Sharing” “Peer review” “Publication” “Article”
  27. 27. World Bank Photo Collection From Flickr C. Strasser What do researchers think of data publication? C. Strasser From Flickr by Sandia Labs
  28. 28. •  •  •  •  •  Publishing Sharing Citation Peer review Trustworthiness Share with researchers!
  29. 29. Other Govt Survey Demographics N=274 Academic 79% US | 21% Not Type of researcher Other PI Grad student Postdoc Discipline Info Sci Other Math, physics Archaeology Envi/earth Sci Bio
  30. 30. In the meantime…
  31. 31. UCSF
  32. 32. For all UCs •  Use institutional credentials to log in •  Enter metadata & deposit data •  Get identifier •  Optional PDF download •  Landing page is the publication
  33. 33. Focus on solving simple bits first: Easy sharing Ÿ Citable datasets data publishing data sharing
  34. 34. Big thanks to John Kratz, CLIR Postdoc Website Email Tweet Slides Survey @carlystrasser