Your SlideShare is downloading. ×
  • Like
Research Life Cycle for GeoData 2014
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Research Life Cycle for GeoData 2014

  • 430 views
Published

Presentation on challenges for research data management and the data life cycle, for GeoData meeting in Boulder, 18 June 2014.

Presentation on challenges for research data management and the data life cycle, for GeoData meeting in Boulder, 18 June 2014.

Published in Science , Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
430
On SlideShare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
12
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. The Research Data Life Cycle From Flickr by Velo Steve Carly Strasser California Digital Library GeoData 18 June 2014
  • 2. Why don’t people share data? Is data management being taught? Do attitudes about sharing differ among disciplines? What role can libraries play in data education? How can we promote storing data in repositories? What barriers to sharing can we eliminate? NSF funded DataNet Project Office of Cyberinfrastructure
  • 3. Enable data sharing Encourage new incentives Think about code sharing Work with libraries, publishers and researchers Explore new tools to help change system Build tools
  • 4. FromFlickrbygsagostinho Outreach Education Assistance You’re doing it wrong!
  • 5. Back in the day… Da Vinci Curie Newton classicalschool.blogspot.com Darwin
  • 6. Research has changed Better
  • 7. From wikimedia Such Internet! So many tools! From Flickr by John Jobby So much data!
  • 8. Research has changed Worse
  • 9. Digital data FromFlickrbyFlickmor FromFlickrbyUSArmyEnvironmentalCommand FromFlickrbyDW0825 C. Strasser CourteseyofWHOI FromFlickrbydeltaMike
  • 10. Digital data + Complex workflows
  • 11. From Flickr by ~Minnea~ Reproducibility Data management Documentation
  • 12. “Reproducibility Crisis” “Digital Dark Age” “Erosion of Trust”
  • 13. “I own my data and you can’t have it.” “Let me do my work.” “I’m already too busy.” “This takes away from research time.”
  • 14. h/t Ted Hart, NEON
  • 15. Data can’t be owned. You can be the Guardian Steward Caretaker  
  • 16. Plan Collect Assure Describe Preserve Discover Integrate Analyze The Data Life Cycle
  • 17. Discussion topics End game Stakeholders & responsibilities Compliance Costs Follow-up Peer review Concrete steps
  • 18. Liz Lyon: Dealing with Data 2008 UK funder expectations 2009 2009-­‐10   DMPs: A Short History
  • 19. Federal Funding Accountability and Transparency Act 2006 Across the Pond… 2010 2010  – present     DMPs: A Short History
  • 20. … “Federal agencies investing in research and development (more than $100 million in annual expenditures) must have clear and coordinated policies for increasing public access to research products.” Feb 2013
  • 21. From  Calisphere,    Courtesy  of    UC  Riverside,  California  Museum  of  Photography   What do researchers think?
  • 22. They don’t know about policies. John  Kratz,  CLIR/DLF  Postdoc  at  CDL  
  • 23. They aren’t taught data management. Quality control and quality assurance The proper way to name computer files Types of files and software to use Metadata generation Workflows Protecting data Databases and data archiving Data re-use Meta-analysis Data sharing Reproducibility Notebook protocols (lab or field) Strasser  &  Hampton  2013.   “Undergraduates  &  Ecological   Data  Management  Training  in  the   US”.    DOI:10.1890/ES12-­‐00139.1  
  • 24. 0   10   20   30   40   50   60   70   BAS   RU   In Curriculum? They aren’t taught data management.
  • 25. No  one  reads  it   anyway.   It’s  an  unfunded   mandate.  I  wrote  it  the  night   before.   They aren’t concerned.
  • 26. What does success look like? DMPs… •  are flexible •  are useful and used •  result in easily discoverable data •  linked to open data •  are created in partnership with institutional service providers •  are used as a/n (automated) compliance tool •  are part of the workflow of research •  include digital and non-digital materials (where relevant)
  • 27. “Community-driven” But what if community doesn’t care (yet)? “Generic, work for everyone” But community-specific standards
  • 28. Current DMP tools FromFlickrbymhlradio
  • 29. Step-by-step wizard for generating DMP Create | edit | re-use | share | save | generate Open to community DMPonline: dmponline.dcc.ac.uk
  • 30. Step-by-step wizard for generating DMP Create | edit | re-use | share | save | generate Open to community DMPTool: dmptool.org
  • 31. IEDA Data Management Plan Tool
  • 32. dmptool.org
  • 33. We want templates!
  • 34. Plan Collect Assure Describe Preserve Discover Integrate Analyze The Data Life Cycle
  • 35. Scientists are bad at data management. still <
  • 36. From  Flickr  by  iowa_spirit_walker   •  Cost •  Confusion about standards •  Lack of training •  Fear of lost rights or benefits •  No incentives
  • 37. Data are being recognized as first class products of research From Flickr by Richard Moross NSF bio-sketches can include data Data Publication Data Citation
  • 38. Journals Funders Peers From Flickr by Eva Rinaldi Celebrity and Live Music Photographer
  • 39. science source notebook content access data government knowledge FromFlickrbycdsessums
  • 40. Plan Collect Assure Describe Preserve Discover Integrate Analyze The Data Life Cycle
  • 41. “Data Publication”
  • 42. John Kratz, CLIR Postdoc
  • 43. What does “data publication” mean? 1. Available 2. Citable 3. Trustworthy* Data are *peer reviewed? certified? Props to Sarah Callaghan & colleagues
  • 44. Available | Citable | Trustworthy Publish means to “make public”. You should not have to email the author. The data doesn’t have to be open access. “Email me!” CC-0 on web
  • 45. Simple case… Data citations should be in reference list. Five-element citation: author, year, title, publisher, identifier Available | Citable | Trustworthy Boettiger C, Dushoff J, Weitz JS (2009). Data from: Fluctuation domains in adaptive evolution. Theoretical Population Biology. Published in Dryad. doi:10.5061/dryad.j8n0p7vc
  • 46. More complicated… Deep data citation: what if you want to cite a subset? Dynamic data: how to create a reliable citation when a dataset is changing? Available | Citable | Trustworthy
  • 47. Technical VS. Scientific Sometimes consider impact and/or novelty Guidelines provided Available | Citable | Trustworthy From Flickr by Percival Lowell
  • 48. 1.  Data as supplemental material Data published alongside a traditional journal article. Available + citable. Review varies. Potential issues with long-term availability. What does a data publication look like? From Flickr by subsetsum
  • 49. 2.  Data paper: Data + descriptive “data paper” Most require data be in a trusted repository. All have a component of peer review. Examples: •  Standalone journals: Nature Scientific Data, Geoscience Data Journal, Ecological Archives •  Journals that publish data papers: GigaScience, F1000 Research, Internet Archaeology What does a data publication look like? From Flickr by subsetsum
  • 50. 3.  Standalone data Data published without a related journal article. Rich metadata (structured or unstructured) Examples: •  Open Context •  NASA PDS Peer Review Data •  figshare (but no validation) What does a data publication look like? From Flickr by subsetsum
  • 51. “Publish” “Paper” “Peer review” “Sharing” “Available” “Article” “Publication”
  • 52. From Flickr by Sandia Labs C. Strasser C. Strasser World Bank Photo Collection From Flickr What do researchers think of data publication?
  • 53. We have our work cut out for us.
  • 54. Okay, I’ll share it. Where do I put it?
  • 55. Repositories for data General content Non-institutional Publishers/for-profits Other Institutional Discipline-specific Repository choices…
  • 56. Institutional Discipline-specific •  All data associated with a paper •  Tells a story •  Clearinghouse for researcher’s works •  Some of data for a given paper •  Discoverable •  Integrated systems •  Collection policies ?   Both Which should a researcher use? Which is more important? Depends Repository choices…
  • 57. Simplify data deposit for UC researchers Branded for campus Merritt underneath the hood
  • 58. dash.berkeley.edu
  • 59. github.com/cdluc3/dash/wiki
  • 60. From  Flickr  by  dotpolka   Hard work Shifting norms Exciting times
  • 61. Website Email Twiter Slides carlystrasser.net carlystrasser@gmail.com @carlystrasser slideshare.net/carlystrasser
  • 62. From  Flickr  by  dotpolka   Hard work Shifting norms Exciting times
  • 63. Website Email Twiter Slides carlystrasser.net carlystrasser@gmail.com @carlystrasser slideshare.net/carlystrasser