Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake Carlson)

1,515 views

Published on

Published in: Technology, Education
  • Be the first to comment

DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake Carlson)

  1. 1. Logistics for Webinar You must call in for audio: 866-740-1260 access code 9870179# Participants muted Ask questions in chat any time 20 minutes for Q&A Recording & slides, schedule of webinars: blog.dmptool.org/webinar-series DMPToolWebinar Series 8: Data Curation Profiles & the DMPTool Sponsored by IMLS 13 August 2013
  2. 2. 28 May Introduction to the DMPTool 4 June Learning about data management: Resources, tools, materials 18 June Customizing the DMPTool for your institution 25 June Environmental Scan:Who's important at your campus 9 July Promoting institutional services; EZID Outreach Made Simple! 16 July Health Sciences & DMPTool - Lisa Federer, UCLA 30 July Digital humanities and the DMPTool - Miriam Posner, UCLA 13 Aug Data curation profiles and the DMPTool – Jake Carlson, Purdue 27 Aug Talking points for meeting with institutional stakeholders 10 Sep Tools and resources that work with/complement the DMPTool Beyond funder requirements: more extensive DMPs Case studies 1 – How librarians have successfully used the tool Case studies 2 – How librarians have successfully used the tool Outreach Kit Introduction Certification program introduction blog.dmptool.org/webinar-series
  3. 3. Data Curation Profiles & the DMPTool Jake Carlson Associate Professor of Library Science / Data Services Specialist Purdue University Libraries DMPToolWebinar Series 8: Data Curation Profiles & the DMPTool Sponsored by IMLS 13 August 2013
  4. 4. Road Map • History / Background of the DCP Toolkit • Comparing the DMP and the DCP • Case Study in using the DCP
  5. 5. “Investigating Data Curation Profiles across Research Domains” • Awarded in 2007 to Purdue Libraries and Graduate School of Library and Information Science at UIUC • Goals of the project: – To understand the practices, attitudes and needs of researchers in managing and sharing their data. – To Identify possible roles for librarians to facilitate data sharing and curation. – To develop a tool for librarians to gather information on researcher needs for their data.
  6. 6. Interview areas: 20 faculty, 12 disciplines Agronomy & Soil Science (Purdue & UIUC), Anthropology (UIUC), Biochemistry (Purdue), Biology (Purdue), Civil Engineering (Purdue), Earth & Atmospheric Sciences (Purdue & UIUC), Electrical & Computer Engineering (Purdue), Food Science (Purdue), Geology (UIUC), Horticulture & Plant Science (Purdue & UIUC), Kinesiology (UIUC), Speech and Hearing (UIUC)
  7. 7. What we asked … • Research Data Lifecycle (story of the data) • Characteristics of the Data • Data Management / Storage • Data Dissemination and Sharing • Data Preservation and Repositories • Roles for Libraries and Librarians
  8. 8. The ability to cite this dataset in my publications The ability for researchers within my discipline to easily find this dataset The ability for researchers outside of my discipline to easily find this dataset The ability for people to easily discover this dataset using Google Prioritize your needs for the following types of services Witt, M. (2009, May 18). Eliciting Faculty Requirements for Research Data Repositories 4th Int’l Conference on Open Repositories. Georgia Tech, Atlanta, GA. n=19
  9. 9. Prioritize your needs for the following types of services The ability for me to submit this dataset to a repository myself The process of submitting this dataset to a repository is automated The ability to make these data accessible in multiple formats The ability of the repository to provide version control for the data Witt, M. (2009, May 18). Eliciting Faculty Requirements for Research Data Repositories 4th Int’l Conference on Open Repositories. Georgia Tech, Atlanta, GA. n=19
  10. 10. An interview based tool for gathering: • Information about a particular data set. • What a researcher is doing to manage / curate the data set. • What a researcher would like to do with the data. http://datacurationprofiles.org
  11. 11. DCP Sections • Information about the Data and its Context –Overview of the Research • Focus • Intended Audience • Funding –Data Kinds and Stages • Data Narrative (data lifecycle) • Target Data for Sharing • Use/re-useValue • Contextual Narrative
  12. 12. Data Stage Output Typical File Size Format Other / Notes Primary Data Raw Sensor data 100k in 1 file per day proprietary to the sensor FTP downloads are mostly automated. Processing Stage 1 Sensor data – open/acces sible format Roughly 6kb .csv / .xls Data are formatted into .csv before bring reformatted into a mySQL database. Processed Data vectors 800 records per intersection per day. SQL / .xls Data are extracted from the mySQL database for analysis purposes. Analyzed charts/ Graphs .xls / .emf charts and graphs used for interpretation. Published charts/ graphs .ppt Data are presented via power point. Ancillary Data Image Stills taken from video .gif /.jpg / .ppt Images generated from video.
  13. 13. More DCP Sections  Information about Needs –Intellectual Property –Organization and description of data –Ingest –Access –Discovery –Tools –Interoperability –Measuring Impact –Data Management –Preservation
  14. 14. Context • Focused on a specific context: developing a data management plan for submission to a funding agency. • Focused on a broad context: understanding the researcher’s data and needs well enough to respond.
  15. 15. Timing • For use in the “Planning Stages” of the Data Lifecycle • For use in the “Active Data Stages” of the Data Lifecycle
  16. 16. “The Research Lifecycle” model developed by the University of Virginia Library’s Scientific Data Consulting Group.
  17. 17. Structure • The DMPTool’s structure is based on the specific elements of the agency’s data management plan. • The DCPToolkit is modular in nature. Questions and sections can be changed.
  18. 18. Level of Investment • Generating a DMP using the DMPTool is a short term investment. • Generating a DCP is a longer term investment, but with a potentially large payoff.
  19. 19. Sharable Output • Data management plans are intended to be submitted to a funding agency, not to be shared publicly. • Data curation profiles are intended to be shared with others.
  20. 20. http://docs.lib.purdue.edu/dcp
  21. 21. • Both tools seek to help researchers identify and address needs in managing and curating data. • In particular, both tools aim to foster the creation of data that are discoverable, accessible, well-described and usable by others.
  22. 22. “The Research Lifecycle” model developed by the University of Virginia Library’s Scientific Data Consulting Group.
  23. 23. • Both tools can be used to help librarians connect with researchers about their data. • Both organizations recognize and support the roles of librarians in providing services to support the data lifecycle.
  24. 24. Case Study: Water Quality Field Station with Marianne Bracke Agricultural Sciences Information Specialist Associate Professor of Library Science Purdue University Libraries
  25. 25. The Water Quality Field Station  On a 991 acre farm facility northwest of Purdue opened in 1992.  Used to identify agricultural practices that minimize movement of AG chemicals into water supplies.  Informs the development of new and more ecologically-balanced technologies for crop production.
  26. 26. Graduate Students  Graduate students are on the front lines of data.  Sharing data locally, between graduate students, was challenging to do.
  27. 27. Project Steps Utilize Data Curation Profiles to collect information about current data gathering, workflow and documentation. Identify common issues and needs as observed in the Data Curation Profiles. Produce a report with recommendations and possible approaches to addressing issues and needs Identify Assess Analyze
  28. 28. Identify  6 interviews with Graduate Students conducted in summer of 2011.  Developed Data Curation Profiles from these interviews.  Reviewed DCPs for needs.
  29. 29. Analyze  There is a lack of clear and shared expectations on how data should be documented, described and organized.  Locally – variation of practice by individual by circumstance, previous training / experience, intended use, etc.  Discipline – there is a lack of standards specifically for Agronomy data.
  30. 30. Analyze  Data are not being generated or processed in ways that could facilitate sharing externally, or even locally at Purdue or within the lab.  Inheriting data from previous graduate students was common and potentially problematic.  Many graduate students who had received data reported some problems understanding or making use of the data.
  31. 31. Analyze  Graduate Students stated that they lack knowledge and skills of how they should document, describe, organize and manage their data.  These activities tend to be done in relative isolation from the lab, or even the advisor.  Physical lab notebooks are still the primary means of documentation / provenance.
  32. 32. Assess
  33. 33. DMP & DCP Connections May uncover issues that merit further investigation through a DCP. Uncovering data management issues could inform data management planning.
  34. 34. Another Case Study with DCPs http://www.dlib.org/dlib/july13/wright/07wright.html
  35. 35. Thanks! Any Questions? Jake Carlson Associate Professor of Library Science / Data Services Specialist Purdue University Libraries jakecarlson@purdue.edu DMPToolWebinar Series 8: Data Curation Profiles & the DMPTool Sponsored by IMLS 13 August 2013
  36. 36. blog.dmptool.org/ webinar-series From Flickr by Jeff Keacher In 2 weeks: Talking Points for Meeting with Stakeholders Presenter: Dan Phipps Tuesday 27 Aug @ 10am PT
  37. 37. blog.dmptool.org/webinar-series/
  38. 38. Email Twitter Blog Facebook uc3@ucop.edu jakecarlson@purdue.edu @TheDMPTool blog.dmptool.org Facebook.com/DMPTool Questions?

×