Your SlideShare is downloading. ×
Chen RDAP11 NSF Data Management Plan Case Studies
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Chen RDAP11 NSF Data Management Plan Case Studies


Published on

Eric Chen, Cornell; NSF Data Management Plan Case Studies; RDAP11 Summit …

Eric Chen, Cornell; NSF Data Management Plan Case Studies; RDAP11 Summit

The 2nd Research Data Access and Preservation (RDAP) Summit
An ASIS&T Summit
March 31-April 1, 2011 Denver, CO
In cooperation with the Coalition for Networked Information

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide
  • Source material from Gail SOutline – background (brief) why we’re doing this Cornell context and the structure we came up with to support these requirements Early results Observations – challenges, questionsteinhart, could not be here in person today.
  • Before the NSF DMP was announcedStarted with a group of us had already been exploring how we could cooperate to support the needs of data-driven science (our other head start).The focus of the DRSG has been twofold: assessment of researchers’ needs with respect to CI and data mgmt (series of interviews), and pilot projects on support for data-driven science.While not specifically focused on data mgmt planning, it had already brought together many of the right groups on campus that would comprise the cross-institution response
  • NSF announces DMP plan requirement that lead to the question of how to respond to the needs of researchers to create a plan.
  • Like many of our peers, Cornell has many groups that provide data management services from central campus IT, research computing, and library computing to name a few.The conversation started with the question of how would a researcher navigate through the various service groups to create a data management planHow to present them with a single point of contact so they don’t have to navigate something that looks like this?At Cornell, this is what this could look like for a researcher attempting to piece together the services they need to develop a robust data management plan. (+ have left out some smaller, more specialized service providers – SRI, CBSU, CCTEC…)
  • Solution - new VO; meant to address two main issues: distribution of resources – making that appear seamless to researchers by providing a single point of contact for assistance with data management. identifying and filling gaps in services in a coordinated way.This comes out of a proposal we submitted to the Vice Provost for Research and the University Librarian; available on website.
  • Here is a overview of the virtual organization. At the top we have the sponsors of the group that include the Vice Provost for Research, University Librarian, and the Faculty Advisory Board.VO itself is comprised of a management group: the management council (reps from major service providers; these are people who can allocate staff and resources to get work done), and a staff coordinator, that’s me!, to hold it all together.Implementation teams are charged by the management group to get work done for the RDMSG.… of course all of existing service providers – which may do some work related to the RDMSG, but also provide services outside of the RDMSG.One thing to note about this new organization is no new $ – participants have all accepted this structure and the work that comes with it as consistent with their mission and purpose.
  • Think of it as a concierge service for data management.Wikipedia / re: hotel concierges: “ a concierge is often expected to "achieve the impossible", dealing with any request a guest may have, no matter how strange, relying on an extensive list of contacts with localmerchants and service providers.”
  • Just to give you an idea of some of the things this new group has been working on… Survey: to better understand the potential impact on existing services, and to identify service gaps Website (still early on that) and single point of contact email Small and growing pool of consultants to field help requests – these are people with particular subject or IT expertise Emails to the ‘help’ email address generate a help ticket; triage process to route help requests to qualified consultant Ran three information sessions for prospective PIs and other interested staff; total attendance >100, slides and video of one session on the web Convened faculty advisory board Agreed to establish implementation teams to work in a handful of areas that support the RDMSG
  • We haven’t had a change to really look at all of the information yet, but can share a few preliminary results.Left: gives a sense of who responded to the surveyRight: significant interest in support for data mgmt planning (and some uncertainty)
  • “Not sure” is the most consistent winner when asked which approach(es) would PI use to share data
  • One kind of think we can get out of this exercise: most respondents weren’t sure about using the library’s IR (several commented that they don’t know what it is).Look at “yes” – blue – by size bins. IR has limit of ~50MB per upload, and w/o special intervention from sysadmin, files are uploaded one at a time by contributor. Several researchers planning to use IR for some ~sizable data collections – so anticipate managing expectations WRT services, and redirecting users to more appropriate services (could become a problem when redirecting from free to fee services).
  • FAQs (bold are big winners), persistently asked Qs, and comments.Lot of basic questions and a lot of uncertainty.
  • Lack of detail: will NSF provide any additional guidance, feedback, examples? Have to advise w/o having seen good and bad DMPs. Would like to hear more on this from NSF (note directorate guidelines are often nearly as vague as umbrella policy).Longer term needs: challenge for us to come up with business models / cost structures to support. Princeton has one, but addresses only simple storage, not preservation. Business models for preservation are active area of research and complicated – see blue ribbon panel report – so this is a tough problem.Interim solutions: good to have movement in this area – if it takes policy/requirement to make that happen, that’s good. Have to balance requirements with good decision making (which can take some time). Concern: interim adoption of mediocre solutions, can be difficult to migrate or reverse engineer.
  • Transcript

    • 1. Research Data Management Service Group
      Cross-institution response to NSF Data Management Plan
      Eric Chen
      Analyst Consultant for Data-Driven Science
      Cornell University
      March 31st, 2011
    • 2. Responding to NSF DMP
      Cross-institution response
      Current activities
      Next steps
    • 3.
    • 4. NSF DMP
    • 5. How to respond?
      Existing data management service providers
      Cornell University Library
      Center for Advanced Computing
      Institutional Review Board
      Office of Sponsored Programs
      Vice Provost for Research
      Cornell Inst. for Social & Econ. Research
      DISCOVER Research Service Group
      Weill Cornell Medical College IT
    • 6. RDMSG
      Research Data Management Service Group (RDMSG)
      Virtual organization
      Comprise existing campus data management service providers
    • 7. RDMSG
      Vice Provost for Research
      University Librarian
      Faculty Advisory Board
      Sponsors and advisors
      Management Group
      RDMSG Virtual Organization
      Services assessment
      Outreach and training
      others as appropriate
      Implementation teams
      Service Providers
      others as appropriate
      Management Council
      Staff Coordinator
    • 8.
      • Data management planning
      • 9. Storage and backup
      • 10. Metadata
      • 11. Data analysis
      • 12. Collaboration tools
      • 13. High performance computing
      • 14. Privacy and confidentiality
      • 15. Intellectual property
      • 16. Data publication
    • 17. RDMSG Activities
      • Survey of active NSF PIs
      • 18. RDMSG website,
      • 19. Consultant pool
      • 20. Ticketing and triage system
      • 21. Information sessions on NSF DMPs
      • 22. Faculty advisory board
      • 23. Implementation teams:
      • 24. Outreach and training
      • 25. Documentation
      • 26. Intellectual property and copyright
      • 27. Survey report
      • 28. Sharing and access
      • 29. Long-term preservation
    • RDMSG Activities
      • Survey of active NSF PIs
      • 30. RDMSG website,
      • 31. Consultant pool
      • 32. Ticketing and triage system
      • 33. Information sessions on NSF DMPs
      • 34. Faculty advisory board
      • 35. Implementation teams:
      • 36. Outreach and training
      • 37. Documentation
      • 38. Intellectual property and copyright
      • 39. Survey report
      • 40. Sharing and access
      • 41. Long-term preservation
    • Preliminary Survey Results
      Directorate of most recent award
      DMP support?
    • 42. Sharing strategies
      Disciplinary data center
      Institutional repository
      CAC disk farm
    • 43. Institutional Repository
      Institutional repository
      Intent to use IR by amount of data
    • 44. Information Sessions
      3 Sessions about NSF DMP requirement
      “common-sense” interpretation
      Over 100 participants from 60 different campus affiliations
      Over 30 questions
    • 45. Information Sessions
      What counts as data? Does that include raw data?
      I publish my findings in journals, so I already share my data.
      How do we budget for data storage and access beyond the end of a grant?
      For how long should data be archived?
      What about the additional burden on reviewers?
      What if your research plans change so that the original DMP is no longer relevant?
    • 46. Next steps
      • Lack of detail makes support difficult
      • 47. Finite award period vs. longer term needs
      • 48. Potential for interim / less than optimal solutions
    • Questions?