Chen RDAP11 NSF Data Management Plan Case Studies
Upcoming SlideShare
Loading in...5
×
 

Chen RDAP11 NSF Data Management Plan Case Studies

on

  • 769 views

Eric Chen, Cornell; NSF Data Management Plan Case Studies; RDAP11 Summit ...

Eric Chen, Cornell; NSF Data Management Plan Case Studies; RDAP11 Summit

The 2nd Research Data Access and Preservation (RDAP) Summit
An ASIS&T Summit
March 31-April 1, 2011 Denver, CO
In cooperation with the Coalition for Networked Information
http://asist.org/Conferences/RDAP11/index.html

Statistics

Views

Total Views
769
Views on SlideShare
769
Embed Views
0

Actions

Likes
0
Downloads
6
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Source material from Gail SOutline – background (brief) why we’re doing this Cornell context and the structure we came up with to support these requirements Early results Observations – challenges, questionsteinhart, could not be here in person today.
  • Before the NSF DMP was announcedStarted with a group of us had already been exploring how we could cooperate to support the needs of data-driven science (our other head start).The focus of the DRSG has been twofold: assessment of researchers’ needs with respect to CI and data mgmt (series of interviews), and pilot projects on support for data-driven science.While not specifically focused on data mgmt planning, it had already brought together many of the right groups on campus that would comprise the cross-institution response
  • NSF announces DMP plan requirement that lead to the question of how to respond to the needs of researchers to create a plan.
  • Like many of our peers, Cornell has many groups that provide data management services from central campus IT, research computing, and library computing to name a few.The conversation started with the question of how would a researcher navigate through the various service groups to create a data management planHow to present them with a single point of contact so they don’t have to navigate something that looks like this?At Cornell, this is what this could look like for a researcher attempting to piece together the services they need to develop a robust data management plan. (+ have left out some smaller, more specialized service providers – SRI, CBSU, CCTEC…)
  • Solution - new VO; meant to address two main issues: distribution of resources – making that appear seamless to researchers by providing a single point of contact for assistance with data management. identifying and filling gaps in services in a coordinated way.This comes out of a proposal we submitted to the Vice Provost for Research and the University Librarian; available on website.
  • Here is a overview of the virtual organization. At the top we have the sponsors of the group that include the Vice Provost for Research, University Librarian, and the Faculty Advisory Board.VO itself is comprised of a management group: the management council (reps from major service providers; these are people who can allocate staff and resources to get work done), and a staff coordinator, that’s me!, to hold it all together.Implementation teams are charged by the management group to get work done for the RDMSG.… of course all of existing service providers – which may do some work related to the RDMSG, but also provide services outside of the RDMSG.One thing to note about this new organization is no new $ – participants have all accepted this structure and the work that comes with it as consistent with their mission and purpose.
  • Think of it as a concierge service for data management.Wikipedia / re: hotel concierges: “ a concierge is often expected to "achieve the impossible", dealing with any request a guest may have, no matter how strange, relying on an extensive list of contacts with localmerchants and service providers.”
  • Just to give you an idea of some of the things this new group has been working on… Survey: to better understand the potential impact on existing services, and to identify service gaps Website (still early on that) and single point of contact email Small and growing pool of consultants to field help requests – these are people with particular subject or IT expertise Emails to the ‘help’ email address generate a help ticket; triage process to route help requests to qualified consultant Ran three information sessions for prospective PIs and other interested staff; total attendance >100, slides and video of one session on the web Convened faculty advisory board Agreed to establish implementation teams to work in a handful of areas that support the RDMSG
  • We haven’t had a change to really look at all of the information yet, but can share a few preliminary results.Left: gives a sense of who responded to the surveyRight: significant interest in support for data mgmt planning (and some uncertainty)
  • “Not sure” is the most consistent winner when asked which approach(es) would PI use to share data
  • One kind of think we can get out of this exercise: most respondents weren’t sure about using the library’s IR (several commented that they don’t know what it is).Look at “yes” – blue – by size bins. IR has limit of ~50MB per upload, and w/o special intervention from sysadmin, files are uploaded one at a time by contributor. Several researchers planning to use IR for some ~sizable data collections – so anticipate managing expectations WRT services, and redirecting users to more appropriate services (could become a problem when redirecting from free to fee services).
  • FAQs (bold are big winners), persistently asked Qs, and comments.Lot of basic questions and a lot of uncertainty.
  • Lack of detail: will NSF provide any additional guidance, feedback, examples? Have to advise w/o having seen good and bad DMPs. Would like to hear more on this from NSF (note directorate guidelines are often nearly as vague as umbrella policy).Longer term needs: challenge for us to come up with business models / cost structures to support. Princeton has one, but addresses only simple storage, not preservation. Business models for preservation are active area of research and complicated – see blue ribbon panel report – so this is a tough problem.Interim solutions: good to have movement in this area – if it takes policy/requirement to make that happen, that’s good. Have to balance requirements with good decision making (which can take some time). Concern: interim adoption of mediocre solutions, can be difficult to migrate or reverse engineer.

Chen RDAP11 NSF Data Management Plan Case Studies Chen RDAP11 NSF Data Management Plan Case Studies Presentation Transcript

  • Research Data Management Service Group
    Cross-institution response to NSF Data Management Plan
    Eric Chen
    Analyst Consultant for Data-Driven Science
    Cornell University
    March 31st, 2011
  • Responding to NSF DMP
    Background
    Cross-institution response
    Current activities
    Next steps
  • NSF DMP
  • How to respond?
    Existing data management service providers
    Cornell University Library
    Center for Advanced Computing
    Institutional Review Board
    Office of Sponsored Programs
    Vice Provost for Research
    Cornell Inst. for Social & Econ. Research
    DISCOVER Research Service Group
    Cornell
    IT
    Weill Cornell Medical College IT
  • RDMSG
    Research Data Management Service Group (RDMSG)
    Virtual organization
    Comprise existing campus data management service providers
  • RDMSG
    Vice Provost for Research
    University Librarian
    Faculty Advisory Board
    Sponsors and advisors
    Management Group
    RDMSG Virtual Organization
    Services assessment
    Outreach and training
    others as appropriate
    Implementation teams
    Service Providers
    others as appropriate
    Management Council
    Staff Coordinator
    CISER
    CIT
    CAC
    CUL
    • Data management planning
    • Storage and backup
    • Metadata
    • Data analysis
    • Collaboration tools
    • High performance computing
    • Privacy and confidentiality
    • Intellectual property
    • Data publication
    http://www.flickr.com/photos/21709799@N03/3798486667/
  • RDMSG Activities
    • Survey of active NSF PIs
    • RDMSG website, rdmsg-help@cornell.edu
    • Consultant pool
    • Ticketing and triage system
    • Information sessions on NSF DMPs
    • Faculty advisory board
    • Implementation teams:
    • Outreach and training
    • Documentation
    • Intellectual property and copyright
    • Survey report
    • Sharing and access
    • Long-term preservation
  • RDMSG Activities
    • Survey of active NSF PIs
    • RDMSG website, rdmsg-help@cornell.edu
    • Consultant pool
    • Ticketing and triage system
    • Information sessions on NSF DMPs
    • Faculty advisory board
    • Implementation teams:
    • Outreach and training
    • Documentation
    • Intellectual property and copyright
    • Survey report
    • Sharing and access
    • Long-term preservation
  • Preliminary Survey Results
    Directorate of most recent award
    DMP support?
  • Sharing strategies
    Disciplinary data center
    Journals
    Institutional repository
    CAC disk farm
    Custom
  • Institutional Repository
    Institutional repository
    Intent to use IR by amount of data
  • Information Sessions
    3 Sessions about NSF DMP requirement
    “common-sense” interpretation
    Over 100 participants from 60 different campus affiliations
    Over 30 questions
  • Information Sessions
    What counts as data? Does that include raw data?
    I publish my findings in journals, so I already share my data.
    How do we budget for data storage and access beyond the end of a grant?
    For how long should data be archived?
    What about the additional burden on reviewers?
    What if your research plans change so that the original DMP is no longer relevant?
  • Next steps
    • Lack of detail makes support difficult
    • Finite award period vs. longer term needs
    • Potential for interim / less than optimal solutions
  • Questions?
    RDMSG-HELP@cornell.edu
    http://data.research.cornell.edu/