Guidelines for OSTP Data Access Plans


Published on

This slide deck provides an overview and resources to respond to the OSTP memo with the subject: Increasing Access to the Results of Federally Funded Scientific Research issued by John P. Holdren in February 2013. It provides resources and information agencies, foundations, and research projects can use to assemble achieve public access to scientific data in digital formats.

Published in: Education, Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Current archives/collections/repositories already meeting public access requirements regarding dataNACDA – NACJD – SAMHDA: examples of long term sustainabilityNAHDAP – SAMHDA – DSDR: examples of sharing of confidential dataNACJD – example of depository/researcher compliance (holding 10% of funding to PI)LGBT – MET: unique infrastructure and disseminationResearch Connections: reports and data dissemination; audiences including policymakers
  • Guidelines for OSTP Data Access Plans

    1. 1. Guidelines and Resources for OSTP Data Access Plans ICPSR September 2013
    2. 2. The OSTP Memo Guidelines for Response • Released February 2013, this memo directs funding agencies with an annual R&D budget over $100 million to develop a public access plan for disseminating the results of their research • ICPSR stresses that standards and guidelines for many of the requirements currently exist • The slides to follow provide an overview of the access plan elements including guidelines and resources on how to respond to meet digital data requirements in the memo
    3. 3. The OSTP Memo – A Review • Released February 22, 2013 • A concern for investment: “Policies that mobilize these publications and data for re-use through preservation and broader public access also maximize the impact and accountability of the Federal research investment.” • Federal agencies with over $100 M annually in R&D expenditures to develop plans to support increased public access to the results of research funded by the Federal Government • Plans to contain eight points
    4. 4. The Eight Points of the Plan 1. Strategy for leveraging existing archives 2. Strategy to improve the public’s ability to locate and access digital data 3. Approach to optimize search, archival, and dissemination features that encourage innovation in accessibility & interoperability and ensure long-term stewardship 4. A plan to notify awardees & researchers of their obligations 5. Strategy for measuring and enforcing compliance with the plan 6. Identification of resources within the existing agency budget to implement plan 7. Timeline for implementation 8. Identification of special circumstances that prevent the agency from meeting memo objectives
    5. 5. Data Portion of Memo - 13 Elements • The portion of the memo describing objectives for public access to data stresses 13 elements for a public access plan • The elements are also summarized online within ICPSR’s Web site:
    6. 6. Maximize Access "Maximize access, by the general public and without charge, to digitally formatted scientific data created with Federal funds“ • Increasing access to research data prevents the duplication of effort, provides accountability and verification of research results, and increases opportunities for innovation and collaboration. • Finding and accessing data in repositories requires descriptive metadata ("data about data") in standard, machine-actionable form. Metadata help search engines find data, and help researchers understand the context of data collections. • Standards already exist: see Data Documentation Initiative –
    7. 7. Maximize Access cont. • Access also involves knowing how to interpret the data. Incomplete data limit reuse. Obsolete data formats can be unreadable. – Repositories 'curate' or enhance data to make it complete, self- explanatory, and usable for future researchers. This includes adding descriptive labels, correcting coding errors, gathering documentation, and standardizing the final versions of files. This is called “data curation.” – Like museums that curate art or artifacts for study and understanding now and in the future, data archives curate data with the same goals. • Data curation is crucial to maximizing access. Resources for curating data: – ICPSR's Guide to Social Science Data Preparation and Archiving – UK Data Archive's Managing and Sharing Data guide.
    8. 8. Protect Confidentiality and Privacy • It is critically important to protect the identities of research subjects. • Disclosure risk is a term that is often used for the possibility that a data record from a study could be linked to a specific person. • Concerns about disclosure risk have grown as more datasets have become available online, and it has become easier to link research datasets with publicly available external databases.
    9. 9. Protect Confidentiality and Privacy cont. Protecting confidentiality of research subjects is not a viable argument for not sharing data. Infrastructure, including virtual and physical data enclaves, already exists: • Restricted-Use Data are made available for research purposes for use by investigators who agree to stringent conditions for the use of the data and its physical safekeeping. • Enclave Data are those datasets which present especially acute disclosure risks. They can be accessed only on-site in ICPSR's physical data enclave in Ann Arbor. Investigators must be approved. Their notes and analytic output are reviewed by ICPSR staff.
    10. 10. Preserve Intellectual Property Rights and Commercial Interests Original research may be both commercially valuable and proprietary. There are several approaches to managing these interests, including: – Tailor copyright and patent licenses, such as through Creative Commons licenses – Establish an embargo period or delayed dissemination on distribution.
    11. 11. Balance Demands of Long-term Preservation and Access • Preserving digital data requires much more than storing files on a server, desktop, or in the cloud! • Digital preservation is the active and ongoing management of digital content to lengthen the lifespan and mitigate against loss, including physical deterioration, format obsolescence, and hardware and software failure.
    12. 12. Balance Demands of Long-term Preservation and Access cont. • Not all data are worth preserving indefinitely; less valuable or easily producible data may be preserved for shorter periods. • Establish selection and appraisal guidelines that make it clear what to save or discard. – Selection criteria consider factors like availability, confidentiality, copyright, quality, f ile format, and financial commitment.
    13. 13. Use of Data Management Plans • Data management plans describe how researchers will provide for long-term preservation of, and access to, scientific data in digital formats. • Data management plans provide opportunities for researchers to manage and curate their data more actively from project inception to completion. • See ICPSR's resource: Guidelines for Effective Data Management Plans
    14. 14. Include Cost of Data Management in Funding Proposals • Data management services carry real costs, ranging from personnel to storage to software. • Maintenance costs are routinely built into physical infrastructure development, so too should data management costs be built into data development. • Long-term access to data requires durable institutions that plan on a scale of decades and even generations. • Cost resources: – DataONE's Provide budget information for your data management plan – UK Data Archive's Costing Tool: Data Management Planning.
    15. 15. Evaluate Data Management Plans & Ensure Compliance • Plans help researchers prepare for working with and preserving data, repositories get ready to accession and provide access, and agencies to understand the community needs for archiving and access. Evaluation helps refine plans so they are realistic and attainable. • If data management plans are to be a standard component of funding applications, funding recipients should be held accountable for diversions from the originally stated plans.
    16. 16. Promote Public Deposit of Data • Public deposit of data helps to ensure the long-term accessibility and preservation of the data. • It removes the burden of ongoing maintenance and care (and user support) from the researcher and provides a stable system to which data can be entrusted. • Many sustainable online repositories are already available to host and archive research data. These may include discipline- specific repositories, archives administered by funding agencies, or institutional repositories. • Databib, a searchable directory of over 500 research data repositories, can help locate relevant repositories by subject area.
    17. 17. Private-sector Cooperation to Improve Access Encourage cooperation with the private sector to improve data access and compatibility. Issues to consider: • What funding structures will be in place to ensure that both organizations involved are benefiting from the partnership? • Will the partnership require any rights to be transferred to the private organization? • How does private-sector cooperation affect access restrictions and intellectual property concerns?
    18. 18. Mechanisms for Identification & Attribution of Data • Properly citing data encourages the replication of scientific results, improves research standards, guarantees persistent reference, and gives proper credit to data producers. • Citing data is straightforward. Each citation must include the basic elements that allow a unique dataset to be identified over time: title, author, date, version, and persistent identifier. • Resources: ICPSR's Data Citations page , IASSIST's Quick Guide to Data Citation, DataCite.
    19. 19. Data Stewardship Workforce Development In coordination with other agencies and the private sector, support training, education, and workforce development related to scientific data management, analysis, storage, preservation, and stewardship. Recent data stewardship workforce development in the United States has included: • Digital Preservation Outreach and Education, from the Library of Congress • Digital Preservation Management tutorial, from Cornell University, ICPSR, and MIT • DigCCurr, from the University of North Carolina
    20. 20. Data Stewardship Workforce Development cont. ICPSR hosts data stewardship courses as part of its Summer Program in Quantitative Methods of Social Research. These include: • Curating and Managing Research Data for Re-Use • Assessing and Mitigating Disclosure Risk: Essentials for Social Science • Providing Social Science Data Services: Strategies for Design and Operation
    21. 21. Long-term Support for Repository Development • ICPSR advocates long-term funding for specialized, long- lived, trustworthy, and sustainable repositories that can mediate between the needs of scientific disciplines and data preservation requirements. • As digital data management becomes an increasingly important part of scientific research, funding agencies must contribute to the developing ecosystem of services and technologies that support access to and preservation of data. • For more information, including various long-term funding models, see ICPSR’s 2013 position paper – “The Price of Keeping Knowledge”
    22. 22. Visit ICPSR Archives/Repositories already Meeting Public Access Requirements
    23. 23. ICPSR’s Data Management & Curation Site
    24. 24. ICPSR’s Guidelines for OSTP Data Access Plan Page
    25. 25. ICPSR – a 50-Year History of Providing Access to Research Data Established in 1962, ICPSR maintains and shares over 8,600 research datasets and hosts 16 public- access specialized collections of data funded by various government agencies and foundations. Our mission: ICPSR advances and expands social and behavioral research, acting as a global leader in data stewardship and providing rich data resources and responsive educational opportunities for present and future generations.
    26. 26. The Concept of Data Curation • Curation, from the Latin "to care," is the process that ICPSR uses to add value to data, maximize access, and ensure long-term preservation. • Data curation is akin to work performed by an art or museum curator. – Data are organized, described, cleaned, enhanced, and preserved for public use, much like the work done on paintings or rare books to make the works accessible to the public now and in the future. • Through curation, ICPSR provides meaningful and enduring access to data.
    27. 27. ICPSR’s Data Management & Curation Goals • Quality - Data at ICSPR are enhanced with meaningful information to make it complete, self-explanatory, and usable for future researchers • Access – Sought by over 730 member institutions an indexed by all the major search engines, ICPSR data are easily discoverable and widely accessible to the public. • Citation - By providing standardized and well-recognized data citations, ICPSR ensures that data producers receive credit for their archived data • Preservation – For over 50 years, ICPSR has preserved its data resources for the long- term, guarding against deterioration, accidental loss, and digital obsolescence • Confidentiality - Stringent protections are in place for securing and distributing sensitive data • Educational Support – ICPSR has a long tradition of supporting training in quantitative methods, scientific data management, and resources for instruction
    28. 28. Copies of these Slides & Use • Feel free to share it; present it; cite it! • Find copies of these slides on
    29. 29. Get More information • Visit ICPSR’s Data Management & Curation site: ment/index.jsp • Contact us: – – (734) 647-2200