Altman RDAP11 Policy-based Data Management
Upcoming SlideShare
Loading in...5
×
 

Altman RDAP11 Policy-based Data Management

on

  • 989 views

Micah Altman, Harvard; Policy-based Data Management ...

Micah Altman, Harvard; Policy-based Data Management

The 2nd Research Data Access and Preservation (RDAP) Summit
An ASIS&T Summit
March 31-April 1, 2011 Denver, CO
In cooperation with the Coalition for Networked Information
http://asist.org/Conferences/RDAP11/index.html

Statistics

Views

Total Views
989
Views on SlideShare
984
Embed Views
5

Actions

Likes
0
Downloads
8
Comments
0

1 Embed 5

http://informatics3 5

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • This work “Trustworthy Repositories, Organizations & Infrastructure”, by Micah Altman (http://redistricting.info) is licensed under the Creative Commons Attribution-Share Alike 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/us/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.

Altman RDAP11 Policy-based Data Management Altman RDAP11 Policy-based Data Management Presentation Transcript

  • Policy Based Digital Preservation: SafeArchive & The Dataverse Network ® Micah Altman, Institute for Quantitative Social Science, Harvard University Prepared for the Research Data Access and Preservation Summit ASIS&T March 2011
  • Collaborators*
    • Leonid Andreev, Ed Bachman, Adam Buchbinder, Ken Bollen, Bryan Beecher, Steve Burling, Kevin Condon, Jonathan Crabtree, Merce Crosas, Gary King, Patrick King, Tom Lipkis, Freeman Lo, Jared Lyle, Marc Maynard, Nancy McGovern, Lois Timms-Ferrarra, Akio Sone, Bob Treacy
    • Research Support
      • Thanks to the Library of Congress (PA#NDP03-1), the National Science Foundation (DMS-0835500, SES 0112072), IMLS (LG-05-09-0041-09), the Harvard University Library, the Institute for Quantitative Social Science, the Harvard-MIT Data Center, and the Murray Research Archive.
    Policy Based Digital Preservation * And co-conspirators
  • Related Work
    • Reprints available from: http://maltman.hmdc.harvard.edu
    • Altman, M., and J. Crabtree, 2011. “Using the SafeArchive System: TRAC-Based Auditing of LOCKSS”, Proceedings of Archiving 2011. (Forthcoming)
    • Altman, M., Beecher, B., and Crabtree, J.; with L. Andreev, E. Bachman, A. Buchbinder, S. Burling, P. King, M. Maynard. 2009. "A Prototype Platform for Policy-Based Archival Replication." Against the Grain . 21(2): 44-47.
    • Altman, M., Adams, M., Crabtree, J., Donakowski, D., Maynard, M., Pienta, A., & Young, C. 2009. "Digital preservation through archival collaboration: The Data Preservation Alliance for the Social Sciences." The American Archivist . 72(1): 169-182
    • Crosas, M. 2011, “The Dataverse Network: An Open-Source Application for Sharing, Discovering and Preserving Data”, D-Lib Magazine 17(1/2).
    • King, Gary (2007), " An Introduction to the Dataverse Network as an Infrastructure for Data Sharing", Sociological Methods and Research , Vol. 32, No. 2, pp. 173-199
    • Gutmann,M. Abrahamson, M, Adams, M.O., Altman, M, Arms, C., Bollen, K., Carlson, M., Crabtree, J., Donakowski, D., King, G., Lyle, J., Maynard, M., Pienta, A., Rockwell, R, Timms-Ferrara L., Young, C., 2009. "From Preserving the Past to Preserving the Future: The Data-PASS Project and the challenges of preserving digital social science data", Library Trends 57(3):315-33
    Policy Based Digital Preservation
  • SafeArchive: TRAC-Based Management of LOCKSS
    • Facilitating collaborative replication and preservation with technology…
    • Collaborators declare explicit non-uniform resource commitments
    • Policy records commitments, storage network properties
    • Storage layer provides replication, integrity, freshness, versioning
    • SafeArchive software provides monitoring, auditing, and provisioning
    • Content is harvested through HTTP (LOCKSS) or OAI-PMH
    • Integration of LOCKSS, The Dataverse Network, TRAC
    Policy Based Digital Preservation
  • Adding Policy to LOCKSS
    • LOCKSS Lots of Copies Keep Stuff Safe
      • Widely used in library community
      • Self-contained OSS replication system, low maintenance, inexpensive
      • Harvests resources via web-crawling, OAI-PMH, database queries,…
      • Maintains copies through secure p2p protocol
      • Zero trust & self repairing
    • What does SafeArchive Add
      • Auditing – easily monitor number of copies of content in network
      • Provisioning – ensure sufficient copies and distribution
      • Collaboration – coordinate across partners, monitor resource commitments
      • Provide restoration guarantees
      • Integrate with Dataverse Network digital repository
    Policy Based Digital Preservation
  • Why this tool?
    • To facilitate institutions in making commitments aligned with their policies and incentives, and
    • Automatically execute and monitor those commitments and policies
    • (Self-interest… Support Data-PASS partnership agreements and transfer protocols)
    • This tool provides a targeted vertical slice of functionality through the policy stack…
    Policy Based Digital Preservation
  • Another Why… Policy Based Digital Preservation R.I.P.
  • SafeArchive Components Policy Based Digital Preservation Current Planned
  • SafeArchive Auditing & Reports Policy Based Digital Preservation Example Fragments
  • SafeArchive: TRAC Alignment
    • SafeArchive audits provide evidence for compliance with policies on:
      • archival storage & preservation (B4)
      • independent audit mechanisms (B2)
      • appropriate system infrastructure (C1)
      • and disaster planning and recover (C3)
    • SafeArchive supports embedded policy documentation:
      • Organizational infrastructure (A1-4)
      • Collection policies (B2.5,2.7,5.2)
      • System configuration (C1.7-1.10)
    Policy Based Digital Preservation
  • SafeArchive: Schematizing Policy and Behavior Policy Based Digital Preservation “ The repository system must be able to identify the number of copies of all stored digital objects, and the location of each object and their copies.” Policy Schematization Behavior (Operationalization)
  • The Dataverse Network ® Policy Based Digital Preservation For Organizations For Scholars
    • Brand it like your own website.
    • Upload any type of data.
    • Establish a persistent data citation
    • Facilitate data discovery
    • Provide live analysis
    • Receive permanent storage space
    • Used by archives, libraries, journals, schools
    • Enable contributors to upload data
    • Organize studies by collections
    • Search across a universe of data
    • Control access and terms of use
    • Federate with catalogs and partners: 
OAI-PMH, LOCKSS, Z39.50, DDI
  • Dataverse Network – Designed for Research Data Policy Based Digital Preservation
  • Policy Support in the DataVerse Network
    • Access Control
      • Roles: access, curation, administration
      • Authenticate by: user, group, network, proxy
    • Workflow Policies
      • Built-in Versioning and Deaccessioning
      • Curatorial Review
        • Review of changes prior to release of new version
        • Review of new virtual archives
    • Legal Policies
      • Terms of use: accounts, uploads, downloads
      • Hierarchical terms: network, archive, study
      • Access request workflow
    Policy Based Digital Preservation
  • Archival Collaboration through shared infrastructure : Data-PASS
    • Data-PASS is a broad-based partnership of social science data archives.
    • Data-PASS partners collaborate to:
      • identify and promote good archival practices
      • seek out at-risk research data
      • mutually safeguard collections
      • build preservation infrastructure
    • Data-PASS uses DataVerse:
      • Creates federated catalog
      • Manages content for some partners
      • Provides simple way for organizations to participate in partnership
    • Data-PASS uses SafeArchive:
      • Collaboration through mutual replication of partner content
      • Supports legal transfer agreements
    Policy Based Digital Preservation
  • Where Do Policies Fit in Organizational Decisions? Policy Based Digital Preservation NSDA LOCKSS META-ARCHIVE DATA-PASS SAFE DVN IRODS
  • Ideal integration of policy and technology?
    • Expressed in domain/business language
    • Translated to a formal schematization
    • Automatically measured by technology
    • Directly controls procedures & actions to achieve compliance
    • Verifiable translation from business domain policy
    • Where do we go from here
      • Combine flexibility of IRODS and semantic level of TRAC
      • Self-documenting infrastructure
      • Formal verifiable translation of policy to schema, and schema to action
      • Make good policy easy to implement!
    Policy Based Digital Preservation Policy: A set of rules and objectives expressed at a high level domain that controls actions at a lower level
  • Contact Us
    • Micah Altman
    • maltman.hmdc.harvard.edu
    • SafeArchive
    • safearchive.org
    • The Dataverse Network ™
    • thedata.org
    Policy Based Digital Preservation