Building Communities of “Trust” Micah Altman, Institute for Quantitative Social Science, Harvard University Prepared for Private LOCKSS Networks: Community-based Approaches to Distributed Digital Preservation  Educopia October 2010
Collaborators* Margaret Adams, George Alter, Ed Bachman,  Adam Buchbinder,  Ken Bollen, Bryan Beecher, Steve Burling,  Darrell Donakowski, Gary King, Patrick King, Bill Lefurgy, Jared Lyle, Marc Maynard,  Amy Pienta, Lois Timms-Ferrarra. Research Support Thanks to the Library of Congress (PA#NDP03-1), the National Science Foundation (DMS-0835500, SES 0112072), IMLS (LG-05-09-0041-09),  the Harvard University Library, the Institute for Quantitative Social Science, the Harvard-MIT Data Center, and the Murray Research Archive.  Building Communities of “Trust” * And co-conspirators
Related Work Reprints available from:  http://maltman.hmdc.harvard.edu Altman, M., Beecher, B., and Crabtree, J.; with L. Andreev, E. Bachman, A. Buchbinder, S. Burling, P. King, M. Maynard.. (2009). "A Prototype Platform for Policy-Based Archival Replication."  Against the Grain . 21(2): 44-47. Altman, M., Adams, M., Crabtree, J., Donakowski, D., Maynard, M., Pienta, A., & Young, C. (2009). "Digital preservation through archival collaboration: The Data Preservation Alliance for the Social Sciences."  The American Archivist . 72(1): 169-182 Myron Gutmann, Abrahamson, M, Adams, M.O., Altman, M, Arms, C., Bollen, K., Carlson, M., Crabtree, J., Donakowski, D., King, G., Lyle, J., Maynard, M., Pienta, A., Rockwell, R, Timms-Ferrara L., Young, C., 2009. "From Preserving the Past to Preserving the Future: The Data-PASS Project and the challenges of preserving digital social science data",  Library Trends  57(3):315-33 Micah Altman, 2009. "Transformative Effects of NDIIPP, the case of the Henry A. Murray Archive",  Library Trends  57(3): 338-35 Building Communities of “Trust”
Structuring Collaboration for Preservation Risks.  How can virtual organizations reduce preservation risks?  Trust.  What trust relationships should virtual organizations establish among members? Evaluation.  How should the virtual organization and relationships be evaluated? Building Communities of “Trust”
Conjectures Organizations reduce preservation risk by: Providing  systematic redundancy across diverse … Technical approaches: software, hardware, formats Institutional environments: funding models, legal regime Institutional control: curation, deaccessioning Enhancing preservation readiness: Awareness of risks and risk management approaches Awareness & use of best practices Active exercise of cataloging information, licensing terms,  API’s Trust and evaluation should be based on: Linking policy objectives to explicitly-defined roles, actions, and expected outcomes Continuous evaluation and monitoring based on organizational incentives, capacity, & commitments Building Communities of “Trust”
One tool… SAFE-Archive Policy-Based Replication & Auditing Facilitating collaborative replication and preservation with technology…  Collaborators  declare explicit non-uniform resource commitments Policy  records commitments, storage network properties Storage layer  provides replication, integrity, freshness, versioning  SAFE-Archive software  provides monitoring, auditing, and provisioning  Content  is harvested through HTTP (LOCKSS) or OAI-PMH Integration of  LOCKSS, The Dataverse Network, TRAC Building Communities of “Trust”
Storage Layers Other than LOCKSS Building Communities of “Trust” System Risks Advantages LOCKSS Single implementation Small installed base Small development community Scalability Designed for preservation Fault-tolerant Minimal trust model Harvesting functions IRODS Single implementation Small installed base -Small development community -Complexity of rules system -No integrity built in  (use ACE?) Flexible rules Scaleable GnuNet, Freenet, Tahoe-LAFS Complexity of integration No support for versioning Fault tolerant Moderate installed base Multiple implementations CrashPlan SpiderOak Mozy Closed source Difficult to integrate with Licensing fees Multiple implementations Extensive target storage support Extensive reporting Commercial support
Why this tool? To facilitate institutions in making commitments aligned with their policies and incentives,  and  Automatically execute and monitor those commitments and policies Support Data-PASS partnership agreements and transfer protocols This tool provides a thin slice of functionality through the entire policy stack…  Building Communities of “Trust”
Another Why… Building Communities of “Trust” R.I.P.
Organizational  Support  Building Communities of “Trust” NSDA PLN EDUCOPIA DATA-PASS SAFE
Risk Management Risk Identification Vulnerability Analysis  Process, Systems, Institutional Controls Detection Verification Diversification Replication Insurance Building Communities of “Trust” Economic models Advocacy Outreach Mission Strategic planning Strategic collaboration Transparency Note on “distributed”: -  “Distributed” -> multiple autonomous systems + communication channels, - distributed systems often associated with heterogeneous communication costs - “Distributed” ≠ {Replicated, Fault tolerant, Diversified} Sustainability
Building Communities of “Trust” When Describing Mitigation Strategies Describe threat category and source Describe domain over which mitigation is applied Describe what is being monitored or verified THREAT MODELS Category Source Technical Media failure natural, human error, malice Media obsolescence natural Format obsolescence natural Software infrastructure  human error, malice Network infrastructure natural, human error, malice External Institution Third party attacks human error, malice Loss of funding natural, human error, malice Change of legal regime natural Internal Institution Curatorial modification human error, malice Loss of institutional knowledge natural, human error, malice Mission change human error Ingest incomplete human error, malice Acquisition failure natural, human error, malice
Trust is an Overloaded Term Individual character - Mensch-like behavior “ Trusted systems” Provenance of content Fault tolerant systems Cryptographic privacy/integrity guarantees Good inter-institutional relationships Good institutional reputation Statistical reliability Building Communities of “Trust”
Evaluation Levels Do documented policies & procedures exist? SAS 70 Type I : point-in-time; controls in operation; documented/presented; suitable for control objective Are operations consistent with policies and procedures? SAS 70 Type II:  tests of control effectiveness over time Do policies and procedures reflect appropriate/good/best practices in place to obtain objectives? FISMA Certification:  evaluates objectives, threats, vulnerabilities, recommends controls Are objectives, goals, mission, values consistent?  Examples: CRL, charity navigator Does institution have the fitness to honor commitments? Examples: CRL, Standard & Poor’s, Moody’s Building Communities of “Trust” System Analysis Threat Modeling Vulnerability Identification Analysis -  likelihood - impact - mitigating controls Institute Selected Controls Testing and Auditing Information Security Control Selection Process
What can we Learn from Open Source Dev Most OSS projects have limited success, at best Most fail/expire Most have single/small group of developers If you build it, users may come Developers may come if people who use your tool also develop it Incentives Ego Reputation Linked to job incentives Structure Have a leader (or small cabal) at any point in time Transparency Governance is linked to participation Building Communities of “Trust”
Knowledge Goods Building Communities of “Trust” Software Best Practice Preserved Digital Content Storage Provisioning Funding (Thin Market) Acquisition Pool Clients
More Questions Policy and evaluation.  What policies should members adopt to the use of collaboratives in their preservation strategy?  How should members document the ways in which collaboratives support their preservation strategy?  When a preservation strategy relies on a collaborative, how should evaluators approach assessment of the collaborative? Examination of risks Which preservation risks are collaboratives/virtual organizations  in the best position to mitigate?  What additional risks do virtual organizations and collaboratives create?  How do characteristics of a collaborative, such as geographical diversity affect its ability to reduce preservation risks for its members? How do we define “Trust” in …….  preservation partners preservation technologies and components preservation collaborations Who is trusting whom to do what? And what happens if they don’t?  Trust but Verify How can collaborations balance trust and risk? What evidence is required to substantiate trust? Audit Reports? MOU’s? Contracts?  Building Communities of “Trust”
Contact Us Micah Altman maltman.hmdc.harvard.edu Jonathan Crabtree www.irss.unc.edu/odum/jsp/content_node.jsp?nodeid=522 Nancy McGovern www.icpsr.org/icpsrweb/ICPSR/staff/mcgovern.jsp Building Communities of “Trust”

Building Communities of “Trust”

  • 1.
    Building Communities of“Trust” Micah Altman, Institute for Quantitative Social Science, Harvard University Prepared for Private LOCKSS Networks: Community-based Approaches to Distributed Digital Preservation Educopia October 2010
  • 2.
    Collaborators* Margaret Adams,George Alter, Ed Bachman, Adam Buchbinder, Ken Bollen, Bryan Beecher, Steve Burling, Darrell Donakowski, Gary King, Patrick King, Bill Lefurgy, Jared Lyle, Marc Maynard, Amy Pienta, Lois Timms-Ferrarra. Research Support Thanks to the Library of Congress (PA#NDP03-1), the National Science Foundation (DMS-0835500, SES 0112072), IMLS (LG-05-09-0041-09), the Harvard University Library, the Institute for Quantitative Social Science, the Harvard-MIT Data Center, and the Murray Research Archive. Building Communities of “Trust” * And co-conspirators
  • 3.
    Related Work Reprintsavailable from: http://maltman.hmdc.harvard.edu Altman, M., Beecher, B., and Crabtree, J.; with L. Andreev, E. Bachman, A. Buchbinder, S. Burling, P. King, M. Maynard.. (2009). "A Prototype Platform for Policy-Based Archival Replication." Against the Grain . 21(2): 44-47. Altman, M., Adams, M., Crabtree, J., Donakowski, D., Maynard, M., Pienta, A., & Young, C. (2009). "Digital preservation through archival collaboration: The Data Preservation Alliance for the Social Sciences." The American Archivist . 72(1): 169-182 Myron Gutmann, Abrahamson, M, Adams, M.O., Altman, M, Arms, C., Bollen, K., Carlson, M., Crabtree, J., Donakowski, D., King, G., Lyle, J., Maynard, M., Pienta, A., Rockwell, R, Timms-Ferrara L., Young, C., 2009. "From Preserving the Past to Preserving the Future: The Data-PASS Project and the challenges of preserving digital social science data", Library Trends 57(3):315-33 Micah Altman, 2009. "Transformative Effects of NDIIPP, the case of the Henry A. Murray Archive", Library Trends 57(3): 338-35 Building Communities of “Trust”
  • 4.
    Structuring Collaboration forPreservation Risks. How can virtual organizations reduce preservation risks? Trust. What trust relationships should virtual organizations establish among members? Evaluation. How should the virtual organization and relationships be evaluated? Building Communities of “Trust”
  • 5.
    Conjectures Organizations reducepreservation risk by: Providing systematic redundancy across diverse … Technical approaches: software, hardware, formats Institutional environments: funding models, legal regime Institutional control: curation, deaccessioning Enhancing preservation readiness: Awareness of risks and risk management approaches Awareness & use of best practices Active exercise of cataloging information, licensing terms, API’s Trust and evaluation should be based on: Linking policy objectives to explicitly-defined roles, actions, and expected outcomes Continuous evaluation and monitoring based on organizational incentives, capacity, & commitments Building Communities of “Trust”
  • 6.
    One tool… SAFE-ArchivePolicy-Based Replication & Auditing Facilitating collaborative replication and preservation with technology… Collaborators declare explicit non-uniform resource commitments Policy records commitments, storage network properties Storage layer provides replication, integrity, freshness, versioning SAFE-Archive software provides monitoring, auditing, and provisioning Content is harvested through HTTP (LOCKSS) or OAI-PMH Integration of LOCKSS, The Dataverse Network, TRAC Building Communities of “Trust”
  • 7.
    Storage Layers Otherthan LOCKSS Building Communities of “Trust” System Risks Advantages LOCKSS Single implementation Small installed base Small development community Scalability Designed for preservation Fault-tolerant Minimal trust model Harvesting functions IRODS Single implementation Small installed base -Small development community -Complexity of rules system -No integrity built in (use ACE?) Flexible rules Scaleable GnuNet, Freenet, Tahoe-LAFS Complexity of integration No support for versioning Fault tolerant Moderate installed base Multiple implementations CrashPlan SpiderOak Mozy Closed source Difficult to integrate with Licensing fees Multiple implementations Extensive target storage support Extensive reporting Commercial support
  • 8.
    Why this tool?To facilitate institutions in making commitments aligned with their policies and incentives, and Automatically execute and monitor those commitments and policies Support Data-PASS partnership agreements and transfer protocols This tool provides a thin slice of functionality through the entire policy stack… Building Communities of “Trust”
  • 9.
    Another Why… BuildingCommunities of “Trust” R.I.P.
  • 10.
    Organizational Support Building Communities of “Trust” NSDA PLN EDUCOPIA DATA-PASS SAFE
  • 11.
    Risk Management RiskIdentification Vulnerability Analysis Process, Systems, Institutional Controls Detection Verification Diversification Replication Insurance Building Communities of “Trust” Economic models Advocacy Outreach Mission Strategic planning Strategic collaboration Transparency Note on “distributed”: - “Distributed” -> multiple autonomous systems + communication channels, - distributed systems often associated with heterogeneous communication costs - “Distributed” ≠ {Replicated, Fault tolerant, Diversified} Sustainability
  • 12.
    Building Communities of“Trust” When Describing Mitigation Strategies Describe threat category and source Describe domain over which mitigation is applied Describe what is being monitored or verified THREAT MODELS Category Source Technical Media failure natural, human error, malice Media obsolescence natural Format obsolescence natural Software infrastructure human error, malice Network infrastructure natural, human error, malice External Institution Third party attacks human error, malice Loss of funding natural, human error, malice Change of legal regime natural Internal Institution Curatorial modification human error, malice Loss of institutional knowledge natural, human error, malice Mission change human error Ingest incomplete human error, malice Acquisition failure natural, human error, malice
  • 13.
    Trust is anOverloaded Term Individual character - Mensch-like behavior “ Trusted systems” Provenance of content Fault tolerant systems Cryptographic privacy/integrity guarantees Good inter-institutional relationships Good institutional reputation Statistical reliability Building Communities of “Trust”
  • 14.
    Evaluation Levels Dodocumented policies & procedures exist? SAS 70 Type I : point-in-time; controls in operation; documented/presented; suitable for control objective Are operations consistent with policies and procedures? SAS 70 Type II: tests of control effectiveness over time Do policies and procedures reflect appropriate/good/best practices in place to obtain objectives? FISMA Certification: evaluates objectives, threats, vulnerabilities, recommends controls Are objectives, goals, mission, values consistent? Examples: CRL, charity navigator Does institution have the fitness to honor commitments? Examples: CRL, Standard & Poor’s, Moody’s Building Communities of “Trust” System Analysis Threat Modeling Vulnerability Identification Analysis - likelihood - impact - mitigating controls Institute Selected Controls Testing and Auditing Information Security Control Selection Process
  • 15.
    What can weLearn from Open Source Dev Most OSS projects have limited success, at best Most fail/expire Most have single/small group of developers If you build it, users may come Developers may come if people who use your tool also develop it Incentives Ego Reputation Linked to job incentives Structure Have a leader (or small cabal) at any point in time Transparency Governance is linked to participation Building Communities of “Trust”
  • 16.
    Knowledge Goods BuildingCommunities of “Trust” Software Best Practice Preserved Digital Content Storage Provisioning Funding (Thin Market) Acquisition Pool Clients
  • 17.
    More Questions Policyand evaluation. What policies should members adopt to the use of collaboratives in their preservation strategy? How should members document the ways in which collaboratives support their preservation strategy? When a preservation strategy relies on a collaborative, how should evaluators approach assessment of the collaborative? Examination of risks Which preservation risks are collaboratives/virtual organizations  in the best position to mitigate? What additional risks do virtual organizations and collaboratives create? How do characteristics of a collaborative, such as geographical diversity affect its ability to reduce preservation risks for its members? How do we define “Trust” in ……. preservation partners preservation technologies and components preservation collaborations Who is trusting whom to do what? And what happens if they don’t? Trust but Verify How can collaborations balance trust and risk? What evidence is required to substantiate trust? Audit Reports? MOU’s? Contracts? Building Communities of “Trust”
  • 18.
    Contact Us MicahAltman maltman.hmdc.harvard.edu Jonathan Crabtree www.irss.unc.edu/odum/jsp/content_node.jsp?nodeid=522 Nancy McGovern www.icpsr.org/icpsrweb/ICPSR/staff/mcgovern.jsp Building Communities of “Trust”

Editor's Notes

  • #2 This work “Trustworthy Repositories, Organizations & Infrastructure”, by Micah Altman (http://redistricting.info) is licensed under the Creative Commons Attribution-Share Alike 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/us/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.