Auditing PLN’s: Preliminary Results


Published on

In this presentation we summarize some lessons learned from trial audits of several production distributed digital preservation networks. These audits were conducted using the open source SafeArchive system ( . This presentation shows the importance of designing auditing systems to provide diagnostic information that can be used to diagnose non-confirmations of audited policies.

Published in: Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • This work by Micah Altman ( , with the exception of images explicitly accompanied by a separate “source” reference, is licensed under the Creative Commons Attribution-Share Alike 3.0 United States License. To view a copy of this license, visit or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.
  • Auditing PLN’s: Preliminary Results

    1. 1. Prepared for PLN 2012 UNC, Chapel Hill October 2012 Auditing PLN’s:Preliminary Results and Next Steps Micah Altman, Director of Research, MIT Libraries Non Resident Senior Fellow, The Brookings Institution Jonathan Crabtree, Assistant Director of Computing and Archival Research HW Odum Institute for Research in Social Science, UNC
    2. 2. Collaborators*• Nancy McGovern• Tom Lipkis & the LOCKSS TeamResearch Support Thanks to the Library of Congress, the National Science Foundation, IMLS, the Sloan Foundation, the Harvard University Library, the Institute for Quantitative Social Science, and the Massachusetts Institute of Technology. * And co-conspirators Auditing PLNs
    3. 3. Related WorkReprints available from:• M. Altman, J. Crabtree, “Using the SafeArchive System: TRAC- Based Auditing of LOCKSS”, Proceedings of Archiving 2011, Society for Imaging Science and Technology.• Altman, M., Beecher, B., & Crabtree, J. (2009). A Prototype Platform for Policy-Based Archival Replication. Against the Grain, 21(2), 44- 47. Auditing PLNs
    4. 4. Preview• Why audit?• Theory & Practice – Round 0: Setting up the Data-PASS PLN – Round 1: Self-Audit – Round 2: Compliance (almost) – Round 3: Auditing Other Networks• What’s next? Auditing PLNs
    5. 5. Why audit? Auditing PLNs
    6. 6. Short Answer: Why the heck not? “Don‟t believe in anything you hear, and only half of what you see” - Lou Reed “Trust, but verify.” - Ronald Reagan Auditing PLNs
    7. 7. Slightly Long Answer: Things Go WrongPhysical & Hardware Software Insider & External Attacks Organizational Failure Media Curatorial Error
    8. 8. Full Answer:It’s our responsibility Auditing PLNs
    9. 9. OAIS Model Responsibilities• Accept appropriate information from Information Producers.• Obtain sufficient control of the information to ensure long term preservation.• Determine which groups should become the Designated Community able to understand the information.• Ensure that the preserved information is independently understandable to the DC• Ensure that the information can be preserved against all reasonable contingencies,• Ensure that the information can be disseminated as authenticated copies of the original or as traceable back to the original• Makes the preserved data available to the DC Auditing PLNs
    10. 10. OAIS Basic Implied Trust Model• Organization is axiomatically trusted to identify designated communities• Organization is engineered with the goal of: – Collecting appropriate authentic document – Reliably deliver authentic documents, in understandable form, at a future time• Success depends upon: – Reliability of storage systems: e.g., LOCKSS network, Amazon Glacier – Reliability of organizations: MetaArchive, DataPASS, Digital Preservation Network – Document contents and properties: Formats, Metadata, Semantics, Provenance, Authenticity Auditing PLNs
    11. 11. Reflections on OAIS Trust Model• Specific bundle of trusted properties• Not complete instrumentally nor ultimately Auditing PLNs
    12. 12. Trust Engineering Approaches• Incentive based approaches: – Rewards, penalties, incentive-compatible mechanisms• Modeling and analysis: – Statistical quality control & reliability estimation, threat-modeling and vulnerability assessment• Portfolio Theory: – Diversification (financial, legal, technical… ) – hedges• Over-engineering approaches: – Safety margin, redundancy• Informational approaches: – Transparency (release of information needed to directly evaluate compliance); cryptographic signature, fingerprint, common knowledge, non-repudiation• Social engineering – Recognized practices; shared norms – Social evidence – Reduce provocations – Remove excuses• Regulatory approaches – Disclosure; Review; Certification; Audits; Regulations & penalties• Security engineering – Increase effort: harden target (reduce vulnerability); increase technical/procedural controls – Increase risk: surveillance, detection, likelihood of response – Design patterns: minimal privileges, separation of privileges Auditing PLNs – Reduce reward: deny benefits, disrupt markets, identify property, remove/conceal targets
    13. 13. Audit [aw-dit]: An independent evaluation of records and activities to assess a system of controlsFixity mitigates risk only if used for auditing.
    14. 14. Functions of Storage Auditing• Detect corruption/deletion of content• Verify compliance with storage/replication policies• Prompt repair actions
    15. 15. Bit-Level Audit Design Choices• Audit regularity and coverage: on-demand (manually); on object access; on event; randomized sample; scheduled/comprehensive• Fixity check & comparison algorithms• Auditing scope: integrity of object; integrity of collection; integrity of network; policy compliance; public/transparent auditing• Trust model• Threat model
    16. 16. Repair Auditing mitigates risk only if used for repair.Key Design Elements• Repair granularity• Repair trust model• Repair latency: – Detection to start of repair – Repair duration• Repair algorithm
    17. 17. LOCKSS Auditing & Repair Decentralized, peer-2-peer, tamper-resistant replication & repairRegularity ScheduledAlgorithms Bespoke, peer-reviewed, tamper resistantScope - Collection integrity - Collection repairTrust model - Publisher is canonical source of content - Changed contented treated as new - Replication peers are untrustedMain threat models - Media failure - Physical Failure - Curatorial Error - External Attack - Insider threats - Organizational failureKey auditing limitations - Correlated Software Failure - Lack of Policy Auditing, public/transparent auditing
    18. 18. Auditing & RepairTRAC-Aligned policy auditing as a overlay networkRegularity Scheduled; ManualFixity algorithms Relies on underlying replication systemScope - Collection integrity - Network integrity - Network repair - High-level (e.g. trac) policy auditingTrust model - External auditor, with permissions to collect meta- data/log information from replication network - Replication network is untrustedMain threat models - Software failure - Policy implementation failure (curatorial error; insider threat) - Organizational failure - Media/physical failure through underlying replication systemKey auditing limitations Relies on underlying replication system, (now) LOCKSS, for fixity check and repair
    19. 19. Theory vs. PracticeRound 0: Setting up the Data-PASS PLN “Looks ok to me” - PHB Motto Auditing PLNs
    20. 20. TheoryExpose Content ( Install LOCKSS Through (On 7 servers)OAI+DDI+HTTP ) Harvest Content (through OAI plugin) Setup PLN configurations (through OAI plugin) LOCKSS Magic Done Auditing PLNs
    21. 21. Practice (Year 1) Theory• OAI Plugin extensions required: – Non-DC metadata – Large metadata Expose Content ( Install LOCKSS Through – Alternate authentication method OAI+DDI+HTTP ) (On 7 servers) – Save metadata record – Support for OAI-SETS Harvest Content – Non-fatal error handling (through OAI plugin)• OAI Provider required: – Authentication extensions Setup PLN configurations – Performance handling for delivery (through OAI plugin) – Performance handling for errors LOCKSS – Metadata validation Magic• PLN Configuration required: – Stabilization around LOCKSS versions – Coordination around plugin Done repository – Coordination around AU definition
    22. 22. Theory vs. Practice Round 1: Self-Audit“A mere matter of implementation” - PHB Motto Auditing PLNs
    23. 23. TheoryGather Information from Add Replica Each Replica Integrate Information ->Map Network State State NOCompare Current ==Network to Policy Policy ? YES Success Auditing PLNs
    24. 24. Implementation
    25. 25. Practice (Year 2) Theory• Gathering information required – Permissions Gather Information – Reverse-engineering UI’s (with help) from Add Replica Each Replica – Network magic• Integrating information required – Heuristics for lagged information Integrate Information -> – Heuristics for incomplete Map Network State information – Heuristics for aggregated State NO information Compare Current == State Map to Policy Policy• Comparing map to policy required ? Mere matter of implementation  YES• Adding replica: Uh-oh, most policies failed  Adding replicas wasn’t going to resolve most Succes issues s
    26. 26. Theory vs. PracticeRound 2: Compliance (almost) “How do you spell „backup‟? R–E-C–O–V–E–R-Y - Auditing PLNs
    27. 27. Practice (and adjustment) makes perfekt?• Timings (e.g. crawls, polls) – Understand – Tune – Parameterize heuristics, reporting – Track trends over time• Collections – Change partitioning to AU’s at source – Extend mapping to AU’s in plugin – Extend reporting/policy framework to group AU’s• Diagnostics – When things go wrong – information to inform adjustment Auditing PLNs
    28. 28. Theory vs. PracticeRound 3: Auditing Other PLNs “In theory, theory and practice are the same – in practice, they differ.” - Auditing PLNs
    29. 29. TheoryGather Information Add from Replica Each Replica NO YES Adjust AU Sizes, Integrate Polling Information -> IntervalsMap Network State adjusted? State NOCompare Current ==Network to Policy Policy ? YES Success Auditing PLNs
    30. 30. Practice (Year 3) Theory• 100% of what?• Diagnostic inference Gather Add Information from Replica Each Replica NO YES AU Sizes, Integrate Adjust Polling Information -> Intervals Map Network adjusted State ? State Compare Current == NO Network to Policy Policy ? YES Succe ss
    31. 31. 100% of what?• No: Of LOCKSS boxes?• No: Of AU’s?• Almost: Of policy overall• Yes: Of policy for specific collection• Maybe: Of files?• Maybe: Of bits in a file?
    32. 32. What you see Box X,Y,Z all agree on AU AWhat you can conclude: Assumption: Box X,Y,Z have the Failures on file harvest are independent; number of Content is good same content harvested files large Auditing PLNs
    33. 33. What you see Box X,Y,Z don’t agreeWhat you can conclude? Auditing PLNs
    34. 34. Hypothesis 1: Disagreement is real, but doesn’t really matter.Non-Substantive AU differences (arising from dynamic elements in AU’s that have no bearing on the substantive content ) 1.1 Individual URLS/files that are dynamic and non substantive (e.g., logo images, plugins, Twitter feeds, etc.) causecontent changes (this is common in the GLN). 1.2 dynamic content embedded in substantive content (e.g. a customized per-client header page embedded in the pdffor a journal article )Hypothesis 2: Disagreement is real, but doesn’t really matter in the longer run (even if disagreement persists over long run!) 2.1 Temporary AU Differences. Versions of objects temporarily out or sync. (E.g. if harvest frequency << source update frequency, but harvest times across boxes vary significantly) 2.2 Objects temporarily missing (E.g. recently added objects are picked up by some replicas, not by others)Hypothesis 3: Disagreement is real, matters Substantive AU differences 3.1 Content corruption (e.g. from corruption in storage, or during transmission/harvesting) 3.2 Objects persistently missing from some replicas ( e.g. because of permissions issue @ provider; technical failures during harvest; plugin problems) 3.2 Versions of objects persistently missing/out of sync from some replicas (e.g. harvest frequency > source update frequency leading to different AU’s harvesting different versions of the content. ) Note that later “agreement” signifies that a particular version was verified, not that all versions have been replicatedand verifiedHypothesis 4: AU’s really do agree, but we think they don’t 4.1 Appearance of disagreement caused by Incomplete diagnostic information Poll data are missing as a result ofsystem reboot, daemon updates, or other cause. 4.2 Poll data are lagging – from different periods Polls fail, but contains information about agreement that is ignored
    35. 35. Auditing PLNs
    36. 36. Design Challenge• Create more sophisticated algorithms and• Instrument PLN data collection Such thatObserved behavior allows us to distinguishbetween hypotheses 1-4. Auditing PLNs
    37. 37. Approaches to Design Challenge [Tom Lipkis’s Talk] Auditing PLNs
    38. 38. What’s Next? “It‟s tough to make predictions, especially about the future” -Attributed to Woody Allen, Yogi Berra, Niels Bohr, Vint Cerf, WinstonChurchill, Confucius, Disreali [sic], Freeman Dyson, Cecil B. Demille, Albert Einstein, Enrico Fermi, Edgar R. Fiedler, Bob Fourer, Sam Goldwyn, Allan Lamport, Groucho Marx, Dan Quayle, George Bernard Shaw, Casey Stengel, Will Rogers, M. Taub, Mark Twain, Kerr L. White, and others Auditing PLNs
    39. 39. Short Term• Complete round 3 data collection• Refinements of current auditing algorithms – More tunable parameters (yeah?!) – Better documentation – Simple health metrics• Reports, and dissemination Auditing PLNs
    40. 40. Longer Term• Health metrics, diagnostics, decision support• Additional audit standards• Support additional replication networks• Audit other policy sets Auditing PLNs
    41. 41. Bibliography (Selected)• B. Schneier, 2012. Liars and Outliers, John Wiley & Sons• H.M. Gladney, J.L. Bennett, 2003. “What do we mean by authentic”, D-Lib 9(7/8)• K. Thompson, 1984. “Reflections on Trusting Trust”, Communication of the ACM, Vol. 27, No. 8, August 1984, pp. 761-763.• David S.H. Rosenthal, Thomas S. Robertson, Tom Lipkis, Vicky Reich, Seth Morabito. “Requirements for Digital Preservation Systems: A Bottom-Up Approach”, D-Lib Magazine, vol. 11, no. 11, November 2005.• OAIS, Reference Model for an Open Archival Information System (OAIS). CCSDS 650.0-B-1, Blue Book, January 2002 Auditing PLNs
    42. 42. Questions?E-mail: Micah_altman@alumni.brown.eduWeb: micahaltman.comTwitter: @drmaltmanE-mail: Auditing PLNs