SlideShare a Scribd company logo
1 of 41
Download to read offline
Taming
                                                      the Monster
                                                      Digital Preservation Planning
                                                      and Implementation Tools



                                                                         Dorothea Salo
Photo: ā€œHappy Easter, to my Peepsā€
http://www.ļ¬‚ickr.com/photos/76074333@N00/449028423/             One System, One Library
WorldIslandInfo.com / CC-BY 2.0
                                                                           2 June 2011
Why is this
                                                       so scary?


Photo: ā€œHappy Easter, to my Peepsā€
http://www.ļ¬‚ickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
Isnā€™t this just
  as scary?




Photo: ā€œNews Paper Origami Dragon Monsterā€
http://www.ļ¬‚ickr.com/photos/epsos/3777343342/
epSos.de / CC-BY 2.0
Yet we
  persevere.




Photo: ā€œNews Paper Origami Dragon Monsterā€
http://www.ļ¬‚ickr.com/photos/epsos/3777343342/
epSos.de / CC-BY 2.0
DIGITAL IS NO
                                   DIFFERENT.


Photo: ā€œ559 - The Matrix - Seamless Textureā€
http://www.ļ¬‚ickr.com/photos/zooboing/4335531915/
Patrick Hoesly / CC-BY 2.0
Many of the same ideas apply...
           ā€¢ Planning and policy
           ā€¢ Risk assessment
           ā€¢ Risk management
                  ā€¢ (knowing that we canā€™t save everything)
           ā€¢ Materials quality matters!
           ā€¢ Problem discovery and remediation
           ā€¢ Crisis management
           ā€¢ Chief problems: staļ¬€, $$$, organizational
             commitment
Photo: ā€œWhere I Teachā€
http://www.ļ¬‚ickr.com/photos/eklektikos/2541408630/
Todd Ehlers / CC-BY 2.0
Planning and
                                                      assessment
                                                             tools

Photo: ā€œHappy Easter, to my Peepsā€
http://www.ļ¬‚ickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
Scene-setting

           ā€¢ Rosenthal, David. ā€œRequirements for Digital
             Preservation: a Bottom-Up Approach.ā€
                  ā€¢ http://www.dlib.org/dlib/november05/rosenthal/
                    11rosenthal.html
           ā€¢ If youā€™re new to this, or trying to ļ¬nd your
             feet, this is the best short introduction I
             know.
                  ā€¢ The list of threats is outstanding.

Photo: ā€œBottoms Up! - Duck; San Anton Gardens, Maltaā€
http://www.ļ¬‚ickr.com/photos/foxypar4/3123113762/
John Haslam / CC-BY 2.0
TRAC
ā€¢ ā€œTrusted Repository Audit Checklistā€
ā€¢ Despite the name, covers a LOT more than
  the technology!




                                             !
  ā€¢ Budget
  ā€¢ Staļ¬ƒng
  ā€¢ ā€œdesignated communitiesā€
ā€¢ CRL will audit you, if you like
  ā€¢ (donā€™t, unless youā€™re really serious!)
ā€¢ http://catalog.crl.edu/record=b2212602~S1
DRAMBORA
ā€¢ Digital Repository Audit Method Based on
  Risk Assessment
ā€¢ A ā€œself-test,ā€ if you will.
  ā€¢ DRAMBORA is equally good as a pre- or post-test.
ā€¢ Personally, I prefer DRAMBORA to TRAC,




                                                 !
  especially for those just starting out.
ā€¢ http://www.repositoryaudit.eu/
  ā€¢ (registration required for toolkit access)
Coping with
                                                      ļ¬le formats

Photo: ā€œHappy Easter, to my Peepsā€
http://www.ļ¬‚ickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
The one acronym you
  need to know: FITS
ā€¢ ā€œFile Information Tool Setā€
  ā€¢ (you need to know this; otherwise itā€™s hard to Google)
ā€¢ Wrapper for several ļ¬le-format detector
  software packages
ā€¢ Intended to be baked into other software
ā€¢ Itā€™s early days yet!
  ā€¢ (This means you canā€™t always trust what the tools tell
    you, especially when theyā€™re telling you about errors.)
Whatā€™s this ļ¬le?

ā€¢ wotsit.org ā€œThe Programmerā€™s File and
  Data Resourceā€
ā€¢ Directory of ļ¬le extensions
ā€¢ When in doubt: open in a browser or text
  editor and see what you get.
  ā€¢ N.b.: Microsoft Word is NOT a text editor!
Solving the
                                                      geographic
                                                      distribution
                                                      problem

Photo: ā€œHappy Easter, to my Peepsā€
http://www.ļ¬‚ickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
What problem, now?
           ā€¢ The ā€œall your eggs in one basketā€ problem.
                  ā€¢ If all your bits are on one server, and the server room
                    is ļ¬‚ooded, or your town is nukedā€”oops.
           ā€¢ Not the same as backups!
                  ā€¢ Donā€™t get me wrong, backups are important!
                  ā€¢ Backups are SHORT-TERM, and usually LOCAL.
                    Geographic distribution (plus associated auditing) is
                    intended for the long term.
                  ā€¢ Donā€™t forget auditing!
Photo: ā€œNidoā€
http://www.ļ¬‚ickr.com/photos/italintheheart/3679974298/
Jorge ElĆ­as / CC-BY 2.0
LOCKSS
ā€¢ Lots of Copies Keeps Stuļ¬€ Safe!
  ā€¢ (There is also Portico, but Portico only works with
    eā€‘journal content.)
  ā€¢ Open-source software that handles replication and
    (some) auditing.
ā€¢ ā€œPrivate LOCKSS networkā€
  ā€¢ A group of institutions agrees to build a LOCKSS
    network just for the stuļ¬€ theyā€™re interested in.
  ā€¢ ASERL does this for ETDs. Many institutions
    (including UW-Madison) participate in a PLN for
    govdocs.
ā€œThe cloudā€
       ā€¢ Typical cloud-based storage services make
         NO promises they wonā€™t lose your stuļ¬€.
              ā€¢ And for large quantities of data, bandwidth can become
                an issue.
              ā€¢ And can they look at your stuļ¬€? Should they be able to?
       ā€¢ Some early movers in this market fading
              ā€¢ Iron Mountain had to kill their service.
       ā€¢ DuraCloud
              ā€¢ trying to ļ¬nesse this issue by negotiating tougher SLAs
                with cloud-storage providers
Photo: ā€œSky View From Humboldt Parkā€
http://www.ļ¬‚ickr.com/photos/purpleslog/2589612577/
Purple Slog / CC-BY 2.0
Repository
                                                      and digital-library
                                                      platforms

Photo: ā€œHappy Easter, to my Peepsā€
http://www.ļ¬‚ickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
Friendly word
of advice:



PICK
SOFTWARE
LAST.                   Photo: ā€œBriana Calderon; future educator of america.ā€
                http://www.ļ¬‚ickr.com/photos/46132085@N03/4703617843/
                                                 Arielle Calderon / CC-BY 2.0
Another friendly word of
                    advice:


                            DONā€™T CHASE
                             THE SHINY.

Photo: ā€œSparkle Textureā€
http://www.ļ¬‚ickr.com/photos/abbylanes/3214921616/
Abby Lane / CC-BY 2.0
Digital-library software
         ā€¢ Is almost always VERY BAD at digital
           preservation!
                ā€¢ (most packages donā€™t even try!)
                ā€¢ So if a ļ¬le gets corrupted on the server, or whatever...
                  no warnings, no restore, nothing. Also, provenance?
                  Who needs provenance? Event tracking? Whatā€™s that?
         ā€¢ Iā€™m not saying donā€™t use it. Iā€™m saying that
           it doesnā€™t solve this problem.
                ā€¢ In fact, if youā€™re using this software, you need to solve
                  this problem FOR IT.
Photo: ā€œNational DIGITAL Libraryā€
http://www.ļ¬‚ickr.com/photos/schex/193912573/
Jesse Schexnayder / CC-BY 2.0
Examples


ā€¢ ContentDM: http://contentdm.com/
ā€¢ Omeka: http://omeka.org/
ā€¢ Greenstone: http://greenstone.org/
Institutional-repository
                     software

        ā€¢ Is SHOCKINGLY bad at digital preservation!
              ā€¢ (Though sometimes better than most DL software.)
        ā€¢ Examples
              ā€¢ Hosted/commercial: Digital Commons (BePress),
                ContentDM, DigiTool
              ā€¢ If you go hosted, youā€™d better ask about their digital-
                preservation practices!
              ā€¢ Open-source: EPrints, DSpace, Fedora
Photo: ā€œIMG_0668ā€
http://www.ļ¬‚ickr.com/photos/12967790@N00/66531124
Robert / CC-BY 2.0
A new approach:
                                                      curation
                                                      microservices

Photo: ā€œHappy Easter, to my Peepsā€
http://www.ļ¬‚ickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
Do we really need




Photo: ā€œgiant crystal blobā€
http://www.ļ¬‚ickr.com/photos/a_of_doom/527905701/
A of DooM / CC-BY 2.0
                                                   THE BLOB?
How about a jigsaw
          puzzle instead?
             ā€¢ Break the digital-preservation problem
               down into parts.
             ā€¢ Code up each part, making sure that it
               plays nicely with other parts.
                    ā€¢ lots of nice APIs!
                    ā€¢ which means other software can adopt/adapt
                      microservices as well!
             ā€¢ Put parts together as you need them.
Photo: ā€œLapsana Apogonoides Puzzleā€
http://www.ļ¬‚ickr.com/photos/gdesigneralex/2313092112/
gdesigneralex / CC-BY 2.0
California Digital Library


ā€¢ Pioneering this approach
ā€¢ Has open-sourced code for microservices
ā€¢ Has added microservices together to build
  its ā€œMerrittā€ storage/repository service
Escaping the silos:
                                                      Fedora Commons

Photo: ā€œHappy Easter, to my Peepsā€
http://www.ļ¬‚ickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
What is Fedora Commons?
ā€¢ Blueprints and foundation, not the whole
  house (analogy credit to Peter Gorman)
ā€¢ You build the house you want!
ā€¢ Or you build condominiums on the same
  foundation.
  ā€¢ Need diļ¬€erent user interfaces for diļ¬€erent materials?
  ā€¢ Need diļ¬€erent structures and behaviors?
  ā€¢ No problem! Fedora can handle that.
ā€¢ (have I run this analogy into the ground yet?)
We had this...




                 Diagram courtesy of Peter Gorman.
We are building this.




                 Diagram courtesy of Peter Gorman.
E-records
                                                      management

Photo: ā€œHappy Easter, to my Peepsā€
http://www.ļ¬‚ickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
Axioms
ā€¢ Records management is
  about policy and
  procedures.
  ā€¢ If your policy doesnā€™t ļ¬t with
    their procedures, guess what
    wins? Choose battles wisely.
ā€¢ There is never enough
  storage space.
ā€¢ Nobody cares until
  thereā€™s a crisis.
ā€¢ Software will not save
  you... but it might help!
                             Photo: ā€œThe Never Ending Math Problemā€
     http://www.ļ¬‚ickr.com/photos/acidwashphotography/2967752733/
                                                 d3 Dan / CC-BY 2.0
Duke Data Accessioner

ā€¢ Accessioning tool for digital data
  ā€¢ use case: J. Important Scholar dumps her hard drive
    on your desk, expects you to cope
ā€¢ File migrator, metadata manager, GUI,
  plugins (e.g. for ļ¬le-format detection)
ā€¢ Bit rough, but in production use.
  ā€¢ http://library.duke.edu/uarchives/about/tools/data-
    accessioner.html
Archivematica

ā€¢ Soup-to-nuts records management and
  digital preservation tool.
  ā€¢ Evaluation and accessioning all the way through
    preservation actions. (Oddly, they seem to be
    missing disposal... but theyā€™re in alpha, so...)
ā€¢ Open source
  ā€¢ Runs on a Linux server; RMs and archivists log in to
    GUI application remotely.
ā€¢ Normally I hate and fear silos, but this one
  is smartly built on microservices.
Practical E-Records
ā€¢ Weblog by Chris Prom and protegĆ©s
ā€¢ Tool evaluations, conference-session
  writeups, essays on praxis
ā€¢ Best reading out there for the do-it-
  yourselfer
ā€¢ If youā€™re not reading it, why not?
ā€¢ http://e-records.chrisprom.com/
Last thoughts

Photo: ā€œHappy Easter, to my Peepsā€
http://www.ļ¬‚ickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
If you canā€™t do everything...




                   Image: ā€œConfusedā€
                   http://www.ļ¬‚ickr.com/photos/kristiand/3223044657/
                   Kristian D. / CC-BY 2.0




  thatā€™s okay. Who can?
DO SOMETHING.




Photo: ā€œCame hame hƔƔƔƔ!ā€
http://www.ļ¬‚ickr.com/photos/kristiand/3223044657/
GuirĆ­ R. Reyes / CC-BY 2.0
The worst threat?




INACTION.                           Photo: ā€œFattyā€™s role modelā€
           http://www.ļ¬‚ickr.com/photos/cloudzilla/4910616774/
                                         cloudzilla / CC-BY 2.0
Thank you!
                                                      This presentation is available
                                                      under a Creative Commons 3.0
                                                      United States license.
Photo: ā€œHappy Easter, to my Peepsā€
http://www.ļ¬‚ickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0

More Related Content

Similar to Taming the Monster: Digital Preservation Planning and Implementation Tools

Canoe the Open Content Rapids
Canoe the Open Content RapidsCanoe the Open Content Rapids
Canoe the Open Content Rapids
Dorothea Salo
Ā 
Course tech conf_2013_ppt_mobile_technology_bowers-miller
Course tech conf_2013_ppt_mobile_technology_bowers-millerCourse tech conf_2013_ppt_mobile_technology_bowers-miller
Course tech conf_2013_ppt_mobile_technology_bowers-miller
Gina Bowers-Miller
Ā 

Similar to Taming the Monster: Digital Preservation Planning and Implementation Tools (20)

"The evolution of mobile apps". Alan Cannistraro, Facebook
"The evolution of mobile apps". Alan Cannistraro, Facebook"The evolution of mobile apps". Alan Cannistraro, Facebook
"The evolution of mobile apps". Alan Cannistraro, Facebook
Ā 
Linked data in action
Linked data in actionLinked data in action
Linked data in action
Ā 
Grab a bucket! It's raining data!
Grab a bucket! It's raining data!Grab a bucket! It's raining data!
Grab a bucket! It's raining data!
Ā 
Welcome to planet Fintlewoodlewix - SmashingConf NYC 2014
Welcome to planet Fintlewoodlewix - SmashingConf NYC 2014Welcome to planet Fintlewoodlewix - SmashingConf NYC 2014
Welcome to planet Fintlewoodlewix - SmashingConf NYC 2014
Ā 
Paul Stokes (Jisc) - A provocation about preservation
Paul Stokes (Jisc) - A provocation about preservationPaul Stokes (Jisc) - A provocation about preservation
Paul Stokes (Jisc) - A provocation about preservation
Ā 
Canoe the Open Content Rapids
Canoe the Open Content RapidsCanoe the Open Content Rapids
Canoe the Open Content Rapids
Ā 
Paraimpu: a social tool for the Web of Things
Paraimpu: a social tool for the Web of ThingsParaimpu: a social tool for the Web of Things
Paraimpu: a social tool for the Web of Things
Ā 
Ldl2012
Ldl2012Ldl2012
Ldl2012
Ā 
UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)
UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)
UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)
Ā 
The Seven Wastes of Software Development
The Seven Wastes of Software DevelopmentThe Seven Wastes of Software Development
The Seven Wastes of Software Development
Ā 
The Next Big Thing is Web 3.0. Catch It If You Can
The Next Big Thing is Web 3.0. Catch It If You Can The Next Big Thing is Web 3.0. Catch It If You Can
The Next Big Thing is Web 3.0. Catch It If You Can
Ā 
From Virtual Reality to Blockchain: Current and Emerging Tech Trends
From Virtual Reality to Blockchain: Current and Emerging Tech TrendsFrom Virtual Reality to Blockchain: Current and Emerging Tech Trends
From Virtual Reality to Blockchain: Current and Emerging Tech Trends
Ā 
Just Digitise It! - Daniel Wilksch
Just Digitise It! - Daniel WilkschJust Digitise It! - Daniel Wilksch
Just Digitise It! - Daniel Wilksch
Ā 
Building Your Future by Building Your Staff
Building Your Future by Building Your StaffBuilding Your Future by Building Your Staff
Building Your Future by Building Your Staff
Ā 
Course Tech 2013, Gina M. Bowers-Miller, Using Mobile Technology in the Class...
Course Tech 2013, Gina M. Bowers-Miller, Using Mobile Technology in the Class...Course Tech 2013, Gina M. Bowers-Miller, Using Mobile Technology in the Class...
Course Tech 2013, Gina M. Bowers-Miller, Using Mobile Technology in the Class...
Ā 
Connect, Communicate, Collaborate: Powering Learning
Connect, Communicate, Collaborate: Powering LearningConnect, Communicate, Collaborate: Powering Learning
Connect, Communicate, Collaborate: Powering Learning
Ā 
Storing Your Research Data
Storing Your Research DataStoring Your Research Data
Storing Your Research Data
Ā 
Course tech conf_2013_ppt_mobile_technology_bowers-miller
Course tech conf_2013_ppt_mobile_technology_bowers-millerCourse tech conf_2013_ppt_mobile_technology_bowers-miller
Course tech conf_2013_ppt_mobile_technology_bowers-miller
Ā 
Smashingconf nyc-final
Smashingconf nyc-finalSmashingconf nyc-final
Smashingconf nyc-final
Ā 
Linked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsLinked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & Museums
Ā 

More from Dorothea Salo

Risk management and auditing
Risk management and auditingRisk management and auditing
Risk management and auditing
Dorothea Salo
Ā 
MARC and BIBFRAME; Linking libraries and archives
MARC and BIBFRAME; Linking libraries and archivesMARC and BIBFRAME; Linking libraries and archives
MARC and BIBFRAME; Linking libraries and archives
Dorothea Salo
Ā 
RDF, RDA, and other TLAs
RDF, RDA, and other TLAsRDF, RDA, and other TLAs
RDF, RDA, and other TLAs
Dorothea Salo
Ā 
Avoiding the Heron's Way
Avoiding the Heron's WayAvoiding the Heron's Way
Avoiding the Heron's Way
Dorothea Salo
Ā 

More from Dorothea Salo (20)

Soylent Semantic Web Is People! (with notes)
Soylent Semantic Web Is People! (with notes)Soylent Semantic Web Is People! (with notes)
Soylent Semantic Web Is People! (with notes)
Ā 
Soylent SemanticWeb Is People!
Soylent SemanticWeb Is People!Soylent SemanticWeb Is People!
Soylent SemanticWeb Is People!
Ā 
Encryption
EncryptionEncryption
Encryption
Ā 
Privacy and libraries
Privacy and librariesPrivacy and libraries
Privacy and libraries
Ā 
Paying for it
Paying for itPaying for it
Paying for it
Ā 
Risk management and auditing
Risk management and auditingRisk management and auditing
Risk management and auditing
Ā 
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)
Ā 
Is this BIG DATA which I see before me?
Is this BIG DATA which I see before me?Is this BIG DATA which I see before me?
Is this BIG DATA which I see before me?
Ā 
MARC and BIBFRAME; Linking libraries and archives
MARC and BIBFRAME; Linking libraries and archivesMARC and BIBFRAME; Linking libraries and archives
MARC and BIBFRAME; Linking libraries and archives
Ā 
Library Linked Data
Library Linked DataLibrary Linked Data
Library Linked Data
Ā 
FRBR and RDA
FRBR and RDAFRBR and RDA
FRBR and RDA
Ā 
Research Data and Scholarly Communication
Research Data and Scholarly CommunicationResearch Data and Scholarly Communication
Research Data and Scholarly Communication
Ā 
Research Data and Scholarly Communication (with notes)
Research Data and Scholarly Communication (with notes)Research Data and Scholarly Communication (with notes)
Research Data and Scholarly Communication (with notes)
Ā 
Manufacturing Serendipity
Manufacturing SerendipityManufacturing Serendipity
Manufacturing Serendipity
Ā 
What We Organize
What We OrganizeWhat We Organize
What We Organize
Ā 
Occupy Copyright!
Occupy Copyright!Occupy Copyright!
Occupy Copyright!
Ā 
RDF, RDA, and other TLAs
RDF, RDA, and other TLAsRDF, RDA, and other TLAs
RDF, RDA, and other TLAs
Ā 
I own copyright, so I pwn you!
I own copyright, so I pwn you!I own copyright, so I pwn you!
I own copyright, so I pwn you!
Ā 
Librarians love data!
Librarians love data!Librarians love data!
Librarians love data!
Ā 
Avoiding the Heron's Way
Avoiding the Heron's WayAvoiding the Heron's Way
Avoiding the Heron's Way
Ā 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(ā˜Žļø+971_581248768%)**%*]'#abortion pills for sale in dubai@
Ā 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
Ā 

Recently uploaded (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
Ā 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
Ā 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Ā 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
Ā 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
Ā 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
Ā 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Ā 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Ā 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
Ā 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
Ā 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
Ā 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Ā 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Ā 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Ā 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Ā 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Ā 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
Ā 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Ā 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
Ā 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
Ā 

Taming the Monster: Digital Preservation Planning and Implementation Tools

  • 1. Taming the Monster Digital Preservation Planning and Implementation Tools Dorothea Salo Photo: ā€œHappy Easter, to my Peepsā€ http://www.ļ¬‚ickr.com/photos/76074333@N00/449028423/ One System, One Library WorldIslandInfo.com / CC-BY 2.0 2 June 2011
  • 2. Why is this so scary? Photo: ā€œHappy Easter, to my Peepsā€ http://www.ļ¬‚ickr.com/photos/76074333@N00/449028423/ WorldIslandInfo.com / CC-BY 2.0
  • 3. Isnā€™t this just as scary? Photo: ā€œNews Paper Origami Dragon Monsterā€ http://www.ļ¬‚ickr.com/photos/epsos/3777343342/ epSos.de / CC-BY 2.0
  • 4. Yet we persevere. Photo: ā€œNews Paper Origami Dragon Monsterā€ http://www.ļ¬‚ickr.com/photos/epsos/3777343342/ epSos.de / CC-BY 2.0
  • 5. DIGITAL IS NO DIFFERENT. Photo: ā€œ559 - The Matrix - Seamless Textureā€ http://www.ļ¬‚ickr.com/photos/zooboing/4335531915/ Patrick Hoesly / CC-BY 2.0
  • 6. Many of the same ideas apply... ā€¢ Planning and policy ā€¢ Risk assessment ā€¢ Risk management ā€¢ (knowing that we canā€™t save everything) ā€¢ Materials quality matters! ā€¢ Problem discovery and remediation ā€¢ Crisis management ā€¢ Chief problems: staļ¬€, $$$, organizational commitment Photo: ā€œWhere I Teachā€ http://www.ļ¬‚ickr.com/photos/eklektikos/2541408630/ Todd Ehlers / CC-BY 2.0
  • 7. Planning and assessment tools Photo: ā€œHappy Easter, to my Peepsā€ http://www.ļ¬‚ickr.com/photos/76074333@N00/449028423/ WorldIslandInfo.com / CC-BY 2.0
  • 8. Scene-setting ā€¢ Rosenthal, David. ā€œRequirements for Digital Preservation: a Bottom-Up Approach.ā€ ā€¢ http://www.dlib.org/dlib/november05/rosenthal/ 11rosenthal.html ā€¢ If youā€™re new to this, or trying to ļ¬nd your feet, this is the best short introduction I know. ā€¢ The list of threats is outstanding. Photo: ā€œBottoms Up! - Duck; San Anton Gardens, Maltaā€ http://www.ļ¬‚ickr.com/photos/foxypar4/3123113762/ John Haslam / CC-BY 2.0
  • 9. TRAC ā€¢ ā€œTrusted Repository Audit Checklistā€ ā€¢ Despite the name, covers a LOT more than the technology! ! ā€¢ Budget ā€¢ Staļ¬ƒng ā€¢ ā€œdesignated communitiesā€ ā€¢ CRL will audit you, if you like ā€¢ (donā€™t, unless youā€™re really serious!) ā€¢ http://catalog.crl.edu/record=b2212602~S1
  • 10. DRAMBORA ā€¢ Digital Repository Audit Method Based on Risk Assessment ā€¢ A ā€œself-test,ā€ if you will. ā€¢ DRAMBORA is equally good as a pre- or post-test. ā€¢ Personally, I prefer DRAMBORA to TRAC, ! especially for those just starting out. ā€¢ http://www.repositoryaudit.eu/ ā€¢ (registration required for toolkit access)
  • 11. Coping with ļ¬le formats Photo: ā€œHappy Easter, to my Peepsā€ http://www.ļ¬‚ickr.com/photos/76074333@N00/449028423/ WorldIslandInfo.com / CC-BY 2.0
  • 12. The one acronym you need to know: FITS ā€¢ ā€œFile Information Tool Setā€ ā€¢ (you need to know this; otherwise itā€™s hard to Google) ā€¢ Wrapper for several ļ¬le-format detector software packages ā€¢ Intended to be baked into other software ā€¢ Itā€™s early days yet! ā€¢ (This means you canā€™t always trust what the tools tell you, especially when theyā€™re telling you about errors.)
  • 13. Whatā€™s this ļ¬le? ā€¢ wotsit.org ā€œThe Programmerā€™s File and Data Resourceā€ ā€¢ Directory of ļ¬le extensions ā€¢ When in doubt: open in a browser or text editor and see what you get. ā€¢ N.b.: Microsoft Word is NOT a text editor!
  • 14. Solving the geographic distribution problem Photo: ā€œHappy Easter, to my Peepsā€ http://www.ļ¬‚ickr.com/photos/76074333@N00/449028423/ WorldIslandInfo.com / CC-BY 2.0
  • 15. What problem, now? ā€¢ The ā€œall your eggs in one basketā€ problem. ā€¢ If all your bits are on one server, and the server room is ļ¬‚ooded, or your town is nukedā€”oops. ā€¢ Not the same as backups! ā€¢ Donā€™t get me wrong, backups are important! ā€¢ Backups are SHORT-TERM, and usually LOCAL. Geographic distribution (plus associated auditing) is intended for the long term. ā€¢ Donā€™t forget auditing! Photo: ā€œNidoā€ http://www.ļ¬‚ickr.com/photos/italintheheart/3679974298/ Jorge ElĆ­as / CC-BY 2.0
  • 16. LOCKSS ā€¢ Lots of Copies Keeps Stuļ¬€ Safe! ā€¢ (There is also Portico, but Portico only works with eā€‘journal content.) ā€¢ Open-source software that handles replication and (some) auditing. ā€¢ ā€œPrivate LOCKSS networkā€ ā€¢ A group of institutions agrees to build a LOCKSS network just for the stuļ¬€ theyā€™re interested in. ā€¢ ASERL does this for ETDs. Many institutions (including UW-Madison) participate in a PLN for govdocs.
  • 17. ā€œThe cloudā€ ā€¢ Typical cloud-based storage services make NO promises they wonā€™t lose your stuļ¬€. ā€¢ And for large quantities of data, bandwidth can become an issue. ā€¢ And can they look at your stuļ¬€? Should they be able to? ā€¢ Some early movers in this market fading ā€¢ Iron Mountain had to kill their service. ā€¢ DuraCloud ā€¢ trying to ļ¬nesse this issue by negotiating tougher SLAs with cloud-storage providers Photo: ā€œSky View From Humboldt Parkā€ http://www.ļ¬‚ickr.com/photos/purpleslog/2589612577/ Purple Slog / CC-BY 2.0
  • 18. Repository and digital-library platforms Photo: ā€œHappy Easter, to my Peepsā€ http://www.ļ¬‚ickr.com/photos/76074333@N00/449028423/ WorldIslandInfo.com / CC-BY 2.0
  • 19. Friendly word of advice: PICK SOFTWARE LAST. Photo: ā€œBriana Calderon; future educator of america.ā€ http://www.ļ¬‚ickr.com/photos/46132085@N03/4703617843/ Arielle Calderon / CC-BY 2.0
  • 20. Another friendly word of advice: DONā€™T CHASE THE SHINY. Photo: ā€œSparkle Textureā€ http://www.ļ¬‚ickr.com/photos/abbylanes/3214921616/ Abby Lane / CC-BY 2.0
  • 21. Digital-library software ā€¢ Is almost always VERY BAD at digital preservation! ā€¢ (most packages donā€™t even try!) ā€¢ So if a ļ¬le gets corrupted on the server, or whatever... no warnings, no restore, nothing. Also, provenance? Who needs provenance? Event tracking? Whatā€™s that? ā€¢ Iā€™m not saying donā€™t use it. Iā€™m saying that it doesnā€™t solve this problem. ā€¢ In fact, if youā€™re using this software, you need to solve this problem FOR IT. Photo: ā€œNational DIGITAL Libraryā€ http://www.ļ¬‚ickr.com/photos/schex/193912573/ Jesse Schexnayder / CC-BY 2.0
  • 22. Examples ā€¢ ContentDM: http://contentdm.com/ ā€¢ Omeka: http://omeka.org/ ā€¢ Greenstone: http://greenstone.org/
  • 23. Institutional-repository software ā€¢ Is SHOCKINGLY bad at digital preservation! ā€¢ (Though sometimes better than most DL software.) ā€¢ Examples ā€¢ Hosted/commercial: Digital Commons (BePress), ContentDM, DigiTool ā€¢ If you go hosted, youā€™d better ask about their digital- preservation practices! ā€¢ Open-source: EPrints, DSpace, Fedora Photo: ā€œIMG_0668ā€ http://www.ļ¬‚ickr.com/photos/12967790@N00/66531124 Robert / CC-BY 2.0
  • 24. A new approach: curation microservices Photo: ā€œHappy Easter, to my Peepsā€ http://www.ļ¬‚ickr.com/photos/76074333@N00/449028423/ WorldIslandInfo.com / CC-BY 2.0
  • 25. Do we really need Photo: ā€œgiant crystal blobā€ http://www.ļ¬‚ickr.com/photos/a_of_doom/527905701/ A of DooM / CC-BY 2.0 THE BLOB?
  • 26. How about a jigsaw puzzle instead? ā€¢ Break the digital-preservation problem down into parts. ā€¢ Code up each part, making sure that it plays nicely with other parts. ā€¢ lots of nice APIs! ā€¢ which means other software can adopt/adapt microservices as well! ā€¢ Put parts together as you need them. Photo: ā€œLapsana Apogonoides Puzzleā€ http://www.ļ¬‚ickr.com/photos/gdesigneralex/2313092112/ gdesigneralex / CC-BY 2.0
  • 27. California Digital Library ā€¢ Pioneering this approach ā€¢ Has open-sourced code for microservices ā€¢ Has added microservices together to build its ā€œMerrittā€ storage/repository service
  • 28. Escaping the silos: Fedora Commons Photo: ā€œHappy Easter, to my Peepsā€ http://www.ļ¬‚ickr.com/photos/76074333@N00/449028423/ WorldIslandInfo.com / CC-BY 2.0
  • 29. What is Fedora Commons? ā€¢ Blueprints and foundation, not the whole house (analogy credit to Peter Gorman) ā€¢ You build the house you want! ā€¢ Or you build condominiums on the same foundation. ā€¢ Need diļ¬€erent user interfaces for diļ¬€erent materials? ā€¢ Need diļ¬€erent structures and behaviors? ā€¢ No problem! Fedora can handle that. ā€¢ (have I run this analogy into the ground yet?)
  • 30. We had this... Diagram courtesy of Peter Gorman.
  • 31. We are building this. Diagram courtesy of Peter Gorman.
  • 32. E-records management Photo: ā€œHappy Easter, to my Peepsā€ http://www.ļ¬‚ickr.com/photos/76074333@N00/449028423/ WorldIslandInfo.com / CC-BY 2.0
  • 33. Axioms ā€¢ Records management is about policy and procedures. ā€¢ If your policy doesnā€™t ļ¬t with their procedures, guess what wins? Choose battles wisely. ā€¢ There is never enough storage space. ā€¢ Nobody cares until thereā€™s a crisis. ā€¢ Software will not save you... but it might help! Photo: ā€œThe Never Ending Math Problemā€ http://www.ļ¬‚ickr.com/photos/acidwashphotography/2967752733/ d3 Dan / CC-BY 2.0
  • 34. Duke Data Accessioner ā€¢ Accessioning tool for digital data ā€¢ use case: J. Important Scholar dumps her hard drive on your desk, expects you to cope ā€¢ File migrator, metadata manager, GUI, plugins (e.g. for ļ¬le-format detection) ā€¢ Bit rough, but in production use. ā€¢ http://library.duke.edu/uarchives/about/tools/data- accessioner.html
  • 35. Archivematica ā€¢ Soup-to-nuts records management and digital preservation tool. ā€¢ Evaluation and accessioning all the way through preservation actions. (Oddly, they seem to be missing disposal... but theyā€™re in alpha, so...) ā€¢ Open source ā€¢ Runs on a Linux server; RMs and archivists log in to GUI application remotely. ā€¢ Normally I hate and fear silos, but this one is smartly built on microservices.
  • 36. Practical E-Records ā€¢ Weblog by Chris Prom and protegĆ©s ā€¢ Tool evaluations, conference-session writeups, essays on praxis ā€¢ Best reading out there for the do-it- yourselfer ā€¢ If youā€™re not reading it, why not? ā€¢ http://e-records.chrisprom.com/
  • 37. Last thoughts Photo: ā€œHappy Easter, to my Peepsā€ http://www.ļ¬‚ickr.com/photos/76074333@N00/449028423/ WorldIslandInfo.com / CC-BY 2.0
  • 38. If you canā€™t do everything... Image: ā€œConfusedā€ http://www.ļ¬‚ickr.com/photos/kristiand/3223044657/ Kristian D. / CC-BY 2.0 thatā€™s okay. Who can?
  • 39. DO SOMETHING. Photo: ā€œCame hame hƔƔƔƔ!ā€ http://www.ļ¬‚ickr.com/photos/kristiand/3223044657/ GuirĆ­ R. Reyes / CC-BY 2.0
  • 40. The worst threat? INACTION. Photo: ā€œFattyā€™s role modelā€ http://www.ļ¬‚ickr.com/photos/cloudzilla/4910616774/ cloudzilla / CC-BY 2.0
  • 41. Thank you! This presentation is available under a Creative Commons 3.0 United States license. Photo: ā€œHappy Easter, to my Peepsā€ http://www.ļ¬‚ickr.com/photos/76074333@N00/449028423/ WorldIslandInfo.com / CC-BY 2.0