SlideShare a Scribd company logo
1 of 22
Download to read offline
Digital Preservation at Norfolk Record Office
A report prepared by Pawel Jaskulski (Digital Preservation trainee)
for Gary Tuson (County Archivist).
March 11th, 2016
Executive Summary
Digital Preservation strategy at Norfolk Record Office evolved from the archive’s own
active interest in the emerging domain and from the anticipated necessity of
integrating accessioning digitally born archives procedure within regular archival
processing framework.
The launch of Norfolk Sound Archive in 2003 and Norfolk Record Office’s (NRO)
participation in the Skills for the Future programme signalled strong commitment to
build expertise within the field of digital technologies. Senior archivists, with support
from Norfolk County Hall’s ICT services, have developed over time bitstream
preservation capability with elements of a ‘parsimonious approach’ to digital
preservation.
The archive approved its first version of a Digital Preservation Policy in 2007, in
which it addressed the need for advancing its digital preservation strategy to involve
format migration pathways and clearly defined accessioning workflows. NRO’s
influential role within East of England Regional Archive Council led to a regional pilot
project (currently at its proof of concept stage) employing cloud hosted instance of
Archivematica connected with a cloud storage system provided by Arkivum.
Acknowledgement
I would like to thank my line manager Ian Palfrey (Senior Archivist/Collection
Management) and Gary Tuson (County Archivist) for guiding me in the process of
completing this report.
Contents
1. Introduction 1
2.1 Background 2
2.2 Wider Regional Context 2
2.3 NRO Digital Preservation Policy 3
2.4 Requirements 4
2.5 Parsimonious Approach – Bitstream Preservation 6
2.6 Towards Logical Preservation 6
3. Conclusions 8
4. Recommendations 8
References 10
Appendices 12
1
1. Introduction
The aim of this report is to survey approaches to digital preservation at Norfolk
Record Office in order to make recommendations for future improvements. Digital
preservation has now become one of the key priorities for the NRO service plan.
The archive has been involved in the Skills for the Future traineeship scheme since
2014, signalling its commitment to filling the skills gaps within the sector. This, as
well as its contribution to collaborative working within East of England Regional
Archive Council (EERAC) demonstrates NRO’s strong dedication to developing
digital preservation strategy for the region and within institutional context of a local
authority archive service.
The report will trace the evolution of interest in digital preservation, from its original
concerns about the need to develop a strategic plan for preservation of electronic
records to the latest developments with EERAC and its current pilot project.
2
2.1 Background
Norfolk Record Office collects and preserves unique archives relating to the history
of Norfolk and makes them accessible to as wide a range of people as possible. It is
a joint service of Norfolk County Council and the District Councils of Norfolk and is
democratically accountable via the joint Norfolk Records Committee.1 NRO is located
at The Archive Centre in Norwich with additional services operating from Norfolk
Heritage Centre at Norwich Millennium Library and King's Lynn Borough Archive.
In April 2003 the record office launched the Norfolk Sound Archive with a purpose to
collect, preserve and provide public access to sound recordings relevant to life in
Norfolk. Its remit includes:
• Provide information on holdings and access to original recordings
• Preserve sound recordings
• Locate existing sound recordings that are worth preserving for the future
• Provide support and training for on-going and new oral history projects
• Promote the use of sound recordings, particularly within education
• Links to organizations who also hold collections of sound recordings relating to
Norfolk and who are carrying out oral history work in the county2
Overall NRO incorporates three repositories: Norfolk Record Office, Norfolk Sound
Archive and King’s Lynn with variety of holdings and record types under its custody.3
2.2 Wider Regional Context
The creation of Regional Archives Councils for the nine English Regions in 1999 led
to the publication of a series of regional archive strategy documents. The report
created for East of England Regional Archive Council (EERAC) in 2003 sets out the
aims of the Preserving the Present for the Future project:
This project encompasses a whole range of issues concerning records
management and the preservation of electronic records and aims to ensure
that contemporary records in all forms – public and private - are both properly
managed now and preserved for the future. It will involve creating the
infrastructure and building confidence to turn theoretical knowledge into
1
Norfolk County Council has two Joint Committees:
http://www.norfolk.gov.uk/Council_and_Democracy/Our_budget_and_council_tax/Statement_of_accounts/NCC1
52976, accessed 09/03/2016
2
Norfolk Sound Archive has a well-established digitisation workflow based on the work of the British Library
Sound Archive.
3
Norfolk Record Office Archive Collections http://www.archives.norfolk.gov.uk/Archive-Collections/index.htm;
Additionally, Norfolk Sound Archive collections feature in Directory of UK Sound Collections:
http://www.bl.uk/projects/uk-sound-directory, accessed 09/03/2016
3
practical action across the Region. It will also seek to identify collaborative
solutions for the preservation of digital data.4
With the archive sector development strategy clearly stated the next important step
forward was the East of England Digital Preservation Regional Pilot Project that took
place between August 2004 - March 2005.5 Although NRO was not directly involved
in the project, it participated as an observer.6 The test bed project was aimed at
better understanding of the processes and costs of preserving digitalised material by
assessing feasibility of outsourcing specialist services on a regional basis to the UK
Data Archive based at the University of Essex. Among the recommendations were
two lessons learned from the project that remain relevant within the context of digital
preservation at a local authority archive:
- Need to consider further modelling of costs and benefits linked with the three
scenarios of: in-house provision, working through consortia and contracting
out;
- Need to develop and clarify the OAIS model to make it more intelligible and
better aligned, with more conventional terminology applied to archive
administration and records management.7
2.3 NRO Digital Preservation Policy
In July 2007 the NRO approved a Digital Preservation Policy. It is currently
undergoing a revision and will be supported by related documents: Digital Records
Accessioning Checklist and Advice to Creators of Digital Records.8 The policy
recognises the urgent need to address developing an OAIS compliant digital
preservation strategy, especially with the view of Norfolk Sound Archive expanding
its activities. Key points from the policy are:
1.6 The NRO (and NSA) expect to receive for appraisal and preservation an
increasing number of ‘born-digital’ records, mainly, in the first instance, from
private organisations, groups and individuals, which due to their functionality
or quantity cannot be preserved by printing out hard copies. This expectation
is based on contact with depositors and with colleagues in other local
government archive services.
4
Chris Pickford, Eastern Promise – A Strategy for Archival Development in the East of England, EERAC 2003, p.
33
5
Digital Archives Regional Pilot profile on UK Data Archive http://www.data-archive.ac.uk/about/projects/darp,
accessed 09/03/2016
6
Chairman of EERAC at a time was John Alban, Norfolk County Archivist, who co-wrote foreword to Report of the
East of England Digital Preservation Regional Pilot Project, MLA East of England and EERAC 2006,
http://www.data-archive.ac.uk/media/1680/DARP_finalreport.pdf, accessed 08/03/2016
7
Ibid., p. 43
8
See appendices A and B for draft versions. The Digital Preservation Policy is an internal document and as such
is not publicly available or published online at the moment. An important point of reference for Digital Preservation
Policy is the Archive Collecting Policy http://www.archives.norfolk.gov.uk/view/NCC098771 accessed 09/03/2016
4
1.7 Digital records, i.e., the type of data, format of carrier and associated
hardware and software needs, are likely to be varied and in many cases
cannot be anticipated.
The appraisal section outlines the scope of digital records collecting policy:
3.2 The NRO will not necessarily accept custody of every digital record offered to
it. Records which in effect we cannot use or whose use imposes
unacceptable costs or conditions on the NRO might not be selected for
retention. Factors which may affect this are:
 the existence of adequate metadata
 the removal or neutralisation of security features
 the provision of a free licensed copy of the original software, if necessary,
to access and maintain the record
The document also set great store on open (non-proprietary) standards that are ‘to
be employed as far as possible for preservation purposes. Where those standards
are absent, file formats accepted as the industry standard should be used, e.g., TIFF
files.’
2.4 Requirements
A quick survey of digitally born archives received by NRO gives an insight into the
requirements for digital preservation. The total volume of data held within the archive
amounted to 87.8 GB in November 2015 with majority of the records consisting of
various formats: raster and vector images, text documents, audio and video files as
shown below (please refer also to appendix D):
5
Figure 1: Collection Profile of all Digitally Born Archives at the NRO
6
Norfolk Sound Archives’ digital assets will soon need 3 TB of storage and various
digitisation projects occupy similar space on shared network drives.
Improving the accessioning system in respect to digitally born archives is the main
drive towards establishing digital preservation strategy. The aim is to build digital
preservation capabilities so as to be able to choose digital records over their paper-
based equivalents (hard copies/printouts).
2.5 Parsimonious Approach - Bitstream Preservation
NRO has been collecting digitally born archives since 1997 and has introduced
gradually different elements of the ‘parsimonious’ approach to digital preservation,
resulting in a systematic method initiated by the Senior Archivist (Collection
Management).9 All digital accessions were processed manually in the same manner:
- One month quarantine
- Virus scan performed
- Integrity checks conducted by generating checksums
- Digital objects extracted from removable media and transferred to a secure
and designated network drive location
- Technical metadata generated (file count, file sizes, directory and file listings)
- File format identified by creating profiling reports with DROID
- Top level descriptions created in CALM cataloguing system
All metadata generated in the above process is stored alongside the digital records
within the top level directory to which digital objects were transferred.
This is a time-consuming process and uses more than one tool, creating metadata in
separate files. This will be difficult to sustain in the long-term.
2.6 Towards Logical Preservation
The service plan for 2014/2015 identifies the lack of compliance with the OAIS
standard – a consideration, which has been gaining significance on the NRO Risk
Register.10 With greater understanding of OAIS functional model and further
research into currently available digital preservation systems (Preservica, Rosetta
etc.) NRO decided to explore Archivematica as the preferred solution, since it offers
9
Norfolk Records Office Service Plan for 2014-2015 enclosed in Norfolk Record Committee meeting agenda from
Thursday 1 May 2014 http://www.norfolk.gov.uk/download/norfrec010514agendapdf, p. 43, accessed
06/03/2016; Tim Gollins, Parsimonious preservation: preventing pointless processes!,
http://www.nationalarchives.gov.uk/documents/information-management/parsimonious-preservation.pdf,
accessed 06/03/2016
10
Ibid., p. 38
7
normalisation for preservation and access purposes. It acts as an Archival
Information Package creator, managing the workflow from transfer, through ingest to
archival storage and dissemination.
The software was first installed in a test environment (Ubuntu 14.04.4 LTS running
on HP Linux compatible machine) and used to process a sample dataset comprising
of file formats similar to those accessioned by NRO in the past. After initial results it
became apparent that in order to fully assess Archivematica’s capabilities it needs to
be deployed in a production environment, connected with appropriate storage
systems.11
The software is commonly referred to as a pipeline, since in itself it is not a repository
(storage system), nor an access system but a processing pipeline connecting them,
supporting pre-ingest and ingest activities of a digital repository.
Arkivum, a company specialising in digital data archiving offers a fully hosted service:
cloud storage integrated with Archivematica from January 2016.12 In order to reduce
the costs and utilise economies of scale NRO suggested a pilot project to EERAC
members in February 2016, demonstrating why Archivematica is the preferred
system for digital records’ ingest.13 In summary the main advantages are:
- It is OAIS compliant and supports PREMIS metadata schema together with
METS and Dublin Core standards14
- The Archival Information Package is structured in accordance with Library of
Congress BagIt specification15
- It uses The National Archives’ file format registry PRONOM
- It is open source and under active development with substantial user
community from across heritage, arts and academic sectors.16
- It runs a series of configurable micro-processes provided by open source tools
integrated within Archivematica, which can be replaced as technologies
change fulfilling the requirement of OAIS’ Manage System Configuration
function17
It was discussed with EERAC members that working together could entail shared
resources: cloud-based infrastructure with digital preservation software accessible
through browser (Archivematica) and linked cloud storage (provided by Arkivum).
The main focus of the project is to evaluate Archivematica as a digital preservation
tool but it will also look at integrating it with AtoM, an access and cataloguing system
11
Norfolk County Council’s ICT security restriction prohibited integration with the system.
12
http://arkivum.com
13
Presentation is available on SlideShare: http://www.slideshare.net/PaweJaskulski1/archivematica-and-local-
authority-archive-services accessed 09/03/2016
14
PREMIS http://www.loc.gov/standards/premis/, METS http://www.loc.gov/standards/mets/, Dublin Core
http://dublincore.org/, accessed 11/03/2013
15
E-Ark Report on Available Formats and Restrictions: http://www.eark-project.com/resources/project-
deliverables/7-e-ark-d41-report-on-available-formats-and-restrictions/file, p. 24, accessed 10/03/2016
16
Archivematica users group forum: https://groups.google.com/forum/#!forum/archivematica
17
Reference Model for an Open Archival Information System (OAIS). Magenta Book. Issue 2. June 2012:
http://public.ccsds.org/publications/archive/650x0m2.pdf, p. 4-12, accessed 11/03/2016
8
supporting digital objects and developed by the same organisation that created
Archivematica: Artefactual Systems.18
3. Conclusion
Part of the Digital Preservation policy should also be advocacy and consultancy
aimed at promoting digital archiving in Norfolk, especially amongst community
archives and other likely donors and depositors. Data must be capable of re-use with
sufficient metadata (detailed enough documentation in regards to chain of custody,
creators, rights, technical provenance: what software and what operating system the
files were created in what file format etc.).19 Advice on best practice in regards to
electronic record keeping should be embedded within the policy to foster better
understanding of digital preservation concepts among electronic records creators,
donors and depositors.
4. Recommendations
With NRO continuing to explore Archivematica and its applications to archival
processing workflow this report recommends:
- The NRO improves its Preservation Planning by compiling an Action Plan for
all file formats received within digital accessions. The Action Plan will inform
staff what must be done to normalise a digital object at Ingest into a
preservation and/or dissemination formats. This will inform format migration
strategy. For example, TIFF is the currently preferred preservation format for
images as it is less prone to data loss than other raster image file formats.
NRO is interested in exploring PNG file format as preservation format for
certain types of digital records.
- The NRO improves its Preservation Planning by identifying and compiling a
list of Significant Properties per type (text, audio, etc.) to help staff decide
whether a format migration has produced acceptable results, retaining its
authenticity and evidential value.20
- The designated community of the NRO is the general public, which demands
a robust access system and intellectual property rights management
procedures. The NRO should review its designated community and identify
18
The current cataloguing system used by EERAC members is CALM. If the project is successful it would require
migration of the records to a new system supporting digital objects like Access to Memory (AtoM):
https://www.accesstomemory.org. A concern shared also by ARCW Digital Preservation Working Group:
http://www.nationalarchives.gov.uk/documents/Cloud-Storage-casestudy_Wales_2015.pdf, accessed 09/03/2016
19
Jenny Mitcham, Preservation of Digital Objects at the Archaeology Data Service; in: Janet Delve and David
Anderson, Preserving Complex Digital Objects, Facet Publishing 2014, p. 50-51
20
Margaret Hedstrom, Christopher A. Lee, Significant properties of digital objects: definitions, applications,
implications, https://www.ils.unc.edu/callee/sigprops_dlm2002.pdf accessed 09/03/2016
9
any sub-sets or new communities whose access requirements are specialised
or more complex and demanding and how these can be met.
- The NRO considers how it will add the timely delivery of digital records to
users in its facilities (search room and online access).
- Although not an immediate priority NRO audits its other digital assets. Please
see Appendix C for suggested actions.
Additionally with the view of NRO and EERAC continuing collaborative work as
consortium this report suggest:
- Archives associated within EERAC occupy disparate geographical locations,
which could translate to a network of disparate storage locations, improving
security and ensuring disaster recovery plan. Assuming that each archive has
its own ICT infrastructure, or is willing to develop it, that would fulfil the first
requirement of the NDSA Levels of Digital Preservation.21
- EERAC could continue its work towards Distributed Digital Preservation
model, in which the members own preservation infrastructures and expertise
rather than outsourcing this core service to external vendors as with the
MetaArchive Cooperative example.22
- EERAC agrees on best practices and standards to support interoperability and
sustainability of the project. It is important to refer to standards for trusted
digital repository: DRAMBORA, Data Seal of Approval or TRAC in order to aid
concentrating the efforts on achievable goals.23
21
Megan Phillips et all, The NDSA Levels of Digital Preservation: An Explanation and Uses,
http://www.digitalpreservation.gov/documents/NDSA_Levels_Archiving_2013.pdf, accessed 10/03/2016
22
Adrian Brown, Practical Digital Preservation, Facet 2013, p. 103-106; Also
https://educopia.org/presentations/long-term-preservation-strategies-architecture-views-implementers, accessed
09/03/2016
23
Main Certification Standards include: peer-reviewed self-assessment Data Seal of Approval Assessment
http://datasealofapproval.org, DRAMBORA Digital Repository Audit Method Based on Risk Assessment
http://www.repositoryaudit.eu and TRAC Trustworthy Repositories Audit and Certification
https://www.crl.edu/sites/default/files/d6/attachments/pages/trac_0.pdf, accessed 11/03/2016
10
References
Kilian Amrhein and Marco Klindt, One Core Preservation System for All your Data. No Exceptions!,
https://opus4.kobv.de/opus4-zib/files/5663/iprespaper-finaledit.pdf, accessed 11/03/2016
Philip C. Bantin, Strategies for Managing Electronic Records: A New Archival Paradigm? An
Affirmation of Our Archival Traditions?, http://www.indiana.edu/~libarch/ER/macpaper12.pdf,
accessed 11/03/2016
Adrian Brown, Practical Digital Preservation, Facet 2013
Edward M. Corrado and Heather Lea Moulaison, Digital Preservation for Libraries, Archives, and
Museums, Rowman & Littlefield 2014
Tim Gollins, Parsimonious preservation: preventing pointless processes!,
http://www.nationalarchives.gov.uk/documents/information-management/parsimonious-
preservation.pdf, accessed 06/03/2016
Margaret Hedstrom, Christopher A. Lee, Significant properties of digital objects: definitions,
applications, implications, https://www.ils.unc.edu/callee/sigprops_dlm2002.pdf accessed 09/03/2016
Helen Heslop, An Approach to the Preservation of Digital Records, National Archives of Australia
2002, http://www.naa.gov.au/Images/An-approach-Green-Paper_tcm16-47161.pdf, accessed
11/03/2016
Sarah Higgins, The DCC Curation Lifecycle Model, The International Journal of Digital Curation,
Volume 3, Issue 1, 2008, p. 134-140
Kirnn Kaur, Report on testing of cost models and further analysis of cost parameters, APARSEN 2013
https://rd-alliance.org/system/files/filedepot/113/APARSEN-REP-D32_2-01-1_0.pdf, accessed
11/03/2016
Anna Kugler, Hannes Kulovits, From TIFF to JPEG 2000?, D-Lib Magazine, Volume 15, Issue 11/12,
2009, http://www.dlib.org/dlib/november09/kulovits/11kulovits.html, accessed 11/03/2016
Brian F. Lavoie, The Open Archival Information System Reference Model: Introductory Guide (DPC
Technology Watch), OCLC and DPC 2004, http://www.dpconline.org/docs/lavoie_OAIS.pdf and its 2nd
2014 edition http://www.dpconline.org/component/docman/doc_download/1359-dpctw14-02,
accessed 11/03/2016
Jenny Mitcham, Preservation of Digital Objects at the Archaeology Data Service; in: Janet Delve and
David Anderson, Preserving Complex Digital Objects, Facet Publishing 2014
Jenny Mitcham, Chris Awre, Julie Allinson, Richard Green and Simon Wilson, Filling the Digital
Preservation Gap. A Jisc Research Data Spring project. Phase One report - July 2015,
https://dx.doi.org/10.6084/m9.figshare.1481170.v1, accessed 11/03/2016
Jenny Mitcham, Chris Awre, Julie Allinson, Richard Green and Simon Wilson, Filling the Digital
Preservation Gap. A Jisc Research Data Spring project Phase Two report - February 2016,
https://dx.doi.org/10.6084/m9.figshare.2073220.v1, accessed 11/03/2016
11
David Pearson and Colin Webb, Defining File Format Obsolescence: A Risk Journey, The
International Journal of Digital Curation, Volume 3, Issue 1, 2008, p. 89-106
Megan Phillips et all, The NDSA Levels of Digital Preservation: An Explanation and Uses,
http://www.digitalpreservation.gov/documents/NDSA_Levels_Archiving_2013.pdf, accessed
10/03/2016
Chris Pickford, Eastern Promise – A Strategy for Archival Development in the East of England, EERAC
2003
Jeff Rothenberg, Ensuring Longevity of Digital Information, CLIR 1999,
http://www.clir.org/pubs/archives/ensuring.pdf, accessed 11/03/2016
Jeff Rothenberg, Preserving Authentic Digital Information, CLIR 2000,
http://www.clir.org/pubs/reports/pub92/rothenberg.html, accessed 11/03/2016
Bronwen Sprout et all, Archivematica As a Service: COPPUL's Shared Digital Preservation Platform,
http://summit.sfu.ca/system/files/iritems1/15519/CJILS39.2-9-Sprout.pdf, accessed 11/03/2016
Adam Tovell and James Knight, Directory of UK Sound Collections, British Library 2015:
http://www.bl.uk/britishlibrary/~/media/subjects%20images/sound/directory%20of%20uk%20sound%2
0collections.pdf, p. 250-294, accessed 11/03/2016
Colin Webb, David Pearson and Paul Koerbin, 'Oh, you wanted us to preserve that?!' Statements of
Preservation Intent for the National Library of Australia's Digital Collections, D-Lib Magazine, Volume
19, Issue 1/2, 2013 http://www.dlib.org/dlib/january13/webb/01webb.html, accessed 05/03/2016
Geoffrey Yeo, Trust and context in cyberspace, Archives and Records, Volume 34, Issue 2, Routledge
2013, p. 214-234
Archives and Records Council Wales Digital Preservation Working Group Case Study:
http://www.nationalarchives.gov.uk/documents/Cloud-Storage-casestudy_Wales_2015.pdf, accessed
09/03/2016
East of England Digital Preservation Regional Pilot Project, MLA East of England and EERAC 2006,
http://www.data-archive.ac.uk/media/1680/DARP_finalreport.pdf, accessed 08/03/2016
Reference Model for an Open Archival Information System (OAIS). Magenta Book. Issue 2.,
Consultative Committee for Space Data Systems 2012,
http://public.ccsds.org/publications/archive/650x0m2.pdf, accessed 11/03/2016
Trustworthy Repositories Audit & Certification: Criteria and Checklist, OCLC and CRL 2007,
https://www.crl.edu/sites/default/files/d6/attachments/pages/trac_0.pdf, accessed 11/03/2016
Appendix A Digital Records Accessioning Checklist (Draft Version January 2016)
1
26 January 2016
Accessioning Digital Records Guide
Accessioning Digital Records does not differ from standard Accession Procedure. Please
refer to it first and follow the steps as described.
Accessioning Digital Records corresponds to the Ingest entity within OAIS functional model.
For the purpose of this guide this is broken down into Transfer and Appraisal, since donors
and depositors not always perform selection and arrangement prior to transferring digitally
born archives. If the digital material is in need of preservation (converting to preservation file
formats) this should be also preceding the Ingest.
(1.1.1) At the point of contact negotiate technical details for the delivery of digitally born
archives. Please refer to (and explain to the donor/depositor) Advice to Creators of Digital
Records document.
(1.1.2) Obtain Intellectual Property Rights and permission to manipulate and/or destroy
(delete) digital material for the purpose of preservation, appraisal (in line with NRO
Collecting Policy) and access. Permission is needed to authorize the procedures necessary
to meet preservation objectives. For example: creating a new version of the archived item so
that it can be rendered by current technologies, discarding of material that does not meet the
criteria of NRO Collecting Policy or creating access copies to be published online on NRO
website.
(1.2.1) Run virus scan on the received digital records on a designated Digital Preservation
workstation (quarantined workarea) that accepts all incoming digital records accessions.
This will happen automatically if the digital content is being sent as a file transfer over
internet or other network (FTP transfer, e-mail, via downloadable link etc.).
(1.2.2) If the digital records are received on removable media (DVD, CD-ROM, USB Memory
Stick etc.) connect them to the designated DP workstation to run virus scan.
(1.2.3) Make sure to update Removable Media Inventory if these are to be retained in the
Cold Store. Number all carriers consecutively based on their accession number, e.g., ACC
2012/33 RM 2 of 5 (where are RM stands for Removable Media).
(1.3.1) Irrespective of how the digital records are delivered (an accession might comprise
both: file transfers and removable media as hybrid/mixed accession) there should be two
copies created of the entire dataset comprising the accession. One respecting original order
and preserving all digital material as it was deposited/donated. And a second working copy,
created in exactly the same manner, but intended for an archivist to perform appraisal (AKA
digital curation).
As mentioned both datasets should be structured as originally delivered. In the case of
mixed accession, including different removable media, there should be separate directory
created for each carrier of digital material. These should be named according to a
convention:
<Sequential Number 000>_<Type of Carrier>_<Label if Applicable>
Transfer Appraisal Preservation Ingest
Archival
Storage
Access
Appendix A
2
For example:
001_Floppy-Disc_Accounts-1993
and all placed within a removableMedia folder located at the top level directory you are
working in.
(1.3.2) Review content of the accession and create intellectual arrangements. This is an
Appraisal stage – it is desirable to decide at this point how to structure the digital records into
a Submission Information Package. If some of the files fall out of the scope of the collection
policy they should be deleted now, subject to donor’s/depositor’s permission. Please refer to
File Formats Action List document to identify superfluous objects.
(1.3.3) As part of the above process you should ensure that all the changes are recorded.
Create metadata folder and metadata.csv file inside it with an Excel spreadsheet template
(coming soon) next to submissionDocumntation folder within the directory storing the digital
objects you are working on. Provide descriptive information at appropriate level, complying
with Dublin Core metadata standard.
(1.4) Create Disk Images for removable media using BitCurator, if it is important to preserve
functionality or particular features of the physical carrier of digital content (for example menu
of a DVD Video or interactive CD-ROM).
(1.5.1) Collect Fixity Information – check if fixity information is delivered by the
donor/depositor and verify if the files are not corrupted.
(1.5.2) Generate Fixity Information – check checksums before and after transfer whenever
data are being copied from one storage system (physical discs, CD-ROM, DVD, USB
memory stick etc.) to another one (Local File System, server storage, shared drives etc.)
Check file count and file size.
Tool: Fsum Frontend (http://fsumfe.sourceforge.net/index.php?page=usage)
To create a checksum file .sha2 for all files within a directory
1. Select in the menu « Generate check file ».
2. Select the folder containing your files.
3. To select location where you want the result to be saved select second option
from the drop-down menu: “1 file in any place”. Otherwise the program creates
the file within the directory containing your files with the first option: “1 file in tree
root”
4. Select the format SHA2 512 (used by Archivematica) and click the button «
Generate ».
(1.6) Scan for sensitive and personal information (credit card details, addresses, phone
numbers etc.)
(1.7) Identify file formats using DROID (http://www.nationalarchives.gov.uk/information-
management/manage-information/policy-process/digital-continuity/file-profiling-tool-droid/)
Appendix A
3
(1.8) Include a digital copy of your email correspondence with the donor/depositor any other
documentation related to the transfer within submissionDocumentation folder, which should
be created within the top directory you are working on.
With the view to use Archivematica in the future all metadata generated throughout
accessioning process (checksums, DROID results etc.) should be put into a
submissionDocumentation folder next to metadata folder that will contain metadata.csv file
created by an archivist according to the Excel template.
(1.9) Assess the overall size of the accession in bytes and convert the size figure to an
easily readable format (e.g.: MB,GB,TB)
Glossary
Fixity information - hash, message digest, checksum, manifest file
Bit Rot - On magnetic media the binary digits are (essentially) represented by individual
particles of magnetic material whose polarity represents either 1 or 0. Sometimes
interference or just general degradation of the media can cause these particles to flip,
reversing the meaning of a particular bit. Or on optical media, physical damage or decay of
the dyes which are used in writable DVDs and CDs has similar effects. There’s usually a
degree of error correction built in, but eventually this can build up and corrupt data
irretrievably.
Disk Image - copy of the bitstream that is read off the disk through the computer’s
input/output equipment. The standard forensics software that creates a disk image also
generates a cryptographic hash of the entire disk image.
Appendix B
1
Advice to Depositors of Digital Records
Transfer of Intellectual Property Rights
Archives may not be able to assign their limited resources to the task of preserving
data for which the value is unknown but at the same time, there is a need to
preserve ‘valuable’ datasets. This is why we ask for the permission to manipulate
and/or destroy digital content donated to us, so as we can ensure the best use of our
resources and prioritise accepting deposits according with our collecting policy (See
our website: http://www.archives.norfolk.gov.uk/view/NCC098771). Permission to
destroy is needed in order to perform preservation tasks as well as to ensure that
Norfolk Record Office meets the requirements of its collecting policy.
Please be aware that by signing the accession form you give Norfolk Record Office
the authority to process, migrate and destroy the data for the purpose of
preservation. This mean that original data carriers (removable media on which they
were stored like USB Memory Stick, CDs, Floppy Discs etc.) may be discarded.
Norfolk Record Office preserves collections donated to it to be accessible to general
public. Please let us know if the digital records contain any sensitive or confidential
information, so that public access can be restricted for a suitable period of time.
Metadata
In order to ensure that the digital records are accessible in the future we must collect
all necessary information required to open and view digital files. To the best of your
knowledge please provide us with information about:
- What software (including version) was used to create, open, read, edit and
save the file/s;
- What operating system was used (including version; for example Windows
XP Service Pack 2003, Mac OS X 10.6.8 etc.);
- For what purpose the data were generated/created and around what time?
Please complete Digital Files and Removable Media Inventory form (Excel
spreadsheet template) that will list content of your deposit and whenever possible
size on a disk.
For Current Records managers
Preferred Deposit Format
In order to ensure long-term sustainability of access to the records it is
recommended that records managers use current preservation formats. If within
means and resources of your organisation export data that you want to preserve and
deposit with NRO in the following formats.
Appendix B
2
Media Type File Formats
Text PDF/A: Portable Document Format (Archival; ISO 19005-3 compliant)
Image (Raster) TIFF: Uncompressed Baseline Tagged Image File Format v.6 (No LZW
compression)
PNG: Portable Network Graphics (lossless compression)
Image (Vector) SVG: Scalable Vector Graphics File
Sound WAV: Broadcast Wave Format
For Existing Digital Records
Accepted Deposit Format
Media Type File Formats
Text DOCX: MS Word Open XML Document (created in MS Office 2007 and above)
XLSX: MS Excel Open XML Document (created in MS Office 2007 and above)
PPTX: MS PowerPoint Open XML Document (created in MS Office 2007 and
above)
ODT: OpenDocument Text Document (created in OpenOffice)
ODS: OpenDocument Spreadsheet (created in OpenOffice)
ODP: OpenDocument Presentation (created in OpenOffice)
PDF/A: Portable Document Format (Archival)
TXT: Plain Text File (ANSI or UTF-8 encoded)
RTF: Rich Text Format File
XML: Extensible Markup Language Data File
CSV: Comma Separated Values File
Image (Raster) TIFF: Tagged Image Format File
PNG: Portable Network Graphic
Image (Vector) SVG: Scalable Vector Graphics File
Sound WAV: Waveform Audio File Format
AIFF: Audio Interchange File Format
MP3: Moving Picture Experts Group Layer 3 compression
FLAC: Free Lossless Audio Codec File
OGG: Ogg Vorbis Audio File
Video* MPEG-1/2: Moving Picture Experts Group
AVI: Audio Video Interleave File (uncompressed)
MOV: Quicktime Movie (uncompressed)
MP4: Moving Picture Experts Group (with H.264 encoding)
Email EML: Electronic Mail Format
3D Graphics OBJ: Wavefront Object files
DROID
If depositing large amount of data that equals to system migration please use DROID
before submitting your deposit
(http://www.nationalarchives.gov.uk/documents/information-management/droid-how-
to-use-it-and-interpret-results.pdf).
Appendix C
1
Digital Audit Questionnaire
The aim of this questionnaire is to identify the requirements for future storage and
preservation of any digital material being within possession of NRO. This will mainly
encompass:
- Digitally born records being deposited to NRO or already held by NRO
- Outputs of digitisation projects (both images and sounds)
- Electronic records generated by NRO itself (organisational records like office
administration, email correspondence etc.)
It is important that all employees will take part in this exercise (needs assessment) in order
to fully understand the scope of the necessary actions to be taken.
PAST ARCHIVE SERVICE ACTIVITIES
1. Are you aware of any important digital material that must be kept (digital
files/electronic records like text documents, spreadsheets, scanned images or digital
photographs) within your department that are being stored on either network drive,
external hard drive or any type of removable media (DVDs, CDs, floppy disks,
memory cards, USB memory sticks etc.)?
2. If yes can you provide details below:
Type of Storage
(network drive, CD,
DVD, external HDD,
floppy disk etc.)
Type of content (spreadsheets, word
documents, PDFs, digital
images/photographs, scanned
documents)
Volume (size on disk in
either MB, GB or TB; if
small put less than 1MB)
3. Email – do you know of any emails that you might have sent yourself or received
from someone that should have been kept for future reference? In this situation
would you normally print off the email and file the printout? Would you consider as
an alternative printing the email into a PDF file and saving it onto designated
network drive locations?
Appendix C
2
Click here to enter text.
CURRENT ARCHIVE SERVICE ACTIVITIES
1. In your everyday tasks at work do you work with digital material (files)?
Click here to enter text.
2. What are they?
Click here to enter text.
3. Do you think they are important to an extent that they would need to be preserved
over time for future access?
Click here to enter text.
4. How strongly would you identify the need to do so? In other words what is the value
of the digital material that you produce/work on? Does it need to be kept by NRO? If
yes, for how long?
Click here to enter text.
5. If you’ve answered yes to the above, can you estimate the volume of digital material
that is being produced (the amount of data that need to be kept)?
Small (can be specified in MB), Medium (can be specified in GB), Large (can be
specified in TB, PT)
Click here to enter text.
Thank you
Appendix D
A survey of digitally born archives received by the Norfolk Record Office compiled with The National Archives’ DROID profiling tool identified 107 various file formats.
Image (Raster) 64%Miscellaneous 10%
Word Processor 8%
Text (Mark-up) 7%
Email 6%
Page Description 2%
Text (Structured) 2%
Image (Raster), Aggregate 1%
Presentation 0% Image (Vector) 0% Audio 0% Video 0%
Spreadsheet 0%
Audio, Video 0%
Text (Unstructured) 0%
Dataset 0%
Aggregate 0%
Database 0%
Image (Vector), Text (Mark-up) 0%
File Formats per Type
Image (Raster)
Miscellaneous
Word Processor
Text (Mark-up)
Email
Page Description
Text (Structured)
Image (Raster), Aggregate
Presentation
Image (Vector)
Audio
Video
Spreadsheet

More Related Content

Viewers also liked (7)

Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
 
Digital preservation
Digital preservationDigital preservation
Digital preservation
 
Digital preservation: an introduction
Digital preservation: an introductionDigital preservation: an introduction
Digital preservation: an introduction
 
Turbo code
Turbo codeTurbo code
Turbo code
 
Turbo Codes
Turbo CodesTurbo Codes
Turbo Codes
 
Turbo codes.ppt
Turbo codes.pptTurbo codes.ppt
Turbo codes.ppt
 
Turbo code
Turbo codeTurbo code
Turbo code
 

Similar to Digital Preservation at Norfolk Record Office

Archiving The Worlds E-Journals:The Keepers Registry As Global Monitor
Archiving The Worlds E-Journals:The Keepers Registry As Global MonitorArchiving The Worlds E-Journals:The Keepers Registry As Global Monitor
Archiving The Worlds E-Journals:The Keepers Registry As Global Monitor
EDINA, University of Edinburgh
 
Rolf Källman Culture Cloud Fränsta 18-20 feb 2014
Rolf Källman Culture Cloud Fränsta 18-20 feb 2014Rolf Källman Culture Cloud Fränsta 18-20 feb 2014
Rolf Källman Culture Cloud Fränsta 18-20 feb 2014
Digisam
 

Similar to Digital Preservation at Norfolk Record Office (20)

A National Preservation Policy for the UK?
A National Preservation Policy for the UK?A National Preservation Policy for the UK?
A National Preservation Policy for the UK?
 
Valerie Johnson: Supporting the Archives Sector via Collaboration
Valerie Johnson: Supporting the Archives Sector via CollaborationValerie Johnson: Supporting the Archives Sector via Collaboration
Valerie Johnson: Supporting the Archives Sector via Collaboration
 
A Service Perspective: Unlocking metadata to enhance discoverability and conn...
A Service Perspective: Unlocking metadata to enhance discoverability and conn...A Service Perspective: Unlocking metadata to enhance discoverability and conn...
A Service Perspective: Unlocking metadata to enhance discoverability and conn...
 
Securing continuing access to ejournal content
Securing continuing access to ejournal contentSecuring continuing access to ejournal content
Securing continuing access to ejournal content
 
Hamooya
HamooyaHamooya
Hamooya
 
Open Research in Ireland: Infrastructures for Open Research
Open Research in Ireland: Infrastructures for Open ResearchOpen Research in Ireland: Infrastructures for Open Research
Open Research in Ireland: Infrastructures for Open Research
 
Piloting an E-journals Preservation Registry Service: overview of PEPRS
Piloting an E-journals Preservation Registry Service: overview of PEPRSPiloting an E-journals Preservation Registry Service: overview of PEPRS
Piloting an E-journals Preservation Registry Service: overview of PEPRS
 
Towards a Common Approach for Access to Digital Archival Records in Europe. A...
Towards a Common Approach for Access to Digital Archival Records in Europe. A...Towards a Common Approach for Access to Digital Archival Records in Europe. A...
Towards a Common Approach for Access to Digital Archival Records in Europe. A...
 
HEA: Research Infrastructures Programme Opportunities
HEA: Research Infrastructures Programme OpportunitiesHEA: Research Infrastructures Programme Opportunities
HEA: Research Infrastructures Programme Opportunities
 
Sigauke and nengomasha
Sigauke and nengomashaSigauke and nengomasha
Sigauke and nengomasha
 
ASTINFO & APINESS
ASTINFO & APINESS ASTINFO & APINESS
ASTINFO & APINESS
 
Preservation Policy in SCAPE - Training, Aarhus
Preservation Policy in SCAPE - Training, AarhusPreservation Policy in SCAPE - Training, Aarhus
Preservation Policy in SCAPE - Training, Aarhus
 
Summer 2011.pdf
Summer 2011.pdfSummer 2011.pdf
Summer 2011.pdf
 
Tdr Overview Pres Advocates
Tdr Overview Pres AdvocatesTdr Overview Pres Advocates
Tdr Overview Pres Advocates
 
Archiving The Worlds E-Journals:The Keepers Registry As Global Monitor
Archiving The Worlds E-Journals:The Keepers Registry As Global MonitorArchiving The Worlds E-Journals:The Keepers Registry As Global Monitor
Archiving The Worlds E-Journals:The Keepers Registry As Global Monitor
 
Archives & Records Association summer seminar Edinburgh 7 June 2019
Archives & Records Association summer seminar   Edinburgh 7 June 2019Archives & Records Association summer seminar   Edinburgh 7 June 2019
Archives & Records Association summer seminar Edinburgh 7 June 2019
 
E-ARK: Open Data Mining for Government Archives
E-ARK: Open Data Mining for Government ArchivesE-ARK: Open Data Mining for Government Archives
E-ARK: Open Data Mining for Government Archives
 
abstract: hierarchical storage &amp; managing media assets for the libraries ...
abstract: hierarchical storage &amp; managing media assets for the libraries ...abstract: hierarchical storage &amp; managing media assets for the libraries ...
abstract: hierarchical storage &amp; managing media assets for the libraries ...
 
Rolf Källman Culture Cloud Fränsta 18-20 feb 2014
Rolf Källman Culture Cloud Fränsta 18-20 feb 2014Rolf Källman Culture Cloud Fränsta 18-20 feb 2014
Rolf Källman Culture Cloud Fränsta 18-20 feb 2014
 
NORFest 2023 Lightning Talks Session Three
NORFest 2023 Lightning Talks Session Three NORFest 2023 Lightning Talks Session Three
NORFest 2023 Lightning Talks Session Three
 

Recently uploaded

Call Girls In datia Escorts ☎️7427069034 🔝 💃 Enjoy 24/7 Escort Service Enjoy...
Call Girls In datia Escorts ☎️7427069034  🔝 💃 Enjoy 24/7 Escort Service Enjoy...Call Girls In datia Escorts ☎️7427069034  🔝 💃 Enjoy 24/7 Escort Service Enjoy...
Call Girls In datia Escorts ☎️7427069034 🔝 💃 Enjoy 24/7 Escort Service Enjoy...
nehasharma67844
 
Call Girls in Chandni Chowk (delhi) call me [9953056974] escort service 24X7
Call Girls in Chandni Chowk (delhi) call me [9953056974] escort service 24X7Call Girls in Chandni Chowk (delhi) call me [9953056974] escort service 24X7
Call Girls in Chandni Chowk (delhi) call me [9953056974] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
call girls in Raghubir Nagar (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service ...
call girls in Raghubir Nagar (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service ...call girls in Raghubir Nagar (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service ...
call girls in Raghubir Nagar (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service ...
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
VIP Call Girls Bhavnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Bhavnagar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Bhavnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Bhavnagar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 

Recently uploaded (20)

VIP Model Call Girls Lohegaon ( Pune ) Call ON 8005736733 Starting From 5K to...
VIP Model Call Girls Lohegaon ( Pune ) Call ON 8005736733 Starting From 5K to...VIP Model Call Girls Lohegaon ( Pune ) Call ON 8005736733 Starting From 5K to...
VIP Model Call Girls Lohegaon ( Pune ) Call ON 8005736733 Starting From 5K to...
 
Call Girls In datia Escorts ☎️7427069034 🔝 💃 Enjoy 24/7 Escort Service Enjoy...
Call Girls In datia Escorts ☎️7427069034  🔝 💃 Enjoy 24/7 Escort Service Enjoy...Call Girls In datia Escorts ☎️7427069034  🔝 💃 Enjoy 24/7 Escort Service Enjoy...
Call Girls In datia Escorts ☎️7427069034 🔝 💃 Enjoy 24/7 Escort Service Enjoy...
 
Call Girls in Chandni Chowk (delhi) call me [9953056974] escort service 24X7
Call Girls in Chandni Chowk (delhi) call me [9953056974] escort service 24X7Call Girls in Chandni Chowk (delhi) call me [9953056974] escort service 24X7
Call Girls in Chandni Chowk (delhi) call me [9953056974] escort service 24X7
 
The U.S. Budget and Economic Outlook (Presentation)
The U.S. Budget and Economic Outlook (Presentation)The U.S. Budget and Economic Outlook (Presentation)
The U.S. Budget and Economic Outlook (Presentation)
 
World Press Freedom Day 2024; May 3rd - Poster
World Press Freedom Day 2024; May 3rd - PosterWorld Press Freedom Day 2024; May 3rd - Poster
World Press Freedom Day 2024; May 3rd - Poster
 
celebrity 💋 Agra Escorts Just Dail 8250092165 service available anytime 24 hour
celebrity 💋 Agra Escorts Just Dail 8250092165 service available anytime 24 hourcelebrity 💋 Agra Escorts Just Dail 8250092165 service available anytime 24 hour
celebrity 💋 Agra Escorts Just Dail 8250092165 service available anytime 24 hour
 
VIP Model Call Girls Kiwale ( Pune ) Call ON 8005736733 Starting From 5K to 2...
VIP Model Call Girls Kiwale ( Pune ) Call ON 8005736733 Starting From 5K to 2...VIP Model Call Girls Kiwale ( Pune ) Call ON 8005736733 Starting From 5K to 2...
VIP Model Call Girls Kiwale ( Pune ) Call ON 8005736733 Starting From 5K to 2...
 
Pimpri Chinchwad ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi R...
Pimpri Chinchwad ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi R...Pimpri Chinchwad ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi R...
Pimpri Chinchwad ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi R...
 
celebrity 💋 Nagpur Escorts Just Dail 8250092165 service available anytime 24 ...
celebrity 💋 Nagpur Escorts Just Dail 8250092165 service available anytime 24 ...celebrity 💋 Nagpur Escorts Just Dail 8250092165 service available anytime 24 ...
celebrity 💋 Nagpur Escorts Just Dail 8250092165 service available anytime 24 ...
 
call girls in Raghubir Nagar (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service ...
call girls in Raghubir Nagar (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service ...call girls in Raghubir Nagar (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service ...
call girls in Raghubir Nagar (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service ...
 
An Atoll Futures Research Institute? Presentation for CANCC
An Atoll Futures Research Institute? Presentation for CANCCAn Atoll Futures Research Institute? Presentation for CANCC
An Atoll Futures Research Institute? Presentation for CANCC
 
Just Call Vip call girls Wardha Escorts ☎️8617370543 Starting From 5K to 25K ...
Just Call Vip call girls Wardha Escorts ☎️8617370543 Starting From 5K to 25K ...Just Call Vip call girls Wardha Escorts ☎️8617370543 Starting From 5K to 25K ...
Just Call Vip call girls Wardha Escorts ☎️8617370543 Starting From 5K to 25K ...
 
Call Girls Nanded City Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Nanded City Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Nanded City Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Nanded City Call Me 7737669865 Budget Friendly No Advance Booking
 
Call On 6297143586 Yerwada Call Girls In All Pune 24/7 Provide Call With Bes...
Call On 6297143586  Yerwada Call Girls In All Pune 24/7 Provide Call With Bes...Call On 6297143586  Yerwada Call Girls In All Pune 24/7 Provide Call With Bes...
Call On 6297143586 Yerwada Call Girls In All Pune 24/7 Provide Call With Bes...
 
VIP Call Girls Bhavnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Bhavnagar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Bhavnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Bhavnagar 7001035870 Whatsapp Number, 24/07 Booking
 
Call On 6297143586 Viman Nagar Call Girls In All Pune 24/7 Provide Call With...
Call On 6297143586  Viman Nagar Call Girls In All Pune 24/7 Provide Call With...Call On 6297143586  Viman Nagar Call Girls In All Pune 24/7 Provide Call With...
Call On 6297143586 Viman Nagar Call Girls In All Pune 24/7 Provide Call With...
 
best call girls in Pune - 450+ Call Girl Cash Payment 8005736733 Neha Thakur
best call girls in Pune - 450+ Call Girl Cash Payment 8005736733 Neha Thakurbest call girls in Pune - 450+ Call Girl Cash Payment 8005736733 Neha Thakur
best call girls in Pune - 450+ Call Girl Cash Payment 8005736733 Neha Thakur
 
Coastal Protection Measures in Hulhumale'
Coastal Protection Measures in Hulhumale'Coastal Protection Measures in Hulhumale'
Coastal Protection Measures in Hulhumale'
 
A Press for the Planet: Journalism in the face of the Environmental Crisis
A Press for the Planet: Journalism in the face of the Environmental CrisisA Press for the Planet: Journalism in the face of the Environmental Crisis
A Press for the Planet: Journalism in the face of the Environmental Crisis
 
2024: The FAR, Federal Acquisition Regulations, Part 31
2024: The FAR, Federal Acquisition Regulations, Part 312024: The FAR, Federal Acquisition Regulations, Part 31
2024: The FAR, Federal Acquisition Regulations, Part 31
 

Digital Preservation at Norfolk Record Office

  • 1. Digital Preservation at Norfolk Record Office A report prepared by Pawel Jaskulski (Digital Preservation trainee) for Gary Tuson (County Archivist). March 11th, 2016
  • 2. Executive Summary Digital Preservation strategy at Norfolk Record Office evolved from the archive’s own active interest in the emerging domain and from the anticipated necessity of integrating accessioning digitally born archives procedure within regular archival processing framework. The launch of Norfolk Sound Archive in 2003 and Norfolk Record Office’s (NRO) participation in the Skills for the Future programme signalled strong commitment to build expertise within the field of digital technologies. Senior archivists, with support from Norfolk County Hall’s ICT services, have developed over time bitstream preservation capability with elements of a ‘parsimonious approach’ to digital preservation. The archive approved its first version of a Digital Preservation Policy in 2007, in which it addressed the need for advancing its digital preservation strategy to involve format migration pathways and clearly defined accessioning workflows. NRO’s influential role within East of England Regional Archive Council led to a regional pilot project (currently at its proof of concept stage) employing cloud hosted instance of Archivematica connected with a cloud storage system provided by Arkivum. Acknowledgement I would like to thank my line manager Ian Palfrey (Senior Archivist/Collection Management) and Gary Tuson (County Archivist) for guiding me in the process of completing this report.
  • 3. Contents 1. Introduction 1 2.1 Background 2 2.2 Wider Regional Context 2 2.3 NRO Digital Preservation Policy 3 2.4 Requirements 4 2.5 Parsimonious Approach – Bitstream Preservation 6 2.6 Towards Logical Preservation 6 3. Conclusions 8 4. Recommendations 8 References 10 Appendices 12
  • 4. 1 1. Introduction The aim of this report is to survey approaches to digital preservation at Norfolk Record Office in order to make recommendations for future improvements. Digital preservation has now become one of the key priorities for the NRO service plan. The archive has been involved in the Skills for the Future traineeship scheme since 2014, signalling its commitment to filling the skills gaps within the sector. This, as well as its contribution to collaborative working within East of England Regional Archive Council (EERAC) demonstrates NRO’s strong dedication to developing digital preservation strategy for the region and within institutional context of a local authority archive service. The report will trace the evolution of interest in digital preservation, from its original concerns about the need to develop a strategic plan for preservation of electronic records to the latest developments with EERAC and its current pilot project.
  • 5. 2 2.1 Background Norfolk Record Office collects and preserves unique archives relating to the history of Norfolk and makes them accessible to as wide a range of people as possible. It is a joint service of Norfolk County Council and the District Councils of Norfolk and is democratically accountable via the joint Norfolk Records Committee.1 NRO is located at The Archive Centre in Norwich with additional services operating from Norfolk Heritage Centre at Norwich Millennium Library and King's Lynn Borough Archive. In April 2003 the record office launched the Norfolk Sound Archive with a purpose to collect, preserve and provide public access to sound recordings relevant to life in Norfolk. Its remit includes: • Provide information on holdings and access to original recordings • Preserve sound recordings • Locate existing sound recordings that are worth preserving for the future • Provide support and training for on-going and new oral history projects • Promote the use of sound recordings, particularly within education • Links to organizations who also hold collections of sound recordings relating to Norfolk and who are carrying out oral history work in the county2 Overall NRO incorporates three repositories: Norfolk Record Office, Norfolk Sound Archive and King’s Lynn with variety of holdings and record types under its custody.3 2.2 Wider Regional Context The creation of Regional Archives Councils for the nine English Regions in 1999 led to the publication of a series of regional archive strategy documents. The report created for East of England Regional Archive Council (EERAC) in 2003 sets out the aims of the Preserving the Present for the Future project: This project encompasses a whole range of issues concerning records management and the preservation of electronic records and aims to ensure that contemporary records in all forms – public and private - are both properly managed now and preserved for the future. It will involve creating the infrastructure and building confidence to turn theoretical knowledge into 1 Norfolk County Council has two Joint Committees: http://www.norfolk.gov.uk/Council_and_Democracy/Our_budget_and_council_tax/Statement_of_accounts/NCC1 52976, accessed 09/03/2016 2 Norfolk Sound Archive has a well-established digitisation workflow based on the work of the British Library Sound Archive. 3 Norfolk Record Office Archive Collections http://www.archives.norfolk.gov.uk/Archive-Collections/index.htm; Additionally, Norfolk Sound Archive collections feature in Directory of UK Sound Collections: http://www.bl.uk/projects/uk-sound-directory, accessed 09/03/2016
  • 6. 3 practical action across the Region. It will also seek to identify collaborative solutions for the preservation of digital data.4 With the archive sector development strategy clearly stated the next important step forward was the East of England Digital Preservation Regional Pilot Project that took place between August 2004 - March 2005.5 Although NRO was not directly involved in the project, it participated as an observer.6 The test bed project was aimed at better understanding of the processes and costs of preserving digitalised material by assessing feasibility of outsourcing specialist services on a regional basis to the UK Data Archive based at the University of Essex. Among the recommendations were two lessons learned from the project that remain relevant within the context of digital preservation at a local authority archive: - Need to consider further modelling of costs and benefits linked with the three scenarios of: in-house provision, working through consortia and contracting out; - Need to develop and clarify the OAIS model to make it more intelligible and better aligned, with more conventional terminology applied to archive administration and records management.7 2.3 NRO Digital Preservation Policy In July 2007 the NRO approved a Digital Preservation Policy. It is currently undergoing a revision and will be supported by related documents: Digital Records Accessioning Checklist and Advice to Creators of Digital Records.8 The policy recognises the urgent need to address developing an OAIS compliant digital preservation strategy, especially with the view of Norfolk Sound Archive expanding its activities. Key points from the policy are: 1.6 The NRO (and NSA) expect to receive for appraisal and preservation an increasing number of ‘born-digital’ records, mainly, in the first instance, from private organisations, groups and individuals, which due to their functionality or quantity cannot be preserved by printing out hard copies. This expectation is based on contact with depositors and with colleagues in other local government archive services. 4 Chris Pickford, Eastern Promise – A Strategy for Archival Development in the East of England, EERAC 2003, p. 33 5 Digital Archives Regional Pilot profile on UK Data Archive http://www.data-archive.ac.uk/about/projects/darp, accessed 09/03/2016 6 Chairman of EERAC at a time was John Alban, Norfolk County Archivist, who co-wrote foreword to Report of the East of England Digital Preservation Regional Pilot Project, MLA East of England and EERAC 2006, http://www.data-archive.ac.uk/media/1680/DARP_finalreport.pdf, accessed 08/03/2016 7 Ibid., p. 43 8 See appendices A and B for draft versions. The Digital Preservation Policy is an internal document and as such is not publicly available or published online at the moment. An important point of reference for Digital Preservation Policy is the Archive Collecting Policy http://www.archives.norfolk.gov.uk/view/NCC098771 accessed 09/03/2016
  • 7. 4 1.7 Digital records, i.e., the type of data, format of carrier and associated hardware and software needs, are likely to be varied and in many cases cannot be anticipated. The appraisal section outlines the scope of digital records collecting policy: 3.2 The NRO will not necessarily accept custody of every digital record offered to it. Records which in effect we cannot use or whose use imposes unacceptable costs or conditions on the NRO might not be selected for retention. Factors which may affect this are:  the existence of adequate metadata  the removal or neutralisation of security features  the provision of a free licensed copy of the original software, if necessary, to access and maintain the record The document also set great store on open (non-proprietary) standards that are ‘to be employed as far as possible for preservation purposes. Where those standards are absent, file formats accepted as the industry standard should be used, e.g., TIFF files.’ 2.4 Requirements A quick survey of digitally born archives received by NRO gives an insight into the requirements for digital preservation. The total volume of data held within the archive amounted to 87.8 GB in November 2015 with majority of the records consisting of various formats: raster and vector images, text documents, audio and video files as shown below (please refer also to appendix D):
  • 8. 5 Figure 1: Collection Profile of all Digitally Born Archives at the NRO
  • 9. 6 Norfolk Sound Archives’ digital assets will soon need 3 TB of storage and various digitisation projects occupy similar space on shared network drives. Improving the accessioning system in respect to digitally born archives is the main drive towards establishing digital preservation strategy. The aim is to build digital preservation capabilities so as to be able to choose digital records over their paper- based equivalents (hard copies/printouts). 2.5 Parsimonious Approach - Bitstream Preservation NRO has been collecting digitally born archives since 1997 and has introduced gradually different elements of the ‘parsimonious’ approach to digital preservation, resulting in a systematic method initiated by the Senior Archivist (Collection Management).9 All digital accessions were processed manually in the same manner: - One month quarantine - Virus scan performed - Integrity checks conducted by generating checksums - Digital objects extracted from removable media and transferred to a secure and designated network drive location - Technical metadata generated (file count, file sizes, directory and file listings) - File format identified by creating profiling reports with DROID - Top level descriptions created in CALM cataloguing system All metadata generated in the above process is stored alongside the digital records within the top level directory to which digital objects were transferred. This is a time-consuming process and uses more than one tool, creating metadata in separate files. This will be difficult to sustain in the long-term. 2.6 Towards Logical Preservation The service plan for 2014/2015 identifies the lack of compliance with the OAIS standard – a consideration, which has been gaining significance on the NRO Risk Register.10 With greater understanding of OAIS functional model and further research into currently available digital preservation systems (Preservica, Rosetta etc.) NRO decided to explore Archivematica as the preferred solution, since it offers 9 Norfolk Records Office Service Plan for 2014-2015 enclosed in Norfolk Record Committee meeting agenda from Thursday 1 May 2014 http://www.norfolk.gov.uk/download/norfrec010514agendapdf, p. 43, accessed 06/03/2016; Tim Gollins, Parsimonious preservation: preventing pointless processes!, http://www.nationalarchives.gov.uk/documents/information-management/parsimonious-preservation.pdf, accessed 06/03/2016 10 Ibid., p. 38
  • 10. 7 normalisation for preservation and access purposes. It acts as an Archival Information Package creator, managing the workflow from transfer, through ingest to archival storage and dissemination. The software was first installed in a test environment (Ubuntu 14.04.4 LTS running on HP Linux compatible machine) and used to process a sample dataset comprising of file formats similar to those accessioned by NRO in the past. After initial results it became apparent that in order to fully assess Archivematica’s capabilities it needs to be deployed in a production environment, connected with appropriate storage systems.11 The software is commonly referred to as a pipeline, since in itself it is not a repository (storage system), nor an access system but a processing pipeline connecting them, supporting pre-ingest and ingest activities of a digital repository. Arkivum, a company specialising in digital data archiving offers a fully hosted service: cloud storage integrated with Archivematica from January 2016.12 In order to reduce the costs and utilise economies of scale NRO suggested a pilot project to EERAC members in February 2016, demonstrating why Archivematica is the preferred system for digital records’ ingest.13 In summary the main advantages are: - It is OAIS compliant and supports PREMIS metadata schema together with METS and Dublin Core standards14 - The Archival Information Package is structured in accordance with Library of Congress BagIt specification15 - It uses The National Archives’ file format registry PRONOM - It is open source and under active development with substantial user community from across heritage, arts and academic sectors.16 - It runs a series of configurable micro-processes provided by open source tools integrated within Archivematica, which can be replaced as technologies change fulfilling the requirement of OAIS’ Manage System Configuration function17 It was discussed with EERAC members that working together could entail shared resources: cloud-based infrastructure with digital preservation software accessible through browser (Archivematica) and linked cloud storage (provided by Arkivum). The main focus of the project is to evaluate Archivematica as a digital preservation tool but it will also look at integrating it with AtoM, an access and cataloguing system 11 Norfolk County Council’s ICT security restriction prohibited integration with the system. 12 http://arkivum.com 13 Presentation is available on SlideShare: http://www.slideshare.net/PaweJaskulski1/archivematica-and-local- authority-archive-services accessed 09/03/2016 14 PREMIS http://www.loc.gov/standards/premis/, METS http://www.loc.gov/standards/mets/, Dublin Core http://dublincore.org/, accessed 11/03/2013 15 E-Ark Report on Available Formats and Restrictions: http://www.eark-project.com/resources/project- deliverables/7-e-ark-d41-report-on-available-formats-and-restrictions/file, p. 24, accessed 10/03/2016 16 Archivematica users group forum: https://groups.google.com/forum/#!forum/archivematica 17 Reference Model for an Open Archival Information System (OAIS). Magenta Book. Issue 2. June 2012: http://public.ccsds.org/publications/archive/650x0m2.pdf, p. 4-12, accessed 11/03/2016
  • 11. 8 supporting digital objects and developed by the same organisation that created Archivematica: Artefactual Systems.18 3. Conclusion Part of the Digital Preservation policy should also be advocacy and consultancy aimed at promoting digital archiving in Norfolk, especially amongst community archives and other likely donors and depositors. Data must be capable of re-use with sufficient metadata (detailed enough documentation in regards to chain of custody, creators, rights, technical provenance: what software and what operating system the files were created in what file format etc.).19 Advice on best practice in regards to electronic record keeping should be embedded within the policy to foster better understanding of digital preservation concepts among electronic records creators, donors and depositors. 4. Recommendations With NRO continuing to explore Archivematica and its applications to archival processing workflow this report recommends: - The NRO improves its Preservation Planning by compiling an Action Plan for all file formats received within digital accessions. The Action Plan will inform staff what must be done to normalise a digital object at Ingest into a preservation and/or dissemination formats. This will inform format migration strategy. For example, TIFF is the currently preferred preservation format for images as it is less prone to data loss than other raster image file formats. NRO is interested in exploring PNG file format as preservation format for certain types of digital records. - The NRO improves its Preservation Planning by identifying and compiling a list of Significant Properties per type (text, audio, etc.) to help staff decide whether a format migration has produced acceptable results, retaining its authenticity and evidential value.20 - The designated community of the NRO is the general public, which demands a robust access system and intellectual property rights management procedures. The NRO should review its designated community and identify 18 The current cataloguing system used by EERAC members is CALM. If the project is successful it would require migration of the records to a new system supporting digital objects like Access to Memory (AtoM): https://www.accesstomemory.org. A concern shared also by ARCW Digital Preservation Working Group: http://www.nationalarchives.gov.uk/documents/Cloud-Storage-casestudy_Wales_2015.pdf, accessed 09/03/2016 19 Jenny Mitcham, Preservation of Digital Objects at the Archaeology Data Service; in: Janet Delve and David Anderson, Preserving Complex Digital Objects, Facet Publishing 2014, p. 50-51 20 Margaret Hedstrom, Christopher A. Lee, Significant properties of digital objects: definitions, applications, implications, https://www.ils.unc.edu/callee/sigprops_dlm2002.pdf accessed 09/03/2016
  • 12. 9 any sub-sets or new communities whose access requirements are specialised or more complex and demanding and how these can be met. - The NRO considers how it will add the timely delivery of digital records to users in its facilities (search room and online access). - Although not an immediate priority NRO audits its other digital assets. Please see Appendix C for suggested actions. Additionally with the view of NRO and EERAC continuing collaborative work as consortium this report suggest: - Archives associated within EERAC occupy disparate geographical locations, which could translate to a network of disparate storage locations, improving security and ensuring disaster recovery plan. Assuming that each archive has its own ICT infrastructure, or is willing to develop it, that would fulfil the first requirement of the NDSA Levels of Digital Preservation.21 - EERAC could continue its work towards Distributed Digital Preservation model, in which the members own preservation infrastructures and expertise rather than outsourcing this core service to external vendors as with the MetaArchive Cooperative example.22 - EERAC agrees on best practices and standards to support interoperability and sustainability of the project. It is important to refer to standards for trusted digital repository: DRAMBORA, Data Seal of Approval or TRAC in order to aid concentrating the efforts on achievable goals.23 21 Megan Phillips et all, The NDSA Levels of Digital Preservation: An Explanation and Uses, http://www.digitalpreservation.gov/documents/NDSA_Levels_Archiving_2013.pdf, accessed 10/03/2016 22 Adrian Brown, Practical Digital Preservation, Facet 2013, p. 103-106; Also https://educopia.org/presentations/long-term-preservation-strategies-architecture-views-implementers, accessed 09/03/2016 23 Main Certification Standards include: peer-reviewed self-assessment Data Seal of Approval Assessment http://datasealofapproval.org, DRAMBORA Digital Repository Audit Method Based on Risk Assessment http://www.repositoryaudit.eu and TRAC Trustworthy Repositories Audit and Certification https://www.crl.edu/sites/default/files/d6/attachments/pages/trac_0.pdf, accessed 11/03/2016
  • 13. 10 References Kilian Amrhein and Marco Klindt, One Core Preservation System for All your Data. No Exceptions!, https://opus4.kobv.de/opus4-zib/files/5663/iprespaper-finaledit.pdf, accessed 11/03/2016 Philip C. Bantin, Strategies for Managing Electronic Records: A New Archival Paradigm? An Affirmation of Our Archival Traditions?, http://www.indiana.edu/~libarch/ER/macpaper12.pdf, accessed 11/03/2016 Adrian Brown, Practical Digital Preservation, Facet 2013 Edward M. Corrado and Heather Lea Moulaison, Digital Preservation for Libraries, Archives, and Museums, Rowman & Littlefield 2014 Tim Gollins, Parsimonious preservation: preventing pointless processes!, http://www.nationalarchives.gov.uk/documents/information-management/parsimonious- preservation.pdf, accessed 06/03/2016 Margaret Hedstrom, Christopher A. Lee, Significant properties of digital objects: definitions, applications, implications, https://www.ils.unc.edu/callee/sigprops_dlm2002.pdf accessed 09/03/2016 Helen Heslop, An Approach to the Preservation of Digital Records, National Archives of Australia 2002, http://www.naa.gov.au/Images/An-approach-Green-Paper_tcm16-47161.pdf, accessed 11/03/2016 Sarah Higgins, The DCC Curation Lifecycle Model, The International Journal of Digital Curation, Volume 3, Issue 1, 2008, p. 134-140 Kirnn Kaur, Report on testing of cost models and further analysis of cost parameters, APARSEN 2013 https://rd-alliance.org/system/files/filedepot/113/APARSEN-REP-D32_2-01-1_0.pdf, accessed 11/03/2016 Anna Kugler, Hannes Kulovits, From TIFF to JPEG 2000?, D-Lib Magazine, Volume 15, Issue 11/12, 2009, http://www.dlib.org/dlib/november09/kulovits/11kulovits.html, accessed 11/03/2016 Brian F. Lavoie, The Open Archival Information System Reference Model: Introductory Guide (DPC Technology Watch), OCLC and DPC 2004, http://www.dpconline.org/docs/lavoie_OAIS.pdf and its 2nd 2014 edition http://www.dpconline.org/component/docman/doc_download/1359-dpctw14-02, accessed 11/03/2016 Jenny Mitcham, Preservation of Digital Objects at the Archaeology Data Service; in: Janet Delve and David Anderson, Preserving Complex Digital Objects, Facet Publishing 2014 Jenny Mitcham, Chris Awre, Julie Allinson, Richard Green and Simon Wilson, Filling the Digital Preservation Gap. A Jisc Research Data Spring project. Phase One report - July 2015, https://dx.doi.org/10.6084/m9.figshare.1481170.v1, accessed 11/03/2016 Jenny Mitcham, Chris Awre, Julie Allinson, Richard Green and Simon Wilson, Filling the Digital Preservation Gap. A Jisc Research Data Spring project Phase Two report - February 2016, https://dx.doi.org/10.6084/m9.figshare.2073220.v1, accessed 11/03/2016
  • 14. 11 David Pearson and Colin Webb, Defining File Format Obsolescence: A Risk Journey, The International Journal of Digital Curation, Volume 3, Issue 1, 2008, p. 89-106 Megan Phillips et all, The NDSA Levels of Digital Preservation: An Explanation and Uses, http://www.digitalpreservation.gov/documents/NDSA_Levels_Archiving_2013.pdf, accessed 10/03/2016 Chris Pickford, Eastern Promise – A Strategy for Archival Development in the East of England, EERAC 2003 Jeff Rothenberg, Ensuring Longevity of Digital Information, CLIR 1999, http://www.clir.org/pubs/archives/ensuring.pdf, accessed 11/03/2016 Jeff Rothenberg, Preserving Authentic Digital Information, CLIR 2000, http://www.clir.org/pubs/reports/pub92/rothenberg.html, accessed 11/03/2016 Bronwen Sprout et all, Archivematica As a Service: COPPUL's Shared Digital Preservation Platform, http://summit.sfu.ca/system/files/iritems1/15519/CJILS39.2-9-Sprout.pdf, accessed 11/03/2016 Adam Tovell and James Knight, Directory of UK Sound Collections, British Library 2015: http://www.bl.uk/britishlibrary/~/media/subjects%20images/sound/directory%20of%20uk%20sound%2 0collections.pdf, p. 250-294, accessed 11/03/2016 Colin Webb, David Pearson and Paul Koerbin, 'Oh, you wanted us to preserve that?!' Statements of Preservation Intent for the National Library of Australia's Digital Collections, D-Lib Magazine, Volume 19, Issue 1/2, 2013 http://www.dlib.org/dlib/january13/webb/01webb.html, accessed 05/03/2016 Geoffrey Yeo, Trust and context in cyberspace, Archives and Records, Volume 34, Issue 2, Routledge 2013, p. 214-234 Archives and Records Council Wales Digital Preservation Working Group Case Study: http://www.nationalarchives.gov.uk/documents/Cloud-Storage-casestudy_Wales_2015.pdf, accessed 09/03/2016 East of England Digital Preservation Regional Pilot Project, MLA East of England and EERAC 2006, http://www.data-archive.ac.uk/media/1680/DARP_finalreport.pdf, accessed 08/03/2016 Reference Model for an Open Archival Information System (OAIS). Magenta Book. Issue 2., Consultative Committee for Space Data Systems 2012, http://public.ccsds.org/publications/archive/650x0m2.pdf, accessed 11/03/2016 Trustworthy Repositories Audit & Certification: Criteria and Checklist, OCLC and CRL 2007, https://www.crl.edu/sites/default/files/d6/attachments/pages/trac_0.pdf, accessed 11/03/2016
  • 15. Appendix A Digital Records Accessioning Checklist (Draft Version January 2016) 1 26 January 2016 Accessioning Digital Records Guide Accessioning Digital Records does not differ from standard Accession Procedure. Please refer to it first and follow the steps as described. Accessioning Digital Records corresponds to the Ingest entity within OAIS functional model. For the purpose of this guide this is broken down into Transfer and Appraisal, since donors and depositors not always perform selection and arrangement prior to transferring digitally born archives. If the digital material is in need of preservation (converting to preservation file formats) this should be also preceding the Ingest. (1.1.1) At the point of contact negotiate technical details for the delivery of digitally born archives. Please refer to (and explain to the donor/depositor) Advice to Creators of Digital Records document. (1.1.2) Obtain Intellectual Property Rights and permission to manipulate and/or destroy (delete) digital material for the purpose of preservation, appraisal (in line with NRO Collecting Policy) and access. Permission is needed to authorize the procedures necessary to meet preservation objectives. For example: creating a new version of the archived item so that it can be rendered by current technologies, discarding of material that does not meet the criteria of NRO Collecting Policy or creating access copies to be published online on NRO website. (1.2.1) Run virus scan on the received digital records on a designated Digital Preservation workstation (quarantined workarea) that accepts all incoming digital records accessions. This will happen automatically if the digital content is being sent as a file transfer over internet or other network (FTP transfer, e-mail, via downloadable link etc.). (1.2.2) If the digital records are received on removable media (DVD, CD-ROM, USB Memory Stick etc.) connect them to the designated DP workstation to run virus scan. (1.2.3) Make sure to update Removable Media Inventory if these are to be retained in the Cold Store. Number all carriers consecutively based on their accession number, e.g., ACC 2012/33 RM 2 of 5 (where are RM stands for Removable Media). (1.3.1) Irrespective of how the digital records are delivered (an accession might comprise both: file transfers and removable media as hybrid/mixed accession) there should be two copies created of the entire dataset comprising the accession. One respecting original order and preserving all digital material as it was deposited/donated. And a second working copy, created in exactly the same manner, but intended for an archivist to perform appraisal (AKA digital curation). As mentioned both datasets should be structured as originally delivered. In the case of mixed accession, including different removable media, there should be separate directory created for each carrier of digital material. These should be named according to a convention: <Sequential Number 000>_<Type of Carrier>_<Label if Applicable> Transfer Appraisal Preservation Ingest Archival Storage Access
  • 16. Appendix A 2 For example: 001_Floppy-Disc_Accounts-1993 and all placed within a removableMedia folder located at the top level directory you are working in. (1.3.2) Review content of the accession and create intellectual arrangements. This is an Appraisal stage – it is desirable to decide at this point how to structure the digital records into a Submission Information Package. If some of the files fall out of the scope of the collection policy they should be deleted now, subject to donor’s/depositor’s permission. Please refer to File Formats Action List document to identify superfluous objects. (1.3.3) As part of the above process you should ensure that all the changes are recorded. Create metadata folder and metadata.csv file inside it with an Excel spreadsheet template (coming soon) next to submissionDocumntation folder within the directory storing the digital objects you are working on. Provide descriptive information at appropriate level, complying with Dublin Core metadata standard. (1.4) Create Disk Images for removable media using BitCurator, if it is important to preserve functionality or particular features of the physical carrier of digital content (for example menu of a DVD Video or interactive CD-ROM). (1.5.1) Collect Fixity Information – check if fixity information is delivered by the donor/depositor and verify if the files are not corrupted. (1.5.2) Generate Fixity Information – check checksums before and after transfer whenever data are being copied from one storage system (physical discs, CD-ROM, DVD, USB memory stick etc.) to another one (Local File System, server storage, shared drives etc.) Check file count and file size. Tool: Fsum Frontend (http://fsumfe.sourceforge.net/index.php?page=usage) To create a checksum file .sha2 for all files within a directory 1. Select in the menu « Generate check file ». 2. Select the folder containing your files. 3. To select location where you want the result to be saved select second option from the drop-down menu: “1 file in any place”. Otherwise the program creates the file within the directory containing your files with the first option: “1 file in tree root” 4. Select the format SHA2 512 (used by Archivematica) and click the button « Generate ». (1.6) Scan for sensitive and personal information (credit card details, addresses, phone numbers etc.) (1.7) Identify file formats using DROID (http://www.nationalarchives.gov.uk/information- management/manage-information/policy-process/digital-continuity/file-profiling-tool-droid/)
  • 17. Appendix A 3 (1.8) Include a digital copy of your email correspondence with the donor/depositor any other documentation related to the transfer within submissionDocumentation folder, which should be created within the top directory you are working on. With the view to use Archivematica in the future all metadata generated throughout accessioning process (checksums, DROID results etc.) should be put into a submissionDocumentation folder next to metadata folder that will contain metadata.csv file created by an archivist according to the Excel template. (1.9) Assess the overall size of the accession in bytes and convert the size figure to an easily readable format (e.g.: MB,GB,TB) Glossary Fixity information - hash, message digest, checksum, manifest file Bit Rot - On magnetic media the binary digits are (essentially) represented by individual particles of magnetic material whose polarity represents either 1 or 0. Sometimes interference or just general degradation of the media can cause these particles to flip, reversing the meaning of a particular bit. Or on optical media, physical damage or decay of the dyes which are used in writable DVDs and CDs has similar effects. There’s usually a degree of error correction built in, but eventually this can build up and corrupt data irretrievably. Disk Image - copy of the bitstream that is read off the disk through the computer’s input/output equipment. The standard forensics software that creates a disk image also generates a cryptographic hash of the entire disk image.
  • 18. Appendix B 1 Advice to Depositors of Digital Records Transfer of Intellectual Property Rights Archives may not be able to assign their limited resources to the task of preserving data for which the value is unknown but at the same time, there is a need to preserve ‘valuable’ datasets. This is why we ask for the permission to manipulate and/or destroy digital content donated to us, so as we can ensure the best use of our resources and prioritise accepting deposits according with our collecting policy (See our website: http://www.archives.norfolk.gov.uk/view/NCC098771). Permission to destroy is needed in order to perform preservation tasks as well as to ensure that Norfolk Record Office meets the requirements of its collecting policy. Please be aware that by signing the accession form you give Norfolk Record Office the authority to process, migrate and destroy the data for the purpose of preservation. This mean that original data carriers (removable media on which they were stored like USB Memory Stick, CDs, Floppy Discs etc.) may be discarded. Norfolk Record Office preserves collections donated to it to be accessible to general public. Please let us know if the digital records contain any sensitive or confidential information, so that public access can be restricted for a suitable period of time. Metadata In order to ensure that the digital records are accessible in the future we must collect all necessary information required to open and view digital files. To the best of your knowledge please provide us with information about: - What software (including version) was used to create, open, read, edit and save the file/s; - What operating system was used (including version; for example Windows XP Service Pack 2003, Mac OS X 10.6.8 etc.); - For what purpose the data were generated/created and around what time? Please complete Digital Files and Removable Media Inventory form (Excel spreadsheet template) that will list content of your deposit and whenever possible size on a disk. For Current Records managers Preferred Deposit Format In order to ensure long-term sustainability of access to the records it is recommended that records managers use current preservation formats. If within means and resources of your organisation export data that you want to preserve and deposit with NRO in the following formats.
  • 19. Appendix B 2 Media Type File Formats Text PDF/A: Portable Document Format (Archival; ISO 19005-3 compliant) Image (Raster) TIFF: Uncompressed Baseline Tagged Image File Format v.6 (No LZW compression) PNG: Portable Network Graphics (lossless compression) Image (Vector) SVG: Scalable Vector Graphics File Sound WAV: Broadcast Wave Format For Existing Digital Records Accepted Deposit Format Media Type File Formats Text DOCX: MS Word Open XML Document (created in MS Office 2007 and above) XLSX: MS Excel Open XML Document (created in MS Office 2007 and above) PPTX: MS PowerPoint Open XML Document (created in MS Office 2007 and above) ODT: OpenDocument Text Document (created in OpenOffice) ODS: OpenDocument Spreadsheet (created in OpenOffice) ODP: OpenDocument Presentation (created in OpenOffice) PDF/A: Portable Document Format (Archival) TXT: Plain Text File (ANSI or UTF-8 encoded) RTF: Rich Text Format File XML: Extensible Markup Language Data File CSV: Comma Separated Values File Image (Raster) TIFF: Tagged Image Format File PNG: Portable Network Graphic Image (Vector) SVG: Scalable Vector Graphics File Sound WAV: Waveform Audio File Format AIFF: Audio Interchange File Format MP3: Moving Picture Experts Group Layer 3 compression FLAC: Free Lossless Audio Codec File OGG: Ogg Vorbis Audio File Video* MPEG-1/2: Moving Picture Experts Group AVI: Audio Video Interleave File (uncompressed) MOV: Quicktime Movie (uncompressed) MP4: Moving Picture Experts Group (with H.264 encoding) Email EML: Electronic Mail Format 3D Graphics OBJ: Wavefront Object files DROID If depositing large amount of data that equals to system migration please use DROID before submitting your deposit (http://www.nationalarchives.gov.uk/documents/information-management/droid-how- to-use-it-and-interpret-results.pdf).
  • 20. Appendix C 1 Digital Audit Questionnaire The aim of this questionnaire is to identify the requirements for future storage and preservation of any digital material being within possession of NRO. This will mainly encompass: - Digitally born records being deposited to NRO or already held by NRO - Outputs of digitisation projects (both images and sounds) - Electronic records generated by NRO itself (organisational records like office administration, email correspondence etc.) It is important that all employees will take part in this exercise (needs assessment) in order to fully understand the scope of the necessary actions to be taken. PAST ARCHIVE SERVICE ACTIVITIES 1. Are you aware of any important digital material that must be kept (digital files/electronic records like text documents, spreadsheets, scanned images or digital photographs) within your department that are being stored on either network drive, external hard drive or any type of removable media (DVDs, CDs, floppy disks, memory cards, USB memory sticks etc.)? 2. If yes can you provide details below: Type of Storage (network drive, CD, DVD, external HDD, floppy disk etc.) Type of content (spreadsheets, word documents, PDFs, digital images/photographs, scanned documents) Volume (size on disk in either MB, GB or TB; if small put less than 1MB) 3. Email – do you know of any emails that you might have sent yourself or received from someone that should have been kept for future reference? In this situation would you normally print off the email and file the printout? Would you consider as an alternative printing the email into a PDF file and saving it onto designated network drive locations?
  • 21. Appendix C 2 Click here to enter text. CURRENT ARCHIVE SERVICE ACTIVITIES 1. In your everyday tasks at work do you work with digital material (files)? Click here to enter text. 2. What are they? Click here to enter text. 3. Do you think they are important to an extent that they would need to be preserved over time for future access? Click here to enter text. 4. How strongly would you identify the need to do so? In other words what is the value of the digital material that you produce/work on? Does it need to be kept by NRO? If yes, for how long? Click here to enter text. 5. If you’ve answered yes to the above, can you estimate the volume of digital material that is being produced (the amount of data that need to be kept)? Small (can be specified in MB), Medium (can be specified in GB), Large (can be specified in TB, PT) Click here to enter text. Thank you
  • 22. Appendix D A survey of digitally born archives received by the Norfolk Record Office compiled with The National Archives’ DROID profiling tool identified 107 various file formats. Image (Raster) 64%Miscellaneous 10% Word Processor 8% Text (Mark-up) 7% Email 6% Page Description 2% Text (Structured) 2% Image (Raster), Aggregate 1% Presentation 0% Image (Vector) 0% Audio 0% Video 0% Spreadsheet 0% Audio, Video 0% Text (Unstructured) 0% Dataset 0% Aggregate 0% Database 0% Image (Vector), Text (Mark-up) 0% File Formats per Type Image (Raster) Miscellaneous Word Processor Text (Mark-up) Email Page Description Text (Structured) Image (Raster), Aggregate Presentation Image (Vector) Audio Video Spreadsheet