Finding the Balance
An attempt at modeling differentiated storage for digitized collections : finding
the balance between ...
How to find the balance….
Digitization
•Multiple versions of a publication
•Which versions should be stored?
•What represe...
Agenda
• Who we are
• What we have
• Finding the balance
Finding the balance Trudie Stoutjesdijk, September 5th 2013
Who we are
• National Library
• Strategic Plan 2010-2013
• We offer everyone access to
everything published in and about t...
What we have
1. Collection Development
Programme
2. Collection Care Plan
3. Storage Management
4. Digital Preservation Sys...
What we have
1. Collection development programme
(2010-2013)
• Collect and preserve everything published in and about
the ...
Finding the balance Trudie Stoutjesdijk, September 5th 2013
Output:
•Digital objects in JPEG2000.
•Different versions of a...
What we have
2. Collection Care Plan
Integrated, efficient and effective collection care for both
physical and digital col...
2. Collection Care
Finding the balance Trudie Stoutjesdijk, September 5th 2013
What we have
Differentiated collection care...
What we have
Hierarchical storage management (HSM)
Finding the balance Trudie Stoutjesdijk, September 5th
2013
• Using sev...
What we have
4. Digital Preservation System
•e-Depot system (DIAS)
at the end of its natural life:
•New Digital Preservati...
How to find the balance….
It is impossible to preserve all the versions at the
highest preservation level.
The value asses...
A differentiated storage policy has been applied on the
digitized collections; based on the following secondary
values:
• ...
Finding the balance Trudie Stoutjesdijk, September 5th 2013
Collection Care: Classification levels
Preservation
level
1. 2...
Finding the balance Trudie Stoutjesdijk, September 5th 2013
Collection Care: Classification levels
Finding the balance:
Di...
Finding the balance Trudie Stoutjesdijk, September 5th 2013
Collection Care: Classification levels
Finding the balance:
Di...
Finding the balance Trudie Stoutjesdijk, September 5th 2013
Collection Care: Classification levels
Finding the balance:
Di...
Finding the balance Trudie Stoutjesdijk, September 5th 2013
Collection Care: Classification levels
Finding the balance:
Di...
Finding the balance Trudie Stoutjesdijk, September 5th 2013
Collection Care: Classification levels
Finding the balance:
Di...
Finding the balance Trudie Stoutjesdijk, September 5th 2013
Digitized collections and storage costs.
Finding the balance:
...
Finding the balance Trudie Stoutjesdijk, September 5th 2013
Finding the balance:
Differentiated storage model for digitize...
Finding the balance Trudie Stoutjesdijk, September 5th 2013
Finding the balance:
Differentiated storage model for digitize...
Finding the balance Trudie Stoutjesdijk, September 5th 2013
Finding the balance:
Differentiated storage model for digitize...
Finding the balance Trudie Stoutjesdijk, September 5th 2013
Finding the balance:
Differentiated storage model for digitize...
Finding the balance Trudie Stoutjesdijk, September 5th 2013
Finding the balance:
Differentiated storage model for digitize...
Finding the balance Trudie Stoutjesdijk, September 5th 2013
Finding the balance:
Differentiated storage model for digitize...
Thank you!
Trudie Stoutjesdijk, September 5th 2013
Upcoming SlideShare
Loading in …5
×

Finding the balance ipres2013

399 views

Published on

The Koninklijke Bibliotheek (KB) digitizes the national collection
of the Netherlands. Digitization leads to multiple versions of a
publication: a digital access file, a digital master file, back-ups of
the digital versions and the physical original publication. This in
turn increases the need for storage capacity quickly. And raises
questions like: Should all versions be stored? Do all the versions
need to be preserved in order to ensure permanent access, and if so
which ones should be preserved and how? Based on the collection
care plan and the content strategy a differentiated storage policy is
set up in order to establish a relation between the physical object
and the digital counterpart(s). This method assigns value to
different collection lots and is used to find out how to apply
collection care in an efficient way.

Published in: Education, Business

Finding the balance ipres2013

  1. 1. Finding the Balance An attempt at modeling differentiated storage for digitized collections : finding the balance between storage, costs and preservation of digitized publications. Trudie Stoutjesdijk, September 5th 2013
  2. 2. How to find the balance…. Digitization •Multiple versions of a publication •Which versions should be stored? •What representation is the object of preservation? •How can we reduce the need for storage? Finding the balance Trudie Stoutjesdijk, September 5th 2013
  3. 3. Agenda • Who we are • What we have • Finding the balance Finding the balance Trudie Stoutjesdijk, September 5th 2013
  4. 4. Who we are • National Library • Strategic Plan 2010-2013 • We offer everyone access to everything published in and about the Netherlands • We improve the national information infrastructure • We guarantee long-term storage of digital information • We maintain, present and strengthen our collection Finding the balance Trudie Stoutjesdijk, September 5th 2013
  5. 5. What we have 1. Collection Development Programme 2. Collection Care Plan 3. Storage Management 4. Digital Preservation System Finding the balance Trudie Stoutjesdijk, September 5th 2013
  6. 6. What we have 1. Collection development programme (2010-2013) • Collect and preserve everything published in and about the Netherlands • Transition from printed to digital format is key priority. • Collect 50% of all Dutch digital born publications • Harvest 10.000 websites • Digitization of all the books, periodicals and newspapers since 1470 (60 M pages before 2014) Finding the balance Trudie Stoutjesdijk, September 5th 2013
  7. 7. Finding the balance Trudie Stoutjesdijk, September 5th 2013 Output: •Digital objects in JPEG2000. •Different versions of an object: master, access, back-up, physical publication. Rapid increase in the number of items and total cost for storage What we have 10% of all books, periodicals and newspapers (since 1470), digitized before 2014. 1. Collection development programme : Digitization
  8. 8. What we have 2. Collection Care Plan Integrated, efficient and effective collection care for both physical and digital collections, based on the following principles: •Integrated collection care for digital files and physical objects •Value assessment of collections •Risk identification •Differentiated levels of collection care •Care redirected from the most valuable collections, to those where the biggest loss of value is expected. Finding the balance Trudie Stoutjesdijk, September 5th 2013
  9. 9. 2. Collection Care Finding the balance Trudie Stoutjesdijk, September 5th 2013 What we have Differentiated collection care based on a rational selection tool: value assessment •Divide the collections in different collection lots or categories •Describe collection units •Establish the definition of every criterion •Rate every collection unit •Calculate the average value Result: The level and duration of collection care Primary criteria Secondary criteria Informational value Use Aesthetic value Completeness Historical value Condition Social value Provenance
  10. 10. What we have Hierarchical storage management (HSM) Finding the balance Trudie Stoutjesdijk, September 5th 2013 • Using several tiers defining different levels of storage quality. • Based on different needs. • Use more than one type of media (HDD, Magnetic Tape). 3. Storage strategy
  11. 11. What we have 4. Digital Preservation System •e-Depot system (DIAS) at the end of its natural life: •New Digital Preservation System (DPS) •2012 migration from DIAS to new DPS •2013 new ingest workflows for born digital publications. •Next step: new ingest workflows for all the digitized collections. Finding the balance Trudie Stoutjesdijk, September 5th 2013
  12. 12. How to find the balance…. It is impossible to preserve all the versions at the highest preservation level. The value assessment provides insight in: - The level and duration of collection care - The relation between physical object and digital counterparts. - The relation between the state of the physical object and the necessity of preservation imaging and sustainable storage. Finding the balance Trudie Stoutjesdijk, September 5th 2013
  13. 13. A differentiated storage policy has been applied on the digitized collections; based on the following secondary values: • Use - The availability of digital content for the customer • Condition - The vulnerability of the physical resources - Sustainability of digital storage Finding the balance Trudie Stoutjesdijk, September 5th 2013 Finding the balance: Differentiated storage model for digitized collections In anticipation of the results of the value assessment we tried to identify classification levels.
  14. 14. Finding the balance Trudie Stoutjesdijk, September 5th 2013 Collection Care: Classification levels Preservation level 1. 2. 3. 4. 5. Representation available? -Digital Master No No Master light Preservation master Preservation master - Access file No Yes Yes Yes Yes - Physical original No Yes Yes Yes Yes Preservation copy available?   No No Physical original Preservation master - Physical original - Preservation master Effort of conservation / preservation care Active     Physical original preservation master Physical original and digital master Passive   physical original; access file Master light physical original   Finding the balance: Differentiated storage model for digitized collections
  15. 15. Finding the balance Trudie Stoutjesdijk, September 5th 2013 Collection Care: Classification levels Finding the balance: Differentiated storage model for digitized collections Preservation level 1. Representation available? -Digital Master No - Access file No - Physical original No Preservation copy available?   No Effort of conservation / preservation care Active   Passive   Level 1: -Lowest imaginable level. -For use only. -Contains no representations and there’s nothing to preserve. -Example: the reference collection which is being transformed from physical to digital.
  16. 16. Finding the balance Trudie Stoutjesdijk, September 5th 2013 Collection Care: Classification levels Finding the balance: Differentiated storage model for digitized collections Level 2: -Digitized for use. -Contains publications that can be digitized more than once. -Condition is good and will continue under the current circumstances. -No need for a digital master unless decay strikes -Example: all foreign titles of the Google project Preservation level 2. Representation available? -Digital Master No - Access file Yes - Physical original Yes Preservation copy available?   No Effort of conservation / preservation care Active   Passive physical original; access file
  17. 17. Finding the balance Trudie Stoutjesdijk, September 5th 2013 Collection Care: Classification levels Finding the balance: Differentiated storage model for digitized collections Level 3: -Digitization for use -Contains objects that represents multiple values -Physical object is in a quite good condition. Can be digitized repeatedly -No need for preservation image -Active preservation: physical original. -Example: large parts of the special collection (18th century) Preservation level 3. Representation available? -Digital Master Master light - Access file Yes - Physical original Yes Preservation copy available?   Physical original Effort of conservation / preservation care Active Physical original Passive Master light
  18. 18. Finding the balance Trudie Stoutjesdijk, September 5th 2013 Collection Care: Classification levels Finding the balance: Differentiated storage model for digitized collections Level 4: -For use and preservation -Objects with high information value, hardly value as an object. -The material can be fragile, digitization can sometimes be done only once -Maintenance of the physical object may not be possible in the future -Create high quality preservation master -Example: Metamorfoze Nat. program for the Preservation of Paper Heritage Preservation level 4. Representation available? -Digital Master Preservation master - Access file Yes - Physical original Yes Preservation copy available?   Preservation master Effort of conservation / preservation care Active preservation master Passive physical original
  19. 19. Finding the balance Trudie Stoutjesdijk, September 5th 2013 Collection Care: Classification levels Finding the balance: Differentiated storage model for digitized collections Level 5: -For use and preservation -Contains fragile, precious objects -Physical object represents primary values that might not be reflected in the digital master -Can only be digitized once -High quality digital master -Example: Bookbinding of William the Silent Preservation level 5. Representation available? -Digital Master Preservation master - Access file Yes - Physical original Yes Preservation copy available?   - Physical original - Preservation master Effort of conservation / preservation care Active Physical original and digital master Passive  
  20. 20. Finding the balance Trudie Stoutjesdijk, September 5th 2013 Digitized collections and storage costs. Finding the balance: Differentiated storage model for digitized collections Currently the output of digitization process is a digital master and a digital access file.
  21. 21. Finding the balance Trudie Stoutjesdijk, September 5th 2013 Finding the balance: Differentiated storage model for digitized collections Type of publication Total costs Books Storage/year Digitization / page Master € 0,01 € 0,72 Access file € 0,008 € 0,56 Master & Access € 0,02 € 1,28 Newspapers Storage Digitization Master € 0,02 € 1,08 Access file € 0,01 € 0,93 Master & Access € 0,05 € 2,01 Journals Storage Digitization Master € 0,01 € 0,77 Access file € 0,009 € 0,61 Master & Access € 0,01 € 1,38 Costs based on TCO storage & digitization
  22. 22. Finding the balance Trudie Stoutjesdijk, September 5th 2013 Finding the balance: Differentiated storage model for digitized collections Classification levels & Cost savings The application of the five level classification model reduce the storage costs of digitized publications for 2 levels. •level 2 will not contain digital master files. This could reduce the costs with 30 – 40%. •level 3 a digital master light will be created; a master light could require less image quality than a preservation master which could reduce the size of a digitized publication, less storage costs.
  23. 23. Finding the balance Trudie Stoutjesdijk, September 5th 2013 Finding the balance: Differentiated storage model for digitized collections Alternatives for cost saving New digital master or digital access files needed: •The access file no longer meets the requirements of the user, •technologies offers new opportunities, possibly better and smaller digital masters •the original physical decay appears to be stronger than expected... Rescanning and/or conversion?
  24. 24. Finding the balance Trudie Stoutjesdijk, September 5th 2013 Finding the balance: Differentiated storage model for digitized collections Rescanning : i.e. re-digitization of (parts of) the collection. •Level 1 has no objects. •Level 2 when decay increases •Level 3 has 2 digital copies, decay / obselescence •Level 4 & 5 rescanning is undesirable / impossible. Conversion: generate a digital access file from the digital master. •Can offer a solution, for level 4 and 5, (vulnerable physical collections). Conversion on the fly: generate a digital access file on demand •Suitable for level 3, 4 and 5, access files don’t need to be stored.
  25. 25. Finding the balance Trudie Stoutjesdijk, September 5th 2013 Finding the balance: Differentiated storage model for digitized collections Conversion and/or on the fly conversion Pro’s •Appropriate and efficient method for permanent storage and access of the collections. •Good solution for the collections at level 4 & 5 •Probably cost saving on production •Cost saving on storage Con’s •A system intensive activity that could create a bottleneck in the delivery to the end user •Insufficient knowledge about the technique •No insight in the costs Started research on conversion by the Research Department
  26. 26. Finding the balance Trudie Stoutjesdijk, September 5th 2013 Finding the balance: Differentiated storage model for digitized collections Wrap up: Tried to realise a model Lessons learned: •Value assessment helps to gain insight in the value of collections •USE and CONDITION of collections helps to find a balance between permanent access and costs •Transparency of the costs. • Rescanning is not feasible for publications that are in vulnerable state. • Conversion might seem preferable to that of rescanning. • Investigation of the conversion / on-the-fly conversion technique is necessary to gain insight into the benefits of this method. In particular with respect to applicability, performance and efficiency.
  27. 27. Thank you! Trudie Stoutjesdijk, September 5th 2013

×