Permanent access to digital material
Upcoming SlideShare
Loading in...5
×
 

Permanent access to digital material

on

  • 1,627 views

Presentació de la Barbara Sierman (National Library of the Netherlands) a les jornades "Biblioteques patrimonials: conservant el futur, construint el passat"...

Presentació de la Barbara Sierman (National Library of the Netherlands) a les jornades "Biblioteques patrimonials: conservant el futur, construint el passat"
organitzades per la Biblioteca de l’Ateneu Barcelonès el 24 de novembre de 2010

Statistics

Views

Total Views
1,627
Views on SlideShare
1,594
Embed Views
33

Actions

Likes
0
Downloads
8
Comments
0

3 Embeds 33

http://biblioteca.ateneubcn.cat 24
http://admin.ateneubcn.org 5
http://paper.li 4

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Thanks you for inviting me in this beautiful city and this beautiful library nd I congratulate you with your birthday and wish you another 150 years of library activities, may be quite different from what you did in the past. I also brought you a small present, a chocolate representative of the most famous person, yearly arriving in the Netherlands from Spain to bring presents to children and grown ups. But now back to the topic of this presentation..
  • Digital Preservation Activities in Europe
  • Dutch parliamentary papers 1814-1995 2.500.000 Dutch Print Online (1781-1800) 2.100.000 Databank of Digital  Daily newspapers 8 .000.000 The Memory of the Netherlands Metamorfoze Various smaller projects Google Book project ( to come) 160.000 books (18-19th century) Digitalisering tijdschriften 1840-1950 1.500.000 This year 2010, the Digital Universe will grow almost as fast to 1.2 million petabytes, or 1.2 zettabytes. (There’s a word we haven’t had to use until now.) While much of the digital content we create is simply not that important (not much different from the paper magazines and newspapers that we throw away, or the telephone conversations, receipts, bad pictures, etc., that we never save), the amount of data that does require permanent or longer-term preservation for a multitude of reasons is increasing exponentially But as we peer into the future, we see the greatest challenges are related not to how to store the information we want to keep, but rather to: Reducing the cost to store all of this content Reducing the risk (and even greater cost) of losing all of this content Extracting all of the value out of the content that we save IDC data shows that nearly 75% of our digital world is a copy – in other words, only 25% is unique.
  • (bron http://www.webarchive.org.uk/ukwa/statistics )
  • The way that material is published changes rapidly. When we started with the e-Depot, our e-journals consisted mainly of pdf files. Over the year, when the pdf format offered more possibilities, the pdf files contained more embedded material. Now we also receive supplemental files with all kinds of file formats. You all know the developments around e-books, with their own propriety reading devices like the Kindle of Amazon or the Nook(?). During the Driver project we investigated a new way of scholarly publication, the enhanced publications, where not only the text of the article is stored, but also related material like datasets, websites, video’s etc. We can expect this to become more complicated and may be we, as a library, need to reconsider whether our definitions of “publication”is still valid
  • While much of the digital content we create is simply not that important (not much different from the paper magazines and newspapers that we throw away, or the telephone conversations, receipts, bad pictures, etc., that we never save), the amount of data that does require permanent or longer-term preservation for a multitude of reasons is increasing exponentially. ©2010 IDC 9 But as we peer into the future, we see the greatest challenges are related not to how to store the information we want to keep, but rather to: Reducing the cost to store all of this content Reducing the risk (and even greater cost) of losing all of this content Extracting all of the value out of the content that we save IDC data shows that nearly 75% of our digital world is a copy – in oth The way that material is published changes rapidly. When we started with the e-Depot, our e-journals consisted mainly of pdf files. Over the year, when the pdf format offered more possibilities, the pdf files contained more embedded material. Now we also receive supplemental files with all kinds of file formats. You all know the developments around e-books, with their own propriety reading devices like the Kindle of Amazon or the Nook(?). During the Driver project we investigated a new way of scholarly publication, the enhanced publications, where not only the text of the article is stored, but also related material like datasets, websites, video’s etc. We can expect this to become more complicated and may be we, as a library, need to reconsider whether our definitions of “publication”is still valid
  • While much of the digital content we create is simply not that important (not much different from the paper magazines and newspapers that we throw away, or the telephone conversations, receipts, bad pictures, etc., that we never save), the amount of data that does require permanent or longer-term preservation for a multitude of reasons is increasing exponentially. ©2010 IDC 9 But as we peer into the future, we see the greatest challenges are related not to how to store the information we want to keep, but rather to: Reducing the cost to store all of this content Reducing the risk (and even greater cost) of losing all of this content Extracting all of the value out of the content that we save IDC data shows that nearly 75% of our digital world is a copy – in oth The way that material is published changes rapidly. When we started with the e-Depot, our e-journals consisted mainly of pdf files. Over the year, when the pdf format offered more possibilities, the pdf files contained more embedded material. Now we also receive supplemental files with all kinds of file formats. You all know the developments around e-books, with their own propriety reading devices like the Kindle of Amazon or the Nook(?). During the Driver project we investigated a new way of scholarly publication, the enhanced publications, where not only the text of the article is stored, but also related material like datasets, websites, video’s etc. We can expect this to become more complicated and may be we, as a library, need to reconsider whether our definitions of “publication”is still valid
  • Although there are many similarities with analogue material like books or journals, Authenticity Collection building requirements *** Etc. there is also a major difference with didgtial material. And that is the interpreter you need to translate the bits into meaningful information. There lies also the risk:
  • The updated version of the OAIS model, which is the basic conceptual model for nearly all organisations that are involved in digital preservation. It offers a shared set of terminology and concepts.
  • Simplified OAIS model, may be familiar with your organisation, but with the difference that these activities are not isolated activities but hightly influence each other, I’ll explain how we do that in the KB
  • De KB recently started a new progam with the central theme of really to become a Digital Library. Although we are involved in digital preservation for more then a decade, this reorganisation of activities is intended to lead to an integrated approach of digital and analogue material. Digital material will be the preferred format in case a publication will arrive both digitally and physically
  • Internationaal: 20 international publishers (STM) 10.000 journal titles open access and restricted access Nationaal agreement with Dutch Publishers Association 16 academische repositories Webarchivering 3000 websites, 4 TB
  • Let op: niet alle hokjes hebben dezelfde kleur (in origineel zit dat niet zo?!) This is the official new organisation. With regard to digital preservation my personal opinion is that this overview should also need a kind of controller function to relate the activities and to judge whether the separate activities led to the goal of well preserved material. Currently the working processes are designed and they should help ta achiee this. Deze slide zou later kunnen als je uitlegt dat DP iets is van hele organisatie en ook linken aan RAC Identifying departments that contribute to the activity of permanent access will be helpful in identifying and streamlining activities, and to make job descriptions, it might even save resources and define clear responsibilities.
  • DP in organisatie is in ontwikkeling. Het e-Depot zat bij Collection Management, maar in de praktijk integreerde het niet met de papieren wereld en bleef het gescheiden In de nieuwe organisatie streven naar meer integratie: beleid onderling op elkaar afstemmen (later vertellen over keuzes bij aanschaf: digitaal de voorkeur) Digital preservation should not be an isolated activity, but should be part of the whole organisation: just to show you some examples I take the new organisation chart of the KB
  • Het proces zal niet van A tot Z worden besproken, maar van Z tot A. Het is namelijk makkelijker te begrijpen waarom bepaalde processtappen plaatsvinden, als je weet wat de eisen van het systeem zijn. Daarom beginnen we bij de toegang, en eindigen we bij de aanlevering. Tijdens deze presentatie wordt de gang van het digitale bestand besproken aan de hand van 4 verschillende fases: Toegang en beschikbaarstelling Opslag Verwerking Aanlevering en distributie Voor elk van deze processen zal ik een vergelijking trekken naar de fysieke bibliotheek. Voor wat betreft het e-Depot zal ik per fase inzoomen om te verduidelijken wat er op dat moment gebeurt met het digitale bestand.
  • A policy is a commitment of the organization towards digital preservation KB : policies in progress, not ready yet
  • Recently a Spanish translation of the Premis model was launched!
  • Local initiatives, but it has the attention of the European Commission, the Alliance for Permanent Access , but also new European projects in relation to DP alwayshave a training workpackage to disseminate knowledge and research results
  • As long term preservation is a long term commitment it is important that the organization in charge realizes this and manages the costs well. Although it is not an easy task to predict the costs of digital preservation, some important publications can support organizations. To name just a few: the Life project - UCL and the British Library, UK project,, tool to support prediction of costs in the life cycle of the digital objects and the Blue Ribbon Task Force helps to set the focus : the ones who pays for preservation are not necessarely the ones that benefit, digital preservation is a derived activitiy, access is the importance
  • Digital material is trusted to organizations, questions for the organization: am I doing the right things and for society; is this organization trustworthy? Self audit to help: Drambora to identify risks Trac to do a self audit and to get some feeling Prepared to become an ISO standard with a audit body Funding bodies will require this in the future
  • Digital preservation too complicated to do on your own Collaboration will lead to better results Valuable networking: national / international
  • The web is full of material of interest for people involved in taking care of digital material. The webites of large libraries and archives, and of European projects or projects in de United States like NDIIP offers an overwhelming amount of material. No one can do it on its own, there are basic rules to follow but there are no strict guidelines yet. Please help to develop these guidelines, like the set of guidelines for books, written down some centuries ago. Like William Dougherty, the Executive Director of Network Infrastructure & Services, Virginia Tech [University] wrote in an article about digital preservation in the Journal of Academic Librarianship “There is no better group to be in charge of developing and promulgating standards for the future of digital materials than librarians.” Thank you

Permanent access to digital material Permanent access to digital material Presentation Transcript

  • Permanent access to digital material
    • Barbara Sierman 24-11-2010
  • Congratulations! Permanent access to digital material 150
  • Digital Material at (National) Libraries
    • Born digital deposit
    • Mass digitization
    • Web harvesting
    Permanent access to digital material
  • Some figures: digitization
    • Europeana:
    • based on billions of digitized objects in
    • libraries, museums and archives
    • Google Books Initiative
    • Over 10 million books in Google Books Search
    • KB-NL
    • Several projects, over 14 million pages
    • + Google digitization: 160.000 books
    Permanent access to digital material
  • Some figures: webarchiving
    • Deposit laws
    • Domain crawls
    • Selections of national sites
    • Special projects: elections, football matches, Olympic Games
    • Examples:
    • UK Webarchive ; in oct. 2010 8000 archived websites, 7,5 TB
    • KB-NL 3000 selected sites, 4 TB
    Permanent access to digital material
  • Developments in digital born material
    • E-journals
    • E-books
    • Enhanced publications
    • Websites
    • Etc.
    Permanent access to digital material
  • Growing amount of data Permanent access to digital material
  • How to keep this accessible? Permanent access to digital material
  • How to preserve books for eternity? (1527) Sign: KB 133 F 2 Permanent access to digital material
  • Digital material is different Permanent access to digital material
  • The OAIS model Permanent access to digital material
  • The digital activities in general Permanent access to digital material
  • The digital library at the KB
    • Strategic priorities 2010-2013
    • We offer everyone access to everything published
    • in and about the Netherlands.
    • We improve the national information infrastructure.
    • We guarantee long-term storage of digital information.
    • We maintain, present and strengthen our collections.
    Permanent access to digital material
  • A bit of the history…
    • Involved in digital preservations since the 90-ties
    • 1990-2003 pilots (Nedlib)
    • 2002 archiving agreement with Elsevier
    • 2003 IBM / DIAS system
    • 2010 requirements new system finalized
    • reorganising organisation
    • 2013 successor of DIAS system
    Permanent access to digital material
  • What do we preserve?
    • International e-depot
    • National collection
    • Webarchiving (separate , not yet long term storage)
    • Digitization projects (separate , not yet long term storage)
    Permanent access to digital material
  • Permanent access to digital material System management Collection Building User Services Collection Mgt. Metadata Ingest Consultancy & Advice Board of Governors Collections User Services Document Processing Digitization Product Support Corporate Communication Finance & Corporate Services Information Technology - Management Human Resources Policy Support Building & Facilities Operations Online Services Marketing Communication Collection Care Marketing & Services Finance Innovative Projects Innovation & Development Director General Research Information Technology - Development Corporate Secretary Project Management Office
  • The KB digitization projects
    • Long term storage of the master files
    Permanent access to digital material Project # pages Dutch parliamentary papers 1814-1995 2.500.000 Dutch Print Online (1781-1800) 2.100.000 Databank of Digital  Daily newspapers 8 .000.000 The Memory of the Netherlands Metamorfoze Various smaller projects Google Book project ( to come) 160.000 books (18-19th century) Digitalisering tijdschriften 1840-1950 1.500.000
  • DIAS Current workflow Pre-process Batch Builder Error recovery e-Post Office Supplier Identifier Object Metadata CMS/DIAS Catalogue I A A End user INGEST ARCHIVAL STORAGE ACCESS Permanent access to digital material
  • Improvements needed
    • Why? Because the world changed …
    • More standards, Premis metadata
    • More heterogeneous material; workflows not dedicated to one type
    • More variety in file formats: checks
    • More storage needed
    • More quality control
    • In production : 2013
    Permanent access to digital material
  • General aspects – developments
    • Preservation Policies
    • Metadata
    • Staff
    • Costs
    • Audit and certification
    • Research
    Permanent access to digital material
    • Ideally: available before starting
    • Reality: many organisations work without …
    • Why is this not desirable?
    • Starting points are unclear
    • Unclear what to expect as a user or as society
    • No leading principles to do actions
    • Organizational involvement unclear
    Policies for digital preservation Permanent access to digital material
  • Metadata
      • How to find the object (descriptive metadata)
      • Characteristics of the object (preservation metadata)
      • Technical information
      • Access restrictions: rights, DRM etc.
      • What is needed to render the object
      • What happened before with the object (provenance)
      • The context of the object
      • Evidence of its trustworthiness
    Permanent access to digital material
  • Origin of these metadata
    • Some information added
    • Some information extracted: JHOVE, DROID, FIDO
    • Information stored in central places
      • File format registries
      • Rendering information
    • Less manually, more automatically!
    Universal digital format registry Permanent access to digital material
    • New profession: learning in practice / by doing
    • Developments underway for curriculum:
    • national initiatives, for example:
    • nestor in germany
    • JISC UK
    • University of North Carolina, USA
    • support by European Commission / APA
    • self learning: training material on the web
    • KB collaboration University of Leiden
    Staff requirements Permanent access to digital material
    • Lifecycle Information for E-Literature
    • Blue Ribbon Task Force on sustainable digital preservation and access
    COSTS: Permanent access to digital material
    • TRAC
    • Drambora (risk identification)
    • RAC : Development of ISO standard, published this year
    Audit and Certification Permanent access to digital material
  • Research in the KB-NL (2010)
    • File formats
      • pdf
      • Jp2(K)
    • Preservation policies
    • Preservation Planning
    • Emulation
    • Enhanced Publications
    • From research to practice: new e-depot, reorganisation
    Permanent access to digital material
  • Participation of KB in European projects
    • Finished :
    • Planets, Parse-Insight, Driver I and II
    • Current or starting soon:
    • Keep , Aparsen, Scape
    • Founding member of Open Planets Foundation
    Permanent access to digital material
    • Permanent access to digital material is a challenge
    • No organization can do it on its own
    • Collaboration is needed
    • Welcome in the community for permanent access !
    To summarize: Permanent access to digital material