Data Archiving and Networked Services




Towards a durable data
infrastructure for
historical census data
CEDAR Symposium
1 March 2013

rene.van.horik@dans.knaw.nl




DANS is een instituut van KNAW en NWO
Outline
1. Background on digital preservation
  1.   Threats and strategies
  2.   OAIS-reference model
  3.   APARSEN Network of Excellence on Digital Preservation
2. Reporting on “HisTel” project
Digital Preservation
• Threats to future access to digital information
   –   Format obsolescence
   –   Not possible to render object
   –   Operating system obsolescence
   –   Hardware failure
   –   What is it?
• Strategies
   – Technology preservation strategy
   – Technology emulation strategy
   – Digital information migration strategy
Reference Model for an Open archival
  Information System (OAIS)
• ISO 14721: 2012
• OAIS = Archive that accepted responsibility to
  preserve information and make it available for a
  Designated Community
• Open = Developed in open forums
• Framework for Long Term Preservation Strategies
• Framework for describing and comparig architectures
• Expands consensus on elements and processes for LT
  preservation and access
Designated Community
• An identified group of potential Consumers who
  should be able to understand a particular set of
  information. The Designated Community may be
  composed of multiple user communities. A
  Designated Community is defined by the Archive
  and this definition may change over time.
Key terms in digital preservation
 • Providing Evidence / Trust
 • Gap-analysis / Gap-management
 • Representation information
Co-funded by the European Union under FP7-ICT-2009-6



 Approach of APARSEN

                TRUST              SUSTAIN                  USABILITY            ACCESS
                                   ABILITY
    Stream 1
  Integration

   Stream 2
   Technical
    research

    Stream 3
Non-technical
    research
   Stream 4
 Sustainable
     uptake

                  Why PIs are crucial in digital preservation
                  Simon Lambert, STFC
                  Webinar, Feb 2013

                                                                            aparsen.eu              #APARSEN
Co-funded by the European Union under FP7-ICT-2009-6



Network of Excellence




Why PIs are crucial in digital preservation
Simon Lambert, STFC
Webinar, Feb 2013

                                                  aparsen.eu              #APARSEN
Modelling Scholarly Research




                        Understanding the Information
                        Requirements of Arts and
                        Humanities Scholarship
                        Agiatis Benardou, Panos
                        Constantopoulos, Costis Dallas,
                        Dimitris Gavrilis
                        doi:10.2218/ijdc.v5i1.141
Source: Past, present and future of historical information science. Onno Boonstra, Leen Breure and
Peter Doorn. 2006
http://www.dans.knaw.nl/en/content/categorieen/publicaties/past-present-and-future-
historical-information-science
State of art HisTel
• Goal: to create a durable infrastructure to provide
  access to historical statistics
• DANS-EASY = Back-office
• High priority for:
   – Persistent identifiers for tables
   – Automatic log-in / Single sign-on
   – License update
• Front-office?
   – Not decided yet
   – www.volkstellingen.nl is outdated
Issues:
• How to cope with new versions / adjustments of
  datasets?
• Which services should developed on top of the
  digital archive? (e.g. Image viewer? GIS-tool?)
• Which digital objects should be archived?
  (TabLinker, TabExtractor, Bedford’s Law Script?,
  Open Annotation Ontology, etc.)
• Which Representation Information is required?
• Designated Community?

20130103 cedar symposium durable data infraastructure

  • 1.
    Data Archiving andNetworked Services Towards a durable data infrastructure for historical census data CEDAR Symposium 1 March 2013 rene.van.horik@dans.knaw.nl DANS is een instituut van KNAW en NWO
  • 2.
    Outline 1. Background ondigital preservation 1. Threats and strategies 2. OAIS-reference model 3. APARSEN Network of Excellence on Digital Preservation 2. Reporting on “HisTel” project
  • 3.
    Digital Preservation • Threatsto future access to digital information – Format obsolescence – Not possible to render object – Operating system obsolescence – Hardware failure – What is it? • Strategies – Technology preservation strategy – Technology emulation strategy – Digital information migration strategy
  • 4.
    Reference Model foran Open archival Information System (OAIS) • ISO 14721: 2012 • OAIS = Archive that accepted responsibility to preserve information and make it available for a Designated Community • Open = Developed in open forums • Framework for Long Term Preservation Strategies • Framework for describing and comparig architectures • Expands consensus on elements and processes for LT preservation and access
  • 6.
    Designated Community • Anidentified group of potential Consumers who should be able to understand a particular set of information. The Designated Community may be composed of multiple user communities. A Designated Community is defined by the Archive and this definition may change over time.
  • 9.
    Key terms indigital preservation • Providing Evidence / Trust • Gap-analysis / Gap-management • Representation information
  • 10.
    Co-funded by theEuropean Union under FP7-ICT-2009-6 Approach of APARSEN TRUST SUSTAIN USABILITY ACCESS ABILITY Stream 1 Integration Stream 2 Technical research Stream 3 Non-technical research Stream 4 Sustainable uptake Why PIs are crucial in digital preservation Simon Lambert, STFC Webinar, Feb 2013 aparsen.eu #APARSEN
  • 11.
    Co-funded by theEuropean Union under FP7-ICT-2009-6 Network of Excellence Why PIs are crucial in digital preservation Simon Lambert, STFC Webinar, Feb 2013 aparsen.eu #APARSEN
  • 12.
    Modelling Scholarly Research Understanding the Information Requirements of Arts and Humanities Scholarship Agiatis Benardou, Panos Constantopoulos, Costis Dallas, Dimitris Gavrilis doi:10.2218/ijdc.v5i1.141
  • 13.
    Source: Past, presentand future of historical information science. Onno Boonstra, Leen Breure and Peter Doorn. 2006 http://www.dans.knaw.nl/en/content/categorieen/publicaties/past-present-and-future- historical-information-science
  • 14.
    State of artHisTel • Goal: to create a durable infrastructure to provide access to historical statistics • DANS-EASY = Back-office • High priority for: – Persistent identifiers for tables – Automatic log-in / Single sign-on – License update • Front-office? – Not decided yet – www.volkstellingen.nl is outdated
  • 15.
    Issues: • How tocope with new versions / adjustments of datasets? • Which services should developed on top of the digital archive? (e.g. Image viewer? GIS-tool?) • Which digital objects should be archived? (TabLinker, TabExtractor, Bedford’s Law Script?, Open Annotation Ontology, etc.) • Which Representation Information is required? • Designated Community?