SlideShare a Scribd company logo
I Say Emulate; He Says Migrate

  Are emulation or migration feasible
       preservation strategies?

                          National Library of Australia
                          Prepared by:
                          Andrew Stawowczyk Long
                          Presented by:              1
                          David Pearson
Archiving the Web
• Many institutions actively harvest the web
• Collecting scale vary
• Preservation practices not well understood and
  implemented
• Collecting intent may differ depending on the
  institution



                                                   2
Web Archives
• Type
  •   Text oriented
  •   Multimedia (video/audio) oriented
  •   Picture oriented
  •   Databases
  •   Combination of all types
• Storage
  • Uncompressed
  • Compressed (WARC)
  • Combination
                                          3
Web Objects and Elements
• Challenge: Web archives may contain any type of digital object
• Common objects
    • HTML/XML and related (htm, html, xml, css, etc.)
    • Images (raster images – JPEG, GIF, PNG)
    • Media
         • Audio files (au, wav, aiff, midi, mp3)
         • Video files (mov, mpg, wmv, rm)
• Other objects
    • File Archives (usually compressed – zip, tar, gz, arc, sit)
    • Images (raster images – bmp, tiff)
    • Images (vector images - SVG)
    • Text files (txt, csv, rtf)
    • Document files
         • PDF
         • Microsoft Word, Excel, Power Point
                                                                    4
Comparative statistics of
                 NLA web collections
    PANDORA (selective)               .au Domain Harvests
Files:           73 million       Files:                  2.3 billion
Size:             3.26 TB         Size:                     78.75 TB

  Domain        2005          2006           2007             2008
  Harvest
  Unique       185 million    596 million   516 million        1 billion
  files
  Hosts           811,523     1,046,038      1,247,614       3,038,658
  crawled
  Size            6.69 TB          19.04      18.47TB         34.55 TB
                                                                      5
What are we preserving?
                          Preservation Intent

• Preservation of:
   •   Physical media?
   •   Bit-stream (logical form of data)?
   •   Action (rendering data into something useful to user)?
   •   User experience?
• Important Considerations
   • Creator’s perceived intent
   • Institution’s preservation intent




                                                                6
         Based on Heslop and Davis (2002)
What are we preserving?
                            Properties

• Object Properties
  (Properties regarded as important would vary depending on the
  intention of the collecting institution)

   •
   •
       Derived from file format
       High-level – e.g. layout, formatting
                                             or     WEB
   •   Measured – identified directly by computer
   •   Intended – Set by the collecting body




                                                              7
Possible Preservation Actions 1
• Emulation
    The original environment is recreated on a contemporary hardware using
      specialised software (emulator) and original software.

• Renderers
• Specialised software,
  operating in the
  contemporary environment
  and used to access (render)
  original files. It is similar
  to emulation.




                                                                             8
Possible Preservation Actions 2
• Migration
   Original file formats are migrated (converted) to
   another format, which is supported by current
   hardware/software.
                           e.g. MS Word 3.0 to MS Word
                           2008




                                                         9
Possible Preservation Actions 3
                     Not long-term sustainable

• Technological Museum
  Collect and maintain the original hardware and software


• Take No Action
  Do nothing




                                                            10
Digital Preservation
                             Preliminaries
• Collection objects need to be correctly recognised and
  identified
• Preservation intent(s) need to be defined
• High-level preservation actions need to be defined (e.g. shall
  we use emulation or migration?)
• Practical-level preservation actions need to be defined

     Object Format + Preservation Intent = Appropriate Action



  Dillema:
  How to properly migrate data if preservation intent(s) are
  unknown or not defined                                           11
Tools Required for Emulation
• Emulators
    • Fast, stable, flexible, extendable
•   Licenced Operating Systems
•   Various drivers
•   Web browsers
•   Browser plug-ins
•   Other programs as required (e.g. Java, Adobe Acrobat
    Reader)


                                                      12
Tools Required in Migration
•   Format identifiers
•   Format converters
•   Link updaters
•   QA automatons




CAMiLEON project – Migration on Request Tool
XENA                                           13
Project Tests
           General Testing Environment

• Large slice of uncompressed PANDORA
  archive (random selection)
• Whole Domain Harvest archive have not been
  included in tests (WARC files)
• Multiple hardware combinations
• Multiple OS combinations
• Multiple Web Browsers


                                           14
Project Tests
                          Material Sample

Testing the industrial scale tools
• PANDORA slice
  • 861Gb
  • 18,019,172 files
  • 2,379,326 folders
Testing object properties
• Smaller slice of PANDORA slice
  • 20 objects of each selected types
     •Audio, html, images, pdf, video, zip, MS documents
                                                           15
Project Tests
                              Methodology
• Large sample testing (861Gb, 18,019,172 files)
       • Attempt to identify objects in the sample using DROID
       • Attempt to migrate jpeg images to png and update links


• Small sample testing
       • Select smaller sub-sample, with objects mostly created before year 2000
       • Identify objects in the sample
       • View and experience selected objects in contemporary environments using
         various platforms, OS and browsers
       • View and experience selected objects in old environments using
         emulations on various platforms, using different OS and browsers
       • Migrate selected objects and review them in various environments


                                                                              16
Project Tests
                              Tools tested
• Common                                 • Emulation
 •   DROID                                   • QEMU
 •   JHOVE
                                             • Bochs
 •   TRiID
 •   File Identifier
                                             • MS Virtual PC
                                               (Not exactly an emulator)
 •   Lister (dev. in-house)
 •   OS                                      ● Dioscuri
       –   MS Win XP Pro
       –   MS Win 3.1
                                         • Migration
       –   MS Win 98SE                       • ImageMagick
       –   Ubuntu 9.04
                                             • MediaCoder
 • Web Browsers
       –   MS IE 7                           • Swf>>avi
       –   Firefox 3                         • OpenOffice Tools
       –   Arachne 1.2
                                             • XENA
       –   Mosaic 2
                                                                           17
       –   Netscape 4
Project Tests
                       Control – Current Environment

• Properties observed in selected files
  Object Basic Characteristics (based on Emulation Project by KB)
      1. Content : the text, images, etc. from the object
      2. Structure : the cohesion between different parts of the object
      3. Context : the meaning of the object.
      4. Appearance : the way an object is presented to the user.
      5. Behaviour : the interaction of the object with the user or system.

E.g. for HTML pages:
  •Rendering of text, images, media files
       •   Font, layout, colours, contrast, brightness, animation smoothness, sound quality, etc.

  •Objects dependencies
  •Mouse & keyboard behaviour
  •Data extraction

                                                                                                    18
Project Tests
                    Emulated Environments
• Hardware
   • Dell Optiplex GX620, P4, 4.4GHz x 3.39GHZ, 3.5Gb RAM
   • Power Mac G4

EMULATORS:
• Bochs
   • Host:         WinXP Pro v2002 SP3
                   Ubuntu 9.04
   • Client:       Win 3.1, MS DOS 6.2
                   WinXP Pro SP2
• Dioscuri 0.4.0
   • Host:         WinXP Pro v2002 SP3
   • Client:       Win3.1, MS DOS 6.2

                                                            19
Project Tests
                  Emulated Environments

• Qemu
   • Host:      MS WinXP Pro v2002 SP3
   • Clients:   MS Win98SE
                MS Win 3.1
                MS DOS 6.2
                Ubuntu 9.04


   • Host:      Ubuntu 9.04
   • Clients:   MS WinXP Pro SP2, P4, 12.92GHz, 256Mb RAM
                MS Win98SE
                MS Win 3.1
• Microsoft Virtual PC
   • Host:      MS WinXP Pro v2002 SP3
   • Clients:   MS Win 3.1
                MS Win98SE                                  20
Tests - Summary
                             Emulation


•Setting up emulators was relatively simple
•Additional software (especially to work with disk images)
proved to be extremely useful.
•Licencing was at times a big obstacle. (E.g. Impossible to
emulate Macintosh environment legally).
•A lot of dependencies exist. It is a complex task to make
programs work correctly.
   •e.g Windows XP requires internet or over-the-phone activation after 30 days




                                                                             21
Tests – Summary
                                       Emulation

• All
  Some of the dll libraries in Win 3.1 did not agree with 16-bit Netscape and Mosaic
  programs
• Bochs 2.3.7 for Windows
    • Extremely slow in GUI environments
    • No full screen mode. Limited end-user experience.
• Dioscuri
    • Sluggish at times
    • Didn’t like some of the images created in WinImage
• Qemu 0.9.0 for Windows and Linux
    • Much faster but still sluggish at times
    • Win98SE couldn't run in hi-res, hi-colour mode
• Microsoft Virtual PC
        Relatively fast (it's a virtualisation software on PC) but still sluggish at times
                                                                                             22
Tests - Summary
             Migration Environment




•Dell Optiplex GX620
•MS Windows XP Pro v2002 SP3
•Networked drive with PANDORA sample




                                       23
Tests - Summary
                                Migration

•Available tools are imperfect and slow.
   • e.g. DROID took more than two weeks to examine slightly over 18 million
     files and many of them were not recognised

•It is very difficult to examine contents of the container
formats (e.g. avi or rm)
•Network connections need to be as fast as possible
•It is difficult to make informed decision about
migration without preservation intent clearly defined



                                                                               24
Tests - General Comments

• No proven methods exist
    Real-world testing is needed
  • Most documented approaches are ad-hoc - no
    commodity solutions
• Tools are few and inadequate




                                                 25
Tests - General Comments


• Preservation policies, especially about
  preservation intent are needed
• Significant resources are needed to practically
  tackle the problem




                                                26
Andrew Stawowczyk Long
Strategist
Digital Preservation Standards
NLA
anlong@nla.gov.au

David Pearson
Director (Acting)
Web Archiving and Digital Preservation Branch
NLA
dapearso@nla.gov.au




                 Project Report is due end of October 2009


                                                             27

More Related Content

Viewers also liked

Digital presevation
Digital presevationDigital presevation
Digital presevation
National Library of Australia
 
The Adventures of Digi: Ideas, Requirements and Reality
The Adventures of Digi: Ideas, Requirements and RealityThe Adventures of Digi: Ideas, Requirements and Reality
The Adventures of Digi: Ideas, Requirements and Reality
National Library of Australia
 
Those Mad Men from the Antipodes: Presentation Intent at the National Library...
Those Mad Men from the Antipodes: Presentation Intent at the National Library...Those Mad Men from the Antipodes: Presentation Intent at the National Library...
Those Mad Men from the Antipodes: Presentation Intent at the National Library...
National Library of Australia
 
Creating a vision for mobile service delivery
Creating a vision for mobile service deliveryCreating a vision for mobile service delivery
Creating a vision for mobile service delivery
National Library of Australia
 
Intro to Digital Preservation
Intro to Digital PreservationIntro to Digital Preservation
Intro to Digital Preservation
Ben Fino-radin
 
An Introduction to Digital Preservation
An Introduction to Digital PreservationAn Introduction to Digital Preservation
An Introduction to Digital Preservation
DigitalPreservationEurope
 
Digital preservation
Digital preservationDigital preservation
Digital preservation
Sarika Sawant
 

Viewers also liked (7)

Digital presevation
Digital presevationDigital presevation
Digital presevation
 
The Adventures of Digi: Ideas, Requirements and Reality
The Adventures of Digi: Ideas, Requirements and RealityThe Adventures of Digi: Ideas, Requirements and Reality
The Adventures of Digi: Ideas, Requirements and Reality
 
Those Mad Men from the Antipodes: Presentation Intent at the National Library...
Those Mad Men from the Antipodes: Presentation Intent at the National Library...Those Mad Men from the Antipodes: Presentation Intent at the National Library...
Those Mad Men from the Antipodes: Presentation Intent at the National Library...
 
Creating a vision for mobile service delivery
Creating a vision for mobile service deliveryCreating a vision for mobile service delivery
Creating a vision for mobile service delivery
 
Intro to Digital Preservation
Intro to Digital PreservationIntro to Digital Preservation
Intro to Digital Preservation
 
An Introduction to Digital Preservation
An Introduction to Digital PreservationAn Introduction to Digital Preservation
An Introduction to Digital Preservation
 
Digital preservation
Digital preservationDigital preservation
Digital preservation
 

Similar to I say emulate

Digital Library Software
Digital Library SoftwareDigital Library Software
Petabyte scale on commodity infrastructure
Petabyte scale on commodity infrastructurePetabyte scale on commodity infrastructure
Petabyte scale on commodity infrastructure
elliando dias
 
Analytics with unified file and object
Analytics with unified file and object Analytics with unified file and object
Analytics with unified file and object
Sandeep Patil
 
SQL Queries on Smalltalk Objects
SQL Queries on Smalltalk ObjectsSQL Queries on Smalltalk Objects
SQL Queries on Smalltalk Objects
ESUG
 
Developing a Staff-Only Samvera Application
Developing a Staff-Only Samvera ApplicationDeveloping a Staff-Only Samvera Application
Developing a Staff-Only Samvera Application
James Griffin
 
Chicago HUG Presentation Oct 2011
Chicago HUG Presentation Oct 2011Chicago HUG Presentation Oct 2011
Chicago HUG Presentation Oct 2011
Abe Taha
 
University of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchersUniversity of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchers
Jez Cope
 
Watching the Detectives: Using digital forensics techniques to investigate th...
Watching the Detectives: Using digital forensics techniques to investigate th...Watching the Detectives: Using digital forensics techniques to investigate th...
Watching the Detectives: Using digital forensics techniques to investigate th...
GarethKnight
 
2007 iPres Beijing - MIXED: Preservation by migration to XML
2007 iPres Beijing - MIXED: Preservation by migration to XML2007 iPres Beijing - MIXED: Preservation by migration to XML
2007 iPres Beijing - MIXED: Preservation by migration to XML
Dirk Roorda
 
Introduction to BIg Data and Hadoop
Introduction to BIg Data and HadoopIntroduction to BIg Data and Hadoop
Introduction to BIg Data and Hadoop
Amir Shaikh
 
Hadoop training in bangalore
Hadoop training in bangaloreHadoop training in bangalore
Hadoop training in bangalore
Kelly Technologies
 
E Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesE Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutes
Jan Aerts
 
Securing the Container Pipeline
Securing the Container PipelineSecuring the Container Pipeline
Securing the Container Pipeline
Salesforce Engineering
 
SCA Accessioning Born-Digital Materials Workshop, Nov. 8, 2012
SCA Accessioning Born-Digital Materials Workshop, Nov. 8, 2012SCA Accessioning Born-Digital Materials Workshop, Nov. 8, 2012
SCA Accessioning Born-Digital Materials Workshop, Nov. 8, 2012
peterchanws
 
Accessioning Born-Digital Materials
Accessioning Born-Digital MaterialsAccessioning Born-Digital Materials
Accessioning Born-Digital Materials
peterchanws
 
Preservation Planning: Choosing a suitable digital preservation strategy
Preservation Planning: Choosing a suitable digital preservation strategyPreservation Planning: Choosing a suitable digital preservation strategy
Preservation Planning: Choosing a suitable digital preservation strategy
GarethKnight
 
CNIT 121: 14 Investigating Applications
CNIT 121: 14 Investigating ApplicationsCNIT 121: 14 Investigating Applications
CNIT 121: 14 Investigating Applications
Sam Bowne
 
2010 AIRI Petabyte Challenge - View From The Trenches
2010 AIRI Petabyte Challenge - View From The Trenches2010 AIRI Petabyte Challenge - View From The Trenches
2010 AIRI Petabyte Challenge - View From The Trenches
George Ang
 
NoSQL, which way to go?
NoSQL, which way to go?NoSQL, which way to go?
NoSQL, which way to go?
Ahmed Elharouny
 
No SQL : Which way to go? Presented at DDDMelbourne 2015
No SQL : Which way to go?  Presented at DDDMelbourne 2015No SQL : Which way to go?  Presented at DDDMelbourne 2015
No SQL : Which way to go? Presented at DDDMelbourne 2015
Himanshu Desai
 

Similar to I say emulate (20)

Digital Library Software
Digital Library SoftwareDigital Library Software
Digital Library Software
 
Petabyte scale on commodity infrastructure
Petabyte scale on commodity infrastructurePetabyte scale on commodity infrastructure
Petabyte scale on commodity infrastructure
 
Analytics with unified file and object
Analytics with unified file and object Analytics with unified file and object
Analytics with unified file and object
 
SQL Queries on Smalltalk Objects
SQL Queries on Smalltalk ObjectsSQL Queries on Smalltalk Objects
SQL Queries on Smalltalk Objects
 
Developing a Staff-Only Samvera Application
Developing a Staff-Only Samvera ApplicationDeveloping a Staff-Only Samvera Application
Developing a Staff-Only Samvera Application
 
Chicago HUG Presentation Oct 2011
Chicago HUG Presentation Oct 2011Chicago HUG Presentation Oct 2011
Chicago HUG Presentation Oct 2011
 
University of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchersUniversity of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchers
 
Watching the Detectives: Using digital forensics techniques to investigate th...
Watching the Detectives: Using digital forensics techniques to investigate th...Watching the Detectives: Using digital forensics techniques to investigate th...
Watching the Detectives: Using digital forensics techniques to investigate th...
 
2007 iPres Beijing - MIXED: Preservation by migration to XML
2007 iPres Beijing - MIXED: Preservation by migration to XML2007 iPres Beijing - MIXED: Preservation by migration to XML
2007 iPres Beijing - MIXED: Preservation by migration to XML
 
Introduction to BIg Data and Hadoop
Introduction to BIg Data and HadoopIntroduction to BIg Data and Hadoop
Introduction to BIg Data and Hadoop
 
Hadoop training in bangalore
Hadoop training in bangaloreHadoop training in bangalore
Hadoop training in bangalore
 
E Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesE Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutes
 
Securing the Container Pipeline
Securing the Container PipelineSecuring the Container Pipeline
Securing the Container Pipeline
 
SCA Accessioning Born-Digital Materials Workshop, Nov. 8, 2012
SCA Accessioning Born-Digital Materials Workshop, Nov. 8, 2012SCA Accessioning Born-Digital Materials Workshop, Nov. 8, 2012
SCA Accessioning Born-Digital Materials Workshop, Nov. 8, 2012
 
Accessioning Born-Digital Materials
Accessioning Born-Digital MaterialsAccessioning Born-Digital Materials
Accessioning Born-Digital Materials
 
Preservation Planning: Choosing a suitable digital preservation strategy
Preservation Planning: Choosing a suitable digital preservation strategyPreservation Planning: Choosing a suitable digital preservation strategy
Preservation Planning: Choosing a suitable digital preservation strategy
 
CNIT 121: 14 Investigating Applications
CNIT 121: 14 Investigating ApplicationsCNIT 121: 14 Investigating Applications
CNIT 121: 14 Investigating Applications
 
2010 AIRI Petabyte Challenge - View From The Trenches
2010 AIRI Petabyte Challenge - View From The Trenches2010 AIRI Petabyte Challenge - View From The Trenches
2010 AIRI Petabyte Challenge - View From The Trenches
 
NoSQL, which way to go?
NoSQL, which way to go?NoSQL, which way to go?
NoSQL, which way to go?
 
No SQL : Which way to go? Presented at DDDMelbourne 2015
No SQL : Which way to go?  Presented at DDDMelbourne 2015No SQL : Which way to go?  Presented at DDDMelbourne 2015
No SQL : Which way to go? Presented at DDDMelbourne 2015
 

More from National Library of Australia

Publicity and media - Anna Gressier & Sarah Kleven (Communications and Market...
Publicity and media - Anna Gressier & Sarah Kleven (Communications and Market...Publicity and media - Anna Gressier & Sarah Kleven (Communications and Market...
Publicity and media - Anna Gressier & Sarah Kleven (Communications and Market...
National Library of Australia
 
CHG recipient case study - Julia Mant of the National Institute of Dramatic Art
CHG recipient case study - Julia Mant of the National Institute of Dramatic ArtCHG recipient case study - Julia Mant of the National Institute of Dramatic Art
CHG recipient case study - Julia Mant of the National Institute of Dramatic Art
National Library of Australia
 
Completing your CHG project - Fran D'Castro
Completing your CHG project - Fran D'CastroCompleting your CHG project - Fran D'Castro
Completing your CHG project - Fran D'Castro
National Library of Australia
 
Just Digitise It - Daniel Wilksch of the Public Records Office Victoria
Just Digitise It - Daniel Wilksch of the Public Records Office VictoriaJust Digitise It - Daniel Wilksch of the Public Records Office Victoria
Just Digitise It - Daniel Wilksch of the Public Records Office Victoria
National Library of Australia
 
Trove - a window to our community heritage - Hilary Berthon of Trove, NLA
Trove - a window to our community heritage - Hilary Berthon of Trove, NLATrove - a window to our community heritage - Hilary Berthon of Trove, NLA
Trove - a window to our community heritage - Hilary Berthon of Trove, NLA
National Library of Australia
 
National Archives of Australia
National Archives of AustraliaNational Archives of Australia
National Archives of Australia
National Library of Australia
 
Disaster Prevention, Preparedness, Response and Recovery for Collections - Ki...
Disaster Prevention, Preparedness, Response and Recovery for Collections - Ki...Disaster Prevention, Preparedness, Response and Recovery for Collections - Ki...
Disaster Prevention, Preparedness, Response and Recovery for Collections - Ki...
National Library of Australia
 
Assessing Significance and Significance 2.0: an introduction - Margaret Birt...
 Assessing Significance and Significance 2.0: an introduction - Margaret Birt... Assessing Significance and Significance 2.0: an introduction - Margaret Birt...
Assessing Significance and Significance 2.0: an introduction - Margaret Birt...
National Library of Australia
 
Preservation Needs Assessment - Tamara Lavrencic
Preservation Needs Assessment  - Tamara LavrencicPreservation Needs Assessment  - Tamara Lavrencic
Preservation Needs Assessment - Tamara Lavrencic
National Library of Australia
 
Assessing the significance of cultural heritage - Tania Cleary
Assessing the significance of cultural heritage - Tania ClearyAssessing the significance of cultural heritage - Tania Cleary
Assessing the significance of cultural heritage - Tania Cleary
National Library of Australia
 
Publicity, Media & Completing your CHG project - 2017 - Fran D'Castro
Publicity, Media & Completing your CHG project - 2017 - Fran D'CastroPublicity, Media & Completing your CHG project - 2017 - Fran D'Castro
Publicity, Media & Completing your CHG project - 2017 - Fran D'Castro
National Library of Australia
 
Just Digitise It - Daniel Wilksch of the Public Records Office Victoria
Just Digitise It - Daniel Wilksch of the Public Records Office VictoriaJust Digitise It - Daniel Wilksch of the Public Records Office Victoria
Just Digitise It - Daniel Wilksch of the Public Records Office Victoria
National Library of Australia
 
TROVE - a window to our community heritage - Hilary Berthon of Trove, NLA
TROVE - a window to our community heritage - Hilary Berthon of Trove, NLATROVE - a window to our community heritage - Hilary Berthon of Trove, NLA
TROVE - a window to our community heritage - Hilary Berthon of Trove, NLA
National Library of Australia
 
Disaster Prevention, Preparedness, Response and Recovery for Collections - Ki...
Disaster Prevention, Preparedness, Response and Recovery for Collections - Ki...Disaster Prevention, Preparedness, Response and Recovery for Collections - Ki...
Disaster Prevention, Preparedness, Response and Recovery for Collections - Ki...
National Library of Australia
 
CHG recipient case study - Donna Bailey of the Catholic Diocese of Sandhurst
CHG recipient case study - Donna Bailey of the Catholic Diocese of SandhurstCHG recipient case study - Donna Bailey of the Catholic Diocese of Sandhurst
CHG recipient case study - Donna Bailey of the Catholic Diocese of Sandhurst
National Library of Australia
 
Preservation Needs Assessment - Tamara Lavrencic
Preservation Needs Assessment - Tamara LavrencicPreservation Needs Assessment - Tamara Lavrencic
Preservation Needs Assessment - Tamara Lavrencic
National Library of Australia
 
Assessing the significance of cultural heritage - Tania Cleary
Assessing the significance of cultural heritage - Tania ClearyAssessing the significance of cultural heritage - Tania Cleary
Assessing the significance of cultural heritage - Tania Cleary
National Library of Australia
 
Significance Assessment and Significance 2.0: an introduction - Veronica Bull...
Significance Assessment and Significance 2.0: an introduction - Veronica Bull...Significance Assessment and Significance 2.0: an introduction - Veronica Bull...
Significance Assessment and Significance 2.0: an introduction - Veronica Bull...
National Library of Australia
 
Preservation assessment - Tamara Lavrencic
Preservation assessment - Tamara LavrencicPreservation assessment - Tamara Lavrencic
Preservation assessment - Tamara Lavrencic
National Library of Australia
 
Just digitise it - Daniel Wilksch of the Public Records Office Victoria
Just digitise it - Daniel Wilksch of the Public Records Office VictoriaJust digitise it - Daniel Wilksch of the Public Records Office Victoria
Just digitise it - Daniel Wilksch of the Public Records Office Victoria
National Library of Australia
 

More from National Library of Australia (20)

Publicity and media - Anna Gressier & Sarah Kleven (Communications and Market...
Publicity and media - Anna Gressier & Sarah Kleven (Communications and Market...Publicity and media - Anna Gressier & Sarah Kleven (Communications and Market...
Publicity and media - Anna Gressier & Sarah Kleven (Communications and Market...
 
CHG recipient case study - Julia Mant of the National Institute of Dramatic Art
CHG recipient case study - Julia Mant of the National Institute of Dramatic ArtCHG recipient case study - Julia Mant of the National Institute of Dramatic Art
CHG recipient case study - Julia Mant of the National Institute of Dramatic Art
 
Completing your CHG project - Fran D'Castro
Completing your CHG project - Fran D'CastroCompleting your CHG project - Fran D'Castro
Completing your CHG project - Fran D'Castro
 
Just Digitise It - Daniel Wilksch of the Public Records Office Victoria
Just Digitise It - Daniel Wilksch of the Public Records Office VictoriaJust Digitise It - Daniel Wilksch of the Public Records Office Victoria
Just Digitise It - Daniel Wilksch of the Public Records Office Victoria
 
Trove - a window to our community heritage - Hilary Berthon of Trove, NLA
Trove - a window to our community heritage - Hilary Berthon of Trove, NLATrove - a window to our community heritage - Hilary Berthon of Trove, NLA
Trove - a window to our community heritage - Hilary Berthon of Trove, NLA
 
National Archives of Australia
National Archives of AustraliaNational Archives of Australia
National Archives of Australia
 
Disaster Prevention, Preparedness, Response and Recovery for Collections - Ki...
Disaster Prevention, Preparedness, Response and Recovery for Collections - Ki...Disaster Prevention, Preparedness, Response and Recovery for Collections - Ki...
Disaster Prevention, Preparedness, Response and Recovery for Collections - Ki...
 
Assessing Significance and Significance 2.0: an introduction - Margaret Birt...
 Assessing Significance and Significance 2.0: an introduction - Margaret Birt... Assessing Significance and Significance 2.0: an introduction - Margaret Birt...
Assessing Significance and Significance 2.0: an introduction - Margaret Birt...
 
Preservation Needs Assessment - Tamara Lavrencic
Preservation Needs Assessment  - Tamara LavrencicPreservation Needs Assessment  - Tamara Lavrencic
Preservation Needs Assessment - Tamara Lavrencic
 
Assessing the significance of cultural heritage - Tania Cleary
Assessing the significance of cultural heritage - Tania ClearyAssessing the significance of cultural heritage - Tania Cleary
Assessing the significance of cultural heritage - Tania Cleary
 
Publicity, Media & Completing your CHG project - 2017 - Fran D'Castro
Publicity, Media & Completing your CHG project - 2017 - Fran D'CastroPublicity, Media & Completing your CHG project - 2017 - Fran D'Castro
Publicity, Media & Completing your CHG project - 2017 - Fran D'Castro
 
Just Digitise It - Daniel Wilksch of the Public Records Office Victoria
Just Digitise It - Daniel Wilksch of the Public Records Office VictoriaJust Digitise It - Daniel Wilksch of the Public Records Office Victoria
Just Digitise It - Daniel Wilksch of the Public Records Office Victoria
 
TROVE - a window to our community heritage - Hilary Berthon of Trove, NLA
TROVE - a window to our community heritage - Hilary Berthon of Trove, NLATROVE - a window to our community heritage - Hilary Berthon of Trove, NLA
TROVE - a window to our community heritage - Hilary Berthon of Trove, NLA
 
Disaster Prevention, Preparedness, Response and Recovery for Collections - Ki...
Disaster Prevention, Preparedness, Response and Recovery for Collections - Ki...Disaster Prevention, Preparedness, Response and Recovery for Collections - Ki...
Disaster Prevention, Preparedness, Response and Recovery for Collections - Ki...
 
CHG recipient case study - Donna Bailey of the Catholic Diocese of Sandhurst
CHG recipient case study - Donna Bailey of the Catholic Diocese of SandhurstCHG recipient case study - Donna Bailey of the Catholic Diocese of Sandhurst
CHG recipient case study - Donna Bailey of the Catholic Diocese of Sandhurst
 
Preservation Needs Assessment - Tamara Lavrencic
Preservation Needs Assessment - Tamara LavrencicPreservation Needs Assessment - Tamara Lavrencic
Preservation Needs Assessment - Tamara Lavrencic
 
Assessing the significance of cultural heritage - Tania Cleary
Assessing the significance of cultural heritage - Tania ClearyAssessing the significance of cultural heritage - Tania Cleary
Assessing the significance of cultural heritage - Tania Cleary
 
Significance Assessment and Significance 2.0: an introduction - Veronica Bull...
Significance Assessment and Significance 2.0: an introduction - Veronica Bull...Significance Assessment and Significance 2.0: an introduction - Veronica Bull...
Significance Assessment and Significance 2.0: an introduction - Veronica Bull...
 
Preservation assessment - Tamara Lavrencic
Preservation assessment - Tamara LavrencicPreservation assessment - Tamara Lavrencic
Preservation assessment - Tamara Lavrencic
 
Just digitise it - Daniel Wilksch of the Public Records Office Victoria
Just digitise it - Daniel Wilksch of the Public Records Office VictoriaJust digitise it - Daniel Wilksch of the Public Records Office Victoria
Just digitise it - Daniel Wilksch of the Public Records Office Victoria
 

Recently uploaded

How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
Federico Razzoli
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
fredae14
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
David Brossard
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 

Recently uploaded (20)

How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 

I say emulate

  • 1. I Say Emulate; He Says Migrate Are emulation or migration feasible preservation strategies? National Library of Australia Prepared by: Andrew Stawowczyk Long Presented by: 1 David Pearson
  • 2. Archiving the Web • Many institutions actively harvest the web • Collecting scale vary • Preservation practices not well understood and implemented • Collecting intent may differ depending on the institution 2
  • 3. Web Archives • Type • Text oriented • Multimedia (video/audio) oriented • Picture oriented • Databases • Combination of all types • Storage • Uncompressed • Compressed (WARC) • Combination 3
  • 4. Web Objects and Elements • Challenge: Web archives may contain any type of digital object • Common objects • HTML/XML and related (htm, html, xml, css, etc.) • Images (raster images – JPEG, GIF, PNG) • Media • Audio files (au, wav, aiff, midi, mp3) • Video files (mov, mpg, wmv, rm) • Other objects • File Archives (usually compressed – zip, tar, gz, arc, sit) • Images (raster images – bmp, tiff) • Images (vector images - SVG) • Text files (txt, csv, rtf) • Document files • PDF • Microsoft Word, Excel, Power Point 4
  • 5. Comparative statistics of NLA web collections PANDORA (selective) .au Domain Harvests Files: 73 million Files: 2.3 billion Size: 3.26 TB Size: 78.75 TB Domain 2005 2006 2007 2008 Harvest Unique 185 million 596 million 516 million 1 billion files Hosts 811,523 1,046,038 1,247,614 3,038,658 crawled Size 6.69 TB 19.04 18.47TB 34.55 TB 5
  • 6. What are we preserving? Preservation Intent • Preservation of: • Physical media? • Bit-stream (logical form of data)? • Action (rendering data into something useful to user)? • User experience? • Important Considerations • Creator’s perceived intent • Institution’s preservation intent 6 Based on Heslop and Davis (2002)
  • 7. What are we preserving? Properties • Object Properties (Properties regarded as important would vary depending on the intention of the collecting institution) • • Derived from file format High-level – e.g. layout, formatting or WEB • Measured – identified directly by computer • Intended – Set by the collecting body 7
  • 8. Possible Preservation Actions 1 • Emulation The original environment is recreated on a contemporary hardware using specialised software (emulator) and original software. • Renderers • Specialised software, operating in the contemporary environment and used to access (render) original files. It is similar to emulation. 8
  • 9. Possible Preservation Actions 2 • Migration Original file formats are migrated (converted) to another format, which is supported by current hardware/software. e.g. MS Word 3.0 to MS Word 2008 9
  • 10. Possible Preservation Actions 3 Not long-term sustainable • Technological Museum Collect and maintain the original hardware and software • Take No Action Do nothing 10
  • 11. Digital Preservation Preliminaries • Collection objects need to be correctly recognised and identified • Preservation intent(s) need to be defined • High-level preservation actions need to be defined (e.g. shall we use emulation or migration?) • Practical-level preservation actions need to be defined Object Format + Preservation Intent = Appropriate Action Dillema: How to properly migrate data if preservation intent(s) are unknown or not defined 11
  • 12. Tools Required for Emulation • Emulators • Fast, stable, flexible, extendable • Licenced Operating Systems • Various drivers • Web browsers • Browser plug-ins • Other programs as required (e.g. Java, Adobe Acrobat Reader) 12
  • 13. Tools Required in Migration • Format identifiers • Format converters • Link updaters • QA automatons CAMiLEON project – Migration on Request Tool XENA 13
  • 14. Project Tests General Testing Environment • Large slice of uncompressed PANDORA archive (random selection) • Whole Domain Harvest archive have not been included in tests (WARC files) • Multiple hardware combinations • Multiple OS combinations • Multiple Web Browsers 14
  • 15. Project Tests Material Sample Testing the industrial scale tools • PANDORA slice • 861Gb • 18,019,172 files • 2,379,326 folders Testing object properties • Smaller slice of PANDORA slice • 20 objects of each selected types •Audio, html, images, pdf, video, zip, MS documents 15
  • 16. Project Tests Methodology • Large sample testing (861Gb, 18,019,172 files) • Attempt to identify objects in the sample using DROID • Attempt to migrate jpeg images to png and update links • Small sample testing • Select smaller sub-sample, with objects mostly created before year 2000 • Identify objects in the sample • View and experience selected objects in contemporary environments using various platforms, OS and browsers • View and experience selected objects in old environments using emulations on various platforms, using different OS and browsers • Migrate selected objects and review them in various environments 16
  • 17. Project Tests Tools tested • Common • Emulation • DROID • QEMU • JHOVE • Bochs • TRiID • File Identifier • MS Virtual PC (Not exactly an emulator) • Lister (dev. in-house) • OS ● Dioscuri – MS Win XP Pro – MS Win 3.1 • Migration – MS Win 98SE • ImageMagick – Ubuntu 9.04 • MediaCoder • Web Browsers – MS IE 7 • Swf>>avi – Firefox 3 • OpenOffice Tools – Arachne 1.2 • XENA – Mosaic 2 17 – Netscape 4
  • 18. Project Tests Control – Current Environment • Properties observed in selected files Object Basic Characteristics (based on Emulation Project by KB) 1. Content : the text, images, etc. from the object 2. Structure : the cohesion between different parts of the object 3. Context : the meaning of the object. 4. Appearance : the way an object is presented to the user. 5. Behaviour : the interaction of the object with the user or system. E.g. for HTML pages: •Rendering of text, images, media files • Font, layout, colours, contrast, brightness, animation smoothness, sound quality, etc. •Objects dependencies •Mouse & keyboard behaviour •Data extraction 18
  • 19. Project Tests Emulated Environments • Hardware • Dell Optiplex GX620, P4, 4.4GHz x 3.39GHZ, 3.5Gb RAM • Power Mac G4 EMULATORS: • Bochs • Host: WinXP Pro v2002 SP3 Ubuntu 9.04 • Client: Win 3.1, MS DOS 6.2 WinXP Pro SP2 • Dioscuri 0.4.0 • Host: WinXP Pro v2002 SP3 • Client: Win3.1, MS DOS 6.2 19
  • 20. Project Tests Emulated Environments • Qemu • Host: MS WinXP Pro v2002 SP3 • Clients: MS Win98SE MS Win 3.1 MS DOS 6.2 Ubuntu 9.04 • Host: Ubuntu 9.04 • Clients: MS WinXP Pro SP2, P4, 12.92GHz, 256Mb RAM MS Win98SE MS Win 3.1 • Microsoft Virtual PC • Host: MS WinXP Pro v2002 SP3 • Clients: MS Win 3.1 MS Win98SE 20
  • 21. Tests - Summary Emulation •Setting up emulators was relatively simple •Additional software (especially to work with disk images) proved to be extremely useful. •Licencing was at times a big obstacle. (E.g. Impossible to emulate Macintosh environment legally). •A lot of dependencies exist. It is a complex task to make programs work correctly. •e.g Windows XP requires internet or over-the-phone activation after 30 days 21
  • 22. Tests – Summary Emulation • All Some of the dll libraries in Win 3.1 did not agree with 16-bit Netscape and Mosaic programs • Bochs 2.3.7 for Windows • Extremely slow in GUI environments • No full screen mode. Limited end-user experience. • Dioscuri • Sluggish at times • Didn’t like some of the images created in WinImage • Qemu 0.9.0 for Windows and Linux • Much faster but still sluggish at times • Win98SE couldn't run in hi-res, hi-colour mode • Microsoft Virtual PC Relatively fast (it's a virtualisation software on PC) but still sluggish at times 22
  • 23. Tests - Summary Migration Environment •Dell Optiplex GX620 •MS Windows XP Pro v2002 SP3 •Networked drive with PANDORA sample 23
  • 24. Tests - Summary Migration •Available tools are imperfect and slow. • e.g. DROID took more than two weeks to examine slightly over 18 million files and many of them were not recognised •It is very difficult to examine contents of the container formats (e.g. avi or rm) •Network connections need to be as fast as possible •It is difficult to make informed decision about migration without preservation intent clearly defined 24
  • 25. Tests - General Comments • No proven methods exist Real-world testing is needed • Most documented approaches are ad-hoc - no commodity solutions • Tools are few and inadequate 25
  • 26. Tests - General Comments • Preservation policies, especially about preservation intent are needed • Significant resources are needed to practically tackle the problem 26
  • 27. Andrew Stawowczyk Long Strategist Digital Preservation Standards NLA anlong@nla.gov.au David Pearson Director (Acting) Web Archiving and Digital Preservation Branch NLA dapearso@nla.gov.au Project Report is due end of October 2009 27