Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Cochrane von Suchodoletz File Creation, Rendering and Formats

706 views

Published on

File Creation, Rendering and Formats
Euan Cochrane and Dirk von Suchodoletz

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Cochrane von Suchodoletz File Creation, Rendering and Formats

  1. 1. File Creation, Rendering and Formats Euan Cochrane, Archives New Zealand &Dirk von Suchodoletz, University of Freiburg Future Perfect 2012 26 March 2012 Wellington, New Zealand
  2. 2. ContentsEuan•Files, formats and their relationships to creating applications•Files, formats and their relationships to rendering applicationsDirk•Maintaining the ability to use older rendering applicationsEuan•Context and conclusions
  3. 3. Digital Preservation• What is digital preservation?Maintaining the full information content of digital objects [across time]Maintaining the ability to render digital objects [across time]“The goal of digital preservation is the accurate rendering of authenticated content over time”• What is a file format?“[pre-defined/particular] way that information is encoded for storage in a computer file”
  4. 4. File Creation and Formats• In 2007 Over 90% of HTML documents did not conform to standards• Microsoft Office 2007(and possibly 2010) create ODS files differently to mostopen source office suites.• Microsoft Office 2007 and 2010 create Microsoft Office 97-2003 formatted files differently to Microsoft Office 97-2003
  5. 5. Format Standards are OftenAmbiguous or not Available• The JPEG standard specifies an end of image marker but not an end of file marker – Different apps write them differently• LibreOffice 3.5 (14 February 2012) now “supports” Visio file import. This support is based on reverse engineering as the format standard is not publically available. It is not complete
  6. 6. “Rendering Matters” Research• Compared the rendering of ~100 files on old software running on old hardware (the “control”) to: 1. LibreOffice version 3.3.0 2. Microsoft Office 2007 3. Word Perfect Office X5 4. Control Software running on emulated hardware
  7. 7. Summary Research Results• [The choice of] Rendering [Environment] Matters• MS-Office 2007 was a better rendering tool for the old files than either LibreOffice or WordPerfect Office• The use of particular attributes/features in office files is inconsistent but most are used at least once.• At least one “odd”/rare attribute/feature is included in most office files
  8. 8. Original Environments (OE) Original creating application best candidate to render documents properly Proprietary format knowledge embedded in the application One environment renders all objects of a certain type Keeping original software (and hardware) environments has impact on preservation and access workflows
  9. 9. Components of Access through OE Emulators for different computer architectures Software archive of all required applica-tions, operating systems, additional components like fonts, codecs Workflows on object ingest Access systems for end users
  10. 10. Emulators Wide range available for all relevant computer architectures Many Open Source Not yet DP aware – long term availability to be secured DP community should seek more influence
  11. 11. Software Archive Preserve the relevant software components and operational knowledge
  12. 12. Necessary Workflows Freiburg digital preservation group leads the state- sponsored two years bwFLA project BwFLA project providing access to complex, interactive digital objects Provide extended ingest workflows with feedback loop
  13. 13. Extended Ingest Workflow Make use of donators expertise to collect complete information and components  Extend software archive if necessary  Add necessary technical metadata  Record knowledge on object handling Let the donor check and sign-off the rendering results
  14. 14. Access Workflows Provide a reading room system or extension – Pre-configure emulator to the OE required by the object – Prepare the inclusion of the object into the original environment – Automate the startup of the OSE – Provide the user information and hints on how to interact with the OE & automate parts of this – (Dis)allow to a certain degree to save results from the original environments or capture certain states (e.g. using screenshots)
  15. 15. Access System Many components already exist, develo-ped by past DP projects Next step: Make them a usable “product”
  16. 16. Reading Room Access System Make emulation accessible to standard users like in memory institutions Robust platform, extension to standard reading room systems Unified access to a wide range of different emulators + preconfigured environments
  17. 17. Context and Conclusions• Making decisions about preservation strategies• When to Normalise?• Variation in format implementation doesn’t matter if you maintain a compatible rendering environment• Variation in rendering across environments doesn’t matter if you maintain the “right” rendering environment• There are practical options for maintaining rendering environments
  18. 18. Thank you

×