Digitisation Infrastructure - June 2007


Published on

The presentation looks at some of the key capabilities that are required, whether at a campus-wide, regional or national level to make sure that digitisation happens effectively, as rapidly as possible and offers value for money in the medium and long term.

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Digitisation Infrastructure - June 2007

  1. 1. Publishing Cultural Heritage Alastair Dunning Digitisation Programme Manager JISC (Joint Information Systems Committee) a.dunning@jisc.ac.uk, 0203 006 6065 UCL Presentation, 19 th June
  2. 2. JISC Digitisation Programme <ul><li>Manager for 8 projects, part of 16 project programme to digitise UK cultural heritage. For example </li></ul><ul><ul><li>British Newspapers 1620-1900 </li></ul></ul><ul><ul><li>Pre-Raphaelite Art </li></ul></ul><ul><ul><li>Images from Scott Polar Research Institute </li></ul></ul><ul><ul><li>Nineteenth-Century Pamphlets </li></ul></ul><ul><ul><li>20 th -century Government Cabinet Papers </li></ul></ul><ul><ul><li>http://www.jisc.ac.uk/digitisation </li></ul></ul><ul><li>Started April 2007, finishing March 2009 </li></ul>
  3. 3. Digitisation is easy http://homepage.mac.com/xcia0069/lizzie-innes/index.htm
  4. 4. Growth of Digitisation <ul><li>Possibilities of Internet inspired rapid data capture of precious objects all over the world </li></ul><ul><li>But maybe this started out as a reactive cottage industry? </li></ul><ul><ul><li>Museums, Libraries and Archives rushing to digitise material and dump it on the web </li></ul></ul><ul><li>How long does this material last on the Internet? Is it good quality? Can people locate it? Can they use it? </li></ul><ul><li>Quantity of material and issue of long-term digitisation effects published material. Added pressure supplied by Google digitisation programme </li></ul><ul><li>… . Digitisation is difficult </li></ul>
  5. 5. Need for an infrastructure <ul><li>To address the issues raised in previous slide </li></ul><ul><ul><li>How long does this material last on the Internet? Is it good quality? Can users locate it? Can they use it? </li></ul></ul><ul><li>Illustrations from the British model; other country’s models may be different </li></ul><ul><li>Demonstration that mass digitisation is complex, involving multiple players and technologies </li></ul><ul><li>Good infrastructure allows publication of cultural heritage to happen quickly; to show value for money; to be usable; to be easily accessible by educational communities and general public </li></ul>
  6. 6. Data capture <ul><li>To convert the physical to digital </li></ul><ul><ul><li>Flat scanners, robotic scanners, 3D scanners, direct capture via digital camera, remote controlled camera, conversion via medium (e.g. microfilm), reel-to-digital, millions of typists </li></ul></ul><ul><li>To cope with all kinds of material (newspapers, stained glass, banners, posters, maps, census, reports, grey literature, artefacts, film, audio … ) </li></ul><ul><li>Need to have keen idea of priorities for digitisation </li></ul><ul><li>Ensure competition but not redundancy (Keep machines working; keep staff in place) </li></ul><ul><li>Requires research on success of methodologies, dialogue with other subject areas (i.e. sciences) </li></ul>
  7. 7. If you don’t have a range of options for data capture – cultural heritage won’t get digitised University of Southampton Robotic Scanner – Details at http://www.soton.ac.uk/mediacentre/news/2004/nov/04_181.shtml
  8. 8. Standards and Formats <ul><li>What file formats to ensure high-quality, long-term use </li></ul><ul><ul><li>Images - TIFF, but also JPEG2000, PNG </li></ul></ul><ul><ul><li>Text – XML (and flavours thereof), but also RTF, Word </li></ul></ul><ul><ul><li>Sound – WAV, AIFF, MP3, Ogg (formats and wrappers) </li></ul></ul><ul><ul><li>Film – MJPEG, MPEG4, AVI, Quicktime, Flash (ditto) </li></ul></ul><ul><li>Normally developed internationally, but local variations occur </li></ul><ul><li>Co-ordination, certification, co-operation, involvement and decisiveness at national and international levels </li></ul><ul><li>As with all parts of infrastructure, research and innovation </li></ul><ul><li>If you don’t have this – see current mess over video! </li></ul>
  9. 9. Metadata <ul><li>Requires sophisticated of experts who know the digital objects (e.g. newspapers, sound recordings, census reports) </li></ul><ul><li>As with before, international c o-ordination, certification, co-operation to develop international schema and vocabularies </li></ul><ul><li>These are required at subject level, format level, technical levels, preservation levels. For example </li></ul><ul><ul><li>Dublin Core, MODS – generic resource description </li></ul></ul><ul><ul><li>VRA4 – digital image description, including technical details </li></ul></ul><ul><ul><li>METS – wraps together different information on a digital object </li></ul></ul><ul><ul><li>PREMIS – preservation metadata over long term </li></ul></ul><ul><li>If you don’t have this – trust and authenticity, interoperability, resource discovery are severely hindered </li></ul>
  10. 10. Data Delivery <ul><li>I.e. the people that build websites </li></ul><ul><li>Complex engagement between commercial (Google, ProQuest, Thomson Gale, JSTOR) and non-commercial suppliers (universities, museums etc.) </li></ul><ul><li>Huge range of potential business models </li></ul><ul><ul><li>Institutional subscription, Personal subscription </li></ul></ul><ul><ul><li>Pay-per-view, Google Ads </li></ul></ul><ul><ul><li>Open Access </li></ul></ul><ul><ul><li>Mixed model </li></ul></ul><ul><li>But no definitive answers about the more successful </li></ul>
  11. 11. Data Delivery – What is required <ul><li>Ability to regularly serve up websites and data </li></ul><ul><li>Systems to deliver a range of digital content (e.g. newspapers, audio, posters, artifacts) </li></ul><ul><li>Low overheads and year on year costs </li></ul><ul><li>Good understanding of end-users </li></ul><ul><li>Working in partnership with other content providers </li></ul><ul><li>Commitment to innovation and good practice </li></ul><ul><li>If you don’t have this – wheel will be constantly reinvented, users will be driven away, material will be siloed </li></ul>
  12. 12. Preservation Facilities <ul><li>Digital objects become obsolete with time. Experts are required to ensure this does not happen </li></ul><ul><ul><li>Expertise in handling digital assets (content and all metadata) in long term, and preferably also the hardware and media that hold such content </li></ul></ul><ul><ul><li>Must be trusted and reliable </li></ul></ul><ul><ul><li>Good relationship with data delivery providers </li></ul></ul><ul><ul><li>Continual research – why, what and how to preserve? </li></ul></ul><ul><li>Without this, digital data will be lost, endangering the entire investment made in digitisation </li></ul>
  13. 13. Preservation Facilities – Case Study <ul><li>A good example from the late 1990s </li></ul><ul><li>Orphaned archaeological data rescued from obsolescence </li></ul><ul><li>CDs, floppy discs, PCs, databases, word files, CAD files all left </li></ul><ul><li>But lack of metadata meant not all data could be retrieved </li></ul><ul><li>http://ahds.ac.uk/creating/case-studies/newham/ </li></ul>
  14. 14. Digitisation Infrastructure <ul><li>Network capabilities </li></ul><ul><li>Authentication </li></ul><ul><li>Tools Development </li></ul><ul><li>Usability testing </li></ul><ul><li>Copyright clearing houses </li></ul><ul><li>Consultants </li></ul><ul><li>Trained expert staff </li></ul><ul><li>Suitable courses </li></ul><ul><li>Data capture </li></ul><ul><li>Standards, Formats </li></ul><ul><li>Metadata </li></ul><ul><li>Data Delivery </li></ul><ul><li>Preservation </li></ul><ul><li>And of course Money </li></ul><ul><li>Skill is in making sure these pieces fit together </li></ul>