Born Digital Archives


Published on

Presentation at the LIFE-SHARE Project's Digitisation Collaboration Colloquium, in Sheffield, March 2011. Uploaded with permission.

Published in: Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Born Digital Archives

  1. 1. Managing born-digital archives:& collaborative trans-Atlantic working<br />Simon Wilson, Digital Archivist (AIMS Project) Hull History Centre<br />
  2. 2. outline<br />Look at AIMS Project -<br />trying to create a model<br />for managing born-digital <br />archives <br />How the project has <br />coped with collaborative<br />working across 3 time<br />zones<br />View looking down one aisle in the History Centre; there are 12km of shelves in total<br />
  3. 3. AIMS Project<br />An inter-Institutional Model for Stewardship <br />To process born-digital collections <br />To use Hydra, a Fedora repository-based solution<br />To disseminate the results & lessons learnt<br />Identify commonality across 4 partners - not to create a single path<br />
  4. 4. manuscripts<br />The message and the <br />medium are inseparable<br />Preserve the medium & <br />message remains legible<br />The items are usually <br />unique and irreplaceable <br />and held by us because <br />they are historical <br />Detail from Letters Patent exempting St Andrew Priory from Dissolution, 9 Sep 1536 (U DDCA2/29/119)<br />
  5. 5. born-digital files<br />The message and the <br />medium are different<br />Both are threatened by<br />obsolescence <br />The files are usually <br />copies with the creator <br />keeping the originals <br />which may still be in use<br />A 2GB pen drive is capable of holding more than 900 photographs<br />
  6. 6. our starting point<br />Fedora repository was already installed and being used at the University<br />Archives had a few born-digital items - but these were not in Fedora<br />Some partners already had TB of born-digital material in their repository<br />Screenshot of University Repository<br />
  7. 7. challenges faced<br />Software – new versions every 12-18 months <br />Each new version brings new headaches aboutbackward compatibility<br />Don’t want to become a museum with hundreds of old software titles<br />Look to convert material to suitable open formats<br />Screenshot of WordPerfect 5.1 (for DOS) released in 1989 from<br />
  8. 8. challenges faced<br />Hardware - series of steps, eg portable media<br />1978 - 5¼ disk1987 - 3½ floppy disk1994 - Zip drive 2000 - USB drive<br />Don’t want to become a museum of hardware – do need to read some formats – eg floppy disks <br />IBM 5150 PC, introduced in Aug 1981, purchase price $1565 excluding disk drives<br />
  9. 9. challenges faced<br />Professional – how do we preserve, convertcatalogue and describe this material?<br />We now have over 30,000 born-digital files – from just 3 deposits<br />Expect to have over 1m files within 5 years and a cataloguing backlog measured in TB<br />
  10. 10. depositors<br />Every outline, script and novel draft has flown back and forth without ever existing as hard copy until (in the case of the scripts) printed and handed to the actors.<br />The relationship is more critical than with paper archives – need to ask questions we haven’t asked before<br />Stephen Gallagher<br />
  11. 11. hybrid collections<br />Paper and born digital material - catalogue based on content not formatPaper archives offer a sense of discoveryNotebooks - snippets of dialogue etc for different work all intermingled With born-digital material information is dispersed between multiple files<br />Acc 2008 box 15, Chimera file 2 and an Amstrad disc <br />
  12. 12. scale of the task<br />55m tweets<br />Archives have had to adapt<br />to changing situations and <br />phenomena <br /><ul><li>the Paperless office
  13. 13. the Y2K problem
  14. 14. Social Media is the biggest challenge yet</li></ul>4.3m photos added <br />@<br />1bn bits content added<br />250bn emails sent<br />every day<br />
  15. 15. what have we lost?<br />Open Planets Foundation<br />estimate there is 100GB <br />data for each individual<br />on the planet and that<br />the rate of data creation<br />doubles every 18 months<br />Archive services:<br />Some are collecting<br />Some are managing<br />Some aren’t doing either<br />How much information <br />has already been lost?<br /><br />
  16. 16. discovery & access<br />Every 2 days....<br />we create as much information<br />as we did from the dawn of<br />civilization up until 2003 <br />Eric Schmidt, Google CEO (Aug 2010)<br />If...we manage to <br />preserve the born-digital<br />archives<br />How do we allow users to <br />discover and access all of<br />this material<br />Can’t “keep everything”<br />and hope that Google will<br />create an algorithm to<br />enable meaningful access<br />
  17. 17. collaborative trans-Atlantic working<br />
  18. 18. collaborative working<br />Four partners in four institutions in four very distinct locations <br />Range of experiences (including none) ensure that project tools and guidelines are relevant and appropriate for novices and experts alike<br />Needed to find ways to work together<br /><br />
  19. 19. collaborative tools<br />Collab site is a secure<br />space (at Uva), useful for <br />reference documents <br />but not collaborative<br />working<br />Google docs enables <br />multiple editors to work <br />on the same document –<br />add comments, seek <br />clarification etc from any<br />location<br />UVaCollab screenshot and Google docs logo<br />
  20. 20. virtual team<br />Skype - easy to use, shows<br />importance of actual (rather <br />than email) conversations<br />also builds sense of team <br />Kept to 1 hr duration <br />to keep focus<br />Jira/Duraspace - digital <br />archivists write tickets for<br />the development work <br />Introduction to this process<br />was done face to face <br />Skype screenshot and Duraspace logo (<br />
  21. 21. conclusions<br />The nature (and format) <br />of archives is undergoing <br />a fundamental change on<br />our watch<br />We need to act now to<br />collect born-digital archives<br />before it is too late<br />There are useful tools and <br />free software that support<br />collaborative working<br />Two banks of servers at Facebook by Darren Mckeeman<br /><br />
  22. 22. contact ...<br />Simon Wilson<br />Digital Archivist<br />Hull History Centre<br />Tel 01482 317506<br /> <br /> <br />Portrait of Claude-Henri Watalet blogging, after Jean-Baptiste Greuze<br />