Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Building Archivable Websites


Published on

Presentation for Stanford Drupal Camp on how and why to build archivable websites.

Published in: Internet, Technology, Design
  • Be the first to comment

  • Be the first to like this

Building Archivable Websites

  1. 1. Building Archivable Websites Nicholas Taylor Web Archiving Service Manager Digital Library Systems and Services Drupal Camp April 19, 2014
  2. 2. ARCHIVABLE WEBSITES? Why Build “Frosted Spiders' Web” by Jess Wood under CC BY 2.0
  3. 3. future users are users, too “a connection between past and future” by Gioia De Antoniis under CC BY-NC-ND 2.0
  4. 4. maintain web usability “Broken Web Connections? Welcome to 2009...” by Paul:Ritchie under CC BY-NC-ND 2.0
  5. 5. improve temporal web usability Internet Archive: “Wayback Machine”
  6. 6. improve temporal web usability Internet Archive: “Wayback Machine”
  7. 7. recover your lost website “Warrick”
  8. 8. refer to earlier website versions “The Iraq War: Wikipedia Historiography” by STML under CC BY-SA 2.0
  9. 9. institutional history Internet Archive Wayback Machine: “Stanford University Homepage”
  10. 10. websites are cultural artifacts “The World Wide Web project”
  11. 11. facilitate compliance
  12. 12. optimize for other crawlers “SEO on a railway platform” by superboreen under CC BY-NC-ND 2.0
  13. 13. IMPROVE ARCHIVABILITY How to “metal web” by paul:74 under CC BY-NC-SA 2.0
  14. 14. follow web standards and accessibility guidelines “Web Standards Fortune Cookie” by Matt Herzberger under CC BY-SA 2.0
  15. 15. use a site map, transparent links, and contiguous navigation “Card sorting” by Manchester Library under CC BY-SA 2.0
  16. 16. maintain stable URLs and redirect when necessary “San Francisco-Oakland Bay Bridge 1442a” by Don Barrett under CC BY-NC-ND 2.0
  17. 17. use semantically-meaningful URLs “”
  18. 18. be careful w/ robot exclusion rules “drupal/robots.txt at 7.x”
  19. 19. minimize reliance on external assets necessary for presentation Internet Archive Wayback Machine: “Stanford Department of English”
  20. 20. minimize reliance on external assets necessary for presentation “Stanford Department of English”
  21. 21. serve reusable assets from a single, common location Google Images: “stanford university seal”
  22. 22. specify HTTP response headers for caching and content encoding “time capsule on Alcatraz” by inajeep under CC BY 2.0
  23. 23. embed metadata, especially character encoding “Keep the Packaging!” by davidd under CC BY 2.0
  24. 24. use durable data formats “Lascaux cave painting” by Christine McIntosh under CC BY-ND 2.0
  25. 25. prefer responsive design over user- agent personalization “«Responsive web design» - 217/366” by Roger Ferrer Ibáñez under CC BY-NC-SA 2.0
  26. 26. examine your site in the Internet Archive Wayback Machine Internet Archive Wayback Machine: “Welcome to A Multidimensional Perception ~/*= & PCGuru”
  27. 27. TOOLS AND SERVICES Web Archiving “giant mechanical spider & crowd” by mjtmail (tiggy) under CC BY 2.0
  28. 28. Heritrix Wikimedia Commons: “File:Heritrix-screenshot.png”
  29. 29. Wget Wikimedia Commons: “File:Wget_1.13.4.png”
  30. 30. HTTrack “HTTrack Website Copier”
  31. 31. Wayback “Internet Archive Wayback Machine”
  32. 32. Web Archiving Integration Layer “Web Archiving Integration Layer”
  33. 33. Memento “Memento”
  34. 34. assess archivability w/ Archive Ready “Archive Ready”
  35. 35. thank you! “stanford dish at sunset” by Dan under CC BY-NC-SA 2.0 Nicholas Taylor