Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

NYARC Web Archiving Program: Archivists Round Table of Metropolitan New York

599 views

Published on

This presentation was given to the Archivists Round Table of Metropolitan New York on June 15, 2015. The presentation provided an introduction to the New York Art Resources Consortium (NYARC)--the collaboration among research libraries at the Brooklyn Museum, Frick Collection, and Museum of Modern Art--and to its web archiving program for specialist art historical resources. Speakers shared lessons learned from web archiving that inform the acquisition, quality assurance, management, and long-term preservation of especially dynamic and ephemeral born-digital resources.

  • Be the first to comment

NYARC Web Archiving Program: Archivists Round Table of Metropolitan New York

  1. 1. QUALITY ASSURANCE ADMINISTRATIONNEED + VISION PRESERVATION QUESTIONS New York Art Resources Consortium Web Archiving Program Deborah Kempe Principal Investigator, Two-Year Mellon Grant “Making the Black Hole Gray” Sumitra Duncan NYARC Web Archiving Coordinator Celeste Brewer Seth Persons Molly Seegers NYARC Web Archiving Interns Karl-Rainer Blumenthal National Digital Stewardship Resident The Archivists Round Table of Metropolitan New York, Inc., June 15, 2015
  2. 2. QUALITY ASSURANCE ADMINISTRATIONNEED + VISION PRESERVATION QUESTIONS New York Art Resources Consortium Web Archiving Program Deborah Kempe Principal Investigator, Two-Year Mellon Grant “Making the Black Hole Gray” Sumitra Duncan NYARC Web Archiving Coordinator Celeste Brewer Seth Persons Molly Seegers NYARC Web Archiving Interns Karl-Rainer Blumenthal National Digital Stewardship Resident
  3. 3. NEED + VISION Why? The Digital Black Hole What? Deliverables How? Radical Collaboration Live Wayback Art ephemeraArt Analog Art Ephemera
  4. 4. NEED + VISION Why? The Digital Black Hole What? Deliverables How? Radical Collaboration Live Wayback
  5. 5. NEED + VISION Why? The Digital Black Hole What? Deliverables How? Radical Collaboration Live Wayback “The Web dwells in a never-ending present. It is — elementally — ethereal, ephemeral, unstable, and unreliable.” -Jill Lepore, The Cobweb: Can the Internet be archived? The New Yorker
  6. 6. NEED + VISION Why? The Digital Black Hole What? Deliverables How? Radical Collaboration Live Wayback
  7. 7. NEED + VISION Why? The Digital Black Hole What? Deliverables How? Radical Collaboration Live Wayback “The Web dwells in a never-ending present. It is — elementally — ethereal, ephemeral, unstable, and unreliable.” -Jill Lepore, The Cobweb: Can the Internet be archived? The New Yorker
  8. 8. NEED + VISION Why? The Digital Black Hole What? Deliverables How? Radical Collaboration Live Wayback
  9. 9. NEXT STEPS Why? The Digital Black Hole What? Deliverables How? Radical Collaboration Live Wayback GOALS FOR NYARC ● Rich and substantial digital resources in art history ● Seamless integration with other research materials ● APIs and judicious metadata ● Permanence and long-term preservation ● Scalable ● Extensible ● Sustainable AND FOR THE GREATER GOOD ● Networked collections ● Federal initiatives ● New approaches made possible through innovative thinking ● Tools for new research
  10. 10. QUALITY ASSURANCE ADMINISTRATIONNEED + VISION PRESERVATION QUESTIONS New York Art Resources Consortium Web Archiving Program Deborah Kempe Principal Investigator, Two-Year Mellon Grant “Making the Black Hole Gray” Sumitra Duncan NYARC Web Archiving Coordinator Celeste Brewer Seth Persons Molly Seegers NYARC Web Archiving Interns Karl-Rainer Blumenthal National Digital Stewardship Resident
  11. 11. ADMINISTRATION Live Wayback Staffing & Partnerships Workflow Elements Harvesting Access & Discovery
  12. 12. ADMINISTRATION Live Wayback Staffing & Partnerships Workflow Elements Harvesting Access & Discovery
  13. 13. ADMINISTRATION Live Wayback Staffing & Partnerships Workflow Elements Harvesting Access & Discovery
  14. 14. ADMINISTRATION Live Wayback Staffing & Partnerships Workflow Elements Harvesting Access & Discovery
  15. 15. ADMINISTRATION Live Wayback Staffing & Partnerships Workflow Elements Harvesting Access & Discovery
  16. 16. ADMINISTRATION Live Wayback Staffing & Partnerships Workflow Elements Harvesting Access & Discovery
  17. 17. ADMINISTRATION Live Wayback Staffing & Partnerships Workflow Elements Harvesting Access & Discovery
  18. 18. ADMINISTRATION Live Wayback Staffing & Partnerships Workflow Elements Harvesting Access & Discovery www.nyarc.org/webarchive
  19. 19. ADMINISTRATION Live Wayback Staffing & Partnerships Workflow Elements Harvesting Access & Discovery Discovery via Primo
  20. 20. QUALITY ASSURANCE ADMINISTRATIONNEED + VISION PRESERVATION QUESTIONS New York Art Resources Consortium Web Archiving Program Deborah Kempe Principal Investigator, Two-Year Mellon Grant “Making the Black Hole Gray” Sumitra Duncan NYARC Web Archiving Coordinator Celeste Brewer Seth Persons Molly Seegers NYARC Web Archiving Interns Karl-Rainer Blumenthal National Digital Stewardship Resident
  21. 21. QUALITY ASSURANCE Live Wayback Context Process Challenges Successes Test Crawl Adjust Scope Full Crawl of Site QA Process
  22. 22. QUALITY ASSURANCE Live Wayback Context Process Challenges Successes Patch Crawls ● capture embedded content that was missed in initial crawl ● patch URLs into overall crawl structures seamlessly
  23. 23. QUALITY ASSURANCE Live Wayback Context Process Challenges Successes Running patch crawls is time intensive and iterative. Each and every web page must be checked for missing URLs.
  24. 24. QUALITY ASSURANCE Live Wayback Context Process Challenges Successes Archive-It is not always able to tell when a URL is missing from a web page.
  25. 25. QUALITY ASSURANCE Live Wayback Context Process Challenges Successes Functionality of the web page must be checked in order to determine the extent of missing URLs.
  26. 26. QUALITY ASSURANCE Live Wayback Context Process Challenges Successes Testing the web page reveals unknown missing URLs and activates them for patch crawls.
  27. 27. QUALITY ASSURANCE Context Process Challenges Successes
  28. 28. QUALITY ASSURANCE Context Process Challenges Successes 2 months later... Unfortunately, by the time they engineered a solution, the content had already changed. However, we will be able to have successful crawls in the future.
  29. 29. QUALITY ASSURANCE Context Process Challenges Successes Large, Content Rich Websites
  30. 30. Context Process Challenges Successes QUALITY ASSURANCE 4 views of 755 exhibitions + 2 views of 1,206 artworks + 47 museum collections + other sections of the website = 5,473 clicks X 2 (both live and archived sites) = 10,946 clicks
  31. 31. QUALITY ASSURANCE Live Wayback Context Process Challenges Successes Video content hosted by a third-party may not playback even though the URLs have been patch crawled.
  32. 32. QUALITY ASSURANCE Live Wayback Context Process Challenges Successes Flash files (.swf) are triggered by time-based processes that require manual intervention (and patience). These files can conceal access to more files including other .swf, .jpegs, .html, and more.
  33. 33. QUALITY ASSURANCE Live Wayback Web archiving art resources has unique complications due to the use of images, video, dynamic content, and complex site structures. Context Process Challenges Successes Dynamic Content is often hidden behind php or javascript files which the crawler is unable to access. Sometimes the crawler is able to collect all of the necessary files, but is unable to display the viewing interface.
  34. 34. QUALITY ASSURANCE Live Wayback Context Process Challenges Successes
  35. 35. QUALITY ASSURANCE Live Wayback Context Process Challenges Successes
  36. 36. QUALITY ASSURANCE ADMINISTRATIONNEED + VISION PRESERVATION QUESTIONS New York Art Resources Consortium Web Archiving Program Deborah Kempe Principal Investigator, Two-Year Mellon Grant “Making the Black Hole Gray” Sumitra Duncan NYARC Web Archiving Coordinator Celeste Brewer Seth Persons Molly Seegers NYARC Web Archiving Interns Karl-Rainer Blumenthal National Digital Stewardship Resident
  37. 37. PRESERVATION Live Wayback Why preservation? Measuring success Remaining challenges
  38. 38. PRESERVATION Live Wayback Why preservation? Measuring success Remaining challenges
  39. 39. PRESERVATION Live Wayback Why preservation? Measuring success Remaining challenges
  40. 40. PRESERVATION Live Wayback Why preservation? Measuring success Remaining challenges
  41. 41. PRESERVATION Live Wayback Why preservation? Measuring success Remaining challenges
  42. 42. PRESERVATION Live Wayback Why preservation? Measuring success Remaining challenges <warc> <xml>
  43. 43. PRESERVATION Live Wayback Why preservation? Measuring success Remaining challenges Control Integrity Security Accessibility + + +
  44. 44. PRESERVATION Live Wayback Why preservation? Measuring success Remaining challenges Control Integrity Security Accessibility + + + Storage and Geographic Location File Fixity and Data Integrity Information Security Metadata & File Formats Levels of Preservation (NDSA, 2013)
  45. 45. PRESERVATION Live Wayback Why preservation? Measuring success Remaining challenges Storage and Geographic Location File Fixity and Data Integrity Information Security Metadata File Formats Levels of Preservation (NDSA, 2013) 3 421
  46. 46. PRESERVATION Live Wayback Why preservation? Measuring success Remaining challenges Storage and Geographic Location File Fixity and Data Integrity Information Security Metadata File Formats Levels of Preservation (NDSA, 2013) 3 421
  47. 47. PRESERVATION Live Wayback Why preservation? Measuring success Remaining challenges Storage and Geographic Location File Fixity and Data Integrity Information Security Metadata File Formats Levels of Preservation (NDSA, 2013) 3 421
  48. 48. PRESERVATION Live Wayback Why preservation? Measuring success Remaining challenges Storage and Geographic Location File Fixity and Data Integrity Information Security Metadata File Formats Levels of Preservation (NDSA, 2013) 3 421
  49. 49. PRESERVATION Live Wayback Why preservation? Measuring success Remaining challenges <warc> <xml>
  50. 50. PRESERVATION Live Wayback Why preservation? Measuring success Remaining challenges
  51. 51. PRESERVATION Live Wayback Why preservation? Measuring success Remaining challenges
  52. 52. QUALITY ASSURANCE ADMINISTRATIONNEED + VISION PRESERVATION QUESTIONS New York Art Resources Consortium Web Archiving Program Deborah Kempe Principal Investigator, Two-Year Mellon Grant “Making the Black Hole Gray” Sumitra Duncan NYARC Web Archiving Coordinator Celeste Brewer Seth Persons Molly Seegers NYARC Web Archiving Interns Karl-Rainer Blumenthal National Digital Stewardship Resident
  53. 53. Acknowledgements Presentation template and relationship diagram courtesy of Karl-Rainer Blumenthal Andy Goldsworthy artwork, courtesy of degine.blogspot.com An Art Resource in New York: The Collective Collection of the NYARC Art Museum Libraries, by Brian Lavoie, Ph.D. Senior Research Scientist, Günter Waibel Program Officer, OCLC Programs and Research, c2008. http://www.oclc. org/content/dam/research/publications/library/2008/2008-02.pdf http://hyperallergic.com/211250/moma-is-archiving-its-exhibition-websites-before-they- expire/ Icons from The Noun Project (https://thenounproject.com) ● “Vortex” by Eli Ratner ● “Light-bulb by Ian Mawle ● “Architect” by Luis Prado THANK YOU!

×