Archiving the Deepwater Horizon Oil Spill<br />Tracy Seneca<br />California Digital Library <br />http://was.cdlib.org<br />
Archive Scope<br />527 sites<br />10402 captures<br />May 5 to present <br />tapering to less frequent captures of key sit...
Archive Selection & Context<br />Planned archives<br />Event archives<br />Advance subject expertise<br />Time for evaluat...
3 Challenges<br />Site selection<br />Site / capture management<br />Quality assurance<br />
Getting Volunteers<br />Tried bringing volunteers into service<br />“Add to WAS” browser button<br />Tried external nomina...
LSU tags relevant sites in DeliciousCDL imports Delicious JSON feed into WAS<br />~ 50% delicious<br />~ 45% 1 curator<br ...
Site Management - From:<br />Fixed table<br />Not enough control<br />Few batch actions<br />
To<br />
To (2)<br />
Collection Observations<br />Of ~350 sites from the Hurricane Katrina archive, only about 120 were initially relevant to t...
Reminders<br />Use the tools you build<br />At larger scale than your users<br />Take advantage of existing workflows<br /...
Next Steps<br />Release public archive<br />Review with Louisiana State University librarians<br />Web Archiving Service<b...
Upcoming SlideShare
Loading in …5
×

Archiving the Deepwater Horizon Oil Spill

218 views
184 views

Published on

Presentation for Spring 2011 International Internet Preservation Consortium

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
218
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Gave subject experts access to WAS: # site nominations: 0Gave subject experts access to external site nomination tool: # nominations: 6Pulled librarian-nominated sites from Delicious: 400+
  • Archiving the Deepwater Horizon Oil Spill

    1. 1. Archiving the Deepwater Horizon Oil Spill<br />Tracy Seneca<br />California Digital Library <br />http://was.cdlib.org<br />
    2. 2. Archive Scope<br />527 sites<br />10402 captures<br />May 5 to present <br />tapering to less frequent captures of key sites, about 200 captures per month<br />76 million + documents<br />2 TB<br />
    3. 3.
    4. 4. Archive Selection & Context<br />Planned archives<br />Event archives<br />Advance subject expertise<br />Time for evaluation<br />Time for QA<br />Focus on comprehensive capture<br />Traditional collection development<br />Control over scale<br />Act quickly<br />No one is the expert<br />Collaboration required<br />Every efficiency matters<br />Frequent shallow captures / rapidly changing sites<br />Massive scale<br />http://was.cdlib.org<br />
    5. 5. 3 Challenges<br />Site selection<br />Site / capture management<br />Quality assurance<br />
    6. 6. Getting Volunteers<br />Tried bringing volunteers into service<br />“Add to WAS” browser button<br />Tried external nomination tool<br />TAP INTO WHAT USERS ARE ALREADY DOING<br />http://was.cdlib.org<br />
    7. 7. LSU tags relevant sites in DeliciousCDL imports Delicious JSON feed into WAS<br />~ 50% delicious<br />~ 45% 1 curator<br />~5% everything else<br />http://was.cdlib.org<br />
    8. 8. Site Management - From:<br />Fixed table<br />Not enough control<br />Few batch actions<br />
    9. 9. To<br />
    10. 10. To (2)<br />
    11. 11.
    12. 12.
    13. 13. Collection Observations<br />Of ~350 sites from the Hurricane Katrina archive, only about 120 were initially relevant to the oil spill<br />Different responding organizations<br />The relevant sites<br />Political offices / government agencies in the region<br />News sources in the region<br />Environmental organizations<br />
    14. 14.
    15. 15. Reminders<br />Use the tools you build<br />At larger scale than your users<br />Take advantage of existing workflows<br />Collection building drives innovation<br />
    16. 16. Next Steps<br />Release public archive<br />Review with Louisiana State University librarians<br />Web Archiving Service<br />http://was.cdlib.org<br />www.facebook.com/webarchiving<br />

    ×