Web	
  archiving	
  collabora/ons	
  at	
  
Columbia	
  University	
  Libraries	
  
Anna	
  Perricci	
  
Columbia	
  Unive...
Web	
  	
  Resources	
  Archiving	
  Collabora/on	
  
Many	
  thanks	
  to	
  the	
  Mellon	
  FoundaFon	
  
Building	
  c...
Incen/ves	
  grants	
  to	
  	
  
advance	
  web	
  archiving	
  tools	
  
Image	
  source:	
  hNp://imgur.com/gallery/vG7...
Incen/ve	
  awards	
  projects	
  
Warcbase:	
  Building	
  a	
  Scalable	
  Web	
  Archiving	
  PlaWorm	
  on	
  HBase	
 ...
Incen/ve	
  awards	
  projects	
  
Visualizing Digital Collections of Web Archives (Michele
Weigle, Old Dominion Universit...
Incen/ve	
  awards	
  projects	
  
Perma.cc:	
  MiFgaFng	
  the	
  
Pervasive	
  Problem	
  of	
  Link	
  
Rot	
  in	
  Sc...
Building	
  an	
  efficient	
  and	
  scalable	
  na/onal	
  
framework	
  for	
  collec/ng	
  web	
  content	
  	
  
Image	...
Designated	
  space	
  for	
  collabora/ve	
  collec/ng	
  
Collabora/ve	
  Architecture,	
  Urbanism	
  and	
  
Sustainability	
  Web	
  Archive	
  (CAUSEWAY)	
  
hNps://archive-­‐i...
Collabora/on	
  with	
  music	
  librarians	
  
Contemporary	
  composers—the	
  perfect	
  storm?	
  
Contemporary	
  Composers	
  Web	
  Archive	
  
Selectors	
  
•  Borrow	
  Direct	
  Music	
  Librarians	
  Group:	
  musi...
Contemporary	
  Composers	
  Web	
  Archive	
  
hNps://archive-­‐it.org/collecFons/4019	
  	
  
Quality	
  Assurance	
  
Crea/ng	
  MARC	
  records	
  for	
  web	
  archives	
  
•  CreaFng	
  MARC	
  records	
  for	
  
archived	
  websites	
  ...
Patron	
  view	
  of	
  record	
  in	
  CLIO	
  
Cataloger’s	
  view	
  of	
  record	
  in	
  CLIO	
  
An/cipa/ng	
  wider	
  use	
  of	
  MARC	
  records	
  
•  Records	
  have	
  been	
  released	
  
to	
  WorldCat	
  
•  C...
CCWA	
  MARC	
  records	
  
•  So	
  far	
  sample	
  of	
  10	
  records	
  
has	
  taught	
  us…	
  
•  PosiFve	
  feedb...
Project	
  tracking	
  
Use	
  cases	
  
Who	
  are	
  the	
  web	
  archives	
  for?	
  	
  Are	
  they	
  being	
  
used?	
  	
  Could	
  we	
  encourage	
  more...
hSp://hrwa.cul.columbia.edu	
  
Using	
  the	
  Human	
  Rights	
  Web	
  Archive	
  &	
  learning	
  from	
  
human	
  rights	
  scholars’	
  work	
  (pu...
Cita/ons	
  scraped	
  from	
  ar/cles	
  published	
  in	
  
2010	
  in	
  select	
  scholarly	
  journals	
  
Isola/ng	
  URLs	
  from	
  list	
  of	
  cita/ons	
  
(approximately	
  10%	
  of	
  cita/ons	
  scraped	
  have	
  URLs	...
Best	
  Prac/ces	
  for	
  site	
  creators:	
  working	
  
with	
  website	
  creators	
  
Image	
  source:	
  hNp://imgu...
Open	
  issues:	
  division	
  and	
  maintenance	
  of	
  
coopera/ve	
  efforts	
  
(communica/on,	
  so]ware	
  and	
  m...
Process	
  over	
  next	
  16	
  months	
  
•  Further	
  planning	
  (revision	
  as	
  needed)	
  and	
  user	
  intervi...
Web	
  archiving	
  ini/a/ves	
  	
  
focusing	
  on	
  art	
  resources	
  
An	
  iniFaFve	
  designed	
  to	
  address	
...
Ques/ons?	
  
Image	
  source:	
  hNp://imgur.com/gallery/qoCqQoh	
  	
  
Resources	
  that	
  came	
  up	
  in	
  the	
  Q	
  &	
  A	
  
•  Internet	
  Archive	
  "Save	
  a	
  Page"	
  Plug-­‐In...
Thanks!	
  
Anna	
  Perricci	
  
alp2198@columbia.edu	
  	
  
@AnnaPerricci	
  	
  
Columbia	
  University	
  Libraries	
  
Upcoming SlideShare
Loading in...5
×

Web archiving collaborations: a presentation for colleagues working in the Libraries of the Metropolitan Museum of Art

360

Published on

These slides were used to support a presentation on web archiving collaborations for colleagues working in the Libraries of the Metropolitan Museum of Art.

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
360
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Web archiving collaborations: a presentation for colleagues working in the Libraries of the Metropolitan Museum of Art

  1. 1. Web  archiving  collabora/ons  at   Columbia  University  Libraries   Anna  Perricci   Columbia  University  Libraries   Metropolitan  Museum  of  Art  (August  19,  2014)  
  2. 2. Web    Resources  Archiving  Collabora/on   Many  thanks  to  the  Mellon  FoundaFon   Building  collaboraFons  among   •  The  web  archiving  community   •  Other  research  libraries   •  Users  and  potenFal  users  of  web  archives   •  Website  creators  
  3. 3. Incen/ves  grants  to     advance  web  archiving  tools   Image  source:  hNp://imgur.com/gallery/vG7KE48  
  4. 4. Incen/ve  awards  projects   Warcbase:  Building  a  Scalable  Web  Archiving  PlaWorm  on  HBase   and  Hadoop.  (Jimmy  Lin,  University  of  Maryland)   Archiving  TransacFons  Towards  UninterrupFble  Web  Service   (Zhiwu  Xie  and  Edward  A.  Fox,  Virginia  Tech  University)  
  5. 5. Incen/ve  awards  projects   Visualizing Digital Collections of Web Archives (Michele Weigle, Old Dominion University) Tools for Managing Seed URLs (Michael Nelson, Old Dominion University)
  6. 6. Incen/ve  awards  projects   Perma.cc:  MiFgaFng  the   Pervasive  Problem  of  Link   Rot  in  Scholarly  Works  and   Preserving  Online  Content   (Kim  Dulin,  The  Harvard   Library  InnovaFon  Lab)   Free  Law  Project      Providing  free  access  to   primary  legal  materials,   developing  legal  research   tools,  and  supporFng   academic  research  on  legal   corpora  
  7. 7. Building  an  efficient  and  scalable  na/onal   framework  for  collec/ng  web  content     Image  source:  hNp://imgur.com/gallery/1m5MBKf      
  8. 8. Designated  space  for  collabora/ve  collec/ng  
  9. 9. Collabora/ve  Architecture,  Urbanism  and   Sustainability  Web  Archive  (CAUSEWAY)   hNps://archive-­‐it.org/collecFons/4638    
  10. 10. Collabora/on  with  music  librarians  
  11. 11. Contemporary  composers—the  perfect  storm?  
  12. 12. Contemporary  Composers  Web  Archive   Selectors   •  Borrow  Direct  Music  Librarians  Group:  music  librarians  at  Brown,   Columbia,  Cornell,  Dartmouth,  Harvard,  Johns  Hopkins,  Princeton,   and  Yale  universiFes,  MIT,  and  the  universiFes  of  Chicago  and   Pennsylvania   Cataloging  exper/se   •  Russell  MerriN  (cataloger  specializing  in  music  resources)   •  Kate  Harcourt  (Director  of  Original  and  Special  Materials  Cataloging)   •  Alex  Thurman  (Web  Resources  CollecFon  Coordinator)  
  13. 13. Contemporary  Composers  Web  Archive   hNps://archive-­‐it.org/collecFons/4019    
  14. 14. Quality  Assurance  
  15. 15. Crea/ng  MARC  records  for  web  archives   •  CreaFng  MARC  records  for   archived  websites  is   standard  pracFce  at  CUL   –  MARC  records  make  web   archives  discoverable  in   CLIO  (Columbia  Libraries   InformaFon  Online)   •  CollecFon  level  and  seed   level  records   •  Will  use  Archive-­‐It  interface   to  make  Dublin  Core  records  
  16. 16. Patron  view  of  record  in  CLIO  
  17. 17. Cataloger’s  view  of  record  in  CLIO  
  18. 18. An/cipa/ng  wider  use  of  MARC  records   •  Records  have  been  released   to  WorldCat   •  Collaborators  on  cataloging   were  aNenFve  to  which   fields  will  ordinarily  be   stripped  out  when  a  MARC   record  is  imported  to   another  insFtuFon’s  OPAC  
  19. 19. CCWA  MARC  records   •  So  far  sample  of  10  records   has  taught  us…   •  PosiFve  feedback  from   music  librarians   •  Next  we  will  add  another  44   records  for  the  archived   sites  in  CCWA  soon  
  20. 20. Project  tracking  
  21. 21. Use  cases  
  22. 22. Who  are  the  web  archives  for?    Are  they  being   used?    Could  we  encourage  more  effec/ve  use?  
  23. 23. hSp://hrwa.cul.columbia.edu  
  24. 24. Using  the  Human  Rights  Web  Archive  &  learning  from   human  rights  scholars’  work  (publica/ons,  cita/ons)  
  25. 25. Cita/ons  scraped  from  ar/cles  published  in   2010  in  select  scholarly  journals  
  26. 26. Isola/ng  URLs  from  list  of  cita/ons   (approximately  10%  of  cita/ons  scraped  have  URLs  in  them)  
  27. 27. Best  Prac/ces  for  site  creators:  working   with  website  creators   Image  source:  hNp://imgur.com/gallery/NWJ12Pl    
  28. 28. Open  issues:  division  and  maintenance  of   coopera/ve  efforts   (communica/on,  so]ware  and  more)  
  29. 29. Process  over  next  16  months   •  Further  planning  (revision  as  needed)  and  user  interviews   •  Maintain  group  communicaFon   •  Ongoing  growth  (scale  of  collecFng  and  distribuFon  of  effort)   •  Present  shared  costs  and  sustainability  models  (currently  in   development)   •  3-­‐5  year  plan  for  Borrow  Direct  collaboraFons  (collecFons   strategy,  finances,  workflows  and  governance)   •  If  collaboraFon  persists,  idenFfy  themes  for  further  collecFng   •  Catalog  resources  to  high  standards   •  Quality  Assurance  and  ongoing  evaluaFon  
  30. 30. Web  archiving  ini/a/ves     focusing  on  art  resources   An  iniFaFve  designed  to  address  the  “urgent  need  to  document  the   dynamic  web-­‐based  versions  of  aucFon  catalogues,  catalogues   raisonnés,  and  scholarly  research  projects,  as  well  as  arFst,  gallery,   and  museum  websites”  (hNp://www.nyarc.org/content/web-­‐archiving)   ArFsts  Files  Special  Interest  Group  
  31. 31. Ques/ons?   Image  source:  hNp://imgur.com/gallery/qoCqQoh    
  32. 32. Resources  that  came  up  in  the  Q  &  A   •  Internet  Archive  "Save  a  Page"  Plug-­‐In  for  Chrome   hNps://github.com/lintool/chrome-­‐archive-­‐this-­‐page     •  SAA  Web  Archiving  Roundtable   hNp://webarchivingrt.wordpress.com/    
  33. 33. Thanks!   Anna  Perricci   alp2198@columbia.edu     @AnnaPerricci     Columbia  University  Libraries  
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×