These slides were used to support a presentation on web archiving collaborations for colleagues working in the Libraries of the Metropolitan Museum of Art.
Web Archiving Collaborations at Columbia University Libraries
1. Web
archiving
collabora/ons
at
Columbia
University
Libraries
Anna
Perricci
Columbia
University
Libraries
Metropolitan
Museum
of
Art
(August
19,
2014)
2. Web
Resources
Archiving
Collabora/on
Many
thanks
to
the
Mellon
FoundaFon
Building
collaboraFons
among
• The
web
archiving
community
• Other
research
libraries
• Users
and
potenFal
users
of
web
archives
• Website
creators
3. Incen/ves
grants
to
advance
web
archiving
tools
Image
source:
hNp://imgur.com/gallery/vG7KE48
4. Incen/ve
awards
projects
Warcbase:
Building
a
Scalable
Web
Archiving
PlaWorm
on
HBase
and
Hadoop.
(Jimmy
Lin,
University
of
Maryland)
Archiving
TransacFons
Towards
UninterrupFble
Web
Service
(Zhiwu
Xie
and
Edward
A.
Fox,
Virginia
Tech
University)
5. Incen/ve
awards
projects
Visualizing Digital Collections of Web Archives (Michele
Weigle, Old Dominion University)
Tools for Managing Seed URLs (Michael Nelson, Old
Dominion University)
6. Incen/ve
awards
projects
Perma.cc:
MiFgaFng
the
Pervasive
Problem
of
Link
Rot
in
Scholarly
Works
and
Preserving
Online
Content
(Kim
Dulin,
The
Harvard
Library
InnovaFon
Lab)
Free
Law
Project
Providing
free
access
to
primary
legal
materials,
developing
legal
research
tools,
and
supporFng
academic
research
on
legal
corpora
7. Building
an
efficient
and
scalable
na/onal
framework
for
collec/ng
web
content
Image
source:
hNp://imgur.com/gallery/1m5MBKf
12. Contemporary
Composers
Web
Archive
Selectors
• Borrow
Direct
Music
Librarians
Group:
music
librarians
at
Brown,
Columbia,
Cornell,
Dartmouth,
Harvard,
Johns
Hopkins,
Princeton,
and
Yale
universiFes,
MIT,
and
the
universiFes
of
Chicago
and
Pennsylvania
Cataloging
exper/se
• Russell
MerriN
(cataloger
specializing
in
music
resources)
• Kate
Harcourt
(Director
of
Original
and
Special
Materials
Cataloging)
• Alex
Thurman
(Web
Resources
CollecFon
Coordinator)
15. Crea/ng
MARC
records
for
web
archives
• CreaFng
MARC
records
for
archived
websites
is
standard
pracFce
at
CUL
– MARC
records
make
web
archives
discoverable
in
CLIO
(Columbia
Libraries
InformaFon
Online)
• CollecFon
level
and
seed
level
records
• Will
use
Archive-‐It
interface
to
make
Dublin
Core
records
18. An/cipa/ng
wider
use
of
MARC
records
• Records
have
been
released
to
WorldCat
• Collaborators
on
cataloging
were
aNenFve
to
which
fields
will
ordinarily
be
stripped
out
when
a
MARC
record
is
imported
to
another
insFtuFon’s
OPAC
19. CCWA
MARC
records
• So
far
sample
of
10
records
has
taught
us…
• PosiFve
feedback
from
music
librarians
• Next
we
will
add
another
44
records
for
the
archived
sites
in
CCWA
soon
26. Isola/ng
URLs
from
list
of
cita/ons
(approximately
10%
of
cita/ons
scraped
have
URLs
in
them)
27. Best
Prac/ces
for
site
creators:
working
with
website
creators
Image
source:
hNp://imgur.com/gallery/NWJ12Pl
28. Open
issues:
division
and
maintenance
of
coopera/ve
efforts
(communica/on,
so]ware
and
more)
29. Process
over
next
16
months
• Further
planning
(revision
as
needed)
and
user
interviews
• Maintain
group
communicaFon
• Ongoing
growth
(scale
of
collecFng
and
distribuFon
of
effort)
• Present
shared
costs
and
sustainability
models
(currently
in
development)
• 3-‐5
year
plan
for
Borrow
Direct
collaboraFons
(collecFons
strategy,
finances,
workflows
and
governance)
• If
collaboraFon
persists,
idenFfy
themes
for
further
collecFng
• Catalog
resources
to
high
standards
• Quality
Assurance
and
ongoing
evaluaFon
30. Web
archiving
ini/a/ves
focusing
on
art
resources
An
iniFaFve
designed
to
address
the
“urgent
need
to
document
the
dynamic
web-‐based
versions
of
aucFon
catalogues,
catalogues
raisonnés,
and
scholarly
research
projects,
as
well
as
arFst,
gallery,
and
museum
websites”
(hNp://www.nyarc.org/content/web-‐archiving)
ArFsts
Files
Special
Interest
Group
32. Resources
that
came
up
in
the
Q
&
A
• Internet
Archive
"Save
a
Page"
Plug-‐In
for
Chrome
hNps://github.com/lintool/chrome-‐archive-‐this-‐page
• SAA
Web
Archiving
Roundtable
hNp://webarchivingrt.wordpress.com/
33. Thanks!
Anna
Perricci
alp2198@columbia.edu
@AnnaPerricci
Columbia
University
Libraries