1. Content Working Group
2015 NDSA Web Archiving
Survey Report Highlights
Nicholas Taylor (@nullhandle)
Web Archiving Service Manager
Stanford University Libraries
Archives 2016: Web Archiving Roundtable
August 3, 2016
2. Content Working Group
NDSA Web Archiving Survey Working Group
Jefferson Bailey
Internet Archive / Archive-It
Edward McCain
University of Missouri
Abbey Potter
Library of Congress
Abbie Grotke
Library of Congress
Christie Moffatt
National Library of Medicine
Nicholas Taylor
Stanford University Libraries
3. Content Working Group
NDSA Web Archiving survey background
2011
• 78 respondents
• program info
• tools/services
• access
• policies
2013
• 92 respondents
• program info
• staff, metrics, skills,
content concerns
• tools/services
• access/discovery
• new options
• policies
• embargo, social
media, robots.txt,
resources
2015
• 106 respondents
• program info
• areas of progress,
collaboration
interests + barriers
• tools/services
• replication targets
• access/discovery
• researchers
• policies
9. Content Working Group
shift from pilot to production (all responses)
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
Planning Pilot Production Discontinued
2011 2013 2015
10. Content Working Group
shift from pilot to production (longitudinal responses)
0
2
4
6
8
10
12
14
16
18
2011 2013 2015
Planning Pilot Production
11. Content Working Group
increased perceptions of progress
Significant
progress
29%
Some
progress
43%
About the
same
15%
Slightly
worse off
1%
Much
worse off
2%
Other
10%
2013
Significant
progress
49%
Some
progress
22%
About the
same
19%
Slightly
worse off
0%
Much
worse off
0%
Other
10%
2015
12. Content Working Group
perceived progress on data capture, appraisal, vision
40.00%
17.65%
34.12%
5.88%
45.88%
37.65%
58.82%
22.35%
10.59%
12.94%
24.71%
16.47%
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
13. Content Working Group
low perceived progress on access, metadata, QA
15.85%
28.05%
24.39% 24.39%
12.20%
9.76%
12.20%
36.59%
21.95%
29.27%
37.80%
52.44%
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
14. Content Working Group
Archiving Focus
“Ant Farm Media Van v.08 (Time Capsule) in Bellewether at Southern Exposure” by Steve Rhodes under CC BY-NC-SA 2.0
15. Content Working Group
more archiving of own content
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
100.00%
Own content 3rd-party content Both Other
2011 2013 2015
16. Content Working Group
access, cost, quality are top valued metrics
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
Access + use Cost Data volume Institutional
buy-in
Loss Quality Risk
Management
Other
2013 2015
17. Content Working Group
tools, appraisal, QA are top valued skills
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
2013 2015
18. Content Working Group
collaboration interest in QA, capture, metadata
40.91%
52.27%
31.82%
27.27%
51.14%
64.77%
46.59%
3.41%
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
Risk
management
Capture
optimization
Collection
development
APIs Metadata
standards
QA Tool
development
Other
20. Content Working Group
more using multiple archiving solutions
25.40%
60.32%
14.29%
19.51%
63.41%
15.85%
4.81%
63.29%
30.38%
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
Local External Both
2011 2013 2015
21. Content Working Group
flat data transfer from service provider
19.15%
80.85%
20.29%
79.71%
20.27%
79.73%
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
Transfer data Do not transfer data
2011 2013 2015
22. Content Working Group
transferring data locally + to other services
58.82%
47.06%
5.88%
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
Local External Both
23. Content Working Group
trust provider + building locally for data transfer
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
Trust data capture
service provider
Building local
infrastructure
No place to
store/maintain it
Not sure what we’d do
with it once we got it
Other
2011 2013 2015
25. Content Working Group
more conditional handling of robots.txt
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
Always Never Sometimes/it depends Don't know
2011 2013 2015
26. Content Working Group
policies of other orgs key for policy-making
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
ARL Code of
Best Practices
Oakland
Archive Policy
Section 108
Study Group
Legal counsel Statutory
authority
Policies of
other orgs
Previous
NDSA surveys
Other
2013 2015
28. Content Working Group
overall picture
• generally positive perceptions of progress
• more moves to production
• growing proportion of universities
• growing proportion of focus on own content
• staffing largely remains fractional w/ minor growth
• majority use external service but hints of hybrid
approaches
• more comfort w/ conditional policy approaches