SlideShare a Scribd company logo
1 of 17
Rethinking Web Archiving Quality
Assurance for Impact, Scalability,
and Sustainability
Nicholas Taylor (@nullhandle)
Web Archiving Service Manager
Stanford University Libraries
Archives 2016
209 - Balancing Quality of Life and Quality Assurance
August 4, 2016
QA panelists
Dory Bower
Government Publishing Office
Lori Donovan
Internet Archive / Archive-It
Dallas Pillen
Bentley Historical Library
Nicholas Taylor
Stanford University Libraries
Alex Thurman
Columbia University Libraries
balancing QA + quality of life?
“Tab Tatham "junk. balance scales."” by ▓▒░ TORLEY ░▒▓ under CC BY-SA 2.0
overheard re: QA @ SAA 2015
we set and forget; I’m just
glad we’re doing something
did more QA at the
beginning but, well, I
don’t really look at
the reports any more
steady,
ongoing QA is
challenging
occasionally I set
aside a lunch hour
to do some QA
my strategy
right now is to
let the big
schools figure
it out
2015 SAA WebArchRT discussion
• if you could only apply 3 QA practices to
your web archives, which 3?
• do you apply different QA practices to
web archives created for different use
cases?
• how do you ensure that staff time
allocated to QA is best spent?
quality assurance in the lifecycle
Archive-It: “The Web Archiving Life Cycle Model”
quality assurance, expansively
typical QA
• parsing robots.txt
• scoping rules
• object count limits
• test crawling
• inspecting archived site
• reviewing reports
• patch crawling
and more
• seed selection
• assessing live site
• capture tool selection
• crawl scheduling
• crawl duration limits
• monitoring crawl
• archivability advocacy
• training
3rd highest desired skill
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
NDSA: “2015 NDSA Web Archiving Survey”
low perceived programmatic progress
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
NDSA: “2015 NDSA Web Archiving Survey”
greatest collaboration interest
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
Policy + Risk
Management
Capture
Configuration
Collaborative
Collection
Dev
Input on APIs
+ Standards
Metadata
Standards
QA
Techniques +
Strategies
Tool Dev Other
NDSA: “2015 NDSA Web Archiving Survey”
RETHINKING QA AT STANFORD
“stanford13” by Paradoxotaur under CC BY-SA 2.0
web archiving at Stanford
• 7 Archive-It accounts
• Heritrix, Webrecorder
• local preservation,
discovery, access
• program manager,
curators, students
• tens of collections
• thousands of seeds Internet Archive: “Stanford University Homepage”
quality assurance goals
• maximize impact +
efficiency of QA efforts
• enable diverse,
distributed, +
approachable
contributions
• calibrate investments
in quality based on
tool capabilities “Goals” by Eric Peacock under CC BY-NC-SA 2.0
capture, behavior, appearance
appearancebehavior
capture
NYARC: “I. Introduction - NYARC Documentation”
capture, behavior, appearance
appearancebehavior
capture
NYARC: “I. Introduction - NYARC Documentation”
in practice
care more about…
• report data
• crawl finishing
• 4xx, 5xx, complete
robots.txt block
• plausible duration
• plausible object counts
• scoping out extraneous
content
• new seeds
care less about…
• visual inspection
• reviewing every capture
• appearance fidelity
• behavior fidelity
• partial content out of
scope
• partial content blocked by
robots.txt
• ongoing seeds
more next from Lori, Alex, Dallas, Dory
“Olympic Relay Handoff” by Dr. Mark Kubert under CC BY-NC-ND 2.0

More Related Content

Similar to Rethinking Web Archiving Quality Assurance for Impact, Scalability, and Sustainability

Similar to Rethinking Web Archiving Quality Assurance for Impact, Scalability, and Sustainability (20)

Website Archivability - Library of Congress NDIIPP Presentation 2015/06/03
Website Archivability - Library of Congress NDIIPP Presentation 2015/06/03Website Archivability - Library of Congress NDIIPP Presentation 2015/06/03
Website Archivability - Library of Congress NDIIPP Presentation 2015/06/03
 
IWMW 2003 b4 QA for web sites (5 - The QA Focus Perspective)
IWMW 2003 b4 QA for web sites (5 - The QA Focus Perspective)IWMW 2003 b4 QA for web sites (5 - The QA Focus Perspective)
IWMW 2003 b4 QA for web sites (5 - The QA Focus Perspective)
 
What’s Next with Accessibility?
What’s Next with Accessibility?What’s Next with Accessibility?
What’s Next with Accessibility?
 
IWMW 2002: QA for web sites
IWMW 2002: QA for web sitesIWMW 2002: QA for web sites
IWMW 2002: QA for web sites
 
Mozilla Web QA: Who, What, Why, How
Mozilla Web QA: Who, What, Why, HowMozilla Web QA: Who, What, Why, How
Mozilla Web QA: Who, What, Why, How
 
The workflows for the ingest of digital objects into a repository/digital l...
The workflows for the ingest of  digital objects into a repository/digital l...The workflows for the ingest of  digital objects into a repository/digital l...
The workflows for the ingest of digital objects into a repository/digital l...
 
NASIG 2020 - Walk this way
NASIG 2020 -  Walk this wayNASIG 2020 -  Walk this way
NASIG 2020 - Walk this way
 
Walk this way: Online content platform migration experiences and collaboration
Walk this way: Online content platform migration experiences and collaboration Walk this way: Online content platform migration experiences and collaboration
Walk this way: Online content platform migration experiences and collaboration
 
Repository Fringe 2015 - Jisc RDM Session, Linda Naughton, Jisc
Repository Fringe 2015 - Jisc RDM Session, Linda Naughton, JiscRepository Fringe 2015 - Jisc RDM Session, Linda Naughton, Jisc
Repository Fringe 2015 - Jisc RDM Session, Linda Naughton, Jisc
 
Building data networks: exploring trust and interoperability between authoris...
Building data networks: exploring trust and interoperability between authoris...Building data networks: exploring trust and interoperability between authoris...
Building data networks: exploring trust and interoperability between authoris...
 
The workflows for the ingest of digital objects into a repository/digital li...
The workflows for the ingest of digital objects into a repository/digital li...The workflows for the ingest of digital objects into a repository/digital li...
The workflows for the ingest of digital objects into a repository/digital li...
 
CLEAR: a Credible Live Evaluation Method of Website Archivability, iPRES2013
CLEAR: a Credible Live Evaluation Method of Website Archivability, iPRES2013CLEAR: a Credible Live Evaluation Method of Website Archivability, iPRES2013
CLEAR: a Credible Live Evaluation Method of Website Archivability, iPRES2013
 
CV - Cathleen Thompson
CV - Cathleen Thompson CV - Cathleen Thompson
CV - Cathleen Thompson
 
SAFe and DevOps - better together
SAFe and DevOps - better togetherSAFe and DevOps - better together
SAFe and DevOps - better together
 
October 28, 2015 NISO Virtual Conference Interacting with Content: Improving ...
October 28, 2015 NISO Virtual Conference Interacting with Content: Improving ...October 28, 2015 NISO Virtual Conference Interacting with Content: Improving ...
October 28, 2015 NISO Virtual Conference Interacting with Content: Improving ...
 
Making the Transition from Manual to Automated Testing
Making the Transition from Manual to Automated TestingMaking the Transition from Manual to Automated Testing
Making the Transition from Manual to Automated Testing
 
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
 
Grant apr20-8
Grant apr20-8Grant apr20-8
Grant apr20-8
 
Introduction to the COAR Notify project
Introduction to the COAR Notify projectIntroduction to the COAR Notify project
Introduction to the COAR Notify project
 
Researching Researchers - Designing the User Experience at ProQuest
Researching Researchers - Designing the User Experience at ProQuestResearching Researchers - Designing the User Experience at ProQuest
Researching Researchers - Designing the User Experience at ProQuest
 

More from nullhandle

More from nullhandle (20)

Understanding Legal Use Cases for Web Archives
Understanding Legal Use Cases for Web ArchivesUnderstanding Legal Use Cases for Web Archives
Understanding Legal Use Cases for Web Archives
 
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...
 
Unlocking LOCKSS with APIs
Unlocking LOCKSS with APIsUnlocking LOCKSS with APIs
Unlocking LOCKSS with APIs
 
Interoperability and Technical Collaboration for Web and Social Media Archiving
Interoperability and Technical Collaboration for Web and Social Media ArchivingInteroperability and Technical Collaboration for Web and Social Media Archiving
Interoperability and Technical Collaboration for Web and Social Media Archiving
 
Why Not Lots of Copies Keep(ing) Software Safe?
Why Not Lots of Copies Keep(ing) Software Safe?Why Not Lots of Copies Keep(ing) Software Safe?
Why Not Lots of Copies Keep(ing) Software Safe?
 
WASAPI Web Archive Data Transfer APIs
WASAPI Web Archive Data Transfer APIsWASAPI Web Archive Data Transfer APIs
WASAPI Web Archive Data Transfer APIs
 
Building Web Archiving Technology, Together
Building Web Archiving Technology, TogetherBuilding Web Archiving Technology, Together
Building Web Archiving Technology, Together
 
Measure All the (Web Archiving) Things!
Measure All the (Web Archiving) Things!Measure All the (Web Archiving) Things!
Measure All the (Web Archiving) Things!
 
2013 NDSA Web Archiving Survey Report Highlights
2013 NDSA Web Archiving Survey Report Highlights2013 NDSA Web Archiving Survey Report Highlights
2013 NDSA Web Archiving Survey Report Highlights
 
Considerations for Strategic Web Archive Collection Development
Considerations for Strategic Web Archive Collection DevelopmentConsiderations for Strategic Web Archive Collection Development
Considerations for Strategic Web Archive Collection Development
 
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
 
Advocating for Web Archivability
Advocating for Web ArchivabilityAdvocating for Web Archivability
Advocating for Web Archivability
 
Building Archivable Websites
Building Archivable WebsitesBuilding Archivable Websites
Building Archivable Websites
 
Link Persistence, Website Persistence
Link Persistence, Website PersistenceLink Persistence, Website Persistence
Link Persistence, Website Persistence
 
From Seed to Harvest: Web Archiving Program Considerations for SUL
From Seed to Harvest: Web Archiving Program Considerations for SULFrom Seed to Harvest: Web Archiving Program Considerations for SUL
From Seed to Harvest: Web Archiving Program Considerations for SUL
 
A Survey of Research Prospects for more Manageable Personal Digital Photo Col...
A Survey of Research Prospects for more Manageable Personal Digital Photo Col...A Survey of Research Prospects for more Manageable Personal Digital Photo Col...
A Survey of Research Prospects for more Manageable Personal Digital Photo Col...
 
Tool Academy: Web Archiving
Tool Academy: Web ArchivingTool Academy: Web Archiving
Tool Academy: Web Archiving
 
Using Wayback Machine for Research
Using Wayback Machine for ResearchUsing Wayback Machine for Research
Using Wayback Machine for Research
 
Designing Preservable Websites
Designing Preservable WebsitesDesigning Preservable Websites
Designing Preservable Websites
 
Web and Twitter Archiving at the Library of Congress
Web and Twitter Archiving at the Library of CongressWeb and Twitter Archiving at the Library of Congress
Web and Twitter Archiving at the Library of Congress
 

Recently uploaded

valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure
 
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
Diya Sharma
 
( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...
( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...
( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...
nilamkumrai
 

Recently uploaded (20)

Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirt
 
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
 
Wagholi & High Class Call Girls Pune Neha 8005736733 | 100% Gennuine High Cla...
Wagholi & High Class Call Girls Pune Neha 8005736733 | 100% Gennuine High Cla...Wagholi & High Class Call Girls Pune Neha 8005736733 | 100% Gennuine High Cla...
Wagholi & High Class Call Girls Pune Neha 8005736733 | 100% Gennuine High Cla...
 
Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.
 
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
 
Enjoy Night⚡Call Girls Samalka Delhi >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Samalka Delhi >༒8448380779 Escort ServiceEnjoy Night⚡Call Girls Samalka Delhi >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Samalka Delhi >༒8448380779 Escort Service
 
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...
 
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...
 
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
 
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
 
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls DubaiDubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirt
 
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
 
Katraj ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For S...
Katraj ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For S...Katraj ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For S...
Katraj ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For S...
 
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
 
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
 
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
 
Russian Call Girls in %(+971524965298 )# Call Girls in Dubai
Russian Call Girls in %(+971524965298  )#  Call Girls in DubaiRussian Call Girls in %(+971524965298  )#  Call Girls in Dubai
Russian Call Girls in %(+971524965298 )# Call Girls in Dubai
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
 
( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...
( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...
( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...
 

Rethinking Web Archiving Quality Assurance for Impact, Scalability, and Sustainability

  • 1. Rethinking Web Archiving Quality Assurance for Impact, Scalability, and Sustainability Nicholas Taylor (@nullhandle) Web Archiving Service Manager Stanford University Libraries Archives 2016 209 - Balancing Quality of Life and Quality Assurance August 4, 2016
  • 2. QA panelists Dory Bower Government Publishing Office Lori Donovan Internet Archive / Archive-It Dallas Pillen Bentley Historical Library Nicholas Taylor Stanford University Libraries Alex Thurman Columbia University Libraries
  • 3. balancing QA + quality of life? “Tab Tatham "junk. balance scales."” by ▓▒░ TORLEY ░▒▓ under CC BY-SA 2.0
  • 4. overheard re: QA @ SAA 2015 we set and forget; I’m just glad we’re doing something did more QA at the beginning but, well, I don’t really look at the reports any more steady, ongoing QA is challenging occasionally I set aside a lunch hour to do some QA my strategy right now is to let the big schools figure it out
  • 5. 2015 SAA WebArchRT discussion • if you could only apply 3 QA practices to your web archives, which 3? • do you apply different QA practices to web archives created for different use cases? • how do you ensure that staff time allocated to QA is best spent?
  • 6. quality assurance in the lifecycle Archive-It: “The Web Archiving Life Cycle Model”
  • 7. quality assurance, expansively typical QA • parsing robots.txt • scoping rules • object count limits • test crawling • inspecting archived site • reviewing reports • patch crawling and more • seed selection • assessing live site • capture tool selection • crawl scheduling • crawl duration limits • monitoring crawl • archivability advocacy • training
  • 8. 3rd highest desired skill 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% NDSA: “2015 NDSA Web Archiving Survey”
  • 9. low perceived programmatic progress 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% NDSA: “2015 NDSA Web Archiving Survey”
  • 10. greatest collaboration interest 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% Policy + Risk Management Capture Configuration Collaborative Collection Dev Input on APIs + Standards Metadata Standards QA Techniques + Strategies Tool Dev Other NDSA: “2015 NDSA Web Archiving Survey”
  • 11. RETHINKING QA AT STANFORD “stanford13” by Paradoxotaur under CC BY-SA 2.0
  • 12. web archiving at Stanford • 7 Archive-It accounts • Heritrix, Webrecorder • local preservation, discovery, access • program manager, curators, students • tens of collections • thousands of seeds Internet Archive: “Stanford University Homepage”
  • 13. quality assurance goals • maximize impact + efficiency of QA efforts • enable diverse, distributed, + approachable contributions • calibrate investments in quality based on tool capabilities “Goals” by Eric Peacock under CC BY-NC-SA 2.0
  • 14. capture, behavior, appearance appearancebehavior capture NYARC: “I. Introduction - NYARC Documentation”
  • 15. capture, behavior, appearance appearancebehavior capture NYARC: “I. Introduction - NYARC Documentation”
  • 16. in practice care more about… • report data • crawl finishing • 4xx, 5xx, complete robots.txt block • plausible duration • plausible object counts • scoping out extraneous content • new seeds care less about… • visual inspection • reviewing every capture • appearance fidelity • behavior fidelity • partial content out of scope • partial content blocked by robots.txt • ongoing seeds
  • 17. more next from Lori, Alex, Dallas, Dory “Olympic Relay Handoff” by Dr. Mark Kubert under CC BY-NC-ND 2.0