Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Not All Mementos are Created Equal: Measuring the Impact of Missing Resources 
Justin F. Brunelle, Mat Kelly, HanySalahEld...
Goal: Automatically measure the quality of the archives 
2 
20% missing
Goal: Automatically measure the quality of the archives 
3 
14% missing
Goal: Automatically measure the quality of the archives 
4 
28% missing
Goal: Automatically measure the quality of the archives 
5 
7% missing
“Live” XKCD 
•Missing 17% of embedded resources 
•Looks complete 
6
“Live” XKCD 
•Take three resources: 
•Logo 
•Main Comic 
•Navigation Strip 
•Relative importance? 
•All present in “Live” ...
Damaging XKCD 
•Created a local memento 
•Removed the logo and navigation strip 
•Now missing 29% of embedded resources 
•...
Damaging XKCD 
•From our local memento 
•Removed the Main Comic 
•Now missing 24% of embedded resources 
•Human assessment...
Damaging XKCD 
•From our local memento 
•Removed the Main Comic 
•Now missing 24% of embedded resources 
•Human assessment...
Image Importance 
•Size (as percentage of all pixels) 
11
Image Importance 
•Size 
•Position (in viewport?) 
12
Image Importance 
•Size 
•Position 
•Centrality (in the vertical or horizontal center?) 
13
Missing CSS 
•Damage not limited to images 
•When missing CSS, content shifts left 
14
Missing CSS 
•Partitioned snapshot into thirds 
•Background color determined 
•Pixel-by-pixel comparison 
15
Missing CSS 
•Calculated the amount of content in each vertical third 
•If >=80% in left column and missing CSS, CSS is im...
Percent Missing vs. Weighted Damage 
•푀푀= Percent of embedded resources missing 
푀푀= 퐸푚푏푒푑푑푒푑푅푒푠표푢푟푐푒푠푀푖푠푠푖푛푔 푇표푡푎푙퐸푚푏푒푑푑푒...
Calculated Damage 
•푀푀= Percent of embedded resources missing 
푀푀= 퐸푚푏푒푑푑푒푑푅푒푠표푢푟푐푒푠푀푖푠푠푖푛푔 푇표푡푎푙퐸푚푏푒푑푑푒푑푅푒푠표푢푟푐푒푠 
•퐷푀= D...
What do Web users think? 
19
Setting up the Turk Test 
•Amazon’s mechanical turkersrepresent real web users 
•Two legs of the experiment: 
•Manually da...
21
22
23
Quantifying TurkerResponse 
•5 turkersfor each comparison 
•Assume 퐷퐴< 퐷퐵(i.e., A is less damaged) 
•Measure turkeragreeme...
Quantifying TurkerResponse 
•5 turkersfor each comparison 
•Assume 퐷퐴< 퐷퐵(i.e., A is less damaged) 
•Measure turkeragreeme...
Quantifying TurkerResponse 
•5 turkersfor each comparison 
•Assume 퐷퐴< 퐷퐵(i.e., A is less damaged) 
•Measure turkeragreeme...
Quantifying TurkerResponse 
•5 turkersfor each comparison 
•Assume 퐷퐴< 퐷퐵(i.e., A is less damaged) 
•Measure turkeragreeme...
Quantifying TurkerResponse 
•5 turkersfor each comparison 
•Assume 퐷퐴< 퐷퐵(i.e., A is less damaged) 
•Measure turkeragreeme...
Quantifying TurkerResponse 
•5 turkersfor each comparison 
•Assume 퐷퐴< 퐷퐵(i.e., A is less damaged) 
•Measure turkeragreeme...
Turk Results 
•Compared damage(퐷푀) and percent missing (푀푀) 
•M0: Manually damaged mementos 
•D: Internet Archive Mementos...
Damage in the Internet Archive 
•1,000 URI-Rs from Bitly 
•1,000 URI-Rs from Archive-it 
•Remove non-HTML representations ...
•Measured Internet Archive mementos 
•Damage generally improves over time 
•Despite missing more resources over time 
Dama...
Conclusions 
•퐷푀is a better measure of memento quality than 푀푀 
•On average, the Internet Archive is improving its quality...
Upcoming SlideShare
Loading in …5
×

Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

2,063 views

Published on

Slides presented by Justin F. Brunelle at Digital Preservation 2014 in London.

Published in: Science
  • Be the first to comment

Not All Mementos Are Created Equal: Measuring The Impact Of Missing Mementos

  1. 1. Not All Mementos are Created Equal: Measuring the Impact of Missing Resources Justin F. Brunelle, Mat Kelly, HanySalahEldeen, Michele C. Weigle, Michael L. Nelson Old Dominion University {jbrunelle, mkelly, hany, mweigle, mln}@cs.odu.edu 1
  2. 2. Goal: Automatically measure the quality of the archives 2 20% missing
  3. 3. Goal: Automatically measure the quality of the archives 3 14% missing
  4. 4. Goal: Automatically measure the quality of the archives 4 28% missing
  5. 5. Goal: Automatically measure the quality of the archives 5 7% missing
  6. 6. “Live” XKCD •Missing 17% of embedded resources •Looks complete 6
  7. 7. “Live” XKCD •Take three resources: •Logo •Main Comic •Navigation Strip •Relative importance? •All present in “Live” XKCD 7
  8. 8. Damaging XKCD •Created a local memento •Removed the logo and navigation strip •Now missing 29% of embedded resources •Human assessment: looks OK 8
  9. 9. Damaging XKCD •From our local memento •Removed the Main Comic •Now missing 24% of embedded resources •Human assessment: Not a usable memento 9
  10. 10. Damaging XKCD •From our local memento •Removed the Main Comic •Now missing 24% of embedded resources •Human assessment: Not a usable memento •Percent of missing embedded resources is not a suitable metric for memento quality 10
  11. 11. Image Importance •Size (as percentage of all pixels) 11
  12. 12. Image Importance •Size •Position (in viewport?) 12
  13. 13. Image Importance •Size •Position •Centrality (in the vertical or horizontal center?) 13
  14. 14. Missing CSS •Damage not limited to images •When missing CSS, content shifts left 14
  15. 15. Missing CSS •Partitioned snapshot into thirds •Background color determined •Pixel-by-pixel comparison 15
  16. 16. Missing CSS •Calculated the amount of content in each vertical third •If >=80% in left column and missing CSS, CSS is important •Only performed if stylesheetsare missing 16
  17. 17. Percent Missing vs. Weighted Damage •푀푀= Percent of embedded resources missing 푀푀= 퐸푚푏푒푑푑푒푑푅푒푠표푢푟푐푒푠푀푖푠푠푖푛푔 푇표푡푎푙퐸푚푏푒푑푑푒푑푅푒푠표푢푟푐푒푠 •퐷푀= Damage rating of missing embedded resources 퐷푀= 퐷푀퐴푐푡푢푎푙 퐷푀푃표푡푒푛푡푖푎푙 퐷푀푃표푡푒푛푡푖푎푙= 푖=1 푛[퐼|푀푀] 퐷[퐼|푀푀](푖) 푛[퐼|푀푀] + 푖=1 푛[퐶] 퐷[퐶](푖) 푛퐶 17 퐼=퐼푚푎푔푒 푀푀=푀푢푙푡푖푀푒푑푖푎 퐶=퐶푆푆
  18. 18. Calculated Damage •푀푀= Percent of embedded resources missing 푀푀= 퐸푚푏푒푑푑푒푑푅푒푠표푢푟푐푒푠푀푖푠푠푖푛푔 푇표푡푎푙퐸푚푏푒푑푑푒푑푅푒푠표푢푟푐푒푠 •퐷푀= Damage rating of missing embedded resources 퐷푀= 퐷푀퐴푐푡푢푎푙 퐷푀푃표푡푒푛푡푖푎푙 퐷푀푃표푡푒푛푡푖푎푙= 푖=1 푛[퐼|푀푀] 퐷[퐼|푀푀](푖) 푛[퐼|푀푀] + 푖=1 푛[퐶] 퐷[퐶](푖) 푛퐶 18 푀푀=0.29 퐷푀=0.36 푀푀=0.24 퐷푀=0.41
  19. 19. What do Web users think? 19
  20. 20. Setting up the Turk Test •Amazon’s mechanical turkersrepresent real web users •Two legs of the experiment: •Manually damaged memento vs. Live resource •10 manually damaged mementos and resources •Real Memento vs. Real Memento •100 URI-Rs, one memento per year 20
  21. 21. 21
  22. 22. 22
  23. 23. 23
  24. 24. Quantifying TurkerResponse •5 turkersfor each comparison •Assume 퐷퐴< 퐷퐵(i.e., A is less damaged) •Measure turkeragreement: Image A Image B Split Turker1 Y Turker2 Y Turker3 Y Turker4 Y Turker5 Y Result 5 0 5-0 24
  25. 25. Quantifying TurkerResponse •5 turkersfor each comparison •Assume 퐷퐴< 퐷퐵(i.e., A is less damaged) •Measure turkeragreement: Image A Image B Split Turker1 Y Turker2 Y Turker3 Y Turker4 Y Turker5 Y Result 4 1 4-1 25
  26. 26. Quantifying TurkerResponse •5 turkersfor each comparison •Assume 퐷퐴< 퐷퐵(i.e., A is less damaged) •Measure turkeragreement: Image A Image B Split Turker1 Y Turker2 Y Turker3 Y Turker4 Y Turker5 Y Result 0 5 0-5 26
  27. 27. Quantifying TurkerResponse •5 turkersfor each comparison •Assume 퐷퐴< 퐷퐵(i.e., A is less damaged) •Measure turkeragreement: Image A Image B Split Turker1 Y Turker2 Y Turker3 Y Turker4 Y Turker5 Y Result 0 5 0-5 27 No agreement!
  28. 28. Quantifying TurkerResponse •5 turkersfor each comparison •Assume 퐷퐴< 퐷퐵(i.e., A is less damaged) •Measure turkeragreement: Image A Image B Split Turker1 Y Turker2 Y Turker3 Y Turker4 Y Turker5 Y Result 3 2 3-2 28
  29. 29. Quantifying TurkerResponse •5 turkersfor each comparison •Assume 퐷퐴< 퐷퐵(i.e., A is less damaged) •Measure turkeragreement: Defined only by 4-1 and 5-0 splits Image A Image B Split Turker1 Y Turker2 Y Turker3 Y Turker4 Y Turker5 Y Result 3 2 3-2 29 Split decision  No agreement!
  30. 30. Turk Results •Compared damage(퐷푀) and percent missing (푀푀) •M0: Manually damaged mementos •D: Internet Archive Mementos •M: Percent missing in Internet Archive Mementos •퐷푀vs. Live: 78.9% true positives •푀푀vs. Live: 47.2% true positives •Worse than a 50/50chance! •퐷푀vs 퐷푀: 58.4% true positives 30
  31. 31. Damage in the Internet Archive •1,000 URI-Rs from Bitly •1,000 URI-Rs from Archive-it •Remove non-HTML representations •1,861 URI-Rs remaining •Sample 1 memento per year from Internet Archive •Measure damage 31
  32. 32. •Measured Internet Archive mementos •Damage generally improves over time •Despite missing more resources over time Damage in the Internet Archive 32
  33. 33. Conclusions •퐷푀is a better measure of memento quality than 푀푀 •On average, the Internet Archive is improving its quality over time •Internet Archive is also missing more embedded resources over time •Improved damage weighting (58.4% correct can be improved) •Measure cumulative temporal damage ratings •E.g., a logo that never changes for 10 years and is used by 100 mementos is more important than the one used in a single memento. 33

×