SlideShare a Scribd company logo
1 of 41
Download to read offline
When Should I Make 
Preservation Copies of Myself? 
Charles L. Cartledge and Michael L. Nelson 
Old Dominion University 
Department of Computer Science 
Norfolk, VA 23529 USA 
JCDL 2014 
London, UK 
September 9, 2014
Unsupervised Small-World (USW) 
has multiple areas of interest 
2
Preservation via benign neglect 
3 
Handwritten on the back of the photo: 
“Josie McClure picture taken Feb 30, 1907 at Poteau, I.T. 
Fifteen years of age When this was taken weighed 140 lbs.” 
(cultural context needed to make sense of the annotation!)
Will Josie last 100+ years as a web object 
(WO) in Flickr, Photobucket, et al.?
Crowd sourcing preservation 
• “Everyone is a curator …” 
– Crowd sourced activity 
– Unscheduled 
– Willing to wait a long time 
• Enlist humans in creation 
and maintenance – 
opposite of benign neglect 
5 
Frank McCown, Michael L. Nelson, and Herbert Van de Sompel, Everyone is 
a Curator: Human-Assisted Preservation for ORE Aggregations, Proceedings 
of the DigCCurr 2009 http://arxiv.org/abs/0901.4571 
See also: http://ws-dl.blogspot.com/2013/10/2013-10-23-preserve-me-if-you-can-using.html
Emergent behavior: flocking boids 
• Craig Reynolds – basis of herd 
and flock behavior in 
computer animations 
– 3 rules 
• Collision avoidance 
• Velocity matching 
• Flock centering 
– No central control, everything 
based on local knowledge only 
• Simple rules produce 
complex, emergent behavior 
6 
Collision avoidance 
Velocity matching 
Craig W. Reynolds, Computer Animation with Scripts and Actors, ACM SIGGRAPH, vol. 16, ACM, 
1982, pp. 289 - 296. 
Images http://www.red3d.com/cwr/boids/ Flock centering
USW interpretation of flocking 
7 
Craig Reynolds’ “boids” USW interpretation 
Collision 
avoidance 
Velocity 
matching 
Flock 
centering 
Each WO has a unique URI 
Matching number of copies/family members 
Move with friends to new hosts
WOs wandering in the USW graph 
• Wandering WO is 
“introduced” to an 
existing WO 
• If a connection is not 
made, then an 
attempt is made to 
another existing WO 
• Process is repeated 
until a connection is 
made 
• No global 
knowledge 
– No omnipotent 
enforcer 
– No omnipresent 
monitor 
• No repositories 
8
USW WO “friendship” links 
• WOs have 
“friendship” 
links to other 
WOs 
• Different than 
HTML 
navigational 
links (i.e., 
<link> instead 
of <a>) 
9
USW WO “families” 
A family is a 
set of copies 
of the same 
WO 
10
USW hosts 
Family 
members live 
on different 
hosts 
11 
Host #1 
Host #2 
Host #3
WO roles & responsibilities 
• Hierarchy of family WOs 
– Progenitor – initial WO 
– Copies – more recent WO copies 
– Each WO is timestamped with creation time 
• WO roles 
– Active maintainer – eldest WO charged with 
making copies and related housekeeping 
– Passive maintainer – all other WOs 
• Order of precedence 
– If progenitor is accessible then it is the active 
maintainer 
– If declared active maintainer is accessible then it is 
the active maintainer 
– Otherwise, WO declares itself active maintainer 
• If family is disconnected then multiple active 
maintainers are possible until reconnection then the 
eldestWO declares itself active maintainer 
12 
Progenitor 
Copies
Active and passive maintenance 
activities 
13 
Active 
Passive 
Active Active 
Passive Passive 
Progenitor 
Is lost 
X • Active maintainer (the WO with earliest timestamp) – currently 
charged with making copies and related housekeeping 
• Passive maintainer – all other WOs 
Progenitor 
returns 
Progenitor 
declares 
act. as copy. 
Time
Progenitor is lost 
14 
Active 
Passive 
Active Active 
Passive Passive 
Progenitor 
Is lost 
X • Active maintainer – currently charged with making copies and 
related housekeeping 
• Passive maintainer – all other WOs 
Progenitor 
returns 
Progenitor 
declares 
act. as copy. 
Time
A new active maintainer 
15 
Active 
Passive 
Active Active 
Passive Passive 
Progenitor 
Is lost 
X • Active maintainer – currently charged with making copies and 
related housekeeping 
• Passive maintainer – all other WOs 
Progenitor 
returns 
Progenitor 
declares 
act. as copy. 
Time
Progenitor returns and assumes active 
maintainer role 
16 
Active 
Passive 
Active Active 
Passive Passive 
Progenitor 
Is lost 
X • Active maintainer – currently charged with making copies and 
related housekeeping 
• Passive maintainer – all other WOs 
Progenitor 
returns 
Progenitor 
declares 
act. as copy. 
Time
Progenitor has made copies 
17 
Time 
Copy 
declares 
active 
Copies 
created 
Replacement 
created 
Excess copies Excess deleted
A copy is disconnected from the 
family 
18 
Time 
Copy 
declares 
active 
Copies 
created 
Replacement 
created 
Excess copies Excess deleted
Two active maintainers make 
copies 
19 
Time 
Copy 
declares 
active 
Copies 
created 
Replacement 
created 
Excess copies Excess deleted
Disconnected copy is 
reconnected to the progenitor 
20 
Time 
Copy 
declares 
active 
Copies 
created 
Replacement 
created 
Excess copies Excess deleted
Family has too many copies 
21 
Time 
Copy 
declares 
active 
Copies 
created 
Replacement 
created 
Excess copies Excess deleted 
• Copy management policies 
– Active: explicit removal 
– Passive: “natural attrition” 
• Equivalent of Reynolds’ 
velocity matching, making 
and monitoring copies
Parameters 
• csoft = minimum number of preservation copies desired by a 
web object 
– e.g., csoft = 3 
• chard = maximum number of preservation copies desired by 
a web object 
– e.g., chard = 5 
• hmax = maximum number of hosts 
– e.g., hmax = 1000 
• hcap = host capacity for web objects 
– e.g., hcap = 5 
• nmax = maximum number of web objects 
– e.g., nmax = 500
Three USW copying policies 
• Least aggressive – one at a time to chard 
• Moderately aggressive – as quickly as possible to 
csoft and then one at a time chard 
• Most aggressive – as quickly as possible to chard 
• Constraints: 
– WOs can only take action when woken up by interactive 
users or other WOs (i.e., mostly they lie dormant 
waiting for crowd sourced preservation) 
– Copying continues until WOs can no longer find hosts 
that are not full 
23
Reading tree ring graphs 
24 
WOs preservation status Hosts utilization status 
None 
< Csoft 
Csoft <= N < Chard 
N == Chard 
0% 
< 25% < 75% 
< 50 % > 75%
Least aggressive (t = 1) 
25
Least aggressive (t = 10) 
26
Least aggressive (t = 50) 
27
Least aggressive (t = 100) 
28 
 A full YouTube video is available at: http://youtu.be/sHJGYphqtK4
Least aggressive (final) 
29 
• Results 
– System 
stabilized 
– Host capacity 
limited 
– Some WOs 
without any 
copies 
– Some hosts 
unused 
• “Least aggressive” is 
not an effective 
policy
Which policy to choose? 
30 
• Moderately aggressive 
results in an additional 
18% of WOs meeting 
their preservation goals 
and makes more 
efficient use of limited 
host resources sooner 
• Most aggressive results 
in almost the same 
percentage of WOs 
meeting their goals, but 
with slightly more hosts 
having unused capacity
How does policy affect message 
exchange? 
Moderately Least aggressive aggressive Most aggressive 
Number of messages is constant, but amortized over different time scales
Conclusions 
• Based on simulations: 
– Be aggressive when making copies! 
– Moderately aggressive copying was approximately 
the same as aggressive copying 
• Aggressive achieves steady state faster 
• But moderately aggressive distributes WOs over hosts 
more equally 
– Moderately aggressive vs. aggressive comes down 
to “go fast” vs. “spread the load”
Video URLs 
• USW video 
– http://youtu.be/JnCMenp73YQ 
• Least Aggressive 
– http://youtu.be/sHJGYphqtK4 
• Moderately Aggressive 
– https://www.youtube.com/watch?v=pVI-VhPh7KQ 
• Most Aggressive 
– https://www.youtube.com/watch?v=eIXz8Njh-QM 
• “Death Star” message histogram 
– https://www.youtube.com/watch?v=X3EShyjFoc4 
• “Traditional” message histogram 
– https://www.youtube.com/watch?v=9CcCup3Td-Q 
33
Backup slides 
34
Some WO reference 
implementation details 
35 
 Sawood Alam, HTTP Mailbox - Asynchronous RESTful Communication, Master's thesis, Old 
Dominion University, Norfolk, VA, 2013. 
 Carl Lagoze, Herbert Van de Sompel, Pete Johnston, Michael Nelson, Robert Sanderson, and 
Simeon Warner, ORE User Guide - Resource Map Implementation in Atom, Tech. report, Open 
Archives Initiative, 2004. 
 Sawood Alam, Charles L. Cartledge, and Michael L. Nelson, Support for Various HTTP Methods 
on the Web, Tech. Report arXiv:1405.2330 (2014). 
WO memory: simulated via “edit” 
service 
Direct WO to WO communication: 
simulated via the HTTP Mailbox
Preserve Me Viz! with new 
connections 
• New friend 
connections 
• New copy 
locations 
36
Preserve Me “Basic” on a copy 
• Differences between 
active and passive 
maintainers. 
• Active maintainer is 
responsible for making 
copies. 
• Passive maintainer 
sends alerts to the 
active maintainer 
• Passive maintainer 
may assume active 
maintainer role if 
active is not available. 
37
A USW instrumented splash page 
… 
<link rel="resourcemap" type="application/atom+xml;type=entry" 
href="http://arxiv.cs.odu.edu/rems/arxiv-0704-3647v1.xml" /> 
<link rel="aggregation" href="http://arxiv.cs.odu.edu/rems/arxiv-0704- 
3647v1.xml#aggregation" /> 
<script src="http://www.cs.odu.edu/~salam/wsdl/uswdo/work/preserveme.js"></script> 
… 
38
USW algorithm popup 
39 
• Written in 
JavaScript 
• Relies on domain 
services 
– Copy -> creates 
copy of a WO 
– Edit -> update 
own REM 
• Uses 
communications 
mechanism based 
on Sawood Alam’s 
master’s thesis
USW copies: famine to feast 
40
Final states for copying policies and 
named conditions 
41

More Related Content

Viewers also liked

More Archives, More Better
More Archives, More Better More Archives, More Better
More Archives, More Better Michael Nelson
 
Storytelling for Summarizing Collections in Web Archives
Storytelling for Summarizing Collections in Web ArchivesStorytelling for Summarizing Collections in Web Archives
Storytelling for Summarizing Collections in Web ArchivesMichael Nelson
 
Web Archiving: A Brief Introduction
Web Archiving: A Brief IntroductionWeb Archiving: A Brief Introduction
Web Archiving: A Brief IntroductionSawood Alam
 
Why We Need Multiple Archives
Why We Need Multiple ArchivesWhy We Need Multiple Archives
Why We Need Multiple ArchivesMichael Nelson
 
Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...
Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...
Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...Michael Nelson
 
Using Web Archives to Enrich the Live Web Experience Through Storytelling
Using Web Archives to Enrich the Live Web Experience Through StorytellingUsing Web Archives to Enrich the Live Web Experience Through Storytelling
Using Web Archives to Enrich the Live Web Experience Through StorytellingYasmin AlNoamany, PhD
 
Software as a Well-Formed Research Object
Software as a Well-Formed Research ObjectSoftware as a Well-Formed Research Object
Software as a Well-Formed Research ObjectYasmin AlNoamany, PhD
 
Old Dominion University Computer Science IIPC New Member
Old Dominion University Computer Science IIPC New Member Old Dominion University Computer Science IIPC New Member
Old Dominion University Computer Science IIPC New Member Michael Nelson
 
Evaluating the Temporal Coherence of Archived Pages
Evaluating the Temporal Coherence of Archived PagesEvaluating the Temporal Coherence of Archived Pages
Evaluating the Temporal Coherence of Archived PagesMichael Nelson
 
Profiling Web Archives
Profiling Web ArchivesProfiling Web Archives
Profiling Web ArchivesMichael Nelson
 
We Need Multiple, Independent Web Archives
We Need Multiple, Independent Web ArchivesWe Need Multiple, Independent Web Archives
We Need Multiple, Independent Web ArchivesMichael Nelson
 
Assessing the Quality of Web Archives
Assessing the Quality of Web ArchivesAssessing the Quality of Web Archives
Assessing the Quality of Web ArchivesMichael Nelson
 
Combining Storytelling and Web Archives
Combining Storytelling and Web ArchivesCombining Storytelling and Web Archives
Combining Storytelling and Web ArchivesMichael Nelson
 
Summarizing archival collections using storytelling techniques
Summarizing archival collections using storytelling techniquesSummarizing archival collections using storytelling techniques
Summarizing archival collections using storytelling techniquesMichael Nelson
 
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...Michael Nelson
 
OAI-ORE: The Open Archives Initiative Object Reuse and Exchange Project
OAI-ORE:  The Open Archives Initiative  Object Reuse and Exchange ProjectOAI-ORE:  The Open Archives Initiative  Object Reuse and Exchange Project
OAI-ORE: The Open Archives Initiative Object Reuse and Exchange ProjectMichael Nelson
 
Why Care About the Past?
Why Care About the Past?Why Care About the Past?
Why Care About the Past?Michael Nelson
 

Viewers also liked (17)

More Archives, More Better
More Archives, More Better More Archives, More Better
More Archives, More Better
 
Storytelling for Summarizing Collections in Web Archives
Storytelling for Summarizing Collections in Web ArchivesStorytelling for Summarizing Collections in Web Archives
Storytelling for Summarizing Collections in Web Archives
 
Web Archiving: A Brief Introduction
Web Archiving: A Brief IntroductionWeb Archiving: A Brief Introduction
Web Archiving: A Brief Introduction
 
Why We Need Multiple Archives
Why We Need Multiple ArchivesWhy We Need Multiple Archives
Why We Need Multiple Archives
 
Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...
Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...
Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...
 
Using Web Archives to Enrich the Live Web Experience Through Storytelling
Using Web Archives to Enrich the Live Web Experience Through StorytellingUsing Web Archives to Enrich the Live Web Experience Through Storytelling
Using Web Archives to Enrich the Live Web Experience Through Storytelling
 
Software as a Well-Formed Research Object
Software as a Well-Formed Research ObjectSoftware as a Well-Formed Research Object
Software as a Well-Formed Research Object
 
Old Dominion University Computer Science IIPC New Member
Old Dominion University Computer Science IIPC New Member Old Dominion University Computer Science IIPC New Member
Old Dominion University Computer Science IIPC New Member
 
Evaluating the Temporal Coherence of Archived Pages
Evaluating the Temporal Coherence of Archived PagesEvaluating the Temporal Coherence of Archived Pages
Evaluating the Temporal Coherence of Archived Pages
 
Profiling Web Archives
Profiling Web ArchivesProfiling Web Archives
Profiling Web Archives
 
We Need Multiple, Independent Web Archives
We Need Multiple, Independent Web ArchivesWe Need Multiple, Independent Web Archives
We Need Multiple, Independent Web Archives
 
Assessing the Quality of Web Archives
Assessing the Quality of Web ArchivesAssessing the Quality of Web Archives
Assessing the Quality of Web Archives
 
Combining Storytelling and Web Archives
Combining Storytelling and Web ArchivesCombining Storytelling and Web Archives
Combining Storytelling and Web Archives
 
Summarizing archival collections using storytelling techniques
Summarizing archival collections using storytelling techniquesSummarizing archival collections using storytelling techniques
Summarizing archival collections using storytelling techniques
 
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
 
OAI-ORE: The Open Archives Initiative Object Reuse and Exchange Project
OAI-ORE:  The Open Archives Initiative  Object Reuse and Exchange ProjectOAI-ORE:  The Open Archives Initiative  Object Reuse and Exchange Project
OAI-ORE: The Open Archives Initiative Object Reuse and Exchange Project
 
Why Care About the Past?
Why Care About the Past?Why Care About the Past?
Why Care About the Past?
 

Similar to When Should I Make Preservation Copies of Myself?

From Workflows to Transparent Research Objects and Reproducible Science Tales
From Workflows to Transparent Research Objects and Reproducible Science TalesFrom Workflows to Transparent Research Objects and Reproducible Science Tales
From Workflows to Transparent Research Objects and Reproducible Science TalesBertram Ludäscher
 
Advancing Research at London's Global University
Advancing Research at London's Global UniversityAdvancing Research at London's Global University
Advancing Research at London's Global Universityinside-BigData.com
 
PacMin @ AMPLab All-Hands
PacMin @ AMPLab All-HandsPacMin @ AMPLab All-Hands
PacMin @ AMPLab All-Handsfnothaft
 
RLG Partnership Update Webinar Slides
RLG Partnership Update Webinar SlidesRLG Partnership Update Webinar Slides
RLG Partnership Update Webinar SlidesOCLC Research
 
Reactive Streams: Handling Data-Flow the Reactive Way
Reactive Streams: Handling Data-Flow the Reactive WayReactive Streams: Handling Data-Flow the Reactive Way
Reactive Streams: Handling Data-Flow the Reactive WayRoland Kuhn
 
Transcribe NLS: Crowdsourcing at the National Library of Scotland
Transcribe NLS: Crowdsourcing at the National Library of ScotlandTranscribe NLS: Crowdsourcing at the National Library of Scotland
Transcribe NLS: Crowdsourcing at the National Library of Scotlandtarastar
 
Cosi Opac Tweaks
Cosi   Opac TweaksCosi   Opac Tweaks
Cosi Opac Tweaksdaveyp
 
As bibliotecas do mundo conectadas. A um mundo conectado!
As bibliotecas do mundo conectadas. A um mundo conectado!As bibliotecas do mundo conectadas. A um mundo conectado!
As bibliotecas do mundo conectadas. A um mundo conectado!OCLC LAC
 
Stream Reasoning: State of the Art and Beyond
Stream Reasoning: State of the Art and BeyondStream Reasoning: State of the Art and Beyond
Stream Reasoning: State of the Art and BeyondEmanuele Della Valle
 
2015 09 emc lsug
2015 09 emc lsug2015 09 emc lsug
2015 09 emc lsugChris Dwan
 
Tutorial 6 (web graph attributes)
Tutorial 6 (web graph attributes)Tutorial 6 (web graph attributes)
Tutorial 6 (web graph attributes)Kira
 
Agile Data Science: Hadoop Analytics Applications
Agile Data Science: Hadoop Analytics ApplicationsAgile Data Science: Hadoop Analytics Applications
Agile Data Science: Hadoop Analytics ApplicationsRussell Jurney
 
Cassandra Meetup Boston - How Table "Shape" Affects Performance
Cassandra Meetup Boston - How Table "Shape" Affects PerformanceCassandra Meetup Boston - How Table "Shape" Affects Performance
Cassandra Meetup Boston - How Table "Shape" Affects PerformanceDan Foody
 
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah Bard
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah BardUsing Containers and HPC to Solve the Mysteries of the Universe by Deborah Bard
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah BardDocker, Inc.
 
Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques
Nelson, Michael: Summarizing Archival Collections Using Storytelling TechniquesNelson, Michael: Summarizing Archival Collections Using Storytelling Techniques
Nelson, Michael: Summarizing Archival Collections Using Storytelling TechniquesReynolds Journalism Institute (RJI)
 
2011 ebooks rsk charleston
2011 ebooks rsk charleston2011 ebooks rsk charleston
2011 ebooks rsk charlestonrdoit4
 

Similar to When Should I Make Preservation Copies of Myself? (20)

From Workflows to Transparent Research Objects and Reproducible Science Tales
From Workflows to Transparent Research Objects and Reproducible Science TalesFrom Workflows to Transparent Research Objects and Reproducible Science Tales
From Workflows to Transparent Research Objects and Reproducible Science Tales
 
Advancing Research at London's Global University
Advancing Research at London's Global UniversityAdvancing Research at London's Global University
Advancing Research at London's Global University
 
PacMin @ AMPLab All-Hands
PacMin @ AMPLab All-HandsPacMin @ AMPLab All-Hands
PacMin @ AMPLab All-Hands
 
RLG Partnership Update Webinar Slides
RLG Partnership Update Webinar SlidesRLG Partnership Update Webinar Slides
RLG Partnership Update Webinar Slides
 
Reactive Streams: Handling Data-Flow the Reactive Way
Reactive Streams: Handling Data-Flow the Reactive WayReactive Streams: Handling Data-Flow the Reactive Way
Reactive Streams: Handling Data-Flow the Reactive Way
 
Transcribe NLS: Crowdsourcing at the National Library of Scotland
Transcribe NLS: Crowdsourcing at the National Library of ScotlandTranscribe NLS: Crowdsourcing at the National Library of Scotland
Transcribe NLS: Crowdsourcing at the National Library of Scotland
 
The opac and the web
The opac and the webThe opac and the web
The opac and the web
 
Cosi Opac Tweaks
Cosi   Opac TweaksCosi   Opac Tweaks
Cosi Opac Tweaks
 
Final Johnson Research Libraries and Computational Research
Final Johnson Research Libraries and Computational ResearchFinal Johnson Research Libraries and Computational Research
Final Johnson Research Libraries and Computational Research
 
Radcliffe
RadcliffeRadcliffe
Radcliffe
 
As bibliotecas do mundo conectadas. A um mundo conectado!
As bibliotecas do mundo conectadas. A um mundo conectado!As bibliotecas do mundo conectadas. A um mundo conectado!
As bibliotecas do mundo conectadas. A um mundo conectado!
 
Stream Reasoning: State of the Art and Beyond
Stream Reasoning: State of the Art and BeyondStream Reasoning: State of the Art and Beyond
Stream Reasoning: State of the Art and Beyond
 
2015 09 emc lsug
2015 09 emc lsug2015 09 emc lsug
2015 09 emc lsug
 
Tutorial 6 (web graph attributes)
Tutorial 6 (web graph attributes)Tutorial 6 (web graph attributes)
Tutorial 6 (web graph attributes)
 
Agile Data Science: Hadoop Analytics Applications
Agile Data Science: Hadoop Analytics ApplicationsAgile Data Science: Hadoop Analytics Applications
Agile Data Science: Hadoop Analytics Applications
 
Cassandra Meetup Boston - How Table "Shape" Affects Performance
Cassandra Meetup Boston - How Table "Shape" Affects PerformanceCassandra Meetup Boston - How Table "Shape" Affects Performance
Cassandra Meetup Boston - How Table "Shape" Affects Performance
 
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah Bard
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah BardUsing Containers and HPC to Solve the Mysteries of the Universe by Deborah Bard
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah Bard
 
2014 bangkok-talk
2014 bangkok-talk2014 bangkok-talk
2014 bangkok-talk
 
Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques
Nelson, Michael: Summarizing Archival Collections Using Storytelling TechniquesNelson, Michael: Summarizing Archival Collections Using Storytelling Techniques
Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques
 
2011 ebooks rsk charleston
2011 ebooks rsk charleston2011 ebooks rsk charleston
2011 ebooks rsk charleston
 

More from Michael Nelson

Web Archiving in the Year eaee1902f186819154789ee22ca30035
Web Archiving in the Year eaee1902f186819154789ee22ca30035Web Archiving in the Year eaee1902f186819154789ee22ca30035
Web Archiving in the Year eaee1902f186819154789ee22ca30035Michael Nelson
 
Uncertainty in replaying archived Twitter pages
Uncertainty in replaying archived Twitter pagesUncertainty in replaying archived Twitter pages
Uncertainty in replaying archived Twitter pagesMichael Nelson
 
Web Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed OriginalsWeb Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed OriginalsMichael Nelson
 
Web Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed OriginalsWeb Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed OriginalsMichael Nelson
 
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesBlockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesMichael Nelson
 
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesBlockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesMichael Nelson
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Michael Nelson
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Michael Nelson
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Michael Nelson
 

More from Michael Nelson (9)

Web Archiving in the Year eaee1902f186819154789ee22ca30035
Web Archiving in the Year eaee1902f186819154789ee22ca30035Web Archiving in the Year eaee1902f186819154789ee22ca30035
Web Archiving in the Year eaee1902f186819154789ee22ca30035
 
Uncertainty in replaying archived Twitter pages
Uncertainty in replaying archived Twitter pagesUncertainty in replaying archived Twitter pages
Uncertainty in replaying archived Twitter pages
 
Web Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed OriginalsWeb Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed Originals
 
Web Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed OriginalsWeb Archives at the Nexus of Good Fakes and Flawed Originals
Web Archives at the Nexus of Good Fakes and Flawed Originals
 
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesBlockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
 
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesBlockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
 

Recently uploaded

Remote patient monitoring :Health care transformation
Remote patient monitoring :Health care transformationRemote patient monitoring :Health care transformation
Remote patient monitoring :Health care transformationfahad Alotaibiu
 
Understanding Nutrition, 16th Edition pdf
Understanding Nutrition, 16th Edition pdfUnderstanding Nutrition, 16th Edition pdf
Understanding Nutrition, 16th Edition pdfHabibouKarbo
 
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep LearningCombining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learningvschiavoni
 
Production technology of Brinjal -Solanum melongena
Production technology of Brinjal -Solanum melongenaProduction technology of Brinjal -Solanum melongena
Production technology of Brinjal -Solanum melongenajana861314
 
FBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxFBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxPayal Shrivastava
 
Advances in AI-driven Image Recognition for Early Detection of Cancer
Advances in AI-driven Image Recognition for Early Detection of CancerAdvances in AI-driven Image Recognition for Early Detection of Cancer
Advances in AI-driven Image Recognition for Early Detection of CancerLuis Miguel Chong Chong
 
Environmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxEnvironmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxpriyankatabhane
 
CDS Fundamentals of digital communication system UNIT 1 AND 2.pdf
CDS Fundamentals of digital communication system UNIT 1 AND 2.pdfCDS Fundamentals of digital communication system UNIT 1 AND 2.pdf
CDS Fundamentals of digital communication system UNIT 1 AND 2.pdfshubhangisonawane6
 
Role of Gibberellins, mode of action and external applications.pptx
Role of Gibberellins, mode of action and external applications.pptxRole of Gibberellins, mode of action and external applications.pptx
Role of Gibberellins, mode of action and external applications.pptxjana861314
 
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPRPirithiRaju
 
Interpreting SDSS extragalactic data in the era of JWST
Interpreting SDSS extragalactic data in the era of JWSTInterpreting SDSS extragalactic data in the era of JWST
Interpreting SDSS extragalactic data in the era of JWSTAlexander F. Mayer
 
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...Chayanika Das
 
Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...
Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...
Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...Chiheb Ben Hammouda
 
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Sérgio Sacani
 
Think Science: What Are Eclipses, by Craig Bobchin
Think Science: What Are Eclipses, by Craig BobchinThink Science: What Are Eclipses, by Craig Bobchin
Think Science: What Are Eclipses, by Craig BobchinNathan Cone
 
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxEnvironmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxpriyankatabhane
 
𝗧𝗖𝗢 (𝙩𝙧𝙖𝙣𝙨-𝗰𝘆𝗰𝗹𝗼𝗼𝗰𝘁𝗲𝗻𝗲) 𝗗𝗲𝗿𝗶𝘃𝗮𝘁𝗶𝘃𝗲𝘀: 𝗧𝗵𝗲 𝗙𝗮𝘀𝘁𝗲𝘀𝘁 𝗖𝗹𝗶𝗰𝗸 𝗥𝗲𝗮𝗰𝘁𝗶𝗼𝗻 𝗥𝗲𝗮𝗴𝗲𝗻𝘁𝘀
𝗧𝗖𝗢 (𝙩𝙧𝙖𝙣𝙨-𝗰𝘆𝗰𝗹𝗼𝗼𝗰𝘁𝗲𝗻𝗲) 𝗗𝗲𝗿𝗶𝘃𝗮𝘁𝗶𝘃𝗲𝘀: 𝗧𝗵𝗲 𝗙𝗮𝘀𝘁𝗲𝘀𝘁 𝗖𝗹𝗶𝗰𝗸 𝗥𝗲𝗮𝗰𝘁𝗶𝗼𝗻 𝗥𝗲𝗮𝗴𝗲𝗻𝘁𝘀𝗧𝗖𝗢 (𝙩𝙧𝙖𝙣𝙨-𝗰𝘆𝗰𝗹𝗼𝗼𝗰𝘁𝗲𝗻𝗲) 𝗗𝗲𝗿𝗶𝘃𝗮𝘁𝗶𝘃𝗲𝘀: 𝗧𝗵𝗲 𝗙𝗮𝘀𝘁𝗲𝘀𝘁 𝗖𝗹𝗶𝗰𝗸 𝗥𝗲𝗮𝗰𝘁𝗶𝗼𝗻 𝗥𝗲𝗮𝗴𝗲𝗻𝘁𝘀
𝗧𝗖𝗢 (𝙩𝙧𝙖𝙣𝙨-𝗰𝘆𝗰𝗹𝗼𝗼𝗰𝘁𝗲𝗻𝗲) 𝗗𝗲𝗿𝗶𝘃𝗮𝘁𝗶𝘃𝗲𝘀: 𝗧𝗵𝗲 𝗙𝗮𝘀𝘁𝗲𝘀𝘁 𝗖𝗹𝗶𝗰𝗸 𝗥𝗲𝗮𝗰𝘁𝗶𝗼𝗻 𝗥𝗲𝗮𝗴𝗲𝗻𝘁𝘀Tokyo Chemicals Industry (TCI)
 
Anatomy of Duodenum- detailed ppt presentation for science students
Anatomy of Duodenum- detailed ppt presentation for science studentsAnatomy of Duodenum- detailed ppt presentation for science students
Anatomy of Duodenum- detailed ppt presentation for science studentsgargipatil3210
 

Recently uploaded (20)

Remote patient monitoring :Health care transformation
Remote patient monitoring :Health care transformationRemote patient monitoring :Health care transformation
Remote patient monitoring :Health care transformation
 
Understanding Nutrition, 16th Edition pdf
Understanding Nutrition, 16th Edition pdfUnderstanding Nutrition, 16th Edition pdf
Understanding Nutrition, 16th Edition pdf
 
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep LearningCombining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
 
Battery Reasearch Reagents from TCI Chemicals
Battery Reasearch Reagents from TCI ChemicalsBattery Reasearch Reagents from TCI Chemicals
Battery Reasearch Reagents from TCI Chemicals
 
Bioenergetics and the role of ATP to drive the beats of life.
Bioenergetics and the role of ATP to drive the beats of life.Bioenergetics and the role of ATP to drive the beats of life.
Bioenergetics and the role of ATP to drive the beats of life.
 
Production technology of Brinjal -Solanum melongena
Production technology of Brinjal -Solanum melongenaProduction technology of Brinjal -Solanum melongena
Production technology of Brinjal -Solanum melongena
 
FBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxFBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptx
 
Advances in AI-driven Image Recognition for Early Detection of Cancer
Advances in AI-driven Image Recognition for Early Detection of CancerAdvances in AI-driven Image Recognition for Early Detection of Cancer
Advances in AI-driven Image Recognition for Early Detection of Cancer
 
Environmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxEnvironmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptx
 
CDS Fundamentals of digital communication system UNIT 1 AND 2.pdf
CDS Fundamentals of digital communication system UNIT 1 AND 2.pdfCDS Fundamentals of digital communication system UNIT 1 AND 2.pdf
CDS Fundamentals of digital communication system UNIT 1 AND 2.pdf
 
Role of Gibberellins, mode of action and external applications.pptx
Role of Gibberellins, mode of action and external applications.pptxRole of Gibberellins, mode of action and external applications.pptx
Role of Gibberellins, mode of action and external applications.pptx
 
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
 
Interpreting SDSS extragalactic data in the era of JWST
Interpreting SDSS extragalactic data in the era of JWSTInterpreting SDSS extragalactic data in the era of JWST
Interpreting SDSS extragalactic data in the era of JWST
 
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...
ESSENTIAL FEATURES REQUIRED FOR ESTABLISHING FOUR TYPES OF BIOSAFETY LABORATO...
 
Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...
Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...
Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...
 
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
 
Think Science: What Are Eclipses, by Craig Bobchin
Think Science: What Are Eclipses, by Craig BobchinThink Science: What Are Eclipses, by Craig Bobchin
Think Science: What Are Eclipses, by Craig Bobchin
 
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxEnvironmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
 
𝗧𝗖𝗢 (𝙩𝙧𝙖𝙣𝙨-𝗰𝘆𝗰𝗹𝗼𝗼𝗰𝘁𝗲𝗻𝗲) 𝗗𝗲𝗿𝗶𝘃𝗮𝘁𝗶𝘃𝗲𝘀: 𝗧𝗵𝗲 𝗙𝗮𝘀𝘁𝗲𝘀𝘁 𝗖𝗹𝗶𝗰𝗸 𝗥𝗲𝗮𝗰𝘁𝗶𝗼𝗻 𝗥𝗲𝗮𝗴𝗲𝗻𝘁𝘀
𝗧𝗖𝗢 (𝙩𝙧𝙖𝙣𝙨-𝗰𝘆𝗰𝗹𝗼𝗼𝗰𝘁𝗲𝗻𝗲) 𝗗𝗲𝗿𝗶𝘃𝗮𝘁𝗶𝘃𝗲𝘀: 𝗧𝗵𝗲 𝗙𝗮𝘀𝘁𝗲𝘀𝘁 𝗖𝗹𝗶𝗰𝗸 𝗥𝗲𝗮𝗰𝘁𝗶𝗼𝗻 𝗥𝗲𝗮𝗴𝗲𝗻𝘁𝘀𝗧𝗖𝗢 (𝙩𝙧𝙖𝙣𝙨-𝗰𝘆𝗰𝗹𝗼𝗼𝗰𝘁𝗲𝗻𝗲) 𝗗𝗲𝗿𝗶𝘃𝗮𝘁𝗶𝘃𝗲𝘀: 𝗧𝗵𝗲 𝗙𝗮𝘀𝘁𝗲𝘀𝘁 𝗖𝗹𝗶𝗰𝗸 𝗥𝗲𝗮𝗰𝘁𝗶𝗼𝗻 𝗥𝗲𝗮𝗴𝗲𝗻𝘁𝘀
𝗧𝗖𝗢 (𝙩𝙧𝙖𝙣𝙨-𝗰𝘆𝗰𝗹𝗼𝗼𝗰𝘁𝗲𝗻𝗲) 𝗗𝗲𝗿𝗶𝘃𝗮𝘁𝗶𝘃𝗲𝘀: 𝗧𝗵𝗲 𝗙𝗮𝘀𝘁𝗲𝘀𝘁 𝗖𝗹𝗶𝗰𝗸 𝗥𝗲𝗮𝗰𝘁𝗶𝗼𝗻 𝗥𝗲𝗮𝗴𝗲𝗻𝘁𝘀
 
Anatomy of Duodenum- detailed ppt presentation for science students
Anatomy of Duodenum- detailed ppt presentation for science studentsAnatomy of Duodenum- detailed ppt presentation for science students
Anatomy of Duodenum- detailed ppt presentation for science students
 

When Should I Make Preservation Copies of Myself?

  • 1. When Should I Make Preservation Copies of Myself? Charles L. Cartledge and Michael L. Nelson Old Dominion University Department of Computer Science Norfolk, VA 23529 USA JCDL 2014 London, UK September 9, 2014
  • 2. Unsupervised Small-World (USW) has multiple areas of interest 2
  • 3. Preservation via benign neglect 3 Handwritten on the back of the photo: “Josie McClure picture taken Feb 30, 1907 at Poteau, I.T. Fifteen years of age When this was taken weighed 140 lbs.” (cultural context needed to make sense of the annotation!)
  • 4. Will Josie last 100+ years as a web object (WO) in Flickr, Photobucket, et al.?
  • 5. Crowd sourcing preservation • “Everyone is a curator …” – Crowd sourced activity – Unscheduled – Willing to wait a long time • Enlist humans in creation and maintenance – opposite of benign neglect 5 Frank McCown, Michael L. Nelson, and Herbert Van de Sompel, Everyone is a Curator: Human-Assisted Preservation for ORE Aggregations, Proceedings of the DigCCurr 2009 http://arxiv.org/abs/0901.4571 See also: http://ws-dl.blogspot.com/2013/10/2013-10-23-preserve-me-if-you-can-using.html
  • 6. Emergent behavior: flocking boids • Craig Reynolds – basis of herd and flock behavior in computer animations – 3 rules • Collision avoidance • Velocity matching • Flock centering – No central control, everything based on local knowledge only • Simple rules produce complex, emergent behavior 6 Collision avoidance Velocity matching Craig W. Reynolds, Computer Animation with Scripts and Actors, ACM SIGGRAPH, vol. 16, ACM, 1982, pp. 289 - 296. Images http://www.red3d.com/cwr/boids/ Flock centering
  • 7. USW interpretation of flocking 7 Craig Reynolds’ “boids” USW interpretation Collision avoidance Velocity matching Flock centering Each WO has a unique URI Matching number of copies/family members Move with friends to new hosts
  • 8. WOs wandering in the USW graph • Wandering WO is “introduced” to an existing WO • If a connection is not made, then an attempt is made to another existing WO • Process is repeated until a connection is made • No global knowledge – No omnipotent enforcer – No omnipresent monitor • No repositories 8
  • 9. USW WO “friendship” links • WOs have “friendship” links to other WOs • Different than HTML navigational links (i.e., <link> instead of <a>) 9
  • 10. USW WO “families” A family is a set of copies of the same WO 10
  • 11. USW hosts Family members live on different hosts 11 Host #1 Host #2 Host #3
  • 12. WO roles & responsibilities • Hierarchy of family WOs – Progenitor – initial WO – Copies – more recent WO copies – Each WO is timestamped with creation time • WO roles – Active maintainer – eldest WO charged with making copies and related housekeeping – Passive maintainer – all other WOs • Order of precedence – If progenitor is accessible then it is the active maintainer – If declared active maintainer is accessible then it is the active maintainer – Otherwise, WO declares itself active maintainer • If family is disconnected then multiple active maintainers are possible until reconnection then the eldestWO declares itself active maintainer 12 Progenitor Copies
  • 13. Active and passive maintenance activities 13 Active Passive Active Active Passive Passive Progenitor Is lost X • Active maintainer (the WO with earliest timestamp) – currently charged with making copies and related housekeeping • Passive maintainer – all other WOs Progenitor returns Progenitor declares act. as copy. Time
  • 14. Progenitor is lost 14 Active Passive Active Active Passive Passive Progenitor Is lost X • Active maintainer – currently charged with making copies and related housekeeping • Passive maintainer – all other WOs Progenitor returns Progenitor declares act. as copy. Time
  • 15. A new active maintainer 15 Active Passive Active Active Passive Passive Progenitor Is lost X • Active maintainer – currently charged with making copies and related housekeeping • Passive maintainer – all other WOs Progenitor returns Progenitor declares act. as copy. Time
  • 16. Progenitor returns and assumes active maintainer role 16 Active Passive Active Active Passive Passive Progenitor Is lost X • Active maintainer – currently charged with making copies and related housekeeping • Passive maintainer – all other WOs Progenitor returns Progenitor declares act. as copy. Time
  • 17. Progenitor has made copies 17 Time Copy declares active Copies created Replacement created Excess copies Excess deleted
  • 18. A copy is disconnected from the family 18 Time Copy declares active Copies created Replacement created Excess copies Excess deleted
  • 19. Two active maintainers make copies 19 Time Copy declares active Copies created Replacement created Excess copies Excess deleted
  • 20. Disconnected copy is reconnected to the progenitor 20 Time Copy declares active Copies created Replacement created Excess copies Excess deleted
  • 21. Family has too many copies 21 Time Copy declares active Copies created Replacement created Excess copies Excess deleted • Copy management policies – Active: explicit removal – Passive: “natural attrition” • Equivalent of Reynolds’ velocity matching, making and monitoring copies
  • 22. Parameters • csoft = minimum number of preservation copies desired by a web object – e.g., csoft = 3 • chard = maximum number of preservation copies desired by a web object – e.g., chard = 5 • hmax = maximum number of hosts – e.g., hmax = 1000 • hcap = host capacity for web objects – e.g., hcap = 5 • nmax = maximum number of web objects – e.g., nmax = 500
  • 23. Three USW copying policies • Least aggressive – one at a time to chard • Moderately aggressive – as quickly as possible to csoft and then one at a time chard • Most aggressive – as quickly as possible to chard • Constraints: – WOs can only take action when woken up by interactive users or other WOs (i.e., mostly they lie dormant waiting for crowd sourced preservation) – Copying continues until WOs can no longer find hosts that are not full 23
  • 24. Reading tree ring graphs 24 WOs preservation status Hosts utilization status None < Csoft Csoft <= N < Chard N == Chard 0% < 25% < 75% < 50 % > 75%
  • 28. Least aggressive (t = 100) 28  A full YouTube video is available at: http://youtu.be/sHJGYphqtK4
  • 29. Least aggressive (final) 29 • Results – System stabilized – Host capacity limited – Some WOs without any copies – Some hosts unused • “Least aggressive” is not an effective policy
  • 30. Which policy to choose? 30 • Moderately aggressive results in an additional 18% of WOs meeting their preservation goals and makes more efficient use of limited host resources sooner • Most aggressive results in almost the same percentage of WOs meeting their goals, but with slightly more hosts having unused capacity
  • 31. How does policy affect message exchange? Moderately Least aggressive aggressive Most aggressive Number of messages is constant, but amortized over different time scales
  • 32. Conclusions • Based on simulations: – Be aggressive when making copies! – Moderately aggressive copying was approximately the same as aggressive copying • Aggressive achieves steady state faster • But moderately aggressive distributes WOs over hosts more equally – Moderately aggressive vs. aggressive comes down to “go fast” vs. “spread the load”
  • 33. Video URLs • USW video – http://youtu.be/JnCMenp73YQ • Least Aggressive – http://youtu.be/sHJGYphqtK4 • Moderately Aggressive – https://www.youtube.com/watch?v=pVI-VhPh7KQ • Most Aggressive – https://www.youtube.com/watch?v=eIXz8Njh-QM • “Death Star” message histogram – https://www.youtube.com/watch?v=X3EShyjFoc4 • “Traditional” message histogram – https://www.youtube.com/watch?v=9CcCup3Td-Q 33
  • 35. Some WO reference implementation details 35  Sawood Alam, HTTP Mailbox - Asynchronous RESTful Communication, Master's thesis, Old Dominion University, Norfolk, VA, 2013.  Carl Lagoze, Herbert Van de Sompel, Pete Johnston, Michael Nelson, Robert Sanderson, and Simeon Warner, ORE User Guide - Resource Map Implementation in Atom, Tech. report, Open Archives Initiative, 2004.  Sawood Alam, Charles L. Cartledge, and Michael L. Nelson, Support for Various HTTP Methods on the Web, Tech. Report arXiv:1405.2330 (2014). WO memory: simulated via “edit” service Direct WO to WO communication: simulated via the HTTP Mailbox
  • 36. Preserve Me Viz! with new connections • New friend connections • New copy locations 36
  • 37. Preserve Me “Basic” on a copy • Differences between active and passive maintainers. • Active maintainer is responsible for making copies. • Passive maintainer sends alerts to the active maintainer • Passive maintainer may assume active maintainer role if active is not available. 37
  • 38. A USW instrumented splash page … <link rel="resourcemap" type="application/atom+xml;type=entry" href="http://arxiv.cs.odu.edu/rems/arxiv-0704-3647v1.xml" /> <link rel="aggregation" href="http://arxiv.cs.odu.edu/rems/arxiv-0704- 3647v1.xml#aggregation" /> <script src="http://www.cs.odu.edu/~salam/wsdl/uswdo/work/preserveme.js"></script> … 38
  • 39. USW algorithm popup 39 • Written in JavaScript • Relies on domain services – Copy -> creates copy of a WO – Edit -> update own REM • Uses communications mechanism based on Sawood Alam’s master’s thesis
  • 40. USW copies: famine to feast 40
  • 41. Final states for copying policies and named conditions 41