Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Hiberlink: Prototypes of pro-active approaches to support the archiving of web references for scholarly communications
1. Prototypes of pro-active approaches to support the
archiving of web references for scholarly
communications
Richard Wincewicz1, Peter Burnhill1
& Herbert Van de Sompel2
1EDINA, University of Edinburgh, 2Los Alamos National Laboratory
2. The Project Team
2013 – 2015, funded by the
Andrew W. Mellon
Foundation
• Los Alamos National Laboratory:
Research Library: Herbert Van de Sompel
Harihar Shankar, [Martin Klein, Rob Sanderson]
• University of Edinburgh:
Language Technology Group: Claire Grover,
Beatrice Alex, Colin Matheson, Richard Tobin, [Ke “Adam” Zhou]
EDINA * : Peter Burnhill, Muriel Mewissen (Project Manager),
Tim Stickland, Richard Wincewicz, [Neil Mayo]
Centre for Service Delivery & Digital Expertise
5. Reference Rot
Links to Web at Large resources are subject to
Reference Rot. This is a combination of two factors:
• Link Rot: Link stops working
• e.g. HTTP 404 “Not Found”
• Content Drift: Linked content changes over time
• Possibly to the extent that it is no longer
representative of the content that was initially
referenced
7. Articles that Link to Articles & to Web At Large Resources
(PMC)
Martin Klein et al. (2014) Scholarly context not found
http://dx.doi.org/10.1371/journal.pone.0115253
8. Articles that Link to Articles & to Web At Large Resources
(Elsevier)
Martin Klein et al. (2014) Scholarly context not found
http://dx.doi.org/10.1371/journal.pone.0115253
9. Articles with URI References (PMC)
Articles 479,194
with URI references 399,005
with URI references to articles 240,857
with URI references to Web at Large 156,160
Martin Klein et al. (2014) Scholarly context not found
http://dx.doi.org/10.1371/journal.pone.0115253
10. Link Rot (PMC)
Martin Klein et al. (2014) Scholarly context not found
http://dx.doi.org/10.1371/journal.pone.0115253
11. Link Rot (Elsevier)
Martin Klein et al. (2014) Scholarly context not found
http://dx.doi.org/10.1371/journal.pone.0115253
12. Links from arXiv, Elsevier, PMC to TLD Targets
Martin Klein et al. (2014) Scholarly context not found. In: PLOS ONE
http://dx.doi.org/10.1371/journal.pone.0115253
13. Grey is Link Rot – Referenced Content Not Accessible
Martin Klein et al. (2014) Scholarly context not found. In: PLOS ONE
http://dx.doi.org/10.1371/journal.pone.0115253
14. Grey is Not Archived - Referenced Content Lost
Martin Klein et al. (2014) Scholarly context not found. In: PLOS ONE
http://dx.doi.org/10.1371/journal.pone.0115253
15. Content Drift – http://dl00.org
2000 2004
2005 2008
(a) Dynamic content
values on webpage change
over time
(b) Static content
but very different (often
unrelated) web pages
17. Create Snapshots of Referenced Resources
Various web archives support on-demand creation of
snapshots of URIs (manual, API):
archive.today
Internet Archive
perma.cc
webcitation.org
When creating snapshots, maintain:
Original URI
Snapshot URI
Date/Time of snapshot
18. Create Snapshots of Referenced Resources
Snapshots can be created at various stages. The closer to
the moment of referencing, the better the image captured.
Stage Actor Snapshot Quality
Preparation Author/reference tool best
Submission
/Issue
Editor/manuscript
system
good
Publication
Aggregator/
publisher platform
ok
Post-publication
Librarian/IR,
journal archive
better than nothing
19. Authoring - Zotero Plugin Demonstrator
Richard Wincewicz (2014) Prototype Hiberlink plugin for Zotero for pro-active
archiving and temporal references
https://www.youtube.com/v/ZYmi_Ydr65M%26vq
24. Publication - HiberActive Service Demonstrator
Martin Klein et al. (2014) HiberActive: Pro-Active Archiving of web references from scholarly
articles
Open Repositories 2014 http://www.slideshare.net/martinklein0815/hiberactive
25. Reference Resources Robustly
When referencing resources include:
Original URI – Allows the user to revisit the URI as it
is at the time of reading, if the URI is still operational
Snapshot URI – Allows the user to visit the snapshot,
if one was created, and if the web archive in which it
was created is still operational
Date/Time – with the original URI allow the user to
visit any snapshot created around the Date/Time in
any web archive around the world (using Memento
infrastructure)
(2015) Robust Links - Motivation
http://robustlinks.mementoweb.org/about/
26. Reference Resources Actionably
When referencing resources, use Link Decorations to convey
Original URI, Snapshot URI, Date/Time
<a href=“http://www.stanford.edu”
data-versionurl=“http://archive.is/FAy6o”
data-versiondate=“2014-08-15” >
<a href=“http://www.stanford.edu”
data-versiondate=“2014-08-15” >
Herbert Van de Sompel et al. (2015) Robust Links - Link Decorations
http://robustlinks.mementoweb.org/spec/
<a href=“http://archive.is/FAy6o”
data-originalurl=“http://www.stanford.edu”
data-versiondate=“2014-08-15” >
27. Robust Links Using Link Decorations, JavaScript,
Memento API
Demo - http://robustlinks.mementoweb.org/demo/uri_references_js.html
robustlinks.js - https://github.com/mementoweb/robustlinks
28. Activate Robust Links
There are no Link Decorations, currently. But there is an
article publication date:
Express the article publication date in an actionable
manner (‘datePublished’ or ‘dateModified’
Schema.org properties) in HTML pages that contain
URI references
Tailor robustlinks.js to exclude links to articles
Inject robustlinks.js in HTML pages that contain URI
references
29. Users Follow Robust Links into Web
Archives
The combination of the referenced URI and the article
publication date:
Leads users to a snapshot in a web archive, created
as close as possible to the article publication date
Addresses link rot
Addresses content drift
30. Create Archive Copies
When ingesting new content into the platform:
Parse for URI references
Create snapshots in web archives of select URIs
For these URIs, use Link Decorations in HTML to
convey:
• original URI
• snapshot URI
• snapshot Date/Time
31. Users Follow Robust Links into Web
Archives
The Link Decorations:
Lead users to the created snapshot, if the web
archive is operational
Lead users to a snapshot in any web archive, created
as close as possible to the snapshot Date/Time
Addresses link rot
Addresses content drift
32. Prototypes of pro-active approaches to support the
archiving of web references for scholarly
communications
Richard Wincewicz1, Peter Burnhill1
& Herbert Van de Sompel2
1EDINA, University of Edinburgh, 2Los Alamos National Laboratory
http://hiberlink.org #hiberlink