SlideShare a Scribd company logo
1 of 77
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Herbert Van de Sompel
Los Alamos National Laboratory
@hvdsomp
Achieving Link Integrity for Managed Collections
Photo by Eric Sieverts
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Hyperlinks in Theory
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Hyperlinks in Reality
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Hyperlinks in Reality
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Link Rot
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Link Rot
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Hyperlinks in Reality
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Content Drift
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Content Drift
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Content Drift
2000 2004
2005 2008
http://dl00.org in 2000, 2004, 2005, 2008
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Content Drift
http://icecube.wisc.edu/ on May 8 2009 (left) and August 27 2009 (right)
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
No Content Drift
http://www.ifa.hawaii.edu/~cowie/k_table.html on June 9 1997 (left) and March 2016 (right)
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
The Web, All Hyperlinks Subject to Link Rot, Content Drift
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
The Web, All Hyperlinks Subject to Reference Rot
• Reference Rot hinders our ability to follow links as they were
intended when they were put in place:
• Link rot: A link stops working all together
• Content drift: The Linked content changes over time and may
eventually no longer be representative of the content that was
originally linked
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Creating Pockets of Persistence
• How to maintain the integrity of links?
• This challenge exists for the entire web. Some communities with well
managed collections care about addressing it because they consider
it a Quality of Service issue:
• Scholarly communication
• Cultural heritage
• Legal publications
• Government communication
• Journalism
• Wikipedia
• …
• What can these communities do to create Pockets of Persistence?
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
A Managed Collection Desires Reliable Outlinks
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Links to another Managed Collection
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Links to Web at Large Resources
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Exploring Link Rot & Content Drift
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
<Intermezzo - Hiberlink Study re Reference Rot in STM Articles>
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
PubMed Central Corpus
PMC articles published 1997-2012 PMC
Total 479,194
With links to articles 240,857
With links to web-at-large resources 156,160
Links PMC
To articles 744,678
To web-at-large resources 480,853A B
A B
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Links to Articles & to Web At Large Resources - PMC
Martin Klein, Herbert Van de Sompel, et al. (2014) Scholarly context not found. In: PLOS ONE
https://doi.org/10.1371/journal.pone.0115253
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
<Intermezzo - Hiberlink Study re Reference Rot in STM Articles>
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Exploring Link Rot & Content Drift
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Links Rot Occurs when B moves to C
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Introduce PID(B)
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Link to PID(B) ; HTTP Redirect from PID(B) to B
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
When B moves to C: HTTP Redirect from PID(B) to C
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Core Assumption: PID(B) Will Be Used for Linking
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Herbert Van de Sompel, Martin Klein, and Shawn Jones (2016) Persistent URIs Must Be Used
to Be Persistent. In: WWW2016. http://arxiv.org/1602.09102
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
• When classifying links extracted from PMC as linking to articles, we
assumed that filtering on http://dx.doi.org/* would do the trick
• But we found a lot of e.g. http://link.springer.com/article/*
• For example:
• http://link.springer.com/article/10.1007%2Fs00799-014-018-0
• Instead of:
• http://dx.doi.org/10.1007/s00799-014-0108-0
• We used CrossRef’s Reverse Domain Lookup to classify these
extracted links as linking to articles
A Disconcerting Observation
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
URI References - PMC
Herbert Van de Sompel, Martin Klein, and Shawn Jones (2016) Persistent URIs Must Be Used to Be Persistent.
In: WWW2016. http://arxiv.org/1602.09102
Herbert Van de Sompel, Martin Klein, and Shawn Jones (2016) Persistent URIs Must Be Used
to Be Persistent. In: WWW2016. http://arxiv.org/1602.09102
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Cartoon by Patrick Hochstenbach
http://signposting.org
<Intermezzo – Signposting the Scholarly Web>
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
• Proposal:
Use typed links to address some long standing problems regarding
scholarly resources on the web, by interlinking them using
appropriate relation types
• Focus on a limited set of patterns to support uniformly:
•Conveying a Persistent Identifier
•Expressing the web boundary of a scholarly resource
•Making bibliographic metadata discoverable
•Conveying an Author Identifier
•Conveying a license that applies to a resource
•Conveying a resource type
Signposting the Scholarly Web
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
HTTP Links
Mark Nottingham (2017) RFC8288: Web Linking
http://tools.iets.org/rfc/rfc8288.txt
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
HTTP Links
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
HTTP Links
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
HTTP Links Are Used
curl –I http://dbpedia.org/data/Reykjavik
HTTP/1.1 200 OK
Date: Thu, 27 Oct 2016 04:43:28 GMT
Content-Type: application/rdf+xml; charset=UTF-8
Content-Length: 1210
Link:
<http://creativecommons.org/licenses/by-sa/3.0>
; rel=“license",
<http://dbpedia.org/data/Reykjavik>
; rel="alternate"; type="text/n3",
<http://dbpedia.org/resource/Reykjavik>; rel="describes",
<http://mementoarchive.lanl.gov/dbpedia/timegate/http://dbpedia.org/
data/Reykjavik>
; rel="timegate"
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
For PIDs: Use cite-as Relation Type
Van de Sompel, H., Nelson M., Bilder, G, Kunze, J., and Warner, S. (2017) “cite-as”: A Link Relation
to Convey a Preferred URI for Referencing https://datatracker.ietf.org/doc/draft-vandesompel-citeas/
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
For PIDs: Use cite-as Relation Type
Van de Sompel, H., Nelson M., Bilder, G, Kunze, J., and Warner, S. (2017) “cite-as”: A Link Relation
to Convey a Preferred URI for Referencing https://datatracker.ietf.org/doc/draft-vandesompel-citeas/
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
• The target URI (PID) of the cite-as link can be picked up by
applications, e.g.:
• reference managers can pick up the PID of an object when the
user saves it while on the landing page, one of the constituent
resources
• publication pipelines can pick up the PID by looking up (HTTP
HEAD) URIs referenced in a paper to determine whether a PID
exists for them
For PIDs: Use cite-as Relation Type
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Cartoon by Patrick Hochstenbach
http://signposting.org
</Intermezzo – Signposting the Scholarly Web>
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
PID Alternative - When B Moves to C: HTTP Redirect from B to C
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
PID Alternative - When B Moves to C: HTTP Redirect from B to C
• Custodian of C needs to hold on to domain of B
• Custodian of C needs to establish redirection patterns; often those
are rather simple rules
• No problem with establishing links to PID(B); the URI in the browser
address bar (initially B, later C) is just fine
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Exploring Link Rot & Content Drift
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Content Drift Occurs when B Changes over Time
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Content Drift Occurs when B Changes over Time
• Is not really considered an issue because:
• the objects that receive PIDs were typically static, e.g. scientific
papers
• when a (substantially) new version of an object is published,
typically a new PID is assigned
• But:
• how to verify that the retrieved version of an object is indeed the
referenced version of the object?
• Requires:
• archiving objects in trusted archive(s)
• ability to retrieve objects from the archive(s)
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Archived Articles
David Rosenthal (2013) Patio Perspectives at ANADP II: Preserving the Other Half
http://blog.dshr.org/2013/11/patio-perspectives-at-anadp-ii.html
Too few
Too low risk
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
How to Audit Whether a PID-identified Object is Archived
http://thekeepers.org
Journal,
Volume, Issue
centric
Global audit by
DOI?
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Contrast: All Web-Archived Versions of David’s Blog Post
Global audit by
HTTP URI
Uses Memento
infrastructure
http://timetravel.mementoweb.org
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Exploring Link Rot & Content Drift
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Scholarly Context Adrift
Shawn Jones, Herbert Van de Sompel, et al. (2016) Scholarly context adrift. In: PLOS ONE
https://doi.org/10.1371/journal.pone.0167475
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
How to Assess Content Drift?
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Step 1: Find Pre/Post Mementos
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Step 2: Select Representative Mementos
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Text Similarity Measures
• Compute aggregate text similarity scores (values between 0...100)
for:
• Simhash
• Jaccard
• Sørensen-Dice
• Cosine
• If the aggregate score is 100, we decide that the Pre/Post
Mementos are representative
• We find 137K URI references out of 480K that have representative
Mementos
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Step 3: Dereference Live Web Version of URI
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Step 4: Representative Memento vs. Live Version
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Content Drift - PMC
Shawn Jones, Herbert Van de Sompel, et al. (2016) Scholarly context adrift. In: PLOS ONE
https://doi.org/10.1371/journal.pone.0167475
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Reference Rot for Links to Web at Large is Severe
• Link Rot and Content Drift are severe
• Cannot retrieve originally linked content from the live web
• Can potentially retrieve originally linked content from web archives
• But the archival coverage is too poor, a result of incidental
archiving
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
URI References without Representative Mementos - PMC
Shawn Jones, Herbert Van de Sompel, et al. (2016) Scholarly context adrift. In: PLOS ONE
https://doi.org/10.1371/journal.pone.0167475
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Impact of Archival Gap on Links from Managed Collections
Martin Klein, Herbert Van de Sompel, et al. (2014) Scholarly context not found. In: PLOS ONE
https://doi.org/10.1371/journal.pone.0115253
Links from Managed Collections to Domains Grey: Linked Content not Archived
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Uncertainty Regarding the Future of B when A Links to It
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Custodian of A Takes a Snapshot of B when Linking to It
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Taking a Snapshots of B: Automation is Key
• Web archive APIs for on-demand archiving
• perma.cc, Internet Archive, archive.is, webcitation
• Amber for Wordpress & Drupal archives resources linked in a page
• http://amberlink.org/
• Hiberlink’s experimental Zotero extension archives bookmarked
URLs
• http://hiberlink.org/zotero.html
• Hiberlink’s experimental HiberActive archives all URLs referenced in
a newly submitted paper
• https://www.slideshare.net/martinklein0815/hiberactive
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
site2cite
http://site2cite
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Custodian of A Links to Snapshot of B
• Typical practice for linking to snapshots:
<a href=“URL of snapshot of B”>
• Problems with this practice:
o Impossible to visit the original URI, if desired
o Requires the permanent existence/uptime of the archive that
holds the snapshot
-One link rot problem replaced by another
http://robustlinks.mementoweb.org/about/
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Permanent Existence/Uptime of Archives?
Capture of http://webcitation.org dated July 17 2013
https://archive.today/eAETp
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Permanent Existence/Uptime of Archives?
Remnant of discontinued web archive http://mummify.it captured on February 14 2014
https://web.archive.org/web/20140214233752/https://www.mummify.it/
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Permanent Existence/Uptime of Archives?
http://www.themoscowtimes.com/news/article/russia-bans-wayback-machine-internet-archive-over-
islamic-state-video/510074.html
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Permanent Existence/Uptime of Archives?
http://web.archive.org/web/20121101043952/http://vogin.nl on March 6 2017 at 15:59 CET
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Custodian of A Links to Snapshot of B, Decorates the Link
• Desired practice for linking to captures is to decorate the link so it
provides a variety of options:
<a href=“URL of snapshot of B”
data-originalurl=“B”
data-versiondate=“datetime of snapshot of B”>
• Supports:
o Revisiting the original URL
o Finding snapshots in any web archive (via original URL)
o Finding a temporally appropriate snapshot in any web archive
(via original URL & snapshot datetime)
o Automatically accessing a temporally appropriate snapshot in
any web archive (Memento protocol using original URL &
snapshot datetime)
http://robustlinks.mementoweb.org/spec/
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Robust Links: Link Decoration in Action
See Robust Links at work in: Van de Sompel H. & Nelson, M.L. (2015)
Reminiscing about 15 years of interoperability efforts. D-Lib Magazine.
https://doi.org/10.1045/november2015-vandesompel
JavaScript makes the
link decorations actionable
Robust Links Javascript
https://github.com/mementoweb/robustlinks
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Recap - A Managed Collection Desires Reliable Outlinks
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Takeaways
• When it comes to links to
managed collections, the
custodian of the linking collection
relies on the custodians of the
linked collections to preserve link
integrity.
• PIDs, HTTP redirects are
managed by the custodian of
linked collections.
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Takeaways
• When it comes to links to web at
large resources, the custodian of a
linking collection cannot rely on the
custodians of those linked
resources to maintain link integrity.
• Creation of Mementos, Robust
Links is managed by the custodian
of the collection that links to web at
large resources.
@hvdsomp
Thor Conference, Rome, Italy, November 15 2017
Herbert Van de Sompel
Los Alamos National Laboratory
@hvdsomp
Achieving Link Integrity for Managed Collections
Photo by Eric Sieverts

More Related Content

What's hot

Persistent Identification: Easier Said than Done
Persistent Identification: Easier Said than DonePersistent Identification: Easier Said than Done
Persistent Identification: Easier Said than DoneHerbert Van de Sompel
 
Discovering Scholarly Orphans Using ORCID
Discovering Scholarly Orphans Using ORCIDDiscovering Scholarly Orphans Using ORCID
Discovering Scholarly Orphans Using ORCIDMartin Klein
 
To the Rescue of the Orphans of Scholarly Communication
To the Rescue of the Orphans of Scholarly CommunicationTo the Rescue of the Orphans of Scholarly Communication
To the Rescue of the Orphans of Scholarly CommunicationMartin Klein
 
Persistent Identifiers and the Web: The Need for an Unambiguous Mapping
Persistent Identifiers and the Web: The Need for an Unambiguous MappingPersistent Identifiers and the Web: The Need for an Unambiguous Mapping
Persistent Identifiers and the Web: The Need for an Unambiguous MappingHerbert Van de Sompel
 
Researcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized WebResearcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized WebHerbert Van de Sompel
 
The Web of Data is Our Opportunity
The Web of Data is Our OpportunityThe Web of Data is Our Opportunity
The Web of Data is Our OpportunityRichard Wallis
 
Wikipedia and Libraries: Island Hopping the Data Archipelago
Wikipedia and Libraries: Island Hopping the Data ArchipelagoWikipedia and Libraries: Island Hopping the Data Archipelago
Wikipedia and Libraries: Island Hopping the Data ArchipelagoMaximilian Klein
 
LD4L OCLC Data Strategy
LD4L OCLC Data StrategyLD4L OCLC Data Strategy
LD4L OCLC Data StrategyRichard Wallis
 
The Web of Data is Our Oyster
The Web of Data is Our OysterThe Web of Data is Our Oyster
The Web of Data is Our OysterRichard Wallis
 
Prototypes of pro-active approaches to support the archiving of web reference...
Prototypes of pro-active approaches to support the archiving of web reference...Prototypes of pro-active approaches to support the archiving of web reference...
Prototypes of pro-active approaches to support the archiving of web reference...EDINA, University of Edinburgh
 
Web Driven Revolution For Library Data
Web Driven Revolution For Library DataWeb Driven Revolution For Library Data
Web Driven Revolution For Library DataRichard Wallis
 
Consuming Linked Data SemTech2010
Consuming Linked Data SemTech2010Consuming Linked Data SemTech2010
Consuming Linked Data SemTech2010Juan Sequeda
 
Reference Rot and Link Decoration
Reference Rot and Link DecorationReference Rot and Link Decoration
Reference Rot and Link DecorationMartin Klein
 
Linked Data Snowball, or Why We Need Reconciliation
Linked Data Snowball, or Why We Need ReconciliationLinked Data Snowball, or Why We Need Reconciliation
Linked Data Snowball, or Why We Need ReconciliationRobert Sanderson
 
Contextual Computing - Knowledge Graphs & Web of Entities
Contextual Computing - Knowledge Graphs & Web of EntitiesContextual Computing - Knowledge Graphs & Web of Entities
Contextual Computing - Knowledge Graphs & Web of EntitiesRichard Wallis
 
Impact of URI Canonicalization on Memento Count
Impact of URI Canonicalization on Memento Count Impact of URI Canonicalization on Memento Count
Impact of URI Canonicalization on Memento Count Mat Kelly
 
Evolving the Web into a Global Dataspace – Advances and Applications
Evolving the Web into a Global Dataspace – Advances and ApplicationsEvolving the Web into a Global Dataspace – Advances and Applications
Evolving the Web into a Global Dataspace – Advances and ApplicationsChris Bizer
 

What's hot (20)

Reminiscing about interoperability
Reminiscing about interoperabilityReminiscing about interoperability
Reminiscing about interoperability
 
PID Signposting Pattern
PID Signposting PatternPID Signposting Pattern
PID Signposting Pattern
 
Persistent Identification: Easier Said than Done
Persistent Identification: Easier Said than DonePersistent Identification: Easier Said than Done
Persistent Identification: Easier Said than Done
 
Discovering Scholarly Orphans Using ORCID
Discovering Scholarly Orphans Using ORCIDDiscovering Scholarly Orphans Using ORCID
Discovering Scholarly Orphans Using ORCID
 
To the Rescue of the Orphans of Scholarly Communication
To the Rescue of the Orphans of Scholarly CommunicationTo the Rescue of the Orphans of Scholarly Communication
To the Rescue of the Orphans of Scholarly Communication
 
Persistent Identifiers and the Web: The Need for an Unambiguous Mapping
Persistent Identifiers and the Web: The Need for an Unambiguous MappingPersistent Identifiers and the Web: The Need for an Unambiguous Mapping
Persistent Identifiers and the Web: The Need for an Unambiguous Mapping
 
The Web We Want
The Web We WantThe Web We Want
The Web We Want
 
Researcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized WebResearcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized Web
 
The Web of Data is Our Opportunity
The Web of Data is Our OpportunityThe Web of Data is Our Opportunity
The Web of Data is Our Opportunity
 
Wikipedia and Libraries: Island Hopping the Data Archipelago
Wikipedia and Libraries: Island Hopping the Data ArchipelagoWikipedia and Libraries: Island Hopping the Data Archipelago
Wikipedia and Libraries: Island Hopping the Data Archipelago
 
LD4L OCLC Data Strategy
LD4L OCLC Data StrategyLD4L OCLC Data Strategy
LD4L OCLC Data Strategy
 
The Web of Data is Our Oyster
The Web of Data is Our OysterThe Web of Data is Our Oyster
The Web of Data is Our Oyster
 
Prototypes of pro-active approaches to support the archiving of web reference...
Prototypes of pro-active approaches to support the archiving of web reference...Prototypes of pro-active approaches to support the archiving of web reference...
Prototypes of pro-active approaches to support the archiving of web reference...
 
Web Driven Revolution For Library Data
Web Driven Revolution For Library DataWeb Driven Revolution For Library Data
Web Driven Revolution For Library Data
 
Consuming Linked Data SemTech2010
Consuming Linked Data SemTech2010Consuming Linked Data SemTech2010
Consuming Linked Data SemTech2010
 
Reference Rot and Link Decoration
Reference Rot and Link DecorationReference Rot and Link Decoration
Reference Rot and Link Decoration
 
Linked Data Snowball, or Why We Need Reconciliation
Linked Data Snowball, or Why We Need ReconciliationLinked Data Snowball, or Why We Need Reconciliation
Linked Data Snowball, or Why We Need Reconciliation
 
Contextual Computing - Knowledge Graphs & Web of Entities
Contextual Computing - Knowledge Graphs & Web of EntitiesContextual Computing - Knowledge Graphs & Web of Entities
Contextual Computing - Knowledge Graphs & Web of Entities
 
Impact of URI Canonicalization on Memento Count
Impact of URI Canonicalization on Memento Count Impact of URI Canonicalization on Memento Count
Impact of URI Canonicalization on Memento Count
 
Evolving the Web into a Global Dataspace – Advances and Applications
Evolving the Web into a Global Dataspace – Advances and ApplicationsEvolving the Web into a Global Dataspace – Advances and Applications
Evolving the Web into a Global Dataspace – Advances and Applications
 

Similar to Achieving Link Integrity for Managed Collections

En toen was er niets meer ....
En toen was er niets meer ....En toen was er niets meer ....
En toen was er niets meer ....voginip
 
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...Andrea Bollini
 
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...4Science
 
IBM Connections REST-API Waltz
IBM Connections REST-API WaltzIBM Connections REST-API Waltz
IBM Connections REST-API WaltzHenning Schmidt
 
Schema.org where did that come from?
Schema.org where did that come from?Schema.org where did that come from?
Schema.org where did that come from?Richard Wallis
 
Whowas: History of resources at APNIC
Whowas: History of resources at APNICWhowas: History of resources at APNIC
Whowas: History of resources at APNICAPNIC
 
Semantic Pipes and Semantic Mashups
Semantic Pipes and Semantic MashupsSemantic Pipes and Semantic Mashups
Semantic Pipes and Semantic Mashupsgiurca
 
TIB AV-Portal: Semantic Content Mining with Semi-Automatic Metadata Editing. ...
TIB AV-Portal: Semantic Content Mining with Semi-Automatic Metadata Editing. ...TIB AV-Portal: Semantic Content Mining with Semi-Automatic Metadata Editing. ...
TIB AV-Portal: Semantic Content Mining with Semi-Automatic Metadata Editing. ...LIBER Europe
 
“Library 2.0: Let's get connected!”
“Library 2.0: Let's get connected!”“Library 2.0: Let's get connected!”
“Library 2.0: Let's get connected!”bridgingworlds2008
 
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case StudyOracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case StudyMark Rittman
 
AWS Finland meetup 2017 October
AWS Finland meetup 2017 OctoberAWS Finland meetup 2017 October
AWS Finland meetup 2017 OctoberRolf Koski
 
Ensuring the Integrity (& Continuity) of Our Record of Scholarship
Ensuring the Integrity (& Continuity) of Our Record of ScholarshipEnsuring the Integrity (& Continuity) of Our Record of Scholarship
Ensuring the Integrity (& Continuity) of Our Record of ScholarshipEDINA, University of Edinburgh
 
Social Connections 12. We hired hackers to hack us
Social Connections 12. We hired hackers to hack usSocial Connections 12. We hired hackers to hack us
Social Connections 12. We hired hackers to hack usRobert Farstad
 
We hired hackers to hack us; A case study about cloud-based authentication an...
We hired hackers to hack us; A case study about cloud-based authentication an...We hired hackers to hack us; A case study about cloud-based authentication an...
We hired hackers to hack us; A case study about cloud-based authentication an...LetsConnect
 
Hiberlink: Prototypes of pro-active approaches to support the archiving of we...
Hiberlink: Prototypes of pro-active approaches to support the archiving of we...Hiberlink: Prototypes of pro-active approaches to support the archiving of we...
Hiberlink: Prototypes of pro-active approaches to support the archiving of we...EDINA, University of Edinburgh
 
Announcing the Connections Cloud Catalog: How to Get new Apps fresh out of th...
Announcing the Connections Cloud Catalog: How to Get new Apps fresh out of th...Announcing the Connections Cloud Catalog: How to Get new Apps fresh out of th...
Announcing the Connections Cloud Catalog: How to Get new Apps fresh out of th...LetsConnect
 
IBM Connections 6 Component Pack
IBM Connections 6 Component PackIBM Connections 6 Component Pack
IBM Connections 6 Component PackLetsConnect
 
LOCAH Project and Considerations of Linked Data Approaches
LOCAH Project and Considerations of Linked Data ApproachesLOCAH Project and Considerations of Linked Data Approaches
LOCAH Project and Considerations of Linked Data ApproachesAdrian Stevenson
 

Similar to Achieving Link Integrity for Managed Collections (20)

En toen was er niets meer ....
En toen was er niets meer ....En toen was er niets meer ....
En toen was er niets meer ....
 
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
 
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
 
IBM Connections REST-API Waltz
IBM Connections REST-API WaltzIBM Connections REST-API Waltz
IBM Connections REST-API Waltz
 
Signposting Overview
Signposting OverviewSignposting Overview
Signposting Overview
 
Schema.org where did that come from?
Schema.org where did that come from?Schema.org where did that come from?
Schema.org where did that come from?
 
Whowas: History of resources at APNIC
Whowas: History of resources at APNICWhowas: History of resources at APNIC
Whowas: History of resources at APNIC
 
Semantic Pipes and Semantic Mashups
Semantic Pipes and Semantic MashupsSemantic Pipes and Semantic Mashups
Semantic Pipes and Semantic Mashups
 
TIB AV-Portal: Semantic Content Mining with Semi-Automatic Metadata Editing. ...
TIB AV-Portal: Semantic Content Mining with Semi-Automatic Metadata Editing. ...TIB AV-Portal: Semantic Content Mining with Semi-Automatic Metadata Editing. ...
TIB AV-Portal: Semantic Content Mining with Semi-Automatic Metadata Editing. ...
 
“Library 2.0: Let's get connected!”
“Library 2.0: Let's get connected!”“Library 2.0: Let's get connected!”
“Library 2.0: Let's get connected!”
 
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case StudyOracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
 
AWS Finland meetup 2017 October
AWS Finland meetup 2017 OctoberAWS Finland meetup 2017 October
AWS Finland meetup 2017 October
 
Reference Rot: Threat and Remedy
Reference Rot: Threat and RemedyReference Rot: Threat and Remedy
Reference Rot: Threat and Remedy
 
Ensuring the Integrity (& Continuity) of Our Record of Scholarship
Ensuring the Integrity (& Continuity) of Our Record of ScholarshipEnsuring the Integrity (& Continuity) of Our Record of Scholarship
Ensuring the Integrity (& Continuity) of Our Record of Scholarship
 
Social Connections 12. We hired hackers to hack us
Social Connections 12. We hired hackers to hack usSocial Connections 12. We hired hackers to hack us
Social Connections 12. We hired hackers to hack us
 
We hired hackers to hack us; A case study about cloud-based authentication an...
We hired hackers to hack us; A case study about cloud-based authentication an...We hired hackers to hack us; A case study about cloud-based authentication an...
We hired hackers to hack us; A case study about cloud-based authentication an...
 
Hiberlink: Prototypes of pro-active approaches to support the archiving of we...
Hiberlink: Prototypes of pro-active approaches to support the archiving of we...Hiberlink: Prototypes of pro-active approaches to support the archiving of we...
Hiberlink: Prototypes of pro-active approaches to support the archiving of we...
 
Announcing the Connections Cloud Catalog: How to Get new Apps fresh out of th...
Announcing the Connections Cloud Catalog: How to Get new Apps fresh out of th...Announcing the Connections Cloud Catalog: How to Get new Apps fresh out of th...
Announcing the Connections Cloud Catalog: How to Get new Apps fresh out of th...
 
IBM Connections 6 Component Pack
IBM Connections 6 Component PackIBM Connections 6 Component Pack
IBM Connections 6 Component Pack
 
LOCAH Project and Considerations of Linked Data Approaches
LOCAH Project and Considerations of Linked Data ApproachesLOCAH Project and Considerations of Linked Data Approaches
LOCAH Project and Considerations of Linked Data Approaches
 

More from Herbert Van de Sompel

FAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning IssueFAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning IssueHerbert Van de Sompel
 
Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)Herbert Van de Sompel
 
Collecting the organizational scholarly record
Collecting the organizational scholarly recordCollecting the organizational scholarly record
Collecting the organizational scholarly recordHerbert Van de Sompel
 
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTDBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTHerbert Van de Sompel
 
A Perspective on Archiving the Scholarly Record
A Perspective on Archiving the Scholarly RecordA Perspective on Archiving the Scholarly Record
A Perspective on Archiving the Scholarly RecordHerbert Van de Sompel
 
Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013Herbert Van de Sompel
 
The Web as infrastructure for scholarly research and communication
The Web as infrastructure for scholarly research and communicationThe Web as infrastructure for scholarly research and communication
The Web as infrastructure for scholarly research and communicationHerbert Van de Sompel
 
Paint-Yourself-In-The-Corner Infrastructure
Paint-Yourself-In-The-Corner InfrastructurePaint-Yourself-In-The-Corner Infrastructure
Paint-Yourself-In-The-Corner InfrastructureHerbert Van de Sompel
 
ResourceSync: Web-Based Resource Synchronization
ResourceSync: Web-Based Resource SynchronizationResourceSync: Web-Based Resource Synchronization
ResourceSync: Web-Based Resource SynchronizationHerbert Van de Sompel
 
ResourceSync: Conceptual and Technical Problem Perspective
ResourceSync: Conceptual and Technical Problem PerspectiveResourceSync: Conceptual and Technical Problem Perspective
ResourceSync: Conceptual and Technical Problem PerspectiveHerbert Van de Sompel
 
Towards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication SystemTowards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication SystemHerbert Van de Sompel
 

More from Herbert Van de Sompel (20)

FAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning IssueFAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning Issue
 
Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)
 
Collecting the organizational scholarly record
Collecting the organizational scholarly recordCollecting the organizational scholarly record
Collecting the organizational scholarly record
 
To the Rescue of Scholarly Orphans
To the Rescue of Scholarly OrphansTo the Rescue of Scholarly Orphans
To the Rescue of Scholarly Orphans
 
Almost two decades at LANL
Almost two decades at LANLAlmost two decades at LANL
Almost two decades at LANL
 
Perseverance on Persistence
Perseverance on PersistencePerseverance on Persistence
Perseverance on Persistence
 
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTDBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
 
Creating Pockets of Persistence
Creating Pockets of PersistenceCreating Pockets of Persistence
Creating Pockets of Persistence
 
ResourceSync Quick Overview
ResourceSync Quick OverviewResourceSync Quick Overview
ResourceSync Quick Overview
 
Memento 101
Memento 101Memento 101
Memento 101
 
A Perspective on Archiving the Scholarly Record
A Perspective on Archiving the Scholarly RecordA Perspective on Archiving the Scholarly Record
A Perspective on Archiving the Scholarly Record
 
ResourceSync Overview
ResourceSync OverviewResourceSync Overview
ResourceSync Overview
 
Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013
 
ResourceSync tutorial OAI8
ResourceSync tutorial OAI8ResourceSync tutorial OAI8
ResourceSync tutorial OAI8
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 
The Web as infrastructure for scholarly research and communication
The Web as infrastructure for scholarly research and communicationThe Web as infrastructure for scholarly research and communication
The Web as infrastructure for scholarly research and communication
 
Paint-Yourself-In-The-Corner Infrastructure
Paint-Yourself-In-The-Corner InfrastructurePaint-Yourself-In-The-Corner Infrastructure
Paint-Yourself-In-The-Corner Infrastructure
 
ResourceSync: Web-Based Resource Synchronization
ResourceSync: Web-Based Resource SynchronizationResourceSync: Web-Based Resource Synchronization
ResourceSync: Web-Based Resource Synchronization
 
ResourceSync: Conceptual and Technical Problem Perspective
ResourceSync: Conceptual and Technical Problem PerspectiveResourceSync: Conceptual and Technical Problem Perspective
ResourceSync: Conceptual and Technical Problem Perspective
 
Towards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication SystemTowards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication System
 

Recently uploaded

Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITMgdsc13
 
Denver Web Design brochure for public viewing
Denver Web Design brochure for public viewingDenver Web Design brochure for public viewing
Denver Web Design brochure for public viewingbigorange77
 
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls KolkataVIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一Fs
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一z xss
 
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Lucknow
 
The Intriguing World of CDR Analysis by Police: What You Need to Know.pdf
The Intriguing World of CDR Analysis by Police: What You Need to Know.pdfThe Intriguing World of CDR Analysis by Police: What You Need to Know.pdf
The Intriguing World of CDR Analysis by Police: What You Need to Know.pdfMilind Agarwal
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Paul Calvano
 
Complet Documnetation for Smart Assistant Application for Disabled Person
Complet Documnetation   for Smart Assistant Application for Disabled PersonComplet Documnetation   for Smart Assistant Application for Disabled Person
Complet Documnetation for Smart Assistant Application for Disabled Personfurqan222004
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作ys8omjxb
 
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一Fs
 
Magic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMagic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMartaLoveguard
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Dana Luther
 
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130  Available With RoomVIP Kolkata Call Girl Alambazar 👉 8250192130  Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Roomdivyansh0kumar0
 
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With RoomVIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Roomdivyansh0kumar0
 
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一Fs
 
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With RoomVIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Roomishabajaj13
 
定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一
定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一
定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一3sw2qly1
 

Recently uploaded (20)

Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITM
 
Denver Web Design brochure for public viewing
Denver Web Design brochure for public viewingDenver Web Design brochure for public viewing
Denver Web Design brochure for public viewing
 
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls KolkataVIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
 
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
 
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
 
The Intriguing World of CDR Analysis by Police: What You Need to Know.pdf
The Intriguing World of CDR Analysis by Police: What You Need to Know.pdfThe Intriguing World of CDR Analysis by Police: What You Need to Know.pdf
The Intriguing World of CDR Analysis by Police: What You Need to Know.pdf
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24
 
Complet Documnetation for Smart Assistant Application for Disabled Person
Complet Documnetation   for Smart Assistant Application for Disabled PersonComplet Documnetation   for Smart Assistant Application for Disabled Person
Complet Documnetation for Smart Assistant Application for Disabled Person
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
 
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
 
Magic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMagic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptx
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
 
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130  Available With RoomVIP Kolkata Call Girl Alambazar 👉 8250192130  Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
 
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With RoomVIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
 
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
 
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With RoomVIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
 
定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一
定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一
定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一
 

Achieving Link Integrity for Managed Collections

  • 1. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Herbert Van de Sompel Los Alamos National Laboratory @hvdsomp Achieving Link Integrity for Managed Collections Photo by Eric Sieverts
  • 2. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Hyperlinks in Theory
  • 3. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Hyperlinks in Reality
  • 4. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Hyperlinks in Reality
  • 5. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Link Rot
  • 6. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Link Rot
  • 7. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Hyperlinks in Reality
  • 8. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Content Drift
  • 9. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Content Drift
  • 10. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Content Drift 2000 2004 2005 2008 http://dl00.org in 2000, 2004, 2005, 2008
  • 11. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Content Drift http://icecube.wisc.edu/ on May 8 2009 (left) and August 27 2009 (right)
  • 12. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 No Content Drift http://www.ifa.hawaii.edu/~cowie/k_table.html on June 9 1997 (left) and March 2016 (right)
  • 13. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 The Web, All Hyperlinks Subject to Link Rot, Content Drift
  • 14. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 The Web, All Hyperlinks Subject to Reference Rot • Reference Rot hinders our ability to follow links as they were intended when they were put in place: • Link rot: A link stops working all together • Content drift: The Linked content changes over time and may eventually no longer be representative of the content that was originally linked
  • 15. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Creating Pockets of Persistence • How to maintain the integrity of links? • This challenge exists for the entire web. Some communities with well managed collections care about addressing it because they consider it a Quality of Service issue: • Scholarly communication • Cultural heritage • Legal publications • Government communication • Journalism • Wikipedia • … • What can these communities do to create Pockets of Persistence?
  • 16. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 A Managed Collection Desires Reliable Outlinks
  • 17. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Links to another Managed Collection
  • 18. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Links to Web at Large Resources
  • 19. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Exploring Link Rot & Content Drift
  • 20. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 <Intermezzo - Hiberlink Study re Reference Rot in STM Articles>
  • 21. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 PubMed Central Corpus PMC articles published 1997-2012 PMC Total 479,194 With links to articles 240,857 With links to web-at-large resources 156,160 Links PMC To articles 744,678 To web-at-large resources 480,853A B A B
  • 22. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Links to Articles & to Web At Large Resources - PMC Martin Klein, Herbert Van de Sompel, et al. (2014) Scholarly context not found. In: PLOS ONE https://doi.org/10.1371/journal.pone.0115253
  • 23. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 <Intermezzo - Hiberlink Study re Reference Rot in STM Articles>
  • 24. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Exploring Link Rot & Content Drift
  • 25. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Links Rot Occurs when B moves to C
  • 26. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Introduce PID(B)
  • 27. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Link to PID(B) ; HTTP Redirect from PID(B) to B
  • 28. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 When B moves to C: HTTP Redirect from PID(B) to C
  • 29. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Core Assumption: PID(B) Will Be Used for Linking
  • 30. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Herbert Van de Sompel, Martin Klein, and Shawn Jones (2016) Persistent URIs Must Be Used to Be Persistent. In: WWW2016. http://arxiv.org/1602.09102
  • 31. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 • When classifying links extracted from PMC as linking to articles, we assumed that filtering on http://dx.doi.org/* would do the trick • But we found a lot of e.g. http://link.springer.com/article/* • For example: • http://link.springer.com/article/10.1007%2Fs00799-014-018-0 • Instead of: • http://dx.doi.org/10.1007/s00799-014-0108-0 • We used CrossRef’s Reverse Domain Lookup to classify these extracted links as linking to articles A Disconcerting Observation
  • 32. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 URI References - PMC Herbert Van de Sompel, Martin Klein, and Shawn Jones (2016) Persistent URIs Must Be Used to Be Persistent. In: WWW2016. http://arxiv.org/1602.09102 Herbert Van de Sompel, Martin Klein, and Shawn Jones (2016) Persistent URIs Must Be Used to Be Persistent. In: WWW2016. http://arxiv.org/1602.09102
  • 33. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Cartoon by Patrick Hochstenbach http://signposting.org <Intermezzo – Signposting the Scholarly Web>
  • 34. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 • Proposal: Use typed links to address some long standing problems regarding scholarly resources on the web, by interlinking them using appropriate relation types • Focus on a limited set of patterns to support uniformly: •Conveying a Persistent Identifier •Expressing the web boundary of a scholarly resource •Making bibliographic metadata discoverable •Conveying an Author Identifier •Conveying a license that applies to a resource •Conveying a resource type Signposting the Scholarly Web
  • 35. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 HTTP Links Mark Nottingham (2017) RFC8288: Web Linking http://tools.iets.org/rfc/rfc8288.txt
  • 36. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 HTTP Links
  • 37. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 HTTP Links
  • 38. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 HTTP Links Are Used curl –I http://dbpedia.org/data/Reykjavik HTTP/1.1 200 OK Date: Thu, 27 Oct 2016 04:43:28 GMT Content-Type: application/rdf+xml; charset=UTF-8 Content-Length: 1210 Link: <http://creativecommons.org/licenses/by-sa/3.0> ; rel=“license", <http://dbpedia.org/data/Reykjavik> ; rel="alternate"; type="text/n3", <http://dbpedia.org/resource/Reykjavik>; rel="describes", <http://mementoarchive.lanl.gov/dbpedia/timegate/http://dbpedia.org/ data/Reykjavik> ; rel="timegate"
  • 39. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 For PIDs: Use cite-as Relation Type Van de Sompel, H., Nelson M., Bilder, G, Kunze, J., and Warner, S. (2017) “cite-as”: A Link Relation to Convey a Preferred URI for Referencing https://datatracker.ietf.org/doc/draft-vandesompel-citeas/
  • 40. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 For PIDs: Use cite-as Relation Type Van de Sompel, H., Nelson M., Bilder, G, Kunze, J., and Warner, S. (2017) “cite-as”: A Link Relation to Convey a Preferred URI for Referencing https://datatracker.ietf.org/doc/draft-vandesompel-citeas/
  • 41. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 • The target URI (PID) of the cite-as link can be picked up by applications, e.g.: • reference managers can pick up the PID of an object when the user saves it while on the landing page, one of the constituent resources • publication pipelines can pick up the PID by looking up (HTTP HEAD) URIs referenced in a paper to determine whether a PID exists for them For PIDs: Use cite-as Relation Type
  • 42. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Cartoon by Patrick Hochstenbach http://signposting.org </Intermezzo – Signposting the Scholarly Web>
  • 43. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 PID Alternative - When B Moves to C: HTTP Redirect from B to C
  • 44. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 PID Alternative - When B Moves to C: HTTP Redirect from B to C • Custodian of C needs to hold on to domain of B • Custodian of C needs to establish redirection patterns; often those are rather simple rules • No problem with establishing links to PID(B); the URI in the browser address bar (initially B, later C) is just fine
  • 45. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Exploring Link Rot & Content Drift
  • 46. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Content Drift Occurs when B Changes over Time
  • 47. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Content Drift Occurs when B Changes over Time • Is not really considered an issue because: • the objects that receive PIDs were typically static, e.g. scientific papers • when a (substantially) new version of an object is published, typically a new PID is assigned • But: • how to verify that the retrieved version of an object is indeed the referenced version of the object? • Requires: • archiving objects in trusted archive(s) • ability to retrieve objects from the archive(s)
  • 48. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Archived Articles David Rosenthal (2013) Patio Perspectives at ANADP II: Preserving the Other Half http://blog.dshr.org/2013/11/patio-perspectives-at-anadp-ii.html Too few Too low risk
  • 49. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 How to Audit Whether a PID-identified Object is Archived http://thekeepers.org Journal, Volume, Issue centric Global audit by DOI?
  • 50. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Contrast: All Web-Archived Versions of David’s Blog Post Global audit by HTTP URI Uses Memento infrastructure http://timetravel.mementoweb.org
  • 51. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Exploring Link Rot & Content Drift
  • 52. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Scholarly Context Adrift Shawn Jones, Herbert Van de Sompel, et al. (2016) Scholarly context adrift. In: PLOS ONE https://doi.org/10.1371/journal.pone.0167475
  • 53. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 How to Assess Content Drift?
  • 54. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Step 1: Find Pre/Post Mementos
  • 55. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Step 2: Select Representative Mementos
  • 56. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Text Similarity Measures • Compute aggregate text similarity scores (values between 0...100) for: • Simhash • Jaccard • Sørensen-Dice • Cosine • If the aggregate score is 100, we decide that the Pre/Post Mementos are representative • We find 137K URI references out of 480K that have representative Mementos
  • 57. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Step 3: Dereference Live Web Version of URI
  • 58. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Step 4: Representative Memento vs. Live Version
  • 59. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Content Drift - PMC Shawn Jones, Herbert Van de Sompel, et al. (2016) Scholarly context adrift. In: PLOS ONE https://doi.org/10.1371/journal.pone.0167475
  • 60. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Reference Rot for Links to Web at Large is Severe • Link Rot and Content Drift are severe • Cannot retrieve originally linked content from the live web • Can potentially retrieve originally linked content from web archives • But the archival coverage is too poor, a result of incidental archiving
  • 61. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 URI References without Representative Mementos - PMC Shawn Jones, Herbert Van de Sompel, et al. (2016) Scholarly context adrift. In: PLOS ONE https://doi.org/10.1371/journal.pone.0167475
  • 62. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Impact of Archival Gap on Links from Managed Collections Martin Klein, Herbert Van de Sompel, et al. (2014) Scholarly context not found. In: PLOS ONE https://doi.org/10.1371/journal.pone.0115253 Links from Managed Collections to Domains Grey: Linked Content not Archived
  • 63. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Uncertainty Regarding the Future of B when A Links to It
  • 64. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Custodian of A Takes a Snapshot of B when Linking to It
  • 65. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Taking a Snapshots of B: Automation is Key • Web archive APIs for on-demand archiving • perma.cc, Internet Archive, archive.is, webcitation • Amber for Wordpress & Drupal archives resources linked in a page • http://amberlink.org/ • Hiberlink’s experimental Zotero extension archives bookmarked URLs • http://hiberlink.org/zotero.html • Hiberlink’s experimental HiberActive archives all URLs referenced in a newly submitted paper • https://www.slideshare.net/martinklein0815/hiberactive
  • 66. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 site2cite http://site2cite
  • 67. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Custodian of A Links to Snapshot of B • Typical practice for linking to snapshots: <a href=“URL of snapshot of B”> • Problems with this practice: o Impossible to visit the original URI, if desired o Requires the permanent existence/uptime of the archive that holds the snapshot -One link rot problem replaced by another http://robustlinks.mementoweb.org/about/
  • 68. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Permanent Existence/Uptime of Archives? Capture of http://webcitation.org dated July 17 2013 https://archive.today/eAETp
  • 69. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Permanent Existence/Uptime of Archives? Remnant of discontinued web archive http://mummify.it captured on February 14 2014 https://web.archive.org/web/20140214233752/https://www.mummify.it/
  • 70. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Permanent Existence/Uptime of Archives? http://www.themoscowtimes.com/news/article/russia-bans-wayback-machine-internet-archive-over- islamic-state-video/510074.html
  • 71. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Permanent Existence/Uptime of Archives? http://web.archive.org/web/20121101043952/http://vogin.nl on March 6 2017 at 15:59 CET
  • 72. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Custodian of A Links to Snapshot of B, Decorates the Link • Desired practice for linking to captures is to decorate the link so it provides a variety of options: <a href=“URL of snapshot of B” data-originalurl=“B” data-versiondate=“datetime of snapshot of B”> • Supports: o Revisiting the original URL o Finding snapshots in any web archive (via original URL) o Finding a temporally appropriate snapshot in any web archive (via original URL & snapshot datetime) o Automatically accessing a temporally appropriate snapshot in any web archive (Memento protocol using original URL & snapshot datetime) http://robustlinks.mementoweb.org/spec/
  • 73. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Robust Links: Link Decoration in Action See Robust Links at work in: Van de Sompel H. & Nelson, M.L. (2015) Reminiscing about 15 years of interoperability efforts. D-Lib Magazine. https://doi.org/10.1045/november2015-vandesompel JavaScript makes the link decorations actionable Robust Links Javascript https://github.com/mementoweb/robustlinks
  • 74. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Recap - A Managed Collection Desires Reliable Outlinks
  • 75. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Takeaways • When it comes to links to managed collections, the custodian of the linking collection relies on the custodians of the linked collections to preserve link integrity. • PIDs, HTTP redirects are managed by the custodian of linked collections.
  • 76. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Takeaways • When it comes to links to web at large resources, the custodian of a linking collection cannot rely on the custodians of those linked resources to maintain link integrity. • Creation of Mementos, Robust Links is managed by the custodian of the collection that links to web at large resources.
  • 77. @hvdsomp Thor Conference, Rome, Italy, November 15 2017 Herbert Van de Sompel Los Alamos National Laboratory @hvdsomp Achieving Link Integrity for Managed Collections Photo by Eric Sieverts

Editor's Notes

  1. Previously, archival status (14-day window) as proxy
  2. Previously, archival status (14-day window) as proxy
  3. Previously, archival status (14-day window) as proxy
  4. Previously, archival status (14-day window) as proxy
  5. Previously, archival status (14-day window) as proxy