Link checking, the 404 problem (and the other 403) (Richard Cross, Nottingham Trent University)
Upcoming SlideShare
Loading in...5

Like this? Share it with your network


Link checking, the 404 problem (and the other 403) (Richard Cross, Nottingham Trent University)

Uploaded on

Link checking, the 404 problem (and the other 403) (Richard Cross, Nottingham Trent University) ...

Link checking, the 404 problem (and the other 403) (Richard Cross, Nottingham Trent University)
How much time should libraries spend validating links from resource lists to electronic and online materials? How much effort should go in to validating copyright and quality assuring materials selected by academics from the web? This session will highlight some of the key issues raised by the business of 'link checking' at Nottingham Trent University, discuss the pros and cons of different assurance methods, and suggest some possible developments in the functionality of Aspire that could support and streamline this area of work.

More in: Education , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 12 12

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. 25 June 20131Enhancing life-long learning, teaching and research throughinformation resources and services
  • 2. 25 June 20132Link checking, the 404 problem (and theother 403)Richard Cross, Resource Discovery and Innovation Team ManagerNottingham Trent University LibraryTalis Aspire User Group | Nottingham Conference Centre| June 2013
  • 3. 25 June 20133Linking it all up• Problem? What problem?• The link validation process at Nottingham Trent University• Issues and challenges in the review of link-to materials• Some possible Aspire developments that could help
  • 4. The 404 problem?• Error 404: standard response from a web site when the requestedpage URL cannot be found25 June 20134
  • 5. The other 403?• The tangle of issues raisedby the challenge ofensuring reliable,problem-free, seamless,deep-linking to onlineresources• Broken links, inaccessiblematerial, poor qualitymetadata, copyrightblindness, unclearprovenance – all raisereputational issues for theresource list service25 June 20135
  • 6. Aspire, metadata and linkingDOI-based, Aspirerecogniser supported25 June 20136Strongest metadataand link-to extractionWeb stuffWeakest metadata andlink-to extraction• Aspire’s ability to ‘recognise’ online materials exists – forunderstandable reasons – along a continuum
  • 7. One of the biggest challenges?Managing the thingsacademics finds on theinterwebnet…25 June 20137• Link-to materials which are not selected from library-mediateddiscovery systems can be the most challenging to manage
  • 8. Reasonsto befearful25 June 2013Ease of GooglediscoveryPoor web sitedesignSearch enginebot indexingCopyrightinfringing hostsLack of copyrightknowledgeResistance toaccept the utilityof copyrightTime-pooracademicsQuick-and-dirtyresourceselectionLack of real-timeguidance8
  • 9. 25 June 20139The link validation process at Nottingham TrentUniversity
  • 10. Link checking and the library review processAcquisitionsLinkcheckingDigitisation• Improve and validate metadata (matching as closely as possible therequirements of the LLR Harvard citation style)• Validate URLs for to be: persistent, location-independent,authentication-aware• Review OpenURL resolution (to test match full-text outcomes)• Check for copyright-compliance, and provider appropriateness
  • 11. What’s the link-checking workflow?• Metadata librarians routinely deal with link-checking processes• Exceptions and queries are referred toreview meetings• Scenarios are recorded for future reference -to routinise the processing of futureoccurrences• Suspected infringements of copyright arerecorded• End of review reports are shared with LiaisonLibrarians, either for reference or for furtheraction
  • 12. Clear evidence of added-value in process• Links to materials that would otherwise break now persist• Access to content is available on-campus and off-campus• OpenURL resolution to full-text outcomes is improved25 June 201312• Metadata is enriched, bettermatching citation quality• Without additional effort ontheir part, lecturer’s intentionis met• List quality and reputation isimproved
  • 13. Where is your institution on theresource list link-checking continuum?25 June 201313Q:We don’t checkanything – it’sthe academic’sresponsibility toget this rightWe checkeverything –we qualityassure for ourstudents
  • 14. Issues: for librarians involved in Review• Scale: some lists can include 100+ links• Link validation process is largely manual, and requires a significantinvestment of staff resources• Time: ‘link checking’ staff have large number of other commitments• Steep learning curve for processing link-to items• Application has no memory of previous decisions; no ‘knowledgebase’
  • 15. Issues: for librarians working with academics• Practice of some list authors suggests training opportunity: resourceselection; copyright awareness; resource discovery• Lists with large numbers of link-to items can take more time tovalidate than academics appreciate• Large proportions of free-on-the-web materials on resource listsmight not match fee-paying student expectation?
  • 16. 25 June 201316Challenging link-to content
  • 17. University of the Wild West‘Link to’ examplesChrisard RossRemorse Discovery and Litigation Team Manager
  • 18. YouTube Not postedby contentowner Noindication ofuse rights ClearcopyrightinfringementPrefixing the duper with super
  • 19. YouTubePrefixing the duper with super YouTube can and do takeaction against copyrightinfringement andintellectual property theft
  • 20. Don’t go there… Site hostingfull-text ofmovie scripts,ownership ofwhich is heldby film studios Academicdiscovered‘repository’through GooglesearchPopping the ‘x’ into ecellent
  • 21. …it’s not safe Domain itself is not web safe (access is blocked by YouWooWoo’sWebSense monitor)Popping the ‘x’ into ecellent
  • 22. ‘Our VLE filestore is Google optimised. Whoknew?’ Article loaded into aVLE module at ‘aninstitution somewhereon the planet’ Found and indexed byGoogle Found andbookmarked byacademic Provenance,clearance, status -unknownPutting you and you into fablos
  • 23. ‘But if I point students to Scribd, it’s allfree…’ Unmediated contenthosts provide access tofull text-material Copyright is self-assigned by theuploader Indexed by Google,found by academicPouring the ‘we’ between ‘yes’ and ‘can’
  • 24. Do you face challenges managinglink-to content on your lists?25 June 2013 24Q:Never. Ouracademics onlyselect from ourdiscoverysystemsAll the time.We canempathise withthe experiencesof YouWooWoo
  • 25. 25 June 201325How could Aspire help?Some ideas for discussion…
  • 26. Highlighting copyright awareness• One-time copyright responsibility tick-box on bookmarking• Include help and support links in cases of uncertainty
  • 27. Highlighting copyright awareness• Allied to a tenancy-level amber and red list of domains• Additional text prompts and action depending on match• Possible draft/publish options settable based on domain
  • 28. Managing copyright awareness• Reports on: domain popularity; addition of items from new domains(with drill through to items)• Potential for Tenancy-level domain Knowledge Base (guidance onhow to approach material found on, e.g., Scribd,YouTube, et al)
  • 29. Promoting library resources• Matching keyword searching withlibrary discovery systems• Proposing alternate resources fromother lists in the Tenancy• Matching to preferred search indexes(Amazon, Google Scholar,OpenLibrary)
  • 30. Managing links – reporting on 404s• As a hosted solution, Aspire’s abilityto check availability ofauthentication-protected resources islimited• Aspire could report on ‘Webpage’ or‘Website’ type items (on reasonableassumption that materials are openlyaccessible)• Then either Aspire or customer couldrun checks to report on 404s (andother error responses)
  • 31. Managing links – reporting on domains• Aspire could report out on links by domain (e.g. ‘Give me alldetails for all Tenancy items linking to’)• Extending the logic of the existing ‘All Journal Article Items’report
  • 32. Control over links displayed in item record• Extend the recently released option to select preferred ‘Onlineresource’ link from the item record• Retain URIs in item metadata in full view, but have the option toselect ones not to display – OpenURL link, dx DOI link, deep-linkURL based on best discovery link (default to show all)
  • 33. 25 June 201333How could Aspire help?Some ideas for discussion…
  • 34. 25 June 201334Questions or comments?NTU Resource Lists CrossResource Discovery and Innovation Team ManagerNottingham Trent University