Being a Good Data Provider
Upcoming SlideShare
Loading in...5

Being a Good Data Provider






Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution-NonCommercial-NoDerivs LicenseCC Attribution-NonCommercial-NoDerivs LicenseCC Attribution-NonCommercial-NoDerivs License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Being a Good Data Provider Being a Good Data Provider Presentation Transcript

  • Being a Good Data Provider Alastair Dunning JISC Programme Manager - Digitisation a.dunning AT , 0203 006 6065 March 2011, Oxford This presentation is intended to give some brief advice for those publishing digital content (digital images, cultural heritage, scholarly information etc.) on the Internet
  • Outline
    • Being a Good Data Provider: A simple thing gets complex
    • Cool URIs
    • Being Friends with Google, Is Google Enough?
    • International Portals
    • Geographies
    • Re-Use and APIs
    • Licensing
  • Cool URIs
    • URI ( Uniform Resource Identifier ) refers to the "generic set of all names/addresses that are short strings that refer to resources" whereas URI ( Uniform Resource Locator ) is "an informal term (no longer used in technical specifications) associated with popular URI schemes: http, ftp, mailto, etc.“
    • Keep them stable , memorable and consistent – develop a short URI policy
  • Cool URIs
    • Where do URIs get quoted? – Often taken out of their environment
      • Publicity material – expensive to reprint
      • Academic Citations – damages scholarly trust (plus citation guidelines?)
      • Bookmarks within browser or on social bookmarking sites
      • Emails (therefore less than 76 characters, avoid underscores)
      • Blogs and other URIs
      • By search engines – loss will inhibit resource discovery
      • Guesswork – users make guesses at URIs – use redirects and good 404 pages
    • Good Example – BBC website
    • Bad Example …
  • Item level not collection level
    • Users may have no interest in the general resource but plenty of interest in a particular item
    • Designing Shakespeare – Shakespeare performed in London & Stratford, 1960 – 2000, 1000s of plays
    • Researchers & teachers interested in general resource
    • Actors interested in specific performances . Needed stable URIs for cast lists and photos
  • Being friends with Google
    • No need to explain the importance in exposing content to metadata – many users have Google as their principal springboard for digital information
    • Even if using authentication, expose metadata
    • Make sure your database is easily queried by robots like Google
    • Optimisation is complex and depends on good communications process
      • Use established URIs – Ensure your website is trusted
      • Get incoming links from other trusted sources – this drives up traffic via Google and via the original sites themselves
    • Strategic Content Alliance / Netskills training and documentation
  • Being friends with Google
    • Give distinctive <title> to each page – helps with clarity on Google
    • Use Google Sitemaps to upload details of your pages
    • Google Analytics can help with measuring web usage
    • Google Maps, Google Scholar?
  • Is Google everything?
    • Recommendation by peers and other respected persons gets resources used
    • Marketing a resource is an integrated strategy to marketing which involves technical and ‘academic’ integration
    • Workshop will be held in this area for all JISC projects in this programme
    Source – Lesly Huxley et al (2007): Gathering evidence: Current ICT use and future needs for arts and humanities researchers
  • Is Google everything?
    • How is your collection integrated into library catalogue?
    • How does your resource fit in with other resources?
    Source – Mark Greengrass et al (2007): RePAH: A User Requirements Analysis for Portals in the Arts and Humanities “ Resource discovery and use would be increased by separate collections being aggregated logically based on their content” Recommendation 3 – Daisy Abbott (2008): Digital Repositories and Archives Inventory
  • Working with Aggregators
    • CultureGrid -
      • UK aggregator cultural heritage material Large-scale harvest of digital resources
      • Works well for images and multimedia
      • Culture grid then exposes metadata to Europeana
    • WorldCat -
      • Bibliographic data - both digital and not digital
      • Metadata exposed via Registry of Digital Masters
      • Requires membership – so best done via institution
  • Aggregators
    • Other options
      • Archives Hub,
      • Connected Histories , British History 1500 – 1900
      • JISC Historic Books, JISC MediaHub
    • Other options exist and will emerge, particularly within specific subject fields and areas of interest.
    • Key is to have easily exposable or transferable metadata
  • Geographies
    • “ 80% of data has a geographical component” … possibly
    • Lists, text, word can be confusing to navigate
    • Maps have a simplicity which many, but not all, find engaging
    • Examples - BL Sound Archive, Population Reports online, Flickr
    • It’s about visualising your data in different ways … time is also a powerful metaphor
  • Geographies
  • Application Programming Interfaces (API)
    • “ The best use of your data will be thought of by someone else”
    • Separating data from its interface
    • Publishing each strand of metadata as a separate URI
    • Allows others to build interfaces over your data (and edit / annotate your data, if you want)
    • Requires certain amount of technical knowledge in setting up and institutional belief
    • Good example –
  • Licensing
    • A different challenge for re-use – making sure people know what they can do with your content
    • Licensing in – clearing third party rights
    • Licensing out – what can your users do
      • Possibilities – re-use in educational context, remashing (including editing, cropping, rearranging), commercial use, anything, attribution
      • Various existing licence s– worth exploring Creative Commons
      • Other options may be required for third-party material
    • Clarity over this is essential to avoid user confusion and legal ramifications
    • But all JISC projects must indicate what can be done to their content
  • In Summary
    • Irrespective of the type of content ...
    • Cool URIs
    • Being Friends with Google, Is Google Enough?
    • International Portals
    • Geographies
    • Re-Use and APIs
    • Licensing