• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Being a Good Data Provider, by Alastair Dunning

Being a Good Data Provider, by Alastair Dunning



Making sure your content is licenced and discoverable...

Making sure your content is licenced and discoverable

A presentation from the JISC Programme Meeting for its Content Programme for 2011 http://www.jisc.ac.uk/whatwedo/programmes/digitisation/econtent11.aspx



Total Views
Views on SlideShare
Embed Views



3 Embeds 164

http://availableonline.wordpress.com 95
http://digitisation.jiscinvolve.org 67
http://translate.googleusercontent.com 2


Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution-NonCommercial LicenseCC Attribution-NonCommercial License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Being a Good Data Provider, by Alastair Dunning Being a Good Data Provider, by Alastair Dunning Presentation Transcript

    • Being a Good Data Provider Alastair Dunning JISC Programme Manager - Digitisation a.dunning AT jisc.ac.uk , 0203 006 6065 November 2011, Oxford This presentation is intended to give some brief advice for those publishing digital content (digital images, cultural heritage, scholarly information etc.) on the Internet
    • Outline
      • Being a Good Data Provider: A simple thing gets complex
      • Cool URIs
      • Being Friends with Google, Is Google Enough?
      • International Portals
      • Geographies
      • Re-Use and APIs
      • Licensing
    • Cool URIs
      • http://www.ariadne.ac.uk/issue31/web-focus/
      • URI ( Uniform Resource Identifier ) refers to the "generic set of all names/addresses that are short strings that refer to resources" whereas URL ( Uniform Resource Locator ) is "an informal term (no longer used in technical specifications) associated with popular URI schemes: http, ftp, mailto, etc.“
      • Keep them stable , memorable and consistent – develop a short URI policy
    • Cool URIs
      • Where do URIs get quoted? – Often taken out of their environment
        • Publicity material – expensive to reprint
        • Academic Citations – damages scholarly trust (plus citation guidelines?)
        • Bookmarks within browser or on social bookmarking sites
        • Emails (therefore less than 76 characters, avoid underscores)
        • Blogs and other URIs
        • By search engines – loss will inhibit resource discovery
        • Guesswork – users make guesses at URIs – use redirects and good 404 pages – often true when people are running queries against a database
      • Good Example – BBC website
      • Bad Example …
    • Item level not collection level
      • Users may have no interest in the general resource but plenty of interest in a particular item
      • Designing Shakespeare – Shakespeare performed in London & Stratford, 1960 – 2000, 1000s of plays
      • Researchers & teachers interested in general resource
      • Actors interested in specific performances . Needed stable URIs for cast lists and photos
    • Being friends with Google
      • No need to explain the importance in exposing content to metadata – many users have Google as their principal springboard for digital information
      • Even if using authentication, expose metadata
      • Make sure your database is easily queried by robots like Google
      • Optimisation is complex and depends on good communications process
        • Use established URIs – Ensure your website is trusted
        • Get incoming links from other trusted sources – this drives up traffic via Google and via the original sites themselves
      • Strategic Content Alliance / Netskills training and documentation
    • Being friends with Google
      • Give distinctive <title> to each page – helps with clarity on Google
      • Use Google Sitemaps to upload details of your pages
      • Google Analytics can help with measuring web usage
      • Google Maps, Google Scholar?
      • http://www.google.com/publicsector
    • Is Google everything?
      • Recommendation by peers and other respected persons gets resources used
      • Marketing a resource is an integrated strategy to marketing which involves technical and ‘academic’ integration
      • Workshop will be held in this area for all JISC projects in this programme
      Source – Lesly Huxley et al (2007): Gathering evidence: Current ICT use and future needs for arts and humanities researchers
    • Is Google everything?
      • How is your collection integrated into library catalogue?
      • How does your resource fit in with other resources?
      Source – Mark Greengrass et al (2007): RePAH: A User Requirements Analysis for Portals in the Arts and Humanities “ Resource discovery and use would be increased by separate collections being aggregated logically based on their content” Recommendation 3 – Daisy Abbott (2008): Digital Repositories and Archives Inventory
    • Working with Aggregators
      • CultureGrid - http://www.culturegrid.org.uk/
        • UK aggregator cultural heritage material Large-scale harvest of digital resources
        • Works well for images and multimedia
        • Culture grid then exposes metadata to Europeana
      • WorldCat - http://www.worldcat.org/librarians/default.jsp
        • Bibliographic data - both digital and not digital
        • Metadata exposed via Registry of Digital Masters
        • Requires membership – so best done via institution
    • Aggregators
      • Other options
        • Archives Hub, http://archiveshub.ac.uk/
        • Connected Histories , British History 1500 – 1900
        • JISC Historic Books, JISC MediaHub
      • Other options exist and will emerge, particularly within specific subject fields and areas of interest.
      • Key is to have easily exposable or transferable metadata
    • Geographies
      • “ 80% of data has a geographical component” … possibly
      • Lists, text, word can be confusing to navigate
      • Maps have a simplicity which many, but not all, find engaging
      • Examples - BL Sound Archive, Population Reports online, Flickr
      • It’s about visualising your data in different ways … time is also a powerful metaphor
    • Geographies
    • Application Programming Interfaces (API)
      • “ The best use of your data will be thought of by someone else”
      • Separating data from its interface
      • Publishing each strand of metadata as a separate URI
      • Allows others to build interfaces over your data (and edit / annotate your data, if you want)
      • Requires certain amount of technical knowledge in setting up and institutional belief
      • Good example – http://www.vam.ac.uk/api
    • Licensing
      • A different challenge for re-use – making sure people know what they can do with your content
      • Licensing in – clearing third party rights
      • Licensing out – what can your users do
        • Possibilities – re-use in educational context, remashing (including editing, cropping, rearranging), commercial use, anything, attribution
        • Various existing licence s– worth exploring Creative Commons
        • Other options may be required for third-party material
      • Clarity over this is essential to avoid user confusion and legal ramifications
      • But all JISC projects must indicate what can be done to their content
    • Discovery Principles
      • Since writing this presentation, many of its points are embedded in the   Technical Principles for the Discovery Ecosystem
      • Sounds nerdy.
      • Isn’t.
      • Okay maybe it is.
      • But vital for projects with data to expose to users
    • In Summary
      • Irrespective of the type of content ...
      • Cool URIs
      • Being Friends with Google, Is Google Enough?
      • International Portals
      • Geographies
      • Re-Use and APIs
      • Licensing