Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Being a Good Data Provider


Published on

  • Be the first to comment

  • Be the first to like this

Being a Good Data Provider

  1. 1. Being a Good Data Provider Alastair Dunning JISC Programme Manager - Digitisation a.dunning AT , 0203 006 6065 March 2011, Oxford This presentation is intended to give some brief advice for those publishing digital content (digital images, cultural heritage, scholarly information etc.) on the Internet
  2. 2. Outline <ul><li>Being a Good Data Provider: A simple thing gets complex </li></ul><ul><li>Cool URIs </li></ul><ul><li>Being Friends with Google, Is Google Enough? </li></ul><ul><li>International Portals </li></ul><ul><li>Geographies </li></ul><ul><li>Re-Use and APIs </li></ul><ul><li>Licensing </li></ul>
  3. 3. Cool URIs <ul><li> </li></ul><ul><li>URI ( Uniform Resource Identifier ) refers to the &quot;generic set of all names/addresses that are short strings that refer to resources&quot; whereas URI ( Uniform Resource Locator ) is &quot;an informal term (no longer used in technical specifications) associated with popular URI schemes: http, ftp, mailto, etc.“ </li></ul><ul><li>Keep them stable , memorable and consistent – develop a short URI policy </li></ul>
  4. 4. Cool URIs <ul><li>Where do URIs get quoted? – Often taken out of their environment </li></ul><ul><ul><li>Publicity material – expensive to reprint </li></ul></ul><ul><ul><li>Academic Citations – damages scholarly trust (plus citation guidelines?) </li></ul></ul><ul><ul><li>Bookmarks within browser or on social bookmarking sites </li></ul></ul><ul><ul><li>Emails (therefore less than 76 characters, avoid underscores) </li></ul></ul><ul><ul><li>Blogs and other URIs </li></ul></ul><ul><ul><li>By search engines – loss will inhibit resource discovery </li></ul></ul><ul><ul><li>Guesswork – users make guesses at URIs – use redirects and good 404 pages </li></ul></ul><ul><li>Good Example – BBC website </li></ul><ul><li>Bad Example … </li></ul>
  5. 5. Item level not collection level <ul><li>Users may have no interest in the general resource but plenty of interest in a particular item </li></ul><ul><li>Designing Shakespeare – Shakespeare performed in London & Stratford, 1960 – 2000, 1000s of plays </li></ul><ul><li>Researchers & teachers interested in general resource </li></ul><ul><li>Actors interested in specific performances . Needed stable URIs for cast lists and photos </li></ul>
  6. 6. Being friends with Google <ul><li>No need to explain the importance in exposing content to metadata – many users have Google as their principal springboard for digital information </li></ul><ul><li>Even if using authentication, expose metadata </li></ul><ul><li>Make sure your database is easily queried by robots like Google </li></ul><ul><li>Optimisation is complex and depends on good communications process </li></ul><ul><ul><li>Use established URIs – Ensure your website is trusted </li></ul></ul><ul><ul><li>Get incoming links from other trusted sources – this drives up traffic via Google and via the original sites themselves </li></ul></ul><ul><li>Strategic Content Alliance / Netskills training and documentation </li></ul>
  7. 7. Being friends with Google <ul><li>Give distinctive <title> to each page – helps with clarity on Google </li></ul><ul><li>Use Google Sitemaps to upload details of your pages </li></ul><ul><li>Google Analytics can help with measuring web usage </li></ul><ul><li>Google Maps, Google Scholar? </li></ul><ul><li> </li></ul>
  8. 8. Is Google everything? <ul><li>Recommendation by peers and other respected persons gets resources used </li></ul><ul><li>Marketing a resource is an integrated strategy to marketing which involves technical and ‘academic’ integration </li></ul><ul><li>Workshop will be held in this area for all JISC projects in this programme </li></ul>Source – Lesly Huxley et al (2007): Gathering evidence: Current ICT use and future needs for arts and humanities researchers
  9. 9. Is Google everything? <ul><li>How is your collection integrated into library catalogue? </li></ul><ul><li>How does your resource fit in with other resources? </li></ul>Source – Mark Greengrass et al (2007): RePAH: A User Requirements Analysis for Portals in the Arts and Humanities “ Resource discovery and use would be increased by separate collections being aggregated logically based on their content” Recommendation 3 – Daisy Abbott (2008): Digital Repositories and Archives Inventory
  10. 10. Working with Aggregators <ul><li>CultureGrid - </li></ul><ul><ul><li>UK aggregator cultural heritage material Large-scale harvest of digital resources </li></ul></ul><ul><ul><li>Works well for images and multimedia </li></ul></ul><ul><ul><li>Culture grid then exposes metadata to Europeana </li></ul></ul><ul><li>WorldCat - </li></ul><ul><ul><li>Bibliographic data - both digital and not digital </li></ul></ul><ul><ul><li>Metadata exposed via Registry of Digital Masters </li></ul></ul><ul><ul><li>Requires membership – so best done via institution </li></ul></ul>
  11. 11. Aggregators <ul><li>Other options </li></ul><ul><ul><li>Archives Hub, </li></ul></ul><ul><ul><li>Connected Histories , British History 1500 – 1900 </li></ul></ul><ul><ul><li>JISC Historic Books, JISC MediaHub </li></ul></ul><ul><li>Other options exist and will emerge, particularly within specific subject fields and areas of interest. </li></ul><ul><li>Key is to have easily exposable or transferable metadata </li></ul>
  12. 12. Geographies <ul><li>“ 80% of data has a geographical component” … possibly </li></ul><ul><li>Lists, text, word can be confusing to navigate </li></ul><ul><li>Maps have a simplicity which many, but not all, find engaging </li></ul><ul><li>Examples - BL Sound Archive, Population Reports online, Flickr </li></ul><ul><li>It’s about visualising your data in different ways … time is also a powerful metaphor </li></ul>
  13. 13. Geographies
  14. 14. Application Programming Interfaces (API) <ul><li>“ The best use of your data will be thought of by someone else” </li></ul><ul><li>Separating data from its interface </li></ul><ul><li>Publishing each strand of metadata as a separate URI </li></ul><ul><li>Allows others to build interfaces over your data (and edit / annotate your data, if you want) </li></ul><ul><li>Requires certain amount of technical knowledge in setting up and institutional belief </li></ul><ul><li>Good example – </li></ul>
  15. 15. Licensing <ul><li>A different challenge for re-use – making sure people know what they can do with your content </li></ul><ul><li>Licensing in – clearing third party rights </li></ul><ul><li>Licensing out – what can your users do </li></ul><ul><ul><li>Possibilities – re-use in educational context, remashing (including editing, cropping, rearranging), commercial use, anything, attribution </li></ul></ul><ul><ul><li>Various existing licence s– worth exploring Creative Commons </li></ul></ul><ul><ul><li>Other options may be required for third-party material </li></ul></ul><ul><li>Clarity over this is essential to avoid user confusion and legal ramifications </li></ul><ul><li>But all JISC projects must indicate what can be done to their content </li></ul>
  16. 16. In Summary <ul><li>Irrespective of the type of content ... </li></ul><ul><li>Cool URIs </li></ul><ul><li>Being Friends with Google, Is Google Enough? </li></ul><ul><li>International Portals </li></ul><ul><li>Geographies </li></ul><ul><li>Re-Use and APIs </li></ul><ul><li>Licensing </li></ul>