British Library Labs Presentation at Ed Tech Hackathon 2013 -


Published on

BL Labs Ed Tech Hackathon 2013 Presentation

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

British Library Labs Presentation at Ed Tech Hackathon 2013 -

  1. 1. British Library Labs Saturday 26th October 2013, 1000 – 1100 (15 min slot) Ed Tech Hackathon 2013 (Apps for Learning and Teaching) Central Working Canteen, Google Campus 4–5 Bonhill Street, London, EC2A 4BX, UK Mr Mahendra Mahey British Library Labs Project Manager
  2. 2. British Library Labs is about… Encouraging scholars and developers to do research and development with and across British Library digital collections and data #bl_labs 2
  3. 3. What activities does Labs do? • Encourage researchers / developers to do interesting things with BL digital content (+other) with and across collections (a data driven approach) • Competitions and events (hack events) • Creating an environment where scholars / developers can work intensively with Library’s digital collections (winners will be resident) • Work with your ideas • Help develop tools and services to support digital scholarship • Case studies for the library and research communities #bl_labs 3
  4. 4. How Labs works… idea BL Digital Collection / Data BL Digital Collection / Data idea idea Competitions idea Contact Other Digital Collection Software Events idea #bl_labs BL Labs Publications Tools and services to support Digital Scholarship 4
  5. 5. British Library Digital Collections Over 600 digital collections and rising…Filter is needed… •Copyright cleared for research and commercial use or non commercial •Curated (Is there someone who knows about the collection?) Digital but not online – various storage devices •Collection / Item Level Metadata available? •Where is it? Digital and online #bl_labs Available only in Reading Rooms due to © Available on site only at the moment due to © Available only onsite at the moment Hack Events, In residence 5
  6. 6. Some data / digital collections Datasets, Books / Text, Images / Music, Maps, Sounds, Multimedia Resonance FM 10 year Community Arts Radio Show Text-mining of electronic journals Book ordering and anonymised reader data UK Web Archive Data 19th Century Books #bl_labs Environmental Sounds British National Bibliography 6
  7. 7. Example Research Methods • Corpus analysis tools • Visualisations • Location based searching • Geotagging • Annotation • APIs for datasets e.g. Metadata, Images • Crowdsourcing / Human Computation • Natural Language Processing • Transcribing #bl_labs 7
  8. 8. Ideas from first competition • Text mining tool in the reading rooms • Curatorial…repackaging metadata for teaching and learning in a CMS e.g. Drupal • Visualising large collections of sound at a glance (thumbnails) • Using sheet music and OMR software • Working to re-use a radio archive • The winners are… #bl_labs 8
  9. 9. Mixing the Library: The Disc Jockey & the Digital Collection Dan Norton’s prototype ‘mixing’ interface Dan Norton completed a PhD at the University of Dundee and is an Artist in Residence at Hangar, Centre for Art and Research, Barcelona. Annotation His idea is to build a ‘mixing’ interface for interacting with BL digital collections and wider developed from the DJ's model of interaction with information. Preview ‘item’ ‘Play back’ of ‘items’ (Blue) and annotations (Yellow) Selected ‘left’ channel ‘item’ Collection ‘stalks’ made of ‘items’. Each ‘item’ is a URL. The order of the ‘items’ can be ‘shuffled’ and sent to the ‘left’ or ‘right’ channels #bl_labs Selected ‘right’ channel ‘item’ 9
  10. 10. The Sample Generator for Digitised Texts 1 Pieter Francois is a Postdoctoral Researcher at the University of Oxford. The ‘Sample Generator’ connects one or more major catalogues or collections of digitized texts through metadata. British Library Labs Sample Generator From Travel Routes 1888 To 1899 Account, Tour, Adventure, Visit. Journey, Expedition, Excursion, Trip, Holiday, Guide, Plan, Route  Digitally available content only Search terms  Synonyms Distribution of items in catalogue Sample Size 8 Generates a randomised unbiased sample Generate 2 Generated sample URL (unique & citable after creation) Terms used: ‘Travel Routes’ from ‘1888-1899’, sample size ‘8’. Set created on ‘16/10/2013’ by ‘Pieter Francois’ Travel route extracted from ‘Work 1’ for further research 3 Work 1 Researcher carries out research on works in the sample generated. Here it used for the basis of generating travel routes as shown in 3. Work 1 Work 2 Work 3 Work 4 Work 5 Work 6 Work 7 Work 8 In this example, the ‘Sample Generator’ searches across 1.8 million bibliographic records from the 19th Century for items about ‘Travel Routes’ and where possible (digitised items permitting) provides unbiased digital ‘samples’ for further research. #bl_labs 10
  11. 11. Next Competition • Starts 11 November 2013 and ends around April 2014 • Submit idea, engage during this period to formulate a good idea • First prize £3000 and residency (expenses paid) and we will work with you to make your shiny thing between May and October 2014 • Work with us anyway and our content at Data / Hack events: • 12 Dec 2013, 13 January 2014, 12 February 2014, 10 March 2014 #bl_labs 11
  12. 12. Data / Items brought • British National Bibliography in RDF Triples • Digitised books from 17th, 18th and 19th and 20th Century • Image metadata • 10 x USB Sticks • 1 x 500 Gb hard drive #bl_labs 12
  13. 13. British National Bibliographic Data • (part of – download here, SPARQL end point • 2.8 Million individual records • Available as Linked Open Data, Basic RDF/XML and Marc21. • On USB • Hard Drive #bl_labs Augmenting Author records – London Review of Books Combining with other data sources? 13
  14. 14. 19th Century Digitised Books • 65,000 digitised volumes. Many rare or inaccessible books published between 17, 18, 19 and 20th Century including philosophy, history, poetry and literature, travel • 25 million pages (OCR text available, 75% accuracy) on hard drive as .txt, .json and metadata as .xml (50 Gb) (metadata as .tsv metadata on USB stick), items identified by unique numbers • 420,000 images / illustrations available on Flickr (around 70% and counting) (use their API) and on hard drive (100 Gb – 20 mins? – illustrations and covers) • See Mechanical Curator on Tumblr - • For images - Jigsaw, crowdsourcing metadata, image recognition (machine learning) • For Text – dirty data, cleaning up exercise, with educational purpose? #bl_labs 14
  15. 15. Image Metadata • .CSV files on USB stick and hard drive • Contains links to images • Re-purpose metadata and images? #bl_labs 15
  16. 16. What next? Speak to me: 0207 412 7324 Email me: or Labs Website: Twitter: @BL_Labs Hash Tag: #bl_labs Jiscmail: Blog: #bl_labs 16