British Library Labs Presentation at Elpub 2014, June 20, 2014


Published on

Key note presentation given at ElPub2014, June 20 about the Digital Scholarship department and the work of the Digital Research Team and British Library Labs.

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

British Library Labs Presentation at Elpub 2014, June 20, 2014

  1. 1. Putting data to use for researchers: How the British Library's Digital Scholarship department is putting data to use for researchers through its Digital research Team and British Library Labs project Mahendra Mahey 18th International Conference on Electronic Publishing (Elpub) Keynote speech, Friday 20 June, 2014, 0930 – 1030 (EST) Alexander Technological Education Institute of Thessaloniki, Greece Manager of British Library Labs
  2. 2. 2 #bl_labs Overview • The British Library and a typical scholar • The Nature of Digital and the Digital Scholar • The British Library supporting Digital Scholarship • Experiences of the Digital Research Team and British Library Labs project in supporting digital scholarship • Conclusions
  3. 3. 3 #bl_labs The British Library St Pancras, London, UK Many books are stored 5 stories below the building Inside the British Library Space for 1200 readers, around 400,000 visitors per year Uses low oxygen and robots Storage at Boston Spa
  4. 4. 4 #bl_labs British Library Collections > 150million items > 0.8 m serial titles > 8 m stamps > 14 m books > 3 m sound recordings > 4 m maps > 1.6 m musical scores > 0.3 m manuscripts > 60 m patents King’s Library
  5. 5. 5 #bl_labs Our Scholar in Humanities… • Travel routes in the 19th Century Pieter Francois Post doctoral researcher at University of Oxford Bob Nicholson History Lecturer at Edge Hill University • History lecturer specialising in the Victorian period
  6. 6. 6 #bl_labs The Nature of Digital Data broken down recombined and duplicated Image: Tower of Babble, Book Sculpture by Brian Dettmer
  7. 7. 7 #bl_labs The Digital Scholar not necessarily be a recognised academic or someone who posts online, just a specialist Digital NetworkedOpen From Digital Scholar : How technology is transforming scholarly practice, Martin Weller, Bloomsbury Academic, 2011, page 4 It is someone who employs digital, networked and open approaches to demonstrate their specialism.
  8. 8. 8 #bl_labs Digital Humanities “The emergence of the new digital humanities isn’t an isolated academic phenomenon. The institutional and disciplinary changes are part of a larger cultural shift, inside and outside the academy, a rapid cycle of emergence and convergence in technology and culture” Steven E Jones, Emergence of the Digital Humanities (2013) Father Roberto Busa (1913-2011)
  9. 9. 9 #bl_labs “Reading individual works is as irrelevant as describing the architecture of a building from a single brick, or the layout of a city from a single church.” -Franco Moretti
  10. 10. 10 #bl_labs Example Digital research methods (has some examples from researchers) Corpus analysis tools Text Mining Visualisations Location based searching Geotagging Annotation Natural Language Processing Using Application Programming Interfaces for datasets e.g. Metadata, Images Transcribing Crowdsourcing / Human Computation
  11. 11. 11 #bl_labs Digitisation at the British Library
  12. 12. 12 #bl_labs Digitised Books 250,000 books being digitised with Google 68,000 volumes digitised with Microsoft 17th, 18th and 19th Century Image taken from page 344, Volume 2, Cassell's Illustrated History of the Russo-Turkish War, etc. by OLLIER, Edmund. Otto, King of Greece Image taken from page 10, "The Greece of the Greeks", PERDICARIS, G. A.,
  13. 13. 13 #bl_labs Digitised Newspapers Newspapers stored at Colindale (now closed)
  14. 14. 14 #bl_labs Digitised Manuscripts
  15. 15. 15 #bl_labs Not just text…Moving Image Collections
  16. 16. 16 #bl_labs Digitisation - Transforming access Spreading the value of collections, content and expertise Connecting as much as collecting, e.g. social media Encouraging others to integrate our materials into their services – and vice versa
  17. 17. 17 #bl_labs only in Reading Rooms due to © only on site due to © not online – various storage devices online and open British Library online behind paywall Challenges of Digital access
  18. 18. 18 #bl_labs Digital Scholarship Department …become a leading centre of digital scholarship … internationally recognised for innovation and collaboration in support of research and learning… •The Digital Research Team – Digital Curators •The British Library Labs project 18
  19. 19. 19 #bl_labs What is a Digital Curator? • Explore how digital technologies are re/shaping research and how this informs how the library does its business. • Support staff across the library to identify the opportunities that digital tools and collections afford in modern scholarship and to gain the skills to engage confidently in this area. • Partner with libraries and institutions to enable innovation in digital scholarship. • No specific collection but rather expertise in digital scholarship, broadly defined. James Baker Nora McGregor Stella Wisdom Aquiles Alencar-Brayner
  20. 20. 20 #bl_labs Training Library Staff • Foundations in working with Digital Objects: From Images to A/V • Data Visualisation for Analysis in Scholarly Research • Information Integration: Mash-ups, API’s and The Semantic Web Digital Scholarship Training Programme • Behind the Screen: Basics of the Web • What is Digital Scholarship? • Digital Collections at British Library • Digitisation at British Library • Text Encoding Initiative & Annotation • Geo-referencing and Digital Mapping • Crowdsourcing in Libraries, Museums and Cultural Heritage Institutions
  21. 21. 21 #bl_labs Opening up Digital content • Picturing Canada: Mapping a Collection:
  22. 22. 22 #bl_labs Crowdsourcing Digitised Maps
  23. 23. 23 #bl_labs Creative with Wildlife Sounds Sound Edit Wildlife Films Competition 2013 'Dave's Wild Life' by Samuel de Ceccatty, won first prize!
  24. 24. 24 #bl_labs The Big Data Experiment • Microsoft Azure • University College London’s Computing and Digital Humanities department • Recommender engine for BL Public domain content
  25. 25. 25 #bl_labs Technology Strategy Board Competition Winner • Competition with Technology Strategy Board • Focus on understanding the value and impact of making the British Library’s Digital Content and data open / in the public domain • Peter Balman will develop an analytics dashboard for the Library showing what is happening to our public domain content Challenge details:
  26. 26. 26 #bl_labs Computer Games Off the Map Competition 2013 Pudding Lane Productions, 6 second-year students, De Montfort University, Leicester, won first prize. Off the Map Gothic 2014 !
  27. 27. 27 #bl_labs Funded by the Andrew Mellon Foundation
  28. 28. 28 #bl_labs Digital Scholarship Digital Research Access & Reuse Group © Developers/ Technical Staff British Library Universities & wider e.g. companies, start-ups, independent scholars etc. Stakeholders involved in Labs United Kingdom The World Researchers Developers BL Labs Curators / Researchers Digital Content
  29. 29. 29 #bl_labs What is Labs… BL Labs Open Software Publications Tools & services to support Digital Scholarship Case Studies Audience Research question / idea idea idea Competition Contact Events Meetings and visits Experimenting with our digital collections Outputs from engagementData Other Digital Collection / Data BL Digital Collection / Data Researchers Developers Data Driven
  30. 30. 30 #bl_labs Labs audience Researchers Developers Ability to question, Review or interpret domain in Potentially meaningful way Good things happen here upskill Skill and / or capability To realise that potential *human and / or hardware Specific Domain Knowledge People of interest Desired outcome of BL Labs activities
  31. 31. 31 #bl_labs British National Bibliography UK Web Archive Data Text-mining of electronic journals Book ordering and anonymised reader data Sample Labs Digital Collections • Copyright cleared for research use • Curated (Is there someone who knows the ‘story’ about the collection?) • Collection / Item Level Metadata available? (What state is and does it need cleaning?) • Where is it? coming soon!!
  32. 32. 32 #bl_labs Engaging with Labs Brainstorm ideas & group Consider and choose Work late and show what has been done 1 2 3 Labs Data Cards Ideas Labs Hack and Data days Projects
  33. 33. 33 #bl_labs The winners of the Labs 2013 competition Pieter Francois (left) and Dan Norton (right) and each received a cheque for £2000 in November 2013 as winners of the first British Library Lab Competition 2013 Two entries chosen in June 2013 They both worked in residence from July to October 2013 with Labs to complete their projects
  34. 34. 34 #bl_labs Sample Generator: representative samples • Pieter Francois • Focus on European travel in the 19th Century • Uses statistical methods to support text analysis • Tool produces representative samples of texts based on search criteria
  35. 35. 35 #bl_labs Pieter Francois
  36. 36. 36 #bl_labs Mixing the Library: The Disc Jockey & the Digital Collection Prototype design Annotation Preview ‘item’ Selected ‘right’ channel ‘item’ Selected ‘left’ channel ‘item’ Collection ‘stalks’ made of ‘items’. Each ‘item’ is a URL. The order of the ‘items’ can be ‘shuffled’ and sent to the ‘left’ or ‘right’ channels ‘Play back’ of ‘items’ (Blue) and annotations (Yellow) Living Lab: Library of the Future, see: Basic functioning prototype:
  37. 37. 37 #bl_labs Curatorial for Library metadata Geo location TimelineSlide show India Office Select materials
  38. 38. 38 #bl_labs Winners of 2014 Competition Victorian Meme Machine Bob Nicholson of Edge Hill University Anna Gerber and Desmond Schmidt from Queensland University Blog posting YouTube: Blog: YouTube: Text to Image Linking Tool (TILT)
  39. 39. 39 #bl_labs Bob Nicholson
  40. 40. 40 #bl_labs Story of one digital collection What can 68,000 books tell us? Image: Artwork by Alicia Martin
  41. 41. 41 #bl_labs Extracting Images from OCR 41 <?xml version="1.0" encoding="UTF-8" ?> - <mets:mets xmlns:xsi="http://ww Schema-instance" xmlns:mets="http://w" xsi:schemaLocation= " METS/ standards/mets/ver sion18/mets.xsd info:lc/xmlns/premi s-v2 Image snipped out Algorithmically From ALTO XML Image taken from page 207 of 'London and its Environs. A picturesque survey of the metropolis and the suburbs ... Translated by Henry Frith. With ... illustrations' ALTO XML
  42. 42. 42 #bl_labs Face Recognition of 19th Century Faces The face-recognition algorithm worked better for female faces than men’s
  43. 43. 43 #bl_labs The Mechanical Curator • #similar_to_77576796197_published_date • #similar_to_77576796197_slantyness • #similar_to_77576796197_bubblyness_x • #similar_to_77576796197_bubblyness_y • #new_train_of_thought Image from ‘A Lost Estate, by Mary E.Mann,Volume: 02, Page: 91, 1889, London, Bentley & Son
  44. 44. 44 #bl_labs 1,020,418 images! Each image has a URL Some metadata, but you can add tags! Flickr has an API so researchers and developers can build apps And query the data Flickr Commons – 1,020,418 images!
  45. 45. 45 #bl_labs Flickr in numbers 163,000,000 !!! image views since launch December 13th, 2013 to June 10th Almost all images seen at least 5 times 90,699tags added 18,567 images favourited Labs involved with 2 potential research projects & 4 grassroots crowdsourcing efforts.
  46. 46. 46 #bl_labs Tagging a million images - Metadata games and other projects Games will probably be developed using Flickr sets Cardiff University’s - Lost Visions Project
  47. 47. 47 #bl_labs Risks of releasing the images Funny Books for Boys and Girls. Struwelpeter. Good-for-nothing Boys and Girls. Troublesome Children. King Nutcracker and Poor Reinhold.
  48. 48. 48 #bl_labs Opportunities – increasing traffic to Library services You can purchase a ‘High Res’ Copy View in the Library Item Viewer Download .pdf All illustrations in book Other illustrations in books Published in same year View the item in the Library Catalogue Tags auto generated User generated Tag Grouping for image
  49. 49. 49 #bl_labs Flickr coverage in the media!
  50. 50. 50 #bl_labs Creative Uses Jura’s Sound Skateboard
  51. 51. 51 #bl_labs Burning Man David Normal, creating light boxes around the Burning man, using the British Library’s Flickr Images
  52. 52. 52 #bl_labs Other Labs stories…. • Augmenting news metadata • Opening up over 100,000 Playbills • 3D printed objects representing statistical data with possibly embedded USBs and RFID chips •, place for all our open data and digital collections • Content next to parallel compute power, analysis at scale • Funding till 2017!
  53. 53. 53 #bl_labs Conclusions • Huge appetite for openly available digital content, • There needs to be a continuous dynamic interaction with data and the researchers to formulate and reformulate research questions • Working with Digital Scholars creates new opportunities • Content and service providers, researchers and technical people need to talk to each other to create the new tools, services and data needed to facilitate new discoveries • Don’t be afraid to experiment and make mistakes too!
  54. 54. 54 #bl_labs Acknowledgements Ben O’Steen - Labs Technical Lead Digital Curator Team Digital Scholarship Head Stella Wisdom - Digital Curator Nora McGregor - Digital Curator James Baker - Digital Curator Adam Farquhar - Head of Digital Scholarship (Wrote Labs proposal)
  55. 55. 55 #bl_labs Email Labs • Let us know your ideas for engaging with Labs! • Questions?