Digital Scholarship at the British Library: Collecting, Collaboration and Research by Stella Wisdom
1. Digital Scholarship at the British Library:
Collecting, Collaboration and Research
Talk for Bath Spa University
Digital Writing Research Group
20th March 2017
Stella Wisdom, Digital Curator
@miss_wisdom
Blog: http://britishlibrary.typepad.co.uk/digital-scholarship/
2. www.bl.uk 2
The British Library is the
national library of the UK
We receive a copy of every
publication produced in the UK and
Ireland
From 6 April 2013, legal deposit
covers e-books, e-journals and
other types of electronic
publication
Plus other material that is made
available to the public in the UK on
handheld media such as CD-
ROMs and microfilm, on the web
(including websites) and by
download from a website.
http://www.bl.uk/aboutus/legaldeposit/
3. www.bl.uk 3
Over 150 Million items
are stored in London and in
Yorkshire
If you saw 5 items a day
it would take you 80,000
years to see the whole
collection
Digitisation is crucial for
opening up access to
this content and collections
http://www.bl.uk/aboutus/quickinfo/facts/
4. www.bl.uk 4
The UK Web Archive
http://www.webarchive.org.uk
• Three collections:
– Open Archive (since 2004)
– Legal Deposit Archive (since 2013)
– JISC Historical Archive (1996-2013)
• Statistics:
– Over eight billion resources
– Over 160TB compressed data
• Goals:
– Preserve UK web history
– Support access
– Enable research
5. www.bl.uk 5
The Conservative Party deleted speeches and press releases
published on its website between 2000 and the 2010 general election.
http://www.bbc.co.uk/news/uk-politics-24924185
http://www.theguardian.com/politics/2013/nov/13/conservative-party-archive-speeches-internet
7. www.bl.uk 7
The Wendy Cope Archive
http://www.wired.co.uk/news/archive/2011-05/10/british-library-digital-archives
8. www.bl.uk 8
Digital Scholarship is using
computational methods either
to answer existing research
questions or to challenge existing
theoretical paradigms…. Geotagging
Data Visualisation
Data Mining
Georeferencing
Digital Mapping
Crowdsourcing
Text mining
Collaboration
9. www.bl.uk 9
Meet the Digital Research Team
We support researchers in the innovative
use of British Library's digital collections and
data through:
• Working behind the scenes to get content
in digital form and online
• Offering digital research support and
guidance
• Supporting collaborative projects
• Running events, competitions, and awards
11. www.bl.uk 11
Datasets
data.bl.uk
As part of its work to open its data to wider use, the British
Library is making copies of some of its datasets available for
research and creative purposes.
This site is a 'beta', and is in the early stages of development. If
you have questions or feedback about this site or our open
data work, please email digitalresearch@bl.uk.
We'd also love to hear what you've done or made with the data.
12. www.bl.uk 12
Story of one digital collection…
What can68,000
books tell us?
Image: Artwork by Alicia Martin
13. www.bl.uk 13
Microsoft Partnership Digitisation
2006-8
• 68,000 volumes (47,000+ titles) published in the 19th
century mostly in English
• Excluded authors active 1850-1901 and who died after
1936
• Output: 25 million pages
• Digitised content is public domain
14. www.bl.uk 14
Extracting Images from OCR
14
<?xml version="1.0"
encoding="UTF-8" ?>
- <mets:mets
xmlns:xsi="http://ww
w.w3.org/2001/XML
Schema-instance"
xmlns:mets="http://w
ww.loc.gov/METS/"
xsi:schemaLocation=
"http://www.loc.gov/
METS/
http://www.loc.gov/
standards/mets/ver
sion18/mets.xsd
info:lc/xmlns/premi
s-v2
Image snipped out
Algorithmically
From ALTO XML
Image taken from page 207 of 'London and its Environs. A
picturesque survey of the metropolis and the suburbs ...
Translated by Henry Frith. With ... illustrations'
ALTO XML
20. www.bl.uk 20
David Normal created light boxes around the
Burning man, using the British Library’s Flickr Images
The Crossroads of Curiosity Installation
at Burning Man Festival
21. www.bl.uk 21
The Crossroads of Curiosity Installation at the British Library
June to November 2015
The installation featured an “augmented reality” self-guided tour enabling viewers
to explore the meaning and origins of the painting’s symbols using Blippar.
www.crossroadsofcuriosity.com
http://www.bl.uk/events/the-crossroads-of-curiosity-installation
22. www.bl.uk 22
Hey There, Young Sailor
written and directed by Ling Low with visual art by Lyn Ong.
Inspired by the works of early cinema pioneer Georges Méliès, the video draws on 19th
century images from the British Library's Flickr collection.
The video was commissioned by Malaysian indie folk band The Impatient Sisters
https://youtu.be/bcOP1E5bRE0
26. www.bl.uk 26
Rob Sherman – On My Wife’s Back
HMS Terror and HMS Erebus were lost on the ill-fated Franklin expedition in
search of the Northwest Passage in 1845
27. www.bl.uk 27
Rob Sherman – On My Wife’s Back
Isaak Scinbank, the ‘Arctic Angler’ of Milldale
http://onmywifesback.tumblr.com/post/100107314408/scinbank
28. www.bl.uk 28
Rob Sherman – On My Wife’s Back
http://britishlibrary.typepad.co.uk/collectioncare/2014/11/the-salmon-
book-conservation-in-reverse-.html
30. www.bl.uk 30
Rob Sherman – On My Wife’s Back
Events & Music
https://soundcloud.com/the-british-library/sets/songs-from-on-my-wifes-back
31. www.bl.uk 31
Sarah Cole, Poetic Places
Creative-Entrepreneur-In-Residence
http://www.poeticplaces.uk/
32. www.bl.uk 32
What is Poetic Places?
• A free, native app for Android and iOS devices.
• Bring poetic depictions of places into the physical world,
helping people to encounter literature and heritage in
relevant locations, accompanied by materials drawn from
cultural heritage collections.
• Brings literature and heritage into everyday life in
unexpected moments. Serendipitous discovery; not tours.
• Browse the poems and places without being in situ.
37. www.bl.uk 37
The Off the Map Competition
• A new type of collaboration
• Explores how British Library digital collections
can be used in creative ways
• Engagement with new audiences
• Opportunity for students in the UK to
showcase their talents to industry
40. www.bl.uk 40
2014 winning team: Gothulus Rift, University of South Wales
Created a Fonthill Abbey inspired game called Nix using Oculus Rift
YouTube flythrough: http://youtu.be/8ESieZO4VHw
41. www.bl.uk 41
The original handwritten manuscript of the
story, ‘Alice’s Adventures Under Ground’,
which was first told to Alice Liddell by Lewis
Carroll in 1862.
42. www.bl.uk 42
2015 Winning Game:
“The Wondering Lands of Alice”
Team Off our Rockers, De Montfort University in Leicester
YouTube flythrough: https://youtu.be/7bwx4uUnbV4
45. www.bl.uk 45
The Tempest
Shakespeare was inspired to write The Tempest when he read of the fate of the Sea-
Adventure, a ship taking English colonists to North America which was wrecked off the
coast of Bermuda in 1609. The Bermudas were then the most feared place on earth for
sea travellers, who had heard stories about the islands being inhabited by devils.
Map of Bermuda as
published in Gerhard
Mercator and
Jodocus Hondius'
world atlas of 1633.
Maps K.Top 123
46. www.bl.uk 46
Off the Map 2016 1st Place:
“The Tempest” by Team Quattro, De Montfort University, Leicester
YouTube flythrough: https://youtu.be/0lzpEFgpk3Y
47. www.bl.uk 47
A Midsummer Night’s Dream
From Boydell's Collection of Prints illustrating Shakespeare's works
http://www.bl.uk/collection-items/boydells-collection-of-prints-illustrating-shakespeares-works
48. www.bl.uk 48
Off the Map 2016 2nd Place:
‘Midsummer’ by Tom Battey, London College of Communication
YouTube flythrough: https://youtu.be/sz-IKvp62NI
51. www.bl.uk 51
Three themes:
1. Illusions, magic and
impersonators
2. Outdoor places of entertainment –
Fairground, travelling shows
3. Indoor places of entertainment –
Music Hall and Pantomime
52. www.bl.uk 52
Maskelyne & Cooke's entertainment at
the Egyptian Hall, 1873
http://www.bl.uk/catalogues/evanion/Recor
d.aspx?EvanID=024-000004866
'Will, the Witch, and the Watch'
or 'The Mystic Freaks of
Gyges', was written in 1872
and performed at the Egyptian
Hall in 1873.
Digital humanities scholars use computational methods either to answer existing research questions or to challenge existing theoretical paradigms, generating new questions and pioneering new approaches…. activities might include incorporation into the traditional arts and humanities disciplines use of text-analytic techniques; GIS; commons-based peer collaboration; and interactive games and multimedia.
Set up in 2010 the team was formed as a way of dedicating focus on the changing research landscape in the digital realm. Now embedded in collection areas, and as you’ll see later, joining the library explicitly as part of major digitisation projects.
Main activities:
Getting content in digital form and online
Collaborations, Competitions & Awards
Digital research support and guidance
The work of Labs is really about a number of stories, stories about digital collections and about researchers wanting to ask fascinating research questions about them. Let’s now tell you a story about one collection and the intended and unintended consequences of working with it.
60 seconds
The Library digitised 68,000 predominantly 19th century books from our collections a few years ago (around 2.7 % of the physical total in that period). You can view them from our catalogue or read them on your <click>IPad via the Historical Books app developed by BiblioLabs.
There are 22 million individual page images, along with full text scans of these images, all of which contain untold quantity of useful data such as names of people, places, historical events, dates.
with no restrictions on use by Microsoft
So the question became then, what next? What can 68,000 books tell us?
60 seconds
As the books were scanned for text, this had a fortunate ‘side effect’ the software not only tries to detect the text on the page but also where the images might be. There had already been some interest in the images from the community of researchers. It seemed easy to extract them.
s part of the Labs competition, Matt Prior attended one of our hack events and when examining our book data and was very interested in the images from the books.
Meanwhile the algorithm that Ben had written to snip the images from the OCR scans was still churning away, how many were there going to be? The Mechanical Curator could publish them every hour, but was there somewhere we could put them all for people to browse when they wanted. Importantly if we did put them somewhere, could we get people to help us add descriptions to the individual images making them infinitely more discoverable.]
With an algorithm by Ben O’Steen we snipped out images from digitised books and put them on to Flickr on December 13 2013, there were over a million, but the problem we had was that we knew which books they came from (author/dates), but we didn’t’ have any information about the images. By releasing them onto flickr, we have got people to start tagging them and using them in very creative ways.
Hosting them internally was not an option and there was not sufficient metadata to put them on Wikipedia. Flickr seemed the obvious option as it is a platform that can support high usage, did not require metadata, allowed tagging and it is free for public domain images.
He speaks about his project, how he came across the images and what he did with them.
How he learnt about the image = it was pure serendipity
Taking images out of the context of books creates potential to reinvent them in a new context.
http://youtu.be/3AOa98RsA2Q
http://www.youtube.com/watch?feature=player_detailpage&v=3AOa98RsA2Q#t=48
Make sure subtitles are on.
This is a surprising use of the images we put onto Flickr. Once a year in the summer, tens of thousands of participants gather in Nevada's Black Rock Desert to create Black Rock City, dedicated to community, art, self-expression, and self-reliance. They depart one week later, having left no trace whatsoever. [This year it took place between August 25 to September 1, Nevada, USA, the show ends by burning an effigy of wooden man! <click>]
American Artist David Normal used images from your Flickr Commons collection and worked on a set of collages called "Crossroads of Curiosity". The finished paintings based on these collages were presented in full colour as ' lightboxes at this year's Burning Man Festival, the theme for which was "Caravansary“. They were presented around the base of the effigy of the Burning Man in the heart of the festival.
Aims developed quickly at project start
Refined over project, flexible mindset
Last point: to achieve this chose (needed) to use DIY app platform…