SlideShare a Scribd company logo
1 of 87
1
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
British Library Labs
What is British Library Labs and what have we learned over the last four years?
1320-1420 & 1600-1615, 10 May 2017
Learning the Lessons of working with the British Library’s Digital Content and Data for your research
British Library data and collections and discussions and feedback on ideas, challenges and issues
College of Arts and Law, University of Birmingham, UK
https://goo.gl/EvKGfa
Mahendra Mahey
Manager of British Library Labs
2
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
The British Library
Inside the British Library
Space for 1200 readers, around 400,000 visitors per year
Building 37 uses low oxygen and robots
Reading room and delivery to London
Document Supply and Storage at Boston Spa
Stockton-on-Tees
Author right to payment each time their books
are borrowed from public libraries.
St Pancras, London, UK
Many books are stored 4 stories below the building
UK Legal Deposit Library – Reference only
3
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Living Knowledge Vision (2015 – 2023)
Custodianship Research Business
Culture Learning International
To make our intellectual heritage accessible to everyone,
for research, inspiration and enjoyment and be the most open, creative
and innovative institution of its kind by 2023.
Document:http://goo.gl/h41wW7 Speech:https://goo.gl/Py9uHK
Roly Keating (Chief Executive Officer of the British Library)
To make our intellectual heritage accessible to everyone,
for research, inspiration and enjoyment and be the most open, creative
and innovative institution of its kind by 2023.
4
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Collections – not just books!
> 180*million items
> 0.8* m serial titles
> 8* m stamps
> 14* m books
> 6* m sound recordings
> 4* m maps
> 1.6* m musical scores
> 0.3* m manuscripts
> 60* m patents
King’s Library *Estimates
5
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
http://www.bl.uk/projects/british-library-labs
Funded by the Andrew W. Mellon Foundation
6
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
http://www.bl.uk/projects/british-library-labs
Funded by the Andrew W. Mellon Foundation
7
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Wider…not just Researchers
Researchers
https://goo.gl/WutNyi
Artists
http://goo.gl/nNKhQ2
Librarians
Curators
https://goo.gl/9NWZUW
Software Developers
https://goo.gl/7QQ5Tf
Archivists
https://goo.gl/x7b4tg
Educators
https://goo.gl/qh01Mi
8
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Digital research methods
Digital Scholarship
Visualisations
Application Programming Interfaces (APIs)
for datasets e.g. Metadata, Images
Transcribing
Annotation
Location based searching & Geo-tagging
Corpus analysis, Text Mining &
Natural Language Processing
Crowdsourcing
Human Computation
9
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
How are we doing this?
10
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Competition
Awards
Projects
Tell us your ideas of what to do with our digital content
Show us what you have already done with our digital
content in research, artistic, commercial and learning and
teaching categories
Talk to us about working on collaborative projects
11
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Why are we doing this?
12
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Why are doing this?
• Working closely with and listening to those who want
use our digital collections and data for their work
• We can learn how we are and should be supporting
them (shapes the problems we work on):
– Access to digital collections?
– Advice, guidance, technical support, training
– Services, Tools and Processes?
– Many more reasons…
• Where are the gaps between what
users want & what we can give?
• How do we build the bridges to overcome the gaps?
• How do we help users to navigate their way through
the Library to what they want to do?
https://goo.gl/esqpRb
https://goo.gl/6CwCeE
https://goo.gl/62JnQT
13
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Born digital
Data all around us!
/
Knowledge Quarter London
89 knowledge organisations (as of 07/07/17) within 1 mile radius of
Kings Cross, http://www.knowledgequarter.london
http://www.turing.ac.uk (Headquartered at the British Library)
UK Web Archive and e-legal deposit (2013)
http://www.webarchive.org.uk/ukwa/
Born digital
Data all around us!
14
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
#bldigital
1-2 %* digitised
* estimate
Digitisation
Partnerships
Commercial & Other Organisations
Amount
increasing rapidly
Bias in digitisation
http://goo.gl/bR9UJL
Sample Generator
15
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Playbills, Books, Newspapers
(includes OCR)
Digital collections and Datasets
British National
Bibliography
http://bnb.data.bl.uk
http://sounds.bl.ukhttp://dml.city.ac.uk/
Music (Recordings & Sheet) & Sounds
http://goo.gl/frSMJt
Broadcast News (TV and Radio)
http://goo.gl/cwThHw
http://goo.gl/pBkisZhttp://goo.gl/E8aRyQ
Usage data
EtHOS
Web ArchiveImages, Manuscripts & Maps
http://www.qdl.qa/
Qatar Digital Library
http://idp.bl.uk/
International
Dunhuang
Project
Maps
http://www.bl.uk/maps/
Hebrew Manuscripts
http://goo.gl/4sbCp9
Flickr &
Wikimedia Commons
https://goo.gl/LZRmaZ
16
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Open Cultural Heritage Datasets
Collection Guides (173 as of 04/05/17)
https://www.bl.uk/collection-guides/
Datasets about our collections
Bibliographic datasets relating to our published and
archival holdings
Datasets for content mining
Content suitable for use in text and data mining
research
Datasets for image analysis
Image collections suitable for large-scale image-
analysis-based research
Datasets from UK Web Archive
Data and API services available for accessing UK Web
Archive
Digital mapping
Geospatial data, cartographic applications, digital aerial
photography and scanned historic map materials
https://data.bl.uk
Download collections as zips, no API
Each dataset has a Digital Object Identifier (DOI)
Discussion list:
http://www.jiscmail.ac.uk/CULTURAL-HERITAGE-DATASETS
17
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
What did people
actually do?
18
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Example pattern of research for Labs
• Finding invisible / well hidden
things in ‘messy’ historical data
• Unearthing / unlocking hidden
histories & data to stimulate
new research
• Celebrating hidden histories /
data creatively through events,
art & performance
https://goo.gl/vJ291F
https://goo.gl/mcpa8B
https://goo.gl/Ql0Bwz
Not the British Library!
19
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
https://goo.gl/oUNj5N
https://goo.gl/ImAUv4
Finding things in ‘messy’
Optical Character Recognised (OCR) text
Mrs Folly
• Clean up some manually
• Get human ‘ground truth’
• Write code to find things
reliably in it automatically
• Try code on messy content
• Tweak if necessary
• Digital ‘lasso’ around content
• Human sift through
Mrs Folly
An example pattern of research
20
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Code: Machine Learning / Reading
• Labs sometimes use Machine Learning / Reading
techniques
• Analogies to how humans read / learn
• Machines acquire ‘knowledge’ / data, use that
knowledge / data to make sense / identify patterns
• Labs doing this on a case by case basis so methods
can vary
• Need computational & human effort
• Legalities of Text and Data mining being ‘ironed’
out with publishers, on-going…Often a misunderstood …
• Perhaps we need a metaphor from history…
https://goo.gl/gXmVQL
https://goo.gl/gDQEAz
https://goo.gl/k68fTf
© £
21
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Smell of soup & Machine Learning
Thanks to Memo Akten (@memotv on twitter) for the inspiration!
https://goo.gl/toq4Bo
Nasreddin, 13th Century Turkish Sufi
http://web2.uvcs.uvic.ca/elc/studyzone/330/reading/smell1.htm
22
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
http://victorianhumour.tubmblr.com
Victorian Meme Machine (2014)
https://goo.gl/HMqDt3
Bob Nicholson
http://victorianhumour.tumblr.com/
Bob Nicholson interviewed on
BBC Radio 4 Making History Programme:
http://goo.gl/fmV9ep
And telling jokes to the public:
http://goo.gl/xIDRhz
Bob obtained further funding from his university
Looking for more collaborations
https://www.youtube.com/watch?v=-GRgj7Q5OM0
Rob Walker, Victorian Mother-in-law Jokes
Victorian Comedy Night, 7 Nov 2016
Learnt about access paths
to digital collections
23
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Katrina Navickas (2015)
Political Meetings Mapper
http://politicalmeetingsmapper.co.uk
https://goo.gl/Qq78Oa
Labs Symposium 2015
https://goo.gl/BSA3be
Interview 2015
The Chartist Newspaper
http://goo.gl/vOLSnH
Chartist Monster Meeting
Chartists Walking Tour and
Re-enactment London
Learnt that domain knowledge
reduces noise
24
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Black Abolitionist Performances & their
Presence in Britain (2016) – Hannah-Rose Murray
Frederick
Douglass
Ellen
Craft
Josiah
Henson
Ida B
Wells
A Performance by
Joe Williams &
Martelle Edinborough
http://frederickdouglassinbritain.com/
Started to implement
Machine Learning Techniques
25
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Data-mining verse in 18th Century newspapers
BL Labs Project 16-17, Jennifer Batt
https://goo.gl/5Akthd
Slides courtesy Jennifer BattJennifer Batt @ the BL on World Poetry Day
26
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
What thoj' among ourrelves, with too much Heat, or t
W: fweutimes.wongle, wvhen we Ihould debate, W –
(A confequential Ill which Freedom drawvs, fl t
A bad Efficf, but from a noble Caufe) t
We can with univeifal Zcal advance, to
To cutb the faithlefs Arrogancccof V rance. hi
Dublin Journal, 10-14 September, 1745 Slides courtesy Jennifer Batt
27
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
Verse: 81% lines begin
with initial capital
Prose: 52% lines begin
with initial capital
Westminster Journal 3 March 1745
Slides courtesy Jennifer Batt
Started to refine
Machine Learning Techniques
28
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Psychiatrist’s Journey
into 19th Century Newspapers (2016)
• Dr Surendra P Singh, Consultant Psychiatrist
• To identify weekly, monthly, yearly and
longitudinal trends in suicide reporting in
terms of gender, status, sites, locations and
health in OCR text of 19th Century
Newspapers
• Used ‘R’ Open Source Stats
Package to collect ‘Suicide’ corpus
• Looking for collaborators to work on this
dataset
Use off-the-shelf tools
and remote access pathways
29
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Use of Overproof
OCR Correction?
Re-OCR with
ABBY FineReader?
https://www.abbyy.com/en-gb/
http://overproof.projectcomputing.com/
RE-OCR
30
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Virtual Infrastructure for OCR text
OCR text ‘scraped’ from
digitised newspapers
and put in cloud
Jupyter notebook
Write python code and results
in web browser
http://jupyter.org
Access available for researchers ‘in residence’
31
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Other experiments with images
32
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
Worked better for female faces than men’s
Press
http://mechanicalcurator.tumblr.com
Posts image every 30 minutes
http://www.flickr.com/photos/britishlibrary/
1,020,418 images
need tagging!
Creative uses of images
Face recognition
Algorithms based on photos
Mechanical Curator
with an algorithmic brain
(Circles, Squares and Slanty etc)
http://goo.gl/qPPgxX
Wikimedia
Flickr Commons
Individual URL & API
Snipping out images
from 65,000 Digitised Books*
>600,000,000* views
>20,000,000* tags
https://goo.gl/FgZ4HM
Work @ BL by Ben O’Steen, Labs
and Digital Research Team*Matt Prior - http://goo.gl/j29Tnx
Since Dec 2013
Tumblr
*Estimates
33
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Tagging, Tagging, Tagging…
34
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Tagging a million images
Iterative Crowdsourcing
http://goo.gl/j6fxac
Cardiff University’s
Lost Visions Project
http://www.metadatagames.org/
Metadata Games
James Heald
Mario Klingemann
Chico 45
Use computational methods
Human Tagger
Top British Library Flickr Commons Taggers
18 hard core taggers
How to reward and keep motivated?
Average for ‘crowd’ is 1 tag per person
Mobile games for ‘Ships’, ‘Covers’ and ‘Portraits’ Interface for tagging
35
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Adam Crymble (2015)
Crowdsource Arcade
http://goo.gl/LBfJ4W
http://goo.gl/OH9pOZ
https://goo.gl/7z0j8p
30 mins talk
Labs Symposium (2015)
https://goo.gl/SSRsdd
5 min interview (2015)
http://goo.gl/0APpE8
Game Jam
Using Arcade Games
to help Tag images
36
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Special Jury’s Prize (2015)
James Heald – Wikimedia and Map work
https://goo.gl/WYZCB2
http://goo.gl/HNQq5e
https://goo.gl/VPgffL
https://commons.wikimedia.org/
https://goo.gl/djtm1b
Labs Symposium (2015)Geotagging maps
54,000 Maps
Found in Flickr 1 million
Human & Computational Tagging
& Community engagement
Geo-referencing work
37
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
SherlockNet: Competition Winner 2016
Karen Wang, Luda Zhao and Brian Do
Using Convolutional Neural Networks to Automatically Tag and Caption
the British Library Flickr Commons 1 million Image Collection
12 categories
>20 million tags added
>100,000 captions
bit.ly/sherlocknet
Pooled surrounding
OCR text on page
from similar images
Used Microsoft COCO (photographs) &
British Museum Prints and Drawings
collections as training sets.
Tags Captions
38
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Artistic / Creative Works
http://goo.gl/dM8ieA
Mario Klingeman (2015)
Code Artist / Curator
https://www.youtube.com/watch?v=Q3SBxO34Zlc
David Normal 2014 and 2015
Collages/Paintings & Lightboxes
http://goo.gl/bNxGZZ
Kris Hoffman (2016)
Animation for Fashion Week 2016
https://goo.gl/QilqqT
Jiayi Chong 2016 - Animation tool
https://www.facebook.com/RealmlandStory/
Paul Rand Pierce 2016
Graphic Novel on Facebook
A Hat on the Ground Spells trouble
Tragic Looking Women
44 Men who Look 44
(Notice the direction faces)
39
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Imaginary Cities – BL Labs Project 16-17
Michael Takeo Magruder
https://goo.gl/4ARwTy
An artistic exploration seeking to create provocative fictional cityscapes for the Information Age
from the British Library’s digital collection of historic urban maps
40
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Learning & Teaching
The PhD Abstracts Collections in FLAX:
Learning Academic English with Electronic Theses Online Service (EThOS)
https://goo.gl/fOwHAe
Shaoqun Wu, Alannah Fitzgerald, Ian H. Witten and Chris Mansfield
41
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Learning &Teaching
Library Carpentry
James
Baker
https://goo.gl/25cq99
And many more!
42
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Commercial: Poetic Places (2016)
http://www.poeticplaces.uk/
Sarah Cole
43
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Commercial:
Curating Digital Collections Go Mobile (2016)
http://www.biblioboard.com
Mitchell Davis (BiblioLabs)
See it in the Foyer!
44
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
#bldigital
1-2 %* digitised
* estimate
Digitisation
Partnerships
Commercial & Other Organisations
Amount
increasing rapidly
Bias in digitisation
http://goo.gl/bR9UJL
Sample Generator
45
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Have you got X?
https://upload.wikimedia.org/wikipedia/commons/5/50/Real_wuerzburg.jpg
Looking for Physical Content in the British Library
46
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Have you got X digitised?
http://www.yorkmix.com/wp-content/uploads/2014/04/mr-simms-sweet-shoppe-york.jpg
Looking for Digitised Content in the BL
47
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
• Digitisation costs time & resources & access can depend
on restrictions imposed by funders …
• Still…over 600 Digital Collections!!
But not all found through Google or even online!
• Need lots of engagement, dialogue is either:
– you are ‘lucky’ & we have the digital content
/ data relevant to your research
– we don’t have exactly what your looking for,
but is there anything of interest? Let’s talk…
• Artists find this dialogue easier and we tend to
attract researchers with ‘fuzzier’ research boundaries
• Access easier for openly licensed content
• More challenging for on-site, in-copyright, data
protected, old content media and contemporary material
https://goo.gl/qpCLlk
So little digitised? Type of Engagement?
© £ 
https://goo.gl/Y5zCXg
©
https://goo.gl/wMTS3Z
48
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
only in
Reading
Rooms due
to ©
only on
site due to
© or
ethical etc
not online /
available –
various storage
devices,
personal data
online
and open
British Library
online
behind
paywall
Challenges of access to Digital Collections
49
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
The Story of the Digital Collection…
Digital
Collection
Curator
Who paid for the digitisation?
Who did the digitisation?
Technology used
Born digital?
Published
Unpublished
Where is it?
Can it still be accessed?
Generates income
Reputational Risk
Legalities
Political
Ego (all)
Surprises (e.g. gaps)
Metadata
Old format not supported
What media was the
digitisation done from?
Documentation
No Metadata
Messy Metadata
Still there?
Good to know the background of a
Digital collection if you want to use it for research and make conclusions…
50
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Open Licensed Digital Content?
20% Openly
Licensed
Around 15%*
available online
Working through to make more open…
Though some collections will always only be available onsite due to ©
Breakdown by collection*
Manuscripts 59%
Books 9%
Maps and Views 7%
Newspapers 3%
Archives and Records 3%
Paintings, Prints and Drawings 2%
*Based on number of digitisation projects
Largest proportion of funding
Public / Private Partnership
20%* Openly Licensed
80%* Available onsite
*Estimates
51
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
How do we give access to
onsite-only
Digital Collections
(80% of our Digital Collections)?
52
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
READING
ROOM
ON
SITE
NOT
ONLINE
OPEN
British Library
£
Labs Residency Model
Challenges of access to Digital Collections
53
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Accessing digital collections onsite
OPEN
£
• Have to be ‘onsite’
• Need to be security cleared for some collections
– Hence ‘Researcher in Residence Model’
• Permission required (depending on ‘story’ of collection)
• Content could be on various media formats
(not always online)
• 5 - 20 % re-use of material for non commercial research for
some collections
• We are learning ‘pathways’ so that this becomes ‘everyday’ to
provide onsite access to some digital collections in the future
54
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Lessons Learned & Challenges…(1)
• Start with a conversation, our data isn’t all on Google
(yet!) & not easy to find. Need to create and embrace
serendipity & opportunities for use by talking!
• Need to have several conversations with several
stakeholders & tap into their tacit knowledge that
isn’t always written down sometimes to progress
ideas. https://goo.gl/XaHYT9
• Often misunderstandings because
of jargon & different meaning of words.
• Expectations change when researchers
actually see the data, systems &
experience the ‘culture’ of the organisation.
• Opening & using digital collections occasionally requires a
need to let go of the emotional & psychological connection to them
https://goo.gl/OYAsmK
?
https://goo.gl/ytmWnu
55
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Lessons Learned & Challenges…(2)
• Embrace dirty data, it may never be perfect!
• Careful of making conclusions (trust) based on
‘black box’ software & techniques (e.g. sentiment analysis)
• We tend to work with researchers who can be ‘flexible’
with their research questions & are willing to embrace
challenges.
• Many researchers have the domain knowledge but
lack technical / digital skills to use Digital Research
methods. Should they be teamed up with those that
want to solve problems or get trained?
• Huge appetite to use digital content & data
(e.g. Flickr Commons stats).
https://goo.gl/mcpa8B
https://goo.gl/i5GVfI
https://goo.gl/yQ5s4U
https://goo.gl/kwcK8J
https://goo.gl/wMTS3Z
56
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Labs mindset…
1. Start a conversation, generate positive energy
and try to support ideas
2. Start with small experiments, but think big!
3. Fail faster (don’t be afraid)!
4. Reject perfectionism!
5. Good enough is sometimes…good enough!
6. Celebrate the uses of digital collections
https://goo.gl/noASfl
57
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
https://goo.gl/SUOO0J
The Magic of Openness!
• If digitised / digital collections are
not used, what is the point of
digitising / keeping them
(i.e. apart from preservation)?
• Opening up our digital collections
offers new ways for the Library’s
content to be re-discovered,
remixed, re-imagined and ‘re-
energised’
• Generates plenty of examples to
inspire use by others
58
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Hey there Young Sailor!
Ling Low 2016 – Hey there Young Sailor
https://www.youtube.com/watch?v=bcOP1E5bRE0VIMEO.COM/SWEETANDLOWFILMS
@SWEETNLOWFILMS ON INSTAGRAM
@SWEETNLOWLING ON TWITTER
The Impatient Sisters
59
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
The Future of BL Labs
• Continue to engage with researchers
• Learn what they want to do
• Collect evidence of demand
• Develop Business Model and Support
process to make ‘Business as Usual’ at
the British Library
• Help to create pathway to developing
a ‘Digital Research Suite’ at the
British Library by 2019
http://www.library.pitt.edu/digital-scholarship-services
https://goo.gl/W4TjGt
60
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Taking a peek at our Open Data
A digitised book…
61
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
002819694
62
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
63
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
64
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Optically Character Recognised (OCR)
generated Text
Scanned Page
Image on Flickr
Commons
https://goo.gl/AC43vs
65
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
OCR XML Generated by ABBY Fine Reader
66
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Taking a peek at our on-site only
accessible data
A digitised newspaper
67
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Accessing digitised newspapers
onsite at the BL
1
Windows 7
External access possible through Citrix Server
Results of digitisation exist on Windows file shares!
68
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Accessing digitised newspapers
onsite at the BL (JISC 1)
2
12 Volumes, each with terabytes of data
69
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Accessing digitised newspapers
onsite at the BL
3
70
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Accessing digitised newspapers
onsite at the BL
4
71
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Accessing digitised newspapers
onsite at the BL
5
72
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Accessing digitised newspapers
onsite at the BL
6
73
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Accessing digitised newspapers
onsite at the BL
7
74
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Accessing digitised newspapers
onsite at the BL
8
75
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Accessing digitised newspapers
onsite at the BL
9
76
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Accessing digitised newspapers
onsite at the BL
10
77
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Accessing digitised newspapers
onsite at the BL
11
78
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Accessing digitised newspapers
onsite at the BL
12
79
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Accessing digitised newspapers
onsite at the BL
13
Accessing original ‘master’ image (not
cropped or post processed)
Or ‘service’ copy (post processed)
and results of OCR available as ALTO XML
80
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Accessing digitised newspapers
onsite at the BL
14a
Accessing original ‘master’ image
(not cropped or post processed) in .TIFF format
81
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Accessing digitised newspapers
onsite at the BL
Accessing original
‘master’ image
(not cropped or post
processed)
14b
82
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Accessing digitised newspapers
onsite at the BL
15a
Accessing ‘service’ Copy (post processed)
and results of OCR available as ALTO XML
83
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Accessing digitised newspapers
onsite at the BL
Accessing ‘service’
Copy (post processed)
15b
84
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Accessing digitised newspapers
onsite at the BL
15c
Accessing OCR as ALTO XML
85
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Accessing digitised newspapers
through Gale Interface (subscription)
1
86
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Accessing digitised newspapers
through Gale Interface (subscription)
2
87
@mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa
mahendra.mahey@bl.uk
Explore or Imagine Our Data!
• CSV of Metadata
https://data.bl.uk/digbks/dig19cbooks-mdata-csv.csv
• 19th Century Books - Book Metadata - 01/09/2013.
https://data.bl.uk/digbks/db21.html
• Digitised Books - Flickr Tag History - Dec 2013 to March 2016.
TSV
https://data.bl.uk/digbks/db15.html
• Digitised Hebrew Manuscripts - Metadata
https://data.bl.uk/hebrewmanuscripts/heb1.html
• Digitised Hebrew Manuscripts: Or 2210 - Or 2364
https://data.bl.uk/hebrewmanuscripts/heb8.html
• Theatrical playbills from Britain and Ireland (OCR text only)
https://data.bl.uk/playbills/pb2.html
• Portraits of actors, views of theatres and playbills (covering
1750 - 1821 in a single volume)
https://data.bl.uk/singlesheet/por1.html
• Volumes of Lysons Collectanea (Amusements), comprising
broadsides, cuttings, advertisements on amusements.1660-
1840.
https://data.bl.uk/singlesheet/ad1.html
https://data.bl.uk
• Have a look at the data.
• Data Quality
• Issues
Or an idea you have thought of
what to do with the data!
http://labs.bl.uk/Ideas+for+Labs
Smaller datasets

More Related Content

What's hot

DH Project Management
DH Project ManagementDH Project Management
DH Project Managementlabsbl
 
20130629 If you build it, will they visit [ala lita lightning talk]
20130629 If you build it, will they visit [ala lita lightning talk]20130629 If you build it, will they visit [ala lita lightning talk]
20130629 If you build it, will they visit [ala lita lightning talk]Frederick Zarndt
 
Engaging and Supporting Researchers who want to use the British Library’s Dig...
Engaging and Supporting Researchers who want to use the British Library’s Dig...Engaging and Supporting Researchers who want to use the British Library’s Dig...
Engaging and Supporting Researchers who want to use the British Library’s Dig...labsbl
 
BL Labs at ArtLab 2016
BL Labs at ArtLab 2016BL Labs at ArtLab 2016
BL Labs at ArtLab 2016labsbl
 
British Library Labs Roadshow 2016 UCL 24 Feb 2016
British Library Labs Roadshow 2016 UCL 24 Feb 2016British Library Labs Roadshow 2016 UCL 24 Feb 2016
British Library Labs Roadshow 2016 UCL 24 Feb 2016labsbl
 
What You Need To Know Before Gamifying Your Library
What You Need To Know Before Gamifying Your Library What You Need To Know Before Gamifying Your Library
What You Need To Know Before Gamifying Your Library Bohyun Kim
 
Library Book Search
Library Book SearchLibrary Book Search
Library Book SearchErin Sees
 

What's hot (7)

DH Project Management
DH Project ManagementDH Project Management
DH Project Management
 
20130629 If you build it, will they visit [ala lita lightning talk]
20130629 If you build it, will they visit [ala lita lightning talk]20130629 If you build it, will they visit [ala lita lightning talk]
20130629 If you build it, will they visit [ala lita lightning talk]
 
Engaging and Supporting Researchers who want to use the British Library’s Dig...
Engaging and Supporting Researchers who want to use the British Library’s Dig...Engaging and Supporting Researchers who want to use the British Library’s Dig...
Engaging and Supporting Researchers who want to use the British Library’s Dig...
 
BL Labs at ArtLab 2016
BL Labs at ArtLab 2016BL Labs at ArtLab 2016
BL Labs at ArtLab 2016
 
British Library Labs Roadshow 2016 UCL 24 Feb 2016
British Library Labs Roadshow 2016 UCL 24 Feb 2016British Library Labs Roadshow 2016 UCL 24 Feb 2016
British Library Labs Roadshow 2016 UCL 24 Feb 2016
 
What You Need To Know Before Gamifying Your Library
What You Need To Know Before Gamifying Your Library What You Need To Know Before Gamifying Your Library
What You Need To Know Before Gamifying Your Library
 
Library Book Search
Library Book SearchLibrary Book Search
Library Book Search
 

Similar to British Library Labs Roadshow 2017 at the University of Birmingham

British Library Labs Roadshow - Open University
British Library Labs Roadshow - Open UniversityBritish Library Labs Roadshow - Open University
British Library Labs Roadshow - Open Universitylabsbl
 
BL Labs Presentation at Open Science Infrastructures for Big Cultural Data
BL Labs Presentation at Open Science Infrastructures for Big Cultural DataBL Labs Presentation at Open Science Infrastructures for Big Cultural Data
BL Labs Presentation at Open Science Infrastructures for Big Cultural Datalabsbl
 
British Library Labs Presentation at UK Medical Heritage Library Live Lab
British Library Labs Presentation at UK Medical Heritage Library Live LabBritish Library Labs Presentation at UK Medical Heritage Library Live Lab
British Library Labs Presentation at UK Medical Heritage Library Live Lablabsbl
 
Working with the British Library’s Digital Collections & Data - Insights from...
Working with the British Library’s Digital Collections & Data - Insights from...Working with the British Library’s Digital Collections & Data - Insights from...
Working with the British Library’s Digital Collections & Data - Insights from...labsbl
 
'What is British Library Labs?' and 'Example patterns of working with the Bri...
'What is British Library Labs?' and 'Example patterns of working with the Bri...'What is British Library Labs?' and 'Example patterns of working with the Bri...
'What is British Library Labs?' and 'Example patterns of working with the Bri...labsbl
 
BL Labs Presentation to the British Library Development Team
BL Labs Presentation to the British Library Development TeamBL Labs Presentation to the British Library Development Team
BL Labs Presentation to the British Library Development Teamlabsbl
 
Building Better GLAM Labs - Opening talk at Museum Big Data Conference - UCL ...
Building Better GLAM Labs - Opening talk at Museum Big Data Conference - UCL ...Building Better GLAM Labs - Opening talk at Museum Big Data Conference - UCL ...
Building Better GLAM Labs - Opening talk at Museum Big Data Conference - UCL ...labsbl
 
A hands-on data exploration & challenge to become a derived data-set author o...
A hands-on data exploration & challenge to become a derived data-set author o...A hands-on data exploration & challenge to become a derived data-set author o...
A hands-on data exploration & challenge to become a derived data-set author o...labsbl
 
BL Labs Presentation to Michigan State Students
BL Labs Presentation to Michigan State StudentsBL Labs Presentation to Michigan State Students
BL Labs Presentation to Michigan State Studentslabsbl
 
BL Labs CityLIS Talk
BL Labs CityLIS TalkBL Labs CityLIS Talk
BL Labs CityLIS Talklabsbl
 
What is BL Labs?
What is BL Labs?What is BL Labs?
What is BL Labs?labsbl
 
British Library Labs Leeds Roadshow 2018
British Library Labs Leeds Roadshow 2018British Library Labs Leeds Roadshow 2018
British Library Labs Leeds Roadshow 2018labsbl
 
Experiences and lessons learned through British Library Labs How have we eng...
Experiences and lessons learned through British Library Labs  How have we eng...Experiences and lessons learned through British Library Labs  How have we eng...
Experiences and lessons learned through British Library Labs How have we eng...labsbl
 
Presentation to the London Psychology Group
Presentation to the London Psychology GroupPresentation to the London Psychology Group
Presentation to the London Psychology Grouplabsbl
 
BL Labs and Digikult 2016
BL Labs and Digikult 2016BL Labs and Digikult 2016
BL Labs and Digikult 2016labsbl
 
Building Better GLAM Labs - Keynote at University of Victoria, Victoria, BC, ...
Building Better GLAM Labs - Keynote at University of Victoria, Victoria, BC, ...Building Better GLAM Labs - Keynote at University of Victoria, Victoria, BC, ...
Building Better GLAM Labs - Keynote at University of Victoria, Victoria, BC, ...labsbl
 
BL Labs Presentation at the University of Wolverhampton
BL Labs Presentation at the University of WolverhamptonBL Labs Presentation at the University of Wolverhampton
BL Labs Presentation at the University of Wolverhamptonlabsbl
 
British Library Labs Presentation at the Accelerating Human Imagination Workshop
British Library Labs Presentation at the Accelerating Human Imagination WorkshopBritish Library Labs Presentation at the Accelerating Human Imagination Workshop
British Library Labs Presentation at the Accelerating Human Imagination Workshoplabsbl
 
Presentation at Food WikiEdit-a-thon
Presentation at Food WikiEdit-a-thonPresentation at Food WikiEdit-a-thon
Presentation at Food WikiEdit-a-thonlabsbl
 
Bl labs ou-dh-collaboration
Bl labs ou-dh-collaborationBl labs ou-dh-collaboration
Bl labs ou-dh-collaborationlabsbl
 

Similar to British Library Labs Roadshow 2017 at the University of Birmingham (20)

British Library Labs Roadshow - Open University
British Library Labs Roadshow - Open UniversityBritish Library Labs Roadshow - Open University
British Library Labs Roadshow - Open University
 
BL Labs Presentation at Open Science Infrastructures for Big Cultural Data
BL Labs Presentation at Open Science Infrastructures for Big Cultural DataBL Labs Presentation at Open Science Infrastructures for Big Cultural Data
BL Labs Presentation at Open Science Infrastructures for Big Cultural Data
 
British Library Labs Presentation at UK Medical Heritage Library Live Lab
British Library Labs Presentation at UK Medical Heritage Library Live LabBritish Library Labs Presentation at UK Medical Heritage Library Live Lab
British Library Labs Presentation at UK Medical Heritage Library Live Lab
 
Working with the British Library’s Digital Collections & Data - Insights from...
Working with the British Library’s Digital Collections & Data - Insights from...Working with the British Library’s Digital Collections & Data - Insights from...
Working with the British Library’s Digital Collections & Data - Insights from...
 
'What is British Library Labs?' and 'Example patterns of working with the Bri...
'What is British Library Labs?' and 'Example patterns of working with the Bri...'What is British Library Labs?' and 'Example patterns of working with the Bri...
'What is British Library Labs?' and 'Example patterns of working with the Bri...
 
BL Labs Presentation to the British Library Development Team
BL Labs Presentation to the British Library Development TeamBL Labs Presentation to the British Library Development Team
BL Labs Presentation to the British Library Development Team
 
Building Better GLAM Labs - Opening talk at Museum Big Data Conference - UCL ...
Building Better GLAM Labs - Opening talk at Museum Big Data Conference - UCL ...Building Better GLAM Labs - Opening talk at Museum Big Data Conference - UCL ...
Building Better GLAM Labs - Opening talk at Museum Big Data Conference - UCL ...
 
A hands-on data exploration & challenge to become a derived data-set author o...
A hands-on data exploration & challenge to become a derived data-set author o...A hands-on data exploration & challenge to become a derived data-set author o...
A hands-on data exploration & challenge to become a derived data-set author o...
 
BL Labs Presentation to Michigan State Students
BL Labs Presentation to Michigan State StudentsBL Labs Presentation to Michigan State Students
BL Labs Presentation to Michigan State Students
 
BL Labs CityLIS Talk
BL Labs CityLIS TalkBL Labs CityLIS Talk
BL Labs CityLIS Talk
 
What is BL Labs?
What is BL Labs?What is BL Labs?
What is BL Labs?
 
British Library Labs Leeds Roadshow 2018
British Library Labs Leeds Roadshow 2018British Library Labs Leeds Roadshow 2018
British Library Labs Leeds Roadshow 2018
 
Experiences and lessons learned through British Library Labs How have we eng...
Experiences and lessons learned through British Library Labs  How have we eng...Experiences and lessons learned through British Library Labs  How have we eng...
Experiences and lessons learned through British Library Labs How have we eng...
 
Presentation to the London Psychology Group
Presentation to the London Psychology GroupPresentation to the London Psychology Group
Presentation to the London Psychology Group
 
BL Labs and Digikult 2016
BL Labs and Digikult 2016BL Labs and Digikult 2016
BL Labs and Digikult 2016
 
Building Better GLAM Labs - Keynote at University of Victoria, Victoria, BC, ...
Building Better GLAM Labs - Keynote at University of Victoria, Victoria, BC, ...Building Better GLAM Labs - Keynote at University of Victoria, Victoria, BC, ...
Building Better GLAM Labs - Keynote at University of Victoria, Victoria, BC, ...
 
BL Labs Presentation at the University of Wolverhampton
BL Labs Presentation at the University of WolverhamptonBL Labs Presentation at the University of Wolverhampton
BL Labs Presentation at the University of Wolverhampton
 
British Library Labs Presentation at the Accelerating Human Imagination Workshop
British Library Labs Presentation at the Accelerating Human Imagination WorkshopBritish Library Labs Presentation at the Accelerating Human Imagination Workshop
British Library Labs Presentation at the Accelerating Human Imagination Workshop
 
Presentation at Food WikiEdit-a-thon
Presentation at Food WikiEdit-a-thonPresentation at Food WikiEdit-a-thon
Presentation at Food WikiEdit-a-thon
 
Bl labs ou-dh-collaboration
Bl labs ou-dh-collaborationBl labs ou-dh-collaboration
Bl labs ou-dh-collaboration
 

More from labsbl

7th BL Labs Symposium (2019): 13_Closing comments
7th BL Labs Symposium (2019): 13_Closing comments7th BL Labs Symposium (2019): 13_Closing comments
7th BL Labs Symposium (2019): 13_Closing commentslabsbl
 
7th BL Labs Symposium (2019): 12_Digital Research team projects update
7th BL Labs Symposium (2019): 12_Digital Research team projects update7th BL Labs Symposium (2019): 12_Digital Research team projects update
7th BL Labs Symposium (2019): 12_Digital Research team projects updatelabsbl
 
7th BL Labs Symposium (2019): 11_The Artistic Award
7th BL Labs Symposium (2019): 11_The Artistic Award7th BL Labs Symposium (2019): 11_The Artistic Award
7th BL Labs Symposium (2019): 11_The Artistic Awardlabsbl
 
7th BL Labs Symposium (2019): 10_British Library Staff Award
7th BL Labs Symposium (2019): 10_British Library Staff Award7th BL Labs Symposium (2019): 10_British Library Staff Award
7th BL Labs Symposium (2019): 10_British Library Staff Awardlabsbl
 
7th BL Labs Symposium (2019): 09_Community commendation
7th BL Labs Symposium (2019): 09_Community commendation7th BL Labs Symposium (2019): 09_Community commendation
7th BL Labs Symposium (2019): 09_Community commendationlabsbl
 
7th BL Labs Symposium (2019): 08_An update on the ‘Living with machines’ project
7th BL Labs Symposium (2019): 08_An update on the ‘Living with machines’ project7th BL Labs Symposium (2019): 08_An update on the ‘Living with machines’ project
7th BL Labs Symposium (2019): 08_An update on the ‘Living with machines’ projectlabsbl
 
7th BL Labs Symposium (2019): 06_An overview of digital preservation at the B...
7th BL Labs Symposium (2019): 06_An overview of digital preservation at the B...7th BL Labs Symposium (2019): 06_An overview of digital preservation at the B...
7th BL Labs Symposium (2019): 06_An overview of digital preservation at the B...labsbl
 
7th BL Labs Symposium (2019): 05_The Research Award
7th BL Labs Symposium (2019): 05_The Research Award7th BL Labs Symposium (2019): 05_The Research Award
7th BL Labs Symposium (2019): 05_The Research Awardlabsbl
 
7th BL Labs Symposium (2019): 04_The story of the GLAM Labs community and how...
7th BL Labs Symposium (2019): 04_The story of the GLAM Labs community and how...7th BL Labs Symposium (2019): 04_The story of the GLAM Labs community and how...
7th BL Labs Symposium (2019): 04_The story of the GLAM Labs community and how...labsbl
 
7th BL Labs Symposium (2019): 03_BL Labs update
7th BL Labs Symposium (2019): 03_BL Labs update7th BL Labs Symposium (2019): 03_BL Labs update
7th BL Labs Symposium (2019): 03_BL Labs updatelabsbl
 
7th BL Labs Symposium (2019): 01_Welcome and Introduction
7th BL Labs Symposium (2019): 01_Welcome and Introduction7th BL Labs Symposium (2019): 01_Welcome and Introduction
7th BL Labs Symposium (2019): 01_Welcome and Introductionlabsbl
 
7th BL Labs Symposium (2019): 07_The Teaching & Learning Award
7th BL Labs Symposium (2019): 07_The Teaching & Learning Award7th BL Labs Symposium (2019): 07_The Teaching & Learning Award
7th BL Labs Symposium (2019): 07_The Teaching & Learning Awardlabsbl
 
Introduction to BL Labs and Reading 35,000 Books: The UCD Contagion Project ...
Introduction to BL Labs and Reading 35,000 Books: The UCD Contagion  Project ...Introduction to BL Labs and Reading 35,000 Books: The UCD Contagion  Project ...
Introduction to BL Labs and Reading 35,000 Books: The UCD Contagion Project ...labsbl
 
Presentation to the National Science Library of the Chinese Academy of Sciences
Presentation to the National Science Library of the Chinese Academy of SciencesPresentation to the National Science Library of the Chinese Academy of Sciences
Presentation to the National Science Library of the Chinese Academy of Scienceslabsbl
 

More from labsbl (14)

7th BL Labs Symposium (2019): 13_Closing comments
7th BL Labs Symposium (2019): 13_Closing comments7th BL Labs Symposium (2019): 13_Closing comments
7th BL Labs Symposium (2019): 13_Closing comments
 
7th BL Labs Symposium (2019): 12_Digital Research team projects update
7th BL Labs Symposium (2019): 12_Digital Research team projects update7th BL Labs Symposium (2019): 12_Digital Research team projects update
7th BL Labs Symposium (2019): 12_Digital Research team projects update
 
7th BL Labs Symposium (2019): 11_The Artistic Award
7th BL Labs Symposium (2019): 11_The Artistic Award7th BL Labs Symposium (2019): 11_The Artistic Award
7th BL Labs Symposium (2019): 11_The Artistic Award
 
7th BL Labs Symposium (2019): 10_British Library Staff Award
7th BL Labs Symposium (2019): 10_British Library Staff Award7th BL Labs Symposium (2019): 10_British Library Staff Award
7th BL Labs Symposium (2019): 10_British Library Staff Award
 
7th BL Labs Symposium (2019): 09_Community commendation
7th BL Labs Symposium (2019): 09_Community commendation7th BL Labs Symposium (2019): 09_Community commendation
7th BL Labs Symposium (2019): 09_Community commendation
 
7th BL Labs Symposium (2019): 08_An update on the ‘Living with machines’ project
7th BL Labs Symposium (2019): 08_An update on the ‘Living with machines’ project7th BL Labs Symposium (2019): 08_An update on the ‘Living with machines’ project
7th BL Labs Symposium (2019): 08_An update on the ‘Living with machines’ project
 
7th BL Labs Symposium (2019): 06_An overview of digital preservation at the B...
7th BL Labs Symposium (2019): 06_An overview of digital preservation at the B...7th BL Labs Symposium (2019): 06_An overview of digital preservation at the B...
7th BL Labs Symposium (2019): 06_An overview of digital preservation at the B...
 
7th BL Labs Symposium (2019): 05_The Research Award
7th BL Labs Symposium (2019): 05_The Research Award7th BL Labs Symposium (2019): 05_The Research Award
7th BL Labs Symposium (2019): 05_The Research Award
 
7th BL Labs Symposium (2019): 04_The story of the GLAM Labs community and how...
7th BL Labs Symposium (2019): 04_The story of the GLAM Labs community and how...7th BL Labs Symposium (2019): 04_The story of the GLAM Labs community and how...
7th BL Labs Symposium (2019): 04_The story of the GLAM Labs community and how...
 
7th BL Labs Symposium (2019): 03_BL Labs update
7th BL Labs Symposium (2019): 03_BL Labs update7th BL Labs Symposium (2019): 03_BL Labs update
7th BL Labs Symposium (2019): 03_BL Labs update
 
7th BL Labs Symposium (2019): 01_Welcome and Introduction
7th BL Labs Symposium (2019): 01_Welcome and Introduction7th BL Labs Symposium (2019): 01_Welcome and Introduction
7th BL Labs Symposium (2019): 01_Welcome and Introduction
 
7th BL Labs Symposium (2019): 07_The Teaching & Learning Award
7th BL Labs Symposium (2019): 07_The Teaching & Learning Award7th BL Labs Symposium (2019): 07_The Teaching & Learning Award
7th BL Labs Symposium (2019): 07_The Teaching & Learning Award
 
Introduction to BL Labs and Reading 35,000 Books: The UCD Contagion Project ...
Introduction to BL Labs and Reading 35,000 Books: The UCD Contagion  Project ...Introduction to BL Labs and Reading 35,000 Books: The UCD Contagion  Project ...
Introduction to BL Labs and Reading 35,000 Books: The UCD Contagion Project ...
 
Presentation to the National Science Library of the Chinese Academy of Sciences
Presentation to the National Science Library of the Chinese Academy of SciencesPresentation to the National Science Library of the Chinese Academy of Sciences
Presentation to the National Science Library of the Chinese Academy of Sciences
 

Recently uploaded

Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docxPoojaSen20
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 

Recently uploaded (20)

Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docx
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 

British Library Labs Roadshow 2017 at the University of Birmingham

  • 1. 1 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk British Library Labs What is British Library Labs and what have we learned over the last four years? 1320-1420 & 1600-1615, 10 May 2017 Learning the Lessons of working with the British Library’s Digital Content and Data for your research British Library data and collections and discussions and feedback on ideas, challenges and issues College of Arts and Law, University of Birmingham, UK https://goo.gl/EvKGfa Mahendra Mahey Manager of British Library Labs
  • 2. 2 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk The British Library Inside the British Library Space for 1200 readers, around 400,000 visitors per year Building 37 uses low oxygen and robots Reading room and delivery to London Document Supply and Storage at Boston Spa Stockton-on-Tees Author right to payment each time their books are borrowed from public libraries. St Pancras, London, UK Many books are stored 4 stories below the building UK Legal Deposit Library – Reference only
  • 3. 3 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Living Knowledge Vision (2015 – 2023) Custodianship Research Business Culture Learning International To make our intellectual heritage accessible to everyone, for research, inspiration and enjoyment and be the most open, creative and innovative institution of its kind by 2023. Document:http://goo.gl/h41wW7 Speech:https://goo.gl/Py9uHK Roly Keating (Chief Executive Officer of the British Library) To make our intellectual heritage accessible to everyone, for research, inspiration and enjoyment and be the most open, creative and innovative institution of its kind by 2023.
  • 4. 4 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Collections – not just books! > 180*million items > 0.8* m serial titles > 8* m stamps > 14* m books > 6* m sound recordings > 4* m maps > 1.6* m musical scores > 0.3* m manuscripts > 60* m patents King’s Library *Estimates
  • 5. 5 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk http://www.bl.uk/projects/british-library-labs Funded by the Andrew W. Mellon Foundation
  • 6. 6 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk http://www.bl.uk/projects/british-library-labs Funded by the Andrew W. Mellon Foundation
  • 7. 7 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Wider…not just Researchers Researchers https://goo.gl/WutNyi Artists http://goo.gl/nNKhQ2 Librarians Curators https://goo.gl/9NWZUW Software Developers https://goo.gl/7QQ5Tf Archivists https://goo.gl/x7b4tg Educators https://goo.gl/qh01Mi
  • 8. 8 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Digital research methods Digital Scholarship Visualisations Application Programming Interfaces (APIs) for datasets e.g. Metadata, Images Transcribing Annotation Location based searching & Geo-tagging Corpus analysis, Text Mining & Natural Language Processing Crowdsourcing Human Computation
  • 9. 9 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk How are we doing this?
  • 10. 10 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Competition Awards Projects Tell us your ideas of what to do with our digital content Show us what you have already done with our digital content in research, artistic, commercial and learning and teaching categories Talk to us about working on collaborative projects
  • 11. 11 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Why are we doing this?
  • 12. 12 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Why are doing this? • Working closely with and listening to those who want use our digital collections and data for their work • We can learn how we are and should be supporting them (shapes the problems we work on): – Access to digital collections? – Advice, guidance, technical support, training – Services, Tools and Processes? – Many more reasons… • Where are the gaps between what users want & what we can give? • How do we build the bridges to overcome the gaps? • How do we help users to navigate their way through the Library to what they want to do? https://goo.gl/esqpRb https://goo.gl/6CwCeE https://goo.gl/62JnQT
  • 13. 13 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Born digital Data all around us! / Knowledge Quarter London 89 knowledge organisations (as of 07/07/17) within 1 mile radius of Kings Cross, http://www.knowledgequarter.london http://www.turing.ac.uk (Headquartered at the British Library) UK Web Archive and e-legal deposit (2013) http://www.webarchive.org.uk/ukwa/ Born digital Data all around us!
  • 14. 14 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk #bldigital 1-2 %* digitised * estimate Digitisation Partnerships Commercial & Other Organisations Amount increasing rapidly Bias in digitisation http://goo.gl/bR9UJL Sample Generator
  • 15. 15 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Playbills, Books, Newspapers (includes OCR) Digital collections and Datasets British National Bibliography http://bnb.data.bl.uk http://sounds.bl.ukhttp://dml.city.ac.uk/ Music (Recordings & Sheet) & Sounds http://goo.gl/frSMJt Broadcast News (TV and Radio) http://goo.gl/cwThHw http://goo.gl/pBkisZhttp://goo.gl/E8aRyQ Usage data EtHOS Web ArchiveImages, Manuscripts & Maps http://www.qdl.qa/ Qatar Digital Library http://idp.bl.uk/ International Dunhuang Project Maps http://www.bl.uk/maps/ Hebrew Manuscripts http://goo.gl/4sbCp9 Flickr & Wikimedia Commons https://goo.gl/LZRmaZ
  • 16. 16 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Open Cultural Heritage Datasets Collection Guides (173 as of 04/05/17) https://www.bl.uk/collection-guides/ Datasets about our collections Bibliographic datasets relating to our published and archival holdings Datasets for content mining Content suitable for use in text and data mining research Datasets for image analysis Image collections suitable for large-scale image- analysis-based research Datasets from UK Web Archive Data and API services available for accessing UK Web Archive Digital mapping Geospatial data, cartographic applications, digital aerial photography and scanned historic map materials https://data.bl.uk Download collections as zips, no API Each dataset has a Digital Object Identifier (DOI) Discussion list: http://www.jiscmail.ac.uk/CULTURAL-HERITAGE-DATASETS
  • 17. 17 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk What did people actually do?
  • 18. 18 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Example pattern of research for Labs • Finding invisible / well hidden things in ‘messy’ historical data • Unearthing / unlocking hidden histories & data to stimulate new research • Celebrating hidden histories / data creatively through events, art & performance https://goo.gl/vJ291F https://goo.gl/mcpa8B https://goo.gl/Ql0Bwz Not the British Library!
  • 19. 19 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk https://goo.gl/oUNj5N https://goo.gl/ImAUv4 Finding things in ‘messy’ Optical Character Recognised (OCR) text Mrs Folly • Clean up some manually • Get human ‘ground truth’ • Write code to find things reliably in it automatically • Try code on messy content • Tweak if necessary • Digital ‘lasso’ around content • Human sift through Mrs Folly An example pattern of research
  • 20. 20 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Code: Machine Learning / Reading • Labs sometimes use Machine Learning / Reading techniques • Analogies to how humans read / learn • Machines acquire ‘knowledge’ / data, use that knowledge / data to make sense / identify patterns • Labs doing this on a case by case basis so methods can vary • Need computational & human effort • Legalities of Text and Data mining being ‘ironed’ out with publishers, on-going…Often a misunderstood … • Perhaps we need a metaphor from history… https://goo.gl/gXmVQL https://goo.gl/gDQEAz https://goo.gl/k68fTf © £
  • 21. 21 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Smell of soup & Machine Learning Thanks to Memo Akten (@memotv on twitter) for the inspiration! https://goo.gl/toq4Bo Nasreddin, 13th Century Turkish Sufi http://web2.uvcs.uvic.ca/elc/studyzone/330/reading/smell1.htm
  • 22. 22 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk http://victorianhumour.tubmblr.com Victorian Meme Machine (2014) https://goo.gl/HMqDt3 Bob Nicholson http://victorianhumour.tumblr.com/ Bob Nicholson interviewed on BBC Radio 4 Making History Programme: http://goo.gl/fmV9ep And telling jokes to the public: http://goo.gl/xIDRhz Bob obtained further funding from his university Looking for more collaborations https://www.youtube.com/watch?v=-GRgj7Q5OM0 Rob Walker, Victorian Mother-in-law Jokes Victorian Comedy Night, 7 Nov 2016 Learnt about access paths to digital collections
  • 23. 23 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Katrina Navickas (2015) Political Meetings Mapper http://politicalmeetingsmapper.co.uk https://goo.gl/Qq78Oa Labs Symposium 2015 https://goo.gl/BSA3be Interview 2015 The Chartist Newspaper http://goo.gl/vOLSnH Chartist Monster Meeting Chartists Walking Tour and Re-enactment London Learnt that domain knowledge reduces noise
  • 24. 24 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Black Abolitionist Performances & their Presence in Britain (2016) – Hannah-Rose Murray Frederick Douglass Ellen Craft Josiah Henson Ida B Wells A Performance by Joe Williams & Martelle Edinborough http://frederickdouglassinbritain.com/ Started to implement Machine Learning Techniques
  • 25. 25 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Data-mining verse in 18th Century newspapers BL Labs Project 16-17, Jennifer Batt https://goo.gl/5Akthd Slides courtesy Jennifer BattJennifer Batt @ the BL on World Poetry Day
  • 26. 26 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa What thoj' among ourrelves, with too much Heat, or t W: fweutimes.wongle, wvhen we Ihould debate, W – (A confequential Ill which Freedom drawvs, fl t A bad Efficf, but from a noble Caufe) t We can with univeifal Zcal advance, to To cutb the faithlefs Arrogancccof V rance. hi Dublin Journal, 10-14 September, 1745 Slides courtesy Jennifer Batt
  • 27. 27 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa Verse: 81% lines begin with initial capital Prose: 52% lines begin with initial capital Westminster Journal 3 March 1745 Slides courtesy Jennifer Batt Started to refine Machine Learning Techniques
  • 28. 28 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Psychiatrist’s Journey into 19th Century Newspapers (2016) • Dr Surendra P Singh, Consultant Psychiatrist • To identify weekly, monthly, yearly and longitudinal trends in suicide reporting in terms of gender, status, sites, locations and health in OCR text of 19th Century Newspapers • Used ‘R’ Open Source Stats Package to collect ‘Suicide’ corpus • Looking for collaborators to work on this dataset Use off-the-shelf tools and remote access pathways
  • 29. 29 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Use of Overproof OCR Correction? Re-OCR with ABBY FineReader? https://www.abbyy.com/en-gb/ http://overproof.projectcomputing.com/ RE-OCR
  • 30. 30 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Virtual Infrastructure for OCR text OCR text ‘scraped’ from digitised newspapers and put in cloud Jupyter notebook Write python code and results in web browser http://jupyter.org Access available for researchers ‘in residence’
  • 31. 31 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Other experiments with images
  • 32. 32 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa Worked better for female faces than men’s Press http://mechanicalcurator.tumblr.com Posts image every 30 minutes http://www.flickr.com/photos/britishlibrary/ 1,020,418 images need tagging! Creative uses of images Face recognition Algorithms based on photos Mechanical Curator with an algorithmic brain (Circles, Squares and Slanty etc) http://goo.gl/qPPgxX Wikimedia Flickr Commons Individual URL & API Snipping out images from 65,000 Digitised Books* >600,000,000* views >20,000,000* tags https://goo.gl/FgZ4HM Work @ BL by Ben O’Steen, Labs and Digital Research Team*Matt Prior - http://goo.gl/j29Tnx Since Dec 2013 Tumblr *Estimates
  • 33. 33 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Tagging, Tagging, Tagging…
  • 34. 34 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Tagging a million images Iterative Crowdsourcing http://goo.gl/j6fxac Cardiff University’s Lost Visions Project http://www.metadatagames.org/ Metadata Games James Heald Mario Klingemann Chico 45 Use computational methods Human Tagger Top British Library Flickr Commons Taggers 18 hard core taggers How to reward and keep motivated? Average for ‘crowd’ is 1 tag per person Mobile games for ‘Ships’, ‘Covers’ and ‘Portraits’ Interface for tagging
  • 35. 35 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Adam Crymble (2015) Crowdsource Arcade http://goo.gl/LBfJ4W http://goo.gl/OH9pOZ https://goo.gl/7z0j8p 30 mins talk Labs Symposium (2015) https://goo.gl/SSRsdd 5 min interview (2015) http://goo.gl/0APpE8 Game Jam Using Arcade Games to help Tag images
  • 36. 36 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Special Jury’s Prize (2015) James Heald – Wikimedia and Map work https://goo.gl/WYZCB2 http://goo.gl/HNQq5e https://goo.gl/VPgffL https://commons.wikimedia.org/ https://goo.gl/djtm1b Labs Symposium (2015)Geotagging maps 54,000 Maps Found in Flickr 1 million Human & Computational Tagging & Community engagement Geo-referencing work
  • 37. 37 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk SherlockNet: Competition Winner 2016 Karen Wang, Luda Zhao and Brian Do Using Convolutional Neural Networks to Automatically Tag and Caption the British Library Flickr Commons 1 million Image Collection 12 categories >20 million tags added >100,000 captions bit.ly/sherlocknet Pooled surrounding OCR text on page from similar images Used Microsoft COCO (photographs) & British Museum Prints and Drawings collections as training sets. Tags Captions
  • 38. 38 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Artistic / Creative Works http://goo.gl/dM8ieA Mario Klingeman (2015) Code Artist / Curator https://www.youtube.com/watch?v=Q3SBxO34Zlc David Normal 2014 and 2015 Collages/Paintings & Lightboxes http://goo.gl/bNxGZZ Kris Hoffman (2016) Animation for Fashion Week 2016 https://goo.gl/QilqqT Jiayi Chong 2016 - Animation tool https://www.facebook.com/RealmlandStory/ Paul Rand Pierce 2016 Graphic Novel on Facebook A Hat on the Ground Spells trouble Tragic Looking Women 44 Men who Look 44 (Notice the direction faces)
  • 39. 39 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Imaginary Cities – BL Labs Project 16-17 Michael Takeo Magruder https://goo.gl/4ARwTy An artistic exploration seeking to create provocative fictional cityscapes for the Information Age from the British Library’s digital collection of historic urban maps
  • 40. 40 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Learning & Teaching The PhD Abstracts Collections in FLAX: Learning Academic English with Electronic Theses Online Service (EThOS) https://goo.gl/fOwHAe Shaoqun Wu, Alannah Fitzgerald, Ian H. Witten and Chris Mansfield
  • 41. 41 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Learning &Teaching Library Carpentry James Baker https://goo.gl/25cq99 And many more!
  • 42. 42 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Commercial: Poetic Places (2016) http://www.poeticplaces.uk/ Sarah Cole
  • 43. 43 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Commercial: Curating Digital Collections Go Mobile (2016) http://www.biblioboard.com Mitchell Davis (BiblioLabs) See it in the Foyer!
  • 44. 44 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk #bldigital 1-2 %* digitised * estimate Digitisation Partnerships Commercial & Other Organisations Amount increasing rapidly Bias in digitisation http://goo.gl/bR9UJL Sample Generator
  • 45. 45 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Have you got X? https://upload.wikimedia.org/wikipedia/commons/5/50/Real_wuerzburg.jpg Looking for Physical Content in the British Library
  • 46. 46 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Have you got X digitised? http://www.yorkmix.com/wp-content/uploads/2014/04/mr-simms-sweet-shoppe-york.jpg Looking for Digitised Content in the BL
  • 47. 47 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk • Digitisation costs time & resources & access can depend on restrictions imposed by funders … • Still…over 600 Digital Collections!! But not all found through Google or even online! • Need lots of engagement, dialogue is either: – you are ‘lucky’ & we have the digital content / data relevant to your research – we don’t have exactly what your looking for, but is there anything of interest? Let’s talk… • Artists find this dialogue easier and we tend to attract researchers with ‘fuzzier’ research boundaries • Access easier for openly licensed content • More challenging for on-site, in-copyright, data protected, old content media and contemporary material https://goo.gl/qpCLlk So little digitised? Type of Engagement? © £  https://goo.gl/Y5zCXg © https://goo.gl/wMTS3Z
  • 48. 48 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk only in Reading Rooms due to © only on site due to © or ethical etc not online / available – various storage devices, personal data online and open British Library online behind paywall Challenges of access to Digital Collections
  • 49. 49 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk The Story of the Digital Collection… Digital Collection Curator Who paid for the digitisation? Who did the digitisation? Technology used Born digital? Published Unpublished Where is it? Can it still be accessed? Generates income Reputational Risk Legalities Political Ego (all) Surprises (e.g. gaps) Metadata Old format not supported What media was the digitisation done from? Documentation No Metadata Messy Metadata Still there? Good to know the background of a Digital collection if you want to use it for research and make conclusions…
  • 50. 50 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Open Licensed Digital Content? 20% Openly Licensed Around 15%* available online Working through to make more open… Though some collections will always only be available onsite due to © Breakdown by collection* Manuscripts 59% Books 9% Maps and Views 7% Newspapers 3% Archives and Records 3% Paintings, Prints and Drawings 2% *Based on number of digitisation projects Largest proportion of funding Public / Private Partnership 20%* Openly Licensed 80%* Available onsite *Estimates
  • 51. 51 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk How do we give access to onsite-only Digital Collections (80% of our Digital Collections)?
  • 52. 52 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk READING ROOM ON SITE NOT ONLINE OPEN British Library £ Labs Residency Model Challenges of access to Digital Collections
  • 53. 53 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Accessing digital collections onsite OPEN £ • Have to be ‘onsite’ • Need to be security cleared for some collections – Hence ‘Researcher in Residence Model’ • Permission required (depending on ‘story’ of collection) • Content could be on various media formats (not always online) • 5 - 20 % re-use of material for non commercial research for some collections • We are learning ‘pathways’ so that this becomes ‘everyday’ to provide onsite access to some digital collections in the future
  • 54. 54 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Lessons Learned & Challenges…(1) • Start with a conversation, our data isn’t all on Google (yet!) & not easy to find. Need to create and embrace serendipity & opportunities for use by talking! • Need to have several conversations with several stakeholders & tap into their tacit knowledge that isn’t always written down sometimes to progress ideas. https://goo.gl/XaHYT9 • Often misunderstandings because of jargon & different meaning of words. • Expectations change when researchers actually see the data, systems & experience the ‘culture’ of the organisation. • Opening & using digital collections occasionally requires a need to let go of the emotional & psychological connection to them https://goo.gl/OYAsmK ? https://goo.gl/ytmWnu
  • 55. 55 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Lessons Learned & Challenges…(2) • Embrace dirty data, it may never be perfect! • Careful of making conclusions (trust) based on ‘black box’ software & techniques (e.g. sentiment analysis) • We tend to work with researchers who can be ‘flexible’ with their research questions & are willing to embrace challenges. • Many researchers have the domain knowledge but lack technical / digital skills to use Digital Research methods. Should they be teamed up with those that want to solve problems or get trained? • Huge appetite to use digital content & data (e.g. Flickr Commons stats). https://goo.gl/mcpa8B https://goo.gl/i5GVfI https://goo.gl/yQ5s4U https://goo.gl/kwcK8J https://goo.gl/wMTS3Z
  • 56. 56 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Labs mindset… 1. Start a conversation, generate positive energy and try to support ideas 2. Start with small experiments, but think big! 3. Fail faster (don’t be afraid)! 4. Reject perfectionism! 5. Good enough is sometimes…good enough! 6. Celebrate the uses of digital collections https://goo.gl/noASfl
  • 57. 57 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk https://goo.gl/SUOO0J The Magic of Openness! • If digitised / digital collections are not used, what is the point of digitising / keeping them (i.e. apart from preservation)? • Opening up our digital collections offers new ways for the Library’s content to be re-discovered, remixed, re-imagined and ‘re- energised’ • Generates plenty of examples to inspire use by others
  • 58. 58 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Hey there Young Sailor! Ling Low 2016 – Hey there Young Sailor https://www.youtube.com/watch?v=bcOP1E5bRE0VIMEO.COM/SWEETANDLOWFILMS @SWEETNLOWFILMS ON INSTAGRAM @SWEETNLOWLING ON TWITTER The Impatient Sisters
  • 59. 59 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk The Future of BL Labs • Continue to engage with researchers • Learn what they want to do • Collect evidence of demand • Develop Business Model and Support process to make ‘Business as Usual’ at the British Library • Help to create pathway to developing a ‘Digital Research Suite’ at the British Library by 2019 http://www.library.pitt.edu/digital-scholarship-services https://goo.gl/W4TjGt
  • 60. 60 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Taking a peek at our Open Data A digitised book…
  • 61. 61 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk 002819694
  • 62. 62 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk
  • 63. 63 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk
  • 64. 64 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Optically Character Recognised (OCR) generated Text Scanned Page Image on Flickr Commons https://goo.gl/AC43vs
  • 65. 65 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk OCR XML Generated by ABBY Fine Reader
  • 66. 66 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Taking a peek at our on-site only accessible data A digitised newspaper
  • 67. 67 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Accessing digitised newspapers onsite at the BL 1 Windows 7 External access possible through Citrix Server Results of digitisation exist on Windows file shares!
  • 68. 68 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Accessing digitised newspapers onsite at the BL (JISC 1) 2 12 Volumes, each with terabytes of data
  • 69. 69 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Accessing digitised newspapers onsite at the BL 3
  • 70. 70 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Accessing digitised newspapers onsite at the BL 4
  • 71. 71 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Accessing digitised newspapers onsite at the BL 5
  • 72. 72 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Accessing digitised newspapers onsite at the BL 6
  • 73. 73 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Accessing digitised newspapers onsite at the BL 7
  • 74. 74 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Accessing digitised newspapers onsite at the BL 8
  • 75. 75 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Accessing digitised newspapers onsite at the BL 9
  • 76. 76 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Accessing digitised newspapers onsite at the BL 10
  • 77. 77 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Accessing digitised newspapers onsite at the BL 11
  • 78. 78 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Accessing digitised newspapers onsite at the BL 12
  • 79. 79 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Accessing digitised newspapers onsite at the BL 13 Accessing original ‘master’ image (not cropped or post processed) Or ‘service’ copy (post processed) and results of OCR available as ALTO XML
  • 80. 80 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Accessing digitised newspapers onsite at the BL 14a Accessing original ‘master’ image (not cropped or post processed) in .TIFF format
  • 81. 81 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Accessing digitised newspapers onsite at the BL Accessing original ‘master’ image (not cropped or post processed) 14b
  • 82. 82 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Accessing digitised newspapers onsite at the BL 15a Accessing ‘service’ Copy (post processed) and results of OCR available as ALTO XML
  • 83. 83 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Accessing digitised newspapers onsite at the BL Accessing ‘service’ Copy (post processed) 15b
  • 84. 84 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Accessing digitised newspapers onsite at the BL 15c Accessing OCR as ALTO XML
  • 85. 85 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Accessing digitised newspapers through Gale Interface (subscription) 1
  • 86. 86 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Accessing digitised newspapers through Gale Interface (subscription) 2
  • 87. 87 @mahendra_mahey @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/EvKGfa mahendra.mahey@bl.uk Explore or Imagine Our Data! • CSV of Metadata https://data.bl.uk/digbks/dig19cbooks-mdata-csv.csv • 19th Century Books - Book Metadata - 01/09/2013. https://data.bl.uk/digbks/db21.html • Digitised Books - Flickr Tag History - Dec 2013 to March 2016. TSV https://data.bl.uk/digbks/db15.html • Digitised Hebrew Manuscripts - Metadata https://data.bl.uk/hebrewmanuscripts/heb1.html • Digitised Hebrew Manuscripts: Or 2210 - Or 2364 https://data.bl.uk/hebrewmanuscripts/heb8.html • Theatrical playbills from Britain and Ireland (OCR text only) https://data.bl.uk/playbills/pb2.html • Portraits of actors, views of theatres and playbills (covering 1750 - 1821 in a single volume) https://data.bl.uk/singlesheet/por1.html • Volumes of Lysons Collectanea (Amusements), comprising broadsides, cuttings, advertisements on amusements.1660- 1840. https://data.bl.uk/singlesheet/ad1.html https://data.bl.uk • Have a look at the data. • Data Quality • Issues Or an idea you have thought of what to do with the data! http://labs.bl.uk/Ideas+for+Labs Smaller datasets

Editor's Notes

  1. 25 Seconds (68 Words) My name is Mahendra Mahey and I work on a project called British Library Labs. We are based at the British Library in London, in the Digital Scholarship department and we work closely with the Digital Research team there. It’s been running for three years now and is funded by the Andrew W. Mellon Foundation.
  2. 140 seconds The British Library is the national library of the UK and one of the largest research libraries in the world . The Library moved to a new purpose built building in 1997 <click> the largest of it’s kind that was built in the UK in the 20th century. Many frequently used items are stored 5 stories below the main building at St Pancras in London and many might not know that part of the building is meant to look like a ship on a journey to discovery!<click>. <click to switch off> The building can sit 1,200 researchers at any one time across 5 reading rooms. <click>Medium and long term requested items are held at Boston Spa in Yorkshire in a low oxygen warehouse, using robot to retrieve items. In total, the library has 625 km of shelving, growing by 12 km every year. Whilst we acquire items through purchase or gifts, much of the collection has been built up through legal deposit. That is, by law, a copy of every UK and Ireland print publication must be given to the British Library by its publishers. Around 3 million items are added per year. In 2013, legal deposit was extended to cover non-print material which means by law we take in digitally published items as well, which means regular mass crawls of the entire UK web domain as well as ebooks, ejournals etc.
  3. 85 seconds The picture you can see is inside the main building in London, it’s the King’s Library – King George the Third’s personal library! Sometimes known as the ‘stack’, I walk past this everyday and I sometimes forget that the collections the British Library have are truly staggering! We currently estimate them to exceed <click>150 million items, representing every age of written civilisation and every known language. Our archives now contain the earliest surviving printed book in the world, the Diamond Sutra, written in Chinese and dating from 868 AD…. So some big numbers… Over …<click>14 million books <click>60 million patents <click>8 million stamps <click>4 million maps <click>3 million sound recordings <click>1.6 million music scores <click>over .3 million manuscripts <click>0.8 million serials titles (which are of course made up of many many volumes/editions), this is where a lot of our content is, just in case you thought the numbers didn’t add up!
  4. 33 Seconds (100 Words) In a nutshell the project encourages researchers, artists, entrepreneurs, educators and anyone else, <Click> to ‘experiment’ with our digital collections and data. We are particularly interested in those who have questions which focus on the potential to find and create NEW things through access to the digital content. For example, being able to ask a question across thousands of digitised books or newspapers using computational techniques would not feasible using manual methods. Let’s look at a clear example. <Click>
  5. 33 Seconds (100 Words) In a nutshell the project encourages researchers, artists, entrepreneurs, educators and anyone else, <Click> to ‘experiment’ with our digital collections and data. We are particularly interested in those who have questions which focus on the potential to find and create NEW things through access to the digital content. For example, being able to ask a question across thousands of digitised books or newspapers using computational techniques would not feasible using manual methods. Let’s look at a clear example. <Click>
  6. https://goo.gl/WutNyi http://goo.gl/nNKhQ2 https://goo.gl/9NWZUW https://goo.gl/7QQ5Tf https://goo.gl/x7b4tg https://upload.wikimedia.org/wikipedia/commons/a/a2/Interactive_whiteboard_at_CeBIT_2007.jpg
  7. Get clearer annotation image and transcription (perhaps TILT)
  8. 6 Seconds (20 Words) So <Click> ‘how’ do we try and engage those who might be interested in the BL’s digital collections and data? <Click>
  9. 17 Seconds (53 Words) <Click>The British Library is one of the largest Library’s in the world <Click> with an estimated 180 million physical items, with only a small proportion being digitised. <Click>We estimate this is around 1-2%, but no one really knows exactly how much. However, increasingly more items are being stored as ‘born’ digital, such as the UK Web Archive<Click>
  10. Have balance of Multimedia Broadcast news and radio, sounds asave our sounds Books and newspapers Images BNB Qatar Digital library Hebrew manuscripts
  11. 21 Seconds (65 Words) Katrina Navickas was particularly interested in the <Click>Chartist Movement who were a group who were campaigning for the vote for working people. <Click>They were the biggest popular movement for democracy in 19th century British history, just as this is early picture shows a huge monster meeting at Kennington Common<Click>She wanted to use a combination of manual and computational methods to explore our Digitised Newspapers to find out when and where they met and plot them on map. <Click>and hopefully unearthing new history.
  12. 970 files from a selection of 19th century newspaper titles from the BL corpus for us to correct using the overProof post-OCR correction software The best way to measure the improvement made by the correction process is to compare the OCR'ed text and the automatically corrected text with a perfect correction made by a human (known as the "ground truth"). Hannah-Rose's 5 small human-corrected samples are show as green dots. These are not only smaller than the other files, but their raw error rate is much lower at 13.3%. OverProof was measured as reducing this to 5.4%, a removal of almost 60% of errors. The red dotted-line indicates the correction "break-even" point: the further under the line, the better the quality of the document after correction. In the graph below, the grey line shows distribution of files across error rates before correction and the green line after correction.
  13. Posts small illustrations taken almost at random from the digitised book corpus to a Tumblr blog. This experiment with undirected engagement was a by-product of work to uncover the hidden wealth of illustrations within the digitised pages.
  14. 27 Seconds (82 Words) Adam Crymble <Click>wanted to harness the power of playing fun games on arcade machines to help with crowdsourcing the tagging of un-described images. He particularly wanted to engage a younger audience into crowdsourcing .<Click>On the right you can see a replica 1980’s arcade machine we built and <Click>and on the bottom left some tagging games that were developed through a ‘Games Jam’ for the machine. <Click>. Let’s take a closer look at two of the games…<Click>
  15. 18 Seconds (56 Words) Indexing BL the 1 million & Mapping the Maps – was led by James Heald and collaboration with others <Click>They produced an index of 1 million 'Mechanical Curator collection' images on <Click>Wikimedia Commons from a collection of largely un-described images. <Click>This gave rise to finding 50,000 maps within the collection partially through a map-tag-a-thon <Click>These are now being geo-referenced. <Click>
  16. An educational research study involving the University of Waikato New Zealand, Concordia University in Canada, and Queen Mary University of London into the development and evaluation of domain-specific language corpora derived from PhD abstracts with the Electronic Theses Online Service (EThOS) at the British Library built using the interactive FLAX (Flexible Language Acquisition flax.nzdl.org) open-source software for uptake in English for Specific Academic Purposes programmes (ESAP). Alannah and Chris should be there to accept the award, get them up on the stage, photos, and then can speak for 2 minutes.
  17. A series of courses, methodologies and tools to introduce programming to Library staff based on British Library data. Get James up on stage (and others if they are there) take picture. Get James to speak for a 5-6 minutes.
  18. Poetic Places is a free app for iOS and Android devices which was launched in March 2016. Is has been created by Sarah Cole of TIME/IMAGE whilst Creative Entrepreneur-in-Residence at the British Library, funded by CreativeWorks London. (UK Entry) Poetic Places brings poetic depictions of places into the everyday world, helping users to encounter poems in the locations described by the literature, accompanied by contextualising historical narratives and relevant audiovisual materials. These materials are primarily drawn from open archive collections, including the British Library Flickr collection. Please come Sarah. Sarah get’s her award and speaks for 2-3 minutes.
  19. As a direct result of its collaborative work with the British Library, BiblioLabs has developed BiblioBoard, an e-Content delivery platform, and online curatorial and multimedia publishing tools to support it. These tools make it simple for subject area experts to create multi-media exhibits for the web and mobile devices without any technical expertise. The curatorial output is available via a responsive web site as well as through native apps for mobile devices. This unified interface incorporates viewers for PDF, ePub, images, documents, video and audio files allowing users to access content without having to link out to other sites to view disparate media formats. Please come Mitchell. Mitchell get’s his award and speaks for 5 minutes.
  20. 17 Seconds (53 Words) <Click>The British Library is one of the largest Library’s in the world <Click> with an estimated 180 million physical items, with only a small proportion being digitised. <Click>We estimate this is around 1-2%, but no one really knows exactly how much. However, increasingly more items are being stored as ‘born’ digital, such as the UK Web Archive<Click>
  21. <click>The British Library faces many challenges of access to our Digital collections! <click> Sometimes digital content is only available onsite due to license restrictions, <click>or even only on a specific computer in a reading room! Technically there are very few reasons why digital content can’t be online <click> though it might be too big or hasn’t been transferred from other digital storage media. <click>Sometimes access is through a paywall. Finally, <click>some content is in the happy sunny place, online, open and freely available. The real reasons why there are challenges to accessing digital content are of course human. They require different approaches from the Library and may often involve an honest, open dialogue and negotiation with the publishers. The Labs project has tried to address this problem my creating a ‘residency model’ for researchers to work intensively with a digital collection on-site, so as to not infringe access conditions, I will say more about this later.
  22. <click>The British Library faces many challenges of access to our Digital collections! <click> Sometimes digital content is only available onsite due to license restrictions, <click>or even only on a specific computer in a reading room! Technically there are very few reasons why digital content can’t be online <click> though it might be too big or hasn’t been transferred from other digital storage media. <click>Sometimes access is through a paywall. Finally, <click>some content is in the happy sunny place, online, open and freely available. The real reasons why there are challenges to accessing digital content are of course human. They require different approaches from the Library and may often involve an honest, open dialogue and negotiation with the publishers. The Labs project has tried to address this problem my creating a ‘residency model’ for researchers to work intensively with a digital collection on-site, so as to not infringe access conditions, I will say more about this later.