SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Successfully reported this slideshow.
Activate your 14 day free trial to unlock unlimited reading.
1.
www.bl.uk
Operationalising AI at a national library
1
Dr. Mia Ridge, @mia_out
Digital Curator, British Library digitalresearch@bl.uk
@BL_DigiSchol @LivingWMachines
Museums + AI Network, New York, September 2019
2.
The British Library is the
national library of the UK.
By law we receive a copy of every
publication produced in the UK
and Ireland.
We have up to 200 million
items, including: 14m books; 8m
stamps; 310,000 manuscript
volumes; 4m maps; websites;
television and radio news
About 5% is digitised or born-
digital.
3.
The British Library's Digital Scholarship team
3
Our mission is to enable the use of the British Library’s digital
collections for research, inspiration, creativity, and enjoyment.
Digital Research
Team
Endangered
Archives
Living with
Machines
BL Labs
Connect and
share
Support digital
scholars
Agents for
change
Invest in our
staff
Innovate and
collaborate
4.
Enabling a shift from pages to datasets
https://www.flickr.com/photos/nasacommons/9467783474https://www.flickr.com/photos/statelibraryqueensland/8808https://www.flickr.com/photos/statelibraryqueensland/8808717962
4
ScalePerspectiveSpeed Complexity
5.
Our Partners Our Funders
Living with Machines
Rethinking the impact of
technology on the lives of ordinary
people during the Industrial
Revolution
@LivingWMachines
https://livingwithmachines.ac.uk
6.
Living with Machines aims to:
• Generate new historical perspectives on the effects of the
mechanisation of labour on the lives of ordinary people during the
long nineteenth century.
• Support the wider academic and cultural heritage sector in using
digital methods to answer historical questions.
• Create new tools and code that can be reused and built upon in
future projects.
• Develop new computational techniques for working with historical
research questions.
• Enrich the British Library’s data holdings for the benefit of all
• Advance public awareness of how digital research in the
humanities can enhance understanding of history.
7.
We’re working on questions like…
• How do we encourage ‘radical collaboration’ between disciplines
and organisations?
• How do we integrate crowdsourcing and machine learning across
shared timelines?
• How can we chart and
understand
representativeness,
genre balance and bias
in sources?
• Were machines seen
as autonomous agents
in the 19thC, able to
bring about change?
8.
Sources include…
• Full-text: newspapers, trade and postal
directories, autobiographies, journals and
diaries, novels, Parliamentary papers
• Tabular: census records, BMD
• Visual: Ordnance Survey maps, Goad fire
insurance maps, images in serials/text as
image
• Mostly British Library collections but we’re
negotiating access to other collections and
derived data
9.
Benefits for the British Library / GLAMs
• Enhance Turing partnership through research collaboration
• Enhance BL reputation as a leading digital innovator in the
library sector
• Improve working with large scale digitisation, digital content, and
data
• Better incorporate learnings and outcomes of research projects
• Improve digital workflows and processes
• Improve workflows for ingesting or enhancing metadata
• Grow digital collections
• Increase understanding of and ability to apply advanced
methods
• Increased awareness of data science and digital history
• Provide a coherent model for mixed-rights access to items and
datasets
• Digital content and data in the cultural programme (exhibitions)
10.
Challenges in operationalising AI: copyright
• Legal exemptions for some forms of text mining require either
internal infrastructure to support data science and AI at scale or
negotiating rights issues for newspapers digitised by a third party
• Resolving data protection (GDPR) questions
• Having funds for digitisation is wonderful but it highlights the impact
of our 'safe' 1878 date on scholarship
• Technical and usability challenges for publishing (partial) datasets
and derived datasets within a complex rights environment
11.
Challenges in operationalising AI: scale
• Data storage and processing at terabyte scale quickly becomes
expensive
• New workflows for digitised images
• Supporting academics in selecting digitisation at scale
• Encouraging and enabling the project to work with complexity at
scale
12.
Challenges in operationalising AI:
operational
• Managing impact on related teams who are being asked to answer
new types of queries and provide different types of support,
sometimes with more urgency than usual
• How do we integrate AI-generated metadata at scale into strategic
systems?
• What impact does the provision of many millions of very detailed
‘entity recognition’ annotations have on discovery services?
• What public-facing infrastructure can we create? Do all outputs
(statistical models vs metadata) need to be equally sustainable?
13.
Challenges in operationalising AI:
interdisciplinary
• Aligning GLAM and academic data science goals, outcomes,
timelines and reward structures?
• Turning academic outputs (working papers, peer reviewed
publications, software models) into training materials (tutorials with
related datasets, workshops, blog posts) for BL staff, library users
and other scholars
• Turning research findings and methods into a small exhibition
• Integrating participation through crowdsourcing and work in local
libraries with academic research processes
14.
14
Data science in the library?
‘you need the right team and the right
mindset. The latter requires a cultural shift
that prioritizes and rewards experimentation,
measurement, and testing throughout your
organization’
Google, ‘Everything a marketer needs to know about
machine learning’
15.
Our Partners Our Funders
Thank you!
Questions?
Dr Mia Ridge, Digital Curator, British Library
@mia_out @BL_DigiSchol @LivingWMachines
http://www.livingwithmachines.ac.uk