This document outlines an agenda and presentation for a workshop on developing digital humanities project ideas using the British Library's digital collections and data. The presentation covers introducing BL Labs and digital humanities projects, developing project proposals, getting feedback, and examples of past successful projects. Attendees are invited to introduce potential project ideas. The presentation then provides an overview of the British Library's vast physical and growing digital collections, as well as examples of digital experiments and challenges involved in working with digitized text and metadata. Participants are encouraged to consider how to showcase collection data and assess technical and legal feasibility when developing proposals.
More than just books - British Library Labs Presentation given at MSc Compute...labsbl
The British Library: More than just books
Exploring new ideas and methods to better understand the cultural and historic heritage held by the Library.
MSc CGE: Games Industry Seminar Series 2013-14
Computing, Room NAB 314, New Academic Building,
29 St James Street, Goldsmiths University of London
Mahendra Mahey
Manager of British Library Labs
Tuesday 4th of February 2014, 1400 - 1415
The document discusses access to digitized newspaper collections at the British Library. It notes that some newspapers are available through a commercial Gale interface by subscription, while others like the JISC 1 collection containing 12 volumes and 80TB of data must be accessed onsite. For onsite access, researchers need security clearance and there are various challenges like mixed OCR quality and the need to understand the "story" of each collection's digitization. Examples are provided of the Burney collection which contains over 1 million digitized newspaper pages with varying OCR quality accessible through a web interface.
Digital Magical Mystery Tour - British Librarylabsbl
This document summarizes a talk given by Mahendra Mahey on the British Library's digital collections and how they are used for projects. It provides information on the British Library Labs program, which funds and supports projects utilizing the Library's digital content. Examples are given of different types of projects, including research projects analyzing digitized newspapers, music collections, and other materials, as well as artistic and educational projects. Tips are provided on accessing and making use of the Library's digital collections and data.
British Library Labs Roadshow - Sussex Humanities Lablabsbl
Presentation given by Mahendra Mahey, Manager of British Library Labs on Friday 5th of May, at Sussex Humanities Lab, 2017 as part of the BL Labs Roadshow 2017
British Library Labs Roadshow 2017 at the University of Birminghamlabsbl
Presentation given by Mahendra Mahey, Manager of British Library Labs at the College of Arts and Law, the University of Birmingham on Wednesday 10th of May, 2017.
More than just books - British Library Labs Presentation given at MSc Compute...labsbl
The British Library: More than just books
Exploring new ideas and methods to better understand the cultural and historic heritage held by the Library.
MSc CGE: Games Industry Seminar Series 2013-14
Computing, Room NAB 314, New Academic Building,
29 St James Street, Goldsmiths University of London
Mahendra Mahey
Manager of British Library Labs
Tuesday 4th of February 2014, 1400 - 1415
The document discusses access to digitized newspaper collections at the British Library. It notes that some newspapers are available through a commercial Gale interface by subscription, while others like the JISC 1 collection containing 12 volumes and 80TB of data must be accessed onsite. For onsite access, researchers need security clearance and there are various challenges like mixed OCR quality and the need to understand the "story" of each collection's digitization. Examples are provided of the Burney collection which contains over 1 million digitized newspaper pages with varying OCR quality accessible through a web interface.
Digital Magical Mystery Tour - British Librarylabsbl
This document summarizes a talk given by Mahendra Mahey on the British Library's digital collections and how they are used for projects. It provides information on the British Library Labs program, which funds and supports projects utilizing the Library's digital content. Examples are given of different types of projects, including research projects analyzing digitized newspapers, music collections, and other materials, as well as artistic and educational projects. Tips are provided on accessing and making use of the Library's digital collections and data.
British Library Labs Roadshow - Sussex Humanities Lablabsbl
Presentation given by Mahendra Mahey, Manager of British Library Labs on Friday 5th of May, at Sussex Humanities Lab, 2017 as part of the BL Labs Roadshow 2017
British Library Labs Roadshow 2017 at the University of Birminghamlabsbl
Presentation given by Mahendra Mahey, Manager of British Library Labs at the College of Arts and Law, the University of Birmingham on Wednesday 10th of May, 2017.
The document discusses British Library Labs (BL Labs), a digital research initiative at the British Library. BL Labs is funded by the Andrew W. Mellon Foundation and has been running since 2013. It aims to engage researchers, artists, entrepreneurs and educators in using the Library's digital collections. The summary provides an overview of BL Labs' activities, including competitions and awards to encourage uses of digital content, and digital research support to help researchers. It also discusses challenges around access to the Library's digital collections, of which only a small percentage are openly available online.
A hands-on data exploration & challenge to become a derived data-set author o...labsbl
Mahendra Mahey, manager of British Library Labs (BL Labs) will examine some of the BL’s digital collections/data & discuss challenges he has had in making the BL's cultural heritage data available openly or onsite at the British Library.
Mahendra will invite delegates to explore data-sets at their leisure, setting a challenge for those who are interested, skilled in exploring, finding patterns and grouping data. They could become data-set authors/creators of derived data-sets, based on pre-existing digital collections/data provided on the day or already available on https://data.bl.uk.
The workshop will conclude with reflections from the delegates and possibly highlighting a number derived data-sets that were generated by participants on the day that could now potentially exist on https://data.bl.uk. If selected, these new derived data-sets will be attributed with the creators' / authors' details and each will have its own cite-able Digital Object Identifier (D.O.I). These new data-sets would then be available for reuse by any researcher in the world.
GUIDANCE FOR THIS WORKSHOP
We strongly recommend you come to this workshop with an appropriate device such as a laptop pre-installed with appropriate tools to analayse different kinds of data-sets, e.g. Microsoft Excel may work with smaller data-sets such as metadata (see other data exploration tools below). If you don't have one, and would still like to attend, please request to 'pair up' with someone who is willing to share and has already signed up.
Other data exploration tools include: Notepad++ (e.g. for viewing text and XML); Open Refine (e.g. for cleaning data); Tableau Public (e.g. for visualising data); Google Fusion Tables (e.g for visualising geo-spatial data); Spacy (e.g. for text and data mining), RStudio (an open source Statistical package), MATLAB (data analysis tool) & NLTK (Natural Language processing).
Please note that this workshop is NOT about training you in using any of these tools, just tools you may be already familiar with to explore and find patterns in our data.
Datatypes you may be examining in this workshop could include: .ZIP, .PDF, .TXT, .CSV, .TSV. .XLS, .XLSX, RDF, .nt, XML (TEI, ALTO and bespoke), .JSON, .JPG, .JPEG, .TIFF and .WARC
Please ensure you are able to read these files on your device before the workshop if you are interested in exploring them during our session.
Slides for session: http://goo.gl/
URL for specific data: http://
Mahendra Mahey tweets at @BL_Labs & @mahendra_mahey
Building Better GLAM Labs - Keynote Presentation at Simon Fraser Universitylabsbl
The document describes the British Library Labs, a department within the British Library focused on enabling use of the British Library's digital collections through experimentation and innovation. It provides details on the Labs' activities, including supporting digital scholars, developing digital research methods, and growing an international community of over 50 GLAM (Galleries, Libraries, Archives, Museums) Labs. Challenges addressed include exploring large digital collections at scale, discovering new ways to access and analyze cultural heritage data, and helping navigate users through the Library's resources and processes.
The British Library Labs project encourages researchers and developers to use British Library digital collections and data for research and development. It does this through competitions, events like hackathons, and residencies where winners can work intensively with collections. The Labs project aims to support digital scholarship by providing tools, services, and case studies. It has highlighted projects like an interface for "mixing" collections based on a DJ model and a "sample generator" to search collections and provide randomized samples of works for research.
Supporting the Digital Scholar:Experiences from the British Library Labslabsbl
The document summarizes the British Library Labs project, which supports digital scholarship. It discusses how the Library works with digital scholars and researchers, providing digitized collections and expertise. Examples include text analysis tools developed using newspaper archives, creative competitions, and crowdsourcing projects tagging images and georeferencing maps. The Labs project aims to open up more collections, support new research methods, and engage researchers in experimenting with digital collections.
Building Better GLAM Labs - Keynote at University of Victoria, Victoria, BC, ...labsbl
The document discusses the British Library Labs, a department within the British Library that supports innovative projects using digitized and born digital cultural heritage collections. It provides an overview of the Labs' history, activities, and lessons learned. The Labs engages with researchers, artists, educators and entrepreneurs through competitions, projects, workshops and other events to support over 150 projects annually. It emphasizes that engagement starts with building relationships with people, not just focusing on technology.
British Library Labs Presentation at Ed Tech Hackathon 2013 - hackathoncentra...labsbl
The document provides information about the British Library Labs project, which encourages researchers and developers to work with the British Library's digital collections and data. It describes upcoming hackathon and competition events that will allow participants to develop tools and services using collections like 19th century books, images, and bibliographic data provided by the British Library Labs. Contact information is provided for those interested in learning more or getting involved in developing ideas and projects.
British Library Labs - Bodleian - University of Oxfordlabsbl
The British Library holds over 150 million items in its collection and is exploring new digital methods to make this cultural heritage more accessible. The presentation discusses the Library's support for digital scholarship through initiatives like British Library Labs, which funds projects to experiment with digital collections. Examples are provided of Labs projects including tools to sample representative texts and mix digital media items. The goal is to engage more researchers through open data and competitions while better understanding how digital tools can unlock new discoveries within the Library's collections.
BL Labs Presentation at Open Science Infrastructures for Big Cultural Datalabsbl
The document provides information about a presentation given by Mahendra Mahey, Manager of BL Labs, about the British Library Labs and how it supports access to and use of the Library's digital collections. It discusses the Library's collections, both physical and digital, challenges around accessing digital content, and how the Labs aims to help researchers navigate accessing collections through exploration, query-focused support and wrap-up phases. It also shares examples of open digital datasets and guidance on finding datasets.
British Library Labs - Open University Presentation - 3 April 2014, 1100-1200labsbl
The document summarizes the experiences of the British Library Labs in supporting digital scholarship. It discusses how the British Library works with digital scholars and researchers, providing various resources and tools. The British Library Labs team collaborates with scholars on projects involving digital collections and aims to make more content openly available online through platforms like Flickr. The Labs also runs competitions for researchers to develop tools and applications using library collections.
Building Better GLAM Labs - Opening talk at Museum Big Data Conference - UCL ...labsbl
The document outlines Mahendra Mahey's presentation on exploring the use of big data in Galleries, Libraries, Archives and Museums (GLAM) digital labs. Some key points include:
- Mahendra will give a talk on using big data in GLAM digital labs at the Qatar National Library on April 30, 2019.
- BL Labs at the British Library works with researchers, artists, and others to experiment with digitized and born digital collections.
- Engagement with potential users is important for GLAM institutions to explore uses of their digital content and data.
British Library Labs Presentation to City University Londonlabsbl
This document outlines presentations given by Mahendra Mahey, Manager of British Library Labs, about digital experiments and opportunities at the British Library. It discusses past competition winners who developed tools like a text-to-image linking tool and Victorian meme machine. It also provides information about upcoming competitions and awards for developing projects using the Library's digital collections, and gives tips for entering the competitions. Finally, it briefly describes some of the digital collections and datasets available through the British Library Labs.
British Library Labs Presentation at Edge Hill Universitylabsbl
The document discusses the British Library Labs, which explores innovative experiments and future opportunities using the Library's digital collections and data. It provides information on past competition winners who developed tools for text analysis, visualization, and more. It encourages new ideas for competitions and awards, provides access to some of the Library's datasets, and discusses collaborations between Labs and researchers.
BL Labs Presentation at Liverpool John Moores Universitylabsbl
The document discusses access to digitized newspapers at the British Library. It describes how digitized newspapers can be accessed on-site through a Windows file share and Citrix server. It provides screenshots showing the folder structure containing terabytes of newspaper image and text files. Researchers can access original master images, processed service copies, and OCR text files in XML format. Digitized newspapers can also be accessed through a subscription-based interface with Gale Cengage. The British Library is exploring virtual infrastructure and machine learning to improve access to and analysis of digitized newspaper collections.
The British Library is the national library of the UK and by law receives a copy of every publication produced in the UK and Ireland, with over 150 million physical items stored in its collections that are growing by 3 million per year; the library is working to expand its digital collections and support for digital scholarship through initiatives like the UK Web Archive, digitization projects, and collaborations with digital scholars.
British Library Labs Presentation Hertfordshirelabsbl
The document discusses the British Library Labs, which funds innovative experiments exploring the library's digital collections and data. It provides information on upcoming competitions and awards for projects utilizing BL data, as well as examples of past winning projects. Guidelines are offered for the competitions. Details are also given on available digital datasets and resources through Labs, including a mini network-attached storage device containing various collection samples available on-site for experimentation.
Presentation given to visitors from the University of Sunderland on the 10th of February, 2014 about BL Labs at the British Library in the Panizzi Room.
The document advertises the British Library Labs Symposium 2020 funded by the Andrew W. Mellon Foundation and British Library, encourages exploring the library's digital collections through various websites, and lists an immersive theater performance called "To those born later" taking place at the Eliot Room in the Knowledge Centre with tickets costing £13 or concessions.
7th BL Labs Symposium (2019): 12_Digital Research team projects updatelabsbl
(1) The British Library's Digital Scholarship team aims to enable the use of the library's digital collections for research, inspiration, creativity, and enjoyment.
(2) The team is cross-disciplinary and supports the creation and innovative use of the library's digital collections.
(3) Recent projects include making Arabic manuscripts searchable through handwriting recognition software, digitizing South Asian printed books from 1713-1914, and exploring optical character recognition for languages like Bengali.
The document discusses British Library Labs (BL Labs), a digital research initiative at the British Library. BL Labs is funded by the Andrew W. Mellon Foundation and has been running since 2013. It aims to engage researchers, artists, entrepreneurs and educators in using the Library's digital collections. The summary provides an overview of BL Labs' activities, including competitions and awards to encourage uses of digital content, and digital research support to help researchers. It also discusses challenges around access to the Library's digital collections, of which only a small percentage are openly available online.
A hands-on data exploration & challenge to become a derived data-set author o...labsbl
Mahendra Mahey, manager of British Library Labs (BL Labs) will examine some of the BL’s digital collections/data & discuss challenges he has had in making the BL's cultural heritage data available openly or onsite at the British Library.
Mahendra will invite delegates to explore data-sets at their leisure, setting a challenge for those who are interested, skilled in exploring, finding patterns and grouping data. They could become data-set authors/creators of derived data-sets, based on pre-existing digital collections/data provided on the day or already available on https://data.bl.uk.
The workshop will conclude with reflections from the delegates and possibly highlighting a number derived data-sets that were generated by participants on the day that could now potentially exist on https://data.bl.uk. If selected, these new derived data-sets will be attributed with the creators' / authors' details and each will have its own cite-able Digital Object Identifier (D.O.I). These new data-sets would then be available for reuse by any researcher in the world.
GUIDANCE FOR THIS WORKSHOP
We strongly recommend you come to this workshop with an appropriate device such as a laptop pre-installed with appropriate tools to analayse different kinds of data-sets, e.g. Microsoft Excel may work with smaller data-sets such as metadata (see other data exploration tools below). If you don't have one, and would still like to attend, please request to 'pair up' with someone who is willing to share and has already signed up.
Other data exploration tools include: Notepad++ (e.g. for viewing text and XML); Open Refine (e.g. for cleaning data); Tableau Public (e.g. for visualising data); Google Fusion Tables (e.g for visualising geo-spatial data); Spacy (e.g. for text and data mining), RStudio (an open source Statistical package), MATLAB (data analysis tool) & NLTK (Natural Language processing).
Please note that this workshop is NOT about training you in using any of these tools, just tools you may be already familiar with to explore and find patterns in our data.
Datatypes you may be examining in this workshop could include: .ZIP, .PDF, .TXT, .CSV, .TSV. .XLS, .XLSX, RDF, .nt, XML (TEI, ALTO and bespoke), .JSON, .JPG, .JPEG, .TIFF and .WARC
Please ensure you are able to read these files on your device before the workshop if you are interested in exploring them during our session.
Slides for session: http://goo.gl/
URL for specific data: http://
Mahendra Mahey tweets at @BL_Labs & @mahendra_mahey
Building Better GLAM Labs - Keynote Presentation at Simon Fraser Universitylabsbl
The document describes the British Library Labs, a department within the British Library focused on enabling use of the British Library's digital collections through experimentation and innovation. It provides details on the Labs' activities, including supporting digital scholars, developing digital research methods, and growing an international community of over 50 GLAM (Galleries, Libraries, Archives, Museums) Labs. Challenges addressed include exploring large digital collections at scale, discovering new ways to access and analyze cultural heritage data, and helping navigate users through the Library's resources and processes.
The British Library Labs project encourages researchers and developers to use British Library digital collections and data for research and development. It does this through competitions, events like hackathons, and residencies where winners can work intensively with collections. The Labs project aims to support digital scholarship by providing tools, services, and case studies. It has highlighted projects like an interface for "mixing" collections based on a DJ model and a "sample generator" to search collections and provide randomized samples of works for research.
Supporting the Digital Scholar:Experiences from the British Library Labslabsbl
The document summarizes the British Library Labs project, which supports digital scholarship. It discusses how the Library works with digital scholars and researchers, providing digitized collections and expertise. Examples include text analysis tools developed using newspaper archives, creative competitions, and crowdsourcing projects tagging images and georeferencing maps. The Labs project aims to open up more collections, support new research methods, and engage researchers in experimenting with digital collections.
Building Better GLAM Labs - Keynote at University of Victoria, Victoria, BC, ...labsbl
The document discusses the British Library Labs, a department within the British Library that supports innovative projects using digitized and born digital cultural heritage collections. It provides an overview of the Labs' history, activities, and lessons learned. The Labs engages with researchers, artists, educators and entrepreneurs through competitions, projects, workshops and other events to support over 150 projects annually. It emphasizes that engagement starts with building relationships with people, not just focusing on technology.
British Library Labs Presentation at Ed Tech Hackathon 2013 - hackathoncentra...labsbl
The document provides information about the British Library Labs project, which encourages researchers and developers to work with the British Library's digital collections and data. It describes upcoming hackathon and competition events that will allow participants to develop tools and services using collections like 19th century books, images, and bibliographic data provided by the British Library Labs. Contact information is provided for those interested in learning more or getting involved in developing ideas and projects.
British Library Labs - Bodleian - University of Oxfordlabsbl
The British Library holds over 150 million items in its collection and is exploring new digital methods to make this cultural heritage more accessible. The presentation discusses the Library's support for digital scholarship through initiatives like British Library Labs, which funds projects to experiment with digital collections. Examples are provided of Labs projects including tools to sample representative texts and mix digital media items. The goal is to engage more researchers through open data and competitions while better understanding how digital tools can unlock new discoveries within the Library's collections.
BL Labs Presentation at Open Science Infrastructures for Big Cultural Datalabsbl
The document provides information about a presentation given by Mahendra Mahey, Manager of BL Labs, about the British Library Labs and how it supports access to and use of the Library's digital collections. It discusses the Library's collections, both physical and digital, challenges around accessing digital content, and how the Labs aims to help researchers navigate accessing collections through exploration, query-focused support and wrap-up phases. It also shares examples of open digital datasets and guidance on finding datasets.
British Library Labs - Open University Presentation - 3 April 2014, 1100-1200labsbl
The document summarizes the experiences of the British Library Labs in supporting digital scholarship. It discusses how the British Library works with digital scholars and researchers, providing various resources and tools. The British Library Labs team collaborates with scholars on projects involving digital collections and aims to make more content openly available online through platforms like Flickr. The Labs also runs competitions for researchers to develop tools and applications using library collections.
Building Better GLAM Labs - Opening talk at Museum Big Data Conference - UCL ...labsbl
The document outlines Mahendra Mahey's presentation on exploring the use of big data in Galleries, Libraries, Archives and Museums (GLAM) digital labs. Some key points include:
- Mahendra will give a talk on using big data in GLAM digital labs at the Qatar National Library on April 30, 2019.
- BL Labs at the British Library works with researchers, artists, and others to experiment with digitized and born digital collections.
- Engagement with potential users is important for GLAM institutions to explore uses of their digital content and data.
British Library Labs Presentation to City University Londonlabsbl
This document outlines presentations given by Mahendra Mahey, Manager of British Library Labs, about digital experiments and opportunities at the British Library. It discusses past competition winners who developed tools like a text-to-image linking tool and Victorian meme machine. It also provides information about upcoming competitions and awards for developing projects using the Library's digital collections, and gives tips for entering the competitions. Finally, it briefly describes some of the digital collections and datasets available through the British Library Labs.
British Library Labs Presentation at Edge Hill Universitylabsbl
The document discusses the British Library Labs, which explores innovative experiments and future opportunities using the Library's digital collections and data. It provides information on past competition winners who developed tools for text analysis, visualization, and more. It encourages new ideas for competitions and awards, provides access to some of the Library's datasets, and discusses collaborations between Labs and researchers.
BL Labs Presentation at Liverpool John Moores Universitylabsbl
The document discusses access to digitized newspapers at the British Library. It describes how digitized newspapers can be accessed on-site through a Windows file share and Citrix server. It provides screenshots showing the folder structure containing terabytes of newspaper image and text files. Researchers can access original master images, processed service copies, and OCR text files in XML format. Digitized newspapers can also be accessed through a subscription-based interface with Gale Cengage. The British Library is exploring virtual infrastructure and machine learning to improve access to and analysis of digitized newspaper collections.
The British Library is the national library of the UK and by law receives a copy of every publication produced in the UK and Ireland, with over 150 million physical items stored in its collections that are growing by 3 million per year; the library is working to expand its digital collections and support for digital scholarship through initiatives like the UK Web Archive, digitization projects, and collaborations with digital scholars.
British Library Labs Presentation Hertfordshirelabsbl
The document discusses the British Library Labs, which funds innovative experiments exploring the library's digital collections and data. It provides information on upcoming competitions and awards for projects utilizing BL data, as well as examples of past winning projects. Guidelines are offered for the competitions. Details are also given on available digital datasets and resources through Labs, including a mini network-attached storage device containing various collection samples available on-site for experimentation.
Presentation given to visitors from the University of Sunderland on the 10th of February, 2014 about BL Labs at the British Library in the Panizzi Room.
The document advertises the British Library Labs Symposium 2020 funded by the Andrew W. Mellon Foundation and British Library, encourages exploring the library's digital collections through various websites, and lists an immersive theater performance called "To those born later" taking place at the Eliot Room in the Knowledge Centre with tickets costing £13 or concessions.
7th BL Labs Symposium (2019): 12_Digital Research team projects updatelabsbl
(1) The British Library's Digital Scholarship team aims to enable the use of the library's digital collections for research, inspiration, creativity, and enjoyment.
(2) The team is cross-disciplinary and supports the creation and innovative use of the library's digital collections.
(3) Recent projects include making Arabic manuscripts searchable through handwriting recognition software, digitizing South Asian printed books from 1713-1914, and exploring optical character recognition for languages like Bengali.
Mahendra Mahey, BL Labs Manager, British Library
--
This Award recognises an artistic or creative endeavour that has used the Library’s digital content to inspire, amaze and provoke.
Maja Maricevic, Head of Higher Education and Science, British Library
--
This Award recognises a current member of staff, or team, who has played a key role in an innovative project using the Library’s digital content or data.
7th BL Labs Symposium (2019): 08_An update on the ‘Living with machines’ projectlabsbl
Mia Ridge, Digital Curator and Co-Investigator for Living with machines, British Library
The 'Living with machines' project is a collaboration between the British Library and the Alan Turing Institute for Data Science and Artificial Intelligence.
7th BL Labs Symposium (2019): 06_An overview of digital preservation at the B...labsbl
Maureen Pennock, Head of Digital Preservation, British Library
An overview of the challenges of preserving an ever-growing and complex set of digital collections and a presentation of the work of the Flashback project.
7th BL Labs Symposium (2019): 05_The Research Awardlabsbl
James Perkins, Research & Postgraduate Development Manager, British Library
This Award recognises a project or activity which demonstrates the development of new knowledge, research methods or tools, using the Library’s digital content.
7th BL Labs Symposium (2019): 04_The story of the GLAM Labs community and how...labsbl
Sophie-Carolin Wagner, Project Manager, Austrian National Library Labs, Austrian National Library
A report on the work to develop a global community of Galleries, Libraries, Archives and Museums (GLAM) Labs and the creation of a handbook for professionals wanting to set up, maintain and ensure digital innovation Labs thrive in their organisations.
7th BL Labs Symposium (2019): 01_Welcome and Introductionlabsbl
The British Library Labs has been running since 2013 and has supported over 160 projects in 6 years. It works with researchers, artists, and others to run competitions, awards, projects and other engagements exploring digital collections from the British Library and other GLAM institutions. A GLAM Lab is a space in a gallery, library, archive or museum to experiment and innovate with digitized and born-digital collections and data. The keynote speaker at this event will be Armand Leroi, an evolutionary biologist and author who has presented several documentary series on science and biology for BBC and Channel 4.
Mahendra Mahey, BL Labs Manager, British Library
This Award celebrates quality learning experiences created for learners of any age and ability that use the Library's digital content.
The document discusses digital collections at the British Library. It provides information on accessing and working with the Library's digital content, including over 720 digital collections that are either openly licensed and available online or available onsite. It also discusses challenges of access, engagement with researchers, the story behind digitization of collections, and support available through the British Library Labs for working with digital collections.
Introduction to BL Labs and Reading 35,000 Books: The UCD Contagion Project ...labsbl
Presentation given by Mahendra Mahey at the Reading 35,000 Books: The UCD Contagion
Project and the British Library Digital Corpus event on 20 February 2019
BL Labs Presentation to the British Library Development Teamlabsbl
The document describes the British Library Labs, including the challenges it addresses, who it works with, its projects, and use of digital collections. Some key points:
- BL Labs was founded in 2013 and addresses challenges like return on investment for digitization and how digital collections can be more openly used.
- It works with researchers, artists, librarians and more on over 200 projects involving sounds, maps, books and other collections.
- Projects follow a pattern of finding new things in data, unlocking hidden history, and celebrating discoveries. BL Labs aims to better understand use and impact of open data.
Presentation given by Mahendra Mahey, Manager of BL Labs, 1400 - 1430, 2 July 2018
London Psychology Librarians Group Meeting
Dickins Room, Conference Centre,
British Library
Experiences and lessons learned through British Library Labs How have we eng...labsbl
Presentation by Mahendra Mahey, Manager of BL Labs.
1100 - 1130, Thursday, 17th May 2018,Part of Plenary Session ‘Cultural Innovation: experiences from the field’,
CAMP iC4: A Breeding Ground for Useful Innovation,
BASE Milano, Via Bergognone, 34, Milan, Italy
Presentation to the National Science Library of the Chinese Academy of Scienceslabsbl
1100 - 1300, Thursday, 26th April 2018,
British Library Labs and Digital Scholarship at the British Library, Harley Room, British Library, St Pancras, London.
Presentation to the National Science Library of the Chinese Academy of Sciences
by Mahendra Mahey Manager of BL Labs
The Work of British Library Labs and Digital ScholarshipInsights from British Library Labs and an emerging role for Libraries
Working with the British Library’s Digital Collections & Data - Insights from...labsbl
Keynote presentation given by Mahendra Mahey at the Research Data Management in Digital Humanities International Conference, 17-18 April, 2018, Doha, UCL Qatar, room 1D02. Entitled: Working with the British Library’s Digital Collections & DataInsights from British Library Labs and an emerging role for Libraries (Keynote speech)
The document provides information about the British Library Labs and its digital collections. It discusses how the British Library Labs supports digital research through online queries, proposals, and on-site support. It notes that while the library has a large collection, only around 1-2% is digitized due to the significant costs involved. Digitized content comes from various digitization projects, but not all content is discoverable online. The document emphasizes that understanding the "story" behind each digital collection is important for research.
Gender and Mental Health - Counselling and Family Therapy Applications and In...PsychoTech Services
A proprietary approach developed by bringing together the best of learning theories from Psychology, design principles from the world of visualization, and pedagogical methods from over a decade of training experience, that enables you to: Learn better, faster!
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.pptHenry Hollis
The History of NZ 1870-1900.
Making of a Nation.
From the NZ Wars to Liberals,
Richard Seddon, George Grey,
Social Laboratory, New Zealand,
Confiscations, Kotahitanga, Kingitanga, Parliament, Suffrage, Repudiation, Economic Change, Agriculture, Gold Mining, Timber, Flax, Sheep, Dairying,
How to Setup Default Value for a Field in Odoo 17Celine George
In Odoo, we can set a default value for a field during the creation of a record for a model. We have many methods in odoo for setting a default value to the field.
THE SACRIFICE HOW PRO-PALESTINE PROTESTS STUDENTS ARE SACRIFICING TO CHANGE T...indexPub
The recent surge in pro-Palestine student activism has prompted significant responses from universities, ranging from negotiations and divestment commitments to increased transparency about investments in companies supporting the war on Gaza. This activism has led to the cessation of student encampments but also highlighted the substantial sacrifices made by students, including academic disruptions and personal risks. The primary drivers of these protests are poor university administration, lack of transparency, and inadequate communication between officials and students. This study examines the profound emotional, psychological, and professional impacts on students engaged in pro-Palestine protests, focusing on Generation Z's (Gen-Z) activism dynamics. This paper explores the significant sacrifices made by these students and even the professors supporting the pro-Palestine movement, with a focus on recent global movements. Through an in-depth analysis of printed and electronic media, the study examines the impacts of these sacrifices on the academic and personal lives of those involved. The paper highlights examples from various universities, demonstrating student activism's long-term and short-term effects, including disciplinary actions, social backlash, and career implications. The researchers also explore the broader implications of student sacrifices. The findings reveal that these sacrifices are driven by a profound commitment to justice and human rights, and are influenced by the increasing availability of information, peer interactions, and personal convictions. The study also discusses the broader implications of this activism, comparing it to historical precedents and assessing its potential to influence policy and public opinion. The emotional and psychological toll on student activists is significant, but their sense of purpose and community support mitigates some of these challenges. However, the researchers call for acknowledging the broader Impact of these sacrifices on the future global movement of FreePalestine.
This document provides an overview of wound healing, its functions, stages, mechanisms, factors affecting it, and complications.
A wound is a break in the integrity of the skin or tissues, which may be associated with disruption of the structure and function.
Healing is the body’s response to injury in an attempt to restore normal structure and functions.
Healing can occur in two ways: Regeneration and Repair
There are 4 phases of wound healing: hemostasis, inflammation, proliferation, and remodeling. This document also describes the mechanism of wound healing. Factors that affect healing include infection, uncontrolled diabetes, poor nutrition, age, anemia, the presence of foreign bodies, etc.
Complications of wound healing like infection, hyperpigmentation of scar, contractures, and keloid formation.
1. 1
mahendra.mahey@bl.uk & labs@bl.uk
http://www.bl.uk/projects/british-library-labs
Funded by the Andrew W. Mellon Foundation
Mahendra Mahey
Experiment with our
Digital Collections
Mahendra Mahey
Manager of BL Labs
Project Management and DH projects
1500 - 1630, Monday 18 December, 2017
CHASE AHDA Winter School 2018
The Open University in London/Futurelearn,
1-11 Hawley Crescent, Camden Town,
London, NW1 8NP
2. 2
mahendra.mahey@bl.uk & labs@bl.uk
Breakdown of session
• Introductions
• Background to BL Labs and DH Projects
• Developing project ideas as proposals, tips and challenges
• Feedback and suggestions
3. 3
mahendra.mahey@bl.uk & labs@bl.uk
What is Project Management?
1) Can range from informal and small scale focusing largely
on common sense flexible approaches to management…
2) To large scale formal approaches using methodologies
such PRINCE 2 (PRojects IN Controlled Environments),
AGILE, SCRUM and tools such as MS Project
• BL Labs uses both
• Focus on the first one to help develop your own ideas
• Not a session on Project Management!
5. 5
mahendra.mahey@bl.uk & labs@bl.uk
The British Library
Inside the British Library
Space for 1200 readers, around 500,000 visitors per year
Building 37 uses low oxygen and robots
Boston Spa also has a Reading room and provides delivery of items to London
Many items stored at Document Supply and Storage centre 48 hours away
Stockton-on-Tees
Author right to payment each time their books
are borrowed from public libraries.
St Pancras, London, UK
Many books are stored 4 stories below the building
UK Legal Deposit Library – Reference only
Founded in 1973 though origins stem back to British Museum Library 1759
Boston-Spa
6. 6
mahendra.mahey@bl.uk & labs@bl.uk
BL Labs supports…
Researchers
https://goo.gl/WutNyi
Artists
http://goo.gl/nNKhQ2
Librarians
Curators
https://goo.gl/9NWZUW
Software Developers
https://goo.gl/7QQ5Tf
Archivists
https://goo.gl/x7b4tg
Educators
https://goo.gl/qh01Mi
Anyone
interested in our
digital collections
and data
7. 7
mahendra.mahey@bl.uk & labs@bl.uk
Physical Collections – not just books!
> 180*million items
> 0.8* m serial titles
> 8* m stamps
> 14* m books
> 6* m sound recordings
> 4* m maps
> 1.6* m musical scores
> 0.3* m manuscripts
> 60* m patents
King George IV bequeathed Library *Estimates
9. 9
mahendra.mahey@bl.uk & labs@bl.uk
/
Knowledge Quarter London
80 knowledge organisations (as of 07/12/17) within 1 mile radius of
Kings Cross, http://www.knowledgequarter.london
http://www.turing.ac.uk (Headquartered at the British Library)
UK Web Archive and e-legal deposit (2013)
http://www.webarchive.org.uk/ukwa/
Born digital
Data all around us at
Kings Cross!
Born digital
Data all around us at
Kings Cross!
Born digital
Data all around us at
Kings Cross!
11. 11
mahendra.mahey@bl.uk & labs@bl.uk
#bldigital
1-2 %* digitised
* estimate
Digitisation
Partnerships
Commercial & Other Organisations
Amount
increasing rapidly
Bias in digitisation
So learn the story behind
the digital collection
http://goo.gl/bR9UJL
Sample Generator
12. 12
mahendra.mahey@bl.uk & labs@bl.uk
Playbills, Books, Newspapers
(includes Optical Character Recognition (OCR))
Digital collections and Datasets
British National
Bibliography
http://bnb.data.bl.uk
http://sounds.bl.ukhttp://dml.city.ac.uk/
Music (Recordings & Sheet) & Sounds
http://goo.gl/frSMJt
Broadcast News (TV and Radio)
http://goo.gl/cwThHw
http://goo.gl/pBkisZhttp://goo.gl/E8aRyQ
Usage data
EtHOS
Web ArchiveImages, Manuscripts & Maps
http://www.qdl.qa/
Qatar Digital Library
http://idp.bl.uk/
International
Dunhuang
Project
Maps
http://www.bl.uk/maps/
Hebrew Manuscripts
http://goo.gl/4sbCp9
Flickr &
Wikimedia Commons
https://goo.gl/LZRmaZ
13. 13
mahendra.mahey@bl.uk & labs@bl.uk
Finding Open Cultural Heritage Datasets
Collection Guides (183 as of 05/12/17)
https://www.bl.uk/collection-guides/
Datasets about our collections
Bibliographic datasets relating to our published and
archival holdings
Datasets for content mining
Content suitable for use in text and data mining
research
Datasets for image analysis
Image collections suitable for large-scale image-
analysis-based research
Datasets from UK Web Archive
Data and API services available for accessing UK Web
Archive
Digital mapping
Geospatial data, cartographic applications, digital aerial
photography and scanned historic map materials
https://data.bl.uk
Download collections as zips, no API
Each dataset has a Digital Object Identifier (DOI)
can be referenced for research
Not all discoverable via
search engines!
14. 14
mahendra.mahey@bl.uk & labs@bl.uk
Explore or Imagine Our Data!
• CSV of Metadata
https://data.bl.uk/digbks/dig19cbooks-mdata-csv.csv
• 19th Century Books - Book Metadata - 01/09/2013.
https://data.bl.uk/digbks/db21.html
• Digitised Books - Flickr Tag History - Dec 2013 to March 2016.
TSV
https://data.bl.uk/digbks/db15.html
• Digitised Hebrew Manuscripts - Metadata
https://data.bl.uk/hebrewmanuscripts/heb1.html
• Digitised Hebrew Manuscripts: Or 2210 - Or 2364
https://data.bl.uk/hebrewmanuscripts/heb8.html
• Theatrical playbills from Britain and Ireland (OCR text only)
https://data.bl.uk/playbills/pb2.html
• Portraits of actors, views of theatres and playbills (covering
1750 - 1821 in a single volume)
https://data.bl.uk/singlesheet/por1.html
• Volumes of Lysons Collectanea (Amusements), comprising
broadsides, cuttings, advertisements on amusements.1660-
1840.
https://data.bl.uk/singlesheet/ad1.html
https://data.bl.uk
• Have a look at the data.
• Data Quality
• Issues
Or an idea you have thought of
what to do with the data!
http://labs.bl.uk/Ideas+for+Labs
Smaller datasets
15. 15
mahendra.mahey@bl.uk & labs@bl.uk
Openly Licensed Digital Content?
15% Openly
Licensed
Around 80%*
available online
Working through to make more open…
Though some collections will always only be available onsite due to
various reasons including legal, ethical etc
Breakdown by collection*
Manuscripts 59%
Books 9%
Maps and Views 7%
Newspapers 3%
Archives and Records 3%
Paintings, Prints and Drawings 2%
*Based on number of digitisation projects (693 as of 08/12/17)
Largest proportion of funding
Public / Private Partnership
15 %* Openly Licensed – most online
85 %* Available onsite only at the moment
*Estimates
16. 16
mahendra.mahey@bl.uk & labs@bl.uk
The Story of the Digital Collection…
Digital
Collection
Curator
Who paid for the digitisation?
Who did the digitisation?
Technology used
Born digital?
Published
Unpublished
Where is it?
Can it still be accessed?
Generates income
Reputational risk in using?
Legalities
Politics when digitised
Personalities involved
Surprises (e.g. gaps)
Descriptive information
Old format not supported
What media was the
digitisation done from?
Is there any background documentation?
No Descriptive information
Inconsistent descriptive information
Still there?
Good to know the background ‘Story’ of a Digital Collection’
if you want to use it for research and make conclusions…
17. 17
mahendra.mahey@bl.uk & labs@bl.uk
Competition
Awards
Projects
Tell us your ideas of what to do with our digital content
Show us what you have already done with our
digital content in research, artistic, commercial
and learning and teaching categories
Talk to us about working on collaborative projects
18. 18
mahendra.mahey@bl.uk & labs@bl.uk
Example Pattern of Research
1, 2, 3
1. Find / identify new things in messy stuff
2. Unlock hidden history / data
3. Celebrate by telling new stories!
19. 19
mahendra.mahey@bl.uk & labs@bl.uk
Finding / identifying invisible / well hidden
things in ‘messy’ historical data
https://goo.gl/mcpa8B
Not the British Library!
Example Pattern of Research 1
Some of the challenges we face at the Library
20. 20
mahendra.mahey@bl.uk & labs@bl.uk
Unearthing / unlocking
hidden histories & data
to stimulate new research
https://goo.gl/vJ291F
It’s an
18th Century Poem!
Example Pattern of Research 2
21. 21
mahendra.mahey@bl.uk & labs@bl.uk
Celebrating hidden histories / data
creatively through events, art, performance
and story telling
https://goo.gl/Ql0Bwz
Re-enacting, re-discovering history
Example Pattern of Research 3
23. 23
mahendra.mahey@bl.uk & labs@bl.uk
https://goo.gl/oUNj5N
https://goo.gl/ImAUv4
Finding things in ‘messy’
Optical Character Recognised (OCR) text
Mrs Folly
• Clean up some manually
• Get human ‘ground truth’
• Write computer code (sometimes
it’s machine learning) to find
things reliably in it ‘automatically’
• Try code on messy content
• Tweak if necessary
• Digital ‘lasso’ around content
• Human sift through
Mrs Folly
An example pattern of research
25. 25
mahendra.mahey@bl.uk & labs@bl.uk
Machine Learning / Reading
Analogies to how humans read / learn
Machines acquire ‘knowledge’ / data, use that
knowledge / data to make sense / identify patterns
https://goo.gl/k68fTf
https://goo.gl/gXmVQL Can you see the bird?
26. 26
mahendra.mahey@bl.uk & labs@bl.uk
Need to stress still requires computational
& human effort…
https://goo.gl/gDQEAz
Labs doing this on a case by case basis
so methods can vary
Machine Learning / Reading still
requires ‘Human Effort’!
27. 27
mahendra.mahey@bl.uk & labs@bl.uk
Legalities of Machine Learning /
Text and Data mining
https://goo.gl/toq4Bo
Legalities of Machine Learning / Text and Data
mining still up for discussion…Often misunderstood
Is it the same as humans reading and looking for
patterns…just a bit quicker?
28. 28
mahendra.mahey@bl.uk & labs@bl.uk
http://victorianhumour.tubmblr.com
Victorian Meme Machine (2014)
https://goo.gl/HMqDt3
Bob Nicholson
http://victorianhumour.tumblr.com/
Bob Nicholson interviewed on
BBC Radio 4 Making History Programme:
http://goo.gl/fmV9ep
And telling jokes to the public:
http://goo.gl/xIDRhz
Bob obtained further funding from his university
Looking for more collaborations
https://www.youtube.com/watch?v=-GRgj7Q5OM0
Rob Walker, Victorian Mother-in-law Jokes
Victorian Comedy Night, 7 Nov 2016
Learnt about access paths
to digital collections
29. 29
mahendra.mahey@bl.uk & labs@bl.uk
Katrina Navickas (2015)
Political Meetings Mapper
http://politicalmeetingsmapper.co.uk
https://goo.gl/Qq78Oa
Labs Symposium 2015
https://goo.gl/BSA3be
Interview 2015
The Chartist Newspaper
http://goo.gl/vOLSnH
Chartist Monster Meeting
Chartists Walking Tour and
Re-enactment London
Learnt that domain knowledge
reduces noise
31. 31
mahendra.mahey@bl.uk & labs@bl.uk
What thoj' among ourrelves, with too much Heat, or t
W: fweutimes.wongle, wvhen we Ihould debate, W –
(A confequential Ill which Freedom drawvs, fl t
A bad Efficf, but from a noble Caufe) t
We can with univeifal Zcal advance, to
To cutb the faithlefs Arrogancccof V rance. hi
Dublin Journal, 10-14 September, 1745 Slides courtesy Jennifer Batt
32. 32
mahendra.mahey@bl.uk & labs@bl.uk
Verse: 81% lines begin with
initial capital
Prose: 52% lines begin with
initial capital
Westminster Journal 3 March 1745
Slides courtesy Jennifer Batt
Started to refine
Machine Learning Techniques
Jennifer Batt @ the BL on World Poetry Day
‘40,000’ things found…
33. 33
mahendra.mahey@bl.uk & labs@bl.uk
Use of Overproof
OCR Correction?
Re-OCR with
ABBY FineReader?
https://www.abbyy.com/en-gb/
http://overproof.projectcomputing.com/
RE-OCR
Cleaning up OCR Text – significant improvement
up (depending on original image quality)
34. 34
mahendra.mahey@bl.uk & labs@bl.uk
Virtual Infrastructure for OCR text
OCR text ‘scraped’ from
digitised newspapers
and put in internal cloud
Jupyter notebook
Write python code and results
in web browser
http://jupyter.org
Access available for researchers ‘in residence’
https://www.docker.com/
http://dhbox.org/
35. 35
mahendra.mahey@bl.uk & labs@bl.uk
BL Labs Competition Entry Process
• Think of a project which uses the British Library’s Digital
Collections or Data
• Examine our data and discuss idea
• Propose mini project
• Proposals assessed and successful ones worked on
• 3 examples from 2014, 2015, 2016 given
36. 36
mahendra.mahey@bl.uk & labs@bl.uk
Elements of Proposal
(https://goo.gl/K85hTQ)
• Title and Summary
• Research Question(s)
• How it showcasing digital collections / data
• Methods (text mining, visualisations, statistical analysis)
• Evidence of how you have or will develop the skills, knowledge and
expertise to successfully carry out the project
• Evidence of idea is achievable on a technical, curatorial and legal basis
• Plan
• Risk assessment* (new suggestion)
39. 39
mahendra.mahey@bl.uk & labs@bl.uk
Showcasing digital collections / data
• How does your idea showcase digital collections / data
• Have you seen the digital collections and data?
• Do you know the ‘story’ of the collection?
• What state is it in?
• Have you done some initial experiments?
• Will it require cleaning, e.g. using tools like open refine?
• Reality check in terms of what you can actually achieve with the data
will determine idea and scope
40. 40
mahendra.mahey@bl.uk & labs@bl.uk
Methods (text mining, visualisations,
statistical analysis)
• Think of what is going to be required to implement methods, e.g. skills,
time and other resources
• Plan accordingly
• Tools required / software / hardware
41. 41
mahendra.mahey@bl.uk & labs@bl.uk
Evidence of how you have or will develop the
skills, knowledge and expertise to successfully
carry out the project
• List skills, presentations, publications etc.
• Are there gaps?
• How are they are going to be filled?
42. 42
mahendra.mahey@bl.uk & labs@bl.uk
Evidence of idea is achievable on a
technical, curatorial and legal basis
• Technical factors
• Is the project technically feasible?
• Whether the technical skills required to complete the project and who
will be required to implement them have been clearly identified.
• Legal factors
• Whether the legal terms of use for the digital collections identified have
been checked and compliance demonstrated in the proposal.
• Whether the idea contains information that ensures the project does not
in any way infringe intellectual property rights or any other rights of any
third party.
43. 43
mahendra.mahey@bl.uk & labs@bl.uk
Evidence of idea is achievable on a
technical, curatorial and legal basis
• Curatorial factors
• Can it be demonstrated that the digital content is available, accessible
and can be realistically used for the project?
• Background research for people connected to the collection / the story
of the collection
• Is any extra worked required to make the digital content usable for the
project has been clearly identified (where appropriate).
44. 44
mahendra.mahey@bl.uk & labs@bl.uk
Plan
• Define period of time X and Y
• Activity described here (e.g. what, when and by who)
• Break down into manageable chunks / units
• Can run parallel
• Build in reasonable review points and lag.
• How will be it be monitored?
• It’s a plan, it can change!
45. 45
mahendra.mahey@bl.uk & labs@bl.uk
Risk
• Have a view of assessing risks
• Risk / Mitigation / Likelihood / Impact
• Use Low, Medium and High for Likelihood and Impact
Risk Mitigation Likelihood
(after
mitigation)
Impact
Insufficient support from UK
research councils.
Build compelling case.
Carry out research to gauge
demand and commitment to
resourcing.
Adapt model according to our
findings.
M H
46. 46
mahendra.mahey@bl.uk & labs@bl.uk
Labs mindset…
1. Start a conversation, generate positive energy
and try to support ideas
2. Start with small experiments, but think big.
3. Fail faster (don’t be afraid) and persevere.
4. Reject perfectionism! Good enough is
sometimes…good enough!
5. Celebrate the uses of digital collections
https://goo.gl/noASfl
Editor's Notes
140 seconds
The British Library is the national library of the UK and one of the largest research libraries in the world . The Library moved to a new purpose built building in 1997 <click> the largest of it’s kind that was built in the UK in the 20th century. Many frequently used items are stored 5 stories below the main building at St Pancras in London and many might not know that part of the building is meant to look like a ship on a journey to discovery!<click>. <click to switch off>
The building can sit 1,200 researchers at any one time across 5 reading rooms.
<click>Medium and long term requested items are held at Boston Spa in Yorkshire in a low oxygen warehouse, using robot to retrieve items. In total, the library has 625 km of shelving, growing by 12 km every year.
Whilst we acquire items through purchase or gifts, much of the collection has been built up through legal deposit. That is, by law, a copy of every UK and Ireland print publication must be given to the British Library by its publishers. Around 3 million items are added per year. In 2013, legal deposit was extended to cover non-print material which means by law we take in digitally published items as well, which means regular mass crawls of the entire UK web domain as well as ebooks, ejournals etc.
85 seconds
The picture you can see is inside the main building in London, it’s the King’s Library – King George the Third’s personal library! Sometimes known as the ‘stack’, I walk past this everyday and I sometimes forget that the collections the British Library have are truly staggering! We currently estimate them to exceed <click>150 million items, representing every age of written civilisation and every known language. Our archives now contain the earliest surviving printed book in the world, the Diamond Sutra, written in Chinese and dating from 868 AD….
So some big numbers…
Over …<click>14 million books
<click>60 million patents
<click>8 million stamps
<click>4 million maps
<click>3 million sound recordings
<click>1.6 million music scores
<click>over .3 million manuscripts
<click>0.8 million serials titles (which are of course made up of many many volumes/editions), this is where a lot of our content is, just in case you thought the numbers didn’t add up!
6 Seconds (20 Words)
So <Click> ‘how’ do we try and engage those who might be interested in the BL’s digital collections and data? <Click>
17 Seconds (53 Words)
<Click>The British Library is one of the largest Library’s in the world <Click> with an estimated 180 million physical items, with only a small proportion being digitised. <Click>We estimate this is around 1-2%, but no one really knows exactly how much. However, increasingly more items are being stored as ‘born’ digital, such as the UK Web Archive<Click>
Have balance of Multimedia
Broadcast news and radio, sounds asave our sounds
Books and newspapers
Images
BNB
Qatar Digital library
Hebrew manuscripts
21 Seconds (65 Words)
Katrina Navickas was particularly interested in the <Click>Chartist Movement who were a group who were campaigning for the vote for working people. <Click>They were the biggest popular movement for democracy in 19th century British history, just as this is early picture shows a huge monster meeting at Kennington Common<Click>She wanted to use a combination of manual and computational methods to explore our Digitised Newspapers to find out when and where they met and plot them on map. <Click>and hopefully unearthing new history.
970 files from a selection of 19th century newspaper titles from the BL corpus for us to correct using the overProof post-OCR correction software
The best way to measure the improvement made by the correction process is to compare the OCR'ed text and the automatically corrected text with a perfect correction made by a human (known as the "ground truth").
Hannah-Rose's 5 small human-corrected samples are show as green dots. These are not only smaller than the other files, but their raw error rate is much lower at 13.3%. OverProof was measured as reducing this to 5.4%, a removal of almost 60% of errors.
The red dotted-line indicates the correction "break-even" point: the further under the line, the better the quality of the document after correction.
In the graph below, the grey line shows distribution of files across error rates before correction and the green line after correction.