SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Successfully reported this slideshow.
Activate your 14 day free trial to unlock unlimited reading.
Digitisation and Digital Humanities - what is the role of Libraries?
Digitisation and Digital Humanities - what is the role of Libraries?
1.
Digitisation and Digital Humanties:
what is the role of Libraries?
Clemens Neudecker (@cneudecker)
Berlin State Library
8 April 2021
2.
Staatsbibliothek zu Berlin – Preußischer Kulturbesitz (SBB)
• Established 1661 as library of the
King of Prussia
• Largest research library in Germany
• Approximately 12m volumes,
23m media objects in total
• Part of the legal entity
Stiftung Preußischer Kulturbesitz
• https://staatsbibliothek-berlin.de/
4.
Digitization @ SBB
• Since 2007: in-house Digitization Center
• Approx. 1.7M images annual production
• Up to 80 concurrent digitization projects
• 20 diverse bookscanners, scanrobots, etc.
• Operation in two shifts with 24 operators
• Digitisation-on-demand service
• KITODO open source digitisation
workflow management system
5.
Digital Collections
• Main portal for digitised collections
• Currently around 180,000 digitised
documents available online
• Document published before 1920
public domain licensed
• IIIF API compatible
• Full image resolution is provided
• Full text (via OCR) and keyword search for
about 20% of the digitised content
• Downloads for images, OCR, metadata
• https://digital.staatsbibliothek-berlin.de/
6.
ZEFYS – digitized newspapers
• Digitized historical newspapers have their own portal ZEFYS
• About 200 newspaper titles and roughly 10m pages digitized
• GDR Press Portal gives access to main newspapers from the GDR
(after authentication which is necessary due to copyright)
• ZEFYS got hacked in February 2021 - but is now being reconstructed
with a new technology stack
• No full text search (yet) but approx. 5m pages already have OCR
• Currently two major newspaper digitization projects from microfilm
• https://zefys.staatsbibliothek-berlin.de/
7.
DDB Newspaper Portal
• Uniform access and UI for digitised
newspapers in Germany
• Key features
• Title list
• Calender
• Keyword search
• Advanced features
• Citation & Persistance
• Named Entities
• Corpus Building
• https://pro.deutsche-digitale-
bibliothek.de/
deutsches-zeitungsportal
8.
Qurator.ai
• Leverage state-of-the-art AI/ML for
digitized cultural heritage curation
• Development of AI/ML pipeline:
• Binarization
• Layout analysis
• OCR
• Postcorrection
• Named Entity Recognition and
Named Entity Linking
• Image Similarity and Search
• https://qurator.ai
• https://github.com/qurator-spk
9.
OCR-D
• Provide the technical and organisation
framework for the OCR processing of the
German VD digitization initiatives
(documents printed in Germany from 1600
– 1900)
• Open & collaborative development :
• Specifications & Guidelines
https://ocr-d.de/en/dev
• Open source tools https://github.com/OCR-D
• Community https://gitter.im/OCR-D/Lobby
• https://ocr-d.de
10.
SoNAR (IDH)
• Examine and evaluate approaches for an
advanced research environment for
Historical Network Analysis
• Extract person names and relations from
databases & digitized newspapers
• Transform entities with relations into a
historical social network graph
• Create intuitive visualizations and
interfaces for querying and analyzing the
social network graph
• https://sonar.fh-potsdam.de
11.
SBB LAB
• Experimental playground
• Provision of (open) datasets
• Documentation of public APIs
• Presentation of innovative prototypes
using SBB collections
• Events (Hackathons, Transcribathons)
• Digital Researcher Residency
(planned)
• https://lab.sbb.berlin/
12.
Thank you for your attention!
Questions?
Clemens Neudecker (@cneudecker)
Berlin State Library
8 April 2021
0 likes
Be the first to like this
Views
Total views
191
On SlideShare
0
From Embeds
0
Number of Embeds
5
You have now unlocked unlimited access to 20M+ documents!
Unlimited Reading
Learn faster and smarter from top experts
Unlimited Downloading
Download to take your learnings offline and on the go
You also get free access to Scribd!
Instant access to millions of ebooks, audiobooks, magazines, podcasts and more.
Read and listen offline with any device.
Free access to premium services like Tuneln, Mubi and more.