SlideShare a Scribd company logo
1 of 31
Download to read offline
eScriptorium: An Open Source
Platform for
Historical Document Analysis
Daniel Stökl Ben Ezra
Peter Stokes
Marc Bui
Ben Kiessling
Robin Tissot
eScriptorium
• Blog: http://escripta.hypotheses.org
• UI Code: https://gitlab.inria.fr/scripta/escriptorium
• AI Code: https://github.com/mittagessen/kraken
Funded by: PSL IRIS Scripta, H2020 Resilience, MENESR, DIM STCN Ile de
France, EquipEx Biblissima+, [indirectement: Mellon, MCC]
eScriptorium Universe
Scripta PSL:
eScriptorium
LectauRep
INRIA
ANF
openITI
North-Eastern
U Maryland
ERC
Vietnamica
EPHE
U-Bib
Heidelberg
?
National
Library of
Israel ?
ENC
Sorbonne
Université
DIM STCN
Observatoir
e de Paris
IRHT
H2020
Resilience
• manuscriptologIA
High Performance
Computing Cluster at
mesoPSL
Biblissima+
TGIR Huma-Num
current
• Import:
IIIF
pdf, imgfiles (jpg, png, …), alto, PageXML,
trained segmentation or transcription models
• Ergonomic UI for manual segmentation, transcription and (soon) annotation.
4 panels (facsimile, segmentation, transcription, text-annotation)
( user definable architectures)
↓ Metadata imported via iiif
current
• Import: IIIF, pdf, imgfiles (jpg, png, …), alto, PageXML,
trained segmentation or transcription models (user definable architectures)
• Ergonomic UI for manual segmentation, transcription and (soon) annotation.
4 panels (facsimile, segmentation, transcription, text-annotation)
Ergonomic transcription e.g. of vertical or oblique lines
BL ms Add. 27296
Transcription font size automatically adapted to manuscript line
current
• Import: IIIF, pdf, imgfiles (jpg, png, …), alto, PageXML,
trained segmentation or transcription models (user definable architectures)
• Ergonomic UI for manual segmentation, transcription and (soon) annotation.
4 panels (facsimile, segmentation, transcription, text-annotation)
• Automatic segmentation (lines, semantic lines and regions, also overlapping)
based on user-defined ontologies.
• Automatic transcription according to the principles set by the user.
• Export: alto 4(!), PageXML, txt, imgfiles (jpg, png ,…)
trained segmentation or transcription models
• Powerful and growing API
Segmentation and Transcription
Demonstration
↑ User definable
segmentation ontology
Locate illuminations through layout segmentation
Automatic segmentation result of ms specific model
Ergonomic correction
Jbaiter Mirador textoverlay plugin
eScriptorium (near) FUTURE
Scripta PSL:
eScriptorium
LectauRep
INRIA
ANF openITI
North-Eastern
U Maryland
ERC
Vietnamica
EPHE
U-Bib
Heidelberg
?
National
Library of
Israel ?
ENC
Sorbonne
Université
DIM STCN
Observatoir
e de Paris
IRHT
H2020
Resilience
• Search
• Trainable reading order
• Prototype for text annotation (NE,
ecdotic) with TEI-Export
• Prototype for image annotation
(e.g. Digipal / Archetype)
• manuscriptologIA
High Performance
Computing Cluster at
mesoPSL
• Customizable virtual
keyboard
• Vertical interface for Chinese
• Automatic textalignment
• Additional simplified interface
• Improved project management
• Crowdsourcing interface
Biblissima+
TGIR Huma-Num
Transcription created automatically without
specific transcription BnF syr 341
Judeo-Arabic+Hebrew, Ox. Bodl. Pococke 295,
Maimonides, Mishnah Commentary
Greek papyri (with WÜ, HD, B)
Greek papyri (with WÜ, HD, B)
eScriptorium used for Dead Sea Scroll Glyph alignment
Automatic letter level alignment
Images of Dead Sea Scrolls by
Shay Halevy Courtesy Israel
Antiquities Authority
p. 3558:
Please stay tuned for upcoming workshops
Contact: daniel.stoekl@ephe.psl.eu, peter.stokes@ephe.psl.eu
https://escripta.hypotheses.org
Many thanks to
Bibliothèque nationale de France
National Library of Israel (Ktiv!)
Bayerische Staatsbibliothek München
Biblioteca Apostolica Vaticana
Bodleian Library, Oxford
Cambridge University Library
Israel Antiquities Authority, Jerusalem
Staatsbibliothek Berlin, Preußischer Kulturbesitz
Intro tutorial: https://lectaurep.hypotheses.org/documentation/prendre-en-main-escriptorium

More Related Content

Similar to eScriptorium: An Open Source Platform for Historical Document Analysis

Deep Dive into Apache MXNet on AWS
Deep Dive into Apache MXNet on AWSDeep Dive into Apache MXNet on AWS
Deep Dive into Apache MXNet on AWSKristana Kane
 
Introduction to libre « fulltext » technology
Introduction to libre « fulltext » technologyIntroduction to libre « fulltext » technology
Introduction to libre « fulltext » technologyRobert Viseur
 
Why Python
Why PythonWhy Python
Why Pythonarnav
 
3 python packages
3 python packages3 python packages
3 python packagesFEG
 
Automation in VLSI related tasks.
Automation in VLSI related tasks.Automation in VLSI related tasks.
Automation in VLSI related tasks.Shariful Islam
 
If You Have The Content, Then Apache Has The Technology!
If You Have The Content, Then Apache Has The Technology!If You Have The Content, Then Apache Has The Technology!
If You Have The Content, Then Apache Has The Technology!gagravarr
 
Interactive learning analytics dashboards with ELK (Elasticsearch Logstash Ki...
Interactive learning analytics dashboards with ELK (Elasticsearch Logstash Ki...Interactive learning analytics dashboards with ELK (Elasticsearch Logstash Ki...
Interactive learning analytics dashboards with ELK (Elasticsearch Logstash Ki...Andrii Vozniuk
 
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech TalksA Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech TalksAmazon Web Services
 
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech TalksA Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech TalksAmazon Web Services
 
Deep Learning Frameworks 2019 | Which Deep Learning Framework To Use | Deep L...
Deep Learning Frameworks 2019 | Which Deep Learning Framework To Use | Deep L...Deep Learning Frameworks 2019 | Which Deep Learning Framework To Use | Deep L...
Deep Learning Frameworks 2019 | Which Deep Learning Framework To Use | Deep L...Simplilearn
 
Machine learning from software developers point of view
Machine learning from software developers point of viewMachine learning from software developers point of view
Machine learning from software developers point of viewPierre Paci
 
ANN-Lecture2-Python Startup.pptx
ANN-Lecture2-Python Startup.pptxANN-Lecture2-Python Startup.pptx
ANN-Lecture2-Python Startup.pptxShahzadAhmadJoiya3
 
Digitization in theory and practice
Digitization in theory and practiceDigitization in theory and practice
Digitization in theory and practiceHelen Nneka Okpala
 
IMPACT Final Event 26-06-2012 - Franciska de Jong - Indexing and searching of...
IMPACT Final Event 26-06-2012 - Franciska de Jong - Indexing and searching of...IMPACT Final Event 26-06-2012 - Franciska de Jong - Indexing and searching of...
IMPACT Final Event 26-06-2012 - Franciska de Jong - Indexing and searching of...IMPACT Centre of Competence
 
Nayeem shaik resume
Nayeem shaik resumeNayeem shaik resume
Nayeem shaik resumeNayeem Shaik
 
2016 bioinformatics i_python_part_1_wim_vancriekinge
2016 bioinformatics i_python_part_1_wim_vancriekinge2016 bioinformatics i_python_part_1_wim_vancriekinge
2016 bioinformatics i_python_part_1_wim_vancriekingeProf. Wim Van Criekinge
 

Similar to eScriptorium: An Open Source Platform for Historical Document Analysis (20)

Deep Dive into Apache MXNet on AWS
Deep Dive into Apache MXNet on AWSDeep Dive into Apache MXNet on AWS
Deep Dive into Apache MXNet on AWS
 
Introduction to libre « fulltext » technology
Introduction to libre « fulltext » technologyIntroduction to libre « fulltext » technology
Introduction to libre « fulltext » technology
 
Why Python
Why PythonWhy Python
Why Python
 
3 python packages
3 python packages3 python packages
3 python packages
 
Automation in VLSI related tasks.
Automation in VLSI related tasks.Automation in VLSI related tasks.
Automation in VLSI related tasks.
 
If You Have The Content, Then Apache Has The Technology!
If You Have The Content, Then Apache Has The Technology!If You Have The Content, Then Apache Has The Technology!
If You Have The Content, Then Apache Has The Technology!
 
Interactive learning analytics dashboards with ELK (Elasticsearch Logstash Ki...
Interactive learning analytics dashboards with ELK (Elasticsearch Logstash Ki...Interactive learning analytics dashboards with ELK (Elasticsearch Logstash Ki...
Interactive learning analytics dashboards with ELK (Elasticsearch Logstash Ki...
 
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech TalksA Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
 
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech TalksA Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
 
Deep Learning Frameworks 2019 | Which Deep Learning Framework To Use | Deep L...
Deep Learning Frameworks 2019 | Which Deep Learning Framework To Use | Deep L...Deep Learning Frameworks 2019 | Which Deep Learning Framework To Use | Deep L...
Deep Learning Frameworks 2019 | Which Deep Learning Framework To Use | Deep L...
 
Machine learning from software developers point of view
Machine learning from software developers point of viewMachine learning from software developers point of view
Machine learning from software developers point of view
 
Amazon Deep Learning
Amazon Deep LearningAmazon Deep Learning
Amazon Deep Learning
 
ANN-Lecture2-Python Startup.pptx
ANN-Lecture2-Python Startup.pptxANN-Lecture2-Python Startup.pptx
ANN-Lecture2-Python Startup.pptx
 
P1 2018 python
P1 2018 pythonP1 2018 python
P1 2018 python
 
P1 2017 python
P1 2017 pythonP1 2017 python
P1 2017 python
 
Digitization in theory and practice
Digitization in theory and practiceDigitization in theory and practice
Digitization in theory and practice
 
Sylvain Bellemare Resume
Sylvain Bellemare ResumeSylvain Bellemare Resume
Sylvain Bellemare Resume
 
IMPACT Final Event 26-06-2012 - Franciska de Jong - Indexing and searching of...
IMPACT Final Event 26-06-2012 - Franciska de Jong - Indexing and searching of...IMPACT Final Event 26-06-2012 - Franciska de Jong - Indexing and searching of...
IMPACT Final Event 26-06-2012 - Franciska de Jong - Indexing and searching of...
 
Nayeem shaik resume
Nayeem shaik resumeNayeem shaik resume
Nayeem shaik resume
 
2016 bioinformatics i_python_part_1_wim_vancriekinge
2016 bioinformatics i_python_part_1_wim_vancriekinge2016 bioinformatics i_python_part_1_wim_vancriekinge
2016 bioinformatics i_python_part_1_wim_vancriekinge
 

More from Equipex Biblissima

Da Biblissima a Biblissima+ : per un osservatorio delle culture scritte
Da Biblissima a Biblissima+ : per un osservatorio delle culture scritteDa Biblissima a Biblissima+ : per un osservatorio delle culture scritte
Da Biblissima a Biblissima+ : per un osservatorio delle culture scritteEquipex Biblissima
 
Annotate (E-ReColNat) : annotation rapide d’images et de vidéos en sciences n...
Annotate (E-ReColNat) : annotation rapide d’images et de vidéos en sciences n...Annotate (E-ReColNat) : annotation rapide d’images et de vidéos en sciences n...
Annotate (E-ReColNat) : annotation rapide d’images et de vidéos en sciences n...Equipex Biblissima
 
Appliquer les techniques d'apprentissage profond pour détecter les enluminure...
Appliquer les techniques d'apprentissage profond pour détecter les enluminure...Appliquer les techniques d'apprentissage profond pour détecter les enluminure...
Appliquer les techniques d'apprentissage profond pour détecter les enluminure...Equipex Biblissima
 
Représentations du chant du Moyen Âge dans les images IIIF
Représentations du chant du Moyen Âge dans les images IIIFReprésentations du chant du Moyen Âge dans les images IIIF
Représentations du chant du Moyen Âge dans les images IIIFEquipex Biblissima
 
Réflexions et explorations croisées autour de IIIF, Omeka-s et NumaHOP à la B...
Réflexions et explorations croisées autour de IIIF, Omeka-s et NumaHOP à la B...Réflexions et explorations croisées autour de IIIF, Omeka-s et NumaHOP à la B...
Réflexions et explorations croisées autour de IIIF, Omeka-s et NumaHOP à la B...Equipex Biblissima
 
Mise en œuvre de IIIF pour la reconnaissance automatique de documents
Mise en œuvre de IIIF pour la reconnaissance automatique de documentsMise en œuvre de IIIF pour la reconnaissance automatique de documents
Mise en œuvre de IIIF pour la reconnaissance automatique de documentsEquipex Biblissima
 
Actualités et perspectives de IIIF
Actualités et perspectives de IIIFActualités et perspectives de IIIF
Actualités et perspectives de IIIFEquipex Biblissima
 
Mieux diffuser et valoriser ses images sur le Web grâce aux standards IIIF
Mieux diffuser et valoriser ses images sur le Web grâce aux standards IIIFMieux diffuser et valoriser ses images sur le Web grâce aux standards IIIF
Mieux diffuser et valoriser ses images sur le Web grâce aux standards IIIFEquipex Biblissima
 
Digital Manuscripts Without Borders: A Discovery Platform of Manuscripts and ...
Digital Manuscripts Without Borders: A Discovery Platform of Manuscripts and ...Digital Manuscripts Without Borders: A Discovery Platform of Manuscripts and ...
Digital Manuscripts Without Borders: A Discovery Platform of Manuscripts and ...Equipex Biblissima
 
IIIF360: A Service to Support and Promote IIIF in France
IIIF360: A Service to Support and Promote IIIF in FranceIIIF360: A Service to Support and Promote IIIF in France
IIIF360: A Service to Support and Promote IIIF in FranceEquipex Biblissima
 
The Biblissima Authority File of Geographical Names
The Biblissima Authority File of Geographical NamesThe Biblissima Authority File of Geographical Names
The Biblissima Authority File of Geographical NamesEquipex Biblissima
 
Les référentiels Biblissima : épine dorsale du portail Biblissima et de IIIF-...
Les référentiels Biblissima : épine dorsale du portail Biblissima et de IIIF-...Les référentiels Biblissima : épine dorsale du portail Biblissima et de IIIF-...
Les référentiels Biblissima : épine dorsale du portail Biblissima et de IIIF-...Equipex Biblissima
 
Introduction aux protocoles IIIF. Formation Enssib 23.01.2019 (Régis Robineau)
Introduction aux protocoles IIIF. Formation Enssib 23.01.2019 (Régis Robineau)Introduction aux protocoles IIIF. Formation Enssib 23.01.2019 (Régis Robineau)
Introduction aux protocoles IIIF. Formation Enssib 23.01.2019 (Régis Robineau)Equipex Biblissima
 
Biblissima: Connecting Manuscripts Collections
Biblissima: Connecting Manuscripts CollectionsBiblissima: Connecting Manuscripts Collections
Biblissima: Connecting Manuscripts CollectionsEquipex Biblissima
 
A la recherche du patrimoine écrit avec le portail Biblissima
A la recherche du patrimoine écrit avec le portail BiblissimaA la recherche du patrimoine écrit avec le portail Biblissima
A la recherche du patrimoine écrit avec le portail BiblissimaEquipex Biblissima
 
Browse and Visualize Manuscripts Illuminations with IIIF
Browse and Visualize Manuscripts Illuminations with IIIFBrowse and Visualize Manuscripts Illuminations with IIIF
Browse and Visualize Manuscripts Illuminations with IIIFEquipex Biblissima
 
Les descripteurs des bases iconographiques Mandragore (BnF) et Initiale (IRHT...
Les descripteurs des bases iconographiques Mandragore (BnF) et Initiale (IRHT...Les descripteurs des bases iconographiques Mandragore (BnF) et Initiale (IRHT...
Les descripteurs des bases iconographiques Mandragore (BnF) et Initiale (IRHT...Equipex Biblissima
 
A la recherche du patrimoine écrit avec le portail Biblissima
A la recherche du patrimoine écrit avec le portail BiblissimaA la recherche du patrimoine écrit avec le portail Biblissima
A la recherche du patrimoine écrit avec le portail BiblissimaEquipex Biblissima
 

More from Equipex Biblissima (20)

Da Biblissima a Biblissima+ : per un osservatorio delle culture scritte
Da Biblissima a Biblissima+ : per un osservatorio delle culture scritteDa Biblissima a Biblissima+ : per un osservatorio delle culture scritte
Da Biblissima a Biblissima+ : per un osservatorio delle culture scritte
 
Annotate (E-ReColNat) : annotation rapide d’images et de vidéos en sciences n...
Annotate (E-ReColNat) : annotation rapide d’images et de vidéos en sciences n...Annotate (E-ReColNat) : annotation rapide d’images et de vidéos en sciences n...
Annotate (E-ReColNat) : annotation rapide d’images et de vidéos en sciences n...
 
Appliquer les techniques d'apprentissage profond pour détecter les enluminure...
Appliquer les techniques d'apprentissage profond pour détecter les enluminure...Appliquer les techniques d'apprentissage profond pour détecter les enluminure...
Appliquer les techniques d'apprentissage profond pour détecter les enluminure...
 
Représentations du chant du Moyen Âge dans les images IIIF
Représentations du chant du Moyen Âge dans les images IIIFReprésentations du chant du Moyen Âge dans les images IIIF
Représentations du chant du Moyen Âge dans les images IIIF
 
Réflexions et explorations croisées autour de IIIF, Omeka-s et NumaHOP à la B...
Réflexions et explorations croisées autour de IIIF, Omeka-s et NumaHOP à la B...Réflexions et explorations croisées autour de IIIF, Omeka-s et NumaHOP à la B...
Réflexions et explorations croisées autour de IIIF, Omeka-s et NumaHOP à la B...
 
Mise en œuvre de IIIF pour la reconnaissance automatique de documents
Mise en œuvre de IIIF pour la reconnaissance automatique de documentsMise en œuvre de IIIF pour la reconnaissance automatique de documents
Mise en œuvre de IIIF pour la reconnaissance automatique de documents
 
Nakala et IIIF
Nakala et IIIFNakala et IIIF
Nakala et IIIF
 
Actualités et perspectives de IIIF
Actualités et perspectives de IIIFActualités et perspectives de IIIF
Actualités et perspectives de IIIF
 
Mieux diffuser et valoriser ses images sur le Web grâce aux standards IIIF
Mieux diffuser et valoriser ses images sur le Web grâce aux standards IIIFMieux diffuser et valoriser ses images sur le Web grâce aux standards IIIF
Mieux diffuser et valoriser ses images sur le Web grâce aux standards IIIF
 
Digital Manuscripts Without Borders: A Discovery Platform of Manuscripts and ...
Digital Manuscripts Without Borders: A Discovery Platform of Manuscripts and ...Digital Manuscripts Without Borders: A Discovery Platform of Manuscripts and ...
Digital Manuscripts Without Borders: A Discovery Platform of Manuscripts and ...
 
IIIF360: A Service to Support and Promote IIIF in France
IIIF360: A Service to Support and Promote IIIF in FranceIIIF360: A Service to Support and Promote IIIF in France
IIIF360: A Service to Support and Promote IIIF in France
 
The Biblissima Authority File of Geographical Names
The Biblissima Authority File of Geographical NamesThe Biblissima Authority File of Geographical Names
The Biblissima Authority File of Geographical Names
 
Les référentiels Biblissima : épine dorsale du portail Biblissima et de IIIF-...
Les référentiels Biblissima : épine dorsale du portail Biblissima et de IIIF-...Les référentiels Biblissima : épine dorsale du portail Biblissima et de IIIF-...
Les référentiels Biblissima : épine dorsale du portail Biblissima et de IIIF-...
 
Introduction aux protocoles IIIF. Formation Enssib 23.01.2019 (Régis Robineau)
Introduction aux protocoles IIIF. Formation Enssib 23.01.2019 (Régis Robineau)Introduction aux protocoles IIIF. Formation Enssib 23.01.2019 (Régis Robineau)
Introduction aux protocoles IIIF. Formation Enssib 23.01.2019 (Régis Robineau)
 
Biblissima: Connecting Manuscripts Collections
Biblissima: Connecting Manuscripts CollectionsBiblissima: Connecting Manuscripts Collections
Biblissima: Connecting Manuscripts Collections
 
IIIF et Biblissima
IIIF et BiblissimaIIIF et Biblissima
IIIF et Biblissima
 
A la recherche du patrimoine écrit avec le portail Biblissima
A la recherche du patrimoine écrit avec le portail BiblissimaA la recherche du patrimoine écrit avec le portail Biblissima
A la recherche du patrimoine écrit avec le portail Biblissima
 
Browse and Visualize Manuscripts Illuminations with IIIF
Browse and Visualize Manuscripts Illuminations with IIIFBrowse and Visualize Manuscripts Illuminations with IIIF
Browse and Visualize Manuscripts Illuminations with IIIF
 
Les descripteurs des bases iconographiques Mandragore (BnF) et Initiale (IRHT...
Les descripteurs des bases iconographiques Mandragore (BnF) et Initiale (IRHT...Les descripteurs des bases iconographiques Mandragore (BnF) et Initiale (IRHT...
Les descripteurs des bases iconographiques Mandragore (BnF) et Initiale (IRHT...
 
A la recherche du patrimoine écrit avec le portail Biblissima
A la recherche du patrimoine écrit avec le portail BiblissimaA la recherche du patrimoine écrit avec le portail Biblissima
A la recherche du patrimoine écrit avec le portail Biblissima
 

Recently uploaded

Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfFrisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfAnubhavMangla3
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...ScyllaDB
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...ScyllaDB
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGDSC PJATK
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)Wonjun Hwang
 
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxMasterG
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...panagenda
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data SciencePaolo Missier
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxFIDO Alliance
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingScyllaDB
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireExakis Nelite
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024Lorenzo Miniero
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxjbellis
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxFIDO Alliance
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxFIDO Alliance
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch TuesdayIvanti
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentationyogeshlabana357357
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityVictorSzoltysek
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimaginedpanagenda
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfalexjohnson7307
 

Recently uploaded (20)

Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfFrisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)
 
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdf
 

eScriptorium: An Open Source Platform for Historical Document Analysis

  • 1. eScriptorium: An Open Source Platform for Historical Document Analysis Daniel Stökl Ben Ezra Peter Stokes Marc Bui Ben Kiessling Robin Tissot
  • 2. eScriptorium • Blog: http://escripta.hypotheses.org • UI Code: https://gitlab.inria.fr/scripta/escriptorium • AI Code: https://github.com/mittagessen/kraken Funded by: PSL IRIS Scripta, H2020 Resilience, MENESR, DIM STCN Ile de France, EquipEx Biblissima+, [indirectement: Mellon, MCC]
  • 3. eScriptorium Universe Scripta PSL: eScriptorium LectauRep INRIA ANF openITI North-Eastern U Maryland ERC Vietnamica EPHE U-Bib Heidelberg ? National Library of Israel ? ENC Sorbonne Université DIM STCN Observatoir e de Paris IRHT H2020 Resilience • manuscriptologIA High Performance Computing Cluster at mesoPSL Biblissima+ TGIR Huma-Num
  • 4.
  • 5.
  • 6.
  • 7. current • Import: IIIF pdf, imgfiles (jpg, png, …), alto, PageXML, trained segmentation or transcription models • Ergonomic UI for manual segmentation, transcription and (soon) annotation. 4 panels (facsimile, segmentation, transcription, text-annotation) ( user definable architectures)
  • 9. current • Import: IIIF, pdf, imgfiles (jpg, png, …), alto, PageXML, trained segmentation or transcription models (user definable architectures) • Ergonomic UI for manual segmentation, transcription and (soon) annotation. 4 panels (facsimile, segmentation, transcription, text-annotation)
  • 10. Ergonomic transcription e.g. of vertical or oblique lines
  • 11. BL ms Add. 27296 Transcription font size automatically adapted to manuscript line
  • 12. current • Import: IIIF, pdf, imgfiles (jpg, png, …), alto, PageXML, trained segmentation or transcription models (user definable architectures) • Ergonomic UI for manual segmentation, transcription and (soon) annotation. 4 panels (facsimile, segmentation, transcription, text-annotation) • Automatic segmentation (lines, semantic lines and regions, also overlapping) based on user-defined ontologies. • Automatic transcription according to the principles set by the user. • Export: alto 4(!), PageXML, txt, imgfiles (jpg, png ,…) trained segmentation or transcription models • Powerful and growing API
  • 13. Segmentation and Transcription Demonstration ↑ User definable segmentation ontology
  • 14.
  • 15. Locate illuminations through layout segmentation
  • 16. Automatic segmentation result of ms specific model
  • 18.
  • 20. eScriptorium (near) FUTURE Scripta PSL: eScriptorium LectauRep INRIA ANF openITI North-Eastern U Maryland ERC Vietnamica EPHE U-Bib Heidelberg ? National Library of Israel ? ENC Sorbonne Université DIM STCN Observatoir e de Paris IRHT H2020 Resilience • Search • Trainable reading order • Prototype for text annotation (NE, ecdotic) with TEI-Export • Prototype for image annotation (e.g. Digipal / Archetype) • manuscriptologIA High Performance Computing Cluster at mesoPSL • Customizable virtual keyboard • Vertical interface for Chinese • Automatic textalignment • Additional simplified interface • Improved project management • Crowdsourcing interface Biblissima+ TGIR Huma-Num
  • 21. Transcription created automatically without specific transcription BnF syr 341
  • 22.
  • 23.
  • 24. Judeo-Arabic+Hebrew, Ox. Bodl. Pococke 295, Maimonides, Mishnah Commentary
  • 25.
  • 26. Greek papyri (with WÜ, HD, B)
  • 27. Greek papyri (with WÜ, HD, B)
  • 28. eScriptorium used for Dead Sea Scroll Glyph alignment Automatic letter level alignment Images of Dead Sea Scrolls by Shay Halevy Courtesy Israel Antiquities Authority
  • 29.
  • 31. Please stay tuned for upcoming workshops Contact: daniel.stoekl@ephe.psl.eu, peter.stokes@ephe.psl.eu https://escripta.hypotheses.org Many thanks to Bibliothèque nationale de France National Library of Israel (Ktiv!) Bayerische Staatsbibliothek München Biblioteca Apostolica Vaticana Bodleian Library, Oxford Cambridge University Library Israel Antiquities Authority, Jerusalem Staatsbibliothek Berlin, Preußischer Kulturbesitz Intro tutorial: https://lectaurep.hypotheses.org/documentation/prendre-en-main-escriptorium