co:op-READ-Convention Marburg - Roger Labahn

Handwritten Text Recognition:
Key concepts
PD Dr. Roger Labahn
Computational Intelligence Technology Lab
Mathematical Optimization Group
Institute for Mathematics
University of Rostock
co:op Convention | READ Kickoff 19.01.2016

Handwritten Text Recognition: Key concepts
Introduction
Concepts – Problems – Tasks
Recognition & Training
Interpretation – Decoding
Epilog
co:op Convention | READ Kickoff HTR Key Concepts | Introduction Roger Labahn | C I T lab

Framework – Workflow
...
...
...
• Application: keyword search, transcription, . . .. . .. . .
OUT textual information (words, positions, ...) with alternatives & confidences
⇑⇑⇑
• HTR-Engine
⇑⇑⇑
IN writing images (lines, words, table cells, form fields, ...)
• Layout Analysis: . . .. . .. . . , text blocks
...
...
...

Alternative recognition strategies
Topological methods
• learn & read graphical substructures of writings
• arcs, lines, curves, holes, ...
HMM based methods
• Hidden Markov Models
• learn & read states while traversing the writing
RNN based methods
• Recurrent Neural Networks

Recognition Engine C I T lab & MoU partner PLANET
Decoding textual output
• textual interpretation of recognition results
• matching external requierements / knowledge (dictionaries, language model, ...)
⇑⇑⇑ ⇑⇑⇑
Recognition recognition matrix
• recognition information from image information
• processing standardized writing image
⇑⇑⇑ ⇑⇑⇑
Writing preprocessing standardized writing
• corrections & normalizations
• e.g.: baseline, slant, height, ...

Introduction
Segmentation
Context
Language
HTR
Epilog
co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks Roger Labahn | C I T lab

Segmentation ?
(classical) OCR = Optical Character Recognition
• Reading single characters Sub-images per character ! ?
• B a d ␣ D o ??? a n
Segmentationfree Reading
• processing the entire writing image: word . . .. . .. . . line . . .. . .. . .
• scanning information data sequence (signal) / character sequence
•
co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab

Segmentation ? NONE !
• B a d ␣ D o ??? a n
•

• B a d ␣ D o ??? a n
• B

• B a d ␣ D o ??? a n
• BB

• B a d ␣ D o ??? a n
• BB.

• B a d ␣ D o ??? a n
• BB.a

• B a d ␣ D o ??? a n
• BB.ad

• B a d ␣ D o ??? a n
• BB.ad␣

• B a d ␣ D o ??? a n
• BB.ad␣.

• B a d ␣ D o ??? a n
• BB.ad␣.D

Image context is essential !
Single segment without context
•
• virtually not (sufﬁciently) readable
Character sequence without context
•
• virtually not (sufﬁciently) explainable
co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Context Roger Labahn | C I T lab

Image context is essential !
Single segment without context
• u ?? OR n ??
• virtually not (sufﬁciently) readable
Character sequence without context
• ???
• virtually not (sufﬁciently) explainable
co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Context Roger Labahn | C I T lab

Language context is essential !
Free reading – no restrictions for possible reading results
• BB.ad␣DDolo.auu
• application: ﬁgures & general numbers, ...
Comparison against dictionary or keyword
• task: • Read a german city name from a given list !
• Find the name Bad Doberan !
• Bad Doberan
• goal: optimal / possible correspondence
writing / reading result dictionary entry / keyword
co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Language Roger Labahn | C I T lab

OCR ?
new paradigm – new concepts
co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | HTR Roger Labahn | C I T lab

OCR ? HTR !
new paradigm – new concepts new term !

OCR ? HTR !
new paradigm – new concepts new term !
• HTR Handwritten Text Recognition
• ATR Automatic Text Recognition
• ... ???

Introduction
Feature extraction
Writing processing
Neural Network
Parameter training
Epilog
co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training Roger Labahn | C I T lab

From pixel values to features
original grey image
Filtering
co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Feature extraction Roger Labahn | C I T lab

Collect & remember context !
Writing processing
• scanning in different directions data sequences (signals)
•
Information memory
• neural networks with complex neurons (cells)
• recurrent connections =⇒=⇒=⇒ memory
co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Writing processing Roger Labahn | C I T lab

Complex cells
co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Neural Network Roger Labahn | C I T lab

Complex cells – memory by recurrent connections
6
?
??

co:op Convention | READ Kickoff HTR Key Concepts | Recognition Training | Neural Network Roger Labahn | C I T lab

Hierarchical Neuronal Networks

From feature input to network output
(Figure from GRAVES, SCHMIDHUBER: Ofﬂine Handwriting Recognition with Multidimensional Recurrent Neural Networks)

From feature input to network output

Parameter training: Machine Learning
Theory
• objective: optimally adapt parameters in cells along network connections
• idea: train the network with learning data samples
• optimization: minimize error (network output vs. sample target) over training data
Practice: impression of large application cases
• 104
network cells
• 106
trainable parameters
• 104
learning data samples (writing images)
• 150 training epochs each processing every sample once
• 4 weeks training from the scratch
co:op Convention | READ Kickoff HTR Key Concepts | Recognition Training | Parameter training Roger Labahn | C I T lab

Learning data . . .. . .. . .
• . . .. . .. . . labeled training samples ground truth
HTR: writing images with correct text
• . . .. . .. . . the more the better . . .. . .. . . BUT:
start with realistic (reasonable) number improve while working
• . . .. . .. . . represent all project data . . .. . .. . . BUT:
start with HTR (networks) from similar collections corpora
• . . .. . .. . . contribute to general HTR engine improvement:
put into network repository for speciﬁc application cases
co:op Convention | READ Kickoff HTR Key Concepts | Recognition Training | Parameter training Roger Labahn | C I T lab

Introduction
Recognition Training
Network output
Decoding
Epilog
co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding Roger Labahn | C I T lab

Channel probabilities
Pre-conditions
• (abstract) alphabet of (abstract) characters
• text composed of exactly these characters
• alphabet characters ⇐⇒⇐⇒⇐⇒ network output neurons channels
• example: digits, uppercase letters, lowercase letters, special characters ␣-
• much more general: any symbol unit learnable from training data
• current (large) application case: up to 150 character channels
• independent from (natural) language – reading/writing direction – understanding
Network output
probability of (character) channel at writing (image) position
co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Network output Roger Labahn | C I T lab

Conﬁdence Matrix – recognition / perception matrix
. B D a d l o u ␣
co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Network output Roger Labahn | C I T lab

Expression matching
• restrict to
permissible words dictionary
keyword(s) construct(s) regular expression
• consider
character conﬁdences probability measure
or their negative logarithms distance measure
Algorithmic method
• compare conﬁdence matrix against any permissible expression
• use extremely fast algorithm: Dynamic Programming
co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Decoding Roger Labahn | C I T lab

Decoding
Objective – Result
• permissible expression(s) with best matching to recognition output
• best matching ⇐⇒⇐⇒⇐⇒ maximal probability ⇐⇒⇐⇒⇐⇒ minimum distance
• best alternatives ranked by measure (probability / distance)
Practice: impression of actual application cases
• only decoding on pre-processed lines
• searching 1 keyword in 10.500 lines (433 pages) : 2 - 3 sec. average
• reading 1 page against 11.650 word dictionary: 8 - 9 sec. average

Dynamic Programming

Introduction
Recognition Training
Epilog
co:op Convention | READ Kickoff HTR Key Concepts | Epilog Roger Labahn | C I T lab

Results
from C I T lab’s contribution to ICDAR’s HTRtS-2015 contest
WER = 0%
CER = 0%
who has with or without right the temporary possession of it : and
who has with or without right the temporary possession of it : and
WER = 17%
CER = 4%
operation of this act is spent upon Titius only ,
operation of this act isspeut upon Titius only ,
WER = 67%
CER = 52%
of the said first issue : the amount of such second consequently gap/ to the
of the and put feet the without of such ; said uitrquunity be the
WER = 80%
CER = 17%
for a simple personal Injury the Offender ’ s punish=
For on simple personal injury the offenders punish .
2. Examples of test line images of increasing difficulty. The reference transcript and the CITlab system hypothesis are displayed (in this order) below
h image. The corresponding WER and CER figures are also shown on the right of each image.
the lines with crossed-out word can be transcribed as the
ine shows. Finally, we can see that sometimes, if the line
a large WER but a low CER, the transcript can be more
ul than if the WER is lower and the CER higher (see third
[4] A. Graves, M. Liwicki, S. Fernández, R. Bertolami, H. Bunke, and
J. Schmidhuber, “A Novel Connectionist System for Unconstrained
Handwriting Recognition,” IEEE Tr. PAMI, vol. 31, no. 5, pp. 855–
868, 2009.
(Figure from SÁNCHEZ, TOSELLI, ROMERO, VIDAL: ICDAR2015 Competition HTRtS: Handwritten Text Recognition on the tranScriptorium Dataset)

Thanks . . .. . .. . .
C I T lab Group – URO MoU Partner
PLANET intgelligent systems GmbH
EU Funding
Recognition Enrichment of Archival Documents
. . .. . .. . . for your attention!

co:op-READ-Convention Marburg - Roger Labahn

More Related Content

What's hot

Viewers also liked

Similar to co:op-READ-Convention Marburg - Roger Labahn

More from ICARUS - International Centre for Archival Research

Recently uploaded

co:op-READ-Convention Marburg - Roger Labahn