The document presents a hierarchical deep supervised network (DSN) for binarizing degraded document images. DSN extends convolutional neural networks to extract hierarchical features and classify pixels as text or background. It achieves state-of-the-art performance on several document image binarization datasets without pre- or post-processing. Subsequent research has adapted DSN to other document types like music scores and paychecks, and worked to reduce its computational complexity.
Seal of Good Local Governance (SGLG) 2024Final.pptx
Binarization of degraded document images based on hierarchical deep supervised network
1. Binarization of Degraded Document
Images Based on Hierarchical Deep
Supervised Network
Quang Nhat Vo, Soo Hyung Kim, Hyung Jeong
Yang, and Gueesang Lee
Pattern Recognition 74 (2018) 568–586
Presented by:
Tarik Reza Toha
#1017052013
2. • Problem definition
– What is the research problem?
• Motivation
– Why have the authors done the research?
• Solution approach
– How have the authors solved the problem?
– Be detail on this.
• Subsequent advancements
– What are the subsequent research studies and how have
they further advanced the solution of the problem?
2
Outline
3. 3
Digital Archiving
• Historical documents represent valuable cultural
heritages that need to be protected and preserved
• Automatic analysis of historical-document images
involves:
– Layout analysis
– Text-line and word segmentation
– Optical character recognition (OCR)
4. • Binary image representation is preferred for document
analysis
– Each pixel is labeled as “text” (1) or “background” (0)
• Binarization of degraded document images is complicated
– Non-uniform intensity
– Complex background
– Bleed through
• Existing solutions use unsupervised approaches and low-
level features
– Difficult to differentiate the text from the non-text components
4
Binary Degraded Document Images
5. • Global binarization algorithms
– Extracted labeling information is applied to the entire
document images
• Otsu et al., compute a threshold
– minimize the within-class variance
– maximize the between-class variance
• Clustering-based approaches separate the text
through learning of the unsupervised models
• Work well with simple backgrounds and a
uniform intensity
5
Existing Binarization Methods
It cannot be used
for degraded
documents
6. • Local binarization algorithms
– Predict based on its neighborhood information
• Image binarization is a classification problem
– Unsupervised-classification algorithms
– Supervised learning-based approaches
• parameter-free nature
• no need for pre- or post-processing
– Deep neural network-based approaches
6
Existing Binarization Methods (contd.)
7. 7
Existing Binarization Methods (contd.)
Still noises
and disconnected
strokes exist
Howe’s method vs Vo’s method on DIBCO 2011 dataset
8. • Hierarchical deep supervised network (DSN)
– Learns different feature levels from image data itself
to classify foreground and background from degraded
document images
• DSN extends traditional convolutional neural
network (CNN) to extract different feature levels
8
Main Contribution
22. • The binarization of degraded document images is
a challenging problem in terms of document
analysis
– DSN is a hierarchical architecture of deep supervised
network that incorporates side layers to improve the
training convergence
• Future work
– Handle the weak information
– Adaptation to music score and paycheck
– Reduce the number of convolutional layers
22
Conclusion