SlideShare a Scribd company logo
1 of 23
Binarization of Degraded Document
Images Based on Hierarchical Deep
Supervised Network
Quang Nhat Vo, Soo Hyung Kim, Hyung Jeong
Yang, and Gueesang Lee
Pattern Recognition 74 (2018) 568–586
Presented by:
Tarik Reza Toha
#1017052013
• Problem definition
– What is the research problem?
• Motivation
– Why have the authors done the research?
• Solution approach
– How have the authors solved the problem?
– Be detail on this.
• Subsequent advancements
– What are the subsequent research studies and how have
they further advanced the solution of the problem?
2
Outline
3
Digital Archiving
• Historical documents represent valuable cultural
heritages that need to be protected and preserved
• Automatic analysis of historical-document images
involves:
– Layout analysis
– Text-line and word segmentation
– Optical character recognition (OCR)
• Binary image representation is preferred for document
analysis
– Each pixel is labeled as “text” (1) or “background” (0)
• Binarization of degraded document images is complicated
– Non-uniform intensity
– Complex background
– Bleed through
• Existing solutions use unsupervised approaches and low-
level features
– Difficult to differentiate the text from the non-text components
4
Binary Degraded Document Images
• Global binarization algorithms
– Extracted labeling information is applied to the entire
document images
• Otsu et al., compute a threshold
– minimize the within-class variance
– maximize the between-class variance
• Clustering-based approaches separate the text
through learning of the unsupervised models
• Work well with simple backgrounds and a
uniform intensity
5
Existing Binarization Methods
It cannot be used
for degraded
documents
• Local binarization algorithms
– Predict based on its neighborhood information
• Image binarization is a classification problem
– Unsupervised-classification algorithms
– Supervised learning-based approaches
• parameter-free nature
• no need for pre- or post-processing
– Deep neural network-based approaches
6
Existing Binarization Methods (contd.)
7
Existing Binarization Methods (contd.)
Still noises
and disconnected
strokes exist
Howe’s method vs Vo’s method on DIBCO 2011 dataset
• Hierarchical deep supervised network (DSN)
– Learns different feature levels from image data itself
to classify foreground and background from degraded
document images
• DSN extends traditional convolutional neural
network (CNN) to extract different feature levels
8
Main Contribution
9
Proposed Architecture
Demonstration of the DSN model for dense prediction
10
Proposed Architecture (contd.)
Diagram of the proposed DSN-based document binarization model
11
Global Threshold
12
Global Prediction
13
Experimental Evaluation
Samples of the generated training image patches and ground-
truth binary maps
14
DIBCO 2011 Dataset
15
DIBCO 2013 Dataset
16
DIBCO 2013 Dataset (contd.)
17
DIBCO 2014 Dataset
18
H-DIBCO 2016 Dataset
19
Network Structure Analysis
20
Other Types of Documents
Korean historical document image Chinese historical document image
21
Failure Analysis
• The binarization of degraded document images is
a challenging problem in terms of document
analysis
– DSN is a hierarchical architecture of deep supervised
network that incorporates side layers to improve the
training convergence
• Future work
– Handle the weak information
– Adaptation to music score and paycheck
– Reduce the number of convolutional layers
22
Conclusion
Thank you
Questions are welcome!
23

More Related Content

Similar to Binarization of degraded document images based on hierarchical deep supervised network

Geo-referenced human-activity-data; access, processing and knowledge extraction
Geo-referenced human-activity-data; access, processing and knowledge extractionGeo-referenced human-activity-data; access, processing and knowledge extraction
Geo-referenced human-activity-data; access, processing and knowledge extraction
Conor Mc Elhinney
 
Graph Data Science with Neo4j: Nordics Webinar
Graph Data Science with Neo4j: Nordics WebinarGraph Data Science with Neo4j: Nordics Webinar
Graph Data Science with Neo4j: Nordics Webinar
Neo4j
 
Zejia_CV_final
Zejia_CV_finalZejia_CV_final
Zejia_CV_final
ZJ Zheng
 
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Lucidworks
 

Similar to Binarization of degraded document images based on hierarchical deep supervised network (20)

Introduction to Neural Network
Introduction to Neural NetworkIntroduction to Neural Network
Introduction to Neural Network
 
Geo-referenced human-activity-data; access, processing and knowledge extraction
Geo-referenced human-activity-data; access, processing and knowledge extractionGeo-referenced human-activity-data; access, processing and knowledge extraction
Geo-referenced human-activity-data; access, processing and knowledge extraction
 
nnU-Net: a self-configuring method for deep learning-based biomedical image s...
nnU-Net: a self-configuring method for deep learning-based biomedical image s...nnU-Net: a self-configuring method for deep learning-based biomedical image s...
nnU-Net: a self-configuring method for deep learning-based biomedical image s...
 
Graph Data Science with Neo4j: Nordics Webinar
Graph Data Science with Neo4j: Nordics WebinarGraph Data Science with Neo4j: Nordics Webinar
Graph Data Science with Neo4j: Nordics Webinar
 
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
 
Zejia_CV_final
Zejia_CV_finalZejia_CV_final
Zejia_CV_final
 
Efficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image ClassficationEfficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image Classfication
 
Adbms 9 complex objects
Adbms 9 complex objectsAdbms 9 complex objects
Adbms 9 complex objects
 
“Modern Machine Vision from Basics to Advanced Deep Learning,” a Presentation...
“Modern Machine Vision from Basics to Advanced Deep Learning,” a Presentation...“Modern Machine Vision from Basics to Advanced Deep Learning,” a Presentation...
“Modern Machine Vision from Basics to Advanced Deep Learning,” a Presentation...
 
Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...
Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...
Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...
 
Multimedia Data Mining using Deep Learning
Multimedia Data Mining using Deep LearningMultimedia Data Mining using Deep Learning
Multimedia Data Mining using Deep Learning
 
18CS81 IOT MODULE 4 PPT.pdf
18CS81 IOT MODULE 4 PPT.pdf18CS81 IOT MODULE 4 PPT.pdf
18CS81 IOT MODULE 4 PPT.pdf
 
Talk@rmit 09112017
Talk@rmit 09112017Talk@rmit 09112017
Talk@rmit 09112017
 
Technologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic RecordsTechnologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic Records
 
Linked Data Quality Assessment – daQ and Luzzu
Linked Data Quality Assessment – daQ and LuzzuLinked Data Quality Assessment – daQ and Luzzu
Linked Data Quality Assessment – daQ and Luzzu
 
Qiagram
QiagramQiagram
Qiagram
 
Qiagram Slides 2011 05
Qiagram Slides 2011 05Qiagram Slides 2011 05
Qiagram Slides 2011 05
 
Qiagram
QiagramQiagram
Qiagram
 
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
 

More from Tarik Reza Toha

Predicting Human Count through Environmental Sensing in Closed Indoor Settings
Predicting Human Count through Environmental Sensing in Closed Indoor SettingsPredicting Human Count through Environmental Sensing in Closed Indoor Settings
Predicting Human Count through Environmental Sensing in Closed Indoor Settings
Tarik Reza Toha
 
Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks—Countin...
Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks—Countin...Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks—Countin...
Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks—Countin...
Tarik Reza Toha
 
BGPC: Energy-Efficient Parallel Computing Considering Both Computational and ...
BGPC: Energy-Efficient Parallel Computing Considering Both Computational and ...BGPC: Energy-Efficient Parallel Computing Considering Both Computational and ...
BGPC: Energy-Efficient Parallel Computing Considering Both Computational and ...
Tarik Reza Toha
 
Towards Simulating Non-lane Based Heterogeneous Road Traffic of Less Develope...
Towards Simulating Non-lane Based Heterogeneous Road Traffic of Less Develope...Towards Simulating Non-lane Based Heterogeneous Road Traffic of Less Develope...
Towards Simulating Non-lane Based Heterogeneous Road Traffic of Less Develope...
Tarik Reza Toha
 
Smart Mat: A Low Cost People Counting Solution
Smart Mat: A Low Cost People Counting SolutionSmart Mat: A Low Cost People Counting Solution
Smart Mat: A Low Cost People Counting Solution
Tarik Reza Toha
 

More from Tarik Reza Toha (20)

An approach towards greening the digital display system
An approach towards greening the digital display systemAn approach towards greening the digital display system
An approach towards greening the digital display system
 
Many-Objective Performance Enhancement in Computing Clusters
Many-Objective Performance Enhancement in Computing ClustersMany-Objective Performance Enhancement in Computing Clusters
Many-Objective Performance Enhancement in Computing Clusters
 
Exploiting a Synergy between Greedy Approach and NSGA for Scheduling in Compu...
Exploiting a Synergy between Greedy Approach and NSGA for Scheduling in Compu...Exploiting a Synergy between Greedy Approach and NSGA for Scheduling in Compu...
Exploiting a Synergy between Greedy Approach and NSGA for Scheduling in Compu...
 
Predicting Human Count through Environmental Sensing in Closed Indoor Settings
Predicting Human Count through Environmental Sensing in Closed Indoor SettingsPredicting Human Count through Environmental Sensing in Closed Indoor Settings
Predicting Human Count through Environmental Sensing in Closed Indoor Settings
 
Automatic Fabric Defect Detection with a Wide-And-Compact Network
Automatic Fabric Defect Detection with a Wide-And-Compact NetworkAutomatic Fabric Defect Detection with a Wide-And-Compact Network
Automatic Fabric Defect Detection with a Wide-And-Compact Network
 
Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks—Countin...
Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks—Countin...Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks—Countin...
Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks—Countin...
 
BGPC: Energy-Efficient Parallel Computing Considering Both Computational and ...
BGPC: Energy-Efficient Parallel Computing Considering Both Computational and ...BGPC: Energy-Efficient Parallel Computing Considering Both Computational and ...
BGPC: Energy-Efficient Parallel Computing Considering Both Computational and ...
 
Towards Simulating Non-lane Based Heterogeneous Road Traffic of Less Develope...
Towards Simulating Non-lane Based Heterogeneous Road Traffic of Less Develope...Towards Simulating Non-lane Based Heterogeneous Road Traffic of Less Develope...
Towards Simulating Non-lane Based Heterogeneous Road Traffic of Less Develope...
 
GMC: Greening MapReduce Clusters Considering both Computation Energy and Cool...
GMC: Greening MapReduce Clusters Considering both Computation Energy and Cool...GMC: Greening MapReduce Clusters Considering both Computation Energy and Cool...
GMC: Greening MapReduce Clusters Considering both Computation Energy and Cool...
 
PNUTS: Yahoo!’s Hosted Data Serving Platform
PNUTS: Yahoo!’s Hosted Data Serving PlatformPNUTS: Yahoo!’s Hosted Data Serving Platform
PNUTS: Yahoo!’s Hosted Data Serving Platform
 
Path shala
Path shalaPath shala
Path shala
 
Towards Greening the Digital Display System
Towards Greening the Digital Display SystemTowards Greening the Digital Display System
Towards Greening the Digital Display System
 
Workload-Based Prediction of CPU Temperature and Usage for Small-Scale Distri...
Workload-Based Prediction of CPU Temperature and Usage for Small-Scale Distri...Workload-Based Prediction of CPU Temperature and Usage for Small-Scale Distri...
Workload-Based Prediction of CPU Temperature and Usage for Small-Scale Distri...
 
Towards Making an Anonymous and One-Stop Online Reporting System for Third-Wo...
Towards Making an Anonymous and One-Stop Online Reporting System for Third-Wo...Towards Making an Anonymous and One-Stop Online Reporting System for Third-Wo...
Towards Making an Anonymous and One-Stop Online Reporting System for Third-Wo...
 
Sparse Mat: A Tale of Devising A Low-Cost Directional System for Pedestrian C...
Sparse Mat: A Tale of Devising A Low-Cost Directional System for Pedestrian C...Sparse Mat: A Tale of Devising A Low-Cost Directional System for Pedestrian C...
Sparse Mat: A Tale of Devising A Low-Cost Directional System for Pedestrian C...
 
Smart Mat: A Low Cost People Counting Solution
Smart Mat: A Low Cost People Counting SolutionSmart Mat: A Low Cost People Counting Solution
Smart Mat: A Low Cost People Counting Solution
 
uReporter, an open public reporting system(SD)
uReporter, an open public reporting system(SD)uReporter, an open public reporting system(SD)
uReporter, an open public reporting system(SD)
 
uReporter, a social problem reporting system (ISD+DB)
uReporter, a social problem reporting system (ISD+DB)uReporter, a social problem reporting system (ISD+DB)
uReporter, a social problem reporting system (ISD+DB)
 
Euler trails and circuit
Euler trails and circuitEuler trails and circuit
Euler trails and circuit
 
Green Networking
Green NetworkingGreen Networking
Green Networking
 

Recently uploaded

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 

Recently uploaded (20)

Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptx
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 

Binarization of degraded document images based on hierarchical deep supervised network

  • 1. Binarization of Degraded Document Images Based on Hierarchical Deep Supervised Network Quang Nhat Vo, Soo Hyung Kim, Hyung Jeong Yang, and Gueesang Lee Pattern Recognition 74 (2018) 568–586 Presented by: Tarik Reza Toha #1017052013
  • 2. • Problem definition – What is the research problem? • Motivation – Why have the authors done the research? • Solution approach – How have the authors solved the problem? – Be detail on this. • Subsequent advancements – What are the subsequent research studies and how have they further advanced the solution of the problem? 2 Outline
  • 3. 3 Digital Archiving • Historical documents represent valuable cultural heritages that need to be protected and preserved • Automatic analysis of historical-document images involves: – Layout analysis – Text-line and word segmentation – Optical character recognition (OCR)
  • 4. • Binary image representation is preferred for document analysis – Each pixel is labeled as “text” (1) or “background” (0) • Binarization of degraded document images is complicated – Non-uniform intensity – Complex background – Bleed through • Existing solutions use unsupervised approaches and low- level features – Difficult to differentiate the text from the non-text components 4 Binary Degraded Document Images
  • 5. • Global binarization algorithms – Extracted labeling information is applied to the entire document images • Otsu et al., compute a threshold – minimize the within-class variance – maximize the between-class variance • Clustering-based approaches separate the text through learning of the unsupervised models • Work well with simple backgrounds and a uniform intensity 5 Existing Binarization Methods It cannot be used for degraded documents
  • 6. • Local binarization algorithms – Predict based on its neighborhood information • Image binarization is a classification problem – Unsupervised-classification algorithms – Supervised learning-based approaches • parameter-free nature • no need for pre- or post-processing – Deep neural network-based approaches 6 Existing Binarization Methods (contd.)
  • 7. 7 Existing Binarization Methods (contd.) Still noises and disconnected strokes exist Howe’s method vs Vo’s method on DIBCO 2011 dataset
  • 8. • Hierarchical deep supervised network (DSN) – Learns different feature levels from image data itself to classify foreground and background from degraded document images • DSN extends traditional convolutional neural network (CNN) to extract different feature levels 8 Main Contribution
  • 9. 9 Proposed Architecture Demonstration of the DSN model for dense prediction
  • 10. 10 Proposed Architecture (contd.) Diagram of the proposed DSN-based document binarization model
  • 13. 13 Experimental Evaluation Samples of the generated training image patches and ground- truth binary maps
  • 20. 20 Other Types of Documents Korean historical document image Chinese historical document image
  • 22. • The binarization of degraded document images is a challenging problem in terms of document analysis – DSN is a hierarchical architecture of deep supervised network that incorporates side layers to improve the training convergence • Future work – Handle the weak information – Adaptation to music score and paycheck – Reduce the number of convolutional layers 22 Conclusion
  • 23. Thank you Questions are welcome! 23