SlideShare a Scribd company logo
1 of 17
Handwritten and Machine Printed Text
 Separation in Document Images using
   the Bag of Visual Words Paradigm
        Konstantinos Zagoris1,2, Ioannis Pratikakis2, Apostolos
        Antonacopoulos1, Basilis Gatos3, Nikos Papamarkos2


1Pattern Recognition and Image Analysis (PRImA) Research Lab
School of Computing, Science and Engineering,
University of Salford, Greater Manchester, UK

2Department of Electrical and Computer Engineering
Democritus University of Thrace, Xanthi, Greece

3Institute
         of Informatics and Telecommunications,
National Center for Scientific Research “Demokritos” Athens, Greece
Current state-of-the-art
Three (3) main approaches
 Text Line Level
 Word Level
 Character Level
Disadvantages
 Different
         Page Segmentation Algorithms
 Incompatible Feature Set
Inspired from Information
Bag of Visual Word   •
                         Retrieval Theory
 Model (BoVWM)       •   An image content is
                         described by a set of “visual
                         words”.
                     •   A “visual word” is expressed
                         by a group of local features
                     •   Most well-known local
                         feature is the Scale-Invariant
                         Feature Transform (SIFT)
                     •   Codebook Creation
                     •   A codebook is defined by
                         the set of the clusters
                     •   A “visual word” is denoted
                         as the vector which
                         represents the center of
                         each cluster
                     •   Codebook is analogous to a
                         dictionary
Each visual entity is
Bag of Visual Word   •
                         described by a BoVWM
 Model (BoVWM)           descriptor
                     •   Each SIFT point belongs to a
                         “visual word”
                     •   The “visual word” that
                         corresponds to the closest
                         center of the cluster by a
                         distance function
                         (Euclidean, Manhattan)
                     •   The descriptor reflects the
                         frequency of each visual
                         word that appears in the
                         image.
Proposed Method


                            Block
                          Descriptor
Original      Page
                          Extraction   Classification   Final Result
Image      Segmentation
                           (BoVW
                           model)
Page Segmentation
                                               1.   B. Gatos, I. Pratikakis, and
                                                    S. Perantonis. Adaptive
                                                    degraded document image
                                                    binarization. Pattern
                   Original
                   Image                            Recognition, 39(3):317–
                                                    327, 2006.

                                               2.   N. Nikolaou, M. Makridis, B. G
                                                    atos, N. Stamatopoulos, and
                                                    N. Papamarkos.
   Locally Adaptive Binarisation Method [1]         Segmentation of historical
                                                    machine-printed documents
                                                    using adaptive run length
                                                    smoothing and skeleton
                                                    segmentation paths. Image
                                                    and Vision
 Adaptive Run Length Smoothing Algorithm [2]        Computing, 28(4):590–
                                                    604, 2010.




                 Final Result
Block Descriptor Extraction
   This step involves the creation of the block
    descriptor by utilizing the BoVW model
   Codebook Properties
   It must be small enough to ensure a low
    computational cost. It must be large enough
    to provide sufficiently high discrimination
    performance
   For the clustering stage the k-means algorithm
    is employed due to its simplicity and speed.
Block Descriptor
An example text block
                                 Extraction
                          •   The SIFTs are calculated on
                              the greyscale version
 Initial SIFT keypoints   •   those SIFTs whose position in
                              the binary image does not
                              match the foreground pixel
                              are rejected
                          •   Each of the remaining local
                              features is assigned a Visual
                              Word from the Codebook
                          •   a Visual Word Descriptor is
                              formed based on the
  Final SIFT keypoints        appearance of each Visual
                              Word of the Codebook in
                              this particular block
Decision System
   a classifier decides if the block contains
    handwritten or machine printed text or neither of
    the above (noise)
   Based on the Support Vector Machines (SVMs)
   Conventional approach – one against one, one
    against others
   Train two SVMs with the Radial Basis Function (RBF)
    kernel
   The first (SVM1) deals with the handwritten text
    problem against all the other
   the second (SVM2) deals with the machine printed
    text problem against all the other.
Decision System Algorithm
Decision System Algorithm

 Support Vector


            D1
                                              D2
         Sample

                                     Sample        Support Vector




SVM1 (Handwritten Text)   SVM2 (Machine-printed Text)
Examples


           Original Image




             Output of the
           proposed method
Evaluation Datasets
 103 modified document images from the
  IAM Handwriting Database
 100 representative images selected from
  the index cards of the UK Natural History
  Museum’s card archive (PRImA-NHM)
 The ground truth files adhere to the Page
  Analysis and Ground-truth Elements
  (PAGE) format framework
 http://datasets.primaresearch.org
Evaluation
            The F-measure of each method.
Dataset                            IAM      PRImA-
                                             NHM
Upper Bound (Proposed
Segmentation)                     0.9887    0.7985
Proposed Method (Proposed
Segmentation and BoVW)            0.9886    0.7689
Gabor Filters (Proposed
Segmentation and Gabor Filters)   0.7921    0.5702
Page Segmentation Problems
 Binarization   Failures



 Noise   – Text Overlapping



 Handwritten     – Machine text Overlapping
Thank You!

Ευχαριστώ!

More Related Content

What's hot

Text extraction from images
Text extraction from imagesText extraction from images
Text extraction from imagesGarby Baby
 
A Literature Survey: Neural Networks for object detection
A Literature Survey: Neural Networks for object detectionA Literature Survey: Neural Networks for object detection
A Literature Survey: Neural Networks for object detectionvivatechijri
 
Color reduction using the combination of the kohonen self organized feature m...
Color reduction using the combination of the kohonen self organized feature m...Color reduction using the combination of the kohonen self organized feature m...
Color reduction using the combination of the kohonen self organized feature m...Konstantinos Zagoris
 
A simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsA simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsDevansh16
 
A systematic image compression in the combination of linear vector quantisati...
A systematic image compression in the combination of linear vector quantisati...A systematic image compression in the combination of linear vector quantisati...
A systematic image compression in the combination of linear vector quantisati...eSAT Publishing House
 
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...ijcsit
 
AN IMPROVED MULTI-SOM ALGORITHM
AN IMPROVED MULTI-SOM ALGORITHMAN IMPROVED MULTI-SOM ALGORITHM
AN IMPROVED MULTI-SOM ALGORITHMIJNSA Journal
 
Pioneering VDT Image Compression using Block Coding
Pioneering VDT Image Compression using Block CodingPioneering VDT Image Compression using Block Coding
Pioneering VDT Image Compression using Block CodingDR.P.S.JAGADEESH KUMAR
 
IRJET- Object Detection using Hausdorff Distance
IRJET-  	  Object Detection using Hausdorff DistanceIRJET-  	  Object Detection using Hausdorff Distance
IRJET- Object Detection using Hausdorff DistanceIRJET Journal
 
Detecting text from natural images with Stroke Width Transform
Detecting text from natural images with Stroke Width TransformDetecting text from natural images with Stroke Width Transform
Detecting text from natural images with Stroke Width TransformPooja G N
 
ON THE IMAGE QUALITY AND ENCODING TIMES OF LSB, MSB AND COMBINED LSB-MSB
ON THE IMAGE QUALITY AND ENCODING TIMES OF LSB, MSB AND COMBINED LSB-MSBON THE IMAGE QUALITY AND ENCODING TIMES OF LSB, MSB AND COMBINED LSB-MSB
ON THE IMAGE QUALITY AND ENCODING TIMES OF LSB, MSB AND COMBINED LSB-MSBijcsit
 

What's hot (19)

MultiModal Retrieval Image
MultiModal Retrieval ImageMultiModal Retrieval Image
MultiModal Retrieval Image
 
Text extraction from images
Text extraction from imagesText extraction from images
Text extraction from images
 
A Literature Survey: Neural Networks for object detection
A Literature Survey: Neural Networks for object detectionA Literature Survey: Neural Networks for object detection
A Literature Survey: Neural Networks for object detection
 
C04741319
C04741319C04741319
C04741319
 
Color reduction using the combination of the kohonen self organized feature m...
Color reduction using the combination of the kohonen self organized feature m...Color reduction using the combination of the kohonen self organized feature m...
Color reduction using the combination of the kohonen self organized feature m...
 
A simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsA simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representations
 
A systematic image compression in the combination of linear vector quantisati...
A systematic image compression in the combination of linear vector quantisati...A systematic image compression in the combination of linear vector quantisati...
A systematic image compression in the combination of linear vector quantisati...
 
O017429398
O017429398O017429398
O017429398
 
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...
 
AN IMPROVED MULTI-SOM ALGORITHM
AN IMPROVED MULTI-SOM ALGORITHMAN IMPROVED MULTI-SOM ALGORITHM
AN IMPROVED MULTI-SOM ALGORITHM
 
Pioneering VDT Image Compression using Block Coding
Pioneering VDT Image Compression using Block CodingPioneering VDT Image Compression using Block Coding
Pioneering VDT Image Compression using Block Coding
 
Ijetcas14 527
Ijetcas14 527Ijetcas14 527
Ijetcas14 527
 
IRJET- Object Detection using Hausdorff Distance
IRJET-  	  Object Detection using Hausdorff DistanceIRJET-  	  Object Detection using Hausdorff Distance
IRJET- Object Detection using Hausdorff Distance
 
M017427985
M017427985M017427985
M017427985
 
Sub1586
Sub1586Sub1586
Sub1586
 
J017426467
J017426467J017426467
J017426467
 
Detecting text from natural images with Stroke Width Transform
Detecting text from natural images with Stroke Width TransformDetecting text from natural images with Stroke Width Transform
Detecting text from natural images with Stroke Width Transform
 
ON THE IMAGE QUALITY AND ENCODING TIMES OF LSB, MSB AND COMBINED LSB-MSB
ON THE IMAGE QUALITY AND ENCODING TIMES OF LSB, MSB AND COMBINED LSB-MSBON THE IMAGE QUALITY AND ENCODING TIMES OF LSB, MSB AND COMBINED LSB-MSB
ON THE IMAGE QUALITY AND ENCODING TIMES OF LSB, MSB AND COMBINED LSB-MSB
 
E017443136
E017443136E017443136
E017443136
 

Similar to Handwritten and Machine Printed Text Separation in Document Images using the Bag of Visual Words Paradigm

Tracking Trends in Korean Information Science Research, 2000-2011
Tracking Trends in Korean Information Science Research, 2000-2011Tracking Trends in Korean Information Science Research, 2000-2011
Tracking Trends in Korean Information Science Research, 2000-2011SoYoung YU
 
MIT 6.870 - Template Matching and Histograms (Nicolas Pinto, MIT)
MIT 6.870 - Template Matching and Histograms (Nicolas Pinto, MIT)MIT 6.870 - Template Matching and Histograms (Nicolas Pinto, MIT)
MIT 6.870 - Template Matching and Histograms (Nicolas Pinto, MIT)npinto
 
Mit6870 template matching and histograms
Mit6870 template matching and histogramsMit6870 template matching and histograms
Mit6870 template matching and histogramszukun
 
Probabilistic generative models for machine vision
Probabilistic generative models for machine visionProbabilistic generative models for machine vision
Probabilistic generative models for machine visionzukun
 
Adaptive membership functions for hand written character recognition by voron...
Adaptive membership functions for hand written character recognition by voron...Adaptive membership functions for hand written character recognition by voron...
Adaptive membership functions for hand written character recognition by voron...JPINFOTECH JAYAPRAKASH
 
Technologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic RecordsTechnologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic Recordspbajcsy
 
AUTOENCODER AND ITS TYPES , HOW ITS USED, APPLICATIONS , ADVANTAGES AND DISAD...
AUTOENCODER AND ITS TYPES , HOW ITS USED, APPLICATIONS , ADVANTAGES AND DISAD...AUTOENCODER AND ITS TYPES , HOW ITS USED, APPLICATIONS , ADVANTAGES AND DISAD...
AUTOENCODER AND ITS TYPES , HOW ITS USED, APPLICATIONS , ADVANTAGES AND DISAD...devismileyrockz
 
White Paper - Mpeg 4 Toolkit Approach
White Paper - Mpeg 4 Toolkit ApproachWhite Paper - Mpeg 4 Toolkit Approach
White Paper - Mpeg 4 Toolkit ApproachAmos Kohn
 
Object recognition
Object recognitionObject recognition
Object recognitionsaniacorreya
 
SIFT Based Feature Extraction and Matching for Archaeological Artifacts
SIFT Based Feature Extraction and Matching for Archaeological ArtifactsSIFT Based Feature Extraction and Matching for Archaeological Artifacts
SIFT Based Feature Extraction and Matching for Archaeological ArtifactsBIPUL MOHANTO [LION]
 
Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...
Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...
Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...CSCJournals
 
TechnicalBackgroundOverview
TechnicalBackgroundOverviewTechnicalBackgroundOverview
TechnicalBackgroundOverviewMotaz El-Saban
 
Hardware Implementation of Genetic Algorithm Based Digital Colour Image Water...
Hardware Implementation of Genetic Algorithm Based Digital Colour Image Water...Hardware Implementation of Genetic Algorithm Based Digital Colour Image Water...
Hardware Implementation of Genetic Algorithm Based Digital Colour Image Water...IDES Editor
 
Speeded-up and Compact Visual Codebook for Object Recognition
Speeded-up and Compact Visual Codebook for Object RecognitionSpeeded-up and Compact Visual Codebook for Object Recognition
Speeded-up and Compact Visual Codebook for Object RecognitionCSCJournals
 
Feature Extraction and Feature Selection using Textual Analysis
Feature Extraction and Feature Selection using Textual AnalysisFeature Extraction and Feature Selection using Textual Analysis
Feature Extraction and Feature Selection using Textual Analysisvivatechijri
 
Computer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonComputer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonAditya Bhattacharya
 

Similar to Handwritten and Machine Printed Text Separation in Document Images using the Bag of Visual Words Paradigm (20)

Tracking Trends in Korean Information Science Research, 2000-2011
Tracking Trends in Korean Information Science Research, 2000-2011Tracking Trends in Korean Information Science Research, 2000-2011
Tracking Trends in Korean Information Science Research, 2000-2011
 
Sa 008 patterns
Sa 008 patternsSa 008 patterns
Sa 008 patterns
 
MIT 6.870 - Template Matching and Histograms (Nicolas Pinto, MIT)
MIT 6.870 - Template Matching and Histograms (Nicolas Pinto, MIT)MIT 6.870 - Template Matching and Histograms (Nicolas Pinto, MIT)
MIT 6.870 - Template Matching and Histograms (Nicolas Pinto, MIT)
 
Mit6870 template matching and histograms
Mit6870 template matching and histogramsMit6870 template matching and histograms
Mit6870 template matching and histograms
 
Probabilistic generative models for machine vision
Probabilistic generative models for machine visionProbabilistic generative models for machine vision
Probabilistic generative models for machine vision
 
Adaptive membership functions for hand written character recognition by voron...
Adaptive membership functions for hand written character recognition by voron...Adaptive membership functions for hand written character recognition by voron...
Adaptive membership functions for hand written character recognition by voron...
 
Vb
VbVb
Vb
 
Technologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic RecordsTechnologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic Records
 
AUTOENCODER AND ITS TYPES , HOW ITS USED, APPLICATIONS , ADVANTAGES AND DISAD...
AUTOENCODER AND ITS TYPES , HOW ITS USED, APPLICATIONS , ADVANTAGES AND DISAD...AUTOENCODER AND ITS TYPES , HOW ITS USED, APPLICATIONS , ADVANTAGES AND DISAD...
AUTOENCODER AND ITS TYPES , HOW ITS USED, APPLICATIONS , ADVANTAGES AND DISAD...
 
White Paper - Mpeg 4 Toolkit Approach
White Paper - Mpeg 4 Toolkit ApproachWhite Paper - Mpeg 4 Toolkit Approach
White Paper - Mpeg 4 Toolkit Approach
 
Sensor Data Management
Sensor Data ManagementSensor Data Management
Sensor Data Management
 
Object recognition
Object recognitionObject recognition
Object recognition
 
SIFT Based Feature Extraction and Matching for Archaeological Artifacts
SIFT Based Feature Extraction and Matching for Archaeological ArtifactsSIFT Based Feature Extraction and Matching for Archaeological Artifacts
SIFT Based Feature Extraction and Matching for Archaeological Artifacts
 
Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...
Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...
Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...
 
TechnicalBackgroundOverview
TechnicalBackgroundOverviewTechnicalBackgroundOverview
TechnicalBackgroundOverview
 
LSDI 2.pptx
LSDI 2.pptxLSDI 2.pptx
LSDI 2.pptx
 
Hardware Implementation of Genetic Algorithm Based Digital Colour Image Water...
Hardware Implementation of Genetic Algorithm Based Digital Colour Image Water...Hardware Implementation of Genetic Algorithm Based Digital Colour Image Water...
Hardware Implementation of Genetic Algorithm Based Digital Colour Image Water...
 
Speeded-up and Compact Visual Codebook for Object Recognition
Speeded-up and Compact Visual Codebook for Object RecognitionSpeeded-up and Compact Visual Codebook for Object Recognition
Speeded-up and Compact Visual Codebook for Object Recognition
 
Feature Extraction and Feature Selection using Textual Analysis
Feature Extraction and Feature Selection using Textual AnalysisFeature Extraction and Feature Selection using Textual Analysis
Feature Extraction and Feature Selection using Textual Analysis
 
Computer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonComputer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathon
 

Recently uploaded

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 

Recently uploaded (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 

Handwritten and Machine Printed Text Separation in Document Images using the Bag of Visual Words Paradigm

  • 1. Handwritten and Machine Printed Text Separation in Document Images using the Bag of Visual Words Paradigm Konstantinos Zagoris1,2, Ioannis Pratikakis2, Apostolos Antonacopoulos1, Basilis Gatos3, Nikos Papamarkos2 1Pattern Recognition and Image Analysis (PRImA) Research Lab School of Computing, Science and Engineering, University of Salford, Greater Manchester, UK 2Department of Electrical and Computer Engineering Democritus University of Thrace, Xanthi, Greece 3Institute of Informatics and Telecommunications, National Center for Scientific Research “Demokritos” Athens, Greece
  • 2. Current state-of-the-art Three (3) main approaches  Text Line Level  Word Level  Character Level
  • 3. Disadvantages  Different Page Segmentation Algorithms  Incompatible Feature Set
  • 4. Inspired from Information Bag of Visual Word • Retrieval Theory Model (BoVWM) • An image content is described by a set of “visual words”. • A “visual word” is expressed by a group of local features • Most well-known local feature is the Scale-Invariant Feature Transform (SIFT) • Codebook Creation • A codebook is defined by the set of the clusters • A “visual word” is denoted as the vector which represents the center of each cluster • Codebook is analogous to a dictionary
  • 5. Each visual entity is Bag of Visual Word • described by a BoVWM Model (BoVWM) descriptor • Each SIFT point belongs to a “visual word” • The “visual word” that corresponds to the closest center of the cluster by a distance function (Euclidean, Manhattan) • The descriptor reflects the frequency of each visual word that appears in the image.
  • 6. Proposed Method Block Descriptor Original Page Extraction Classification Final Result Image Segmentation (BoVW model)
  • 7. Page Segmentation 1. B. Gatos, I. Pratikakis, and S. Perantonis. Adaptive degraded document image binarization. Pattern Original Image Recognition, 39(3):317– 327, 2006. 2. N. Nikolaou, M. Makridis, B. G atos, N. Stamatopoulos, and N. Papamarkos. Locally Adaptive Binarisation Method [1] Segmentation of historical machine-printed documents using adaptive run length smoothing and skeleton segmentation paths. Image and Vision Adaptive Run Length Smoothing Algorithm [2] Computing, 28(4):590– 604, 2010. Final Result
  • 8. Block Descriptor Extraction  This step involves the creation of the block descriptor by utilizing the BoVW model  Codebook Properties  It must be small enough to ensure a low computational cost. It must be large enough to provide sufficiently high discrimination performance  For the clustering stage the k-means algorithm is employed due to its simplicity and speed.
  • 9. Block Descriptor An example text block Extraction • The SIFTs are calculated on the greyscale version Initial SIFT keypoints • those SIFTs whose position in the binary image does not match the foreground pixel are rejected • Each of the remaining local features is assigned a Visual Word from the Codebook • a Visual Word Descriptor is formed based on the Final SIFT keypoints appearance of each Visual Word of the Codebook in this particular block
  • 10. Decision System  a classifier decides if the block contains handwritten or machine printed text or neither of the above (noise)  Based on the Support Vector Machines (SVMs)  Conventional approach – one against one, one against others  Train two SVMs with the Radial Basis Function (RBF) kernel  The first (SVM1) deals with the handwritten text problem against all the other  the second (SVM2) deals with the machine printed text problem against all the other.
  • 12. Decision System Algorithm Support Vector D1 D2 Sample Sample Support Vector SVM1 (Handwritten Text) SVM2 (Machine-printed Text)
  • 13. Examples Original Image Output of the proposed method
  • 14. Evaluation Datasets  103 modified document images from the IAM Handwriting Database  100 representative images selected from the index cards of the UK Natural History Museum’s card archive (PRImA-NHM)  The ground truth files adhere to the Page Analysis and Ground-truth Elements (PAGE) format framework  http://datasets.primaresearch.org
  • 15. Evaluation The F-measure of each method. Dataset IAM PRImA- NHM Upper Bound (Proposed Segmentation) 0.9887 0.7985 Proposed Method (Proposed Segmentation and BoVW) 0.9886 0.7689 Gabor Filters (Proposed Segmentation and Gabor Filters) 0.7921 0.5702
  • 16. Page Segmentation Problems  Binarization Failures  Noise – Text Overlapping  Handwritten – Machine text Overlapping