SlideShare a Scribd company logo
Holistic Recognition of
  Printed Arabic Script
  Ligatures
  Akram El-Korashy
  Supervised by: Dr. Faisal Shafait
Deutsche Forschungszentrum für Künstliche Intelligenz (DFKI)
Kaiserslautern, Deutschland
Outline
● Introduction
   ○ Segmentation-free OCR for Arabic scripts
● Approaches Used
   ○ Features Extraction, and the Shape Context method
   ○ Machine Learning Techniques (Hierarchical
      classification, Spectral Hashing, Random Forests)
● Improvements and Methodology
   ○ Shape Context weaknesses
   ○ New Features (dots, sizes, pixel-level matching)
   ○ Classification Methodology
● Experiments and Results
● Conclusion and Summary

Akram El-Korashy, Segmentation-free OCR, 14.08.12   1
Outline
● Introduction
   ○ Segmentation-free OCR for Arabic scripts
● Approaches Used
   ○ Features Extraction, and the Shape Context method
   ○ Machine Learning Techniques (Hierarchical
      classification, Spectral Hashing, Random Forests)
● Improvements and Methodology
   ○ Shape Context weaknesses
   ○ New Features (dots, sizes, pixel-level matching)
   ○ Classification Methodology
● Experiments and Results
● Conclusion and Summary

Akram El-Korashy, Segmentation-free OCR, 14.08.12   2
Segmentation-free OCR for
Arabic scripts
ive




● Nastalique writing: Classify ligatures instead
  of individual characters.
● Over 20,000 valid ligatures in the Urdu
  language.
● Ease in the preprocessing, with difficulty in
  feature extraction & classification.
Akram El-Korashy, Segmentation-free OCR, 14.08.12   3
Outline
● Introduction
   ○ Segmentation-free OCR for Arabic scripts
● Approaches Used
   ○ Features Extraction, and the Shape Context method
   ○ Machine Learning Techniques (Hierarchical
      classification, Spectral Hashing, Random Forests)
● Improvements and Methodology
   ○ Shape Context weaknesses
   ○ New Features (dots, sizes, pixel-level matching)
   ○ Classification Methodology
● Experiments and Results
● Conclusion and Summary

Akram El-Korashy, Segmentation-free OCR, 14.08.12   4
Features Extraction, Shape
Context method
● Distribution of Points, Transformation
  methods, Structural Analysis.

● Nabocr: Shape Context features vector.

● Contour Extraction.

● Shape Context is a shape descriptor
  proposed by Belongie et al.
Akram El-Korashy, Segmentation-free OCR, 14.08.12   5
Features Extraction, Shape
Context method
● 4 histograms from 4 quadrants.

● Each histogram is
  a sum of point histograms.

● Distance, Orientation

● Histogram: bins of ranges.

Akram El-Korashy, Segmentation-free OCR, 14.08.12   6
Outline
● Introduction
   ○ Segmentation-free OCR for Arabic scripts
● Approaches Used
   ○ Features Extraction, and the Shape Context method
   ○ Machine Learning Techniques (Hierarchical
      classification, Spectral Hashing, Random Forests)
● Improvements and Methodology
   ○ Shape Context weaknesses
   ○ New Features (dots, sizes, pixel-level matching)
   ○ Classification Methodology
● Experiments and Results
● Conclusion and Summary

Akram El-Korashy, Segmentation-free OCR, 14.08.12   7
Hierarchical Classification
● Decomposing a classification problem into a
  set of smaller problems.
● Useful with large numbers of categories.

● Efficiency of recognition.
● Can help improve accuracy

● Independent set of features for each branch.

Akram El-Korashy, Segmentation-free OCR, 14.08.12   8
Outline
● Introduction
   ○ Segmentation-free OCR for Arabic scripts
● Approaches Used
   ○ Features Extraction, and the Shape Context method
   ○ Machine Learning Techniques (Hierarchical
      classification, Spectral Hashing, Random Forests)
● Improvements and Methodology
   ○ Shape Context weaknesses
   ○ New Features (dots, sizes, pixel-level matching)
   ○ Classification Methodology
● Experiments and Results
● Conclusion and Summary

Akram El-Korashy, Segmentation-free OCR, 14.08.12   9
Spectral Hashing
● Fast NN technique
● Feature vector into a binary code:
     ○ easily computed
     ○ small no. of bits
     ○ similarity mapping


● Calculating binary code:
     ○ maximum variance direction
     (PCA)
     ○ sin eigenfn.

Akram El-Korashy, Segmentation-free OCR, 14.08.12   10
Outline
● Introduction
   ○ Segmentation-free OCR for Arabic scripts
● Approaches Used
   ○ Features Extraction, and the Shape Context method
   ○ Machine Learning Techniques (Hierarchical
      classification, Spectral Hashing, Random Forests)
● Improvements and Methodology
   ○ Shape Context weaknesses
   ○ New Features (dots, sizes, pixel-level matching)
   ○ Classification Methodology
● Experiments and Results
● Conclusion and Summary

Akram El-Korashy, Segmentation-free OCR, 14.08.12   11
Random Forests
● Ensemble Classifier




Ensemble learning combines the predictions of different classifiers (decision trees) by collecting
   independent votes from each tree and calculating the majority vote to give a prediction.
Akram El-Korashy, Segmentation-free OCR, 14.08.12                                  12
Outline
● Introduction
   ○ Segmentation-free OCR for Arabic scripts
● Approaches Used
   ○ Features Extraction, and the Shape Context method
   ○ Machine Learning Techniques (Hierarchical
      classification, Spectral Hashing, Random Forests)
● Improvements and Methodology
   ○ Shape Context weaknesses
   ○ New Features (dots, sizes, pixel-level matching)
   ○ Classification Methodology
● Experiments and Results
● Conclusion and Summary

Akram El-Korashy, Segmentation-free OCR, 14.08.12   13
Shape context weaknesses
● Scale invariance

● Missing representation of dots

● Confusion between ligatures
  that vary only in dots.



Akram El-Korashy, Segmentation-free OCR, 14.08.12   14
Outline
● Introduction
   ○ Segmentation-free OCR for Arabic scripts
● Approaches Used
   ○ Features Extraction, and the Shape Context method
   ○ Machine Learning Techniques (Hierarchical
      classification, Spectral Hashing, Random Forests)
● Improvements and Methodology
   ○ Shape Context weaknesses
   ○ New Features (dots, sizes, pixel-level matching)
   ○ Classification Methodology
● Experiments and Results
● Conclusion and Summary

Akram El-Korashy, Segmentation-free OCR, 14.08.12   15
New Features
● Sizes of connected components

● Locations of connected components

     ○ above, below,
       or interleaving

     ○ Grid location


Akram El-Korashy, Segmentation-free OCR, 14.08.12   16
New Features
● Pixel-level properties:

     ○ weights of regions
     ○ fill ratio


● Length, Width, Aspect Ratio

     ○ Invariance to scanning resolution
     ○ Setting reference size
     ○ Histogram of widths and heights
Akram El-Korashy, Segmentation-free OCR, 14.08.12   17
Outline
● Introduction
   ○ Segmentation-free OCR for Arabic scripts
● Approaches Used
   ○ Features Extraction, and the Shape Context method
   ○ Machine Learning Techniques (Hierarchical
      classification, Spectral Hashing, Random Forests)
● Improvements and Methodology
   ○ Shape Context weaknesses
   ○ New Features (dots, sizes, pixel-level matching)
   ○ Classification Methodology
● Experiments and Results
● Conclusion and Summary

Akram El-Korashy, Segmentation-free OCR, 14.08.12   18
Classification Methodology
● Experiment set "1"
     ○ Spectral Hashing, reduction of number of
       comparisons


● Experiment set "2"
     ○ Random Forests, hierarchy by recognizing the no. of
       characters


● Experiment "3"
     ○ Random Forests, classification of alphabet symbols
Akram El-Korashy, Segmentation-free OCR, 14.08.12   19
Classification Methodology
● Spectral Hashing (sunvid project):

     ○ Training Dataset (~80,000 samples)

     ○ Test Dataset (~20,000 samples)

     ○ Different combinations of number of bits, number of
       tables, tolerance bits (training different hash
       structures in parallel)

Akram El-Korashy, Segmentation-free OCR, 14.08.12   20
Classification Methodology
● Random Forests (python milk):

     ○ Number of decision trees: 101
     ○ 70% of the attributes
     ○ 70% of the training samples

     ○ Reduced training dataset (~20,000 samples)
     ○ Test dataset of ~18,000 samples



Akram El-Korashy, Segmentation-free OCR, 14.08.12   21
Classification Methodology
●                                                                input

     ○ New features vector

                                                            Random Forest classifier
     ○ Classifying based on
       no. of characters
                                              1-character          2-character        3+ character
                                              classifier           classifier         classifier



     ○ Classifying the Alphabet Symbols



Akram El-Korashy, Segmentation-free OCR, 14.08.12                                22
Outline
● Introduction
   ○ Segmentation-free OCR for Arabic scripts
● Approaches Used
   ○ Features Extraction, and the Shape Context method
   ○ Machine Learning Techniques (Hierarchical
      classification, Spectral Hashing, Random Forests)
● Improvements and Methodology
   ○ Shape Context weaknesses
   ○ New Features (dots, sizes, pixel-level matching)
   ○ Classification Methodology
● Experiments and Results
● Conclusion and Summary

Akram El-Korashy, Segmentation-free OCR, 14.08.12   23
Experiments and Results
● Spectral Hashing Results "1"
     ○ Effect of changing the number of tables
     ○ 7-bit-binary-code, 2 tolerance bits




Akram El-Korashy, Segmentation-free OCR, 14.08.12   24
Experiments and Results
● Spectral Hashing Results "1"
    Accuracy                     Best Reduction     Hash (bits, tables,
                                                    tolerance)
    81.5%                        37538 (47.2%)      7, 9, 1
    81%                          31553 (39.7%)      7, 7, 1
    80.5%                        23975 (30.1%)      8, 9, 1
    79.5%                        20736 (26.1%)      7, 4, 1
    78%                          18737 (23.6%)      8, 7, 1
    76%                          15392 (19.4%)      7, 3, 1



Akram El-Korashy, Segmentation-free OCR, 14.08.12               25
Experiments and Results
● Spectral Hashing Results "1"
● Significant reduction rates
     ○ Reduction down to 19% for a difference of 6% in
       accuracy
     ○ Reduction down to 24% for a difference of 4% in
       accuracy.

     ○ Reduction down to 47.2% for no accuracy loss.
     ○ Observation: Accuracy slightly higher than 1-NN for
       reduction down to 57.6%

Akram El-Korashy, Segmentation-free OCR, 14.08.12   26
Experiments and Results
● Random Forest Results "2"

●    Accuracy of 78.7% for 1, 2, 3, 4+ labels
●    Accuracy of 45.4% for 1, 2, 3, 4, 5+ labels
●    Accuracy of 20.7% for 1, 2, 3, 4, 5, 6+ labels
●    Even worse with more partitioning




Akram El-Korashy, Segmentation-free OCR, 14.08.12   27
Experiments and Results
● Random Forest Results "2"
● Confusion matrix for 1, 2, 3+: alphabet
  symbols can be separately classified.
   test label /                                             Recall
   result
                   1              2                 3+

   1               1131           88                14      91.9%
   2               16             94                531     17.2%
   3+              7              2                 16627   99.9%
   % true                                                            ___
   positives
                   98%            51%               96.8%
Akram El-Korashy, Segmentation-free OCR, 14.08.12                          28
Experiments and Results
● Alphabet symbols

     ○ 80.34 % for Random Forests "3"

     ○ Accuracy of 98.74 % for 1-NN classifier

     ○ 1-NN classifier can be used for recognition under
       class 1.

     ○ Over 30% of ligatures are individual characters.

Akram El-Korashy, Segmentation-free OCR, 14.08.12   29
Outline
● Introduction
   ○ Segmentation-free OCR for Arabic scripts
● Approaches Used
   ○ Features Extraction, and the Shape Context method
   ○ Machine Learning Techniques (Hierarchical
      classification, Spectral Hashing, Random Forests)
● Improvements and Methodology
   ○ Shape Context weaknesses
   ○ New Features (dots, sizes, pixel-level matching)
   ○ Classification Methodology
● Experiments and Results
● Conclusion and Summary

Akram El-Korashy, Segmentation-free OCR, 14.08.12   30
Conclusion and Summary
● Features vector can be improved.

● 1-NN improved efficiency by Spectral
  Hashing: significant reduction

● Random Forests: can be used to separate
  the 1-character alphabet symbols.

● Useful for overall performance improvement
  on real text data.
Akram El-Korashy, Segmentation-free OCR, 14.08.12   31
Questions?
Future Work




 Thank You

More Related Content

Recently uploaded

Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 

Recently uploaded (20)

Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 

Featured

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
Marius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
Expeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
Pixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
marketingartwork
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
Skeleton Technologies
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
SpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Lily Ray
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
Rajiv Jayarajah, MAppComm, ACC
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Christy Abraham Joy
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
Vit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
MindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
RachelPearson36
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Search space reduction for holistic ligature recognition in Urdu Nastaliq script (Bachelor thesis presentation)

  • 1. Holistic Recognition of Printed Arabic Script Ligatures Akram El-Korashy Supervised by: Dr. Faisal Shafait Deutsche Forschungszentrum für Künstliche Intelligenz (DFKI) Kaiserslautern, Deutschland
  • 2. Outline ● Introduction ○ Segmentation-free OCR for Arabic scripts ● Approaches Used ○ Features Extraction, and the Shape Context method ○ Machine Learning Techniques (Hierarchical classification, Spectral Hashing, Random Forests) ● Improvements and Methodology ○ Shape Context weaknesses ○ New Features (dots, sizes, pixel-level matching) ○ Classification Methodology ● Experiments and Results ● Conclusion and Summary Akram El-Korashy, Segmentation-free OCR, 14.08.12 1
  • 3. Outline ● Introduction ○ Segmentation-free OCR for Arabic scripts ● Approaches Used ○ Features Extraction, and the Shape Context method ○ Machine Learning Techniques (Hierarchical classification, Spectral Hashing, Random Forests) ● Improvements and Methodology ○ Shape Context weaknesses ○ New Features (dots, sizes, pixel-level matching) ○ Classification Methodology ● Experiments and Results ● Conclusion and Summary Akram El-Korashy, Segmentation-free OCR, 14.08.12 2
  • 4. Segmentation-free OCR for Arabic scripts ive ● Nastalique writing: Classify ligatures instead of individual characters. ● Over 20,000 valid ligatures in the Urdu language. ● Ease in the preprocessing, with difficulty in feature extraction & classification. Akram El-Korashy, Segmentation-free OCR, 14.08.12 3
  • 5. Outline ● Introduction ○ Segmentation-free OCR for Arabic scripts ● Approaches Used ○ Features Extraction, and the Shape Context method ○ Machine Learning Techniques (Hierarchical classification, Spectral Hashing, Random Forests) ● Improvements and Methodology ○ Shape Context weaknesses ○ New Features (dots, sizes, pixel-level matching) ○ Classification Methodology ● Experiments and Results ● Conclusion and Summary Akram El-Korashy, Segmentation-free OCR, 14.08.12 4
  • 6. Features Extraction, Shape Context method ● Distribution of Points, Transformation methods, Structural Analysis. ● Nabocr: Shape Context features vector. ● Contour Extraction. ● Shape Context is a shape descriptor proposed by Belongie et al. Akram El-Korashy, Segmentation-free OCR, 14.08.12 5
  • 7. Features Extraction, Shape Context method ● 4 histograms from 4 quadrants. ● Each histogram is a sum of point histograms. ● Distance, Orientation ● Histogram: bins of ranges. Akram El-Korashy, Segmentation-free OCR, 14.08.12 6
  • 8. Outline ● Introduction ○ Segmentation-free OCR for Arabic scripts ● Approaches Used ○ Features Extraction, and the Shape Context method ○ Machine Learning Techniques (Hierarchical classification, Spectral Hashing, Random Forests) ● Improvements and Methodology ○ Shape Context weaknesses ○ New Features (dots, sizes, pixel-level matching) ○ Classification Methodology ● Experiments and Results ● Conclusion and Summary Akram El-Korashy, Segmentation-free OCR, 14.08.12 7
  • 9. Hierarchical Classification ● Decomposing a classification problem into a set of smaller problems. ● Useful with large numbers of categories. ● Efficiency of recognition. ● Can help improve accuracy ● Independent set of features for each branch. Akram El-Korashy, Segmentation-free OCR, 14.08.12 8
  • 10. Outline ● Introduction ○ Segmentation-free OCR for Arabic scripts ● Approaches Used ○ Features Extraction, and the Shape Context method ○ Machine Learning Techniques (Hierarchical classification, Spectral Hashing, Random Forests) ● Improvements and Methodology ○ Shape Context weaknesses ○ New Features (dots, sizes, pixel-level matching) ○ Classification Methodology ● Experiments and Results ● Conclusion and Summary Akram El-Korashy, Segmentation-free OCR, 14.08.12 9
  • 11. Spectral Hashing ● Fast NN technique ● Feature vector into a binary code: ○ easily computed ○ small no. of bits ○ similarity mapping ● Calculating binary code: ○ maximum variance direction (PCA) ○ sin eigenfn. Akram El-Korashy, Segmentation-free OCR, 14.08.12 10
  • 12. Outline ● Introduction ○ Segmentation-free OCR for Arabic scripts ● Approaches Used ○ Features Extraction, and the Shape Context method ○ Machine Learning Techniques (Hierarchical classification, Spectral Hashing, Random Forests) ● Improvements and Methodology ○ Shape Context weaknesses ○ New Features (dots, sizes, pixel-level matching) ○ Classification Methodology ● Experiments and Results ● Conclusion and Summary Akram El-Korashy, Segmentation-free OCR, 14.08.12 11
  • 13. Random Forests ● Ensemble Classifier Ensemble learning combines the predictions of different classifiers (decision trees) by collecting independent votes from each tree and calculating the majority vote to give a prediction. Akram El-Korashy, Segmentation-free OCR, 14.08.12 12
  • 14. Outline ● Introduction ○ Segmentation-free OCR for Arabic scripts ● Approaches Used ○ Features Extraction, and the Shape Context method ○ Machine Learning Techniques (Hierarchical classification, Spectral Hashing, Random Forests) ● Improvements and Methodology ○ Shape Context weaknesses ○ New Features (dots, sizes, pixel-level matching) ○ Classification Methodology ● Experiments and Results ● Conclusion and Summary Akram El-Korashy, Segmentation-free OCR, 14.08.12 13
  • 15. Shape context weaknesses ● Scale invariance ● Missing representation of dots ● Confusion between ligatures that vary only in dots. Akram El-Korashy, Segmentation-free OCR, 14.08.12 14
  • 16. Outline ● Introduction ○ Segmentation-free OCR for Arabic scripts ● Approaches Used ○ Features Extraction, and the Shape Context method ○ Machine Learning Techniques (Hierarchical classification, Spectral Hashing, Random Forests) ● Improvements and Methodology ○ Shape Context weaknesses ○ New Features (dots, sizes, pixel-level matching) ○ Classification Methodology ● Experiments and Results ● Conclusion and Summary Akram El-Korashy, Segmentation-free OCR, 14.08.12 15
  • 17. New Features ● Sizes of connected components ● Locations of connected components ○ above, below, or interleaving ○ Grid location Akram El-Korashy, Segmentation-free OCR, 14.08.12 16
  • 18. New Features ● Pixel-level properties: ○ weights of regions ○ fill ratio ● Length, Width, Aspect Ratio ○ Invariance to scanning resolution ○ Setting reference size ○ Histogram of widths and heights Akram El-Korashy, Segmentation-free OCR, 14.08.12 17
  • 19. Outline ● Introduction ○ Segmentation-free OCR for Arabic scripts ● Approaches Used ○ Features Extraction, and the Shape Context method ○ Machine Learning Techniques (Hierarchical classification, Spectral Hashing, Random Forests) ● Improvements and Methodology ○ Shape Context weaknesses ○ New Features (dots, sizes, pixel-level matching) ○ Classification Methodology ● Experiments and Results ● Conclusion and Summary Akram El-Korashy, Segmentation-free OCR, 14.08.12 18
  • 20. Classification Methodology ● Experiment set "1" ○ Spectral Hashing, reduction of number of comparisons ● Experiment set "2" ○ Random Forests, hierarchy by recognizing the no. of characters ● Experiment "3" ○ Random Forests, classification of alphabet symbols Akram El-Korashy, Segmentation-free OCR, 14.08.12 19
  • 21. Classification Methodology ● Spectral Hashing (sunvid project): ○ Training Dataset (~80,000 samples) ○ Test Dataset (~20,000 samples) ○ Different combinations of number of bits, number of tables, tolerance bits (training different hash structures in parallel) Akram El-Korashy, Segmentation-free OCR, 14.08.12 20
  • 22. Classification Methodology ● Random Forests (python milk): ○ Number of decision trees: 101 ○ 70% of the attributes ○ 70% of the training samples ○ Reduced training dataset (~20,000 samples) ○ Test dataset of ~18,000 samples Akram El-Korashy, Segmentation-free OCR, 14.08.12 21
  • 23. Classification Methodology ● input ○ New features vector Random Forest classifier ○ Classifying based on no. of characters 1-character 2-character 3+ character classifier classifier classifier ○ Classifying the Alphabet Symbols Akram El-Korashy, Segmentation-free OCR, 14.08.12 22
  • 24. Outline ● Introduction ○ Segmentation-free OCR for Arabic scripts ● Approaches Used ○ Features Extraction, and the Shape Context method ○ Machine Learning Techniques (Hierarchical classification, Spectral Hashing, Random Forests) ● Improvements and Methodology ○ Shape Context weaknesses ○ New Features (dots, sizes, pixel-level matching) ○ Classification Methodology ● Experiments and Results ● Conclusion and Summary Akram El-Korashy, Segmentation-free OCR, 14.08.12 23
  • 25. Experiments and Results ● Spectral Hashing Results "1" ○ Effect of changing the number of tables ○ 7-bit-binary-code, 2 tolerance bits Akram El-Korashy, Segmentation-free OCR, 14.08.12 24
  • 26. Experiments and Results ● Spectral Hashing Results "1" Accuracy Best Reduction Hash (bits, tables, tolerance) 81.5% 37538 (47.2%) 7, 9, 1 81% 31553 (39.7%) 7, 7, 1 80.5% 23975 (30.1%) 8, 9, 1 79.5% 20736 (26.1%) 7, 4, 1 78% 18737 (23.6%) 8, 7, 1 76% 15392 (19.4%) 7, 3, 1 Akram El-Korashy, Segmentation-free OCR, 14.08.12 25
  • 27. Experiments and Results ● Spectral Hashing Results "1" ● Significant reduction rates ○ Reduction down to 19% for a difference of 6% in accuracy ○ Reduction down to 24% for a difference of 4% in accuracy. ○ Reduction down to 47.2% for no accuracy loss. ○ Observation: Accuracy slightly higher than 1-NN for reduction down to 57.6% Akram El-Korashy, Segmentation-free OCR, 14.08.12 26
  • 28. Experiments and Results ● Random Forest Results "2" ● Accuracy of 78.7% for 1, 2, 3, 4+ labels ● Accuracy of 45.4% for 1, 2, 3, 4, 5+ labels ● Accuracy of 20.7% for 1, 2, 3, 4, 5, 6+ labels ● Even worse with more partitioning Akram El-Korashy, Segmentation-free OCR, 14.08.12 27
  • 29. Experiments and Results ● Random Forest Results "2" ● Confusion matrix for 1, 2, 3+: alphabet symbols can be separately classified. test label / Recall result 1 2 3+ 1 1131 88 14 91.9% 2 16 94 531 17.2% 3+ 7 2 16627 99.9% % true ___ positives 98% 51% 96.8% Akram El-Korashy, Segmentation-free OCR, 14.08.12 28
  • 30. Experiments and Results ● Alphabet symbols ○ 80.34 % for Random Forests "3" ○ Accuracy of 98.74 % for 1-NN classifier ○ 1-NN classifier can be used for recognition under class 1. ○ Over 30% of ligatures are individual characters. Akram El-Korashy, Segmentation-free OCR, 14.08.12 29
  • 31. Outline ● Introduction ○ Segmentation-free OCR for Arabic scripts ● Approaches Used ○ Features Extraction, and the Shape Context method ○ Machine Learning Techniques (Hierarchical classification, Spectral Hashing, Random Forests) ● Improvements and Methodology ○ Shape Context weaknesses ○ New Features (dots, sizes, pixel-level matching) ○ Classification Methodology ● Experiments and Results ● Conclusion and Summary Akram El-Korashy, Segmentation-free OCR, 14.08.12 30
  • 32. Conclusion and Summary ● Features vector can be improved. ● 1-NN improved efficiency by Spectral Hashing: significant reduction ● Random Forests: can be used to separate the 1-character alphabet symbols. ● Useful for overall performance improvement on real text data. Akram El-Korashy, Segmentation-free OCR, 14.08.12 31