Aocr Hmm Presentation

737 views
672 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
737
On SlideShare
0
From Embeds
0
Number of Embeds
14
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Aocr Hmm Presentation

  1. 1. AOCR Arabic Optical Character Recognition ABDEL RAHMAN GHAREEB KASEM ADEL SALAH ABU SEREEA MAHMOUD ABDEL MONEIM ABDEL MONEIM MAHMOUD MOHAMMED ABDEL WAHAB
  2. 2. Main contents <ul><li>Introduction to AOCR </li></ul><ul><li>Feature extraction </li></ul><ul><li>Preprocessing </li></ul><ul><li>AOCR system implementation </li></ul><ul><li>Experimental results </li></ul><ul><li>Conclusion & future directions </li></ul><ul><li>Applications </li></ul>
  3. 3. Main contents <ul><li>Introduction to AOCR </li></ul><ul><li>Feature extraction </li></ul><ul><li>Preprocessing </li></ul><ul><li>AOCR system implementation </li></ul><ul><li>Experimental results </li></ul><ul><li>Conclusion & future directions </li></ul><ul><li>Applications </li></ul>
  4. 4. Introduction <ul><li>Why AOCR? </li></ul><ul><li>What is OCR? </li></ul><ul><li>What is the problem in AOCR? </li></ul><ul><li>What is the solution? </li></ul><ul><ul><ul><li>Pre-Segmentation. </li></ul></ul></ul><ul><ul><ul><li>Auto-Segmentation. </li></ul></ul></ul>
  5. 5. Main contents <ul><li>Introduction to AOCR </li></ul><ul><li>Feature extraction </li></ul><ul><li>Preprocessing </li></ul><ul><li>AOCR system implementation </li></ul><ul><li>Experimental results </li></ul><ul><li>Conclusion & future directions </li></ul><ul><li>Applications </li></ul>
  6. 6. Preprocessing <ul><li>Image rotation </li></ul><ul><li>Segmentation. </li></ul><ul><ul><li>Line segmentation. </li></ul></ul><ul><ul><li>Word segmentation </li></ul></ul><ul><li>Image enhancement </li></ul>
  7. 7. Preprocessing Problem of tilted image 1. Image rotation
  8. 8. Preprocessing 1. Process rotated image
  9. 9. <ul><li>Rotate by -1 degree </li></ul>Preprocessing 1. Process rotated image
  10. 10. <ul><li>Rotate by -2 degree </li></ul>Preprocessing 1. Process rotated image
  11. 11. <ul><li>Rotate by -3 degree </li></ul>Preprocessing 1. Process rotated image
  12. 12. <ul><li>Rotate by -4 degree </li></ul>Preprocessing 1. Process rotated image
  13. 13. <ul><li>Rotate by -4 degree </li></ul>Preprocessing 1. Process rotated image
  14. 14. Preprocessing 1. Process rotated image <ul><li>Threshold effect </li></ul>Clear zeros Clear zeros Mean value 0.2*Mean value
  15. 15. Preprocessing 1. Process rotated image GRAY Scale Vs. Black/White in Rotation process Original image Gray scale Black/White
  16. 16. Preprocessing <ul><li>Process rotated image </li></ul><ul><li>Segmentation. </li></ul><ul><ul><li>Line segmentation. </li></ul></ul><ul><ul><li>Word segmentation </li></ul></ul><ul><li>Image enhancement </li></ul>
  17. 17. Preprocessing <ul><li>2. Segmentation. </li></ul><ul><ul><li>What is the Segmentation process? </li></ul></ul><ul><ul><li>Why we need segmentation in Arabic OCR? </li></ul></ul><ul><ul><li>What is the algorithm used in Segmentation? </li></ul></ul>
  18. 18. <ul><li>2. Segmentation. </li></ul>Preprocessing <ul><li>Line level segmentation </li></ul>
  19. 19. <ul><li>2. Segmentation. </li></ul>Preprocessing <ul><li>Line level segmentation </li></ul>
  20. 20. <ul><li>2. Segmentation. </li></ul>Preprocessing <ul><li>Word level segmentation </li></ul>
  21. 21. <ul><li>2. Segmentation. </li></ul>Preprocessing
  22. 22. Preprocessing <ul><li>Process rotated image </li></ul><ul><li>Segmentation. </li></ul><ul><ul><li>Line segmentation. </li></ul></ul><ul><ul><li>Word segmentation </li></ul></ul><ul><li>Image enhancement </li></ul>
  23. 23. Preprocessing 3. Image enhancement
  24. 24. 3. Image enhancement Preprocessing <ul><li>Noise Reduction </li></ul>By morphology operations
  25. 25. <ul><li>Very important notation: </li></ul>Apply Image Enhancement operations on small images not large image بسم الله الرحمن الرحيم الله أكبر الله أكبر الله أكبر لا إله الا الله والله أكبر Large Image X <ul><li>Small Images </li></ul>بسم الله الرحمن الرحيم الله أكبر الله أكبر الله أكبر لا إله الا الله والله أكبر
  26. 26. Main contents <ul><li>Introduction to AOCR </li></ul><ul><li>Feature extraction </li></ul><ul><li>Preprocessing </li></ul><ul><li>AOCR system implementation </li></ul><ul><li>Experimental results </li></ul><ul><li>Conclusion & future directions </li></ul><ul><li>Applications </li></ul>
  27. 27. Feature Extraction الله اكبر
  28. 28. <ul><li>Feature Selection </li></ul><ul><li>Suitable for HMM technique ( i.e. window scanning based features). </li></ul><ul><li>Suitable for word level recognition (not character). </li></ul><ul><li>To retain as much information as possible. </li></ul><ul><li>Achieve high accuracy with small processing time. </li></ul>we select features such that:
  29. 29. Satisfaction of the previous points <ul><li>Each feature designed such that, it deals with the principle of slice technique </li></ul>n1 n3 n4 n6 n5 n2 n7 Feature vector محمد رسول الله
  30. 30. <ul><li>Features deal with words not single character, where algorithm is based on segmentation free concept. </li></ul><ul><li>We avoid dealing with structural features as it requires hard implementation, in addition large processing time. </li></ul>
  31. 31. <ul><li>To achieve high accuracy with lowest processing time, we use simple features & apply overlap between slices to ensure smoothing of extracted data. </li></ul>الصلاة overlap
  32. 32. (1) Background Count <ul><li>Calculate vertical distances (in terms of pixels) of background regions, where each background region is bounded by two foreground regions. </li></ul>background Foreground النجاح
  33. 33. <ul><li>Feature vector </li></ul>Example: d1 d3 d2 d3 d2 d1 Feature vector of the selected slide Two pixels with on overlap
  34. 34. Feature Figure
  35. 35. (2) Baseline Count <ul><li>calculate number of black pixels above baseline (with [+ve] value) & number of black pixels below baseline (with [-ve] value) in each slide. </li></ul>
  36. 36. Example: Baseline No. of black pixels above baseline (X1) No. of black pixels below baseline (X2) Two pixels with on overlap Thinning X2 X1 Feature vector
  37. 37. Feature Figure
  38. 38. (3) Centroid <ul><li>For each slide we get its Centroid (cx, cy) so the feature vector contains sequence of centroids. </li></ul>Example: Cx Cy Feature vector Two pixels with on overlap
  39. 39. (4) Cross Count <ul><li>For each slide we calculate number of crossing from background (white) to foreground (black). </li></ul>Example: 2 Feature vector Two pixels with on overlap
  40. 40. (5) Euclidean distance <ul><li>We get the average foreground pixel in region above & below baseline, then Euclidean distance is measured from baseline to the average points above & below baseline, with +ve value for point above and – ve value for point below. </li></ul>
  41. 41. Baseline Euclidean distance above baseline D1 Euclidean distance below baseline D2 Example: Thinning One pixel without overlap D2 D1 Feature vector
  42. 42. Feature Figure
  43. 43. (6) Horizontal histogram <ul><li>For each slide we get its horizontal histogram (horizontal summation for rows in the slide). </li></ul>Calculate Histogram Example: Four pixels with one overlap
  44. 44. Feature Figure
  45. 45. (7) Vertical histogram <ul><li>for each slide we get its vertical histogram (vertical summation for columns). </li></ul>Example: X2 X1 Feature vector Two pixels with one overlap
  46. 46. Feature Figure
  47. 47. ( 8) Weighted vertical histogram <ul><li>Exactly as the previous feature but the only difference is that, we multiply each row in the image by a number (weight), where the weight vector which be multiplied by the whole image takes a triangle shape. </li></ul>
  48. 48. Example: weight vector 1 -1 X2 X1 Feature vector Two pixels with one overlap
  49. 49. Feature Figure
  50. 50. Main contents <ul><li>Introduction to AOCR </li></ul><ul><li>Feature extraction </li></ul><ul><li>Preprocessing </li></ul><ul><li>AOCR system implementation </li></ul><ul><li>Experimental results </li></ul><ul><li>Conclusion & future directions </li></ul><ul><li>Applications </li></ul>
  51. 51. Implementation of AOCR Based HMM Using HTK <ul><li>Data preparation </li></ul><ul><li>Creating Monophone HMMs </li></ul><ul><li>Recognizer Evaluation </li></ul>
  52. 52. Data preparation <ul><li>The Task Grammar </li></ul><ul><li>The Dictionary </li></ul><ul><li>Recording the Data </li></ul><ul><li>Creating the Transcription Files </li></ul><ul><li>Coding the Data </li></ul>
  53. 53. The Task Grammar <ul><li>Isolated AOCR Grammar ----->Mini project </li></ul><ul><li>Connected AOCR Grammar ---->Final project </li></ul>
  54. 54. Isolated AOCR Grammar <ul><li>$name =a1| a2 | a3 | a4 | a5|……………|a28|a29; </li></ul><ul><li>( SENT-START <$name> SENT-END ) </li></ul><ul><li>a1-----> ا a2---> ب a3---> ت a4---> ث </li></ul><ul><li>a29---> space </li></ul>
  55. 55. Connected AOCR Grammar <ul><li>$name =a1| a2 | a3 | a4 | </li></ul><ul><li> a5 |……………|a124|a125; </li></ul><ul><li>(SENT-START <$name> SENT-END ) </li></ul><ul><li>a1-----> ا a2---> ـا a11---> ــبــ a23---> ـجـ a124---> لله a125---> ـــــــ </li></ul>
  56. 56. Why Grammar? Start a1 a2 a124 a125 a3 End
  57. 57. How is it created? <ul><li>Hparse creates it </li></ul>Grammar Word Net ( Wdnet ) HParse
  58. 58. The Dictionary <ul><li>Our dictionary is limited </li></ul>???
  59. 59. The Dictionary
  60. 60. Recording the Data Feature extraction Transformer (Image) 2-D signal 1-D vector .wav
  61. 61. Creating the Transcription Files <ul><li>Word level MLF </li></ul><ul><li>Phone level MLF </li></ul>
  62. 62. Word level MLF #! MLF! # &quot;*/1.lab&quot; فصل . &quot;*/2.lab&quot; في الفرق بين الخالق والمخلوق . &quot;*/3.lab&quot; وما ابراهيم وآل ابراهيم الحنفاء والأنبياء فهم . &quot;*/4.lab&quot; يعلمون انه لا بد من الفرق بين الخالق والمخلوق . . . فصل في الفرق بين الخالق والمخلوق وما ابراهيم وآل ابراهيم الحنفاء والأنبياء فهم يعلمون انه لا بد من الفرق بين الخالق والمخلوق
  63. 63. Phone level MLF #! MLF! # &quot;*/1.lab&quot; a74 a51 a88 . &quot;*/2.lab&quot; a74 a108 a123 a1 a86 a75 a38 a77 a123 #! MLF! # &quot;*/1.lab&quot; فصل . &quot;*/2.lab&quot; في الفرق بين الخالق والمخلوق . &quot;*/3.lab&quot; وما ابراهيم وآل ابراهيم الحنفاء والأنبياء فهم . &quot;*/4.lab&quot; يعلمون انه لا بد من الفرق بين الخالق والمخلوق . .
  64. 64. Coding the Data HCOPY MFCC Files S0001.mfc S0002.mfc S0003.mfc etc.. Wave form files ٍٍ S0001.wav S0002.wav S0003.wav etc.. Configuration File Script File
  65. 65. Creating Monophone HMMs <ul><li>Creating Flat Start Monophones </li></ul><ul><li>Re-estimation </li></ul>
  66. 66. Creating Monophone HMMs <ul><li>The first step in HMM training is to define a </li></ul><ul><li>prototype model. </li></ul><ul><li>The parameters of this model are not important; its purpose is to define the model topology </li></ul>
  67. 67. The Prototype <ul><li>~o <VecSize> 39 <MFCC_0_D_A> </li></ul><ul><li>~h &quot; proto &quot; </li></ul><ul><li><BeginHMM> </li></ul><ul><li><NumStates> 5 </li></ul><ul><li><State> 2 </li></ul><ul><li>< Mean > 39 </li></ul><ul><li>0.0 0.0 0.0 . . . . . . . </li></ul><ul><li><Variance> 39 </li></ul><ul><li>1.0 1.0 1.0 . . . . . . . . </li></ul><ul><li><State> 3 </li></ul><ul><li>< Mean > 39 </li></ul><ul><li>0.0 0.0 0.0 . . . . . . . </li></ul><ul><li><Variance> 39 </li></ul><ul><li>1.0 1.0 1.0 . . . . . . . </li></ul><ul><li><State> 4 </li></ul><ul><li>< Mean > 39 </li></ul><ul><li>0.0 0.0 0.0 . . . . . . . </li></ul><ul><li><Variance> 39 </li></ul><ul><li>1.0 1.0 1.0 . . . . . . . </li></ul><ul><li>< TransP > 5 </li></ul><ul><li>0.0 1.0 0.0 0.0 0.0 </li></ul><ul><li>0.0 0.6 0.4 0.0 0.0 </li></ul><ul><li>0.0 0.0 0.6 0.4 0.0 </li></ul><ul><li>0.0 0.0 0.0 0.7 0.3 </li></ul><ul><li>0.0 0.0 0.0 0.0 0.0 </li></ul><ul><li><EndHMM> </li></ul>
  68. 68. Initialization Process Proto Vfloors Proto HCompV hmm0
  69. 69. Initialized prototype <ul><li>~o <VecSize> 39 <MFCC_0_D_A> </li></ul><ul><li>~h &quot; proto &quot; </li></ul><ul><li><BeginHMM> </li></ul><ul><li><NumStates> 5 </li></ul><ul><li><State> 2 </li></ul><ul><li>< Mean > 39 </li></ul><ul><li>-5.029420e+000 1.948325e+000 -5.192460e+000 . . . . . </li></ul><ul><li><Variance> 39 </li></ul><ul><li>1.568812e+001 1.038746e+001 2.110239e+001 . . . . . </li></ul><ul><li><State> 3 </li></ul><ul><li>< Mean > 39 </li></ul><ul><li>-5.029420e+000 1.948325e+000 -5.192460e+000 . . . . . . </li></ul><ul><li><Variance> 39 </li></ul><ul><li>1.568812e+001 1.038746e+001 2.110239e+001 . . . . . </li></ul><ul><li><State> 4 </li></ul><ul><li>< Mean > 39 </li></ul><ul><li>-5.029420e+000 1.948325e+000 -5.192460e+000 . . . . . . . </li></ul><ul><li><Variance> 39 </li></ul><ul><li>1.568812e+001 1.038746e+001 2.110239e+001 . . . . . . . </li></ul><ul><li>< TransP > 5 </li></ul><ul><li>0.0 1.0 0.0 0.0 0.0 </li></ul><ul><li>0.0 0.6 0.4 0.0 0.0 </li></ul><ul><li>0.0 0.0 0.6 0.4 0.0 </li></ul><ul><li>0.0 0.0 0.0 0.7 0.3 </li></ul><ul><li>0.0 0.0 0.0 0.0 0.0 </li></ul><ul><li><EndHMM> </li></ul>
  70. 70. Vfloors Contents <ul><li>~v varFloor1 </li></ul><ul><li><Variance> 39 </li></ul><ul><li>1.568812e-001 1.038746e-001 2.110239e-001 . . . . . . </li></ul>
  71. 71. Creating initialized Models a125 a2 a1 Initialized model hmmdefs ~o <VecSize> 39 <MFCC_0_D_A> Initialized proto
  72. 72. Creating Macros File Vfloors file ~o <VecSize> 39 <MFCC_0_D_A> Vfloors file
  73. 73. Re-estimation Process Hmmdefs macros HERest Initialized Proto HCompV Hmmdefs macros Training Files MFc Files Phones level Transcription monophones
  74. 74. Recognition Process Hvite Trained Models Test Files Word Network wnet The dictioary dict Reconized words
  75. 75. Recognizer Evaluation HResults Reference Transcription Reconized Transcription Accuracy
  76. 76. Main contents <ul><li>Introduction to AOCR </li></ul><ul><li>Feature extraction </li></ul><ul><li>Preprocessing </li></ul><ul><li>AOCR system implementation </li></ul><ul><li>Experimental results </li></ul><ul><li>Conclusion & future directions </li></ul><ul><li>Applications </li></ul>
  77. 77. Experimental Results
  78. 78. 1- Main Problem <ul><li>1-1 Requirements: </li></ul><ul><ul><ul><li>Connected Character Recognition. </li></ul></ul></ul><ul><ul><ul><li>Multi-sizes. </li></ul></ul></ul><ul><ul><ul><li>Multi-fonts. </li></ul></ul></ul><ul><ul><ul><li>Hand Written. </li></ul></ul></ul>
  79. 79. <ul><li>1-2 Variables: </li></ul><ul><ul><ul><li>Tool . </li></ul></ul></ul><ul><ul><ul><li>Method used to train and test. </li></ul></ul></ul><ul><ul><ul><li>Model Parameters. </li></ul></ul></ul><ul><ul><ul><li>Feature Parameters. </li></ul></ul></ul>
  80. 80. <ul><li>Tool: </li></ul><ul><li>How it can operate with images? </li></ul>Discrete Input images. (failed) Continuous Input a continuous wave form (Succeeded) DATA Input to HTK
  81. 81. 2- Isolated Character Recognition <ul><li>2-1 Single Size (16)- Single Font (Simplified Arabic Fixed). </li></ul><ul><li>2-2 Multi-Sizes Character Recognition. </li></ul><ul><li>2-3 Variable Lengths Character Recognition . </li></ul>
  82. 82. 2-1 Single Size (16)- Single Font (Simplified Arabic Fixed) <ul><ul><ul><ul><li>Best method. </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Best number of states. </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Best Widow size. </li></ul></ul></ul></ul>
  83. 83. <ul><ul><li>Best method: </li></ul></ul><ul><ul><ul><ul><ul><li>Model for each char. (35 models) Vs Model for each Char. In each position (116 Models) </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>(Vertical histogram-11 states-window=2.5) </li></ul></ul></ul></ul></ul>116 100% 35 No. of Models 99.14 % Accuracy
  84. 84. <ul><ul><ul><ul><li>Best number of states: </li></ul></ul></ul></ul><ul><li>(Vertical histogram-Number of Models=35 -window=2 pixels) </li></ul>11 99.14% 3 No. of States 96.55 % Accuracy
  85. 85. <ul><ul><ul><ul><ul><li>Best Widow size: </li></ul></ul></ul></ul></ul><ul><ul><ul><li>(2-D histogram-Number of Models=124-11 states). </li></ul></ul></ul>
  86. 86. 2-2 Multi-Sizes Character Recognition <ul><ul><ul><ul><ul><li>Sizes (12-14-16): </li></ul></ul></ul></ul></ul><ul><ul><ul><li>(2-D histogram-Number of Models=124-11 states). </li></ul></ul></ul>
  87. 87. 2-3 Variable Lengths Character Recognition <ul><ul><ul><ul><ul><li>Train with different lengths: </li></ul></ul></ul></ul></ul><ul><li>Vertical histogram gives Accuracy more than 2-D histogram </li></ul><ul><li>Vertical histogram-Number of Models=35 -window=2 pixels </li></ul>
  88. 88. <ul><ul><ul><ul><ul><li>Make Model for dash : </li></ul></ul></ul></ul></ul><ul><li>Training: </li></ul><ul><ul><ul><li>Train with characters (with out dash) &dash model. </li></ul></ul></ul><ul><ul><ul><li>Train with different lengths & dash model. </li></ul></ul></ul><ul><ul><ul><li>Train with different lengths & dash model & if the character has a dash at its end we define it as a character model followed by dash model. </li></ul></ul></ul><ul><ul><ul><ul><ul><li>(True way). </li></ul></ul></ul></ul></ul>
  89. 89. <ul><ul><ul><ul><ul><li>Make Model for dash : </li></ul></ul></ul></ul></ul><ul><li>Testing: </li></ul><ul><ul><ul><li>Vertical histogram: </li></ul></ul></ul><ul><li>failed to recognize the dash model using all methods (recognize it as a space). </li></ul><ul><ul><ul><li>2-D histogram : for window size =2.6 </li></ul></ul></ul>Accuracy=100%
  90. 90. 3- Connected Character Recognition <ul><li>3-1 Single Size (16)- Single Font (Simplified Arabic Fixed). </li></ul><ul><li>3-2 Parameter Optimization. </li></ul><ul><li>3-3 Multi-Sizes Character Recognition. </li></ul><ul><li>3-4 Fusion by feature concatenation. </li></ul>
  91. 91. 3-1 Single Size (16)- Single Font (Simplified Arabic Fixed) <ul><li>Best Method: (on a simple experiment (10 words)) </li></ul><ul><li>The correct way for the word Recognition is to train the character models by (Words or Lines). </li></ul><ul><li>Assumptions: </li></ul><ul><ul><li>Training data: 25-pages (495 lines). </li></ul></ul><ul><ul><li>Simplified Arabic fixed (font size = 16). </li></ul></ul><ul><ul><li>Images: 300dpi-black and white. </li></ul></ul><ul><ul><li>Testing data: 4-pages (74 lines). </li></ul></ul><ul><ul><li>Feature properties: window=2*frame. </li></ul></ul>
  92. 92. <ul><ul><ul><li>Vertical histogram: </li></ul></ul></ul>
  93. 93. <ul><ul><ul><li>2-D histogram: </li></ul></ul></ul>
  94. 94. 3-2 Parameter Optimization <ul><li>Line Level Vs Word Level. </li></ul><ul><li>optimum number of mixture. </li></ul><ul><li>optimum number of States. </li></ul><ul><li>optimum initial transition probability. </li></ul><ul><li>optimum window Vs frame ratio. </li></ul>
  95. 95. <ul><li>Line Level Vs Word level </li></ul><ul><li>Assumptions: </li></ul><ul><ul><ul><li>Simplified Arabic fixed (font size = 16). </li></ul></ul></ul><ul><ul><ul><li>Testing data: same training data. </li></ul></ul></ul><ul><ul><ul><li>Feature type: (vertical histogram, window=2*frame). </li></ul></ul></ul><ul><ul><ul><li>Images: 300dpi-black and white. </li></ul></ul></ul>Word Level 85.36% Line Level Level 84.99% Accuracy
  96. 96. <ul><li>Conclusion: </li></ul><ul><li>We will concentrate on the line segmentation instead of word segmentation because of: </li></ul><ul><ul><ul><li>The disadvantages of the word segmentation: </li></ul></ul></ul><ul><ul><ul><ul><li>We have a limitation on the window size because of its small size. </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Accuracy decreases with increasing the number of mixture. </li></ul></ul></ul></ul><ul><ul><ul><li>The simplicity of the line segmentation than word segmentation in preprocessing. </li></ul></ul></ul>
  97. 97. <ul><li>optimum number of mixture . </li></ul><ul><li>One dimension features : </li></ul><ul><li>Training data: 495 lines </li></ul><ul><li>Testing data: same training data. </li></ul><ul><li>Feature type: </li></ul><ul><ul><li>(Vertical histogram, window=2*frame, window size = 6.5 pixels). </li></ul></ul>
  98. 98. <ul><li>Two dimension features : </li></ul><ul><li>Training data: 495 lines </li></ul><ul><li>Testing data: same training data. </li></ul><ul><li>Feature type: </li></ul><ul><ul><li>(2-D histogram, window=2*frame, window size = 5.33 pixels, N= 4) </li></ul></ul>
  99. 99. <ul><li>optimum number of States </li></ul><ul><li>One dimension features : </li></ul>
  100. 100. <ul><li>Two dimension features : </li></ul><ul><ul><ul><li>Assumptions: as previous </li></ul></ul></ul><ul><ul><ul><li>Results: </li></ul></ul></ul>11 95.02% 8 Number of states 92.52% Accuracy =
  101. 101. <ul><li>optimum initial transition probability </li></ul><ul><li>Almost Equally likely probabilities. (Failed) </li></ul><ul><li>Random Probabilities ……..very bad. </li></ul><ul><li>Each state may still in it self or go to the next state only, probability that state sill in it self higher than probability to go to the next state…………(Succeed). </li></ul>0 1 0 0 0 0 0.7 0.3 0 0 0 0 0.6 0.4 0 ------------------------------and so on.
  102. 102. <ul><li>optimum window Vs frame ratio </li></ul><ul><li>Assumptions: as previous in (2-D feature) </li></ul><ul><li>Results: </li></ul>0.5 93.92% 0.4 0.6 Overlapping Ratio = 91.70% 92.52% Accuracy =
  103. 103. <ul><ul><ul><li>Maximum Accuracy for all features: </li></ul></ul></ul>Vertical histogram 96.97% Max. Accuracy Feature Type 95.96% 2-D histogram 87.16% Euclidean distance 91.51% Cross count 95.75% Weighted histogram 89.70% Baseline count 91.61% Background count
  104. 104. 3-3 Multi-Sizes Character Recognition <ul><li>Resizing the test data only: </li></ul><ul><ul><li>Training data: Simplified Arabic fixed-font size =16. </li></ul></ul><ul><ul><li>Testing data: </li></ul></ul><ul><ul><ul><li>Simplified Arabic fixed. </li></ul></ul></ul><ul><ul><ul><li>Font size = 12-16-18 (After resize). </li></ul></ul></ul><ul><ul><ul><li>60 lines. </li></ul></ul></ul><ul><ul><li>Feature Type: Vertical histogram </li></ul></ul>16 96.97% 18 14 Font size 76.21% 79.74% Accuracy
  105. 105. <ul><li>Resizing the training and test data: </li></ul><ul><ul><li>Training data: </li></ul></ul><ul><ul><ul><ul><ul><li>Simplified Arabic fixed. </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Font size = 14-16-18 </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>(After resize). </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>(324 * 3) lines. </li></ul></ul></ul></ul></ul><ul><ul><li>Testing data: </li></ul></ul><ul><ul><ul><ul><ul><li>(324 * 3) lines. </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Same as training. </li></ul></ul></ul></ul></ul><ul><ul><li>Feature Type: Vertical histogram </li></ul></ul>Accuracy = 92.15%
  106. 106. 3-4 F eature concatenation <ul><li>Concatenates vertical histogram and 2-D histogram. </li></ul>No scale 5 84.09% 4 4 Scale vertical histogram)= 4.2 5.57 Window size = 69.02% 77.17% Accuracy =
  107. 107. Main contents <ul><li>Introduction to AOCR </li></ul><ul><li>Feature extraction </li></ul><ul><li>Preprocessing </li></ul><ul><li>AOCR system implementation </li></ul><ul><li>Experimental results </li></ul><ul><li>Conclusion & future directions </li></ul><ul><li>Applications </li></ul>
  108. 108. Future works <ul><li>Improving printed text system: </li></ul><ul><ul><li>Data base: increasing its size to support Multi-sizes and Multi-fonts. </li></ul></ul><ul><ul><li>Preprocessing improvements: </li></ul></ul><ul><ul><ul><li>Improving the image enhancement to solve the problem of noisy pages. </li></ul></ul></ul><ul><ul><ul><li>Develop a robust system to solve the problems that depends on the nature of input pages (delete frames and borders and pictures and tables…..etc). </li></ul></ul></ul><ul><ul><ul><li>Search for new features and combine between them to improve the accuracy. </li></ul></ul></ul>
  109. 109. <ul><ul><li>Training and testing improvements: </li></ul></ul><ul><ul><ul><li>Tying the models. </li></ul></ul></ul><ul><ul><ul><li>Using Adaptation supported by HTK-tool that may improve the (Multi-size) system (size independent). </li></ul></ul></ul><ul><ul><ul><li>Using tri-phones technique to solve the problems of overlapping. </li></ul></ul></ul><ul><ul><li>Improve the time response (implement all pre-processing programs by C++). </li></ul></ul><ul><ul><li>Increasing the accuracy by feature fusion. </li></ul></ul>
  110. 110. <ul><li>Build the Multi-Language system (Language independent system). </li></ul><ul><li>Develop the hand written system, especially because HMM can attack this problem efficiently. </li></ul><ul><li>Develop the ON-Line system. </li></ul>
  111. 111. Main contents <ul><li>Introduction to AOCR </li></ul><ul><li>Feature extraction </li></ul><ul><li>Preprocessing </li></ul><ul><li>AOCR system implementation </li></ul><ul><li>Experimental results </li></ul><ul><li>Conclusion & future directions </li></ul><ul><li>Applications </li></ul>
  112. 112. Automatic Form Recognition <ul><li>Check Bank Reading </li></ul>بنــك مصــر شيك رقم : .......................... اسم المصرف اليه : ................. المبلغ بالارقام : ................ المبلغ بالحروف : .................. امضاء ...................
  113. 113. <ul><li>Digital libraries : </li></ul><ul><li>Where all books, magazines, newspapers…etc can be stored as a softcopy on PCs & CDs. </li></ul>بسم الله
  114. 114. <ul><li>Transcription of historical archives & &quot;non-death&quot; of paper </li></ul><ul><li>Where we can store all archived papers & documents as a softcopy files. </li></ul>بسم الله

×