SlideShare a Scribd company logo
1 of 27
Download to read offline
Recognition of Printed Bilingual (Odia and English)
Scripts and Numbers
Conference on
VLSI Design, Signal Processing, Image Processing, Communications & Embedded Systems
VSPICE,2020
Prangya Paramita Pradhan
Department of Instrumentation &
Electronics Engineering
College of Engineering and Technology
Bhubaneswar
Debashree Brahma
Department of Instrumentation &
Electronics Engineering
College of Engineering and Technology
Bhubaneswar
Department of Instrumentation & Electronics Engineering, CET
Bhubaneswar,VSPICE,2020
Department of Instrumentation & Electronics Engineering, CET
Bhubaneswar,VSPICE,2020
Outlines:
 Introduction
 Bilingual scripts identification system
 Different features of English and Odia scripts
 Proposed method
 Experimental setup
 Conclusion
 Future work
 References
Department of Instrumentation & Electronics Engineering, CET
Bhubaneswar,VSPICE,2020
1 2
 What is bilingual script identification?
 Necessity of bilingual script identification
Figure1. A typical bilingual documents in Roman and Odia scripts
Introduction
Department of Instrumentation & Electronics Engineering, CET
Bhubaneswar,VSPICE,2020
1 3
Skew correction: The detected skew angle can be corrected
by rotating the entire document in opposite direction.
Line segmentation: The skew corrected document is
segmented into lines
Word segmentation: After line is separated, it is necessary to
differentiate, the individual words.
Classification:Image classification analyses the properties of
various image features and organizes data into same categories.
Bilingual Scripts identification system
Department of Instrumentation & Electronics Engineering, CET
Bhubaneswar,VSPICE,2020
1
4
Figure2. A typical bilingual script identification system for both English and Odia scripts
Input text
(scanned image)
PREPROCESSING
(BINARIZATION, SKEW
DETECTION &
CORRECTION)
LINE AND WORD
SEGMENTATION
ENGLISH
WORDS
ODIA WORDS
BILINGUAL
WORDS
WHICH
TYPE OF
WORD
IS IT?
CLASSIFICATION
Output document
image as
character
Department of Instrumentation & Electronics Engineering, CET
Bhubaneswar,VSPICE,2020
1
5
Odia scripts
 12 vowels and 38 consonants
having matras and
yuktaakshara
 Matras are used in upper zone
and lower zone
 Scripts are cursive in nature
 The basic characters are in
same level
English scripts
 26upper case &26 lower
case
 Scripts are Straight and
slant in nature
 Basic characters are not
in same level
Different Features of English and Odia scripts
Department of Instrumentation & Electronics Engineering, CET
Bhubaneswar,VSPICE,2020
1 6
Figure3. Scripts occupies three different zones
Figure4.Showing Vertical strokes in English and odia
scripts
Figure5. English scripts are in different levels
Department of Instrumentation & Electronics Engineering, CET
Bhubaneswar,VSPICE,2020
1 7
Preprocessing:
 Text binarization
 Skew Correction
Step1: Estimate the skew angle
Step2: Correct the skew angle
Experimental setup
Department of Instrumentation & Electronics Engineering, CET
Bhubaneswar,VSPICE,2020
1
8
Figure7. (a) and (b) document images with skews of -4.7 degree and 5.3 degree
respectively and (c) and (d) the corresponding skew corrected images
Department of Instrumentation & Electronics Engineering, CET
Bhubaneswar,VSPICE,2020
1
9
Line segmentation
 Find the ON pixel on the starting row
 Find the OFF pixel on the next Row, call it R1
 Find the ON pixel in the next row, call it R2.
 Find the spacing between them, that segment the
first line
Department of Instrumentation & Electronics Engineering, CET
Bhubaneswar,VSPICE,2020
1
10
Line segmentation:
Figure8: Experimental output for line segmentation
Department of Instrumentation & Electronics Engineering, CET
Bhubaneswar,VSPICE,2020
1
11
Word segmentation
 Scan the image vertically from top to bottom
 Find the distance between the characters
 By getting the maximum distance between two
characters, the word can be differentiated
Department of Instrumentation & Electronics Engineering, CET
Bhubaneswar,VSPICE,2020
Character segmentation
 For each word, Scan from left to right. Identify
the consecutive OFF and ON pixels.
 The OFF and ON pixels in a particular order will
segment the word into characters.
1
12
Figure 9: Line is segmented into words
Figure10. Character segmentation
Department of Instrumentation & Electronics Engineering, CET
Bhubaneswar,VSPICE,2020
1 13
Figure6: Flow chart for word identification
Department of Instrumentation & Electronics Engineering, CET
Bhubaneswar,VSPICE,2020
1
14
Segment the words
English or odia or bilingual?
Identify the matras
English or bilingual
Are matras
Present?
Are no.of
vertical strokes
≤ no.of chars ?
Are the
templates
“ra” or “re ”
matched?
English
yes
no
odia or
bilingual
Are no.of
vertical
strokes>no.
of chars ?
yes
NO
Are ‘s,’
‘c’,’g’,’x’
are
matched
Is none of the vert
strokes at the
beginning of the
char and the
characters are in
the same level?
yes
yes
no
yes
no
Odia
no
no
yes
Odia
odia
bilingual
Step1: Identify the matras. If a matra is present the word is Oriya or
bilingual. Otherwise the word is Oriya, English or bilingual
Step2: Identify the vertical strokes in a word. [The vertical stroke
feature is obtained by identifying the columns with maximum number
of on pixels].If the number of vertical strokes is greater than number
of characters in the word, it is an English or a mixed English-Oriya
word.
Step3: : Identify a mixed English-Oriya word by noting the matra at the
word-end or match for ‘ra’ or ‘re’
Step4:If number of full vertical strokes less than the number of character per
word, if a vertical stroke is present at the beginning or the basic characters or
the characters in the word are not at the same level, it is decided to be English
or bilingual. in such a situation, if there is a matra or a match for ‘ra’ or ‘re’ at
the word-end, the word is decided as the bilingual.
Proposed identification method for Odia & English Scripts
Department of Instrumentation & Electronics Engineering, CET
Bhubaneswar,VSPICE,2020
1
15
Step6: The words with less number of full vertical strokes in step 2,
search for the English characters with no vertical stroke. Template
matching using the correlation method is applied for identifying these
letters. If one such character is present, the corresponding word is
decided as English.
Department of Instrumentation & Electronics Engineering, CET
Bhubaneswar,VSPICE,2020
1 16
Table 1. Number of characters in a word and in a word average number of
vertical strokes in English scripts
Department of Instrumentation & Electronics Engineering, CET
Bhubaneswar,VSPICE,2020
Number of char in a word
1
2
3
4
5
6
7
8
9
10
Average number of vertical
strokes
1
2
6.5
5
7.3
7.4
10.5
10.4
11
14
1
17
Table 2. Number of characters in a word and in a word average number of
vertical strokes in Odia scripts
Department of Instrumentation & Electronics Engineering, CET
Bhubaneswar,VSPICE,2020
Number of char in a word
1
2
3
4
5
6
7
8
9
Average number of vertical
strokes
0.5
1
1.75
2.5
2.9
2.1
4
2
4
1
18
Comparing both the output
Odia scripts
Figure 11. Comparing the vertical strokes for both the scripts
Department of Instrumentation & Electronics Engineering, CET
Bhubaneswar,VSPICE,2020
English script
1
19
Conclusion and future scope:
The performance of the proposed method may
be studied in more details include the case of
variation of font sizes
The method may be extended to other numerical
values with bilingual scripts
The performance of the proposed method in the
ambiguous case like ‘I’ in Roman script and the
Oriya punctuation mark ‘-‘ is to be improved
Department of Instrumentation & Electronics Engineering, CET
Bhubaneswar,VSPICE,2020
1
20
References
[1] D. DHANYA, A. G. RAMAKRISHNAN, and P. B. PATI, “Script identification in printed
bilingual documents,” Sadhana,, vol.VOI. 27, Part 1, pp. 73-82, February 2002.
[2] B. CHAUDHURI, U. PAL, and M. MITRA, “Automatic recognition of printed oriya script,”
Sadhana, vol.VOI. 27,pp. 23-34, February 2002.
[3] U. Pal and B. Chaudhuri, “Indian script character recognition: a survey,” Elsevier, vol. 37, pp.
1887-1889,September 2004.
[4] P.B. Pati, S. S. R.Nishikanta, and A. G. Ramkrishnan, “Gabor filters for document analysisin
Indian bilingual documents,” Proceedings of International Conference IEEE, pp. 123-126,
2004.
[5] U. Pal and B.B.Chaudhuri, “Script line separation from Indian multi-script documents,”
Proceedings of the Fifth International Conference on, ICDAR, pp. 406-409, January 1999.
[6] S. Mori, C. Y. Suen, and K. Yamamoto, “Historical review of ocr research and development,”
IEEE, vol. 22,January 1992.
[7] S. N. Srihari and J. J. Hull, “On-line and off-line handwriting recognition: a comprehensive
survey,” IEEE, Computer SocietyWashington.
[8] S. Wood, X. Yao, K. , and L.Dang, “Language identification from printed text independent of
segmentation,” proc. Ofint’1. Conf on image processing,” January 1995.
[9] D. Dhanya and A. Ramakrishnan, “Script identification in printed bilingual documents,”
SpringerVerlag Berlin Heidelberg, vol. 2423,pp.pp. 13-24,2002.
[10] B. V. Dhandra, Malikarjun, Hangarge, and V. S. Malemathl, “Separation of English numeral
from the multi lingual document text image,” IEEE-ICSCN, International Conference on
Signal Processing, Communications and Networking at MIT,, February 2007.
Department of Instrumentation & Electronics Engineering, CET
Bhubaneswar,VSPICE,2020
1
21
1 22
Department of Instrumentation & Electronics Engineering, CET
Bhubaneswar,VSPICE,2020
1 23
1 24
1 25
1 26
Segment the words
English or odia or bilingual?
Identify the matras
English or bilingual
Are matras
Present?
Are no.of vertical
strokes≤ no.of
chars ?
Are the
templates
“ra” or “re ”
matched?
English
yes
no
odia or
bilingual
Are no.of
vertical
strokes≤
no.of chars
?
yes
NO
Are ‘s,’
‘c’,’g’,’x’
are
matched
Is none of the
vertical strokes at
the beginning of
the character and
the characters are
in the same level?
yes
yes
no
yes
no
biningual
no
no
yes
Odia
odia
bilingual
1 27

More Related Content

Similar to PPP-DB-FINAL PPT.pdf

An exhaustive font and size invariant classification scheme for ocr of devana...
An exhaustive font and size invariant classification scheme for ocr of devana...An exhaustive font and size invariant classification scheme for ocr of devana...
An exhaustive font and size invariant classification scheme for ocr of devana...ijnlc
 
IRJET - A Survey on Recognition of Strike-Out Texts in Handwritten Documents
IRJET - A Survey on Recognition of Strike-Out Texts in Handwritten DocumentsIRJET - A Survey on Recognition of Strike-Out Texts in Handwritten Documents
IRJET - A Survey on Recognition of Strike-Out Texts in Handwritten DocumentsIRJET Journal
 
Classification and Identification of Telugu Aksharas using Moment Invariants ...
Classification and Identification of Telugu Aksharas using Moment Invariants ...Classification and Identification of Telugu Aksharas using Moment Invariants ...
Classification and Identification of Telugu Aksharas using Moment Invariants ...Srikanth Chintakindi
 
Devnagari handwritten numeral recognition using geometric features and statis...
Devnagari handwritten numeral recognition using geometric features and statis...Devnagari handwritten numeral recognition using geometric features and statis...
Devnagari handwritten numeral recognition using geometric features and statis...Vikas Dongre
 
Segmentation of Handwritten Text in Gurmukhi Script
Segmentation of Handwritten Text in Gurmukhi ScriptSegmentation of Handwritten Text in Gurmukhi Script
Segmentation of Handwritten Text in Gurmukhi ScriptCSCJournals
 
Recognition of Words in Tamil Script Using Neural Network
Recognition of Words in Tamil Script Using Neural NetworkRecognition of Words in Tamil Script Using Neural Network
Recognition of Words in Tamil Script Using Neural NetworkIJERA Editor
 
Dimensionality Reduction and Feature Selection Methods for Script Identificat...
Dimensionality Reduction and Feature Selection Methods for Script Identificat...Dimensionality Reduction and Feature Selection Methods for Script Identificat...
Dimensionality Reduction and Feature Selection Methods for Script Identificat...ITIIIndustries
 
Engineering_drawing an overview of engineering drawing
Engineering_drawing an overview of engineering drawingEngineering_drawing an overview of engineering drawing
Engineering_drawing an overview of engineering drawinganggawirya1
 
Sample of Engineering Drawing SLide,.ppt
Sample of Engineering Drawing SLide,.pptSample of Engineering Drawing SLide,.ppt
Sample of Engineering Drawing SLide,.pptanggawirya1
 
Technical reports Exam 2014
Technical reports Exam 2014 Technical reports Exam 2014
Technical reports Exam 2014 Magdi Saadawi
 
Malayalam Word Sense Disambiguation using Machine Learning Approach
Malayalam Word Sense Disambiguation using Machine Learning ApproachMalayalam Word Sense Disambiguation using Machine Learning Approach
Malayalam Word Sense Disambiguation using Machine Learning ApproachIRJET Journal
 
A bidirectional text transcription of braille for odia, hindi, telugu and eng...
A bidirectional text transcription of braille for odia, hindi, telugu and eng...A bidirectional text transcription of braille for odia, hindi, telugu and eng...
A bidirectional text transcription of braille for odia, hindi, telugu and eng...eSAT Journals
 
Chapter 01 Introduction drawing.ppt
Chapter 01 Introduction drawing.pptChapter 01 Introduction drawing.ppt
Chapter 01 Introduction drawing.pptRajanBagale4
 
Script Identification In Trilingual Indian Documents
Script Identification In Trilingual Indian DocumentsScript Identification In Trilingual Indian Documents
Script Identification In Trilingual Indian DocumentsCSCJournals
 
MAHI: Machine And Human Interface
MAHI: Machine And Human InterfaceMAHI: Machine And Human Interface
MAHI: Machine And Human InterfaceCSCJournals
 
An Empirical Study on Identification of Strokes and their Significance in Scr...
An Empirical Study on Identification of Strokes and their Significance in Scr...An Empirical Study on Identification of Strokes and their Significance in Scr...
An Empirical Study on Identification of Strokes and their Significance in Scr...IJMER
 
Devnagari document segmentation using histogram approach
Devnagari document segmentation using histogram approachDevnagari document segmentation using histogram approach
Devnagari document segmentation using histogram approachVikas Dongre
 

Similar to PPP-DB-FINAL PPT.pdf (20)

An exhaustive font and size invariant classification scheme for ocr of devana...
An exhaustive font and size invariant classification scheme for ocr of devana...An exhaustive font and size invariant classification scheme for ocr of devana...
An exhaustive font and size invariant classification scheme for ocr of devana...
 
IRJET - A Survey on Recognition of Strike-Out Texts in Handwritten Documents
IRJET - A Survey on Recognition of Strike-Out Texts in Handwritten DocumentsIRJET - A Survey on Recognition of Strike-Out Texts in Handwritten Documents
IRJET - A Survey on Recognition of Strike-Out Texts in Handwritten Documents
 
Classification and Identification of Telugu Aksharas using Moment Invariants ...
Classification and Identification of Telugu Aksharas using Moment Invariants ...Classification and Identification of Telugu Aksharas using Moment Invariants ...
Classification and Identification of Telugu Aksharas using Moment Invariants ...
 
Devnagari handwritten numeral recognition using geometric features and statis...
Devnagari handwritten numeral recognition using geometric features and statis...Devnagari handwritten numeral recognition using geometric features and statis...
Devnagari handwritten numeral recognition using geometric features and statis...
 
Segmentation of Handwritten Text in Gurmukhi Script
Segmentation of Handwritten Text in Gurmukhi ScriptSegmentation of Handwritten Text in Gurmukhi Script
Segmentation of Handwritten Text in Gurmukhi Script
 
Recognition of Words in Tamil Script Using Neural Network
Recognition of Words in Tamil Script Using Neural NetworkRecognition of Words in Tamil Script Using Neural Network
Recognition of Words in Tamil Script Using Neural Network
 
Dimensionality Reduction and Feature Selection Methods for Script Identificat...
Dimensionality Reduction and Feature Selection Methods for Script Identificat...Dimensionality Reduction and Feature Selection Methods for Script Identificat...
Dimensionality Reduction and Feature Selection Methods for Script Identificat...
 
Engineering_drawing an overview of engineering drawing
Engineering_drawing an overview of engineering drawingEngineering_drawing an overview of engineering drawing
Engineering_drawing an overview of engineering drawing
 
Sample of Engineering Drawing SLide,.ppt
Sample of Engineering Drawing SLide,.pptSample of Engineering Drawing SLide,.ppt
Sample of Engineering Drawing SLide,.ppt
 
Chapter.ppt
Chapter.pptChapter.ppt
Chapter.ppt
 
Chapter.ppt
Chapter.pptChapter.ppt
Chapter.ppt
 
Technical reports Exam 2014
Technical reports Exam 2014 Technical reports Exam 2014
Technical reports Exam 2014
 
Malayalam Word Sense Disambiguation using Machine Learning Approach
Malayalam Word Sense Disambiguation using Machine Learning ApproachMalayalam Word Sense Disambiguation using Machine Learning Approach
Malayalam Word Sense Disambiguation using Machine Learning Approach
 
A bidirectional text transcription of braille for odia, hindi, telugu and eng...
A bidirectional text transcription of braille for odia, hindi, telugu and eng...A bidirectional text transcription of braille for odia, hindi, telugu and eng...
A bidirectional text transcription of braille for odia, hindi, telugu and eng...
 
Chapter 01 Introduction drawing.ppt
Chapter 01 Introduction drawing.pptChapter 01 Introduction drawing.ppt
Chapter 01 Introduction drawing.ppt
 
Script Identification In Trilingual Indian Documents
Script Identification In Trilingual Indian DocumentsScript Identification In Trilingual Indian Documents
Script Identification In Trilingual Indian Documents
 
MAHI: Machine And Human Interface
MAHI: Machine And Human InterfaceMAHI: Machine And Human Interface
MAHI: Machine And Human Interface
 
An Empirical Study on Identification of Strokes and their Significance in Scr...
An Empirical Study on Identification of Strokes and their Significance in Scr...An Empirical Study on Identification of Strokes and their Significance in Scr...
An Empirical Study on Identification of Strokes and their Significance in Scr...
 
Ijetcas14 399
Ijetcas14 399Ijetcas14 399
Ijetcas14 399
 
Devnagari document segmentation using histogram approach
Devnagari document segmentation using histogram approachDevnagari document segmentation using histogram approach
Devnagari document segmentation using histogram approach
 

Recently uploaded

VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 

Recently uploaded (20)

VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 

PPP-DB-FINAL PPT.pdf

  • 1. Recognition of Printed Bilingual (Odia and English) Scripts and Numbers Conference on VLSI Design, Signal Processing, Image Processing, Communications & Embedded Systems VSPICE,2020 Prangya Paramita Pradhan Department of Instrumentation & Electronics Engineering College of Engineering and Technology Bhubaneswar Debashree Brahma Department of Instrumentation & Electronics Engineering College of Engineering and Technology Bhubaneswar Department of Instrumentation & Electronics Engineering, CET Bhubaneswar,VSPICE,2020 Department of Instrumentation & Electronics Engineering, CET Bhubaneswar,VSPICE,2020
  • 2. Outlines:  Introduction  Bilingual scripts identification system  Different features of English and Odia scripts  Proposed method  Experimental setup  Conclusion  Future work  References Department of Instrumentation & Electronics Engineering, CET Bhubaneswar,VSPICE,2020 1 2
  • 3.  What is bilingual script identification?  Necessity of bilingual script identification Figure1. A typical bilingual documents in Roman and Odia scripts Introduction Department of Instrumentation & Electronics Engineering, CET Bhubaneswar,VSPICE,2020 1 3
  • 4. Skew correction: The detected skew angle can be corrected by rotating the entire document in opposite direction. Line segmentation: The skew corrected document is segmented into lines Word segmentation: After line is separated, it is necessary to differentiate, the individual words. Classification:Image classification analyses the properties of various image features and organizes data into same categories. Bilingual Scripts identification system Department of Instrumentation & Electronics Engineering, CET Bhubaneswar,VSPICE,2020 1 4
  • 5. Figure2. A typical bilingual script identification system for both English and Odia scripts Input text (scanned image) PREPROCESSING (BINARIZATION, SKEW DETECTION & CORRECTION) LINE AND WORD SEGMENTATION ENGLISH WORDS ODIA WORDS BILINGUAL WORDS WHICH TYPE OF WORD IS IT? CLASSIFICATION Output document image as character Department of Instrumentation & Electronics Engineering, CET Bhubaneswar,VSPICE,2020 1 5
  • 6. Odia scripts  12 vowels and 38 consonants having matras and yuktaakshara  Matras are used in upper zone and lower zone  Scripts are cursive in nature  The basic characters are in same level English scripts  26upper case &26 lower case  Scripts are Straight and slant in nature  Basic characters are not in same level Different Features of English and Odia scripts Department of Instrumentation & Electronics Engineering, CET Bhubaneswar,VSPICE,2020 1 6
  • 7. Figure3. Scripts occupies three different zones Figure4.Showing Vertical strokes in English and odia scripts Figure5. English scripts are in different levels Department of Instrumentation & Electronics Engineering, CET Bhubaneswar,VSPICE,2020 1 7
  • 8. Preprocessing:  Text binarization  Skew Correction Step1: Estimate the skew angle Step2: Correct the skew angle Experimental setup Department of Instrumentation & Electronics Engineering, CET Bhubaneswar,VSPICE,2020 1 8
  • 9. Figure7. (a) and (b) document images with skews of -4.7 degree and 5.3 degree respectively and (c) and (d) the corresponding skew corrected images Department of Instrumentation & Electronics Engineering, CET Bhubaneswar,VSPICE,2020 1 9
  • 10. Line segmentation  Find the ON pixel on the starting row  Find the OFF pixel on the next Row, call it R1  Find the ON pixel in the next row, call it R2.  Find the spacing between them, that segment the first line Department of Instrumentation & Electronics Engineering, CET Bhubaneswar,VSPICE,2020 1 10
  • 11. Line segmentation: Figure8: Experimental output for line segmentation Department of Instrumentation & Electronics Engineering, CET Bhubaneswar,VSPICE,2020 1 11
  • 12. Word segmentation  Scan the image vertically from top to bottom  Find the distance between the characters  By getting the maximum distance between two characters, the word can be differentiated Department of Instrumentation & Electronics Engineering, CET Bhubaneswar,VSPICE,2020 Character segmentation  For each word, Scan from left to right. Identify the consecutive OFF and ON pixels.  The OFF and ON pixels in a particular order will segment the word into characters. 1 12
  • 13. Figure 9: Line is segmented into words Figure10. Character segmentation Department of Instrumentation & Electronics Engineering, CET Bhubaneswar,VSPICE,2020 1 13
  • 14. Figure6: Flow chart for word identification Department of Instrumentation & Electronics Engineering, CET Bhubaneswar,VSPICE,2020 1 14 Segment the words English or odia or bilingual? Identify the matras English or bilingual Are matras Present? Are no.of vertical strokes ≤ no.of chars ? Are the templates “ra” or “re ” matched? English yes no odia or bilingual Are no.of vertical strokes>no. of chars ? yes NO Are ‘s,’ ‘c’,’g’,’x’ are matched Is none of the vert strokes at the beginning of the char and the characters are in the same level? yes yes no yes no Odia no no yes Odia odia bilingual
  • 15. Step1: Identify the matras. If a matra is present the word is Oriya or bilingual. Otherwise the word is Oriya, English or bilingual Step2: Identify the vertical strokes in a word. [The vertical stroke feature is obtained by identifying the columns with maximum number of on pixels].If the number of vertical strokes is greater than number of characters in the word, it is an English or a mixed English-Oriya word. Step3: : Identify a mixed English-Oriya word by noting the matra at the word-end or match for ‘ra’ or ‘re’ Step4:If number of full vertical strokes less than the number of character per word, if a vertical stroke is present at the beginning or the basic characters or the characters in the word are not at the same level, it is decided to be English or bilingual. in such a situation, if there is a matra or a match for ‘ra’ or ‘re’ at the word-end, the word is decided as the bilingual. Proposed identification method for Odia & English Scripts Department of Instrumentation & Electronics Engineering, CET Bhubaneswar,VSPICE,2020 1 15
  • 16. Step6: The words with less number of full vertical strokes in step 2, search for the English characters with no vertical stroke. Template matching using the correlation method is applied for identifying these letters. If one such character is present, the corresponding word is decided as English. Department of Instrumentation & Electronics Engineering, CET Bhubaneswar,VSPICE,2020 1 16
  • 17. Table 1. Number of characters in a word and in a word average number of vertical strokes in English scripts Department of Instrumentation & Electronics Engineering, CET Bhubaneswar,VSPICE,2020 Number of char in a word 1 2 3 4 5 6 7 8 9 10 Average number of vertical strokes 1 2 6.5 5 7.3 7.4 10.5 10.4 11 14 1 17
  • 18. Table 2. Number of characters in a word and in a word average number of vertical strokes in Odia scripts Department of Instrumentation & Electronics Engineering, CET Bhubaneswar,VSPICE,2020 Number of char in a word 1 2 3 4 5 6 7 8 9 Average number of vertical strokes 0.5 1 1.75 2.5 2.9 2.1 4 2 4 1 18
  • 19. Comparing both the output Odia scripts Figure 11. Comparing the vertical strokes for both the scripts Department of Instrumentation & Electronics Engineering, CET Bhubaneswar,VSPICE,2020 English script 1 19
  • 20. Conclusion and future scope: The performance of the proposed method may be studied in more details include the case of variation of font sizes The method may be extended to other numerical values with bilingual scripts The performance of the proposed method in the ambiguous case like ‘I’ in Roman script and the Oriya punctuation mark ‘-‘ is to be improved Department of Instrumentation & Electronics Engineering, CET Bhubaneswar,VSPICE,2020 1 20
  • 21. References [1] D. DHANYA, A. G. RAMAKRISHNAN, and P. B. PATI, “Script identification in printed bilingual documents,” Sadhana,, vol.VOI. 27, Part 1, pp. 73-82, February 2002. [2] B. CHAUDHURI, U. PAL, and M. MITRA, “Automatic recognition of printed oriya script,” Sadhana, vol.VOI. 27,pp. 23-34, February 2002. [3] U. Pal and B. Chaudhuri, “Indian script character recognition: a survey,” Elsevier, vol. 37, pp. 1887-1889,September 2004. [4] P.B. Pati, S. S. R.Nishikanta, and A. G. Ramkrishnan, “Gabor filters for document analysisin Indian bilingual documents,” Proceedings of International Conference IEEE, pp. 123-126, 2004. [5] U. Pal and B.B.Chaudhuri, “Script line separation from Indian multi-script documents,” Proceedings of the Fifth International Conference on, ICDAR, pp. 406-409, January 1999. [6] S. Mori, C. Y. Suen, and K. Yamamoto, “Historical review of ocr research and development,” IEEE, vol. 22,January 1992. [7] S. N. Srihari and J. J. Hull, “On-line and off-line handwriting recognition: a comprehensive survey,” IEEE, Computer SocietyWashington. [8] S. Wood, X. Yao, K. , and L.Dang, “Language identification from printed text independent of segmentation,” proc. Ofint’1. Conf on image processing,” January 1995. [9] D. Dhanya and A. Ramakrishnan, “Script identification in printed bilingual documents,” SpringerVerlag Berlin Heidelberg, vol. 2423,pp.pp. 13-24,2002. [10] B. V. Dhandra, Malikarjun, Hangarge, and V. S. Malemathl, “Separation of English numeral from the multi lingual document text image,” IEEE-ICSCN, International Conference on Signal Processing, Communications and Networking at MIT,, February 2007. Department of Instrumentation & Electronics Engineering, CET Bhubaneswar,VSPICE,2020 1 21
  • 22. 1 22 Department of Instrumentation & Electronics Engineering, CET Bhubaneswar,VSPICE,2020
  • 23. 1 23
  • 24. 1 24
  • 25. 1 25
  • 26. 1 26
  • 27. Segment the words English or odia or bilingual? Identify the matras English or bilingual Are matras Present? Are no.of vertical strokes≤ no.of chars ? Are the templates “ra” or “re ” matched? English yes no odia or bilingual Are no.of vertical strokes≤ no.of chars ? yes NO Are ‘s,’ ‘c’,’g’,’x’ are matched Is none of the vertical strokes at the beginning of the character and the characters are in the same level? yes yes no yes no biningual no no yes Odia odia bilingual 1 27