This document summarizes research on recognizing Tamil handwritten characters using optical character recognition (OCR) techniques. It discusses various approaches that have been used for pre-processing images, segmenting characters, extracting features, and classifying characters. For pre-processing, techniques like binarization, noise removal, normalization, and skew correction are discussed. For segmentation, methods like projection-based, smearing, and graph-based techniques are mentioned. Feature extraction approaches include statistical, structural, and hybrid methods. Classification is done using techniques such as k-nearest neighbors, neural networks, and support vector machines. The document also provides two tables summarizing accuracy levels achieved by different studies on Tamil character recognition and the limitations of those studies.
Comparative study of two methods for Handwritten Devanagari Numeral RecognitionIOSR Journals
Abstract : In this paper two different methods for Numeral Recognition are proposed and their results are
compared. The objective of this paper is to provide an efficient and reliable method for recognition of
handwritten numerals. First method employs Grid based feature extraction and recognition algorithm. In this
method the features of the image are extracted by using grid technique and this feature set is then compared
with the feature set of database image for classification. While second method contains Image Centroid Zone
and Zone Centroid Zone algorithms for feature extraction and the features are applied to Artificial Neural
Network for recognition of input image. Machine text recognition is important research area because of its
applications in many areas like Bank, Post office, Hospitals etc.
Keywords: Handwritten Numeral Recognition, Grid Technique, ANN, Feature Extraction, Classification.
Signature recognition using clustering techniques dissertatiDr. Vinayak Bharadi
This document summarizes Vinayak Ashok Bharadi's dissertation on signature recognition using clustering techniques. It introduces the topic, outlines the problem definition and steps in signature recognition. It then discusses several preprocessing techniques, feature extraction methods like global features, grid and texture information, vector quantization, Walsh coefficients, and successive geometric centers. The document presents results and concludes by discussing the application of clustering techniques to signature recognition.
Devnagari handwritten numeral recognition using geometric features and statis...Vikas Dongre
This paper presents a Devnagari Numerical recognition method based on statistical
discriminant functions. 17 geometric features based on pixel connectivity, lines, line directions, holes,
image area, perimeter, eccentricity, solidity, orientation etc. are used for representing the numerals. Five
discriminant functions viz. Linear, Quadratic, Diaglinear, Diagquadratic and Mahalanobis distance are
used for classification. 1500 handwritten numerals are used for training. Another 1500 handwritten
numerals are used for testing. Experimental results show that Linear, Quadratic and Mahalanobis
discriminant functions provide better results. Results of these three Discriminants are fed to a majority
voting type Combination classifier. It is found that Combination classifier offers better results over
individual classifiers.
A Review on Geometrical Analysis in Character Recognitioniosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Text-Image Separation in Document Images using Boundary/Perimeter DetectionIDES Editor
Document analysis plays an important role in office
automation, especially in intelligent signal processing. The
proposed system consists of two modules: block segmentation
and block identification. In this approach, first a document is
segmented into several non-overlapping blocks by utilizing a
novel recursive segmentation technique, and then extracts
the features embedded in each segmented block are extracted.
Two kinds of features, connected components and image
boundary/perimeter features are extracted. Document with
text inside image pose limitations in earlier reported literature.
This is taken care of by applying additional pass of the Run
Length Smearing on the extracted image that contains text.
Proposed scheme is independent of type and language of the
document.
Successive Geometric Center Based Dynamic Signature RecognitionDr. Vinayak Bharadi
The document summarizes research on signature recognition using successive geometric centers, grid, and texture features. It discusses extracting features from dynamic signatures captured using a digitizer tablet. Successive geometric centers are extracted from segmented regions of the signature at different depths. Grid features provide pixel density information across a segmented grid. Texture features capture pressure pattern transitions. The features are evaluated for signature recognition and verification performance based on metrics like true acceptance and rejection rates. The goal is to analyze the proposed method and improve over existing systems.
A Survey of Modern Character Recognition Techniquesijsrd.com
This document summarizes several modern techniques for handwritten character recognition. It discusses common feature extraction methods like statistical, structural and global transformation features. It then summarizes several papers that have proposed different techniques for handwritten character recognition, including using associative memory nets, moment invariants with support vector machines, neural networks, hidden markov models, gradient features, and multi-scale neural networks. The document concludes that neural networks are commonly used for training, and that feature extraction methods continue to be improved, but handwritten character recognition remains an active area of research.
Fingerprint image enhancement is the key process in IAFIS systems. In order to reduce false identification ratio and to supply good fingerprint images to IAFIS systems for exact identification, fingerprint images are generally enhanced. A filtering process tries to filter out the noise from the input image, and emphasize on low, high and directional spatial frequency components of an image. This paper presents an experimental summary of enhancing fingerprint images using Gabor filters. Frequency, width and window domain filter ranges are fixed. The orientation angle alone is modified by 0 radians, π/2, π/4 and 3π/4 radians. The experimental results show that Gabor filter enhances the fingerprint image in a better way than other filtering methods and extracts features.
Comparative study of two methods for Handwritten Devanagari Numeral RecognitionIOSR Journals
Abstract : In this paper two different methods for Numeral Recognition are proposed and their results are
compared. The objective of this paper is to provide an efficient and reliable method for recognition of
handwritten numerals. First method employs Grid based feature extraction and recognition algorithm. In this
method the features of the image are extracted by using grid technique and this feature set is then compared
with the feature set of database image for classification. While second method contains Image Centroid Zone
and Zone Centroid Zone algorithms for feature extraction and the features are applied to Artificial Neural
Network for recognition of input image. Machine text recognition is important research area because of its
applications in many areas like Bank, Post office, Hospitals etc.
Keywords: Handwritten Numeral Recognition, Grid Technique, ANN, Feature Extraction, Classification.
Signature recognition using clustering techniques dissertatiDr. Vinayak Bharadi
This document summarizes Vinayak Ashok Bharadi's dissertation on signature recognition using clustering techniques. It introduces the topic, outlines the problem definition and steps in signature recognition. It then discusses several preprocessing techniques, feature extraction methods like global features, grid and texture information, vector quantization, Walsh coefficients, and successive geometric centers. The document presents results and concludes by discussing the application of clustering techniques to signature recognition.
Devnagari handwritten numeral recognition using geometric features and statis...Vikas Dongre
This paper presents a Devnagari Numerical recognition method based on statistical
discriminant functions. 17 geometric features based on pixel connectivity, lines, line directions, holes,
image area, perimeter, eccentricity, solidity, orientation etc. are used for representing the numerals. Five
discriminant functions viz. Linear, Quadratic, Diaglinear, Diagquadratic and Mahalanobis distance are
used for classification. 1500 handwritten numerals are used for training. Another 1500 handwritten
numerals are used for testing. Experimental results show that Linear, Quadratic and Mahalanobis
discriminant functions provide better results. Results of these three Discriminants are fed to a majority
voting type Combination classifier. It is found that Combination classifier offers better results over
individual classifiers.
A Review on Geometrical Analysis in Character Recognitioniosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Text-Image Separation in Document Images using Boundary/Perimeter DetectionIDES Editor
Document analysis plays an important role in office
automation, especially in intelligent signal processing. The
proposed system consists of two modules: block segmentation
and block identification. In this approach, first a document is
segmented into several non-overlapping blocks by utilizing a
novel recursive segmentation technique, and then extracts
the features embedded in each segmented block are extracted.
Two kinds of features, connected components and image
boundary/perimeter features are extracted. Document with
text inside image pose limitations in earlier reported literature.
This is taken care of by applying additional pass of the Run
Length Smearing on the extracted image that contains text.
Proposed scheme is independent of type and language of the
document.
Successive Geometric Center Based Dynamic Signature RecognitionDr. Vinayak Bharadi
The document summarizes research on signature recognition using successive geometric centers, grid, and texture features. It discusses extracting features from dynamic signatures captured using a digitizer tablet. Successive geometric centers are extracted from segmented regions of the signature at different depths. Grid features provide pixel density information across a segmented grid. Texture features capture pressure pattern transitions. The features are evaluated for signature recognition and verification performance based on metrics like true acceptance and rejection rates. The goal is to analyze the proposed method and improve over existing systems.
A Survey of Modern Character Recognition Techniquesijsrd.com
This document summarizes several modern techniques for handwritten character recognition. It discusses common feature extraction methods like statistical, structural and global transformation features. It then summarizes several papers that have proposed different techniques for handwritten character recognition, including using associative memory nets, moment invariants with support vector machines, neural networks, hidden markov models, gradient features, and multi-scale neural networks. The document concludes that neural networks are commonly used for training, and that feature extraction methods continue to be improved, but handwritten character recognition remains an active area of research.
Fingerprint image enhancement is the key process in IAFIS systems. In order to reduce false identification ratio and to supply good fingerprint images to IAFIS systems for exact identification, fingerprint images are generally enhanced. A filtering process tries to filter out the noise from the input image, and emphasize on low, high and directional spatial frequency components of an image. This paper presents an experimental summary of enhancing fingerprint images using Gabor filters. Frequency, width and window domain filter ranges are fixed. The orientation angle alone is modified by 0 radians, π/2, π/4 and 3π/4 radians. The experimental results show that Gabor filter enhances the fingerprint image in a better way than other filtering methods and extracts features.
Devnagari document segmentation using histogram approachVikas Dongre
Document segmentation is one of the critical phases in machine recognition of any language. Correct
segmentation of individual symbols decides the accuracy of character recognition technique. It is used to
decompose image of a sequence of characters into sub images of individual symbols by segmenting lines and
words. Devnagari is the most popular script in India. It is used for writing Hindi, Marathi, Sanskrit and
Nepali languages. Moreover, Hindi is the third most popular language in the world. Devnagari documents
consist of vowels, consonants and various modifiers. Hence proper segmentation of Devnagari word is
challenging. A simple histogram based approach to segment Devnagari documents is proposed in this paper.
Various challenges in segmentation of Devnagari script are also discussed.
A Review of Optical Character Recognition System for Recognition of Printed Textiosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Automatic dominant region segmentation for natural imagescsandit
Image Segmentation segments an image into different homogenous regions. An efficient
semantic based image retrieval system divides the image into different regions separated by
color or texture sometimes even both. Features are extracted from the segmented regions and
are annotated automatically. Relevant images are retrieved from the database based on the
keywords of the segmented region In this paper, automatic image segmentation is proposed to
obtained dominant region of the input natural images. Dominant region are segmented and
results are obtained . Results are also recorded in comparison to JSEG algorithm
Online signature recognition using sectorization of complex walshDr. Vinayak Bharadi
This document discusses online signature recognition using sectorization of the complex Walsh plane. It proposes extracting features from the intermediate transforms of signatures by plotting CAL and SAL functions on the complex Walsh plane. The plane is divided into blocks and mean values in each block are calculated to form feature vectors. Both unimodal and multi-algorithmic techniques are explored. Soft biometric features are also added. The Kekre transform is found to perform best, achieving 98.68% performance index for column density-based vectors. Future work could involve designing better classifiers and generating new hybrid wavelets from different transforms.
This document summarizes and reviews several techniques for image mining, including feature extraction, image clustering, and object recognition algorithms. It discusses color, texture, and edge feature extraction techniques and evaluates their precision and recall. It also describes the block truncation algorithm for image recognition and the cascade feature extraction approach. The key techniques - color moments, block truncation coding, and cascade classifiers - are evaluated based on experimental recall and precision results. Overall, the document provides an overview of different image mining techniques and evaluates their effectiveness.
This document summarizes image indexing and its features. It discusses that image indexing is used to retrieve similar images from a database based on extracted features like color, shape, and texture. Color features can be represented by models like RGB, HSV, and color histograms. Shape features include global properties like roundness and local features like edge segments. Texture is described using statistical, structural, and spectral approaches. Texture feature extraction methods discussed include standard wavelets, Gabor wavelets, and extracting features like entropy and standard deviation. The paper provides an overview of the different features used for image indexing and classification.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Cursive Handwriting Recognition System using Feature Extraction and Artif...IRJET Journal
The document describes a system for recognizing cursive handwriting using feature extraction and an artificial neural network. It involves preprocessing scanned images, segmenting them into individual characters, extracting features from the characters using a diagonal scanning method, and classifying the characters using a neural network. This approach provides higher recognition accuracy compared to conventional methods. The key steps are preprocessing images, segmenting into characters, extracting 54 features from each character by moving along diagonals in a grid, and training a neural network classifier on the extracted features.
Nowadays character recognition has gained lot of attention in the field of pattern recognition due to its application in various fields. It is one of the most successful applications of automatic pattern recognition. Research in OCR is popular for its application potential in banks, post offices, office automation etc. HCR is useful in cheque processing in banks; almost all kind of form processing systems, handwritten postal address resolution and many more. This paper presents a simple and efficient approach for the implementation of OCR and translation of scanned images of printed text into machine-encoded text. It makes use of different image analysis phases followed by image detection via pre-processing and post-processing. This paper also describes scanning the entire document (same as the segmentation in our case) and recognizing individual characters from image irrespective of their position, size and various font styles and it deals with recognition of the symbols from English language, which is internationally accepted.
This document presents a new method for image compression called Haar Wavelet Based Joint Compression Method Using Adaptive Fractal Image Compression (DWT+AFIC). It combines discrete wavelet transform with an existing adaptive fractal image compression technique to improve compression ratio and reconstructed image quality compared to previous fractal image compression methods. The document introduces fractal image compression and its limitations, describes the proposed DWT+AFIC method and 5 other compression techniques, provides simulation results on test images showing DWT+AFIC achieves higher peak signal to noise ratios and compression ratios than other methods, and concludes DWT+AFIC decreases encoding time while increasing compression ratio and maintaining reconstructed image quality.
A Comprehensive Study On Handwritten Character Recognition Systemiosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
A comparative study on content based image retrieval methodsIJLT EMAS
Content-based image retrieval (CBIR) is a method of
finding images from a huge image database according to persons’
interests. Content-based here means that the search involves
analysis the actual content present in the image. As database of
images is growing daybyday, researchers/scholars are searching
for better techniques for retrieval of images maintaining good
efficiency. This paper presents the visual features and various
ways for image retrieval from the huge image database.
In this paper; we introduce a system of automatic recognition of Amazigh characters based on the Random Forest Method in non-constrictive pictures that are stemmed from the terminals Mobile phone. After doing some pretreatments on the picture, the text is segmented into lines and then into characters. In the stage of characteristics extraction, we are representing the input data into the vector of primitives of the zoning types, of diagonal, horizontal, Gabor filters and of the Zernike moment. These characteristics are linked to pixels’ densities and they are extracted on binary pictures. In the classification stage, we examine four classification methods with two different classifiers types namely the Support vector machines (SVM) and the Random Forest method. After some checking tests, the system of learning and recognition which is based on the Random Forest has shown a good performance on a basis of 100 models of pictures.
Recognition of basic kannada characters in scene images using euclidean disIAEME Publication
The document describes a proposed methodology for recognizing basic Kannada characters in scene images using zone-wise horizontal and vertical profile-based features and an Euclidean distance classifier. The methodology involves preprocessing images, extracting 30 features from 15 horizontal and vertical zones, constructing a knowledge base from training images, and recognizing characters in test images by calculating distances between their features and the knowledge base. The method was evaluated on 460 Kannada character images with variations, achieving an average recognition accuracy of 91%.
This document discusses text detection and character recognition from images. It begins with an introduction and then discusses the aims, objectives, motivation and problem statement. It reviews relevant literature on segmentation and recognition techniques. The document then describes the methodology used, including preprocessing, segmentation using vertical projections and connected components, and recognition using pixel counting, projections, template matching, Fourier descriptors and heuristic filters. It presents results from four experiments comparing different segmentation and recognition methods. The discussion analyzes results and limitations. The conclusion finds that segmentation works best with connected components while recognition works best with template matching, Fourier descriptors and heuristic filters.
Content Based Image Retrieval Approach Based on Top-Hat Transform And Modifie...cscpconf
In this paper a robust approach is proposed for content based image retrieval (CBIR) using texture analysis techniques. The proposed approach includes three main steps. In the first one, shape detection is done based on Top-Hat transform to detect and crop object part of the image. Second step is included a texture feature representation algorithm using color local binary patterns (CLBP) and local variance features. Finally, to retrieve mostly closing matching images to the query, log likelihood ratio is used. The performance of the proposed approach is evaluated using Corel and Simplicity image sets and it compared by some of other well-known approaches in terms of precision and recall which shows the superiority of the proposed approach. Low noise sensitivity, rotation invariant, shift invariant, gray scale invariant and low computational complexity are some of other advantages.
BrailleOCR: An Open Source Document to Braille Converter Applicationpijush15
This presentation is actually about an Open Source application, BrailleOCR that helps to convert scanned documents to Braille and thus helps the Visually Impaired.
What is the use of this application in real life? Well, BrailleOCR is currently the only app that integrated Optical character recognition and Braille Translation together. This app will eventually help converting a lot of important documents to Braille. The project site for this project is given here
IJCA Paper: http://www.ijcaonline.org/archives/volume68/number16/11664-7254
Project site: https://code.google.com/p/brailleocr/
The app uses a four step process. Initially, we have a scanned image, which is a RGB image. The first step or the Pre-Processing step deals with conversion of a RGB image to grayscale. The 2nd step deals with Character Recognition using the Tesseract Engine. Now, the recognition step may have errors and we require post processing to correct them. The 3rd step is thus the Post-Processing step and it actually corrects errors in the previous step. The final and the most important step is the Braille Conversion step.
A bidirectional text transcription of braille for odia, hindi, telugu and eng...eSAT Journals
This document describes a bidirectional text transcription system for converting Braille documents in Odia, Hindi, Telugu, and English to text and vice versa using image processing techniques on an FPGA. It discusses prior work on Braille recognition and text-to-speech systems. The proposed algorithm segments Braille cells from image inputs and uses a modified database to map dot patterns to letters/words in each language based on number of dots. Letter patterns are rearranged for Hindi, Telugu, and English. The database is tested and dumped onto an FPGA for hardware implementation and bidirectional conversion between Braille and text in multiple languages.
A UTOMATIC L OGO E XTRACTION F ROM D OCUMENT I MAGESIJCI JOURNAL
Logo extraction plays an important role in logo ba
sed document image retrieval. Here we present a
method for automatic logo extraction from the docum
ent images that works for scanned documents
containing a logo. Proposed method uses morphologic
al operations for logo extraction. It supports
extraction of a logo with its gray level and color
information
This document provides a comprehensive review of existing works in offline handwritten character recognition. It discusses the three major stages of any character recognition system: preprocessing, feature extraction, and classification. For preprocessing, it describes techniques like binarization, filtering, and morphological operations that are used to improve image quality. For feature extraction, it discusses various methods used to represent characters, including global transformations, statistical representations, and geometrical/topological features. Wavelet transforms are highlighted as a commonly used feature extraction technique. Finally, it provides an overview of literature on methods used in each stage of offline handwritten character recognition systems.
This document provides a review of existing methods for handwritten character recognition based on geometrical properties. It begins by classifying character recognition as either printed or handwritten, and describes the different phases a character recognition system typically includes: image acquisition, preprocessing, segmentation, feature extraction, and classification. Preprocessing steps like binarization, noise removal, normalization and morphological operations are discussed. Feature extraction methods focused on include statistical, global and structural features. Geometrical features involving lines, loops, strokes and their directions are highlighted. Classification algorithms mentioned are neural networks, SVM, k-nearest neighbor, and genetic algorithms. The literature review provides examples of character recognition research using geometrical features like horizontal/vertical line analysis and directional feature
Improvement of the Recognition Rate by Random ForestIJERA Editor
In this paper; we introduce a system of automatic recognition of characters based on the Random Forest Method in non-constrictive pictures that are stemmed from the terminals Mobile phone. After doing some pretreatments on the picture, the text is segmented into lines and then into characters. In the stage of characteristics extraction, we are representing the input data into the vector of primitives of the zoning types, of diagonal, horizontal and of the Zernike moment. These characteristics are linked to pixels’ densities and they are extracted on binary pictures. In the classification stage, we examine four classification methods with two different classifiers types namely the multi-layer perceptron (MLP) and the Random Forest method. After some checking tests, the system of learning and recognition which is based on the Random Forest has shown a good performance on a basis of 100 models of pictures
Devnagari document segmentation using histogram approachVikas Dongre
Document segmentation is one of the critical phases in machine recognition of any language. Correct
segmentation of individual symbols decides the accuracy of character recognition technique. It is used to
decompose image of a sequence of characters into sub images of individual symbols by segmenting lines and
words. Devnagari is the most popular script in India. It is used for writing Hindi, Marathi, Sanskrit and
Nepali languages. Moreover, Hindi is the third most popular language in the world. Devnagari documents
consist of vowels, consonants and various modifiers. Hence proper segmentation of Devnagari word is
challenging. A simple histogram based approach to segment Devnagari documents is proposed in this paper.
Various challenges in segmentation of Devnagari script are also discussed.
A Review of Optical Character Recognition System for Recognition of Printed Textiosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Automatic dominant region segmentation for natural imagescsandit
Image Segmentation segments an image into different homogenous regions. An efficient
semantic based image retrieval system divides the image into different regions separated by
color or texture sometimes even both. Features are extracted from the segmented regions and
are annotated automatically. Relevant images are retrieved from the database based on the
keywords of the segmented region In this paper, automatic image segmentation is proposed to
obtained dominant region of the input natural images. Dominant region are segmented and
results are obtained . Results are also recorded in comparison to JSEG algorithm
Online signature recognition using sectorization of complex walshDr. Vinayak Bharadi
This document discusses online signature recognition using sectorization of the complex Walsh plane. It proposes extracting features from the intermediate transforms of signatures by plotting CAL and SAL functions on the complex Walsh plane. The plane is divided into blocks and mean values in each block are calculated to form feature vectors. Both unimodal and multi-algorithmic techniques are explored. Soft biometric features are also added. The Kekre transform is found to perform best, achieving 98.68% performance index for column density-based vectors. Future work could involve designing better classifiers and generating new hybrid wavelets from different transforms.
This document summarizes and reviews several techniques for image mining, including feature extraction, image clustering, and object recognition algorithms. It discusses color, texture, and edge feature extraction techniques and evaluates their precision and recall. It also describes the block truncation algorithm for image recognition and the cascade feature extraction approach. The key techniques - color moments, block truncation coding, and cascade classifiers - are evaluated based on experimental recall and precision results. Overall, the document provides an overview of different image mining techniques and evaluates their effectiveness.
This document summarizes image indexing and its features. It discusses that image indexing is used to retrieve similar images from a database based on extracted features like color, shape, and texture. Color features can be represented by models like RGB, HSV, and color histograms. Shape features include global properties like roundness and local features like edge segments. Texture is described using statistical, structural, and spectral approaches. Texture feature extraction methods discussed include standard wavelets, Gabor wavelets, and extracting features like entropy and standard deviation. The paper provides an overview of the different features used for image indexing and classification.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Cursive Handwriting Recognition System using Feature Extraction and Artif...IRJET Journal
The document describes a system for recognizing cursive handwriting using feature extraction and an artificial neural network. It involves preprocessing scanned images, segmenting them into individual characters, extracting features from the characters using a diagonal scanning method, and classifying the characters using a neural network. This approach provides higher recognition accuracy compared to conventional methods. The key steps are preprocessing images, segmenting into characters, extracting 54 features from each character by moving along diagonals in a grid, and training a neural network classifier on the extracted features.
Nowadays character recognition has gained lot of attention in the field of pattern recognition due to its application in various fields. It is one of the most successful applications of automatic pattern recognition. Research in OCR is popular for its application potential in banks, post offices, office automation etc. HCR is useful in cheque processing in banks; almost all kind of form processing systems, handwritten postal address resolution and many more. This paper presents a simple and efficient approach for the implementation of OCR and translation of scanned images of printed text into machine-encoded text. It makes use of different image analysis phases followed by image detection via pre-processing and post-processing. This paper also describes scanning the entire document (same as the segmentation in our case) and recognizing individual characters from image irrespective of their position, size and various font styles and it deals with recognition of the symbols from English language, which is internationally accepted.
This document presents a new method for image compression called Haar Wavelet Based Joint Compression Method Using Adaptive Fractal Image Compression (DWT+AFIC). It combines discrete wavelet transform with an existing adaptive fractal image compression technique to improve compression ratio and reconstructed image quality compared to previous fractal image compression methods. The document introduces fractal image compression and its limitations, describes the proposed DWT+AFIC method and 5 other compression techniques, provides simulation results on test images showing DWT+AFIC achieves higher peak signal to noise ratios and compression ratios than other methods, and concludes DWT+AFIC decreases encoding time while increasing compression ratio and maintaining reconstructed image quality.
A Comprehensive Study On Handwritten Character Recognition Systemiosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
A comparative study on content based image retrieval methodsIJLT EMAS
Content-based image retrieval (CBIR) is a method of
finding images from a huge image database according to persons’
interests. Content-based here means that the search involves
analysis the actual content present in the image. As database of
images is growing daybyday, researchers/scholars are searching
for better techniques for retrieval of images maintaining good
efficiency. This paper presents the visual features and various
ways for image retrieval from the huge image database.
In this paper; we introduce a system of automatic recognition of Amazigh characters based on the Random Forest Method in non-constrictive pictures that are stemmed from the terminals Mobile phone. After doing some pretreatments on the picture, the text is segmented into lines and then into characters. In the stage of characteristics extraction, we are representing the input data into the vector of primitives of the zoning types, of diagonal, horizontal, Gabor filters and of the Zernike moment. These characteristics are linked to pixels’ densities and they are extracted on binary pictures. In the classification stage, we examine four classification methods with two different classifiers types namely the Support vector machines (SVM) and the Random Forest method. After some checking tests, the system of learning and recognition which is based on the Random Forest has shown a good performance on a basis of 100 models of pictures.
Recognition of basic kannada characters in scene images using euclidean disIAEME Publication
The document describes a proposed methodology for recognizing basic Kannada characters in scene images using zone-wise horizontal and vertical profile-based features and an Euclidean distance classifier. The methodology involves preprocessing images, extracting 30 features from 15 horizontal and vertical zones, constructing a knowledge base from training images, and recognizing characters in test images by calculating distances between their features and the knowledge base. The method was evaluated on 460 Kannada character images with variations, achieving an average recognition accuracy of 91%.
This document discusses text detection and character recognition from images. It begins with an introduction and then discusses the aims, objectives, motivation and problem statement. It reviews relevant literature on segmentation and recognition techniques. The document then describes the methodology used, including preprocessing, segmentation using vertical projections and connected components, and recognition using pixel counting, projections, template matching, Fourier descriptors and heuristic filters. It presents results from four experiments comparing different segmentation and recognition methods. The discussion analyzes results and limitations. The conclusion finds that segmentation works best with connected components while recognition works best with template matching, Fourier descriptors and heuristic filters.
Content Based Image Retrieval Approach Based on Top-Hat Transform And Modifie...cscpconf
In this paper a robust approach is proposed for content based image retrieval (CBIR) using texture analysis techniques. The proposed approach includes three main steps. In the first one, shape detection is done based on Top-Hat transform to detect and crop object part of the image. Second step is included a texture feature representation algorithm using color local binary patterns (CLBP) and local variance features. Finally, to retrieve mostly closing matching images to the query, log likelihood ratio is used. The performance of the proposed approach is evaluated using Corel and Simplicity image sets and it compared by some of other well-known approaches in terms of precision and recall which shows the superiority of the proposed approach. Low noise sensitivity, rotation invariant, shift invariant, gray scale invariant and low computational complexity are some of other advantages.
BrailleOCR: An Open Source Document to Braille Converter Applicationpijush15
This presentation is actually about an Open Source application, BrailleOCR that helps to convert scanned documents to Braille and thus helps the Visually Impaired.
What is the use of this application in real life? Well, BrailleOCR is currently the only app that integrated Optical character recognition and Braille Translation together. This app will eventually help converting a lot of important documents to Braille. The project site for this project is given here
IJCA Paper: http://www.ijcaonline.org/archives/volume68/number16/11664-7254
Project site: https://code.google.com/p/brailleocr/
The app uses a four step process. Initially, we have a scanned image, which is a RGB image. The first step or the Pre-Processing step deals with conversion of a RGB image to grayscale. The 2nd step deals with Character Recognition using the Tesseract Engine. Now, the recognition step may have errors and we require post processing to correct them. The 3rd step is thus the Post-Processing step and it actually corrects errors in the previous step. The final and the most important step is the Braille Conversion step.
A bidirectional text transcription of braille for odia, hindi, telugu and eng...eSAT Journals
This document describes a bidirectional text transcription system for converting Braille documents in Odia, Hindi, Telugu, and English to text and vice versa using image processing techniques on an FPGA. It discusses prior work on Braille recognition and text-to-speech systems. The proposed algorithm segments Braille cells from image inputs and uses a modified database to map dot patterns to letters/words in each language based on number of dots. Letter patterns are rearranged for Hindi, Telugu, and English. The database is tested and dumped onto an FPGA for hardware implementation and bidirectional conversion between Braille and text in multiple languages.
A UTOMATIC L OGO E XTRACTION F ROM D OCUMENT I MAGESIJCI JOURNAL
Logo extraction plays an important role in logo ba
sed document image retrieval. Here we present a
method for automatic logo extraction from the docum
ent images that works for scanned documents
containing a logo. Proposed method uses morphologic
al operations for logo extraction. It supports
extraction of a logo with its gray level and color
information
This document provides a comprehensive review of existing works in offline handwritten character recognition. It discusses the three major stages of any character recognition system: preprocessing, feature extraction, and classification. For preprocessing, it describes techniques like binarization, filtering, and morphological operations that are used to improve image quality. For feature extraction, it discusses various methods used to represent characters, including global transformations, statistical representations, and geometrical/topological features. Wavelet transforms are highlighted as a commonly used feature extraction technique. Finally, it provides an overview of literature on methods used in each stage of offline handwritten character recognition systems.
This document provides a review of existing methods for handwritten character recognition based on geometrical properties. It begins by classifying character recognition as either printed or handwritten, and describes the different phases a character recognition system typically includes: image acquisition, preprocessing, segmentation, feature extraction, and classification. Preprocessing steps like binarization, noise removal, normalization and morphological operations are discussed. Feature extraction methods focused on include statistical, global and structural features. Geometrical features involving lines, loops, strokes and their directions are highlighted. Classification algorithms mentioned are neural networks, SVM, k-nearest neighbor, and genetic algorithms. The literature review provides examples of character recognition research using geometrical features like horizontal/vertical line analysis and directional feature
Improvement of the Recognition Rate by Random ForestIJERA Editor
In this paper; we introduce a system of automatic recognition of characters based on the Random Forest Method in non-constrictive pictures that are stemmed from the terminals Mobile phone. After doing some pretreatments on the picture, the text is segmented into lines and then into characters. In the stage of characteristics extraction, we are representing the input data into the vector of primitives of the zoning types, of diagonal, horizontal and of the Zernike moment. These characteristics are linked to pixels’ densities and they are extracted on binary pictures. In the classification stage, we examine four classification methods with two different classifiers types namely the multi-layer perceptron (MLP) and the Random Forest method. After some checking tests, the system of learning and recognition which is based on the Random Forest has shown a good performance on a basis of 100 models of pictures
Improvement oh the recognition rate by random forestYoussef Rachidi
In this paper; we introduce a system of automatic recognition of characters based on the Random Forest Method in non-constrictive pictures that are stemmed from the terminals Mobile phone. After doing some pretreatments on the picture, the text is segmented into lines and then into characters. In the stage of characteristics extraction, we are representing the input data into the vector of primitives of the zoning types, of diagonal, horizontal and of the Zernike moment. These characteristics are linked to pixels’ densities and they are extracted on binary pictures. In the classification stage, we examine four classification methods with two different classifiers types namely the multi-layer perceptron (MLP) and the Random Forest method. After some checking tests, the system of learning and recognition which is based on the Random Forest has shown a good performance on a basis of 100 models of pictures.
This document provides a comprehensive review of offline handwritten character recognition. It discusses the various phases of a character recognition system, including image acquisition, preprocessing, segmentation, feature extraction, and classification. Preprocessing techniques like noise removal, binarization, and size normalization are described. Common feature extraction methods like statistical, transform, and structural features are also summarized. Finally, the document analyzes different classification approaches used in character recognition, such as neural networks, support vector machines, and multiple classifier methods.
Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...iosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
The document discusses four different methods for Bangla handwritten digit recognition. Method 1 uses preprocessing techniques like binarization, noise reduction, and segmentation followed by feature extraction and classification with a CNN. It achieves 94% accuracy. Method 2 also uses a CNN called MathNET with data augmentation, achieving 97% accuracy. Method 3 uses preprocessing, HOG feature extraction, and an SVM classifier, achieving 97.08% accuracy. Method 4 develops a dataset, performs data augmentation, uses a multi-layer CNN model with ensembling, and achieves 96.788% accuracy even on noisy images. The methods demonstrate high and improving recognition accuracy for Bangla handwritten digits.
Feature Extraction and Feature Selection using Textual Analysisvivatechijri
After pre-processing the images in character recognition systems, the images are segmented based on
certain characteristics known as “features”. The feature space identified for character recognition is however
ranging across a huge dimensionality. To solve this problem of dimensionality, the feature selection and feature
extraction methods are used. Hereby in this paper, we are going to discuss, the different techniques for feature
extraction and feature selection and how these techniques are used to reduce the dimensionality of feature space
to improve the performance of text categorization.
Segmentation and recognition of handwritten digit numeral string using a mult...ijfcstjournal
In this paper, the use of Multi-Layer Perceptron (MLP) Neural Network model is proposed for recognizing
unconstrained offline handwritten Numeral strings. The Numeral strings are segmented and isolated
numerals are obtained using a connected component labeling (CCL) algorithm approach. The structural
part of the models has been modeled using a Multilayer Perceptron Neural Network. This paper also
presents a new technique to remove slope and slant from handwritten numeral string and to normalize the
size of text images and classify with supervised learning methods. Experimental results on a database of
102 numeral string patterns written by 3 different people show that a recognition rate of 99.7% is obtained
on independent digits contained in the numeral string of digits includes both the skewed and slant data.
A Survey On Thresholding Operators of Text Extraction In VideosCSCJournals
The document describes a technique for text detection in video frames that uses morphological operations. It involves four phases: 1) image enhancement to improve edge detection, 2) edge extraction using morphological operations, 3) labeling connected text regions, and 4) extracting text using Hough transforms. The technique is evaluated using quantitative metrics like boundary precision-recall and volume precision-recall to measure the accuracy of detected text boundaries and regions compared to ground truth. Experimental results show the technique can accurately detect and extract text from video frames.
A Survey On Thresholding Operators of Text Extraction In VideosCSCJournals
Video indexing is an important problem that has interested by the communities of visual information in image processing. The detection and extraction of scene and caption text from unconstrained, general purpose video is an important research problem in the context of content-based retrieval and summarization. In this paper, the technique presented is for detection text from frames video. Finding the textual contents in images is a challenging and promising research area in information technology. Consequently, text detection and recognition in multimedia had become one of the most important fields in computer vision due to its valuable uses in a variety of recent technical applications. The work in this paper consists using morphological operations for extract text appearing in the video frames. The proposed scheme well as preprocessing to differentiate among where it as the high similarity between text and background information. Experimental results show that the resultant image is the image with only text. The evaluated criteria are applied with the image result and one obtained bay different operator.
Hangul Recognition Using Support Vector MachineEditor IJCATR
The recognition of Hangul Image is more difficult compared with that of Latin. It could be recognized from the structural arrangement. Hangul is arranged from two dimensions while Latin is only from the left to the right. The current research creates a system to convert Hangul image into Latin text in order to use it as a learning material on reading Hangul. In general, image recognition system is divided into three steps. The first step is preprocessing, which includes binarization, segmentation through connected component-labeling method, and thinning with Zhang Suen to decrease some pattern information. The second is receiving the feature from every single image, whose identification process is done through chain code method. The third is recognizing the process using Support Vector Machine (SVM) with some kernels. It works through letter image and Hangul word recognition. It consists of 34 letters, each of which has 15 different patterns. The whole patterns are 510, divided into 3 data scenarios. The highest result achieved is 94,7% using SVM kernel polynomial and radial basis function. The level of recognition result is influenced by many trained data. Whilst the recognition process of Hangul word applies to the type 2 Hangul word with 6 different patterns. The difference of these patterns appears from the change of the font type. The chosen fonts for data training are such as Batang, Dotum, Gaeul, Gulim, Malgun Gothic. Arial Unicode MS is used to test the data. The lowest accuracy is achieved through the use of SVM kernel radial basis function, which is 69%. The same result, 72 %, is given by the SVM kernel linear and polynomial.
An effective approach to offline arabic handwriting recognitionijaia
Segmentation is the most challenging part of the Arabic handwriting recognition, due to the unique
characteristics of Arabic writing that allows the same shape to denote different characters. In this paper,
an off-line Arabic handwriting recognition system is proposed. The processing details are presented in
three main stages. Firstly, the image is skeletonized to one pixel thin. Secondly, transfer each diagonally
connected foreground pixel to the closest horizontal or vertical line. Finally, these orthogonal lines are
coded as vectors of unique integer numbers; each vector represents one letter of the word. In order to
evaluate the proposed techniques, the system has been tested on the IFN/ENIT database, and the
experimental results show that our method is superior to those methods currently available.
Artificial Neural Network For Recognition Of Handwritten Devanagari CharacterIOSR Journals
1) The document discusses recognizing handwritten Devanagari characters using artificial neural networks and zone-based feature extraction.
2) It proposes extracting features from images by dividing them into zones and calculating average pixel distances to the image and zone centroids.
3) This zone-based feature vector is then input to a feedforward neural network for character recognition.
1) The document discusses recognizing handwritten Devanagari characters using an artificial neural network approach.
2) It proposes extracting features from images by dividing them into zones and calculating average pixel distances to the image and zone centroids.
3) This zone-based feature extraction method is used to create a feature vector for each character image that is then classified by a feedforward neural network.
https://jst.org.in/index.html
Our journal has a foster a sense of community among researchers, scholars, and academics. They provide a space for intellectual exchange, allowing scholars to engage with each other's work, provide constructive feedback, and build upon existing knowledge.
Optical Character Recognition (OCR) based RetrievalBiniam Asnake
The document outlines research works on optical character recognition (OCR) systems, including both global and local (Amharic language) research. It discusses several local studies from 1997-2011 focused on developing OCR for printed, typewritten and handwritten Amharic text. The studies explored various preprocessing, segmentation, recognition algorithms and achieved recognition accuracy rates ranging from 15-99% depending on the type of Amharic text and techniques used. Future research directions included improving techniques for formatted text, different font styles and improving accuracy.
This document summarizes a research paper that reviews different methods for scene text detection and the challenges associated with it. The paper begins with an introduction that describes the overall process of automated scene text detection systems. It then provides a literature review of various text detection methods proposed in previous research, which can be categorized as connected component based methods or texture based methods. Some example methods are described. The paper discusses challenges in scene text detection, such as variable imaging conditions, complex backgrounds, and a wide range of text sizes and fonts. Finally, it discusses performance metrics like precision, recall, and f-measure that are used to evaluate scene text detection methods based on a standard dataset.
Enhancement and Segmentation of Historical Recordscsandit
Document Analysis and Recognition (DAR) aims to extract automatically the information in the document and also addresses to human comprehension. The automatic processing of degraded
historical documents are applications of document image analysis field which is confronted with many difficulties due to the storage condition and the complexity of the script. The main interest
of enhancement of historical documents is to remove undesirable statistics that appear in the
background and highlight the foreground, so as to enable automatic recognition of documents
with high accuracy. This paper addresses pre-processing and segmentation of ancient scripts, as an initial step to automate the task of an epigraphist in reading and deciphering inscriptions.
Pre-processing involves, enhancement of degraded ancient document images which is achieved through four different Spatial filtering methods for smoothing or sharpening namely Median,
Gaussian blur, Mean and Bilateral filter, with different mask sizes. This is followed by
binarization of the enhanced image to highlight the foreground information, using Otsu
thresholding algorithm. In the second phase Segmentation is carried out using Drop Fall and
WaterReservoir approaches, to obtain sampled characters, which can be used in later stages of
OCR. The system showed good results when tested on the nearly 150 samples of varying
degraded epigraphic images and works well giving better enhanced output for, 4x4 mask size
for Median filter, 2x2 mask size for Gaussian blur, 4x4 mask size for Mean and Bilateral filter.
The system can effectively sample characters from enhanced images, giving a segmentation rate of 85%-90% for Drop Fall and 85%-90% for Water Reservoir techniques respectively
Design and Description of Feature Extraction Algorithm for Old English FontIRJET Journal
This document describes a proposed feature extraction algorithm for recognizing Old English font characters. It consists of four main stages: data collection, preprocessing, feature extraction, and recognition using a minimum distance classifier. For preprocessing, binarization and noise removal are performed. The feature extraction algorithm extracts 20 features from each character image by dividing it into 16x16 zones and identifying black pixels in each zone. Classification is done using a minimum distance classifier to match features to class means. The method achieved a 79% recognition rate on the test data.
Similar to A Survey on Tamil Handwritten Character Recognition using OCR Techniques (20)
ANALYSIS OF LAND SURFACE DEFORMATION GRADIENT BY DINSAR cscpconf
The progressive development of Synthetic Aperture Radar (SAR) systems diversify the exploitation of the generated images by these systems in different applications of geoscience. Detection and monitoring surface deformations, procreated by various phenomena had benefited from this evolution and had been realized by interferometry (InSAR) and differential interferometry (DInSAR) techniques. Nevertheless, spatial and temporal decorrelations of the interferometric couples used, limit strongly the precision of analysis results by these techniques. In this context, we propose, in this work, a methodological approach of surface deformation detection and analysis by differential interferograms to show the limits of this technique according to noise quality and level. The detectability model is generated from the deformation signatures, by simulating a linear fault merged to the images couples of ERS1 / ERS2 sensors acquired in a region of the Algerian south.
4D AUTOMATIC LIP-READING FOR SPEAKER'S FACE IDENTIFCATIONcscpconf
A novel based a trajectory-guided, concatenating approach for synthesizing high-quality image real sample renders video is proposed . The lips reading automated is seeking for modeled the closest real image sample sequence preserve in the library under the data video to the HMM predicted trajectory. The object trajectory is modeled obtained by projecting the face patterns into an KDA feature space is estimated. The approach for speaker's face identification by using synthesise the identity surface of a subject face from a small sample of patterns which sparsely each the view sphere. An KDA algorithm use to the Lip-reading image is discrimination, after that work consisted of in the low dimensional for the fundamental lip features vector is reduced by using the 2D-DCT.The mouth of the set area dimensionality is ordered by a normally reduction base on the PCA to obtain the Eigen lips approach, their proposed approach by[33]. The subjective performance results of the cost function under the automatic lips reading modeled , which wasn’t illustrate the superior performance of the
method.
MOVING FROM WATERFALL TO AGILE PROCESS IN SOFTWARE ENGINEERING CAPSTONE PROJE...cscpconf
Universities offer software engineering capstone course to simulate a real world-working environment in which students can work in a team for a fixed period to deliver a quality product. The objective of the paper is to report on our experience in moving from Waterfall process to Agile process in conducting the software engineering capstone project. We present the capstone course designs for both Waterfall driven and Agile driven methodologies that highlight the structure, deliverables and assessment plans.To evaluate the improvement, we conducted a survey for two different sections taught by two different instructors to evaluate students’ experience in moving from traditional Waterfall model to Agile like process. Twentyeight students filled the survey. The survey consisted of eight multiple-choice questions and an open-ended question to collect feedback from students. The survey results show that students were able to attain hands one experience, which simulate a real world-working environment. The results also show that the Agile approach helped students to have overall better design and avoid mistakes they have made in the initial design completed in of the first phase of the capstone project. In addition, they were able to decide on their team capabilities, training needs and thus learn the required technologies earlier which is reflected on the final product quality
PROMOTING STUDENT ENGAGEMENT USING SOCIAL MEDIA TECHNOLOGIEScscpconf
This document discusses using social media technologies to promote student engagement in a software project management course. It describes the course and objectives of enhancing communication. It discusses using Facebook for 4 years, then switching to WhatsApp based on student feedback, and finally introducing Slack to enable personalized team communication. Surveys found students engaged and satisfied with all three tools, though less familiar with Slack. The conclusion is that social media promotes engagement but familiarity with the tool also impacts satisfaction.
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGICcscpconf
In real world computing environment with using a computer to answer questions has been a human dream since the beginning of the digital era, Question-answering systems are referred to as intelligent systems, that can be used to provide responses for the questions being asked by the user based on certain facts or rules stored in the knowledge base it can generate answers of questions asked in natural , and the first main idea of fuzzy logic was to working on the problem of computer understanding of natural language, so this survey paper provides an overview on what Question-Answering is and its system architecture and the possible relationship and
different with fuzzy logic, as well as the previous related research with respect to approaches that were followed. At the end, the survey provides an analytical discussion of the proposed QA models, along or combined with fuzzy logic and their main contributions and limitations.
DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS cscpconf
Human beings generate different speech waveforms while speaking the same word at different times. Also, different human beings have different accents and generate significantly varying speech waveforms for the same word. There is a need to measure the distances between various words which facilitate preparation of pronunciation dictionaries. A new algorithm called Dynamic Phone Warping (DPW) is presented in this paper. It uses dynamic programming technique for global alignment and shortest distance measurements. The DPW algorithm can be used to enhance the pronunciation dictionaries of the well-known languages like English or to build pronunciation dictionaries to the less known sparse languages. The precision measurement experiments show 88.9% accuracy.
INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS cscpconf
In education, the use of electronic (E) examination systems is not a novel idea, as Eexamination systems have been used to conduct objective assessments for the last few years. This research deals with randomly designed E-examinations and proposes an E-assessment system that can be used for subjective questions. This system assesses answers to subjective questions by finding a matching ratio for the keywords in instructor and student answers. The matching ratio is achieved based on semantic and document similarity. The assessment system is composed of four modules: preprocessing, keyword expansion, matching, and grading. A survey and case study were used in the research design to validate the proposed system. The examination assessment system will help instructors to save time, costs, and resources, while increasing efficiency and improving the productivity of exam setting and assessments.
TWO DISCRETE BINARY VERSIONS OF AFRICAN BUFFALO OPTIMIZATION METAHEURISTICcscpconf
African Buffalo Optimization (ABO) is one of the most recent swarms intelligence based metaheuristics. ABO algorithm is inspired by the buffalo’s behavior and lifestyle. Unfortunately, the standard ABO algorithm is proposed only for continuous optimization problems. In this paper, the authors propose two discrete binary ABO algorithms to deal with binary optimization problems. In the first version (called SBABO) they use the sigmoid function and probability model to generate binary solutions. In the second version (called LBABO) they use some logical operator to operate the binary solutions. Computational results on two knapsack problems (KP and MKP) instances show the effectiveness of the proposed algorithm and their ability to achieve good and promising solutions.
DETECTION OF ALGORITHMICALLY GENERATED MALICIOUS DOMAINcscpconf
In recent years, many malware writers have relied on Dynamic Domain Name Services (DDNS) to maintain their Command and Control (C&C) network infrastructure to ensure a persistence presence on a compromised host. Amongst the various DDNS techniques, Domain Generation Algorithm (DGA) is often perceived as the most difficult to detect using traditional methods. This paper presents an approach for detecting DGA using frequency analysis of the character distribution and the weighted scores of the domain names. The approach’s feasibility is demonstrated using a range of legitimate domains and a number of malicious algorithmicallygenerated domain names. Findings from this study show that domain names made up of English characters “a-z” achieving a weighted score of < 45 are often associated with DGA. When a weighted score of < 45 is applied to the Alexa one million list of domain names, only 15% of the domain names were treated as non-human generated.
GLOBAL MUSIC ASSET ASSURANCE DIGITAL CURRENCY: A DRM SOLUTION FOR STREAMING C...cscpconf
The document proposes a blockchain-based digital currency and streaming platform called GoMAA to address issues of piracy in the online music streaming industry. Key points:
- GoMAA would use a digital token on the iMediaStreams blockchain to enable secure dissemination and tracking of streamed content. Content owners could control access and track consumption of released content.
- Original media files would be converted to a Secure Portable Streaming (SPS) format, embedding watermarks and smart contract data to indicate ownership and enable validation on the blockchain.
- A browser plugin would provide wallets for fans to collect GoMAA tokens as rewards for consuming content, incentivizing participation and addressing royalty discrepancies by recording
IMPORTANCE OF VERB SUFFIX MAPPING IN DISCOURSE TRANSLATION SYSTEMcscpconf
This document discusses the importance of verb suffix mapping in discourse translation from English to Telugu. It explains that after anaphora resolution, the verbs must be changed to agree with the gender, number, and person features of the subject or anaphoric pronoun. Verbs in Telugu inflect based on these features, while verbs in English only inflect based on number and person. Several examples are provided that demonstrate how the Telugu verb changes based on whether the subject or pronoun is masculine, feminine, neuter, singular or plural. Proper verb suffix mapping is essential for generating natural and coherent translations while preserving the context and meaning of the original discourse.
EXACT SOLUTIONS OF A FAMILY OF HIGHER-DIMENSIONAL SPACE-TIME FRACTIONAL KDV-T...cscpconf
In this paper, based on the definition of conformable fractional derivative, the functional
variable method (FVM) is proposed to seek the exact traveling wave solutions of two higherdimensional
space-time fractional KdV-type equations in mathematical physics, namely the
(3+1)-dimensional space–time fractional Zakharov-Kuznetsov (ZK) equation and the (2+1)-
dimensional space–time fractional Generalized Zakharov-Kuznetsov-Benjamin-Bona-Mahony
(GZK-BBM) equation. Some new solutions are procured and depicted. These solutions, which
contain kink-shaped, singular kink, bell-shaped soliton, singular soliton and periodic wave
solutions, have many potential applications in mathematical physics and engineering. The
simplicity and reliability of the proposed method is verified.
AUTOMATED PENETRATION TESTING: AN OVERVIEWcscpconf
The document discusses automated penetration testing and provides an overview. It compares manual and automated penetration testing, noting that automated testing allows for faster, more standardized and repeatable tests but has limitations in developing new exploits. It also reviews some current automated penetration testing methodologies and tools, including those using HTTP/TCP/IP attacks, linking common scanning tools, a Python-based tool targeting databases, and one using POMDPs for multi-step penetration test planning under uncertainty. The document concludes that automated testing is more efficient than manual for known vulnerabilities but cannot replace manual testing for discovering new exploits.
CLASSIFICATION OF ALZHEIMER USING fMRI DATA AND BRAIN NETWORKcscpconf
Since the mid of 1990s, functional connectivity study using fMRI (fcMRI) has drawn increasing
attention of neuroscientists and computer scientists, since it opens a new window to explore
functional network of human brain with relatively high resolution. BOLD technique provides
almost accurate state of brain. Past researches prove that neuro diseases damage the brain
network interaction, protein- protein interaction and gene-gene interaction. A number of
neurological research paper also analyse the relationship among damaged part. By
computational method especially machine learning technique we can show such classifications.
In this paper we used OASIS fMRI dataset affected with Alzheimer’s disease and normal
patient’s dataset. After proper processing the fMRI data we use the processed data to form
classifier models using SVM (Support Vector Machine), KNN (K- nearest neighbour) & Naïve
Bayes. We also compare the accuracy of our proposed method with existing methods. In future,
we will other combinations of methods for better accuracy.
VALIDATION METHOD OF FUZZY ASSOCIATION RULES BASED ON FUZZY FORMAL CONCEPT AN...cscpconf
The document proposes a new validation method for fuzzy association rules based on three steps: (1) applying the EFAR-PN algorithm to extract a generic base of non-redundant fuzzy association rules using fuzzy formal concept analysis, (2) categorizing the extracted rules into groups, and (3) evaluating the relevance of the rules using structural equation modeling, specifically partial least squares. The method aims to address issues with existing fuzzy association rule extraction algorithms such as large numbers of extracted rules, redundancy, and difficulties with manual validation.
PROBABILITY BASED CLUSTER EXPANSION OVERSAMPLING TECHNIQUE FOR IMBALANCED DATAcscpconf
In many applications of data mining, class imbalance is noticed when examples in one class are
overrepresented. Traditional classifiers result in poor accuracy of the minority class due to the
class imbalance. Further, the presence of within class imbalance where classes are composed of
multiple sub-concepts with different number of examples also affect the performance of
classifier. In this paper, we propose an oversampling technique that handles between class and
within class imbalance simultaneously and also takes into consideration the generalization
ability in data space. The proposed method is based on two steps- performing Model Based
Clustering with respect to classes to identify the sub-concepts; and then computing the
separating hyperplane based on equal posterior probability between the classes. The proposed
method is tested on 10 publicly available data sets and the result shows that the proposed
method is statistically superior to other existing oversampling methods.
CHARACTER AND IMAGE RECOGNITION FOR DATA CATALOGING IN ECOLOGICAL RESEARCHcscpconf
Data collection is an essential, but manpower intensive procedure in ecological research. An
algorithm was developed by the author which incorporated two important computer vision
techniques to automate data cataloging for butterfly measurements. Optical Character
Recognition is used for character recognition and Contour Detection is used for imageprocessing.
Proper pre-processing is first done on the images to improve accuracy. Although
there are limitations to Tesseract’s detection of certain fonts, overall, it can successfully identify
words of basic fonts. Contour detection is an advanced technique that can be utilized to
measure an image. Shapes and mathematical calculations are crucial in determining the precise
location of the points on which to draw the body and forewing lines of the butterfly. Overall,
92% accuracy were achieved by the program for the set of butterflies measured.
SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...cscpconf
Smart cities utilize Internet of Things (IoT) devices and sensors to enhance the quality of the city
services including energy, transportation, health, and much more. They generate massive
volumes of structured and unstructured data on a daily basis. Also, social networks, such as
Twitter, Facebook, and Google+, are becoming a new source of real-time information in smart
cities. Social network users are acting as social sensors. These datasets so large and complex
are difficult to manage with conventional data management tools and methods. To become
valuable, this massive amount of data, known as 'big data,' needs to be processed and
comprehended to hold the promise of supporting a broad range of urban and smart cities
functions, including among others transportation, water, and energy consumption, pollution
surveillance, and smart city governance. In this work, we investigate how social media analytics
help to analyze smart city data collected from various social media sources, such as Twitter and
Facebook, to detect various events taking place in a smart city and identify the importance of
events and concerns of citizens regarding some events. A case scenario analyses the opinions of
users concerning the traffic in three largest cities in the UAE
SOCIAL NETWORK HATE SPEECH DETECTION FOR AMHARIC LANGUAGEcscpconf
The anonymity of social networks makes it attractive for hate speech to mask their criminal
activities online posing a challenge to the world and in particular Ethiopia. With this everincreasing
volume of social media data, hate speech identification becomes a challenge in
aggravating conflict between citizens of nations. The high rate of production, has become
difficult to collect, store and analyze such big data using traditional detection methods. This
paper proposed the application of apache spark in hate speech detection to reduce the
challenges. Authors developed an apache spark based model to classify Amharic Facebook
posts and comments into hate and not hate. Authors employed Random forest and Naïve Bayes
for learning and Word2Vec and TF-IDF for feature selection. Tested by 10-fold crossvalidation,
the model based on word2vec embedding performed best with 79.83%accuracy. The
proposed method achieve a promising result with unique feature of spark for big data.
GENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXTcscpconf
This article presents Part of Speech tagging for Nepali text using General Regression Neural
Network (GRNN). The corpus is divided into two parts viz. training and testing. The network is
trained and validated on both training and testing data. It is observed that 96.13% words are
correctly being tagged on training set whereas 74.38% words are tagged correctly on testing
data set using GRNN. The result is compared with the traditional Viterbi algorithm based on
Hidden Markov Model. Viterbi algorithm yields 97.2% and 40% classification accuracies on
training and testing data sets respectively. GRNN based POS Tagger is more consistent than the
traditional Viterbi decoding technique.
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Dr. Vinod Kumar Kanvaria
Exploiting Artificial Intelligence for Empowering Researchers and Faculty,
International FDP on Fundamentals of Research in Social Sciences
at Integral University, Lucknow, 06.06.2024
By Dr. Vinod Kumar Kanvaria
A review of the growth of the Israel Genealogy Research Association Database Collection for the last 12 months. Our collection is now passed the 3 million mark and still growing. See which archives have contributed the most. See the different types of records we have, and which years have had records added. You can also see what we have for the future.
How to Build a Module in Odoo 17 Using the Scaffold MethodCeline George
Odoo provides an option for creating a module by using a single line command. By using this command the user can make a whole structure of a module. It is very easy for a beginner to make a module. There is no need to make each file manually. This slide will show how to create a module using the scaffold method.
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPRAHUL
This Dissertation explores the particular circumstances of Mirzapur, a region located in the
core of India. Mirzapur, with its varied terrains and abundant biodiversity, offers an optimal
environment for investigating the changes in vegetation cover dynamics. Our study utilizes
advanced technologies such as GIS (Geographic Information Systems) and Remote sensing to
analyze the transformations that have taken place over the course of a decade.
The complex relationship between human activities and the environment has been the focus
of extensive research and worry. As the global community grapples with swift urbanization,
population expansion, and economic progress, the effects on natural ecosystems are becoming
more evident. A crucial element of this impact is the alteration of vegetation cover, which plays a
significant role in maintaining the ecological equilibrium of our planet.Land serves as the foundation for all human activities and provides the necessary materials for
these activities. As the most crucial natural resource, its utilization by humans results in different
'Land uses,' which are determined by both human activities and the physical characteristics of the
land.
The utilization of land is impacted by human needs and environmental factors. In countries
like India, rapid population growth and the emphasis on extensive resource exploitation can lead
to significant land degradation, adversely affecting the region's land cover.
Therefore, human intervention has significantly influenced land use patterns over many
centuries, evolving its structure over time and space. In the present era, these changes have
accelerated due to factors such as agriculture and urbanization. Information regarding land use and
cover is essential for various planning and management tasks related to the Earth's surface,
providing crucial environmental data for scientific, resource management, policy purposes, and
diverse human activities.
Accurate understanding of land use and cover is imperative for the development planning
of any area. Consequently, a wide range of professionals, including earth system scientists, land
and water managers, and urban planners, are interested in obtaining data on land use and cover
changes, conversion trends, and other related patterns. The spatial dimensions of land use and
cover support policymakers and scientists in making well-informed decisions, as alterations in
these patterns indicate shifts in economic and social conditions. Monitoring such changes with the
help of Advanced technologies like Remote Sensing and Geographic Information Systems is
crucial for coordinated efforts across different administrative levels. Advanced technologies like
Remote Sensing and Geographic Information Systems
9
Changes in vegetation cover refer to variations in the distribution, composition, and overall
structure of plant communities across different temporal and spatial scales. These changes can
occur natural.
A workshop hosted by the South African Journal of Science aimed at postgraduate students and early career researchers with little or no experience in writing and publishing journal articles.
हिंदी वर्णमाला पीपीटी, hindi alphabet PPT presentation, hindi varnamala PPT, Hindi Varnamala pdf, हिंदी स्वर, हिंदी व्यंजन, sikhiye hindi varnmala, dr. mulla adam ali, hindi language and literature, hindi alphabet with drawing, hindi alphabet pdf, hindi varnamala for childrens, hindi language, hindi varnamala practice for kids, https://www.drmullaadamali.com
Executive Directors Chat Leveraging AI for Diversity, Equity, and InclusionTechSoup
Let’s explore the intersection of technology and equity in the final session of our DEI series. Discover how AI tools, like ChatGPT, can be used to support and enhance your nonprofit's DEI initiatives. Participants will gain insights into practical AI applications and get tips for leveraging technology to advance their DEI goals.
2. 116 Computer Science & Information Technology (CS & IT)
Researchers have come up with many approaches for the character recognition, however, many of
them have surveyed in the paper. Apart from that, challenges and issues still prevailing in this
even for future research has also been surveyed in this paper.
This paper is organized as follows; Pre-processing techniques are surveyed in section 2. Section 3
illustrates the various segmentation techniques available for offline character recognition. Feature
extraction methods are explained in section 4. Section 5 explains the various classification
approaches are available. Section 6 illustrates a comprehensive study of existing approaches and
the final section provides an overall view of these approaches, the scope of our research in this
area and conclusion.
2. PRE-PROCESSING
There are numerous tasks to be completed before performing character recognition. A
handwritten document must be scanned and converted into a suitable format for processing. Pre-
processing consists of a few types of sub processes to clean the document image and make it
appropriate to carry the recognition process accurately. The sub processes which get involved in
pre-processing are illustrated below:
• Binarization
• Noise reduction
• Normalization
• Skew correction, thinning and slant removal
2.1 Binarization
Binarization is a method of transforming a gray scale image into a black and white image through
Thresholding [17][9]. Another approach, Otsu's method may be used to perform histogram based
thresholding [16] [14] to get binarized image automatically. Otsu’s method has been extended for
multi level thresholding, called Multi Ostu method [24]. Normally, most researchers use
thresholding concepts to extract the foreground image from background image [18][21][4]. In this
method, the threshold value is fixed by taking any value between two foreground gray code
images. Histogram based thresholding approach can also be used to convert a gray-scale image
into a two tone image. In contrast, Adaptive Binarization method can also be used to identify the
local gray value contrast of Image. This will help to extract text information from low quality
documents. Another approach named Two-Level Global Binarization Technique represents the
output using global thresholding technique [24].
2.2 Noise Removal
Digital images are prone to many types of noises. Noise in a document image is due to poorly
photocopied pages. Median Filtering [18], Wiener Filtering method [19] and morphological
operations can be performed to remove noise [16]. Median filters are used to replace the intensity
of the character image [24], Where as Gaussian filters can be used to smoothing the image [10].
2.3 Normalization
Normalization is the process of converting a random sized image into a standard size.The Roi-
Extraction [20] method is used to get the single structural element from the image. Bicubic
interpolation [16], linear size normalization [14] and Java Image Class [12] normalization
techniques could be used for standard sized images. In many works, input image are normalized
3. Computer Science & Information Technology (CS & IT) 117
to a size 50 * 50 after finding the bounding box of each hand written image [25][11][17] for
razing processing.
2.4 Skew correction, Thinning and Slant removal
Thinning is a pre-process which results in single pixel width image to recognize the handwritten
character easily. It is applied repeatedly leaving only pixel-wide linear representations of the
image characters. Cumulative scalar product (CSP) [18] of windows text block with the Gabor
filters has been used for thinning purpose. Morphology based thinning algorithm [25] [11] and
other Thinning algorithms [8] [3] [17] has also been used for better symbol representation and to
thin the character images. Skeletonization is the process of shedding off a pattern to as many
pixels as possible without affecting the general shape of the pattern. Hilditch’s algorithm has also
been used for removing unwanted pixels from the images (skeletonization) [3] [5] [10] [1]. Skew
is inevitably introduced into the incoming document image during document scanning.
Normalization [15], Fourier Spectrum [23] techniques are used for correction of the slant, angle
stroke, width and vertical scaling.
3. SEGMENTATION
Segmentation is a process which is used to split the document images into lines, words and
characters. Segmentation of hand-written documents is more complex than type-written
documents. Histogram profiles and connected component analysis [20] [6] are used for line
segmentation. In segmentation process, paragraph space has been checked for identifying
paragraphs. Histogram of the image has been used to detect the width of the horizontal lines [2].
Spatial space detection technique has been used for word segmentation. Histogram methods has
been used to detect the both width of words [25] and also to convert the image to glyph [21].
The vertical histogram profile method [16] [17] has been used to find spacing within the lines to
identify the word boundary. Region probe Algorithm [8] has been used to get individual
characters from the image. Segmentation is categorized based on different stages as Projection
based, Smearing, Grouping, Hough-based, graph-based and Cut Text Minimization (CTM).
Modified cross counting Technique [10], Histogram profile [20] and connected component
analysis are also found in the survey to handle the line character segmentation problem.
4. FEATURE EXTRACTION
The feature extraction techniques can be broadly grouped into three class namely statistical
features, structural features and the hybrid features. A statistical technique uses quantitative
measurements for feature extraction, whereas structural techniques use qualitative measurements
for feature extraction. In hybrid approach, these two techniques are combined and used for
recognition.
4.1 Structural Technique
Scale Invariant feature transform (SIFE) [20] is used to transfer the character image into a set of
local features. Using this approach, 128 dimensions of SIFE features (Interesting points) are
identified from the character image. An image is converted to a two tone image, and then
converted into frame. The frame point obtained from the frame will possess the vectors. The
Normalized Feature Vector (NFV) [22] obtains the prototype from vector, i.e. lines and arcs.
4. 118 Computer Science & Information Technology (CS & IT)
4.2 Statistical Technique
In the Zone based method, the normalized characters are divided into non interleaving zones.
Pixel density is calculated for each zone. This is then used to represent the features [16] [12] [11]
[13]. The height and width of character pixels are counted using the encoding Binary variation
approach [4]. Once the top level of row and width is reached, the process halts. A binary flag is
set as per the approach and features extracted from that. Images are divided into nine non
overlapping blocks of equal size using the Gabor channel method [6]. This gives 24 responses for
blocks passing through each channel. Mean and standard derivations are calculated and used as
features.
All images are scaled to the same height and width using Bilinear Interpolation Technique [6][5].
Unwanted portions are corrected using the Sobel edge detection algorithm [6]. The Roi-
Extraction approach [8] uses morphological closing on image resulting in closed image. This
approach casts off the leftmost, rightmost, uppermost and bottommost block out of the limit. The
Features Normalized x-y coordinates; first and second derivatives, curvature, aspect, curliness
and lineness are derived from Time domain features. Features for the frequency domain are
computed along with stroke using a sliding hamming window. The real and imaginary part of
lowest coefficient is added to the feature vector [9]. Structural features such as end point, fork
point, holes, length, shape and curvature of individual stroke are derived using octal graph [7].
The “eight-neighbour” adjustment method is used for tracing the boundaries of the images. The
approach scans until it finds the boundary of binary image. Then, the Fourier descriptor [24] is
used to find the coefficient and obtain the total number of boundaries. This number of invariant
descriptors is given as input to a neural network for further classification.
4.3 Hybrid Technique
Hough Transform [1] has been used to detect the horizontal and vertical lines. They have been
analyzing branch and position using another simple algorithm. Another technique Bilinear
interpolation [5] has been used to extract the features such as slant and strip. A feature extraction
technique Zone based hybrid approach [12], which has been used to extract the zone centroid and
Image centroid based distance metric features.
5. CLASSIFICATION
The Extracted features are given as the input to the Classification process. A bag-of-keypoints
extracted from the feature extraction approaches are used for classification. There are some
approaches are used for classify the character features in the existing systems such as K- Nearest
Neighbour approach, Fuzzy system, Neural network, Discriminate classifier, Unsupervised
classifier and so on.
K-Nearest Neighbour approach [20] used as classifier to recognize the character sets and better
accuracy. The result of the recognition is obtained using the Histogram Equalization. In Self
organizing map approach [2][6][25] the weight of each node are calculated using the Euclidean
distance method [1]. Then, the input characters are compared with all vectors whose weights
were calculated through SOM. If they match, the output will be given as a recognized vector.
This is called as Best Matching Unit (BMU). The global features are used if this approach fails to
classify [21].
In the Multilayer perception for two hidden layer approach [11], a neural network is designed
with all weights mapped to a random number between 1 and -1. The characters are compared
5. Computer Science & Information Technology (CS & IT) 119
with the network output to find the best recognition result. Two hidden layers used for better
performance. If the hidden layers are increased then the performance will be reduced [19][3][24].
In some works, Hidden Markov Model has been used to support handwritten characters
recognition model [18][9]. It provides the maximum probable result. In the Nearest Neighbour
classifier approach [12][11] features of training sample are computed and stored. These features
are used for testing samples. The input samples are compared with the stored samples. In some
works, Support Vector machine a binary classifier, where the features are divided into two classes
along the hyper plane of minimum margin between two classes has been used. Here margin
refers to the distance between hyper plane and closest data points. The outcome is based on the
data points, (ie) at the margin; if the margin is maximum, then the hyper plane will be divided
into sub plane. If the problem of classifying instances into more than two classes (multi class
problem) occurs, use the following algorithms, one-against-all and one-against-one [14][17] [21].
Fuzzy logic approaches have also been used for identifying pattern primitives and labelled of two
tone image, then it will be classified as a prototype characters [23][22]. This finds the upper and
lower confidential limits of the characters using statistical classifier [5] and they are stored as
classified result. In Hierarchical neural network [10] approach two sets are used. The first set is
trained to classify multi groups. In second set, each group is further divided into multiple groups
and from them, the final result is classified
6. COMPREHENSIVE STUDY
The table 1 shows the comprehensive study of which has been made on the different OCR’s
available for offline character recognition
9. Computer Science & Information Technology (CS & IT) 123
Table 2 list outs the accuracy obtained by the various OCR and their appreciation and limitation
obtained in character recognition.
S.No Appreciations Accuracy Limitation
Font size,
Input
format
and
output
format
1[20] Achieved in experiments using
six thousand training and two
thousand testing images of 20
selected characters
87%
Achieved result for 20
selected characters.
problem to recognize
the abnormal writing
and similar shaped
characters
Image is an
input,
Identified
SIFE
Features,
the generate
the Code
book (CB)
2 [2]
Sentence sample from 5
persons but first 8 letters only
used. This also reduced the
amount of time and processing
power needed to
run the experiment.
Not Specified
(100% was
obtainable)
Recognised done only
for 8 characters,
Accuracy is not
obtained properly.
TIF, JPG or
GIF image is
segmented
into
paragraph,
paragraph to
line, line to
words , word
to image
glyph
3 [19]
Taken 5 different samples from
different handwritten for five
different script (Tamil, Urdu,
Telugu, Devnagri and English) -
10 characters have taken from
each languages for recognize.
Single hidden layer
- 96.53, Double
hidden layer - 94%
Considered 10
characters for
recognise, Five
samples of each letters
were collected for
input.
_
4 [18]
100 people (age group 14 to 40)
are made to write four pages of
text in Tamil languages. Lists of
ten words have taken from
each of the four documents for
testing. 40 search words
written by each writers form a
text set. This set have take for
recognition
Test - 90%
Sample - 80.7%
Tested only 40 words. Document
5[21]
25 users selected from 40 for
sample and another 15 for Test
Set. Skew detection check for
the angle orientation between
+or- 15 degree.
Sample set - 100%,
Test Set - 97%,
Speed .1 Sec
Minimal set of samples
(top three letters),
problems to recognize
the abnormal writing,
similar shaped
characters and joined
letters.
Tif, Jpg or
BMP
10. 124 Computer Science & Information Technology (CS & IT)
6 [16]
Result shows that the
algorithm works well for all the
characters
Avg. 82.04% (62.8
% for 3 characters,
98.9 % for 12
characters, all 34
characters - 82.04 )
34 characters 6048
samples, Recognition
errors based on
abnormal errors,
similar shaped
characters
32X32,
48X48 and
64X64 size
image
7 [4] 10 characters are involved in
the recognition
Not given
Consider only 10
characters, Not
specified about
accuracy(not given
100%)
Image is an
input, Text is
an output
8 [14]
Six frequently used Tamil font
characters with different style
printed are tested. Initially 216
instances were taken for
training and gradually
incremented
SVM - 92.5%, Six
font variation(86%
- 100%)
No Handwritten
character are in the
process. Six frequently
used Tamil fonts
regular,
bold, bold
italic, italic.
Printed text
as input
9 [12]
Collected 2000(1000 testing,
1000 training) Tamil numeral
samples from 200 people for
recognize, 10 sample
characters are considered. Give
good accuracy rate for Kannada
SVM - 93.9%, NN -
94.9%,
10 characters only
considered for
recognition
Input format
is BMP
10 [6]
Considered 67 symbols, 200
training and 800 testing
samples and 100 text lines are
used. SOM model is to capture
the invariant features of the
Tamil Scripts
98.50%
Recognised done only
for 67 characters.
Problems to recognize
the abnormal writing,
similar shaped
characters and joined
letters.
250 X 250
pixels INPUT
FONT
11 [25]
Very few Samples Not Specified Very few Samples
TIF, JPG or
GIF input
format
12 [8]
Applied for 250 "Thirukural"
92% to 94 % (50
kurals - 94.1%, 100
Kurals - 94.3, 150
Kurals - 92.5%, 200
Kurals - 94.2%, 250
Kurals - 92.5%)
Accuracy rate,
Handwritten character
not specified
_
13 [9]
Tested Many Hand written of
different individuals,
Olaichuvadi, Scanned and
machine printed documents,
more compatible for other
Indian scripts
93% to 98% (25
words - 94.41%, 50
words - 96.52% ,
100 words -
98.48% ,150 words
- 95.07%, 200
words - 93%)
.But Given example is
few hand written
scanned documents
only, Not considering
the Continuous writing
(If there are no space
between the
characters) and sliding
characters
_
11. Computer Science & Information Technology (CS & IT) 125
14 [11]
collected 1500 Tamil numeral
samples from 150 people for
recognize, 10 sample
characters are considered
96%
10 characters only
considered for
recognition
Input format
is BMP
15 [23]
Tested and recognize 2500
sample of 28 characters.
Recognized 30 degree sliding
characters also
76-94%
Tested 28 characters
only. problems to
more curves and
joined letters.
Input size 15
pixels
heightand
width 30
pixels
The following challenges are identified from various papers which may provide more lively
interest to the researchers to carry out the research work in this area. They are
• Recognized limited number of Tamil characters (40 Characters has been identified)
• Minimum number of testing samples from various handwritten documents has been
considered
• Difficult to identify the abnormal writing and similar shaped characters
• Font variation and sliding letters – not dealt
• Low accuracy rate in recognition
• Old handwritten character set which are available in palm leaf, historical documents,
mother documents and stone carvings are not dealt until now.
• Distorted characters which are on the damaged palm leaf and damaged documents are not
considered till now.
7. CONCLUSION
A lot of research work exists in the survey for Tamil Handwritten recognition. However, there is
no standard solution to identify all Tamil characters with reasonable accuracy. Various methods
have been used in each phase of the recognition process, whereas each approach provides
solution only for few character sets. Challenges still prevails in the recognition of normal as well
as abnormal writing, slanting characters, similar shaped characters, joined characters, curves and
so on during recognition process.
In this paper, we have projected various aspects of each phase of the offline Tamil character
recognition process. Researchers have used minimal character set. Coverage is not given for
different writing styles and font size issues. The following key challenges can be further explored
in the future by researchers
• Curves in the Tamil characters
• Very large character set
• Complex letter structure
• Significant variation in writing styles due to complex letter structures
• Increased number of stroke and holes
• Mixed words (English and Tamil)
• Extreme font variability
• Difficulties faced in viewing angles, shadows and unique fonts
12. 126 Computer Science & Information Technology (CS & IT)
REFERENCES
[1] Akshay Apte and Harshad Gado, “Tamil character recognition using structural features” ,2010
[2] Banumathi P and Nasira G.M, “Handwritten Tamil Character Recognition using Artificial neural
networks”, International Conference on Process Automation, Control and Computing (PACC),
page(s): 1 – 5, 2011
[3] Bhattacharya U, Ghosh S.K and Parui S.K, “A Two Stage Recognition Scheme for Handwritten
Tamil Characters”, Ninth International Conference on Document Analysis and Recognition, Vol: 1
page(s): 511 – 515, 2007
[4] Bremananth R and Prakash A, “Tamil Numerals Identification”, International Conference on
Advances in Recent Technologies in Communication and Computing, page(s): 620 – 622, 2009
[5] Hewavitharana S and Fernando H.C, “A Two Stage Classification Approach to Tamil Handwritten
Recognition”, Tamil Internet, California, USA, 2002
[6] Indra Gandhi R and Iyakutti K, “An attempt to Recognize Handwritten Tamil Character using
Kohonen SOM”, Int. J. of Advance d Networking and Applications, Volume: 01 Issue: 03 Pages:
188-192, 2009
[7] Jagadeesh Kannan R and Prabhakar R, ”An improved Handwritten Tamil Character Recognition
System using Octal Graph”, Int. J. of Computer Science, ISSN 1549-3636, Vol 4 (7): 509-516, 2008
[8] Jagadeesh Kumar R and Prabhakar R, “Accuracy Augmentation of Tamil OCR Using Algorithm
Fusion”, Int. J. of Computer Science and Network Security, VOL.8 No.5, May 2008
[9] Jagadeesh Kumar R, Prabhakar R and Suresh R.M, “Off-line Cursive Handwritten Tamil Characters
Recognition”, International Conference on Security Technology, page(s): 159 – 164, 2008
[10] Paulpandian T and Ganapathy V, “Translation and scale Invariant Recognition of Handwritten Tamil
characters using Hierarchical Neural Networks”, Circuits and Systems, IEEE Int. Sym. , vol.4, 2439 –
2441, 1993
[11] Rajashekararadhya S.V and Vanaja Ranjan P, “Efficient Zone based Feature Extraction Algorithm for
Handwritten Numeral Recognition of Four Popular south Indian Scripts”. Int. J. of Theoretical and
Applied Information Technology, pages: 1171 – 1181, 2008
[12] Rajashekararadhya S.V and Vanaja Ranjan P, “Zone-Based Hybrid Feature Extraction Algorithm for
Handwritten Numeral Recognition of two popular Indian Script”, World Congress on Nature &
Biologically Inspired Computing, page(s): 526 – 530, 2009.
[13] Rajashekararadhya S.V, Vanaja Ranjan P and Manjunath Aradhya V. N, “Isolated Handwritten
Kannada and Tamil Numeral Recognition a Novel Approach”, First International Conference on
Emerging Trends in Engineering and Technology, page(s): 1192 – 1195, 2008
[14] Ramanathan R, Ponmathavan S, Thaneshwaran L, Arun.S.Nair, and Valliappan N, “Tamil font
Recognition Using Gabor and Support vector machines”, International Conference on Advances in
Computing, Control, & Telecommunication Technologies, page(s): 613 – 615, 2009
[15] Sarveswaran K and Ratnaweera , “An Adaptive Technique for Handwritten Tamil Character
Recognition”, International Conference on Intelligent and Advanced Systems, page(s): 151 – 156,
2007
[16] Shanthi N and Duraiswami K, “A Novel SVM -based Handwritten Tamil character recognition
system”, springer, Pattern Analysis & Applications,Vol-13, No. 2, 173-180,2010
13. Computer Science & Information Technology (CS & IT) 127
[17] Shanthi N and Duraiswami K, “Performance Comparison of Different Image size for Recognizing
unconstrained Handwritten Tamil character using SVM”, Journal of Computer Science Vol-3 (9):
Page(3) 760-764, 2007
[18] Sigappi A.N, Palanivel S and Ramalingam V, “Handwritten Document Retrieval System for Tamil
Language”, Int. J of Computer Application, Vol-31, 2011
[19] Stuti Asthana, Farha Haneef and Rakesh K Bhujade, “Handwritten Multiscript Numeral Recognition
using Artificial Neural Networks”, Int. J. of Soft Computing and Engineering ISSN: 2231-2307,
Volume-1, Issue-1, March 2011
[20] Subashini A and Kodikara N.D , ” A Novel SIFE-based Codebook Generation for Handwritten Tamil
character Recognition” , 6th IEEE Int. Conf. on Industrial and Information Systems (ICIIS), Page(s):
261 – 264, 2011
[21] Suresh Kumar C and Ravichandran T, “Handwritten Tamil Character Recognition using RCS
algorithms”, Int. J. of Computer Applications, (0975 – 8887) Volume 8– No.8, October 2010
[22] Suresh R.M, Arumugam S and Ganesan L, “Fuzzy Approach to Recognize Handwritten Tamil
Characters”, Third International Conference, Proc. on Computational Intelligence and Multimedia
Applications, page(s): 459 – 463, 1999
[23] Suresh R.M, “Printed and Handwritten Tamil Characters Recognition using Fuzzy Technique”, Pro.
of the Int. Multi Conference of Engineers and Computer Scientists, Vol I, 19-2, March, 2008
[24] Sutha J and RamaRaj N, “Neural network based offline Tamil handwritten character recognition
System” , International Conference on Conference on Computational Intelligence and Multimedia
Vol : 2, page(s): 446 – 450, 2007
[25] Venkatesh J and Suresh Kumar C, “Tamil Handwritten Character Recognition using Kohonon's Self
Organizing Map”, Int. J. of Computer Science and Network Security, VOL.9 No.12, December 2009