Document Analysis and Recognition (DAR) aims to extract automatically the information in the document and also addresses to human comprehension. The automatic processing of degraded
historical documents are applications of document image analysis field which is confronted with many difficulties due to the storage condition and the complexity of the script. The main interest
of enhancement of historical documents is to remove undesirable statistics that appear in the
background and highlight the foreground, so as to enable automatic recognition of documents
with high accuracy. This paper addresses pre-processing and segmentation of ancient scripts, as an initial step to automate the task of an epigraphist in reading and deciphering inscriptions.
Pre-processing involves, enhancement of degraded ancient document images which is achieved through four different Spatial filtering methods for smoothing or sharpening namely Median,
Gaussian blur, Mean and Bilateral filter, with different mask sizes. This is followed by
binarization of the enhanced image to highlight the foreground information, using Otsu
thresholding algorithm. In the second phase Segmentation is carried out using Drop Fall and
WaterReservoir approaches, to obtain sampled characters, which can be used in later stages of
OCR. The system showed good results when tested on the nearly 150 samples of varying
degraded epigraphic images and works well giving better enhanced output for, 4x4 mask size
for Median filter, 2x2 mask size for Gaussian blur, 4x4 mask size for Mean and Bilateral filter.
The system can effectively sample characters from enhanced images, giving a segmentation rate of 85%-90% for Drop Fall and 85%-90% for Water Reservoir techniques respectively
Improved wolf algorithm on document images detection using optimum mean techn...journalBEEI
Detection text from handwriting in historical documents provides high-level features for the challenging problem of handwriting recognition. Such handwriting often contains noise, faint or incomplete strokes, strokes with gaps, and competing lines when embedded in a table or form, making it unsuitable for local line following algorithms or associated binarization schemes. In this paper, a proposed method based on the optimum threshold value and namely as the Optimum Mean method was presented. Besides, Wolf method unsuccessful in order to detect the thin text in the non-uniform input image. However, the proposed method was suggested to overcome the Wolf method problem by suggesting a maximum threshold value using optimum mean. Based on the calculation, the proposed method obtained a higher F-measure (74.53), PSNR (14.77) and lowest NRM (0.11) compared to the Wolf method. In conclusion, the proposed method successful and effective to solve the wolf problem by producing a high-quality output image.
Segmentation - based Historical Handwritten Word Spotting using document-spec...Konstantinos Zagoris
Many word spotting strategies for the modern documents are not directly applicable to historical handwritten documents due to writing styles variety and intense degradation. In this paper, a new method that permits effective word spotting in handwritten documents is presented that relies upon document-specific local features which take into account texture information around representative keypoints. Experimental work on two historical handwritten datasets using standard evaluation measures shows the improved performance achieved by the proposed methodology.
Text extraction using document structure features and support vector machinesKonstantinos Zagoris
This document presents a bottom-up text localization technique that uses document structure features and support vector machines. The technique detects and extracts text from document images through several steps: preprocessing, locating and merging blocks, extracting features from blocks, and using support vector machines trained on the features to classify blocks as containing text or not. The technique uses a flexible feature descriptor based on structural elements that can adapt to different document image types. Experimental results on document images show a 98.5% success rate in classifying blocks.
Segmentation and recognition of handwritten digit numeral string using a mult...ijfcstjournal
In this paper, the use of Multi-Layer Perceptron (MLP) Neural Network model is proposed for recognizing
unconstrained offline handwritten Numeral strings. The Numeral strings are segmented and isolated
numerals are obtained using a connected component labeling (CCL) algorithm approach. The structural
part of the models has been modeled using a Multilayer Perceptron Neural Network. This paper also
presents a new technique to remove slope and slant from handwritten numeral string and to normalize the
size of text images and classify with supervised learning methods. Experimental results on a database of
102 numeral string patterns written by 3 different people show that a recognition rate of 99.7% is obtained
on independent digits contained in the numeral string of digits includes both the skewed and slant data.
Texture features based text extraction from images using DWT and K-means clus...Divya Gera
Text extraction from different kind of images document, caption and scene text images. Discret wavelet transform was used to exract horizontal, vertical and diagonal features and k-means clustering was used to cluster the features into text and background cluster. For simple images k = 2 worked i.e. text and backgroud cluster while for complex images k=3 was used i.e. text cluster, complex background ad simple background.
IRJET- Object Detection using Hausdorff DistanceIRJET Journal
This document proposes a new object recognition system using Hausdorff distance. The system aims to improve on existing methods like YOLO that struggle with small objects and can capture garbage data. The document outlines preprocessing steps like noise cancellation, representing shapes as point sets, and extracting features. It then describes using Hausdorff distance and shape context to find the best match between input and reference shapes. Testing on datasets showed encouraging results for recognizing handwritten digits.
GEOMETRIC CORRECTION FOR BRAILLE DOCUMENT IMAGEScsandit
Braille system has been used by the visually impaired people for reading.The shortage of Braille
books has caused a need for conversion of Braille to text. This paper addresses the geometric
correction of a Braille document images. Due to the standard measurement of the Braille cells,
identification of Braille characters could be achieved by simple cell overlapping procedure. The
standard measurement varies in a scaled document and fitting of the cells become difficult if the
document is tilted. This paper proposes a line fitting algorithm for identifying the tilt (skew)
angle. The horizontal and vertical scale factor is identified based on the ratio of distance
between characters to the distance between dots. These are used in geometric transformation
matrix for correction. Rotation correction is done prior to scale correction. This process aids in
increased accuracy. The results for various Braille documents are tabulated.
Single Image Super-Resolution Using Analytical Solution for L2-L2 Algorithmijtsrd
This paper addresses a unified work for achieving single image super-resolution, which consists of improving a high resolution from blurred, decimated and noisy version. Single image super-resolution is also known as image enhancement or image scaling up. In this paper mainly four steps are used for enhancement of single image resolution: input image, low sampling the image, an analytical solution and L2 regularization. This proposes to deal with the decimation and blurring operators by their particular properties in the frequency domain, which leads to a fast super-resolution approach. And an analytical solution obtained and implemented for the L2-regularization i.e. L2-L2 optimized algorithm. This aims to reduce the computational cost of the existing methods by the proposed method. Simulation results taken on different images and different priors with an advance machine learning technique and conducted results compared with the existing method. Varsha Patil | Meharunnisa SP"Single Image Super-Resolution Using Analytical Solution for L2-L2 Algorithm" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-4 , June 2018, URL: http://www.ijtsrd.com/papers/ijtsrd15635.pdf http://www.ijtsrd.com/engineering/electronics-and-communication-engineering/15635/single-image-super-resolution-using-analytical-solution-for-l2-l2-algorithm/varsha-patil
Improved wolf algorithm on document images detection using optimum mean techn...journalBEEI
Detection text from handwriting in historical documents provides high-level features for the challenging problem of handwriting recognition. Such handwriting often contains noise, faint or incomplete strokes, strokes with gaps, and competing lines when embedded in a table or form, making it unsuitable for local line following algorithms or associated binarization schemes. In this paper, a proposed method based on the optimum threshold value and namely as the Optimum Mean method was presented. Besides, Wolf method unsuccessful in order to detect the thin text in the non-uniform input image. However, the proposed method was suggested to overcome the Wolf method problem by suggesting a maximum threshold value using optimum mean. Based on the calculation, the proposed method obtained a higher F-measure (74.53), PSNR (14.77) and lowest NRM (0.11) compared to the Wolf method. In conclusion, the proposed method successful and effective to solve the wolf problem by producing a high-quality output image.
Segmentation - based Historical Handwritten Word Spotting using document-spec...Konstantinos Zagoris
Many word spotting strategies for the modern documents are not directly applicable to historical handwritten documents due to writing styles variety and intense degradation. In this paper, a new method that permits effective word spotting in handwritten documents is presented that relies upon document-specific local features which take into account texture information around representative keypoints. Experimental work on two historical handwritten datasets using standard evaluation measures shows the improved performance achieved by the proposed methodology.
Text extraction using document structure features and support vector machinesKonstantinos Zagoris
This document presents a bottom-up text localization technique that uses document structure features and support vector machines. The technique detects and extracts text from document images through several steps: preprocessing, locating and merging blocks, extracting features from blocks, and using support vector machines trained on the features to classify blocks as containing text or not. The technique uses a flexible feature descriptor based on structural elements that can adapt to different document image types. Experimental results on document images show a 98.5% success rate in classifying blocks.
Segmentation and recognition of handwritten digit numeral string using a mult...ijfcstjournal
In this paper, the use of Multi-Layer Perceptron (MLP) Neural Network model is proposed for recognizing
unconstrained offline handwritten Numeral strings. The Numeral strings are segmented and isolated
numerals are obtained using a connected component labeling (CCL) algorithm approach. The structural
part of the models has been modeled using a Multilayer Perceptron Neural Network. This paper also
presents a new technique to remove slope and slant from handwritten numeral string and to normalize the
size of text images and classify with supervised learning methods. Experimental results on a database of
102 numeral string patterns written by 3 different people show that a recognition rate of 99.7% is obtained
on independent digits contained in the numeral string of digits includes both the skewed and slant data.
Texture features based text extraction from images using DWT and K-means clus...Divya Gera
Text extraction from different kind of images document, caption and scene text images. Discret wavelet transform was used to exract horizontal, vertical and diagonal features and k-means clustering was used to cluster the features into text and background cluster. For simple images k = 2 worked i.e. text and backgroud cluster while for complex images k=3 was used i.e. text cluster, complex background ad simple background.
IRJET- Object Detection using Hausdorff DistanceIRJET Journal
This document proposes a new object recognition system using Hausdorff distance. The system aims to improve on existing methods like YOLO that struggle with small objects and can capture garbage data. The document outlines preprocessing steps like noise cancellation, representing shapes as point sets, and extracting features. It then describes using Hausdorff distance and shape context to find the best match between input and reference shapes. Testing on datasets showed encouraging results for recognizing handwritten digits.
GEOMETRIC CORRECTION FOR BRAILLE DOCUMENT IMAGEScsandit
Braille system has been used by the visually impaired people for reading.The shortage of Braille
books has caused a need for conversion of Braille to text. This paper addresses the geometric
correction of a Braille document images. Due to the standard measurement of the Braille cells,
identification of Braille characters could be achieved by simple cell overlapping procedure. The
standard measurement varies in a scaled document and fitting of the cells become difficult if the
document is tilted. This paper proposes a line fitting algorithm for identifying the tilt (skew)
angle. The horizontal and vertical scale factor is identified based on the ratio of distance
between characters to the distance between dots. These are used in geometric transformation
matrix for correction. Rotation correction is done prior to scale correction. This process aids in
increased accuracy. The results for various Braille documents are tabulated.
Single Image Super-Resolution Using Analytical Solution for L2-L2 Algorithmijtsrd
This paper addresses a unified work for achieving single image super-resolution, which consists of improving a high resolution from blurred, decimated and noisy version. Single image super-resolution is also known as image enhancement or image scaling up. In this paper mainly four steps are used for enhancement of single image resolution: input image, low sampling the image, an analytical solution and L2 regularization. This proposes to deal with the decimation and blurring operators by their particular properties in the frequency domain, which leads to a fast super-resolution approach. And an analytical solution obtained and implemented for the L2-regularization i.e. L2-L2 optimized algorithm. This aims to reduce the computational cost of the existing methods by the proposed method. Simulation results taken on different images and different priors with an advance machine learning technique and conducted results compared with the existing method. Varsha Patil | Meharunnisa SP"Single Image Super-Resolution Using Analytical Solution for L2-L2 Algorithm" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-4 , June 2018, URL: http://www.ijtsrd.com/papers/ijtsrd15635.pdf http://www.ijtsrd.com/engineering/electronics-and-communication-engineering/15635/single-image-super-resolution-using-analytical-solution-for-l2-l2-algorithm/varsha-patil
An effective and robust technique for the binarization of degraded document i...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Textual information in images constitutes a very rich source of high-level semantics for retrieval and indexing. In this paper, a new approach is proposed using Cellular Automata (CA) which strives towards identifying scene text on natural images. Initially, a binary edge map is calculated. Then, taking advantage of the CA flexibility, the transition rules are changing and are applied in four consecutive steps resulting in four time steps CA evolution. Finally, a post-processing technique based on edge projection analysis is employed for high density edge images concerning the elimination of possible false positives. Evaluation results indicate considerable performance gains without sacrificing text detection accuracy.
This document discusses methods for restoring content from degraded document images that have been damaged by termite bites. It analyzes different preprocessing techniques for removing noise caused by termite bites, including global thresholding, local thresholding, Gaussian filtering, and Gabor filtering. The proposed method applies Canny edge detection, hole filling, and gradient estimation to identify meaningful edges and threshold gradient magnitudes to remove termite bite noise. Experimental results show that local methods like Niblack and Savula do not work well, while Gaussian and Gabor filtering improve the image but do not fully remove the noise. The proposed technique achieves better results by detecting edges and filling holes before gradient-based thresholding.
Cursive Handwriting Recognition System using Feature Extraction and Artif...IRJET Journal
The document describes a system for recognizing cursive handwriting using feature extraction and an artificial neural network. It involves preprocessing scanned images, segmenting them into individual characters, extracting features from the characters using a diagonal scanning method, and classifying the characters using a neural network. This approach provides higher recognition accuracy compared to conventional methods. The key steps are preprocessing images, segmenting into characters, extracting 54 features from each character by moving along diagonals in a grid, and training a neural network classifier on the extracted features.
Self-Directing Text Detection and Removal from Images with SmoothingPriyanka Wagh
This document presents a method for self-directing text detection and removal from images using smoothing and exemplar-based inpainting (TLES+EBI). It introduces the problem of existing text detection systems requiring technical skills and outlines objectives to improve automaticity and visually plausible region filling. The methodology applies L0 gradient minimization smoothing to text detection, followed by exemplar-based inpainting for hole filling. Experimental results show smoothing improves detection rates while exemplar-based inpainting decreases MSE and increases PSNR compared to other methods. The document concludes the approach achieves better text detection and visually plausible region filling.
Script Identification for printed document images at text-line level using DC...IOSR Journals
This document summarizes a research paper that proposes a script identification approach for Indian scripts at the text-line level of printed documents. The approach uses visual appearance-based recognition by extracting features from text lines using Discrete Cosine Transform (DCT) and Principal Component Analysis (PCA). These features are classified using a Modified K-Nearest Neighbor algorithm. The method achieves 95% accuracy in recognizing 11 major Indian languages and discusses the importance of script identification in multilingual environments like India for applications such as optical character recognition.
Finding similarities between structured documents as a crucial stage for gene...Alexander Decker
This document discusses methods for classifying structured documents by finding similarities between them. It describes how pre-processing steps like thresholding and size normalization are used. A key step is tilting documents based on detecting reference lines using clustering. Features are then extracted from the tilted images, like reference lines and logos, to classify documents. The proposed method calculates tilt angle based on the largest cluster of connected line pixels to properly orient documents for classification.
A NOVEL DATA DICTIONARY LEARNING FOR LEAF RECOGNITIONsipij
Automatic leaf recognition via image processing has been greatly important for a number of professionals, such as botanical taxonomic, environmental protectors, and foresters. Learn an over-complete leaf dictionary is an essential step for leaf image recognition. Big leaf images dimensions and training images number is facing of fast and complete data leaves dictionary. In this work an efficient approach applies to construct over-complete data leaves dictionary to set of big images diminutions based on sparse representation. In the proposed method a new cropped-contour method has used to crop the training image. The experiments are testing using correlation between the sparse representation and data dictionary and with focus on the computing time.
A system was developed able to retrieve specific documents from a document collection. In this system the query is given in text by the user and then transformed into image. Appropriate features were in order to capture the general shape of the query, and ignore details due to noise or different fonts. In order to demonstrate the effectiveness of our system, we used a collection of noisy documents and we compared our results with those of a commercial OCR package.
Fast and Scalable Semi Supervised Adaptation For Video Action RecognitionIJSRED
This document summarizes a research paper on developing a fast and scalable semi-supervised method for video action recognition using conditional random fields. The proposed method, called semi-supervised virtual evidence boosting (sVEB), simultaneously performs feature selection and semi-supervised training of CRFs using both labeled and unlabeled data. sVEB reduces the amount of labeling required during training while maintaining high accuracy, making it suitable for real-world applications. Experimental results on synthetic and activity recognition tasks demonstrate that sVEB outperforms other training techniques in terms of accuracy while being more computationally efficient.
Handwritten and Machine Printed Text Separation in Document Images using the ...Konstantinos Zagoris
In a number of types of documents, ranging from forms to archive documents and books with annotations, machine printed and handwritten text may be present in the same document image, giving rise to significant issues within a digitisation and recognition pipeline. It is therefore necessary to separate the two types of text before applying different recognition methodologies to each. In this paper, a new approach is proposed which strives towards identifying and separating handwritten from machine printed text using the Bag of Visual Words paradigm (BoVW). Initially, blocks of interest are detected in the document image. For each block, a descriptor is calculated based on the BoVW. The final characterization of the blocks as Handwritten,Machine Printed or Noise is made by a Support Vector Machine classifier. The promising performance of the proposed approach is shown by using a consistent evaluation methodology which couples meaningful measures along with a new dataset.
IRJET- Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...IRJET Journal
This document compares the performance of four face recognition algorithms - PCA, KPCA, KFA, and LDA - on three standard datasets: AT&T, Yale, and UMIST. It finds that KFA generally achieves the highest recognition rates, particularly for the AT&T and Yale datasets which involve changes in facial expressions and lighting. The Yale dataset, with its variations, yields the best results overall for KFA and LDA. The UMIST dataset, with its profile images, produces lower recognition rates across algorithms due to less similarity between training and test images.
Documents Image Binarization is performed in the preprocessing stage for document analysis and it aims to
segment the foreground text from the document background. A fast and accurate document image binarization
technique is important for the ensuing document image processing tasks such as optical character recognition
(OCR). Though document image binarization has been studied for many years, the thresholding of degraded
document images is still an unsolved problem due to the high inter/intra variation between the text stroke and
the document background across different document images. The handwritten text within the degraded
documents often shows a certain amount of variation in terms of the stroke width, stroke brightness, stroke
connection, and document background. In addition, historical documents are often degraded by the bleed.
Documents are often degraded by different types of imaging artifact. These different types of document
degradations tend to induce the document thresholding error and make degraded document image binarization a
big challenge to most state-of-the-art techniques. The proposed method is simple, robust and capable of
handling different types of degraded document images with minimum parameter tuning. It makes use of the
adaptive image contrast that combines the local image contrast and the local image gradient adaptively and
therefore is tolerant to the text and background variation caused by different types of document degradations. In
particular, the proposed technique addresses the over-normalization problem of the local maximum minimum
algorithm. At the same time, the parameters used in the algorithm can be adaptively estimated.
Segmentation of unhealthy region of plant leaf using image processing techniq...eSAT Publishing House
1. This document discusses various image segmentation techniques that can be used to segment diseased regions of plant leaves for disease identification.
2. Common segmentation techniques discussed include K-means clustering, Fuzzy C-means clustering, Penalized Fuzzy C-means, and unsupervised segmentation.
3. After segmentation, texture and color features are extracted from the diseased regions to identify the plant disease using classification methods. The choice of segmentation technique depends on factors like noise levels and boundary definitions in the image.
Data reduction techniques for high dimensional biological dataeSAT Journals
Abstract
High dimensional biological datasets in recent years has been growing rapidly. Extracting the knowledge and analyzing highdimensional
biological data is one the key challenges in which variety and veracity are the two distinct characteristics. The
question that arises now is, how to perform dimensionality reduction for this heterogeneous data and how to develop a high
performance platform to efficiently analyze high dimensional biological data and how to find the useful things from this data. To
deeply discuss this issue, this paper begins with a brief introduction to data analytics available for biological data, followed by
the discussions of big data analytics and then a survey on various data reduction methods for biological data. We propose a dense
clustering algorithm for standard high dimensional biological data.
Keywords: Big Data Analytics, Dimensionality Reduction
Representation and recognition of handwirten digits using deformable templatesAhmed Abd-Elwasaa
This document presents a case study on using deformable templates for recognizing handwritten digits. Deformable templates match unknown images to known templates by deforming the contours of templates to fit the edge strengths of unknown images. The dissimilarity measure is derived from the deformation needed for the match. This technique achieved recognition rates up to 99.25% on a dataset of 2,000 handwritten digits. Statistical and structural features are extracted to represent characters for classification using deformable templates.
Cursive Handwriting Segmentation Using Ideal Distance Approach IJECEIAES
Offline cursive handwriting becomes a major challenge due to the huge amount of handwriting varieties such as slant handwriting, space between words, the size and direction of the letter, the style of writing the letter and handwriting with contour similarity on some letters. There are some steps for recursive handwriting recognition. The steps are preprocessing, morphology, segmentation, features of letter extraction and recognition. Segmentation is a crucial process in handwriting recognition since the success of segmentation step will determine the success level of recognition. This paper proposes a segmentation algorithm that segment recursive handwriting into letters. These letters will form words using a method that determine the intersection cutting point of image recursive handwriting with an ideal image distance. The ideal distance of recursive handwriting image is an ideal distance segmentation point in order to avoid the cutting of other letter’s section. The width and height of images are used to determine the accurate segmentation point. There were 999 recursive handwriting input images taken from 25 researchers used for this study. The images used are the images obtained from preprocessing step. Those are the images with slope correction. This study used Support Vector Machine (SVM) to recognize recursive handwriting. The experiments show the proposed segmentation algorithm able to segment the image precisely and have 97% success recognizing the recursive handwriting.
Hybrid Multilevel Thresholding and Improved Harmony Search Algorithm for Segm...IJECEIAES
This paper proposes a new method for image segmentation is hybrid multilevel thresholding and improved harmony search algorithm. Improved harmony search algorithm which is a method for finding vector solutions by increasing its accuracy. The proposed method looks for a random candidate solution, then its quality is evaluated through the Otsu objective function. Furthermore, the operator continues to evolve the solution candidate circuit until the optimal solution is found. The dataset used in this study is the retina dataset, tongue, lenna, baboon, and cameraman. The experimental results show that this method produces the high performance as seen from peak signal-to-noise ratio analysis (PNSR). The PNSR result for retinal image averaged 40.342 dB while for the average tongue image 35.340 dB. For lenna, baboon and cameramen produce an average of 33.781 dB, 33.499 dB, and 34.869 dB. Furthermore, the process of object recognition and identification is expected to use this method to produce a high degree of accuracy.
The document summarizes an adaptive image steganography technique that embeds secret messages into digital images. It proposes using adaptive quantization embedding, where quantization steps for image blocks are optimized to guarantee more data can be embedded in busy image areas with high contrast. The technique embeds adaptive quantization parameters and message bits into the cover image using a difference expanding algorithm. Simulation results showed the proposed scheme can provide a good balance between imperceptibility and embedding capacity.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Text detection and recognition in scene images or natural images has applications in computer
vision systems like registration number plate detection, automatic traffic sign detection, image retrieval
and help for visually impaired people. Scene text, however, has complicated background, blur image,
partly occluded text, variations in font-styles, image noise and ranging illumination. Hence scene text
recognition could be a difficult computer vision problem. In this paper connected component method
is used to extract the text from background. In this work, horizontal and vertical projection profiles,
geometric properties of text, image binirization and gap filling method are used to extract the text from
scene images. Then histogram based threshold is applied to separate text background of the images.
Finally text is extracted from images.
The document presents a novel pre-processing approach to improve optical character recognition (OCR) accuracy on document images. The approach uses a stack of enhancement techniques including illumination adjustment, grayscale conversion, sharpening, and binarization. Specifically, it applies contrast limited adaptive histogram equalization to adjust illumination, uses a luminance algorithm for grayscale optimization, employs unsharp masking to enhance text, and performs Otsu's method for binarization. The proposed method aims to compensate for distortions and improve OCR accuracy in a nonparametric and unsupervised manner. It is evaluated on standard datasets and shown to significantly improve text detection and OCR performance.
An effective and robust technique for the binarization of degraded document i...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Textual information in images constitutes a very rich source of high-level semantics for retrieval and indexing. In this paper, a new approach is proposed using Cellular Automata (CA) which strives towards identifying scene text on natural images. Initially, a binary edge map is calculated. Then, taking advantage of the CA flexibility, the transition rules are changing and are applied in four consecutive steps resulting in four time steps CA evolution. Finally, a post-processing technique based on edge projection analysis is employed for high density edge images concerning the elimination of possible false positives. Evaluation results indicate considerable performance gains without sacrificing text detection accuracy.
This document discusses methods for restoring content from degraded document images that have been damaged by termite bites. It analyzes different preprocessing techniques for removing noise caused by termite bites, including global thresholding, local thresholding, Gaussian filtering, and Gabor filtering. The proposed method applies Canny edge detection, hole filling, and gradient estimation to identify meaningful edges and threshold gradient magnitudes to remove termite bite noise. Experimental results show that local methods like Niblack and Savula do not work well, while Gaussian and Gabor filtering improve the image but do not fully remove the noise. The proposed technique achieves better results by detecting edges and filling holes before gradient-based thresholding.
Cursive Handwriting Recognition System using Feature Extraction and Artif...IRJET Journal
The document describes a system for recognizing cursive handwriting using feature extraction and an artificial neural network. It involves preprocessing scanned images, segmenting them into individual characters, extracting features from the characters using a diagonal scanning method, and classifying the characters using a neural network. This approach provides higher recognition accuracy compared to conventional methods. The key steps are preprocessing images, segmenting into characters, extracting 54 features from each character by moving along diagonals in a grid, and training a neural network classifier on the extracted features.
Self-Directing Text Detection and Removal from Images with SmoothingPriyanka Wagh
This document presents a method for self-directing text detection and removal from images using smoothing and exemplar-based inpainting (TLES+EBI). It introduces the problem of existing text detection systems requiring technical skills and outlines objectives to improve automaticity and visually plausible region filling. The methodology applies L0 gradient minimization smoothing to text detection, followed by exemplar-based inpainting for hole filling. Experimental results show smoothing improves detection rates while exemplar-based inpainting decreases MSE and increases PSNR compared to other methods. The document concludes the approach achieves better text detection and visually plausible region filling.
Script Identification for printed document images at text-line level using DC...IOSR Journals
This document summarizes a research paper that proposes a script identification approach for Indian scripts at the text-line level of printed documents. The approach uses visual appearance-based recognition by extracting features from text lines using Discrete Cosine Transform (DCT) and Principal Component Analysis (PCA). These features are classified using a Modified K-Nearest Neighbor algorithm. The method achieves 95% accuracy in recognizing 11 major Indian languages and discusses the importance of script identification in multilingual environments like India for applications such as optical character recognition.
Finding similarities between structured documents as a crucial stage for gene...Alexander Decker
This document discusses methods for classifying structured documents by finding similarities between them. It describes how pre-processing steps like thresholding and size normalization are used. A key step is tilting documents based on detecting reference lines using clustering. Features are then extracted from the tilted images, like reference lines and logos, to classify documents. The proposed method calculates tilt angle based on the largest cluster of connected line pixels to properly orient documents for classification.
A NOVEL DATA DICTIONARY LEARNING FOR LEAF RECOGNITIONsipij
Automatic leaf recognition via image processing has been greatly important for a number of professionals, such as botanical taxonomic, environmental protectors, and foresters. Learn an over-complete leaf dictionary is an essential step for leaf image recognition. Big leaf images dimensions and training images number is facing of fast and complete data leaves dictionary. In this work an efficient approach applies to construct over-complete data leaves dictionary to set of big images diminutions based on sparse representation. In the proposed method a new cropped-contour method has used to crop the training image. The experiments are testing using correlation between the sparse representation and data dictionary and with focus on the computing time.
A system was developed able to retrieve specific documents from a document collection. In this system the query is given in text by the user and then transformed into image. Appropriate features were in order to capture the general shape of the query, and ignore details due to noise or different fonts. In order to demonstrate the effectiveness of our system, we used a collection of noisy documents and we compared our results with those of a commercial OCR package.
Fast and Scalable Semi Supervised Adaptation For Video Action RecognitionIJSRED
This document summarizes a research paper on developing a fast and scalable semi-supervised method for video action recognition using conditional random fields. The proposed method, called semi-supervised virtual evidence boosting (sVEB), simultaneously performs feature selection and semi-supervised training of CRFs using both labeled and unlabeled data. sVEB reduces the amount of labeling required during training while maintaining high accuracy, making it suitable for real-world applications. Experimental results on synthetic and activity recognition tasks demonstrate that sVEB outperforms other training techniques in terms of accuracy while being more computationally efficient.
Handwritten and Machine Printed Text Separation in Document Images using the ...Konstantinos Zagoris
In a number of types of documents, ranging from forms to archive documents and books with annotations, machine printed and handwritten text may be present in the same document image, giving rise to significant issues within a digitisation and recognition pipeline. It is therefore necessary to separate the two types of text before applying different recognition methodologies to each. In this paper, a new approach is proposed which strives towards identifying and separating handwritten from machine printed text using the Bag of Visual Words paradigm (BoVW). Initially, blocks of interest are detected in the document image. For each block, a descriptor is calculated based on the BoVW. The final characterization of the blocks as Handwritten,Machine Printed or Noise is made by a Support Vector Machine classifier. The promising performance of the proposed approach is shown by using a consistent evaluation methodology which couples meaningful measures along with a new dataset.
IRJET- Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...IRJET Journal
This document compares the performance of four face recognition algorithms - PCA, KPCA, KFA, and LDA - on three standard datasets: AT&T, Yale, and UMIST. It finds that KFA generally achieves the highest recognition rates, particularly for the AT&T and Yale datasets which involve changes in facial expressions and lighting. The Yale dataset, with its variations, yields the best results overall for KFA and LDA. The UMIST dataset, with its profile images, produces lower recognition rates across algorithms due to less similarity between training and test images.
Documents Image Binarization is performed in the preprocessing stage for document analysis and it aims to
segment the foreground text from the document background. A fast and accurate document image binarization
technique is important for the ensuing document image processing tasks such as optical character recognition
(OCR). Though document image binarization has been studied for many years, the thresholding of degraded
document images is still an unsolved problem due to the high inter/intra variation between the text stroke and
the document background across different document images. The handwritten text within the degraded
documents often shows a certain amount of variation in terms of the stroke width, stroke brightness, stroke
connection, and document background. In addition, historical documents are often degraded by the bleed.
Documents are often degraded by different types of imaging artifact. These different types of document
degradations tend to induce the document thresholding error and make degraded document image binarization a
big challenge to most state-of-the-art techniques. The proposed method is simple, robust and capable of
handling different types of degraded document images with minimum parameter tuning. It makes use of the
adaptive image contrast that combines the local image contrast and the local image gradient adaptively and
therefore is tolerant to the text and background variation caused by different types of document degradations. In
particular, the proposed technique addresses the over-normalization problem of the local maximum minimum
algorithm. At the same time, the parameters used in the algorithm can be adaptively estimated.
Segmentation of unhealthy region of plant leaf using image processing techniq...eSAT Publishing House
1. This document discusses various image segmentation techniques that can be used to segment diseased regions of plant leaves for disease identification.
2. Common segmentation techniques discussed include K-means clustering, Fuzzy C-means clustering, Penalized Fuzzy C-means, and unsupervised segmentation.
3. After segmentation, texture and color features are extracted from the diseased regions to identify the plant disease using classification methods. The choice of segmentation technique depends on factors like noise levels and boundary definitions in the image.
Data reduction techniques for high dimensional biological dataeSAT Journals
Abstract
High dimensional biological datasets in recent years has been growing rapidly. Extracting the knowledge and analyzing highdimensional
biological data is one the key challenges in which variety and veracity are the two distinct characteristics. The
question that arises now is, how to perform dimensionality reduction for this heterogeneous data and how to develop a high
performance platform to efficiently analyze high dimensional biological data and how to find the useful things from this data. To
deeply discuss this issue, this paper begins with a brief introduction to data analytics available for biological data, followed by
the discussions of big data analytics and then a survey on various data reduction methods for biological data. We propose a dense
clustering algorithm for standard high dimensional biological data.
Keywords: Big Data Analytics, Dimensionality Reduction
Representation and recognition of handwirten digits using deformable templatesAhmed Abd-Elwasaa
This document presents a case study on using deformable templates for recognizing handwritten digits. Deformable templates match unknown images to known templates by deforming the contours of templates to fit the edge strengths of unknown images. The dissimilarity measure is derived from the deformation needed for the match. This technique achieved recognition rates up to 99.25% on a dataset of 2,000 handwritten digits. Statistical and structural features are extracted to represent characters for classification using deformable templates.
Cursive Handwriting Segmentation Using Ideal Distance Approach IJECEIAES
Offline cursive handwriting becomes a major challenge due to the huge amount of handwriting varieties such as slant handwriting, space between words, the size and direction of the letter, the style of writing the letter and handwriting with contour similarity on some letters. There are some steps for recursive handwriting recognition. The steps are preprocessing, morphology, segmentation, features of letter extraction and recognition. Segmentation is a crucial process in handwriting recognition since the success of segmentation step will determine the success level of recognition. This paper proposes a segmentation algorithm that segment recursive handwriting into letters. These letters will form words using a method that determine the intersection cutting point of image recursive handwriting with an ideal image distance. The ideal distance of recursive handwriting image is an ideal distance segmentation point in order to avoid the cutting of other letter’s section. The width and height of images are used to determine the accurate segmentation point. There were 999 recursive handwriting input images taken from 25 researchers used for this study. The images used are the images obtained from preprocessing step. Those are the images with slope correction. This study used Support Vector Machine (SVM) to recognize recursive handwriting. The experiments show the proposed segmentation algorithm able to segment the image precisely and have 97% success recognizing the recursive handwriting.
Hybrid Multilevel Thresholding and Improved Harmony Search Algorithm for Segm...IJECEIAES
This paper proposes a new method for image segmentation is hybrid multilevel thresholding and improved harmony search algorithm. Improved harmony search algorithm which is a method for finding vector solutions by increasing its accuracy. The proposed method looks for a random candidate solution, then its quality is evaluated through the Otsu objective function. Furthermore, the operator continues to evolve the solution candidate circuit until the optimal solution is found. The dataset used in this study is the retina dataset, tongue, lenna, baboon, and cameraman. The experimental results show that this method produces the high performance as seen from peak signal-to-noise ratio analysis (PNSR). The PNSR result for retinal image averaged 40.342 dB while for the average tongue image 35.340 dB. For lenna, baboon and cameramen produce an average of 33.781 dB, 33.499 dB, and 34.869 dB. Furthermore, the process of object recognition and identification is expected to use this method to produce a high degree of accuracy.
The document summarizes an adaptive image steganography technique that embeds secret messages into digital images. It proposes using adaptive quantization embedding, where quantization steps for image blocks are optimized to guarantee more data can be embedded in busy image areas with high contrast. The technique embeds adaptive quantization parameters and message bits into the cover image using a difference expanding algorithm. Simulation results showed the proposed scheme can provide a good balance between imperceptibility and embedding capacity.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Text detection and recognition in scene images or natural images has applications in computer
vision systems like registration number plate detection, automatic traffic sign detection, image retrieval
and help for visually impaired people. Scene text, however, has complicated background, blur image,
partly occluded text, variations in font-styles, image noise and ranging illumination. Hence scene text
recognition could be a difficult computer vision problem. In this paper connected component method
is used to extract the text from background. In this work, horizontal and vertical projection profiles,
geometric properties of text, image binirization and gap filling method are used to extract the text from
scene images. Then histogram based threshold is applied to separate text background of the images.
Finally text is extracted from images.
The document presents a novel pre-processing approach to improve optical character recognition (OCR) accuracy on document images. The approach uses a stack of enhancement techniques including illumination adjustment, grayscale conversion, sharpening, and binarization. Specifically, it applies contrast limited adaptive histogram equalization to adjust illumination, uses a luminance algorithm for grayscale optimization, employs unsharp masking to enhance text, and performs Otsu's method for binarization. The proposed method aims to compensate for distortions and improve OCR accuracy in a nonparametric and unsupervised manner. It is evaluated on standard datasets and shown to significantly improve text detection and OCR performance.
Design and Description of Feature Extraction Algorithm for Old English FontIRJET Journal
This document describes a proposed feature extraction algorithm for recognizing Old English font characters. It consists of four main stages: data collection, preprocessing, feature extraction, and recognition using a minimum distance classifier. For preprocessing, binarization and noise removal are performed. The feature extraction algorithm extracts 20 features from each character image by dividing it into 16x16 zones and identifying black pixels in each zone. Classification is done using a minimum distance classifier to match features to class means. The method achieved a 79% recognition rate on the test data.
This document summarizes and reviews various techniques for optical character recognition (OCR) of English text, including matrix matching, fuzzy logic, feature extraction, structural analysis, and neural networks. It discusses the structure and stages of OCR systems, including image preprocessing, segmentation, feature extraction, classification, and output. Challenges for OCR systems include degraded documents like old books, photocopies, and newspapers. The document reviews several related works on OCR and discusses techniques to improve recognition of degraded text.
This document summarizes and reviews various techniques for optical character recognition (OCR) of English text, including matrix matching, fuzzy logic, feature extraction, structural analysis, and neural networks. It discusses the structure and stages of OCR systems, including image preprocessing, segmentation, feature extraction, classification, and output. Challenges for OCR systems include degraded documents like old books, photocopies, and newspapers. The document reviews several related works on OCR and discusses techniques for English, Indian languages, license plate recognition, document binarization, and removing "bleed-through" effects from financial documents.
Preprocessing techniques for recognition of ancient Kannada epigraphsIJECEIAES
The Dravidian language Kannada is most spoken in the state of Karnataka, and because of its extensive library of epigraphs, including old manuscripts and inscriptions, it is regarded as a repository of knowledge. To make this knowledge more accessible to the people, efforts are being made to digitize the documents for optimized usage and storage using optical character recognition (OCR) but oftentimes these epigraphs are in poor conditions and the quality of the image being fed to the OCR model may not be good enough to achieve high accuracy of recognition and classification. Preprocessing techniques are used to make the dataset more reliable by improving the quality and accuracy of the model. Preprocessing methods including binarization, smoothing, edge detection, and segmentation help to increase the model's interpretability, decrease overfitting, and train it more quickly and with fewer resources. When applied to the epigraphs, these preprocessing approaches significantly increase the image quality and minimize noise, making it simpler to identify and digitize the text.
IRJET - Object Detection using Hausdorff DistanceIRJET Journal
This document proposes using Hausdorff distance for object detection as it can better handle noise compared to other methods like Euclidean distance. The document discusses preprocessing images using Gaussian filtering for noise cancellation. It then represents shapes as point sets for feature extraction before using Hausdorff distance to match shapes between reference and test images for object recognition. Encouraging results were obtained when testing on MNIST, COIL and private handwritten digit datasets.
An Efficient Segmentation Technique for Machine Printed Devanagiri Script: Bo...iosrjce
Segmentation technique plays a major role in scripting the documents for extraction of various
features. Many researchers are doing various research works in this field to make the segmenting process
simple as well as efficient. In this paper a simple segmentation technique for both the line and word
segmentation of a script document has been proposed. The main objective of this technique is to recognize the
spaces that separate two text lines.For the Word segmentation technique also similar procedure is followed. In
this work ,three different scanned document have been taken as input images for both line and word
segmentation techniques. The results found were outstanding with average accuracy for both line and word. It
provides 100% accuracy for line segmentation and 100% for line segmentation as well. Evaluation results show
that our method outperforms several competing methods.
An effective approach to offline arabic handwriting recognitionijaia
Segmentation is the most challenging part of the Arabic handwriting recognition, due to the unique
characteristics of Arabic writing that allows the same shape to denote different characters. In this paper,
an off-line Arabic handwriting recognition system is proposed. The processing details are presented in
three main stages. Firstly, the image is skeletonized to one pixel thin. Secondly, transfer each diagonally
connected foreground pixel to the closest horizontal or vertical line. Finally, these orthogonal lines are
coded as vectors of unique integer numbers; each vector represents one letter of the word. In order to
evaluate the proposed techniques, the system has been tested on the IFN/ENIT database, and the
experimental results show that our method is superior to those methods currently available.
Binarization of Degraded Text documents and Palm Leaf ManuscriptsIRJET Journal
This document proposes a technique for binarizing degraded text documents and palm leaf manuscripts. It involves taking the average pixel value of the image as a threshold to distinguish foreground from background. The algorithm first computes the average value of the original image and uses it to set pixels above the threshold to black, removing background. It then computes the average of the remaining image, excluding black pixels, and uses that value as a new threshold to set remaining pixels above it to white, extracting the foreground. The technique is tested on old documents and manuscripts, showing improvement over existing methods based on metrics like peak signal-to-noise ratio. While effective for documents, it needs improvement for palm leaf manuscripts with non-uniform degradation.
Medical Prescription Scanner using Optical Character RecognitionIRJET Journal
This document describes a proposed system to digitize medical prescriptions using optical character recognition (OCR). The system would allow users to scan prescriptions using their device's camera. Deep learning techniques like convolutional neural networks would be used to identify characters and text in the prescriptions despite variability in handwriting. The system is meant to help patients and doctors better manage prescriptions by storing them digitally rather than on paper. It discusses techniques for collecting prescription data, preprocessing images, segmenting characters, training and testing OCR models, and implementing the system in an application. Popular tools like TensorFlow, OpenCV, Keras and NumPy would be utilized.
A Texture Based Methodology for Text Region Extraction from Low Resolution Na...CSCJournals
Automated systems for understanding display boards are finding many applications useful in guiding tourists, assisting visually challenged and also in providing location aware information. Such systems require an automated method to detect and extract text prior to further image analysis. In this paper, a methodology to detect and extract text regions from low resolution natural scene images is presented. The proposed work is texture based and uses DCT based high pass filter to remove constant background. The texture features are then obtained on every 50x50 block of the processed image and potential text blocks are identified using newly defined discriminant functions. Further, the detected text blocks are merged and refined to extract text regions. The proposed method is robust and achieves a detection rate of 96.6% on a variety of 100 low resolution natural scene images each of size 240x320.
The document discusses four different methods for Bangla handwritten digit recognition. Method 1 uses preprocessing techniques like binarization, noise reduction, and segmentation followed by feature extraction and classification with a CNN. It achieves 94% accuracy. Method 2 also uses a CNN called MathNET with data augmentation, achieving 97% accuracy. Method 3 uses preprocessing, HOG feature extraction, and an SVM classifier, achieving 97.08% accuracy. Method 4 develops a dataset, performs data augmentation, uses a multi-layer CNN model with ensembling, and achieves 96.788% accuracy even on noisy images. The methods demonstrate high and improving recognition accuracy for Bangla handwritten digits.
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...ijdpsjournal
Scene text recognition brings various new challenges occurs in recent years. Detecting and recognizing text in scenes entails some of the equivalent problems as document processing, but there are also numerous novel problems to face for ecognizing text in natural scene images. Recent research in these regions has exposed several promise but present is motionless much effort to be entire in these regions. Most existing techniques have focused on detecting horizontal or near-horizontal texts. In this paper, we propose a new scheme which detects texts of arbitrary directions in natural scene images. Our algorithm is equipped with two sets of characteristics specially designed for capturing both the natural characteristics of texts using
MSER regions using Otsu method. To better estimate our algorithm and compare it with other existing algorithms, we are using existing MSRA Dataset, ICDAR Dataset, and our new dataset, which includes various texts in various real-world situations. Experiments results on these standard datasets and the proposed dataset shows that our algorithm compares positively with the modern algorithms when using horizontal texts and accomplishes significantly improved performance on texts of random orientations in composite natural scenes images.
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...ijdpsjournal
Scene text recognition brings various new challenges occurs in recent years. Detecting and recognizing text
in scenes entails some of the equivalent problems as document processing, but there are also numerous
novel problems to face for recognizing text in natural scene images. Recent research in these regions has
exposed several promise but present is motionless much effort to be entire in these regions. Most existing
techniques have focused on detecting horizontal or near-horizontal texts. In this paper, we propose a new
scheme which detects texts of arbitrary directions in natural scene images. Our algorithm is equipped with
two sets of characteristics specially designed for capturing both the natural characteristics of texts using
MSER regions using Otsu method. To better estimate our algorithm and compare it with other existing
algorithms, we are using existing MSRA Dataset, ICDAR Dataset, and our new dataset, which includes
various texts in various real-world situations. Experiments results on these standard datasets and the
proposed dataset shows that our algorithm compares positively with the modern algorithms when using
horizontal texts and accomplishes significantly improved performance on texts of random orientations in
composite natural scenes images.
This document summarizes an article from the International Journal of Computer Engineering and Technology. The article proposes an effective thresholding method for segmenting text from degraded ancient manuscript images. It begins with converting color images to grayscale. It then calculates an adaptive threshold value based on pixel intensity and contrast within a local window. This threshold value is used to binarize the grayscale image and produce an enhanced output image with clearer segmented text, suitable for applications processing degraded ancient manuscripts. The method is evaluated on over 100 manuscript images and shown to outperform standard global thresholding methods at preserving readable text from original degraded images.
Character recognition of kannada text in scene images using neuralIAEME Publication
Character recognition in scene images is one of the most fascinating and challenging
areas of pattern recognition with various practical application potentials. It can contribute
immensely to the advancement of an automation process and can improve the interface
between man and machine in many applications. Some practical application potentials of
character recognition system are: reading aid for the blind, traffic guidance systems, tour
guide systems, location aware systems and many more. In this work, a novel method for
recognizing basic Kannada characters in natural scene images is proposed. The proposed
method uses zone wise horizontal and vertical profile based features of character images. The
method works in two phases. During training, zone wise vertical and horizontal profile based
features are extracted from training samples and neural network is trained. During testing, the
test image is processed to obtain features and recognized using neural network classifier. The
method has been evaluated on 490 Kannada character images captured from 2 Mega Pixels
cameras on mobile phones at various sizes 240x320, 600x800 and 900x1200, which contains
samples of different sizes, styles and with different degradations, and achieves an average
recognition accuracy of 92%. The system is efficient and insensitive to the variations in size
and font, noise, blur and other degradations.
Character recognition of kannada text in scene images using neuralIAEME Publication
The document summarizes a proposed method for recognizing Kannada characters in low resolution scene images using neural networks. It involves extracting zone-wise horizontal and vertical profile features from character images. During training, features are extracted from samples and used to train a neural network. During testing, features are extracted from test images and recognized using the trained neural network classifier. The method achieved an average recognition accuracy of 92% on 490 Kannada character images captured from mobile phones under varying conditions.
This document presents a method for recovering text from degraded document images. It involves several steps:
1. Constructing a contrast image to distinguish text from background by calculating local image contrast and gradient.
2. Detecting text stroke edges in the contrast image using Otsu's thresholding and Canny edge detection.
3. Estimating a local threshold for binarization based on mean and standard deviation of detected edge pixel intensities.
4. Converting the image to binary format above the threshold.
5. Post-processing to remove unwanted background pixels.
The method is tested on several degraded documents and shows good performance in recovering text contents in a short time period. It provides a
Similar to Enhancement and Segmentation of Historical Records (20)
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
2. 96 Computer Science & Information Technology (CS & IT)
include numbers, letters and symbols into system processable format. Segmentation is an
important assignment of any OCR system and it separates the image text documents into
lines,words and characters. Hence the accuracy of OCR system primarily depends on the
segmentation algorithm been used.
Segmentation of handwritten text of Indian languages is challenging when compared with Latin
based languages because of its structural complication and presence of compound characters.
This complexity increases further if were to recognize text of ancient Indian or non-Indian
epigraphical documents. The epigraphical records engraved on stones, rocks, pillars or on some
other writing material are non-linear in their shapes and non-uniform in their sizes.
Raw image of an epigraph contains unwanted symbols or marks, noise embedded and text
engraved with much skew.The spacing between characters and also between the lines and the
skew could complicate the process of translating the scripts. Some touching lines as well as
characters complicates the process of segmentation which is input for the recognition process in
the later stages. Hence the input document image of epigraphs is to be preprocessed for removal
of noise, skew detection and correction, followed by segmentation of characters [1].
Inspite of several positive works on OCR across the world, development of OCR tools in Indian
languages is still a challenging task. Character segmentation plays an important role in character
recognition since incorrectly segmented characters are susceptible to be recognized wrongly.
Hence the proposed work focuses on preprocessing and segmentation of ancient handwritten
documents. This is an initial step towards developing OCR for ancient scripts, which can be used
by archaeologists and historians for digitization and further exploration of ancient records.
This paper is organized as follows: Section 2 elaborates the related works in the field. The system
architecture is highlighted Section 3. The theory and related mathematical background of the
approaches in current system is discussed in Section 4. Methodology is given in Section 5.
Experimental results and performance analysis is covered in Section 6 and Section 7 provides
conclusion.
2. RELATED WORK
Researchers have worked on many approaches for preprocessing and segmentation of various
languages. In this section, some of the works are discussed.
The linear Unsharp Masking (USM) technique [2] is adopted to increase the pictorial presence of
an image by highlighting its regularity contents to improve the edge and detailed information in
it. Nevertheless this method is easy and gives good result for many applications. It has two
limitations, one it is tremendously sensitive to noise and other one is that it increases high
contrast areas much more than area that do not show high image dynamics. Therefore output
image suffers from unkind overshoot objects. Adaptive Unsharp Masking hires an adaptive filter
that controls the contribution of sharpening in the manner that contrast enhancement occurs in
high dense areas and less or no image sharpening occurs in soft areas. This algorithm is better
compared with several other methods available in linear unsharp masking filter technologies.
Binarization is the initial step for processing, with the fact of degradation of the source document,
whichever global or local thresholding approaches are chosen. The Otsu thresholding algorithm
3. Computer Science & Information Technology (CS & IT) 97
[3] using the histogram shape analysis, which is most widespread global binarization algorithm.
The thresholding of Otsu yields a promising performance when the histogram has twin modal
distribution. The global threshold is designated automatically by a discriminant standard.
Another method utilize image contrast defined as local image minimum and maximum when
compared with the image gradient process, the image contrast derived by the local maximum and
minimum process has a good property and it is more tolerant to the uneven illumination,
document degradation such as smudge [4]. This method is superior when handling document
images with difficult background variation. Finally, the ancient document image is binarized
based on the local thresholds that are derived from the detected high contrast image pixels when
the same is compared with previous method based on image contrast, the method uses the image
contrast to recognize the text stroke boundary and it can be used to produce high accurate
binarization results.
A common technique for scrubbing the degraded documents is modified iterative global
threshold algorithm [5]. A best approach in the separation of object information from foreground
is to compute a global threshold of intensity value based on which two clusters can be diverted. It
is an iteration approach which can handle many degraded conditions. In each iteration the
intermediate tones are shifted towards background there by providing efficient difference
between foreground and background. It is mostly useful for the documents having non-uniform
distribution of noises.
In order to make foreground inscriptions clearly visible from background, Histogram
normalization is used. The image obtained still may suffer from uneven background intensity
variation which in turn reduces the clarity of the foreground. Further processing of image is done
to get an image with better foreground information. As the intensity of the foreground pixels
differs from the intensity of background, this key factor is used to identify the foreground
characters. The main criteria is to find a threshold value which causes the image components to
lie in one of two levels L0, which is below the threshold value and L1, which is above the
threshold value. The pixels which are above the threshold value represent the nodes of a graph.
The nodes form the basis for representing the foreground. The nodes that are neighbors in the
sampling grid are joined by an edge [6].
This technique is used to reduce the number of pixels in the image by a factor of 4 or 8, which in
turn will decrease the number of pixels that has to be processed. The image is represented as a
graph by associating each pixel to a vertex of a graph and connecting the pixels that are neighbors
in the sampling grid by an edge. The gray value of the pixel is considered as an attribute of the
vertex. Since the image is of finite size so also the graph. Pixels represent the finite regions and
vertices represent the faces. The dual of this graph represents borders of the faces which are inter-
pixel edges and vertices. Dual graph pyramids are constructed by adopting bottom-up approach.
Each level of pyramid represents an adjacency graph where vertices correspond to regions and
edges represent the relation between the regions [7].
The procedure of segmentation has huge importance in the handwritten script identification. Thus
an abundant study of research outcome in related segmentation field was surveyed. The algorithm
based on connected components [8], segmenting the document image into non-overlapping equi-
width vertical zones.
4. 98 Computer Science & Information Technology (CS & IT)
A method is to decrease the noise level, which is existing in the distorted image is proposed in
[9]. The distorted image will be first binarized using the threshold, which is determined by the
Otsu’s method. Next for each pixel p(x, y) estimate the horizontal and vertical run length count.
If horizontal and vertical run length count is less than a specified threshold then it is assumed to
be noise and will be eliminated.
Gaussian kernel is a linear operation; convolution is used to find the common area between the
profile and the Gaussian kernel. The degree of shift in the Gaussian kernel during the convolution
process linearly varies with horizontal profile information. So this can be used to represent
randomness in the profile and provides a zero crossing smooth curve, when it is convolved with
the profile, represented by ‘C’. The peaks which are above zero are treated as the gaps between
the lines. Based on this information, the line segmentation is performed [10].
Another method for segmentation is nearest neighbor algorithm [11] which is iterative in nature
scans the character from the top left portion of the image. When it reaches a first black pixel, then
the first symbol is identified through the connected component. If it is found to be the first
character of the script, then it will be placed as the first character of the new line used for placing
the character segmentation. The centroid of the character is calculated and stored in an array
separation the x and y coordinators. The document is again scanned from left top to locate the
next black pixel and hence the next character in the document. The centroid of the character also
computed. The distance between the centroid is computed using the distance formula. If the
distance is less than or equal to threshold value then the character is assume to lie on same line.
Otherwise the character is consider being the part of the next line and transferred to the next line
in the result part. This process is continued until all the characters are scanned and whole image
has been traversed. At the end of the iterative algorithm the separated lines are obtained from the
source. The individual character can also be obtained in this process itself.
Text line segmentation is necessary to detect all text regions in the document image. The
algorithm based on multiple histogram projections using morphological operators to extract
features of the image. Horizontal projection is performed on the text image, and then line
segments are identified by the peaks in the horizontal projection. Threshold applied to divide the
text image into segments. False lines are eliminated using other threshold. Vertical histogram
projections are used for the line segments and decomposed into words using threshold and further
decomposed to characters. This kind of approach provides best performance based on the
experimental results such as Detection rate DR (98%) and Recognition Accuracy RA (98%)
[12,13].
Contour tracing is a technique applied to digital images for extracting the boundary of any object.
This kind of system applies one of the recent contour tracing algorithms to separate character by
using the Theo Pavlidis’s algorithm [15]. It works with 4-connected patterns. The width of the
segmented components from this process is checked. If it is more than the criteria value of the
average width of the components, it will be processed in the next stage. This means there are
some touching characters that are not separated [14].
Tracing of Background Skeleton approach, is applied to separate some touching components to
segment touching characters, background skeleton is processed by using the Zhang-Suen thinning
algorithm. Then, contour tracing algorithm is applied to abstract the skeleton of the background.
Subsequently, the characters in each line will be sorted by checking the column position in order
5. Computer Science & Information Technology (CS & IT) 99
to determine the sequence of the characters. This is applied to practical data from ancient
documents [15, 16].
Enhanced stroke filter have shifted the attention to the skeletons of potential strokes. In this
manner, the task of distinguishing strokes from a complicated background is converted to the task
of comparing the difference between skeletons of potential strokes and those of disturbing
patches, both of which can be extracted from the resulting images of previous Stroke filters.
Skeleton constraints such as length and width constraints can be introduced into the method to
enhance stroke information [17].
For the segmentation of unconstrained handwritten connected numerals, Water Reservoir
technique is used. A reservoir location and size, touching position (top, middle or bottom) is
decided. Analyzing the reservoir boundary, touching position and topological structures of the
touching pattern, the best cutting point and then the cutting path for segmentation is generated
[18].
Segmented linked characters are critical preprocessing steps in character recognition applications.
Old drop fall algorithm has proved to be an efficient segmenting method due to its simplicity and
effectiveness. However it is subject to small convexes on the contour of characters. Xiujuan
Wang, Kangfeng Zheng, and Jun Guo presented Innertial Drop fall algorithm and big drop fall
algorithm to avoid this defect [19, 20, 21].
3. SYSTEM ARCHITECTURE
The system “Enhancement and Segmentation of Historical Records” designed, mainly consists of
the subcomponents - Preprocessing and Segmentation as shown in Figure 1. The input to the
system is ancient epigraphic documents of varying amount of degradation.
• Image Enhancement : This sub system enhances the quality of the ancient document
images by reducing noise. This is carried out by providing four different filtering options
of various filter sizes.
• Binarization: This sub-system converts RGB images to binary images using
thresholding technique known as Otsu algorithm.
• Segmentation: This sub-system samples out characters from the ancient documents and
is achieved through Drop Fall and Water reservoir techniques.
6. 100 Computer Science & Information Technology (CS & IT)
Figure 1. Enhancement and Segmentation System
4. REVIEW ON THE APPROACHES USED IN PROPOSED SYSTEM
The methods used in current system are described in this section:
Preprocessing stage of degraded ancient document images includes: enhancement and reduction
in noise, which is achieved through different filtering methods for smoothing or sharpening
namely Bilateral, Mean, Median, and Gaussian Blur Filters. These filters are provided with
different mask sizes and parameter values. This is followed by binarization of the enhanced
image to highlight the foreground information, using Otsu thresholding algorithm. Character
segmentation is performed based on Drop Fall and Water Reservoir concept.
4.1 Mean Filter
The mean filter is a sliding-window longitudinal filter that exchanges the center value in the
window with the average (mean) of all the pixel values in that window [22]. Let Sxy represent
the set of coordinates in a rectangular sub image window of size m x n, centered at point (x, y).
The arithmetic mean filtering process computes the average value of the corrupted image g(x, y)
in that area defined by Sxy. The restored image value at any point (x, y) is merely the arithmetic
mean calculated using the pixel in that region, indicated in Equation 1.
݂෩ሺ,ݔ ݕሻ =
ଵ
∑ ݃ሺ,ݏ ݐሻሺ௦,௧ሻ∈ௌೣ
(1)
This operation can be applied using a convolution mask in which all coefficients have value
1/mn. Mean Filters smoothes local variations in an image and as a result of blurring, noise is
reduced.
4.2 Median Filter
In image processing, neighborhood averaging is the best method to perform the noise reduction,
whereas the method can overturn isolated out of range noise, however the adverse effect is that it
also distorts sudden changes such as sharp edges. The median filter can suppress the noise
without damaging the sharp edges. In median filtering, all the pixel values are first sorted into
numerical order and then replaced with the middle pixel value [22].
7. Computer Science & Information Technology (CS & IT) 101
Let y be a pixel location and w a neighborhood centered on location (m, n) in the image, therefore
median filter is given by Equation 2,
y [m, n]=median{x[ i ,j], ( i , j) belongs to w} (2)
Subsequently the pixel y [m, n] represents the location of the pixel y, m and n represents the x
and y co-ordinates of pixel y. w represents the neighborhood pixels surrounding the pixel position
at (m, n), (i, j) belongs to the same neighborhood centered on (m, n). Hence the median method
will take the median of all the pixels within the range of (i, j) represented by x [i, j].
4.3 Gaussian blur Filter
A Gaussian blur or Gaussian smoothing involves blurring an image by Gaussian function and
used to decrease image noise and image details. Gaussian smoothing is used as a preprocessing
stage in computer vision algorithms in order to enhance image structures at different scales. The
2- dimension Gaussian function is given by Equation 3.
Gሺx, yሻ < −
ଵ
ଶπσమ e ି ୶మା୷మ
ଶσమ (3)
where x is the distance from the origin (Horizontal axis), y is the distance from the origin
(vertical axis), and σ is the standard deviation of the Gaussian distribution. The standard
deviation σ of the Gaussian determines the amount of smoothing [22].
4.4 Bilateral Filter
Bilateral filter [23] is a non linear filter in spatial domain, which does averaging without
smoothing the edges. The bilateral filter inputs a weighted sum of the pixels in a local
neighborhood; the weights depend on both the spatial distance and the intensity distance.
Essentially the bilateral filter has weights as a product of two Gaussian filter weights, one of
which corresponds to average intensity in a spatial domain, and second weight corresponds to the
intensity difference. Hence no smoothing occurs, when one of the weights is close to 0, which
means the product becomes insignificant around the region where intensity changes swiftly,
which represents usually the sharp edges. As a result, the bilateral filter preserves sharp edges
[28]. Pixel location x, bilateral filter output is given in Equation 4
ܫሗ =
ଵ
∑ ݁
ି||௬ି௫||మ
ଶఙ
మ − ݁
ି|ூሺ௬ሻିூሺ௫ሻ|మ
ଶఙೝ
మ௬∈ேሺ௫ሻ ܫሺݕሻ (4)
There parameters controlling the fall-off of weights in spatial and intensity domains, respectively.
And are inputs and output images respectively are spatial neighborhood of pixel I(x), and C can
be given as
ܥ = ∑ ݁
ି||௬ି௫||మ
ଶఙ
మ − ݁
ି|ூሺ௬ሻିூሺ௫ሻ|మ
ଶఙೝ
మ௬∈ேሺ௫ሻ (5)
4.5 Otsu’s Method of Binarization
Binarization is the method of converting a grey scale image to a binary image by using threshold
selection procedures to categorize the pixels of an image into either one of the two classes.
Binarization of the image using Otsu method [23] is used to automatically accomplish histogram
shape-based image thresholding or the decrease of a gray level image to a binary image. This
algorithm adopts that the image as thresholded contains two classes of pixels, then calculates the
8. 102 Computer Science & Information Technology (CS & IT)
optimum threshold separating those two classes so that their combined spread (intra-class
variance) is minimal.
In an Otsu's method weighted sum of variances of the two classes is given by:
ߪఠ
ଶ
(t) = ߱ଵ(t) ߪଵ
ଶ
(t) + ߱ଶ(t) ߪଶ
ଶ
(t) (6)
Weights ߱ are the probabilities of the two classes separated by a threshold t and ߪ
ଶ
variances
of these classes. The class probability is
ߪ
ଶ
(t) = ߪଶ
− ߪఠ
ଶ
(t) = ߱ଵ(t) = ߱ଵ(t)߱ଶ(t) [µଵ
(t) - µଶ
ሺݐሻ]ଶ
(7)
The class probability ߱ଵሺݐሻ is calculated from the histogram as t:
߱ଵ(t) = ∑ ሺ݅ሻ௧
(8)
While the class mean µଵ
(t) is:
µଵ
(t) = ሾ∑ ሺ݅ሻईሺ݅ሻ௧
] ∕ ߱ଵ (9)
where x(i) is ߱ଶሺݐሻ the value at the center of the ith histogram bin. Also can calculate ߤଶ on the
right-hand side of the histogram for bins greater than t.
4.6 Drop Fall Algorithm for Segmentation
Drop fall algorithm [24] with respect to the principle that an equally ideal cut between two
touched characters can be created, if one has to role a hypothetical marble off the top of the first
character and create the cut where the marble falls. The important things to be addressed for this
implementation are where to drop the marble from because it is important if the algorithm starts
at the wrong place. The marble can simply roll down the left side of the first digit or the right side
of the second digit and, hence, it would be completely unsuccessful. The best approach to start
drop falling process is possible to the point at which two characters are touched. In this process
the pixels are scanned row by row until a black boundry pixel with adjacent black boundry pixel
to the right of it is identified, where as the two pixels are separated by white space. This pixel is
used as a point to start the drop fall as shown in Figure 2.
Figure 2. Identification of Initial Pixel Positions
9. Computer Science & Information Technology (CS & IT) 103
The direction that the algorithm will move is according to the current pixel position and its
surroundings as shown in Figure 3.
Figure 3. The Principle of Drop Fall algorithm
4.7 Water Reservoir Algorithm
The larger space generated by touching characters is analyzed with the help of water reservoir
concept. The working principle of water reservoir method is illustrated in Figure 4. When water
is poured from top (bottom) of a component, the regions of the component where water will be
stored are considered as top (bottom) reservoir. Top (bottom) reservoir is the reservoir obtained
when water is poured from top (bottom). The white spaces are found in the regions in the
bounding box of the components where water can be stored. These regions are called water
reservoirs. The reservoirs obtained in this procedure are not considered for further processing.
Those reservoirs whose heights are greater than a threshold value T1 are considered for further
processing.
Figure 4. Reservoirs formed from water flow from top and bottom is shown for (a) top (b) middle and (c)
bottom touching numerals. The top Reservoirs are marked by Dots and bottom reservoirs are marked by
small line segments
10. 104 Computer Science & Information Technology (CS & IT)
When two characters touch each other, they create a space (reservoir) between the characters.
This space is very important for segmentation because,
a) As cutting points are concentrated around the base of the reservoir, and hence, decreases the
search area.
b) The cutting points lie on base of the reservoir.
c) The space attributes (center of gravity and height) aid to go near the best touching position.
If water is poured from top (bottom) of large space created by touching (Water reservoir) Base of
the reservoir connected numeral then water will be stored in this large space. This water stored
area is named “Water Reservoir” [25]. Figure 5 illustrates the same.
Figure 5. Examples of touching numeral and Space created by the touching
Figure 6 depicts the detection of touching position recognition. The largest reservoir of the
component whose center of gravity lies in vm region is found. This reservoir is known as the best
reservoir for touching. The base-line (lowermost row of the reservoir) of the best reservoir is then
identified. The best reservoir and its base-line are shown and to find this touching position in the
components, morphological thinning operation is applied to touching components for further
processing. For feature points extraction the touching position is renowned. The leftmost and
rightmost points of the base-line of considered reservoirs are the feature points. These points are
initial feature points. With this initial feature points the best feature point (which gets maximum
confidence value) is chosen for segmentation. To calculate confidence value (CV) following
features are considered. Euclidean distance of feature points from the center of gravity of the
touching component.
Figure 6. Feature Detection Approach.
11. Computer Science & Information Technology (CS & IT) 105
5. PROPOSED SYSTEM AND METHODOLOGY
5.1 DFD of the Preprocessing and Segmentation System
The Data Flow Diagram(DFD) of the current system is shown in Figure 7. This work is carried
out in two phases. In the first phase - Preprocessing, the degraded ancient document image is
taken as input and it is converted to grayscale. Then the smoothing or sharpening filters, namely
Median filter, Gaussian filter, Mean filter, Bilateral filter of different mask sizes are applied to
reduce the amount of noise and thus enhances the image. Next, the enhanced image is binarized
using Ostu algorithm to differentiate background and foreground of ancient document.
In the second phase Segmentation is carried out using Drop Fall algorithm and Water Reservoir
algorithm, to obtain sampled characters, which can be used later stages of OCR.
Figure 7. DFD of the Preprocessing and Segmentation Phases
5.2 Methodology
The functionality of the phases – Preprocessing and Segmentation in detail is covered in this
section. The ancient input image is first converted to grayscale image. The image is enhanced and
noise is reduced by applying four different filters with mask size of 2x2 and 4x4. Next, The
enhanced image is converted to binary image. Lastly the characters in the document are sampled
out using Drop Fall and Water Reservoir approaches.
12. 106 Computer Science & Information Technology (CS & IT)
5.2.1 Preprocessing
Image Enhancement
Input: Degraded Gray Scale image
Functionality: Enhances epigraphic image of medium-level degradation using
spatial filters namely Median, Gaussian blur, Mean, Bilateral filter on the input
image.
Output: The enhanced image with reduced noise.
Algorithm for Enhancement using Median filter
[Step 1]: Read the gray image.
[Step 2]: Compute y [m, n] = median{x [i, j], (i, j) belongs to w}
y be a pixel position, w represent a neighborhood centered around location (m, n) in the
image.
[Step 3]: Apply the Median filter designed over the entire input image to obtain
enhanced image.
Algorithm for Enhancement using Gaussian blur filter
[Step 1]: Read the gray image.
[Step 2]: Compute݁ Gሺx, yሻ < −
ଵ
ଶπσమ e – ୶మା୷మ
ଶσమ
x is the distance from the origin (Horizontal axis), y is the distance from the origin
(vertical axis), σ is the standard deviation of the Gaussian distribution.
[Step 3]: Apply the Gaussian filter designed over the entire input image to obtain
enhanced image.
Algorithm for Enhancement using Mean filter
[Step 1]: Read the gray image.
[Step 2]: Compute ݂መሺ,ݔ ݕሻ < −
ଵ
∑ ݃ሺ,ݏ ݐሻሺ௦,௧ሻఢௌೣ
Sxy represent the set of coordinates in a rectangular subimage window of size m X n,
centered at point (x, y).
[Step 3]: Apply the Mean Filter designed over the entire input image to obtain
enhanced image.
Algorithm for Enhancement using Bilateral filter
[Step 1]: Read gray image.
[Step 2]: Compute ܥ = ∑ ݁
ି||௬ି௫||మ
ଶఙ
మ − ݁
ି|ூሺ௬ሻିூሺ௫ሻ|మ
ଶఙೝ
మ௬∈ேሺ௫ሻ
[Step 3]: Compute ܫሗ =
ଵ
∑ ݁
ି||௬ି௫||మ
ଶఙ
మ − ݁
ି|ூሺ௬ሻିூሺ௫ሻ|మ
ଶఙೝ
మ௬∈ேሺ௫ሻ ܫሺݕሻ
[Step 4]: Apply the Bilateral Filter designed over the entire input image to obtain
enhanced image.
Binarization
Input: Enhanced Image
Functionality: The enhanced images are converted to binary image consisting of
ones and zeroes.
13. Computer Science & Information Technology (CS & IT) 107
Output: Binarized image
Algorithm for Binarization
[Step 1]: Compute histogram and probabilities of each intensity level
[Step 2]: Initialize initial ߱ሺ0ሻ and µ
(0)
[Step 3]: Compute through all possible thresholds t=1. Maximum intensity
Revise ߱ and µ
; Compute ߪ
ଶ
(t)
[Step 4]: Desired threshold corresponds to the maximum ߪ
ଶ
(t)
[Step 5]: Compute two maxima (and two corresponding thresholds). ߪଵ
ଶ
(t)
[Step 6]: Compute greater max and ߪଶ
ଶ
(t) is the greater or equal maximum
[Step 7]: Compute required threshold threshold1+threshold2/2
5.2.2 Segmentation
The segmentation of the document image is carried out at the character level using Drop Fall and
Water Reservoir Approaches.
Input: Binary epigraph image
Functionality: The binarized image is segmented to characters
Output: Segmented characters of the input epigraph.
Drop fall algorithm
Drop falling algorithm forms segmentation path by rolling in between two touching
characters and displays the segmented characters.
[Step 1]: Input the binary image
[Step 2]: Find the Height and Width of the touched characters
[Step 3]: Apply Breadth First Search (BFS) algorithm to find the touched characters
[Step 4]: If found start the Drop fall, the drop falling algorithm it will always move
downwards, crossways down-wards, to the right, or two the left.
[Step 5]: Make the slice where marble parks. Thus Segmentation path for connected
components is found
Water Reservoir Algorithm
[Step 1]: Find the size of the characters to find touched characters.
[Step 2]: The positions and sizes of the reservoirs are analyzed and a reservoir is detected
where touching is made, the initial feature points for segmentation are noted.
[Step 3]: The best feature points are noted from the initial feature points.
[Step 4]: Based on touching position, close loop positions and morphological structure of
touching region the cutting path is produced.
6. EXPERIMENTAL RESULTS, ANALYSIS AND DISCUSSION
6.1 Experimental Results
The system developed is tested on nearly 150 ancient epigraphic images and the results are found
to be satisfactory. The sample experimental results are depicted in following figures. Figure 8
shows the input ancient historical record, which is analysed for different spatial filtering
techniques.
14. 108 Computer Science & Information Technology (CS & IT)
Figure 8. Input Image selected for Pre-processing
Figure 9 shows the results of color to gray scale conversion, when carried out on the image
shown in Fig 8.
Figure 9. The result of Gray Scale Conversion
Figure 10(a) and 10(b) shows the results after Median filtering, for the mask size of 2x2 and 4x4
respectively. The median filter is an effective method that can suppress isolated noise without
blurring sharp edges.
Figure 10(a). The result of Median filtering for Mask size 2x2
Figure 10(b). The result of Median Filter for Mask size 4x4
Figure 11(a) and 11(b) shows the results of Gaussian blur filtering for the mask size of 2x2 and
4x4. The Gaussian blur method is used to blur the sharpen image so that a less edge highlighted
image is produced.
Figure 11(a). The result of Gaussian Blur Filtering for Mask size 2x2
15. Computer Science & Information Technology (CS & IT) 109
Figure 11(b). The result of Gaussian blur Filtering for Mask size 4x4
Figure 12 shows the results of Mean filtering for the mask size of 4x4. The Mean filter is a
simple filter that replaces the center value in the window with the mean of all the pixel values in
that window.
Figure 12. The result of Mean Filtering for Mask size 4x4
Figure 13 shows the results of bilateral filtering.
Figure 13. The result of Bilateral Filtering
Figure 14 shows the result of binarization, in which the enhanced image is converted to binary
image.
Figure 14. The results of Binarization
Figure 15 and Figure 16 represents the result of Segmentation of Characters using Drop Fall
algorithm and Water Reservoir algorithm respectively.
16. 110 Computer Science & Information Technology (CS & IT)
Figure 15. The result of Segmented Characters using Drop Fall algorithm
Figure 16. The result of Segmented Characters using Water Reservoir algorithm
6.2 Performance Analysis
The dataset includes 150 samples of medium degraded images for preprocessing and
segmentation. The performance of the 2 phases is discussed below:
17. Computer Science & Information Technology (CS & IT) 111
6.2.1 Filtering Techniques
This system is tested on 150 samples of medium degraded images, using four spatial filtering
techniques of varying mask sizes. The enhancement was found to be appreciable for the mask
size 4x4 for Median filter when the mask size is high then the output image will appear clear with
sharp edges. The mask size of 2x2 for Gaussian blurs results in blurred image. The mask size of
4x4 for Mean filter typically smoothens local variations in an image and noise is reduced as a
result of blurring. Bilateral filter sharpens the edges. The smoothing Gaussian filter will result in
good accuracy if the edge of the input image is very thick, where as in case of sharpening
Bilateral filter gives better output for the medium degraded images.
6.2.2 Segmentation
The system showed good results when tested on 150 varying degraded images, giving
segmentation rate of 85%-90% for Drop Fall algorithm, 85%-90% for Water Reservoir
algorithm.
Figure 17 . Segmentation Rate of Drop fall and Water Reservoir techniques
7. CONCLUSION
The system showed good results when tested on the 150 samples of varying degraded epigraphs.
It provides better enhanced output on applying filters of appropriate mask size - 4x4 mask size
for Median filter, 2x2 mask size for Gaussian blur, 4x4 mask size for Mean and Bilateral filter.
Segmentation is carried out using Drop Fall and Water Reservoir algorithms and system can
efficiently segment characters from ancient document images. System segments the compound
characters correctly when connectivity present. Few cases where in connectivity is absent,
compound character is segmented separately. Segmentation rate of 85%-90% for Drop fall
algorithm and 85%-90% for Water Reservoir algorithm is achieved.
18. 112 Computer Science & Information Technology (CS & IT)
REFERENCES
[1] Tanzila Saba, Ghazali Sulong & Amjad Rehman, “A survey on Methods and Strategies on Touched
Characters Segmentation”, International Journal of Research and Reviews in Computer Science
(IJRRCS), Vol.1, No.2, June 2010.
[2] M. Trentacoste et al. “Unsharp Masking, Counter shading and Halos: Enhancements or Artifacts”,
2012, The Euro graphics Association and Blackwell publishing Ltd. 2012.
[3] Maya R. Gupta, National P. Jacobson, Eric K. Garcia, “Binarization and Image Preprocessing for
searching Historical Documents”, University of Washington, Seattle, Washington 98195, April
2006.
[4] Bolan Su, Shijian Lu, Chew Lim Tan, “Binarization of Historical Document images Using the local
Maximum and Minimum”, pp.9-11, June 2010.
[5] N. Venkata Rao, A.V.Srinivasa Rao, S Balaji and L. Pratap Reddy, “Cleaning of Ancient Document
Images Using Modified Iterative Global Threshold” ,IJCSI international Journal of Computer Science
Issues, Vol.8.Issue 6, No 2, pp. ISSN (online):1694-0814, November 2011.
[6] Dr. B.P.Mallikarjunaswamy, Karunakara K, “Graph Based Approach for Background and
Segmentation of the Image”, Vol 02, pp.ISSN: 2230-8563, June 2011.
[7] Mamatha H.R and Srikanta Murthy K, “Morphological Operations and Projection Profiles based
Segmentation of handwritten Kannada Document”, International Journal of Applied Information
System (IJAIS)-ISSN: 2249-0868 Foundation of computer Science FCS, New York, USA Vol 4-
N0.5, pp. 13-19, October 2012.
[8] Das, Reddy, Govardhan, Saikrishna, ”Segmentation of Overlapping text lines, Characters in printed
Telugu text document images”, International Journal of Engineering science and technology,
Vol.2(11), pp.6606-6610, 2010.
[9] Karthik S, Mamatha H.R, Srikanta Murth y K, “An Approach based on Run Length Count for
denoising the Kannada Characters”, Internatinal journal of Computer applications, Vol 50-No.18,
pp.0975-8887, July 2012.
[10] N. Anupama, Ch. Rupa & Prof. E. Sreenivasa Reddy, “Character Segmentation for Telugu Image
Document using Multiple Histogram Projections”, Global Journal of Computer Science and
Technology Graphics & Vision, Vol 13, 2013.
[11] Srinivasa Rao A.V, segmentation of ancient Telugu Text Documents, published online in MECS ,I.J.
Image, “Graphics and Signal Processing”, pp.8-14 Published Online July 2012.
[12] Partha Pratim Roy, Umapada Pal, Josep Llados, Mathieu Delalandre, ”Multi-oriented and Mutisized
Touching Character Segmentation using Dynamic programming”, 10th international conference on
Document analysis and Recognition, DOI 10.1109/ICDAR.124 IEEE 2009.
[13] Indu Sreedevi, Rishi Pandey, N. Jayanthi, Geetanjali Bhola, Santanu Chaudhury2, “NGFICA Based
Digitization of Historic Inscription Images, ISRN Signal Processing”, Vol 3, 2013.
[14] Pavlidis, T. “Algorithms for Graphics and Image Processing”, Computer Science Press, Rockville,
Maryland, 2009.
[15] Zhang, T. Y. and Suen, C. Y. “A Fast parallel algorithm for thinning digital patterns.
Communication. ACM”, Vol 3, No 27, pp 236-239, 2009.
[16] B Gangamma,Srikanta Murthy K,Arun Vikas Singh, ”Restoration of Degraded Historical Document
Image”, Journal of Emerging Trends in computing and Information Science, Vol 3, No 5, May 2012.
[17] Xiaoqing Lu, Zhi Tang, Yan Liu, Liangcai. “Stroke –based character segmentation of low –quality
images on ancient Chinese tablet”, 12th International conference on document Analysis and
recognition, 2013.
[18] U. Pal, A. Belaid and Ch. Choisy “Touching numeral segmentation using water reservoir concept”,
Pattern Recognition Letters, Vol.24, pp. 262-272, 2003.
[19] S.Khan.“Character Segmentation Heurisrtics for Check Amount Verification “Master Thesis,
Massachusetts Institute of Technology, 1998.
[20] X.Wang, K. Zheng and J.Guo. “Inertial and Big Drop Fall Algorithm”, International Journal of
information Technology, vol.12, No.4, 2006.
19. Computer Science & Information Technology (CS & IT) 113
[21] Parminder Singh and Harjinder Singh, “ A Comparison of High Pass Spatial Filters using
Measurements and Automation”, Int. J. IJERT, Vol 1, May 2012.
[22] Mamatha H.R, Sonali Madireddi, Srikanta Murthy K, “Performance Analysis of various filters for
De-noising of Handwritten Kannada Documents”, International Journal of Computer Applications
(0975 – 888) Volume 48– No.12, June 2012.
[23] Messaoud, I.B, et.al. “Region Based Local Binarization approach for Handwritten Ancient
Documents, “in Int. Conf. Front. Handwrit recognit. Bari, pp.633-638, 2012.
[24] Srinivasa Rao A V, D R Sandeep, V B Sandeep,S Dhanam Jaya “Segmentation of Touching Hand
written Telugu Characters by using Drop Fall Algorithm, International Journal of Computers &
Technology, Volume 3 No. 2, OCT, 2012.
[25] Praveen kumar. C, Kiran. Y. C, “ Kannada Handwritten Character Segmentation using Water
Reservoir Method” , International Journal of Systems, Algorithms & Application 2012.