Many word spotting strategies for the modern documents are not directly applicable to historical handwritten documents due to writing styles variety and intense degradation. In this paper, a new method that permits effective word spotting in handwritten documents is presented that relies upon document-specific local features which take into account texture information around representative keypoints. Experimental work on two historical handwritten datasets using standard evaluation measures shows the improved performance achieved by the proposed methodology.
Text extraction using document structure features and support vector machinesKonstantinos Zagoris
In order to successfully locate and retrieve document images such as technical articles and newspapers, a text localization technique must be employed. The proposed method detects and extracts homogeneous text areas in document images indifferent to font types and size by using connected components analysis to detect blocks of foreground objects. Next, a descriptor that consists of a set of structural features is extracted from the merged blocks and used as input to a trained Support Vector Machines (SVM). Finally, the output of the SVM classifies the block as text or not.
ICFHR 2014 Competition on Handwritten KeyWord Spotting (H-KWS 2014)Konstantinos Zagoris
H-KWS 2014 is the Handwritten Keyword Spotting Competition organized in conjunction with ICFHR 2014 conference. The main objective of the competition is to record current advances in keyword spotting algorithms using established performance evaluation measures frequently encountered in the information retrieval literature. The competition comprises two distinct tracks, namely, a segmentation-based and a segmentation- free track. Five (5) distinct research groups have participated in the competition with three (3) methods for the segmentation- based track and four (4) methods for the segmentation-free track. The benchmarking datasets that were used in the contest contain both historical and modern documents from multiple writers. In this paper, the contest details are reported including the evaluation measures and the performance of the submitted methods along with a short description of each method.
Handwritten and Machine Printed Text Separation in Document Images using the ...Konstantinos Zagoris
In a number of types of documents, ranging from forms to archive documents and books with annotations, machine printed and handwritten text may be present in the same document image, giving rise to significant issues within a digitisation and recognition pipeline. It is therefore necessary to separate the two types of text before applying different recognition methodologies to each. In this paper, a new approach is proposed which strives towards identifying and separating handwritten from machine printed text using the Bag of Visual Words paradigm (BoVW). Initially, blocks of interest are detected in the document image. For each block, a descriptor is calculated based on the BoVW. The final characterization of the blocks as Handwritten,Machine Printed or Noise is made by a Support Vector Machine classifier. The promising performance of the proposed approach is shown by using a consistent evaluation methodology which couples meaningful measures along with a new dataset.
Textual information in images constitutes a very rich source of high-level semantics for retrieval and indexing. In this paper, a new approach is proposed using Cellular Automata (CA) which strives towards identifying scene text on natural images. Initially, a binary edge map is calculated. Then, taking advantage of the CA flexibility, the transition rules are changing and are applied in four consecutive steps resulting in four time steps CA evolution. Finally, a post-processing technique based on edge projection analysis is employed for high density edge images concerning the elimination of possible false positives. Evaluation results indicate considerable performance gains without sacrificing text detection accuracy.
A system was developed able to retrieve specific documents from a document collection. In this system the query is given in text by the user and then transformed into image. Appropriate features were in order to capture the general shape of the query, and ignore details due to noise or different fonts. In order to demonstrate the effectiveness of our system, we used a collection of noisy documents and we compared our results with those of a commercial OCR package.
Color reduction using the combination of the kohonen self organized feature m...Konstantinos Zagoris
The color of the digital images is one of the most important components of the image processing research area. In many applications such as image segmentation, analysis, compression and transition, it is preferable to reduce the colors as much as possible. In this paper, a color clustering technique which is the combination of a neural network and a fuzzy algorithm is proposed. Initially, the Kohonen Self Organized Featured Map (KSOFM) is applied to the original image. Then, the KSOFM results are fed to the Gustafson-Kessel (GK) fuzzy clustering algorithm as starting values. Finally, the output classes of GK algorithm define the numbers of colors of which the image will be reduced.
Text extraction using document structure features and support vector machinesKonstantinos Zagoris
In order to successfully locate and retrieve document images such as technical articles and newspapers, a text localization technique must be employed. The proposed method detects and extracts homogeneous text areas in document images indifferent to font types and size by using connected components analysis to detect blocks of foreground objects. Next, a descriptor that consists of a set of structural features is extracted from the merged blocks and used as input to a trained Support Vector Machines (SVM). Finally, the output of the SVM classifies the block as text or not.
ICFHR 2014 Competition on Handwritten KeyWord Spotting (H-KWS 2014)Konstantinos Zagoris
H-KWS 2014 is the Handwritten Keyword Spotting Competition organized in conjunction with ICFHR 2014 conference. The main objective of the competition is to record current advances in keyword spotting algorithms using established performance evaluation measures frequently encountered in the information retrieval literature. The competition comprises two distinct tracks, namely, a segmentation-based and a segmentation- free track. Five (5) distinct research groups have participated in the competition with three (3) methods for the segmentation- based track and four (4) methods for the segmentation-free track. The benchmarking datasets that were used in the contest contain both historical and modern documents from multiple writers. In this paper, the contest details are reported including the evaluation measures and the performance of the submitted methods along with a short description of each method.
Handwritten and Machine Printed Text Separation in Document Images using the ...Konstantinos Zagoris
In a number of types of documents, ranging from forms to archive documents and books with annotations, machine printed and handwritten text may be present in the same document image, giving rise to significant issues within a digitisation and recognition pipeline. It is therefore necessary to separate the two types of text before applying different recognition methodologies to each. In this paper, a new approach is proposed which strives towards identifying and separating handwritten from machine printed text using the Bag of Visual Words paradigm (BoVW). Initially, blocks of interest are detected in the document image. For each block, a descriptor is calculated based on the BoVW. The final characterization of the blocks as Handwritten,Machine Printed or Noise is made by a Support Vector Machine classifier. The promising performance of the proposed approach is shown by using a consistent evaluation methodology which couples meaningful measures along with a new dataset.
Textual information in images constitutes a very rich source of high-level semantics for retrieval and indexing. In this paper, a new approach is proposed using Cellular Automata (CA) which strives towards identifying scene text on natural images. Initially, a binary edge map is calculated. Then, taking advantage of the CA flexibility, the transition rules are changing and are applied in four consecutive steps resulting in four time steps CA evolution. Finally, a post-processing technique based on edge projection analysis is employed for high density edge images concerning the elimination of possible false positives. Evaluation results indicate considerable performance gains without sacrificing text detection accuracy.
A system was developed able to retrieve specific documents from a document collection. In this system the query is given in text by the user and then transformed into image. Appropriate features were in order to capture the general shape of the query, and ignore details due to noise or different fonts. In order to demonstrate the effectiveness of our system, we used a collection of noisy documents and we compared our results with those of a commercial OCR package.
Color reduction using the combination of the kohonen self organized feature m...Konstantinos Zagoris
The color of the digital images is one of the most important components of the image processing research area. In many applications such as image segmentation, analysis, compression and transition, it is preferable to reduce the colors as much as possible. In this paper, a color clustering technique which is the combination of a neural network and a fuzzy algorithm is proposed. Initially, the Kohonen Self Organized Featured Map (KSOFM) is applied to the original image. Then, the KSOFM results are fed to the Gustafson-Kessel (GK) fuzzy clustering algorithm as starting values. Finally, the output classes of GK algorithm define the numbers of colors of which the image will be reduced.
Enhancement and Segmentation of Historical Recordscsandit
Document Analysis and Recognition (DAR) aims to extract automatically the information in the document and also addresses to human comprehension. The automatic processing of degraded
historical documents are applications of document image analysis field which is confronted with many difficulties due to the storage condition and the complexity of the script. The main interest
of enhancement of historical documents is to remove undesirable statistics that appear in the
background and highlight the foreground, so as to enable automatic recognition of documents
with high accuracy. This paper addresses pre-processing and segmentation of ancient scripts, as an initial step to automate the task of an epigraphist in reading and deciphering inscriptions.
Pre-processing involves, enhancement of degraded ancient document images which is achieved through four different Spatial filtering methods for smoothing or sharpening namely Median,
Gaussian blur, Mean and Bilateral filter, with different mask sizes. This is followed by
binarization of the enhanced image to highlight the foreground information, using Otsu
thresholding algorithm. In the second phase Segmentation is carried out using Drop Fall and
WaterReservoir approaches, to obtain sampled characters, which can be used in later stages of
OCR. The system showed good results when tested on the nearly 150 samples of varying
degraded epigraphic images and works well giving better enhanced output for, 4x4 mask size
for Median filter, 2x2 mask size for Gaussian blur, 4x4 mask size for Mean and Bilateral filter.
The system can effectively sample characters from enhanced images, giving a segmentation rate of 85%-90% for Drop Fall and 85%-90% for Water Reservoir techniques respectively
Integrated Hidden Markov Model and Kalman Filter for Online Object Trackingijsrd.com
Visual prior from generic real-world images study to represent that objects in a scene. The existing work presented online tracking algorithm to transfers visual prior learned offline for online object tracking. To learn complete dictionary to represent visual prior with collection of real world images. Prior knowledge of objects is generic and training image set does not contain any observation of target object. Transfer learned visual prior to construct object representation using Sparse coding and Multiscale max pooling. Linear classifier is learned online to distinguish target from background and also to identify target and background appearance variations over time. Tracking is carried out within Bayesian inference framework and learned classifier is used to construct observation model. Particle filter is used to estimate the tracking result sequentially however, unable to work efficiently in noisy scenes. Time sift variance were not appropriated to track target object with observer value to prior information of object structure. Proposal HMM based kalman filter to improve online target tracking in noisy sequential image frames. The covariance vector is measured to identify noisy scenes. Discrete time steps are evaluated for identifying target object with background separation. Experiment conducted on challenging sequences of scene. To evaluate the performance of object tracking algorithm in terms of tracking success rate, Centre location error, Number of scenes, Learning object sizes, and Latency for tracking.
Improved wolf algorithm on document images detection using optimum mean techn...journalBEEI
Detection text from handwriting in historical documents provides high-level features for the challenging problem of handwriting recognition. Such handwriting often contains noise, faint or incomplete strokes, strokes with gaps, and competing lines when embedded in a table or form, making it unsuitable for local line following algorithms or associated binarization schemes. In this paper, a proposed method based on the optimum threshold value and namely as the Optimum Mean method was presented. Besides, Wolf method unsuccessful in order to detect the thin text in the non-uniform input image. However, the proposed method was suggested to overcome the Wolf method problem by suggesting a maximum threshold value using optimum mean. Based on the calculation, the proposed method obtained a higher F-measure (74.53), PSNR (14.77) and lowest NRM (0.11) compared to the Wolf method. In conclusion, the proposed method successful and effective to solve the wolf problem by producing a high-quality output image.
A Literature Survey: Neural Networks for object detectionvivatechijri
Humans have a great capability to distinguish objects by their vision. But, for machines object
detection is an issue. Thus, Neural Networks have been introduced in the field of computer science. Neural
Networks are also called as ‘Artificial Neural Networks’ [13]. Artificial Neural Networks are computational
models of the brain which helps in object detection and recognition. This paper describes and demonstrates the
different types of Neural Networks such as ANN, KNN, FASTER R-CNN, 3D-CNN, RNN etc. with their accuracies.
From the study of various research papers, the accuracies of different Neural Networks are discussed and
compared and it can be concluded that in the given test cases, the ANN gives the best accuracy for the object
detection.
DETECTION OF DENSE, OVERLAPPING, GEOMETRIC OBJECTSijaia
Using a unique data collection, we are able to study the detection of dense geometric objects in image data where object density, clarity, and size vary. The data is a large set of black and white images of scatterplots, taken from journals reporting thermophysical property data of metal systems, whose plot points are represented primarily by circles, triangles, and squares. We built a highly accurate single class U-Net convolutional neural network model to identify 97 % of image objects in a defined set of test images, locating the centers of the objects to within a few pixels of the correct locations. We found an optimal way in which to mark our training data masks to achieve this level of accuracy. The optimal markings for object classification, however, required more information in the masks to identify particular types of geometries. We show a range of different patterns used to mark the training data masks, and how they help or hurt our dual goals of location and classification. Altering the annotations in the segmentation masks can increase both the accuracy of object classification and localization on the plots, more than other factors such as
adding loss terms to the network calculations. However, localization of the plot points and classification of the geometric objects require different optimal training data.
Text detection and recognition from natural sceneshemanthmcqueen
Text characters in natural scenes and surroundings provide us with valuable information about the place and even provide us with some legal/important information. Hence it’s very important for us to detect such text and recognise them which helps a lot. But , it’s not really easy to recognize those text information because of the diverse backgrounds and fonts used for the text. In this paper, a method is proposed to extract the text information from the surroundings. First, a character descriptor is designed with existing standard detectors and descriptors. Then, character structure is modeled at each character class by designing stroke configuration maps.In natural scenes , the text part is generally found on nearby sign boards and other objects. The extraction of such text is difficult because of noisy backgrounds and diverse fonts and text sizes. But many applications have been proven to be efficient in extraction of text from surroundings. For this , the method of text extraction is divided into two processes;
Text detection
Text recognition
Mammography is currently the dominant imaging modality for the early detection of breast cancer. However, its robustness in distinguishing malignancy is relatively low, resulting in a large number of unnecessary biopsies. A computer-aided diagnosis (CAD) scheme, capable of visually justifying its results, is expected to aid the decision made by radiologists. Content-based image retrieval (CBIR) accounts for a promising paradigm in this direction. Facing this challenge, we introduce a CBIR scheme that utilizes the extracted features as input to a support vector machine (SVM) ensemble. The final features used for CBIR comprise the participation value of each SVM. The retrieval performance of the proposed scheme has been evaluated quantitatively on the basis of the standard measures. In the experiments, a set of 90 mammograms is used, derived from a widely adopted digital database for screening mammography. The experimental results show the improved performance of the proposed scheme.
A Semi-Automatic Annotation Tool For Arabic Online Handwritten TextRanda Elanwar
Presentation of PhD dissertation
Content
Text Lines Extraction using dynamic programming
Words Extraction using SVM and RBF
Words Segmentation using HMM
User Interfaces on Matlab
Annotation performance evalution
Content and Metadata Based Image Document Retrieval (in Greek)Konstantinos Zagoris
In the bottom line, the present thesis presents solutions to real problems of the content-based image retrieval systems as image segmentation, text localization, relevance feedback algorithms and shape/word descriptors. All the proposed methods can be combined in order to create a fast and modern MPEG-7 compatible content-based retrieval image system.
Comparative Performance Evaluation of Image Descriptors Over IEEE 802.11b Noi...Konstantinos Zagoris
We evaluate the image retrieval
procedure over an IEEE 802.11b Ad Hoc network, operating
in 2.4GHz, using IEEE Distributed Coordination Function
CSMA/CA as the multiple access scheme. IEEE 802.11 is a
widely used network standard, implemented and supported
by a variety of devices, such as desktops, laptops, notebooks,
mobile phones etc., capable of providing a variety of different
services, such as file transfer, internet access et.al. Therefore,
we consider IEEE 802.11b being a suitable technology to
investigate the case of conducting image retrieval over a
wireless noisy channel. The model we use to simulate the noisy
environment is based on the scenario in which the wireless
network is located in an outdoor noisy environment, or in
an indoor environment of partial LOS - Line-of-sight power.
We used a large number of descriptors reported in literature
in order to evaluate which one has the best performance in
terms of Mean Average Precision - MAP values under those
circumstances. Experimental results on known benchmarking
database show that the majority of the descriptors appear to
have decreased performance when transferred and used in such
noisy environments.
Enhancement and Segmentation of Historical Recordscsandit
Document Analysis and Recognition (DAR) aims to extract automatically the information in the document and also addresses to human comprehension. The automatic processing of degraded
historical documents are applications of document image analysis field which is confronted with many difficulties due to the storage condition and the complexity of the script. The main interest
of enhancement of historical documents is to remove undesirable statistics that appear in the
background and highlight the foreground, so as to enable automatic recognition of documents
with high accuracy. This paper addresses pre-processing and segmentation of ancient scripts, as an initial step to automate the task of an epigraphist in reading and deciphering inscriptions.
Pre-processing involves, enhancement of degraded ancient document images which is achieved through four different Spatial filtering methods for smoothing or sharpening namely Median,
Gaussian blur, Mean and Bilateral filter, with different mask sizes. This is followed by
binarization of the enhanced image to highlight the foreground information, using Otsu
thresholding algorithm. In the second phase Segmentation is carried out using Drop Fall and
WaterReservoir approaches, to obtain sampled characters, which can be used in later stages of
OCR. The system showed good results when tested on the nearly 150 samples of varying
degraded epigraphic images and works well giving better enhanced output for, 4x4 mask size
for Median filter, 2x2 mask size for Gaussian blur, 4x4 mask size for Mean and Bilateral filter.
The system can effectively sample characters from enhanced images, giving a segmentation rate of 85%-90% for Drop Fall and 85%-90% for Water Reservoir techniques respectively
Integrated Hidden Markov Model and Kalman Filter for Online Object Trackingijsrd.com
Visual prior from generic real-world images study to represent that objects in a scene. The existing work presented online tracking algorithm to transfers visual prior learned offline for online object tracking. To learn complete dictionary to represent visual prior with collection of real world images. Prior knowledge of objects is generic and training image set does not contain any observation of target object. Transfer learned visual prior to construct object representation using Sparse coding and Multiscale max pooling. Linear classifier is learned online to distinguish target from background and also to identify target and background appearance variations over time. Tracking is carried out within Bayesian inference framework and learned classifier is used to construct observation model. Particle filter is used to estimate the tracking result sequentially however, unable to work efficiently in noisy scenes. Time sift variance were not appropriated to track target object with observer value to prior information of object structure. Proposal HMM based kalman filter to improve online target tracking in noisy sequential image frames. The covariance vector is measured to identify noisy scenes. Discrete time steps are evaluated for identifying target object with background separation. Experiment conducted on challenging sequences of scene. To evaluate the performance of object tracking algorithm in terms of tracking success rate, Centre location error, Number of scenes, Learning object sizes, and Latency for tracking.
Improved wolf algorithm on document images detection using optimum mean techn...journalBEEI
Detection text from handwriting in historical documents provides high-level features for the challenging problem of handwriting recognition. Such handwriting often contains noise, faint or incomplete strokes, strokes with gaps, and competing lines when embedded in a table or form, making it unsuitable for local line following algorithms or associated binarization schemes. In this paper, a proposed method based on the optimum threshold value and namely as the Optimum Mean method was presented. Besides, Wolf method unsuccessful in order to detect the thin text in the non-uniform input image. However, the proposed method was suggested to overcome the Wolf method problem by suggesting a maximum threshold value using optimum mean. Based on the calculation, the proposed method obtained a higher F-measure (74.53), PSNR (14.77) and lowest NRM (0.11) compared to the Wolf method. In conclusion, the proposed method successful and effective to solve the wolf problem by producing a high-quality output image.
A Literature Survey: Neural Networks for object detectionvivatechijri
Humans have a great capability to distinguish objects by their vision. But, for machines object
detection is an issue. Thus, Neural Networks have been introduced in the field of computer science. Neural
Networks are also called as ‘Artificial Neural Networks’ [13]. Artificial Neural Networks are computational
models of the brain which helps in object detection and recognition. This paper describes and demonstrates the
different types of Neural Networks such as ANN, KNN, FASTER R-CNN, 3D-CNN, RNN etc. with their accuracies.
From the study of various research papers, the accuracies of different Neural Networks are discussed and
compared and it can be concluded that in the given test cases, the ANN gives the best accuracy for the object
detection.
DETECTION OF DENSE, OVERLAPPING, GEOMETRIC OBJECTSijaia
Using a unique data collection, we are able to study the detection of dense geometric objects in image data where object density, clarity, and size vary. The data is a large set of black and white images of scatterplots, taken from journals reporting thermophysical property data of metal systems, whose plot points are represented primarily by circles, triangles, and squares. We built a highly accurate single class U-Net convolutional neural network model to identify 97 % of image objects in a defined set of test images, locating the centers of the objects to within a few pixels of the correct locations. We found an optimal way in which to mark our training data masks to achieve this level of accuracy. The optimal markings for object classification, however, required more information in the masks to identify particular types of geometries. We show a range of different patterns used to mark the training data masks, and how they help or hurt our dual goals of location and classification. Altering the annotations in the segmentation masks can increase both the accuracy of object classification and localization on the plots, more than other factors such as
adding loss terms to the network calculations. However, localization of the plot points and classification of the geometric objects require different optimal training data.
Text detection and recognition from natural sceneshemanthmcqueen
Text characters in natural scenes and surroundings provide us with valuable information about the place and even provide us with some legal/important information. Hence it’s very important for us to detect such text and recognise them which helps a lot. But , it’s not really easy to recognize those text information because of the diverse backgrounds and fonts used for the text. In this paper, a method is proposed to extract the text information from the surroundings. First, a character descriptor is designed with existing standard detectors and descriptors. Then, character structure is modeled at each character class by designing stroke configuration maps.In natural scenes , the text part is generally found on nearby sign boards and other objects. The extraction of such text is difficult because of noisy backgrounds and diverse fonts and text sizes. But many applications have been proven to be efficient in extraction of text from surroundings. For this , the method of text extraction is divided into two processes;
Text detection
Text recognition
Mammography is currently the dominant imaging modality for the early detection of breast cancer. However, its robustness in distinguishing malignancy is relatively low, resulting in a large number of unnecessary biopsies. A computer-aided diagnosis (CAD) scheme, capable of visually justifying its results, is expected to aid the decision made by radiologists. Content-based image retrieval (CBIR) accounts for a promising paradigm in this direction. Facing this challenge, we introduce a CBIR scheme that utilizes the extracted features as input to a support vector machine (SVM) ensemble. The final features used for CBIR comprise the participation value of each SVM. The retrieval performance of the proposed scheme has been evaluated quantitatively on the basis of the standard measures. In the experiments, a set of 90 mammograms is used, derived from a widely adopted digital database for screening mammography. The experimental results show the improved performance of the proposed scheme.
A Semi-Automatic Annotation Tool For Arabic Online Handwritten TextRanda Elanwar
Presentation of PhD dissertation
Content
Text Lines Extraction using dynamic programming
Words Extraction using SVM and RBF
Words Segmentation using HMM
User Interfaces on Matlab
Annotation performance evalution
Content and Metadata Based Image Document Retrieval (in Greek)Konstantinos Zagoris
In the bottom line, the present thesis presents solutions to real problems of the content-based image retrieval systems as image segmentation, text localization, relevance feedback algorithms and shape/word descriptors. All the proposed methods can be combined in order to create a fast and modern MPEG-7 compatible content-based retrieval image system.
Comparative Performance Evaluation of Image Descriptors Over IEEE 802.11b Noi...Konstantinos Zagoris
We evaluate the image retrieval
procedure over an IEEE 802.11b Ad Hoc network, operating
in 2.4GHz, using IEEE Distributed Coordination Function
CSMA/CA as the multiple access scheme. IEEE 802.11 is a
widely used network standard, implemented and supported
by a variety of devices, such as desktops, laptops, notebooks,
mobile phones etc., capable of providing a variety of different
services, such as file transfer, internet access et.al. Therefore,
we consider IEEE 802.11b being a suitable technology to
investigate the case of conducting image retrieval over a
wireless noisy channel. The model we use to simulate the noisy
environment is based on the scenario in which the wireless
network is located in an outdoor noisy environment, or in
an indoor environment of partial LOS - Line-of-sight power.
We used a large number of descriptors reported in literature
in order to evaluate which one has the best performance in
terms of Mean Average Precision - MAP values under those
circumstances. Experimental results on known benchmarking
database show that the majority of the descriptors appear to
have decreased performance when transferred and used in such
noisy environments.
Holistic Approach for Arabic Word RecognitionEditor IJCATR
Optical Character Recognition (OCR) is one of the important branches. One segmenting words into character is one of the
most challenging steps on OCR. As the results of advances in machine speeds and memory sizes as well as the availability of large
training dataset, researchers currently study Holistic Approach “recognition of a word without segmentation”. This paper describes a
method to recognize off-line handwritten Arabic names. The classification approach is based on Hidden Markov models.. For each
Arabic word many HMM models with different number of states have been trained. The experiments result are encouraging, it also
show that best number of state for each word need careful selection and considerations.
This paper tackles the problem of the user’s incapability to describe the image that he seeks by introducing an innovative image engine called TsoKaDo. Until now the traditional web image was based only on the comparison between metadata of the webpage the user’s textual description. In the method proposed images various search engines are classified based on visual content new tags are proposed to the user. Recursively the results get to the user’s desire. The aim of this paper is to present new way of searching especially in case with less query generality greater weight in visual content rather than in metadata.
Segmentation and recognition of handwritten gurmukhi scriptRAJENDRA VERMA
To Segment handwritten cursive words into individual predefined strokes. I had design a algorithm which is calculated angle between two coordinate points in the basic of angle and segment the handwritten cursive word. It improve the accuracy of handwriting recognition system.
Dynamic Two-Stage Image Retrieval from Large Multimodal DatabasesKonstantinos Zagoris
Content-based image retrieval (CBIR) with global features is notoriously noisy, especially for image queries with low percentages of relevant images in a collection. Moreover, CBIR typically ranks the whole collection, which is inefficient for large databases. We experiment with a method for image retrieval from multimodal databases, which improves both the effectiveness and efficiency of traditional CBIR by exploring secondary modalities. We perform retrieval in a two-stage fashion: first rank by a secondary modality, and then perform CBIR only on the top-K items. Thus, effectiveness is improved by performing CBIR on a ‘better’ subset. Using a relatively ‘cheap’ first stage, efficiency is also improved via the fewer CBIR operations performed. Our main novelty is that K is dynamic, i.e. estimated per query to optimize a predefined effectiveness measure. We show that such dynamic two-stage setups can be significantly more effective and robust than similar setups with static thresholds previously proposed.
Performance of Statistics Based Line Segmentation System for Unconstrained H...AM Publications
Handwritten character recognition is a technique by which a computer system could recognize characters and other symbols written in natural handwriting. Segmentation decomposes the document image into subcomponents like lines, words and characters. To achieve greater accuracy, segmentation and recognition could not be treated independently. Most of the existing line segmentation methods have limitations when applied to unconstrained handwritten documents. Statistics based line segmentation system was developed in Java Developer Kit 1.6 for segmenting unconstrained handwritten document images into lines. Arithmetic mean, trimmed mean and inter-quartile mean were used appropriately to achieve accurate segmentation results. The performance of the system was studied by using a few public handwritten document image datasets and images collected from different writers to compare its segmentation accuracy. The datasets contained well separated, sharing, touching, overlapping, irregular base and short handwritten text lines. The samples from the datasets were also segmented by a few other line segmentation methods. The segmentation accuracy of the system was higher than that of other methods. Performance measures like language support, segmentation document and line type of the system were compared with that of other line segmentation methods. The developed system segmented handwritten and printed lines from English, Chinese and Bengali languages and supported linear and non linear lines.
This PPT tells you how to tackle with questions based on LCM & HCF in CAT 2009. Ample of PPTs of this type on every topic of CAT 2009 are available on www.tcyonline.com
Arabic Handwritten Script Recognition Towards Generalization: A Survey Randa Elanwar
Our concern in this paper is to:
provide a comprehensive review of recent off-line and on-line trends in Arabic cursive handwriting recognition (last 10 years publications)
clarify the challenges standing against obtaining a reliable, accurate, simple, general purpose recognizer based on these trends.
Reference Scope Identification of Citances Using Convolutional Neural NetworkSaurav Jha
In the task of summarization of a scientific paper, a lot of information stands to be gained about a reference paper, from the papers that cite it. Automatically generating the reference scope (the span of cited text) in a reference paper, corresponding to citances (sentences in the citing papers that cite it) has great significance in preparing a structured summary of the reference paper. We treat this task as a binary classification problem, by extracting feature vectors from pairs of citances and reference sentences. These features are lexical, corpus-based, surface and knowledge-based. We extend the current feature set employed for reference-citance pair identification in the current state-of-the-art system. Using these features, we present a novel classification approach for this task, that employs a deep Convolutional Neural Network along with two boosting ensemble algorithms. We outperform the existing state-of-the- art for distinguishing between cited spans and non-cited spans of text in the reference paper.
A Visual Exploration of Distance, Documents, and DistributionsRebecca Bilbro
Machine learning often requires us to think spatially and make choices about what it means for two instances to be close or far apart. So which is best - Euclidean? Manhattan? Cosine? It all depends! In this talk, we'll explore open source tools and visual diagnostic strategies for picking good distance metrics when doing machine learning on text.
Machine learning often requires us to think spatially and make choices about what it means for two instances to be close or far apart. So which is best - Euclidean? Manhattan? Cosine? It all depends! In this talk, we'll explore open source tools and visual diagnostic strategies for picking good distance metrics when doing machine learning on text.
Efficient Query Processing in Geographic Web Search EnginesYen-Yu Chen
Geographic web search engines allow users to constrain and order search results in an intuitive manner by focusing a query on a particular geographic region. Geographic search technology, also called local search, has recently received significant interest from major search engine companies. Academic research in this area has focused primarily on techniques for extracting geographic knowledge from the web. In this paper, we study the problem of efficient query processing in scalable geographic search engines. Query processing is a major bottleneck in standard web search engines, and the main reason for the thousands of machines used by the major engines. Geographic search engine query processing is different in that it requires a combination of text and spatial data processing techniques. We propose several algorithms for efficient query processing in geographic search engines, integrate them into an existing web search query processor, and evaluate them on large sets of real data and query traces.
The “Local Ranking Problem” (LRP) is related to the computation of a centrality-like rank on a local graph, where the scores of the nodes could significantly differ from the ones computed on the global graph. Previous work has studied LRP on the hyperlink graph but never on the BrowseGraph, namely a graph where nodes are webpages and edges are browsing transitions. Recently, this graph has received more and more attention in many different tasks such as ranking, prediction and recommendation. However, a webserver has only the browsing traffic performed on its pages (local BrowseGraph) and, as a consequence, the local computation can lead to estimation errors, which hinders the increasing number of applications in the state of the art. Also, although the divergence between the local and global ranks has been measured, the possibility of estimating such divergence using only local knowledge has been mainly overlooked. These aspects are of great interest for online service providers who want to: (i) gauge their ability to correctly assess the importance of their resources only based on their local knowledge, and (ii) take into account real user browsing fluxes that better capture the actual user interest than the static hyperlink network. We study the LRP problem on a BrowseGraph from a large news provider, considering as subgraphs the aggregations of browsing traces of users coming from different domains. We show that the distance between rankings can be accurately predicted based only on structural information of the local graph, being able to achieve an average rank correlation as high as 0.8.
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...ijdpsjournal
Scene text recognition brings various new challenges occurs in recent years. Detecting and recognizing text in scenes entails some of the equivalent problems as document processing, but there are also numerous novel problems to face for ecognizing text in natural scene images. Recent research in these regions has exposed several promise but present is motionless much effort to be entire in these regions. Most existing techniques have focused on detecting horizontal or near-horizontal texts. In this paper, we propose a new scheme which detects texts of arbitrary directions in natural scene images. Our algorithm is equipped with two sets of characteristics specially designed for capturing both the natural characteristics of texts using
MSER regions using Otsu method. To better estimate our algorithm and compare it with other existing algorithms, we are using existing MSRA Dataset, ICDAR Dataset, and our new dataset, which includes various texts in various real-world situations. Experiments results on these standard datasets and the proposed dataset shows that our algorithm compares positively with the modern algorithms when using horizontal texts and accomplishes significantly improved performance on texts of random orientations in composite natural scenes images.
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...ijdpsjournal
Scene text recognition brings various new challenges occurs in recent years. Detecting and recognizing text
in scenes entails some of the equivalent problems as document processing, but there are also numerous
novel problems to face for recognizing text in natural scene images. Recent research in these regions has
exposed several promise but present is motionless much effort to be entire in these regions. Most existing
techniques have focused on detecting horizontal or near-horizontal texts. In this paper, we propose a new
scheme which detects texts of arbitrary directions in natural scene images. Our algorithm is equipped with
two sets of characteristics specially designed for capturing both the natural characteristics of texts using
MSER regions using Otsu method. To better estimate our algorithm and compare it with other existing
algorithms, we are using existing MSRA Dataset, ICDAR Dataset, and our new dataset, which includes
various texts in various real-world situations. Experiments results on these standard datasets and the
proposed dataset shows that our algorithm compares positively with the modern algorithms when using
horizontal texts and accomplishes significantly improved performance on texts of random orientations in
composite natural scenes images.
Similar to Segmentation - based Historical Handwritten Word Spotting using document-specific Local Features (20)
Richard's entangled aventures in wonderlandRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
Slide 1: Title Slide
Extrachromosomal Inheritance
Slide 2: Introduction to Extrachromosomal Inheritance
Definition: Extrachromosomal inheritance refers to the transmission of genetic material that is not found within the nucleus.
Key Components: Involves genes located in mitochondria, chloroplasts, and plasmids.
Slide 3: Mitochondrial Inheritance
Mitochondria: Organelles responsible for energy production.
Mitochondrial DNA (mtDNA): Circular DNA molecule found in mitochondria.
Inheritance Pattern: Maternally inherited, meaning it is passed from mothers to all their offspring.
Diseases: Examples include Leber’s hereditary optic neuropathy (LHON) and mitochondrial myopathy.
Slide 4: Chloroplast Inheritance
Chloroplasts: Organelles responsible for photosynthesis in plants.
Chloroplast DNA (cpDNA): Circular DNA molecule found in chloroplasts.
Inheritance Pattern: Often maternally inherited in most plants, but can vary in some species.
Examples: Variegation in plants, where leaf color patterns are determined by chloroplast DNA.
Slide 5: Plasmid Inheritance
Plasmids: Small, circular DNA molecules found in bacteria and some eukaryotes.
Features: Can carry antibiotic resistance genes and can be transferred between cells through processes like conjugation.
Significance: Important in biotechnology for gene cloning and genetic engineering.
Slide 6: Mechanisms of Extrachromosomal Inheritance
Non-Mendelian Patterns: Do not follow Mendel’s laws of inheritance.
Cytoplasmic Segregation: During cell division, organelles like mitochondria and chloroplasts are randomly distributed to daughter cells.
Heteroplasmy: Presence of more than one type of organellar genome within a cell, leading to variation in expression.
Slide 7: Examples of Extrachromosomal Inheritance
Four O’clock Plant (Mirabilis jalapa): Shows variegated leaves due to different cpDNA in leaf cells.
Petite Mutants in Yeast: Result from mutations in mitochondrial DNA affecting respiration.
Slide 8: Importance of Extrachromosomal Inheritance
Evolution: Provides insight into the evolution of eukaryotic cells.
Medicine: Understanding mitochondrial inheritance helps in diagnosing and treating mitochondrial diseases.
Agriculture: Chloroplast inheritance can be used in plant breeding and genetic modification.
Slide 9: Recent Research and Advances
Gene Editing: Techniques like CRISPR-Cas9 are being used to edit mitochondrial and chloroplast DNA.
Therapies: Development of mitochondrial replacement therapy (MRT) for preventing mitochondrial diseases.
Slide 10: Conclusion
Summary: Extrachromosomal inheritance involves the transmission of genetic material outside the nucleus and plays a crucial role in genetics, medicine, and biotechnology.
Future Directions: Continued research and technological advancements hold promise for new treatments and applications.
Slide 11: Questions and Discussion
Invite Audience: Open the floor for any questions or further discussion on the topic.
Richard's aventures in two entangled wonderlandsRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
Introduction:
RNA interference (RNAi) or Post-Transcriptional Gene Silencing (PTGS) is an important biological process for modulating eukaryotic gene expression.
It is highly conserved process of posttranscriptional gene silencing by which double stranded RNA (dsRNA) causes sequence-specific degradation of mRNA sequences.
dsRNA-induced gene silencing (RNAi) is reported in a wide range of eukaryotes ranging from worms, insects, mammals and plants.
This process mediates resistance to both endogenous parasitic and exogenous pathogenic nucleic acids, and regulates the expression of protein-coding genes.
What are small ncRNAs?
micro RNA (miRNA)
short interfering RNA (siRNA)
Properties of small non-coding RNA:
Involved in silencing mRNA transcripts.
Called “small” because they are usually only about 21-24 nucleotides long.
Synthesized by first cutting up longer precursor sequences (like the 61nt one that Lee discovered).
Silence an mRNA by base pairing with some sequence on the mRNA.
Discovery of siRNA?
The first small RNA:
In 1993 Rosalind Lee (Victor Ambros lab) was studying a non- coding gene in C. elegans, lin-4, that was involved in silencing of another gene, lin-14, at the appropriate time in the
development of the worm C. elegans.
Two small transcripts of lin-4 (22nt and 61nt) were found to be complementary to a sequence in the 3' UTR of lin-14.
Because lin-4 encoded no protein, she deduced that it must be these transcripts that are causing the silencing by RNA-RNA interactions.
Types of RNAi ( non coding RNA)
MiRNA
Length (23-25 nt)
Trans acting
Binds with target MRNA in mismatch
Translation inhibition
Si RNA
Length 21 nt.
Cis acting
Bind with target Mrna in perfect complementary sequence
Piwi-RNA
Length ; 25 to 36 nt.
Expressed in Germ Cells
Regulates trnasposomes activity
MECHANISM OF RNAI:
First the double-stranded RNA teams up with a protein complex named Dicer, which cuts the long RNA into short pieces.
Then another protein complex called RISC (RNA-induced silencing complex) discards one of the two RNA strands.
The RISC-docked, single-stranded RNA then pairs with the homologous mRNA and destroys it.
THE RISC COMPLEX:
RISC is large(>500kD) RNA multi- protein Binding complex which triggers MRNA degradation in response to MRNA
Unwinding of double stranded Si RNA by ATP independent Helicase
Active component of RISC is Ago proteins( ENDONUCLEASE) which cleave target MRNA.
DICER: endonuclease (RNase Family III)
Argonaute: Central Component of the RNA-Induced Silencing Complex (RISC)
One strand of the dsRNA produced by Dicer is retained in the RISC complex in association with Argonaute
ARGONAUTE PROTEIN :
1.PAZ(PIWI/Argonaute/ Zwille)- Recognition of target MRNA
2.PIWI (p-element induced wimpy Testis)- breaks Phosphodiester bond of mRNA.)RNAse H activity.
MiRNA:
The Double-stranded RNAs are naturally produced in eukaryotic cells during development, and they have a key role in regulating gene expression .
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
Multi-source connectivity as the driver of solar wind variability in the heli...Sérgio Sacani
The ambient solar wind that flls the heliosphere originates from multiple
sources in the solar corona and is highly structured. It is often described
as high-speed, relatively homogeneous, plasma streams from coronal
holes and slow-speed, highly variable, streams whose source regions are
under debate. A key goal of ESA/NASA’s Solar Orbiter mission is to identify
solar wind sources and understand what drives the complexity seen in the
heliosphere. By combining magnetic feld modelling and spectroscopic
techniques with high-resolution observations and measurements, we show
that the solar wind variability detected in situ by Solar Orbiter in March
2022 is driven by spatio-temporal changes in the magnetic connectivity to
multiple sources in the solar atmosphere. The magnetic feld footpoints
connected to the spacecraft moved from the boundaries of a coronal hole
to one active region (12961) and then across to another region (12957). This
is refected in the in situ measurements, which show the transition from fast
to highly Alfvénic then to slow solar wind that is disrupted by the arrival of
a coronal mass ejection. Our results describe solar wind variability at 0.5 au
but are applicable to near-Earth observatories.
Multi-source connectivity as the driver of solar wind variability in the heli...
Segmentation - based Historical Handwritten Word Spotting using document-specific Local Features
1. SEGMENTATION-BASED HISTORICAL HANDWRITTEN
WORD SPOTTING USING DOCUMENT-SPECIFIC LOCAL
FEATURES
KONSTANTINOS ZAGORIS1,2
IOANNIS PRATIKAKIS1
BASILIS GATOS2
Visual Computing Group
Democritus University of Thrace
Dept. of Electrical and Computer Engineering
Xanthi, Greece
National Centre of Scientific Research “Demokritos”
Institute of Informatics and Telecommunications
Athens, Greece
1
2
2. WHAT IS KEY WORD SPOTTING?
• It is the task of identifying locations on a document image which have high
probability to contain an instance of a queried word
• without explicitly recognizing it.
• It is related to Content-Based Image Retrieval systems.
• Searching a word image from a set of unindexed document images using the image
content as the only information source.
3. CURRENT LITERATURE TRENDS
• Currently there are two distinct trends.
(i) Segmentation-based and (ii) Segmentation-free approaches.
• Their fundamental difference concerns the search space
- segmented word images (segmentation-based)
- complete document image (segmentation-free).
We address the word spotting problem with a segmentation-based
approach.
4. PREVIOUS LITERATURE
Rath and Manmatha calculate two families of feature sets.
• scalar type features that include aspect ratio, area, etc.
• profile-based features that are based on horizontal and vertical words
projections and the upper and lower word profiles.
Zagoris et. al. created a similar set of profile-based features but:
• encoded Discrete Cosine Transformation and
• quantize through the Gustafson - Kessel fuzzy algorithm.
Rodriguez and Perronnin extract features from a sliding window, based
on the first gradient and inspired by the SIFT keypoint descriptor.
5. BAG-OF-VISUAL WORDS MODEL
Recently, there was an influx of works based on the local features in the form of the
Bag-of-Visual Words model.
Llados et. al. evaluate the performance of various word descriptors :
• a bag of visual words procedure (BoVW),
• a pseudo-structural representation based on Loci Features,
• a structural approach by using words as graphs, and
• sequences of column features based on DTW.
They found that the statistical approach of the BoVW produces the best results,
although the memory requirements to store the descriptors are significant.
6. PROBLEMS WITH CURRENT LOCAL FEATURES
Most works using local features are based on the Scale Invariant Feature
Transform (SIFT) in order to describe the local information
• The original application of these local features are the natural images which they
have many structural differences compared to document images
• The detection of the most powerful edges through pyramid scaling creates local
points between text lines.
• Invariant properties in the descriptor results in noise amplification so they are
more sensitive to the noise and the complex texture of the background.
7. TEXTURE VS SHAPE FEATURES
Features for word spotting which rely only on word shape characteristics are not
effective in dealing with a document collection created by different writers, containing
significant writing style variations.
Although slant and skew preprocessing techniques can reduce the shape variations,
they cannot eliminate the problem as the whole structure of the word is different in
most of the cases.
In this respect, we argue that although the shape information is meaningful, the
texture information in a spatial context is more reliable.
8. DOCUMENT SPECIFIC LOCAL FEATURES (DSLF)
Taking into account the aforementioned considerations, we propose:
• novel local features which are specific for documents and a
• matching procedure that does not rely on codebook creation (as on
BoVW).
9. PROPOSED WORD SPOTTING FRAMEWORK
Document
Dataset
Word Image
Segmentation
Word Image
DSLF Calculation
Matching Process
Display Results
Database DSLF Calculation
ONLINE
OFFLINE
10. KEYPOINT DETECTION AND SELECTION
Keypoint Detection
CCs Analysis
Local Point Selection
Gradient
Orientation
Quantization
Convex Hull
Corner
Detection
Entropy-based Final Keypoints Keypoint Filtering
11. KEYPOINT DETECTION AND SELECTION
original document
image
orientation of the gradient
vector
quantization of the gradient
vector orientation
initial keypoints final keypoints
12. FEATURE EXTRACTION
• The feature for the local keypoint is calculated upon the quantized gradient angles
• An area of 18x18 pixels around the kP, is divided into 9 cells with size 6x6 for each of them.
• Each cell is represented by a 3-bin histogram (each bin corresponds to a quantization level).
• Each pixel accumulates a vote in the corresponding angle histogram bin. The strength of
voting depends on the norm of the gradient vector and on the distance from the location of
local point as shown at the following equation:
x, y x, y x, y V s G
2 2
,
2
1
LP LP
3 9 2
x y
x x y y
s
• The task of the 푠푥,푦 variable is to weigh the pixel participation to the histogram taking into account
its distance from the kP.
13. MATCHING PROCEDURE
• In the case of segmentation-based word spotting, the aim is to match the query
keypoints to the corresponding keypoints of any word image in the document.
• Local Proximity Nearest Neighbor (LPNN) search is implemented.
• The advantage of LPNN search is two-fold:
it enables a search in focused areas instead of searching in a brute force manner and
it goes beyond the typical use of a descriptor by the incorporation of spatial context in the
local search addressed.
14. MATCHING PROCEDURE
Update the location for each keypoint to a
new normalized space:
′
=
푝푥푖
푝푥푖
−푐푥
퐷푥
′
=
, 푝푦푖
푝푦푖
−푐푦
퐷푦
where:
푐푥, 푐푦 =
푘 푖=푥1
푖
푝푘
,
푘 푖=푦1
푖
푝푘
퐷푥 =
푘 푝푥푖
푖=1
− 푐푥
푘
, 퐷푦 =
푘 푝푦푖
푖=1
− 푐푦
푘
k denotes the total number of the keypoints in
a word image.
15. EVALUATION - DATASETS
BENTHAM DATASET
• It consists of 50 high quality (approximately 3000
pixel width and 4000 pixel height) handwritten
manuscripts written by Jeremy Bentham (1748-
1832).
• The variation of the same word is extreme and
involves writing style, font size, noise as well as their
combination.
16. EVALUATION - DATASETS
WASHINGTON DATASET
• It consists of 20 document images from George
Washington Collection of the Library of Congress
• The documents are were scanned from microfilm in
300 dpi resolution.
17. EVALUATION STRATEGY
• Two evaluation metrics: Precision at the k Top Retrieved words (P@k) and the Mean Average
Precision (MAP).
• P@5 is the precision at top 5 retrieved words. This metric defines how successfully the
algorithms produce relevant results to the first 5 positions of the ranking list
• MAP is a typical measure for the performance of information retrieval systems
• For the experiments, the word image segmentation information is taken from the ground truth
corpora.
• The total word image queries for the Washington dataset was 1570 and for the Bentham dataset
was 3668.
• Both query sets contain words appearing in various frequencies and sizes
• Evaluated against two previous segmentation-based profile-based strategies
• Then, in order to highlight the advantage of the proposed DSLF, it was replaced by the SIFT but the
proposed matching algorithm remained the same.
19. CONCLUSION
In this work, novel local features are proposed driven by the challenges presented in
historical handwritten word spotting scenarios.
The proposed method outperformed both the profile-based strategies and the SIFT
local features.
Moreover, a matching procedure was presented based on Local Proximity Nearest
Neighbour, that augments performance in terms of effectiveness and efficiency
incorporating spatial context.
The proposed framework achieves better performance after a consistent evaluation
against two profile-based approaches as well as the proposed approach with the
popular SIFT local features in two different handwritten datasets.
Although there is an abundance of systems suitable for both modern and historical printed material, very few of these systems are suitable to handwritten documents due to noise sensitivity, character variation and text layout complexity:
we argue that it is not beneficial in document images to incorporate invariant properties in the descriptor of the local points as it results in noise amplification. We believe that the features that are invariant to rotation are more sensitive to the noise and the complex texture of the background.
Features for word spotting which rely only on word shape characteristics are not effective in dealing with a document collection created by different writers, containing significant writing style variations.
Although slant and skew preprocessing techniques can reduce the shape variations, they cannot eliminate the problem as the whole structure of the word is different in most of the cases.
In this respect, we argue that although the shape information is meaningful, the texture information in a spatial context is more reliable.
For the sake of clarity, it is worth to note that since the focus of this work is on features extraction and matching, the segmented word images used in the proposed approach are achieved from the available ground truth dataset without involving any particular word image segmentation method.
But initially let’s look the proposed word spotting framework
In the next stage, the LPNN for each keypoint that resides on the query image is addressed. LPNN is realized in a search area which is computed by taking into account a percentage (25%) of the already calculated distances Dx,Dy. During
search, if there is one or more word keypoints in the proximity of the query keypoint under consideration, the Euclidean distance between their descriptors is calculated and the minimum distance is kept. This is repeated for each keypoint in the query image. The final similarity measure is the sum of all the minimal distances. If there is not a local point in its proximity then a penalty value is added to the similarity measure and it is equal to maximum Euclidean distance that can be calculated between the keypoint descriptors. As a final stage, the system presents to the user all the word images based on ascending sort order of the calculated similarity measure.
The proposed method outperformed both the profile-based strategies and the SIFT local features. It is worth to note, that the profile-based features were applied in words that were binarized, denoised, de-skew and de-slant as opposed to the local features that were applied to the original word images. Moreover, although the SIFT descriptor contains more information than the proposed local features (128 values against only 27), the latter performed better in both datasets