The document summarizes an automatic text extraction system for complex images. The system uses discrete wavelet transform for text localization. Morphological operations like erosion and dilation are used to enhance text identification and segmentation. Text regions are segmented using connected component analysis and properties like area and bounding box shape. The extracted text is recognized and shown in a text file. The system allows modifying the recognized text and shows better performance than existing techniques.
Survey of Hybrid Image Compression Techniques IJECEIAES
Â
A compression process is to reduce or compress the size of data while maintaining the quality of information contained therein. This paper presents a survey of research papers discussing improvement of various hybrid compression techniques during the last decade. A hybrid compression technique is a technique combining excellent properties of each group of methods as is performed in JPEG compression method. This technique combines lossy and lossless compression method to obtain a highquality compression ratio while maintaining the quality of the reconstructed image. Lossy compression technique produces a relatively high compression ratio, whereas lossless compression brings about high-quality data reconstruction as the data can later be decompressed with the same results as before the compression. Discussions of the knowledge of and issues about the ongoing hybrid compression technique development indicate the possibility of conducting further researches to improve the performance of image compression method.
An Unsupervised Cluster-based Image Retrieval Algorithm using Relevance FeedbackIJMIT JOURNAL
Â
Content-based image retrieval (CBIR) systems utilize low level query image feature as identifying similarity between a query image and the image database. Image contents are plays significant role for image retrieval. There are three fundamental bases for content-based image retrieval, i.e. visual feature extraction, multidimensional indexing, and retrieval system design. Each image has three contents such as: color, texture and shape features. Color and texture both plays important image visual features used in Content-Based Image Retrieval to improve results. Color histogram and texture features have potential to retrieve similar images on the basis of their properties. As the feature extracted from a query is low level, it is extremely difficult for user to provide an appropriate example in based query. To overcome these problems and reach higher accuracy in CBIR system, providing user with relevance feedback is famous for provide promising solutio
Improved wolf algorithm on document images detection using optimum mean techn...journalBEEI
Â
Detection text from handwriting in historical documents provides high-level features for the challenging problem of handwriting recognition. Such handwriting often contains noise, faint or incomplete strokes, strokes with gaps, and competing lines when embedded in a table or form, making it unsuitable for local line following algorithms or associated binarization schemes. In this paper, a proposed method based on the optimum threshold value and namely as the Optimum Mean method was presented. Besides, Wolf method unsuccessful in order to detect the thin text in the non-uniform input image. However, the proposed method was suggested to overcome the Wolf method problem by suggesting a maximum threshold value using optimum mean. Based on the calculation, the proposed method obtained a higher F-measure (74.53), PSNR (14.77) and lowest NRM (0.11) compared to the Wolf method. In conclusion, the proposed method successful and effective to solve the wolf problem by producing a high-quality output image.
Survey of Hybrid Image Compression Techniques IJECEIAES
Â
A compression process is to reduce or compress the size of data while maintaining the quality of information contained therein. This paper presents a survey of research papers discussing improvement of various hybrid compression techniques during the last decade. A hybrid compression technique is a technique combining excellent properties of each group of methods as is performed in JPEG compression method. This technique combines lossy and lossless compression method to obtain a highquality compression ratio while maintaining the quality of the reconstructed image. Lossy compression technique produces a relatively high compression ratio, whereas lossless compression brings about high-quality data reconstruction as the data can later be decompressed with the same results as before the compression. Discussions of the knowledge of and issues about the ongoing hybrid compression technique development indicate the possibility of conducting further researches to improve the performance of image compression method.
An Unsupervised Cluster-based Image Retrieval Algorithm using Relevance FeedbackIJMIT JOURNAL
Â
Content-based image retrieval (CBIR) systems utilize low level query image feature as identifying similarity between a query image and the image database. Image contents are plays significant role for image retrieval. There are three fundamental bases for content-based image retrieval, i.e. visual feature extraction, multidimensional indexing, and retrieval system design. Each image has three contents such as: color, texture and shape features. Color and texture both plays important image visual features used in Content-Based Image Retrieval to improve results. Color histogram and texture features have potential to retrieve similar images on the basis of their properties. As the feature extracted from a query is low level, it is extremely difficult for user to provide an appropriate example in based query. To overcome these problems and reach higher accuracy in CBIR system, providing user with relevance feedback is famous for provide promising solutio
Improved wolf algorithm on document images detection using optimum mean techn...journalBEEI
Â
Detection text from handwriting in historical documents provides high-level features for the challenging problem of handwriting recognition. Such handwriting often contains noise, faint or incomplete strokes, strokes with gaps, and competing lines when embedded in a table or form, making it unsuitable for local line following algorithms or associated binarization schemes. In this paper, a proposed method based on the optimum threshold value and namely as the Optimum Mean method was presented. Besides, Wolf method unsuccessful in order to detect the thin text in the non-uniform input image. However, the proposed method was suggested to overcome the Wolf method problem by suggesting a maximum threshold value using optimum mean. Based on the calculation, the proposed method obtained a higher F-measure (74.53), PSNR (14.77) and lowest NRM (0.11) compared to the Wolf method. In conclusion, the proposed method successful and effective to solve the wolf problem by producing a high-quality output image.
Content Based Image Retrieval (CBIR) aims at retrieving the images from the database based on the user query which is visual form rather than the traditional text form. The applications of CBIR extend from surveillance to remote sensing, medical imaging to weather forecasting, and security systems to historical research and so on. Though extensive research is made on content based image retrieval in the spatial domain, we have most images in the internet which is JPEG compressed which pushes the need for image retrieval in the compressed domain itself rather than decoding it to raw format before comparison and retrieval. This research addresses the need to retrieve the images from the database based on the features extracted from the compressed domain along with the application of genetic algorithm in improving the retrieval results. The research focuses on various features and their levels of impact on improving the precision and recall parameters of the CBIR system. Our experimentation results also indicate that the CBIR features in compressed domain along with the genetic algorithm usage improves the results considerably when compared with the literature techniques.
Scene Text Detection of Curved Text Using Gradiant Vector Flow MethodIJTET Journal
Â
Abstract--Text detection and recognition is a hot topic for researchers in the field of image processing and multimedia. Content based Image Retrieval (CBIR) community fills the semantic gap between low-level and high-level features. For text detection and extraction that achieve reasonable accuracy for multi-oriented text and natural scene text (camera images), several methods have been developed. In general most of the methods use classifier and large number of training samples to improve the accuracy in text detection. In general, connected components are used to tackle the multi-orientation problem. The connected component analysis based features with classifier training, work well for achieving better accuracy when the images are highly contrast. However, when the same methods used directly for text detection in video it results in disconnections, loss of shapes etc, because of low contrast and complex background. For such cases, deciding geometrical features of the components and classifier is not that easy. To overcome from this problem the proposed research uses Gradiant Vector Flow and Grouping based Method for Arbitrarily Oriented Scene text Detection method. The GVF of edge pixels in the Sobel edge map of the input frame is explored to identify the dominant edge pixels which represent text components. The method extracts dominant pixel’s edge components corresponding to the Sobel edge map, which is called Text Candidates (TC) of the text lines. Experimental results on different datasets including text data that is oriented arbitrary, non-horizontal text data also horizontal text data, Hua’s data and ICDAR-03 data (Camera images) show that the proposed method outperforms existing methods.
2 ijaems dec-2015-5-comprehensive review of huffman encoding technique for im...INFOGAIN PUBLICATION
Â
The image processing is used in the every field of life. It is growing field and is used by large number of users. The image processing is used in order to remove the problems present within the image. There are number of techniques which are suggested in order to improve the image. For this purpose image enhancement is commonly used. The space requirements associated with the image is also very important factor. The main aim of the various techniques of image processing is to decrease the space requirements of the image. The space requirements will be minimized by the use of compression techniques. Compression techniques are lossy and lossless in nature. This paper will conduct a comprehensive survey of the lossless compression Huffman coding in detail.
A Survey On Thresholding Operators of Text Extraction In VideosCSCJournals
Â
ideo indexing is an important problem that has interested by the communities of visual information in image processing. The detection and extraction of scene and caption text from unconstrained, general purpose video is an important research problem in the context of content-based retrieval and summarization. In this paper, the technique presented is for detection text from frames video. Finding the textual contents in images is a challenging and promising research area in information technology. Consequently, text detection and recognition in multimedia had become one of the most important fields in computer vision due to its valuable uses in a variety of recent technical applications. The work in this paper consists using morphological operations for extract text appearing in the video frames. The proposed scheme well as preprocessing to differentiate among where it as the high similarity between text and background information. Experimental results show that the resultant image is the image with only text. The evaluated criteria are applied with the image result and one obtained bay different operator.
Modified Skip Line Encoding for Binary Image Compressionidescitation
Â
Image Compression is an important issue in
Internet, mobile communication, digital library, digital
photography, multimedia, teleconferencing and other
applications. Application areas of Image Compression would
focus on the problem of optimizing storage space and
transmission bandwidth. In this paper, a modified form of skip
line encoding is proposed to further reduce the redundancy in
the image. The performance is found to be better than the
skip-line encoding.
Query Image Searching With Integrated Textual and Visual Relevance Feedback f...IJERA Editor
Â
There are many researchers who have studied the relevance feedback in the literature of content based image
retrieval (CBIR) community, but none of CBIR search engines support it because of scalability, effectiveness
and efficiency issues. In this, we had implemented an integrated relevance feedback for retrieving of web
images. Here, we had concentrated on integration of both textual features (TF) and visual features (VF) based
relevance feedback (RF), simultaneously we also tested them individually. The TFRF employs and effective
search result clustering (SRC) algorithm to get salient phrases. Then a new user interface (UI) is proposed to
support RF. Experimental results show that the proposed algorithm is scalable, effective and accurated
A Novel Document Image Binarization For Optical Character RecognitionEditor IJCATR
Â
This paper presents a technique for document image binarization that segments the foreground text accurately from poorly
degraded document images. The proposed technique is based on the
Segmentation of text from poorly degraded document images and
it is a very demanding job due to the high variation between the background and the foreground of the document. This paper pr
oposes
a novel document image binarization technique that segments t
he texts by using adaptive image contrast. It is a combination of the
local image contrast and the local image gradient that is efficient to overcome variations in text and background caused by d
ifferent
types degradation effects. In the proposed technique
, first an adaptive contrast map is constructed for a degraded input document
image. The contrast map is then binarized by global thresholding and pooled with Canny’s edge map detection to identify the t
ext
stroke edge pixels. By applying Segmentation the
text is further segmented by a local thresholding method that. The proposed method
is simple, strong, and requires minimum parameter tuning
Embedding Patient Information In Medical Images Using LBP and LTPcsijjournal
Â
In this paper a new efficient interleaving methodology in which patient textual information is embedded in medical images considering Magnetic Resonance Image, Ultrasonic image and CT images of that patient body. The main disadvantages in traditional techniques to Embed patient information in medical images is inability to withstand attacks and Bit Error Rate (BER) when maximum number of characters to be embedded are absent in proposed algorithm. The New robust Embedding Techniques, Local Binary Pattern (LBP) and Local Ternary Pattern (LTP) are used to embed the patient information in medical images. LBP Technique is mainly used to provide the security for the patient information. Along with the security high volume of patient information can be embedded inside the medical image using LTP technique. Statistical parameters such as Normalized Root Mean Square Error, PSNR (Peak Signal to Noise Ratio) are used to measure the reliability of the present technique. Experimental results strongly indicate that the present technique with zero BER (Bit Error Rate), higher PSNR (Peak Signal to Noise Ratio), and high volume of data embedding capacity achieved using LTP as compared with the other data embedding techniques. This technique is found to be robust and watermark information is recoverable without distortion.
Content Based Image Retrieval Approach Based on Top-Hat Transform And Modifie...cscpconf
Â
In this paper a robust approach is proposed for content based image retrieval (CBIR) using texture analysis techniques. The proposed approach includes three main steps. In the first one, shape detection is done based on Top-Hat transform to detect and crop object part of the image. Second step is included a texture feature representation algorithm using color local binary patterns (CLBP) and local variance features. Finally, to retrieve mostly closing matching images to the query, log likelihood ratio is used. The performance of the proposed approach is evaluated using Corel and Simplicity image sets and it compared by some of other well-known approaches in terms of precision and recall which shows the superiority of the proposed approach. Low noise sensitivity, rotation invariant, shift invariant, gray scale invariant and low computational complexity are some of other advantages.
Facial image retrieval on semantic features using adaptive mean genetic algor...TELKOMNIKA JOURNAL
Â
The emergence of larger databases has made image retrieval techniques an essential component and has led to the development of more efficient image retrieval systems. Retrieval can either be content or text-based. In this paper, the focus is on the content-based image retrieval from the FGNET database. Input query images are subjected to several processing techniques in the database before computing the squared Euclidean distance (SED) between them. The images with the shortest Euclidean distance are considered as a match and are retrieved. The processing techniques involve the application of the median modified Weiner filter (MMWF), extraction of the low-level features using histogram-oriented gradients (HOG), discrete wavelet transform (DWT), GIST, and Local tetra pattern (LTrP). Finally, the features are selected using Adaptive Mean Genetic Algorithm (AMGA). In this study, the average PSNR value obtained after applying the Wiener filter was 45.29. The performance of the AMGA was evaluated based on its precision, F-measure, and recall, and the obtained average values were respectively 0.75, 0.692, and 0.66. The performance matrix of the AMGA was compared to those of particle swarm optimization algorithm (PSO) and genetic algorithm (GA) and found to perform better; thus, proving its efficiency.
MMFO: modified moth flame optimization algorithm for region based RGB color i...IJECEIAES
Â
Region-based color image segmentation is elementary steps in image processing and computer vision. The region-based color image segmentation has faced the problem of multidimensionality. The color image is considered in five-dimensional problems, in which three dimensions in color (RGB) and two dimensions in geometry (luminosity layer and chromaticity layer). In this paper, L*a*b color space conversion has been used to reduce the one dimension and geometrically it converts in the array hence the further one dimension has been reduced. This paper introduced, an improved algorithm modified moth flame optimization (MMFO) algorithm for RGB color image segmentation which is based on bio-inspired techniques. The simulation results of MMFO for region based color image segmentation are performed better as compared to PSO and GA, in terms of computation times for all the images. The experiment results of this method gives clear segments based on the different color and the different number of clusters is used during the segmentation process.
Wavelet-Based Color Histogram on Content-Based Image RetrievalTELKOMNIKA JOURNAL
Â
The growth of image databases in many domains, including fashion, biometric, graphic design,
architecture, etc. has increased rapidly. Content Based Image Retrieval System (CBIR) is a technique used
for finding relevant images from those huge and unannotated image databases based on low-level features
of the query images. In this study, an attempt to employ 2nd level Wavelet Based Color Histogram (WBCH)
on a CBIR system is proposed. Image database used in this study are taken from Wang’s image database
containing 1000 color images. The experiment results show that 2nd level WBCH gives better precision
(0.777) than the other methods, including 1st level WBCH, Color Histogram, Color Co-occurrence Matrix,
and Wavelet texture feature. It can be concluded that the 2nd Level of WBCH can be applied to CBIR system.
Content Based Image Retrieval (CBIR) aims at retrieving the images from the database based on the user query which is visual form rather than the traditional text form. The applications of CBIR extend from surveillance to remote sensing, medical imaging to weather forecasting, and security systems to historical research and so on. Though extensive research is made on content based image retrieval in the spatial domain, we have most images in the internet which is JPEG compressed which pushes the need for image retrieval in the compressed domain itself rather than decoding it to raw format before comparison and retrieval. This research addresses the need to retrieve the images from the database based on the features extracted from the compressed domain along with the application of genetic algorithm in improving the retrieval results. The research focuses on various features and their levels of impact on improving the precision and recall parameters of the CBIR system. Our experimentation results also indicate that the CBIR features in compressed domain along with the genetic algorithm usage improves the results considerably when compared with the literature techniques.
Scene Text Detection of Curved Text Using Gradiant Vector Flow MethodIJTET Journal
Â
Abstract--Text detection and recognition is a hot topic for researchers in the field of image processing and multimedia. Content based Image Retrieval (CBIR) community fills the semantic gap between low-level and high-level features. For text detection and extraction that achieve reasonable accuracy for multi-oriented text and natural scene text (camera images), several methods have been developed. In general most of the methods use classifier and large number of training samples to improve the accuracy in text detection. In general, connected components are used to tackle the multi-orientation problem. The connected component analysis based features with classifier training, work well for achieving better accuracy when the images are highly contrast. However, when the same methods used directly for text detection in video it results in disconnections, loss of shapes etc, because of low contrast and complex background. For such cases, deciding geometrical features of the components and classifier is not that easy. To overcome from this problem the proposed research uses Gradiant Vector Flow and Grouping based Method for Arbitrarily Oriented Scene text Detection method. The GVF of edge pixels in the Sobel edge map of the input frame is explored to identify the dominant edge pixels which represent text components. The method extracts dominant pixel’s edge components corresponding to the Sobel edge map, which is called Text Candidates (TC) of the text lines. Experimental results on different datasets including text data that is oriented arbitrary, non-horizontal text data also horizontal text data, Hua’s data and ICDAR-03 data (Camera images) show that the proposed method outperforms existing methods.
2 ijaems dec-2015-5-comprehensive review of huffman encoding technique for im...INFOGAIN PUBLICATION
Â
The image processing is used in the every field of life. It is growing field and is used by large number of users. The image processing is used in order to remove the problems present within the image. There are number of techniques which are suggested in order to improve the image. For this purpose image enhancement is commonly used. The space requirements associated with the image is also very important factor. The main aim of the various techniques of image processing is to decrease the space requirements of the image. The space requirements will be minimized by the use of compression techniques. Compression techniques are lossy and lossless in nature. This paper will conduct a comprehensive survey of the lossless compression Huffman coding in detail.
A Survey On Thresholding Operators of Text Extraction In VideosCSCJournals
Â
ideo indexing is an important problem that has interested by the communities of visual information in image processing. The detection and extraction of scene and caption text from unconstrained, general purpose video is an important research problem in the context of content-based retrieval and summarization. In this paper, the technique presented is for detection text from frames video. Finding the textual contents in images is a challenging and promising research area in information technology. Consequently, text detection and recognition in multimedia had become one of the most important fields in computer vision due to its valuable uses in a variety of recent technical applications. The work in this paper consists using morphological operations for extract text appearing in the video frames. The proposed scheme well as preprocessing to differentiate among where it as the high similarity between text and background information. Experimental results show that the resultant image is the image with only text. The evaluated criteria are applied with the image result and one obtained bay different operator.
Modified Skip Line Encoding for Binary Image Compressionidescitation
Â
Image Compression is an important issue in
Internet, mobile communication, digital library, digital
photography, multimedia, teleconferencing and other
applications. Application areas of Image Compression would
focus on the problem of optimizing storage space and
transmission bandwidth. In this paper, a modified form of skip
line encoding is proposed to further reduce the redundancy in
the image. The performance is found to be better than the
skip-line encoding.
Query Image Searching With Integrated Textual and Visual Relevance Feedback f...IJERA Editor
Â
There are many researchers who have studied the relevance feedback in the literature of content based image
retrieval (CBIR) community, but none of CBIR search engines support it because of scalability, effectiveness
and efficiency issues. In this, we had implemented an integrated relevance feedback for retrieving of web
images. Here, we had concentrated on integration of both textual features (TF) and visual features (VF) based
relevance feedback (RF), simultaneously we also tested them individually. The TFRF employs and effective
search result clustering (SRC) algorithm to get salient phrases. Then a new user interface (UI) is proposed to
support RF. Experimental results show that the proposed algorithm is scalable, effective and accurated
A Novel Document Image Binarization For Optical Character RecognitionEditor IJCATR
Â
This paper presents a technique for document image binarization that segments the foreground text accurately from poorly
degraded document images. The proposed technique is based on the
Segmentation of text from poorly degraded document images and
it is a very demanding job due to the high variation between the background and the foreground of the document. This paper pr
oposes
a novel document image binarization technique that segments t
he texts by using adaptive image contrast. It is a combination of the
local image contrast and the local image gradient that is efficient to overcome variations in text and background caused by d
ifferent
types degradation effects. In the proposed technique
, first an adaptive contrast map is constructed for a degraded input document
image. The contrast map is then binarized by global thresholding and pooled with Canny’s edge map detection to identify the t
ext
stroke edge pixels. By applying Segmentation the
text is further segmented by a local thresholding method that. The proposed method
is simple, strong, and requires minimum parameter tuning
Embedding Patient Information In Medical Images Using LBP and LTPcsijjournal
Â
In this paper a new efficient interleaving methodology in which patient textual information is embedded in medical images considering Magnetic Resonance Image, Ultrasonic image and CT images of that patient body. The main disadvantages in traditional techniques to Embed patient information in medical images is inability to withstand attacks and Bit Error Rate (BER) when maximum number of characters to be embedded are absent in proposed algorithm. The New robust Embedding Techniques, Local Binary Pattern (LBP) and Local Ternary Pattern (LTP) are used to embed the patient information in medical images. LBP Technique is mainly used to provide the security for the patient information. Along with the security high volume of patient information can be embedded inside the medical image using LTP technique. Statistical parameters such as Normalized Root Mean Square Error, PSNR (Peak Signal to Noise Ratio) are used to measure the reliability of the present technique. Experimental results strongly indicate that the present technique with zero BER (Bit Error Rate), higher PSNR (Peak Signal to Noise Ratio), and high volume of data embedding capacity achieved using LTP as compared with the other data embedding techniques. This technique is found to be robust and watermark information is recoverable without distortion.
Content Based Image Retrieval Approach Based on Top-Hat Transform And Modifie...cscpconf
Â
In this paper a robust approach is proposed for content based image retrieval (CBIR) using texture analysis techniques. The proposed approach includes three main steps. In the first one, shape detection is done based on Top-Hat transform to detect and crop object part of the image. Second step is included a texture feature representation algorithm using color local binary patterns (CLBP) and local variance features. Finally, to retrieve mostly closing matching images to the query, log likelihood ratio is used. The performance of the proposed approach is evaluated using Corel and Simplicity image sets and it compared by some of other well-known approaches in terms of precision and recall which shows the superiority of the proposed approach. Low noise sensitivity, rotation invariant, shift invariant, gray scale invariant and low computational complexity are some of other advantages.
Facial image retrieval on semantic features using adaptive mean genetic algor...TELKOMNIKA JOURNAL
Â
The emergence of larger databases has made image retrieval techniques an essential component and has led to the development of more efficient image retrieval systems. Retrieval can either be content or text-based. In this paper, the focus is on the content-based image retrieval from the FGNET database. Input query images are subjected to several processing techniques in the database before computing the squared Euclidean distance (SED) between them. The images with the shortest Euclidean distance are considered as a match and are retrieved. The processing techniques involve the application of the median modified Weiner filter (MMWF), extraction of the low-level features using histogram-oriented gradients (HOG), discrete wavelet transform (DWT), GIST, and Local tetra pattern (LTrP). Finally, the features are selected using Adaptive Mean Genetic Algorithm (AMGA). In this study, the average PSNR value obtained after applying the Wiener filter was 45.29. The performance of the AMGA was evaluated based on its precision, F-measure, and recall, and the obtained average values were respectively 0.75, 0.692, and 0.66. The performance matrix of the AMGA was compared to those of particle swarm optimization algorithm (PSO) and genetic algorithm (GA) and found to perform better; thus, proving its efficiency.
MMFO: modified moth flame optimization algorithm for region based RGB color i...IJECEIAES
Â
Region-based color image segmentation is elementary steps in image processing and computer vision. The region-based color image segmentation has faced the problem of multidimensionality. The color image is considered in five-dimensional problems, in which three dimensions in color (RGB) and two dimensions in geometry (luminosity layer and chromaticity layer). In this paper, L*a*b color space conversion has been used to reduce the one dimension and geometrically it converts in the array hence the further one dimension has been reduced. This paper introduced, an improved algorithm modified moth flame optimization (MMFO) algorithm for RGB color image segmentation which is based on bio-inspired techniques. The simulation results of MMFO for region based color image segmentation are performed better as compared to PSO and GA, in terms of computation times for all the images. The experiment results of this method gives clear segments based on the different color and the different number of clusters is used during the segmentation process.
Wavelet-Based Color Histogram on Content-Based Image RetrievalTELKOMNIKA JOURNAL
Â
The growth of image databases in many domains, including fashion, biometric, graphic design,
architecture, etc. has increased rapidly. Content Based Image Retrieval System (CBIR) is a technique used
for finding relevant images from those huge and unannotated image databases based on low-level features
of the query images. In this study, an attempt to employ 2nd level Wavelet Based Color Histogram (WBCH)
on a CBIR system is proposed. Image database used in this study are taken from Wang’s image database
containing 1000 color images. The experiment results show that 2nd level WBCH gives better precision
(0.777) than the other methods, including 1st level WBCH, Color Histogram, Color Co-occurrence Matrix,
and Wavelet texture feature. It can be concluded that the 2nd Level of WBCH can be applied to CBIR system.
IOSR Journal of Pharmacy and Biological Sciences(IOSR-JPBS) is an open access international journal that provides rapid publication (within a month) of articles in all areas of Pharmacy and Biological Science. The journal welcomes publications of high quality papers on theoretical developments and practical applications in Pharmacy and Biological Science. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
A Survey On Thresholding Operators of Text Extraction In VideosCSCJournals
Â
Video indexing is an important problem that has interested by the communities of visual information in image processing. The detection and extraction of scene and caption text from unconstrained, general purpose video is an important research problem in the context of content-based retrieval and summarization. In this paper, the technique presented is for detection text from frames video. Finding the textual contents in images is a challenging and promising research area in information technology. Consequently, text detection and recognition in multimedia had become one of the most important fields in computer vision due to its valuable uses in a variety of recent technical applications. The work in this paper consists using morphological operations for extract text appearing in the video frames. The proposed scheme well as preprocessing to differentiate among where it as the high similarity between text and background information. Experimental results show that the resultant image is the image with only text. The evaluated criteria are applied with the image result and one obtained bay different operator.
On Text Realization Image SteganographyCSCJournals
Â
In this paper the steganography strategy is going to be implemented but in a different way from a different scope since the important data will neither be hidden in an image nor transferred through the communication channel inside an image, but on the contrary, a well known image will be used that exists on both sides of the channel and a text message contains important data will be transmitted. With the suitable operations, we can re-mix and re-make the source image. MATLAB7 is the program where the algorithm implemented on it, where the algorithm shows high ability for achieving the task to different type and size of images. Perfect reconstruction was achieved on the receiving side. But the most interesting is that the algorithm that deals with secured image transmission transmits no images at all
Inpainting scheme for text in video a surveyeSAT Journals
Â
Abstract
Text data present in video sequences provide useful information of paramount requirement. Although text present in video provide
useful information not all are of them are necessary because it may hide the important portion of the video. So there must way to
erase this type of unwanted text. This can be done in two phase first text components are detected from each frame of the video.
Detected text component are then removed from the video sequences. And restore the occluded part of the video using inpaint
method. This text detection and removal scheme is in two phases. Each phase is broad topic of image processing. Video text
detection and Inpainting are two most important phase in this scheme. Text detection phase consist of text localization, text
segmentation and recognition phase. Inpainting method is used for restoring occluded part produce due to removal of text.
Keywords—Optical Character Recognition(OCR), Stroke width transform, Text Detection,Connected Component
Comparative analysis of c99 and topictiling text segmentation algorithmseSAT Journals
Â
Abstract In this paper, the work done includes the extraction of information from image datasets which contain natural text. The difficulty level of segmenting natural text from an image is too high and so precision is the most important factor to be kept in mind. To minimize the error rates, error filtration technique is provided, as filtration is adopted while doing image segmentation basically text segmentation present in images. Furthermore, a comparative analysis of two different text segmentation algorithms namely C99 and TopicTiling on image documents is presented. To assess how well each algorithm works, each was applied on different datasets and results were compared. The work done also proves the efficiency of TopicTiling over C99. Index Terms: Text Segmentation, text extraction, image documents,C99 and TopicTiling.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
A Texture Based Methodology for Text Region Extraction from Low Resolution Na...CSCJournals
Â
Automated systems for understanding display boards are finding many applications useful in guiding tourists, assisting visually challenged and also in providing location aware information. Such systems require an automated method to detect and extract text prior to further image analysis. In this paper, a methodology to detect and extract text regions from low resolution natural scene images is presented. The proposed work is texture based and uses DCT based high pass filter to remove constant background. The texture features are then obtained on every 50x50 block of the processed image and potential text blocks are identified using newly defined discriminant functions. Further, the detected text blocks are merged and refined to extract text regions. The proposed method is robust and achieves a detection rate of 96.6% on a variety of 100 low resolution natural scene images each of size 240x320.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
A Review of Optical Character Recognition System for Recognition of Printed Textiosrjce
Â
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
A comparative study on content based image retrieval methodsIJLT EMAS
Â
Content-based image retrieval (CBIR) is a method of
finding images from a huge image database according to persons’
interests. Content-based here means that the search involves
analysis the actual content present in the image. As database of
images is growing daybyday, researchers/scholars are searching
for better techniques for retrieval of images maintaining good
efficiency. This paper presents the visual features and various
ways for image retrieval from the huge image database.
LOCALIZATION OF OVERLAID TEXT BASED ON NOISE INCONSISTENCIESaciijournal
Â
In this paper, we present a novel technique for localization of caption text in video frames based on noise inconsistencies. Text is artificially added to the video after it has been captured and as such does not form part of the original video graphics. Typically, the amount of noise level is uniform across the entire captured video frame, thus, artificially embedding or overlaying text on the video introduces yet another segment of noise level. Therefore detection of various noise levels in the video frame may signify availability of overlaid text. Hence we exploited this property by detecting regions with various noise levels to localize overlaid text in video frames. Experimental results obtained shows a great improvement in line with overlaid text localization, where we have performed metric measure based on Recall, Precision and f-measure.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Â
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Â
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Â
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
Â
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Â
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Â
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Â
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
Â
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Â
E1803012329
1. IOSR Journal of Computer Engineering (IOSR-JCE)
e-ISSN: 2278-0661,p-ISSN: 2278-8727, Volume 18, Issue 3, Ver. I (May-Jun. 2016), PP 23-29
www.iosrjournals.org
DOI: 10.9790/0661-1803012329 www.iosrjournals.org 23 | Page
Automatic Text Extraction System for Complex Images
R.Janani1
, T.Sheela2
1
(Assistant Professor, Department Of Computer Science and Engineering, Dhanalakshmi Srinivasan
Engineering College, Perambalur-621212, India)
2
(Assistant Professor, Department Of Information Technology, Dhanalakshmi Srinivasan Engineering College,
Perambalur-621212, India)
Abstract : The Intelligent text extraction system automatically identifies and extracts the text present in
different types of images. The growth of digital world Detection and Extraction of text regions in an image are
well known problems in the area of image processing. The growth of digital world and the usage of multimedia
generated a new era with a classic problem of pattern recognition. Thus Automatic text extraction from images
and videos serves an important role in indexing and efficient retrieval of multimedia. The existing techniques
such as region based, texture based techniques for the text extraction are not able to compact with all the
applications of text extraction. The proposed Intelligent Text Extraction system automatically identifies and
extracts the text present in different types of images. The system consists of different stages which include the
localization, segmentation, extraction and recognition of text from the images. In the proposed system we use
the discrete wavelet transform for the localization of text. The morphological operations are used which
enhances the identification of correct text portions. The text part is segmented and is recognized using an
efficient system. The advantage of the system is that the extracted text is shown in the .txt file. The proposed
system also allows the modification of the recognized text from the image. This method shows better efficiency,
precision and recall compared to the existing techniques. This shows the possibility of using this technique in
more new and advanced applications.
Keywords - DWT, image processing, morphological operations, Otsu, segmentation, text extraction, text
recognition
I. Introduction
The rapid advancement in the technology and multimedia has digitalized the world. The availability of
cameras and other systems contributes large number of images to the world. Ranging from cameras embedded
in mobile phones to professional ones, Surveillance cameras to broadcast videos, every day images to satellite
images, all these contributes to increase in multi-media data. Most of the images may contain text as part of it,
which gives some information about that image. Therefore identification of these texts has relevance in many
applications. This shows the importance of the text extraction system in lot of applications. It was stated that in
recent years there was a drastic in-crease in multimedia libraries and the amount of data is growing
exponentially with time. It is also known that there are number of television stations that broad-cast everyday
and the widespread of affordable digital cameras and inexpensive memory devices, the multi-media data is
increasing every second.
The text present within an image enables applications such as keyword-based image search, text-based
image indexing and automatic video logging. Extraction of these texts from images is a very difficult task due to
variations in character fonts, styles, sizes and text directions, and presence of complex backgrounds and variable
light conditions. Generally, the images can be categorized into three based on its type: document images, scene
images and born-digital images. Document images are the image of the document which includes pdf, notes etc.
The text present in document image is the document text.
Fig.1. (a) Document image
2. Automatic Text Extraction System For Complex Images
DOI: 10.9790/0661-1803012329 www.iosrjournals.org 24 | Page
Fig.1. (b) Scene text image
Fig.1.(a) shows the document image. Born-digital images are the images generated by computer
software. The text in these images is called as caption text or overlay text. Some researchers specify overlay text
as superimposed text or artificial text. An example for overlay text is the text shown by the news channels etc.
Scene images are the images that contain the text, such as the advertising boards, banners, which is
captured naturally when the scene images are taken with the camera as shown in Fig.1.(b), therefore scene text
gets included in the background of image as a part of the scene taken. Compared with document images and
born-digital images, the scene images, have more complex foreground/ background, low resolution,
compression loss, and severe edge softness. This makes the detection of text from scene text more difficult.
There-fore automatic extraction of texts from images or video is a challenging task and research under this field
is still under progress. Text present in images exhibit many variations based on the following properties:
(a) Size: The text can have variable size from small to large.
(b) Alignment: The characters in caption text may appear in clusters and sometimes they can also appear as
non-planar texts. The scene texts possesses numerous perspective distortions .They may be aligned in any
direction with geometric distortions
(c) Inter-character distance: Characters present in a text line have some uniform distance in between them.
(d) Color: In a simple image the characters in a text usually have the similar or same color. Connected
component-based approach usually makes use of this property. But complex images or color documents usually
contain text strings with two or more colors for effective visualization
(e) Edge: Most caption and scene texts are designed to be easily read, therefore strong edges are placed at the
boundaries of text and background.
The remaining part of this paper is organized as follows. Section II describes the related works.
Proposed system is explained in Section III and the experiments and results in section IV. Finally the conclusion
is given in the Section V.
II. Related works
The different approaches related to text extraction includes region based methods, texture based
methods. The region based methods are basically divided into two categories: edge based [1] and connected
component based methods. J.Gllavataet.al [3] proposed a connected component [2] based approach for the text
extraction .It is based on color reduction technique and OCR is used for character recognition .It will only detect
text with horizontal alignment. Low quality images will not be processed accurately. Zhong et al. [4] used a CC-
based method, which uses color reduction. In that they quantize the color space by analyzing the color histogram
generated in the RGB color space. This is done mainly based on the assumption that the text regions usually
cluster together in this color space and they occupy a significant portion of an image. Each text component
present will undergo filtering stage using a number of heuristics, such as area, spatial alignment and diameter.
The performance of this system was evaluated with testing in CD images and other book cover images. Kim et
al. [5] use transition map to detect overlay text. They proposed a method for overlay text detection and
extraction from complex videos. The detection method is based on the observation regarding the existence of
transient color between inserted text and its adjacent background. The transition map is generated first which is
based on its logarithmical change of intensity and modified saturation. Then Linked mask is generated to create
connected components for each candidate region and then each of these connected components is reshaped to
have smooth boundaries. Researches based on Caption text detection are proposed in [9]. Xiaoqing Liu,et.al [6]
proposed a method based on the properties of edges. This method is not sensitive to image color/intensity. Even
though it can handle both printed and document images effectively. Since it mainly analyses texts in the form of
blocks the small image regions and stroke are misidentified as text in areas containing large characters.
3. Automatic Text Extraction System For Complex Images
DOI: 10.9790/0661-1803012329 www.iosrjournals.org 25 | Page
The proposed system uses wavelet and different morphological operations in a different way compared
to existing techniques for the identification of text part. So a better efficiency is obtained in the extraction of
text. Additional features are also added in our system with the enhancement of technology.
III. Proposed System
The proposed method works based on the fact that the texts present in images have some unique
features which include the properties of edges. The architecture of the proposed system is shown in Fig.2. The
pro-posed system is mainly divided into three modules: Edge map generation module, Text area segmentation
module, Text recognition module. The input is given as image to the system and the output obtained is in the
form of a text file.
1.1 Edge Map Generation Module
The input is passed to this module. The input image can be grayscale or color, compressed or
uncompressed format. This module contains different steps in it. Firstly, the input image undergoes a
preprocessing stage.
Fig.2. Architecture of the proposed system
1.2 Preprocessing
Edge strength and density are fundamental characteristics of that embedded text in a complex image.
This serves as a vital feature for text extraction. In the pro-posed algorithm, the input taken is a color or a gray-
scale image. If the input is a color image it undergoes the preprocessing stage. In this stage, the color image is
converted to grayscale image using the equation below.
(1)
The RGB image is converted to Hue-Saturation-Value (HSV) color space. The Y of the above equation
refers to value component of the Hue-Saturation- Value (HSV) color space. Thus RGB color image gets
converted to the gray scale image. We can also process the image by converting it to YCbCr color space and
taking the Y, the luminance part for processing. On applying a median filtering to this gray scale image the
noise can be filtered out. The best known order statistics filter is the median filter. It replaces the value of a
referred pix-el by another value which is calculated by taking the median of the gray levels in the neighborhood
of that pixel. Median filtering has excellent noise reduction capabilities with less blurring compared to linear
smoothing filters of similar size. It is very effective in case of bipolar and unipolar impulsive noise. After this
filtering step is done, most of noise present in the image gets removed while the edges in the image will be still
preserved. The Image Y is then processed with the 2-D discrete wavelet transform.
3.3 Discrete Wavelet Transform
In this system, the proposed algorithm uses the Haar discrete wavelet trans- form. The Haar DWT
provides a well powerful tool which models most of the characteristics of the image. The textured images are
mostly well characterized by their edges. On applying Discrete Wavelet Transform (DWT), it is decomposed
into components of frequency domain [8]. On applying the DWT the input image is decomposed to four sub-
bands or components. i.e one average component and three detail components. To obtain the components it has
to deal with row and column direction separately. First High Pass Filter (H.P.F) G and Low Pass Filter (L.P.F)
H are exploited for each row data and then are down sampled by 2 to get high and low frequency components of
the row. Next the high and low pass filters are applied again for each high and low frequency components of the
column and then down sampled by 2. By way of the above processing, the four-sub band images are generated:
HH1, HL1, LH1 andLL1.
4. Automatic Text Extraction System For Complex Images
DOI: 10.9790/0661-1803012329 www.iosrjournals.org 26 | Page
Fig.3. Block diagram of 2D DWT
The detail component sub-bands containing the vertical details LH1, horizontal details HL1 and
diagonal details LL1 are used to detect text edges present in the original image. The process of applying DWT
to image is shown in Fig.3. The D.S. represents the down sampling of the image by 2.
Since filtering is done before applying DWT, the effect of noise in the components can be reduced.
Now edge detection method is applied at each component. The wavelet function and the scaling function of
Haar wavelets are defined below.
(2)
(3)
The Haar is simpler than other wavelets which reduces the complexity of the algorithm. The advantage
of using Haar wavelet is that it is the only wavelet that allows perfect localization in the transform domain. Its
coefficients are either 1 or -1 and are real symmetric and orthogonal. On applying this, the portions with higher
edge strength in identical directions can be found out. Thereafter the threshold is calculated. This helps to filter
out the unfeasible edges. The second derivative of intensity is employed in the measurement of edge strength as
it provides improved detection of intensity places which leads to a characterization of text in images that usually
leads to a characterization of text in images. The traditional edge detection filters may also be able to provide
the similar result but it will not be able to detect three kinds of edges at a time. Therefore, the processing time
for the traditional edge detection filters works slower than 2-D DWT.
3.4 Text Region Detection
The process of identifying text regions can be split into two sub problems: detection and localization.
In the detection step, general regions of the frame are classified as text or non-text. The size and shape of these
regions differ in different algorithm. For example, some algorithms classify 8x8 pixel blocks, while others
classify individual scan lines. In the localization step, the results of detection are grouped together to form one
or more text instances. This is usually represented as a bounding box around each text instance. For the
detection of the text region, the three detailed sub components which is obtained on applying Haar DWT is
used. Sobel edge detection algorithm is applied to each detailed component. Thus hc,vc,dc are images obtained
after edge detection in horizontal ,vertical and diagonal component. Therefore hc contains all the edges in
horizontal direction, vc contains all the edges in vertical direction and the dc contains all the edges in diagonal
direction. Now an edge map is generated using hc, vc and dc. It is generated by using weighted OR operation.
The equation (4) used for the generation of edge map is shown below
I = (40 * hc + 70 * vc + 30 * dc) (4)
The I is the image obtained after applying the edgemap. hc, vc and dc indicates the horizontal, vertical
and diagonal sub component respectively after sobel edge detection algorithm is applied. The edge map of an
example image is shown in Fig 4 .This algorithm uses the sobel edge detector as it is more efficient to locate the
strong edge present. Now the image I shows edges with all the possible text regions.
5. Automatic Text Extraction System For Complex Images
DOI: 10.9790/0661-1803012329 www.iosrjournals.org 27 | Page
Fig.4. Edgemap of the image obtained
3.5 Text area segmentation
After edge map is obtained, the binarization of the image is done. The binary form of the edge map is
obtained by the thresholding operation. For binarization of the image the Otsu algorithm is used. The
thresholding operation removes the non text regions identified so far. The steps involved in computing Otsu
threshold are
• Reshape the image to 1 dimensional.
• Compute histogram and values at each intensity level
• Initialize a matrix with values from 0 to 255
• Step through all possible thresholds maximum intensity
• Compute the mean, weight and the variance for the foreground and background.
• Calculate weight of the foreground * variance of the foreground + weight of the background * variance of the
background.
• Find the minimum value.
The minimum value calculated is taken as the thresh-old for the binarization process. The localization
process of the text involves further enhancement the text regions by eliminating non-text regions from the
image. One of the main properties the text exhibits is that all characters present will appear close to each other in
the image. Therefore it can form a cluster. By taking into account this property we use morphological
operations. The possible text pixels can be clustered together by using the dilation operation, thus eliminating
pixels that are far from the candidate text regions are possible. Dilation can be defined as an operation that
expands or enhances the region of interest, by the usage of the structural element of the required shape and size.
Large structuring element is used in the dilation process so that regions which lie close to each other can be
enhanced. To localize the text part clearly we use the morphological operations.
The morphological operations include the erosion, dilation with line and disk as the structuring element
is done. The segmentation of the identified text portions are done in this segmentation phase. For that the
connected components are labeled and the connectivity used here is 8. The set of properties, shape and
measurements of the connected components are computed. The area and bounding box shape measurements as
per the requirement are only considered. The area of the connected component is considered as scalar as it
represents the number of pixels in that region.
Bounding box is placed to the connected component identified and it must be the smallest containing
the required text region [11]. The width, height and the up-per left corner position are identified. A new value
can be computed by multiplying height and width of bounding box. The ratio of this new value and area is
taken. If the ratio is less than 1.5, then the regions so obtained are considered as text regions. The resultant
image got after dilation operation may even consist of some non-text regions or any noise which are needed to
be eliminated.
To eliminate noise blobs present in the image an area filtering is done. After that only those regions in
the final image whose area is greater than or equal to 1/15 of the maximum area region identified are retained.
The Fig 5(a) shows the image after morphological operations and Fig 5(b) gives the image after text
extraction.
Fig.5. (a) Image after morphological operations, (b) Output image after segmentation.
6. Automatic Text Extraction System For Complex Images
DOI: 10.9790/0661-1803012329 www.iosrjournals.org 28 | Page
3.6 Text Recognition
The image got from the previous stage is considered as input in this phase. Text recognition is done by
using the tesseract OCR. The binary image with polygonal text regions defined is given as input to the tesseract.
The processing of the system follows a traditional pipe-line. The first step is analysis of connected component. It
outlines the components stored. This is a computationally expensive design decision, but it has a significant
advantage. At this stage, outlines are gathered together, by nesting, into Blobs. The Blobs thus obtained are
organized into text lines. Then the lines and the regions identified are analyzed for getting fixed pitch or
proportional text. Now the obtained text lines are divided into words in accordance with the character spacing.
Fixed pitch text is then chopped immediately by using the character cells. Proportional text after that is broken
down into words by using definite spaces. Recognition process here undergoes a two-pass process. In the first
pass, an attempt is made to recognize each word. Every word that is satisfactory recognized is then passed to an
adaptive classifier as a training data. The adaptive classifier then will more accurately recognize text in the page.
With the completion of first pass, a second pass is then executed over the page. In this the words that were not
recognized in the previous pass well enough are recognized again. A final phase resolves spaces, and locates the
small text by checking the alternative hypotheses for the x-height to it. After recognition the UTF-8 code of the
character are returned. This can be easily converted corresponding characters and are displayed as output in the
form of a text file.
IV. Experiments and results
The system is experimented with large number of images and the results obtained are described here. It
is clear from the Table 1 that the proposed methods have better precision rate and recall rate compared to other
existing techniques. The method proposed by Samarabandhu et.al shows 91.8 % of precision and 96.6% of
recall whereas the technique given by j.yanget. al have lesser rate of about 84.90% and 90.0 % precision and
recall. But our method shows 99.65% of preci-sion rate and 99.8% of recall. The other existing methods
proposed by Kim et.al and J.Gllavata exhibits even smaller rate as given in the Table 1.
The accuracy of the algorithm in the proposed method is computed by counting the number of correctly
located characters, which is taken as the ground truth. The precision and recall rate are calculated by the
equation (5) & (6)
(5)
(6)
Comparisons with some existing methods are shown in table 1 which shows a clear improvement over
existing methods. The performances of other methods shown are cited from the published works. The proposed
system is implemented using Matlab[13].
Method Precision rate (%) Recall rate (%)
Proposed method 99.65 99.8
Samarabandhu et. al[6] 91.8 96.6
J. yang et. al[11] 84.90 90.0
K.C.kim et al[10] 63.7 82.8
J.Gllavata et.al[3] 83.9 88.7
Table 1 Comparisons with existing methods
Fig.6. The GUI showing the input image and the image with the text portion
7. Automatic Text Extraction System For Complex Images
DOI: 10.9790/0661-1803012329 www.iosrjournals.org 29 | Page
V. Conclusion
In this paper, a relatively simple, fast and effective method for text detection and extraction are
proposed. The method uses DWT for the efficient working of the algorithm. This process requires less
processing time which is mainly essential for real time applications and shows high precision rate. Most of
methods fail when the characters are not aligned well or when the characters are too small. Those methods also
result in some of missing characters when the characters have very poor contrast with respect to the background.
But the pro-posed method is not sensitive to color or intensity of image, and also the uneven illumination and
reflection effects.
This can be used in large variety of application fields such as vehicle license plate detection, object
identification, identification of various parts in industrial automation, mobile robot navigation which helps to
detect text based land marks, analysis of technical papers with the help of maps, charts, and electric circuits etc.
This algorithm is good at handling both scene text images and documents images effectively. Even though there
are large numbers of algorithms in this area, it is observed that there is no single unified approach or algorithm
that fits for all the applications.
References
[1] Xin Zhang, Fuchun Sun, Lei Gu “A Combined Al-gorithm for Video Text Extraction” Seventh Internation-al Conference on Fuzzy
Systems and Knowledge Discov-ery,2010.
[2] Ohya, A. Shio, and S. Akamatsu, “Recognizing Characters in Scene Images”, IEEE Transactions on Pat-tern Analysis and Machine
Intelligence, 1994, 214-224.
[3] Gllavata, R. Ewerth, and B. Freisleben, “A robust algorithm for text detection in images” Proceedings of the 3rd International
Symposium on Image and Signal Pro-cessing and Analysis, pp.611– 616, ISPA, 2003.
[4] Zhong, Yu.,Karu, K., and Jain, A.K.” Locating text in complex color images” Proceedings of the Third Inter-national Conference
on Document Analysis and Recogni-tion1995, 1, 14-16: 146-149.
[5] WonjunKim,Changick Kim, “A New Approach for Overlay Text Detection and Extraction From Complex Video Scene,” IEEE
Transactions on Image Processing, V.18 , No.2, pp. 401 – 411, 2009.
[6] Xiaoqing Liu, JagathSamarabandu “Multiscale Edge-Based Text Extraction from Complex Images,” International Conference on
Multimedia and Expo, pp.1721-1724, 2006
[7] Xiao-Wei Zhang, Xiong-Bo Zheng, Zhi-Juan Weng, “Text Localization Algorithm Under Back-ground Image Using Wavelet
Transforms”, Proceedings of the 2008 International Conference on Wavelet Analysis and Pattern Recognition, Hong Kong, pp.30-
31, Aug. 2008.
[8] N. Otsu, “A Threshold Selection Method from Gray-Level Histograms,” IEEE Transactions on Systems, man and Cybernet, 1979.
[9] Debapratim Sarkar, Raghunath Ghosh ,”A Bottom-Up Approach of Line Segmentation from Handwritten Text”2009.
[10] K.C. Kim, H.R. Byun, Y.J. Song, Y.W. Choi, S.Y. Chi,K.K. Kim and Y.K Chung, Scene Text Extraction in Natural Scene Images
using Hierarchical FeatureCom-bining and verification, Proceedings of the 17International Conference on Pattern Recognition
(ICPR’04),IEEE.
[11] J. Yang, J. Gao, Y. Zhang, X. Chen and A. Waibel, “AnAutomatic Sign Recognition and Translation Sys-tem”,Proceedings of the
Workshop on Perceptive User Inter-faces(PUI'01), 2001, pp. 1-8.
[12] A.K Jain,”Fundamentals of Digital Image Pro-cessing”,Englewood cliff, NJ: Prentice Hall, 1989.