The document summarizes a student project on real-time scene text localization and recognition. It includes an introduction describing the challenges of natural scene text detection compared to well-formatted documents. It also describes previous work on text localization and recognition methods, the proposed image processing and MATLAB software used in the project, experimental results and analysis, and conclusions. The project aims to develop a new video scene text detection method using image enhancement and a Bayesian classifier to classify text pixels without prior image knowledge.
Bangla Optical Digits Recognition using Edge Detection MethodIOSR Journals
Abstract:This paper is based on Bangla Optical Digit Recognition (ODR) by the Edge detection technique. In this method, Bangla digit image converted into gray-scale which distributed by an M by N array form. Here input data are considered off-line printed digit’s image which collected from computer generated image, scanned documents or printed text. After addressing the gray-scale image against a variable in the form of an M by N array, where the value of array pointers are shown 255 for total white space, 0 (zero) for total dark space and value between 255 and 0 for mix of white and dark space of the image. At the next process, four edgestouch points as well as each touch point’s ratio use as parameters to determine each Bangla digit uniquely. Keywords-Edge, image,gray-scale, Matrix,ODR.
The growing trend of online image sharing and downloads today mandate the need for better encoding and
decoding scheme. This paper looks into this issue of image coding. Multiple Description Coding is an
encoding and decoding scheme that is specially designed in providing more error resilience for data
transmission. The main issue of Multiple Description Coding is the lossy transmission channels. This work
attempts to address the issue of re-constructing high quality image with the use of just one descriptor
rather than the conventional descriptor. This work compare the use of Type I quantizer and Type II
quantizer. We propose and compare 4 coders by examining the quality of re-constructed images. The 4
coders are namely JPEG HH (Horizontal Pixel Interleaving with Huffman Coding) model, JPEG HA
(Horizontal Pixel Interleaving with Arithmetic Encoding) model, JPEG VH (Vertical Pixel Interleaving
with Huffman Encoding) model, and JPEG VA (Vertical Pixel Interleaving with Arithmetic Encoding)
model. The findings suggest that the use of horizontal and vertical pixel interleavings do not affect the
results much. Whereas the choice of quantizer greatly affect its performance.
Modified Skip Line Encoding for Binary Image Compressionidescitation
This paper proposes a modified skip line encoding technique for lossless compression of binary images. Skip line encoding exploits correlation between successive scan lines by encoding only one line and skipping similar lines. The proposed technique improves upon existing skip line encoding by allowing a scan line to be skipped if a similar line exists anywhere in the image, rather than just successive lines. Experimental results on sample images show the modified technique achieves higher compression ratios than conventional skip line encoding.
Corrosion Detection Using A.I : A Comparison of Standard Computer Vision Tech...csandit
In this paper we present a comparison between stand
ard computer vision techniques and Deep
Learning approach for automatic metal corrosion (ru
st) detection. For the classic approach, a
classification based on the number of pixels contai
ning specific red components has been
utilized. The code written in Python used OpenCV li
braries to compute and categorize the
images. For the Deep Learning approach, we chose Ca
ffe, a powerful framework developed at
“Berkeley Vision and Learning Center” (BVLC). The
test has been performed by classifying
images and calculating the total accuracy for the t
wo different approaches.
CORROSION DETECTION USING A.I. : A COMPARISON OF STANDARD COMPUTER VISION TEC...cscpconf
In this paper we present a comparison between standard computer vision techniques and Deep
Learning approach for automatic metal corrosion (rust) detection. For the classic approach, a
classification based on the number of pixels containing specific red components has been
utilized. The code written in Python used OpenCV libraries to compute and categorize the
images. For the Deep Learning approach, we chose Caffe, a powerful framework developed at
“Berkeley Vision and Learning Center” (BVLC). The test has been performed by classifying
images and calculating the total accuracy for the two different approaches.
RGBEXCEL: AN RGB IMAGE DATA EXTRACTOR AND EXPORTER FOR EXCEL PROCESSINGsipij
The objective of this paper was to develop a means of rapidly obtaining RGB image data, as part of an
effort to develop a low-cost method of image processing and analysis based on Microsoft Excel. A simple
standalone GUI (graphical user interface) software application called RGBExcel was developed to extract
RGB image data from any colour image files of any format. For a given image file, the output from the
software is an Excel file with the data from the R (red), G (green), and B (blue) bands of the image
contained in different sheets. The raw data and any enhancements can be visualized by using the surface
chart type in combination with other features. Since Excel can plot a maximum dimension of 255 by 255
pixels, larger images are downscaled to have a maximum dimension of 255 pixels. Results from testing the
application are discussed in the paper.
Automatic License Plate Detection in Foggy Condition using Enhanced OTSU Tech...IRJET Journal
This document presents research on detecting license plates in foggy conditions using an enhanced OTSU technique. The researchers tested their technique on a large database of license plate images taken under different conditions, including clear and foggy images. They evaluated the technique using various performance parameters such as MSE, PSNR, SSIM, and aspect ratio. When compared to a base technique, the enhanced OTSU technique showed improvements in these parameters of 14.93%, 14.12%, 39.21%, and 40% respectively. The technique aims to better handle hazardous image conditions like foggy weather that existing techniques often struggle with. It uses steps like image denoising, thresholding segmentation, and character extraction to read license plates in low-visibility situations
Implementation of Computer Vision Applications using OpenCV in C++IRJET Journal
This document discusses implementing computer vision applications using OpenCV in C++. It explores integrating OpenCV and C++ for developing robust CV applications. Examples are presented demonstrating seamless integration, including facial recognition, motion detection, and augmented reality. Four use cases are implemented: color detection, barcode decoding, text recognition, and text on images. While some pre-trained libraries were unavailable in C++, the expected outcomes were achieved through alternative methods using OpenCV functions and algorithms. The future of OpenCV in C++ is discussed to include industrial automation, security, real-time systems, and integration with deep learning.
Bangla Optical Digits Recognition using Edge Detection MethodIOSR Journals
Abstract:This paper is based on Bangla Optical Digit Recognition (ODR) by the Edge detection technique. In this method, Bangla digit image converted into gray-scale which distributed by an M by N array form. Here input data are considered off-line printed digit’s image which collected from computer generated image, scanned documents or printed text. After addressing the gray-scale image against a variable in the form of an M by N array, where the value of array pointers are shown 255 for total white space, 0 (zero) for total dark space and value between 255 and 0 for mix of white and dark space of the image. At the next process, four edgestouch points as well as each touch point’s ratio use as parameters to determine each Bangla digit uniquely. Keywords-Edge, image,gray-scale, Matrix,ODR.
The growing trend of online image sharing and downloads today mandate the need for better encoding and
decoding scheme. This paper looks into this issue of image coding. Multiple Description Coding is an
encoding and decoding scheme that is specially designed in providing more error resilience for data
transmission. The main issue of Multiple Description Coding is the lossy transmission channels. This work
attempts to address the issue of re-constructing high quality image with the use of just one descriptor
rather than the conventional descriptor. This work compare the use of Type I quantizer and Type II
quantizer. We propose and compare 4 coders by examining the quality of re-constructed images. The 4
coders are namely JPEG HH (Horizontal Pixel Interleaving with Huffman Coding) model, JPEG HA
(Horizontal Pixel Interleaving with Arithmetic Encoding) model, JPEG VH (Vertical Pixel Interleaving
with Huffman Encoding) model, and JPEG VA (Vertical Pixel Interleaving with Arithmetic Encoding)
model. The findings suggest that the use of horizontal and vertical pixel interleavings do not affect the
results much. Whereas the choice of quantizer greatly affect its performance.
Modified Skip Line Encoding for Binary Image Compressionidescitation
This paper proposes a modified skip line encoding technique for lossless compression of binary images. Skip line encoding exploits correlation between successive scan lines by encoding only one line and skipping similar lines. The proposed technique improves upon existing skip line encoding by allowing a scan line to be skipped if a similar line exists anywhere in the image, rather than just successive lines. Experimental results on sample images show the modified technique achieves higher compression ratios than conventional skip line encoding.
Corrosion Detection Using A.I : A Comparison of Standard Computer Vision Tech...csandit
In this paper we present a comparison between stand
ard computer vision techniques and Deep
Learning approach for automatic metal corrosion (ru
st) detection. For the classic approach, a
classification based on the number of pixels contai
ning specific red components has been
utilized. The code written in Python used OpenCV li
braries to compute and categorize the
images. For the Deep Learning approach, we chose Ca
ffe, a powerful framework developed at
“Berkeley Vision and Learning Center” (BVLC). The
test has been performed by classifying
images and calculating the total accuracy for the t
wo different approaches.
CORROSION DETECTION USING A.I. : A COMPARISON OF STANDARD COMPUTER VISION TEC...cscpconf
In this paper we present a comparison between standard computer vision techniques and Deep
Learning approach for automatic metal corrosion (rust) detection. For the classic approach, a
classification based on the number of pixels containing specific red components has been
utilized. The code written in Python used OpenCV libraries to compute and categorize the
images. For the Deep Learning approach, we chose Caffe, a powerful framework developed at
“Berkeley Vision and Learning Center” (BVLC). The test has been performed by classifying
images and calculating the total accuracy for the two different approaches.
RGBEXCEL: AN RGB IMAGE DATA EXTRACTOR AND EXPORTER FOR EXCEL PROCESSINGsipij
The objective of this paper was to develop a means of rapidly obtaining RGB image data, as part of an
effort to develop a low-cost method of image processing and analysis based on Microsoft Excel. A simple
standalone GUI (graphical user interface) software application called RGBExcel was developed to extract
RGB image data from any colour image files of any format. For a given image file, the output from the
software is an Excel file with the data from the R (red), G (green), and B (blue) bands of the image
contained in different sheets. The raw data and any enhancements can be visualized by using the surface
chart type in combination with other features. Since Excel can plot a maximum dimension of 255 by 255
pixels, larger images are downscaled to have a maximum dimension of 255 pixels. Results from testing the
application are discussed in the paper.
Automatic License Plate Detection in Foggy Condition using Enhanced OTSU Tech...IRJET Journal
This document presents research on detecting license plates in foggy conditions using an enhanced OTSU technique. The researchers tested their technique on a large database of license plate images taken under different conditions, including clear and foggy images. They evaluated the technique using various performance parameters such as MSE, PSNR, SSIM, and aspect ratio. When compared to a base technique, the enhanced OTSU technique showed improvements in these parameters of 14.93%, 14.12%, 39.21%, and 40% respectively. The technique aims to better handle hazardous image conditions like foggy weather that existing techniques often struggle with. It uses steps like image denoising, thresholding segmentation, and character extraction to read license plates in low-visibility situations
Implementation of Computer Vision Applications using OpenCV in C++IRJET Journal
This document discusses implementing computer vision applications using OpenCV in C++. It explores integrating OpenCV and C++ for developing robust CV applications. Examples are presented demonstrating seamless integration, including facial recognition, motion detection, and augmented reality. Four use cases are implemented: color detection, barcode decoding, text recognition, and text on images. While some pre-trained libraries were unavailable in C++, the expected outcomes were achieved through alternative methods using OpenCV functions and algorithms. The future of OpenCV in C++ is discussed to include industrial automation, security, real-time systems, and integration with deep learning.
Optical Character Recognition from Text ImageEditor IJCATR
Optical Character Recognition (OCR) is a system that provides a full alphanumeric recognition of printed or handwritten
characters by simply scanning the text image. OCR system interprets the printed or handwritten characters image and converts it into
corresponding editable text document. The text image is divided into regions by isolating each line, then individual characters with
spaces. After character extraction, the texture and topological features like corner points, features of different regions, ratio of
character area and convex area of all characters of text image are calculated. Previously features of each uppercase and lowercase
letter, digit, and symbols are stored as a template. Based on the texture and topological features, the system recognizes the exact
character using feature matching between the extracted character and the template of all characters as a measure of similarity.
FAST AND EFFICIENT IMAGE COMPRESSION BASED ON PARALLEL COMPUTING USING MATLABJournal For Research
Image compression technique is used in many applications for example, satellite imaging, medical imaging, video where the size of the iamge requires more space to store, in such application image compression effectively can be used. There are two types in image compression techniques Lossy and Lossless comression. Both these techniques are used for compression of images, but these techniques are not fast. The image compression techniques both lossy and lossless image compression techniques are not fast, they take more time for compression and decompression. For fast and efficient image compression a parallel computing technique is used in matlab. Matlab is used in this project for parallel computing of images. In this paper we will discuss Regular image compression technique, three alternatives of parallel computing using matlab, comparison of image compression with and without parallel computing.
Mobile Based Application to Scan the Number Plate and To Verify the Owner Det...inventionjournals
Any License plate recognition system usually passes through three steps of image processing: 1) Extraction of a license plate region; 2) Segmentation of the plate characters; and 3) Recognition of each character. A number of algorithms have been proposed in recent times for efficient disposal of the application. The purpose of this project was to develop a real time application which recognizes number plates from cars at a gate, for example at the entrance of a parking area or a border crossing. The system, based on regular PC with mobile camera, catches video frames which include a visible car number plate and processes them. Once a number plate is detected, its digits are recognized, displayed on the User Interface or checked against a database.The software aspect of the system runs on mobile hardware and can be linked to other applications or databases. It first uses a series of image manipulation techniques to detect, normalize and enhance the Image of the number plate, and then optical character recognition (ocr) to extract the alpha numeric text of number plate. The system are generally deployed in one of two basic approaches: one allows for the entire process to be performed at the lane location in real-time. The other will reveal the driver’s profile by checking in the registered database.
Comparative Analysis of Lossless Image Compression Based On Row By Row Classi...IJERA Editor
This document proposes and evaluates a near lossless image compression algorithm that divides color images into red, green, and blue channels. It classifies pixels in each channel row-by-row and records the results in mask images. The image data is then decomposed into sequences based on the classification and the mask images are hidden in the least significant bits of the sequences. Different encoding schemes like LZW, Huffman, and RLE are applied and compared. Experimental results on test images show the proposed algorithm achieves smaller bits per pixel than simple encoding schemes. PSNR values also indicate very little difference between original and reconstructed images.
Survey paper on image compression techniquesIRJET Journal
This document summarizes and compares several popular image compression techniques: wavelet compression, JPEG/DCT compression, vector quantization (VQ), fractal compression, and genetic algorithm compression. It finds that all techniques perform satisfactorily at 0.5 bits per pixel, but for very low bit rates like 0.25 bpp, wavelet compression techniques like EZW perform best in terms of compression ratio and quality. Specifically, EZW and JPEG are more practical than others at low bit rates. The document also notes advantages and disadvantages of each technique and concludes hybrid approaches may achieve even higher compression ratios while maintaining image quality.
This document provides an overview of using MATLAB for image processing. It describes MATLAB's development environment and basic data structures. It also covers reading, displaying, and saving images, as well as common image processing techniques like point processing, histogram equalization, and color space conversion that can be performed in MATLAB.
The document discusses four different methods for Bangla handwritten digit recognition. Method 1 uses preprocessing techniques like binarization, noise reduction, and segmentation followed by feature extraction and classification with a CNN. It achieves 94% accuracy. Method 2 also uses a CNN called MathNET with data augmentation, achieving 97% accuracy. Method 3 uses preprocessing, HOG feature extraction, and an SVM classifier, achieving 97.08% accuracy. Method 4 develops a dataset, performs data augmentation, uses a multi-layer CNN model with ensembling, and achieves 96.788% accuracy even on noisy images. The methods demonstrate high and improving recognition accuracy for Bangla handwritten digits.
Steganography is a best method for in secret communicating information during the transference of data. Images are an appropriate method that used in steganography can be used to protection the simple bits and pieces. Several systems, this one as color scale images steganography and grayscale images steganography, are used on color and store data in different techniques. These color images can have very big amounts of secret data, by using three main color modules. The different color modules, such as HSV-(hue, saturation, and value), RGB-(red, green, and blue), YCbCr-(luminance and chrominance), YUV, YIQ, etc. This paper uses unusual module to hide data: an adaptive procedure that can increase security ranks when hiding a top secret binary image in a RGB color image, which we implement the steganography in the YCbCr module space. We performed Exclusive-OR (XOR) procedures between the binary image and the RGB color image in the YCBCR module space. The converted byte stored in the 8-bit LSB is not the actual bytes; relatively, it is obtained by translation to another module space and applies the XOR procedure. This technique is practical to different groups of images. Moreover, we see that the adaptive technique ensures good results as the peak signal to noise ratio (PSNR) and stands for mean square error (MSE) are good. When the technique is compared with our previous works and other existing techniques, it is shown to be the best in both error and message capability. This technique is easy to model and simple to use and provides perfect security with unauthorized.
11.secure compressed image transmission using self organizing feature mapsAlexander Decker
This document summarizes a research paper that proposes a method for secure compressed image transmission using self-organizing feature maps. The method involves compressing images using SOFM-based vector quantization, entropy coding the results, and encrypting the compressed data using a scrambler before transmission. Simulation results show the method achieves a compression ratio of up to 38:1 while providing security, outperforming JPEG compression by up to 1 dB. The paper presents the technical details and evaluation of the proposed secure image transmission system.
IRJET- An Optimized Approach for Deaf and Dumb People using Air WritingIRJET Journal
This document presents a proposed approach for communication using air writing detection and keyword recognition to help deaf and dumb people communicate. The system works by capturing motion from a camera, detecting colors in the HSV color space, preparing the image for character recognition using thresholding, performing optical character recognition to detect characters, and then matching the detected characters to keywords to convey messages through audio or video output. Some benefits of the system include enabling more expressive communication than limited gestures and providing a cost-effective alternative to other sensors. Future work could make the system more compact and wearable for real-time communication.
This document is a mini project report on digital image processing using MATLAB. It discusses various image processing techniques and applications implemented in MATLAB, including image formats, operations, and tools. Applications demonstrated include text recognition, color tracking, solving an engineering problem using image processing, creating a virtual slate using laser tracking, face detection, and distance estimation. The report provides examples of MATLAB functions used for tasks like importing, displaying, converting and cropping images, as well as analyzing and manipulating them.
Analysis of color image features extraction using texture methodsTELKOMNIKA JOURNAL
A digital color images are the most important types of data currently being traded; they are used in many vital and important applications. Hence, the need for a small data representation of the image is an important issue. This paper will focus on analyzing different methods used to extract texture features for a color image. These features can be used as a primary key to identify and recognize the image. The proposed discrete wave equation DWE method of generating color image key will be presented, implemented and tested. This method showed that the percentage of reduction in the key size is 85% compared with other methods.
RANDOMIZED STEGANOGRAPHY IN SKIN TONE IMAGESijcseit
Steganography is the technique of hiding a confidential message in an ordinary message and the extraction
of that secret message at its destination. Different carrier file formats can be used in steganography.
Among these carrier file formats, digital images are the most popular. For this work, digital images are
used. Here steganography is done on the skin portion of an image. First skin portion of an image is
detected. Random pixels are selected from that detected region using a pseudo-random number generator.
The bits of the secret message will be embedded on the LSB of these random pixels. An analysis is done to
check the efficiency and robustness of the proposed method. The aim of this work is to show that
steganography done using random pixel selection is less prone to outside attacks.
RANDOMIZED STEGANOGRAPHY IN SKIN TONE IMAGES ijcseit
Steganography is the technique of hiding a confidential message in an ordinary message and the extraction of that secret message at its destination. Different carrier file formats can be used in steganography. Among these carrier file formats, digital images are the most popular. For this work, digital images are used. Here steganography is done on the skin portion of an image. First skin portion of an image is detected. Random pixels are selected from that detected region using a pseudo-random number generator. The bits of the secret message will be embedded on the LSB of these random pixels. An analysis is done to check the efficiency and robustness of the proposed method. The aim of this work is to show that steganography done using random pixel selection is less prone to outside attacks.
Steganography is the technique of hiding a confidential message in an ordinary message and the extraction
of that secret message at its destination. Different carrier file formats can be used in steganography.
Among these carrier file formats, digital images are the most popular. For this work, digital images are
used. Here steganography is done on the skin portion of an image. First skin portion of an image is
detected. Random pixels are selected from that detected region using a pseudo-random number generator.
The bits of the secret message will be embedded on the LSB of these random pixels. An analysis is done to
check the efficiency and robustness of the proposed method. The aim of this work is to show that
steganography done using random pixel selection is less prone to outside attacks.
Canny Edge Detection Algorithm on FPGA IOSR Journals
This document summarizes the implementation of the Canny edge detection algorithm on an FPGA. It begins with an introduction to edge detection and digital image processing. It then describes the high-level implementation of the Canny algorithm using Simulink. The design and system-level block diagram of the implementation on an FPGA is shown, including loading an input image and displaying the output. Simulation and synthesis results are presented, showing the resource utilization on a Spartan 3E FPGA board. The implementation provides real-time edge detection to interface an FPGA with a monitor.
This document summarizes the implementation of the Canny edge detection algorithm on an FPGA. It begins with an introduction to edge detection and digital image processing. It then describes the Canny edge detection algorithm and its benefits. The document outlines the high-level implementation in Simulink and shows the input, grayscaled, and edge detected output images. It presents the system design with the FPGA reading in an image file and performing Canny edge detection. Simulation and synthesis results are shown verifying the design works as intended. The paper concludes the Canny edge detection algorithm was successfully designed, simulated, tested and realized on an FPGA.
Canny Edge Detection Algorithm on FPGA IOSR Journals
This document summarizes the implementation of the Canny edge detection algorithm on an FPGA. It begins with an introduction to edge detection and digital image processing. It then describes the Canny edge detection algorithm and its benefits. The document outlines the high-level implementation in Simulink and shows the input, grayscaled, and edge detected output images. It presents the system design with the FPGA reading in an image file and performing Canny edge detection. Simulation and synthesis results are shown verifying the design works as intended. The paper concludes the Canny edge detection algorithm was successfully designed, simulated, tested and realized on an FPGA.
This document introduces the MATLAB image processing environment. It discusses how images are represented and loaded in MATLAB. Common image file formats and data types are described. Basic image processing functions for point processing operations like scaling, histogram equalization and thresholding pixels are also introduced. Color spaces and conversions between RGB and other perceptual color models are briefly covered.
This document introduces the MATLAB development environment and how it can be used for image processing. MATLAB is a data analysis and visualization tool that uses matrices as its basic data structure. Images are represented as matrices of pixels. The document describes how to load, display, and save images in MATLAB. It also covers image data types and quantization, including how to convert between data types and access portions of image matrices using indexing.
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
A review on techniques and modelling methodologies used for checking electrom...nooriasukmaningtyas
The proper function of the integrated circuit (IC) in an inhibiting electromagnetic environment has always been a serious concern throughout the decades of revolution in the world of electronics, from disjunct devices to today’s integrated circuit technology, where billions of transistors are combined on a single chip. The automotive industry and smart vehicles in particular, are confronting design issues such as being prone to electromagnetic interference (EMI). Electronic control devices calculate incorrect outputs because of EMI and sensors give misleading values which can prove fatal in case of automotives. In this paper, the authors have non exhaustively tried to review research work concerned with the investigation of EMI in ICs and prediction of this EMI using various modelling methodologies and measurement setups.
More Related Content
Similar to REAL-TIME SCENE TEXT LOCALIZATION AND RECOGNITION ppt.pptx
Optical Character Recognition from Text ImageEditor IJCATR
Optical Character Recognition (OCR) is a system that provides a full alphanumeric recognition of printed or handwritten
characters by simply scanning the text image. OCR system interprets the printed or handwritten characters image and converts it into
corresponding editable text document. The text image is divided into regions by isolating each line, then individual characters with
spaces. After character extraction, the texture and topological features like corner points, features of different regions, ratio of
character area and convex area of all characters of text image are calculated. Previously features of each uppercase and lowercase
letter, digit, and symbols are stored as a template. Based on the texture and topological features, the system recognizes the exact
character using feature matching between the extracted character and the template of all characters as a measure of similarity.
FAST AND EFFICIENT IMAGE COMPRESSION BASED ON PARALLEL COMPUTING USING MATLABJournal For Research
Image compression technique is used in many applications for example, satellite imaging, medical imaging, video where the size of the iamge requires more space to store, in such application image compression effectively can be used. There are two types in image compression techniques Lossy and Lossless comression. Both these techniques are used for compression of images, but these techniques are not fast. The image compression techniques both lossy and lossless image compression techniques are not fast, they take more time for compression and decompression. For fast and efficient image compression a parallel computing technique is used in matlab. Matlab is used in this project for parallel computing of images. In this paper we will discuss Regular image compression technique, three alternatives of parallel computing using matlab, comparison of image compression with and without parallel computing.
Mobile Based Application to Scan the Number Plate and To Verify the Owner Det...inventionjournals
Any License plate recognition system usually passes through three steps of image processing: 1) Extraction of a license plate region; 2) Segmentation of the plate characters; and 3) Recognition of each character. A number of algorithms have been proposed in recent times for efficient disposal of the application. The purpose of this project was to develop a real time application which recognizes number plates from cars at a gate, for example at the entrance of a parking area or a border crossing. The system, based on regular PC with mobile camera, catches video frames which include a visible car number plate and processes them. Once a number plate is detected, its digits are recognized, displayed on the User Interface or checked against a database.The software aspect of the system runs on mobile hardware and can be linked to other applications or databases. It first uses a series of image manipulation techniques to detect, normalize and enhance the Image of the number plate, and then optical character recognition (ocr) to extract the alpha numeric text of number plate. The system are generally deployed in one of two basic approaches: one allows for the entire process to be performed at the lane location in real-time. The other will reveal the driver’s profile by checking in the registered database.
Comparative Analysis of Lossless Image Compression Based On Row By Row Classi...IJERA Editor
This document proposes and evaluates a near lossless image compression algorithm that divides color images into red, green, and blue channels. It classifies pixels in each channel row-by-row and records the results in mask images. The image data is then decomposed into sequences based on the classification and the mask images are hidden in the least significant bits of the sequences. Different encoding schemes like LZW, Huffman, and RLE are applied and compared. Experimental results on test images show the proposed algorithm achieves smaller bits per pixel than simple encoding schemes. PSNR values also indicate very little difference between original and reconstructed images.
Survey paper on image compression techniquesIRJET Journal
This document summarizes and compares several popular image compression techniques: wavelet compression, JPEG/DCT compression, vector quantization (VQ), fractal compression, and genetic algorithm compression. It finds that all techniques perform satisfactorily at 0.5 bits per pixel, but for very low bit rates like 0.25 bpp, wavelet compression techniques like EZW perform best in terms of compression ratio and quality. Specifically, EZW and JPEG are more practical than others at low bit rates. The document also notes advantages and disadvantages of each technique and concludes hybrid approaches may achieve even higher compression ratios while maintaining image quality.
This document provides an overview of using MATLAB for image processing. It describes MATLAB's development environment and basic data structures. It also covers reading, displaying, and saving images, as well as common image processing techniques like point processing, histogram equalization, and color space conversion that can be performed in MATLAB.
The document discusses four different methods for Bangla handwritten digit recognition. Method 1 uses preprocessing techniques like binarization, noise reduction, and segmentation followed by feature extraction and classification with a CNN. It achieves 94% accuracy. Method 2 also uses a CNN called MathNET with data augmentation, achieving 97% accuracy. Method 3 uses preprocessing, HOG feature extraction, and an SVM classifier, achieving 97.08% accuracy. Method 4 develops a dataset, performs data augmentation, uses a multi-layer CNN model with ensembling, and achieves 96.788% accuracy even on noisy images. The methods demonstrate high and improving recognition accuracy for Bangla handwritten digits.
Steganography is a best method for in secret communicating information during the transference of data. Images are an appropriate method that used in steganography can be used to protection the simple bits and pieces. Several systems, this one as color scale images steganography and grayscale images steganography, are used on color and store data in different techniques. These color images can have very big amounts of secret data, by using three main color modules. The different color modules, such as HSV-(hue, saturation, and value), RGB-(red, green, and blue), YCbCr-(luminance and chrominance), YUV, YIQ, etc. This paper uses unusual module to hide data: an adaptive procedure that can increase security ranks when hiding a top secret binary image in a RGB color image, which we implement the steganography in the YCbCr module space. We performed Exclusive-OR (XOR) procedures between the binary image and the RGB color image in the YCBCR module space. The converted byte stored in the 8-bit LSB is not the actual bytes; relatively, it is obtained by translation to another module space and applies the XOR procedure. This technique is practical to different groups of images. Moreover, we see that the adaptive technique ensures good results as the peak signal to noise ratio (PSNR) and stands for mean square error (MSE) are good. When the technique is compared with our previous works and other existing techniques, it is shown to be the best in both error and message capability. This technique is easy to model and simple to use and provides perfect security with unauthorized.
11.secure compressed image transmission using self organizing feature mapsAlexander Decker
This document summarizes a research paper that proposes a method for secure compressed image transmission using self-organizing feature maps. The method involves compressing images using SOFM-based vector quantization, entropy coding the results, and encrypting the compressed data using a scrambler before transmission. Simulation results show the method achieves a compression ratio of up to 38:1 while providing security, outperforming JPEG compression by up to 1 dB. The paper presents the technical details and evaluation of the proposed secure image transmission system.
IRJET- An Optimized Approach for Deaf and Dumb People using Air WritingIRJET Journal
This document presents a proposed approach for communication using air writing detection and keyword recognition to help deaf and dumb people communicate. The system works by capturing motion from a camera, detecting colors in the HSV color space, preparing the image for character recognition using thresholding, performing optical character recognition to detect characters, and then matching the detected characters to keywords to convey messages through audio or video output. Some benefits of the system include enabling more expressive communication than limited gestures and providing a cost-effective alternative to other sensors. Future work could make the system more compact and wearable for real-time communication.
This document is a mini project report on digital image processing using MATLAB. It discusses various image processing techniques and applications implemented in MATLAB, including image formats, operations, and tools. Applications demonstrated include text recognition, color tracking, solving an engineering problem using image processing, creating a virtual slate using laser tracking, face detection, and distance estimation. The report provides examples of MATLAB functions used for tasks like importing, displaying, converting and cropping images, as well as analyzing and manipulating them.
Analysis of color image features extraction using texture methodsTELKOMNIKA JOURNAL
A digital color images are the most important types of data currently being traded; they are used in many vital and important applications. Hence, the need for a small data representation of the image is an important issue. This paper will focus on analyzing different methods used to extract texture features for a color image. These features can be used as a primary key to identify and recognize the image. The proposed discrete wave equation DWE method of generating color image key will be presented, implemented and tested. This method showed that the percentage of reduction in the key size is 85% compared with other methods.
RANDOMIZED STEGANOGRAPHY IN SKIN TONE IMAGESijcseit
Steganography is the technique of hiding a confidential message in an ordinary message and the extraction
of that secret message at its destination. Different carrier file formats can be used in steganography.
Among these carrier file formats, digital images are the most popular. For this work, digital images are
used. Here steganography is done on the skin portion of an image. First skin portion of an image is
detected. Random pixels are selected from that detected region using a pseudo-random number generator.
The bits of the secret message will be embedded on the LSB of these random pixels. An analysis is done to
check the efficiency and robustness of the proposed method. The aim of this work is to show that
steganography done using random pixel selection is less prone to outside attacks.
RANDOMIZED STEGANOGRAPHY IN SKIN TONE IMAGES ijcseit
Steganography is the technique of hiding a confidential message in an ordinary message and the extraction of that secret message at its destination. Different carrier file formats can be used in steganography. Among these carrier file formats, digital images are the most popular. For this work, digital images are used. Here steganography is done on the skin portion of an image. First skin portion of an image is detected. Random pixels are selected from that detected region using a pseudo-random number generator. The bits of the secret message will be embedded on the LSB of these random pixels. An analysis is done to check the efficiency and robustness of the proposed method. The aim of this work is to show that steganography done using random pixel selection is less prone to outside attacks.
Steganography is the technique of hiding a confidential message in an ordinary message and the extraction
of that secret message at its destination. Different carrier file formats can be used in steganography.
Among these carrier file formats, digital images are the most popular. For this work, digital images are
used. Here steganography is done on the skin portion of an image. First skin portion of an image is
detected. Random pixels are selected from that detected region using a pseudo-random number generator.
The bits of the secret message will be embedded on the LSB of these random pixels. An analysis is done to
check the efficiency and robustness of the proposed method. The aim of this work is to show that
steganography done using random pixel selection is less prone to outside attacks.
Canny Edge Detection Algorithm on FPGA IOSR Journals
This document summarizes the implementation of the Canny edge detection algorithm on an FPGA. It begins with an introduction to edge detection and digital image processing. It then describes the high-level implementation of the Canny algorithm using Simulink. The design and system-level block diagram of the implementation on an FPGA is shown, including loading an input image and displaying the output. Simulation and synthesis results are presented, showing the resource utilization on a Spartan 3E FPGA board. The implementation provides real-time edge detection to interface an FPGA with a monitor.
This document summarizes the implementation of the Canny edge detection algorithm on an FPGA. It begins with an introduction to edge detection and digital image processing. It then describes the Canny edge detection algorithm and its benefits. The document outlines the high-level implementation in Simulink and shows the input, grayscaled, and edge detected output images. It presents the system design with the FPGA reading in an image file and performing Canny edge detection. Simulation and synthesis results are shown verifying the design works as intended. The paper concludes the Canny edge detection algorithm was successfully designed, simulated, tested and realized on an FPGA.
Canny Edge Detection Algorithm on FPGA IOSR Journals
This document summarizes the implementation of the Canny edge detection algorithm on an FPGA. It begins with an introduction to edge detection and digital image processing. It then describes the Canny edge detection algorithm and its benefits. The document outlines the high-level implementation in Simulink and shows the input, grayscaled, and edge detected output images. It presents the system design with the FPGA reading in an image file and performing Canny edge detection. Simulation and synthesis results are shown verifying the design works as intended. The paper concludes the Canny edge detection algorithm was successfully designed, simulated, tested and realized on an FPGA.
This document introduces the MATLAB image processing environment. It discusses how images are represented and loaded in MATLAB. Common image file formats and data types are described. Basic image processing functions for point processing operations like scaling, histogram equalization and thresholding pixels are also introduced. Color spaces and conversions between RGB and other perceptual color models are briefly covered.
This document introduces the MATLAB development environment and how it can be used for image processing. MATLAB is a data analysis and visualization tool that uses matrices as its basic data structure. Images are represented as matrices of pixels. The document describes how to load, display, and save images in MATLAB. It also covers image data types and quantization, including how to convert between data types and access portions of image matrices using indexing.
Similar to REAL-TIME SCENE TEXT LOCALIZATION AND RECOGNITION ppt.pptx (20)
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
A review on techniques and modelling methodologies used for checking electrom...nooriasukmaningtyas
The proper function of the integrated circuit (IC) in an inhibiting electromagnetic environment has always been a serious concern throughout the decades of revolution in the world of electronics, from disjunct devices to today’s integrated circuit technology, where billions of transistors are combined on a single chip. The automotive industry and smart vehicles in particular, are confronting design issues such as being prone to electromagnetic interference (EMI). Electronic control devices calculate incorrect outputs because of EMI and sensors give misleading values which can prove fatal in case of automotives. In this paper, the authors have non exhaustively tried to review research work concerned with the investigation of EMI in ICs and prediction of this EMI using various modelling methodologies and measurement setups.
We have compiled the most important slides from each speaker's presentation. This year’s compilation, available for free, captures the key insights and contributions shared during the DfMAy 2024 conference.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELgerogepatton
As digital technology becomes more deeply embedded in power systems, protecting the communication
networks of Smart Grids (SG) has emerged as a critical concern. Distributed Network Protocol 3 (DNP3)
represents a multi-tiered application layer protocol extensively utilized in Supervisory Control and Data
Acquisition (SCADA)-based smart grids to facilitate real-time data gathering and control functionalities.
Robust Intrusion Detection Systems (IDS) are necessary for early threat detection and mitigation because
of the interconnection of these networks, which makes them vulnerable to a variety of cyberattacks. To
solve this issue, this paper develops a hybrid Deep Learning (DL) model specifically designed for intrusion
detection in smart grids. The proposed approach is a combination of the Convolutional Neural Network
(CNN) and the Long-Short-Term Memory algorithms (LSTM). We employed a recent intrusion detection
dataset (DNP3), which focuses on unauthorized commands and Denial of Service (DoS) cyberattacks, to
train and test our model. The results of our experiments show that our CNN-LSTM method is much better
at finding smart grid intrusions than other deep learning algorithms used for classification. In addition,
our proposed approach improves accuracy, precision, recall, and F1 score, achieving a high detection
accuracy rate of 99.50%.
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTjpsjournal1
The rivalry between prominent international actors for dominance over Central Asia's hydrocarbon
reserves and the ancient silk trade route, along with China's diplomatic endeavours in the area, has been
referred to as the "New Great Game." This research centres on the power struggle, considering
geopolitical, geostrategic, and geoeconomic variables. Topics including trade, political hegemony, oil
politics, and conventional and nontraditional security are all explored and explained by the researcher.
Using Mackinder's Heartland, Spykman Rimland, and Hegemonic Stability theories, examines China's role
in Central Asia. This study adheres to the empirical epistemological method and has taken care of
objectivity. This study analyze primary and secondary research documents critically to elaborate role of
china’s geo economic outreach in central Asian countries and its future prospect. China is thriving in trade,
pipeline politics, and winning states, according to this study, thanks to important instruments like the
Shanghai Cooperation Organisation and the Belt and Road Economic Initiative. According to this study,
China is seeing significant success in commerce, pipeline politics, and gaining influence on other
governments. This success may be attributed to the effective utilisation of key tools such as the Shanghai
Cooperation Organisation and the Belt and Road Economic Initiative.
International Conference on NLP, Artificial Intelligence, Machine Learning an...gerogepatton
International Conference on NLP, Artificial Intelligence, Machine Learning and Applications (NLAIM 2024) offers a premier global platform for exchanging insights and findings in the theory, methodology, and applications of NLP, Artificial Intelligence, Machine Learning, and their applications. The conference seeks substantial contributions across all key domains of NLP, Artificial Intelligence, Machine Learning, and their practical applications, aiming to foster both theoretical advancements and real-world implementations. With a focus on facilitating collaboration between researchers and practitioners from academia and industry, the conference serves as a nexus for sharing the latest developments in the field.
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMSIJNSA Journal
The smart irrigation system represents an innovative approach to optimize water usage in agricultural and landscaping practices. The integration of cutting-edge technologies, including sensors, actuators, and data analysis, empowers this system to provide accurate monitoring and control of irrigation processes by leveraging real-time environmental conditions. The main objective of a smart irrigation system is to optimize water efficiency, minimize expenses, and foster the adoption of sustainable water management methods. This paper conducts a systematic risk assessment by exploring the key components/assets and their functionalities in the smart irrigation system. The crucial role of sensors in gathering data on soil moisture, weather patterns, and plant well-being is emphasized in this system. These sensors enable intelligent decision-making in irrigation scheduling and water distribution, leading to enhanced water efficiency and sustainable water management practices. Actuators enable automated control of irrigation devices, ensuring precise and targeted water delivery to plants. Additionally, the paper addresses the potential threat and vulnerabilities associated with smart irrigation systems. It discusses limitations of the system, such as power constraints and computational capabilities, and calculates the potential security risks. The paper suggests possible risk treatment methods for effective secure system operation. In conclusion, the paper emphasizes the significant benefits of implementing smart irrigation systems, including improved water conservation, increased crop yield, and reduced environmental impact. Additionally, based on the security analysis conducted, the paper recommends the implementation of countermeasures and security approaches to address vulnerabilities and ensure the integrity and reliability of the system. By incorporating these measures, smart irrigation technology can revolutionize water management practices in agriculture, promoting sustainability, resource efficiency, and safeguarding against potential security threats.
Embedded machine learning-based road conditions and driving behavior monitoringIJECEIAES
Car accident rates have increased in recent years, resulting in losses in human lives, properties, and other financial costs. An embedded machine learning-based system is developed to address this critical issue. The system can monitor road conditions, detect driving patterns, and identify aggressive driving behaviors. The system is based on neural networks trained on a comprehensive dataset of driving events, driving styles, and road conditions. The system effectively detects potential risks and helps mitigate the frequency and impact of accidents. The primary goal is to ensure the safety of drivers and vehicles. Collecting data involved gathering information on three key road events: normal street and normal drive, speed bumps, circular yellow speed bumps, and three aggressive driving actions: sudden start, sudden stop, and sudden entry. The gathered data is processed and analyzed using a machine learning system designed for limited power and memory devices. The developed system resulted in 91.9% accuracy, 93.6% precision, and 92% recall. The achieved inference time on an Arduino Nano 33 BLE Sense with a 32-bit CPU running at 64 MHz is 34 ms and requires 2.6 kB peak RAM and 139.9 kB program flash memory, making it suitable for resource-constrained embedded systems.
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
REAL-TIME SCENE TEXT LOCALIZATION AND RECOGNITION ppt.pptx
1. REAL-TIME SCENE TEXT
LOCALIZATION AND RECOGNITION
project done by:
208H1A0454 PICHIKA MANOHAR
208H1A0416 CHEEDEPUDI G S PRAVEEN BABU
208H1A0422 DEVARAPALLI CHANDRASEKHAR
218H5A0411 KATARI SRINIVASRAO
3. INTRODUCTION
Scene text recognition (STR) has become an increasing hot research field in computer vision recently, as manifested
by the prosperity of recent ”robust reading” competitions ICDAR [1] in every two years, along with the workshop
about Camera Based Document Analysis and Recognition (CBDAR).
With an extensive demand for the information identification, STR technology has a large-scale applications in
automatically logistics distribution, geographical positioning, license plate recognition, and driverless applications.
Text detection [2] is a common task in image analysis, while text recognition [3] is a more advanced task for it not
only should localize the text spatially which belongs to object detection, but also recognize the text, i.e., text spotting
[4].
Compared to the traditional well-formatted document text detection and recognition, natural text detection and
recognition is a challenging topic in the visual detection task due to multilingual, text sizes, font tilt, blurring,
background interference, handwriting, various angles and so on, as shown
4. Previous Work
Numerous methods which focus solely on text localization in real-world images have been published [6, 2, 7, 17].
The method of Epstein et al. in [5] converts an input image to a greyscale space and uses Canny detector [1] to find
edges.
Pairs of parallel edges are then used to calculate stroke width for each pixel and pixels with similar stroke width are
grouped together into characters. The method is sensitive to noise and blurry images because it is dependent on a
successful edge detection and it provides only single segmentation for each character which not necessarily might
be the best one for an OCR module. A similar edge based approach with different connected component algorithm
is presented in [24].
A good overview of the methods and their performance can be also found in ICDAR Robust Reading competition
results [10, 9, 20]. Only a few methods that perform both text localization and recognition have been published. The
method of Wang Figure 2.
Text localization and recognition overview.
(a) Source 2MPx image.
(b) Intensity channel extracted.
(c) ERs selected in ON by the first stage of the sequential classifier.
(d) ERs selected by the second stage of the classifier.
(e) Text lines found by region grouping.
(f) Only ERs in text lines selected and text recognized by an OCR module.
(g) Number of ERs at the end of each stage and its duration.
5. IMAGE PROCESSING
The term digital image refers to processing of a two dimensional picture by a digital computer. In a
broader context, it implies digital processing of any two dimensional data.
A digital image is an array of real or complex numbers represented by a finite number of bits. An
image given in the form of a transparency, slide, photograph or an X-ray is first digitized and stored as
a matrix of binary digits in computer memory.
This digitized image can then be processed and/or displayed on a high-resolution television monitor.
For display, the image is stored in a rapid-access buffer memory, which refreshes the monitor at a rate
of 25 frames per second to produce a visually continuous display
6. RGB IMAGE
An RGB color image is an M*N*3 array of color pixels where each color pixel is triplet
corresponding to the red, green and blue components of an RGB image, at a specific spatial location.
An RGB image may be viewed as “stack” of three gray scale images that when fed in to the red,
green and blue inputs of a color monitor Produce a color image on the screen.
Convention the three images forming an RGB color image are referred to as the red, green and blue
components images.
The data class of the components images determines their range of values. If an RGB image is of class
double the range of values is [0, 1]
A normal grey scale image has 8 bit color depth = 256 grey scales.
A true color image has 24 bit color depth = 8 x 8 x 8 bits = 256 x 256 x 256 colors = ~16 million
colors
7. BINARY IMAGE
The elementary type of image representation is called the "Binary image". It typical uses only two levels.
The two levels are referred to as black and white which are mentioned as ‘1’ and ‘0’. This kind of image representation is
considered like 1 bit per pixel image. This is suitable to an reason which considers barely single digit number to signify
every pel.
These types of images are often used to depict low level information of the picture like its outline or shape. Especially in
applications like representation in optical character (OCR) where the only outline character required realizing the letter
representing it.
The digital images are generated from the system of images of gray scale through a technique called thresholding.
The two-level thresholding simply acts as a decision factor above which it switches to numerical '1' and below which it
switches to numerical '0'.
(a) (b) (c)
8. IMAGE OF GRAY SCALE
Images of Gray scale (GS) were denoted as neutral or single-color picture. Such images possess
information of brightness merely.
Hence color data is contained by them is empty. However, the brightness is represented at different
levels. Typical 8-bit image holds a range of 0-255 brightness levels known as gray levels.
Here 0 refers to black and 1 refers to white. The 8-bit depiction is obvious with the reality to a
computer actually handles the data in 8-bit format. Below Fig.5.7 (a) and (b) are two examples of
such GSI.
(a) (b) (c)
9. COLOUR IMAGE
Color image (CI) which modeled as triple band single chromatic light information, here every band
of information will keeps in touch with various dissimilar colors.
The following figure illustrates the vector as an arrow that adds, which refers an individual smallest
unit measurement of red, green, blue principles like a color vector ---(R, G, B ).
(a) (b) (c)
10. A shading pixel vector comprises of the red, green and blue pixel esteems (R, G, B) at one given
line/section pixel facilitate (r, c)
A multispectral picture is one that catches picture information at particular frequencies over the
electromagnetic range. Multispectral pictures regularly contain data outside the typical human
perceptual range. This may incorporate infrared, bright, X-beam, acoustic or radar information.
Foundation of these kinds of picture incorporates satellite frameworks, submerged sonar frameworks
and medicinal diagnostics imaging frameworks.
11. MATLAB SOFTWARE
MATLAB® is a high a performance language for technical computing. It integrates computation,
visualization, and programming in an easy-to-use environment where problems and solutions are
expressed in familiar mathematical notation.
Typical uses include
Math and computation.
Algorithm development
Data acquisition
Modelling, simulation, and prototyping
Data analysis, exploration, and visualization
Scientific and engineering graphics
Application development, including graphical user interface building.
12. MATLAB is an interactive system whose basic data element is an array that does not require
dimensioning. This allows you to solve many technical computing problems. Especially those with
matrix and vector formulations, in a fraction of the time it would take to write a program in a scalar
non-interactive language such as C or FORTRAN.
The name MATLAB stands for matrix laboratory. MATLAB was originally written to provide easy
access to matrix software developed by the LINPACK and EISPACK projects. Today, MATLAB
engines incorporate the LAPACK and BLAS libraries, embedding the state of the art in software for
matrix computation.
MATLAB has evolved over a period of years with input from many users. in university environments,
it is the standard instructional tool for introductory and advanced courses in mathematics engineering,
and science. In industry, MAAB is the tool of for high productivity research, development, and
analysis.
13. MATLAB DESKTOP
Mat lab Desktop is the main at lab application window. The desktop contains sub windows, the
command window, the workspace browser, the current directory window, the command history
window, and one or more figure windows, which are shown only when the user displays a graphic.
14. EXISTING METHOD
The method is able to cope with noisy data, but its generality is limited as a lexicon of words (which
contains at most 500 words in their experiments) has to be supplied for each individual image.
Methods presented in [14, 15] detect characters as Maximally Stable Extremal Regions (MSERs) [11]
and perform text recognition using the segmentation obtained by the MSER detector. An MSER is an
particular case of
Extremal Region whose size remains virtually unchanged over a range of thresholds. The methods
perform well but have problems on blurry images or characters with low contrast. According to the
description provided by the ICDAR 2011 Robust Reading competition organizers [20] the winning
method is based on MSER detection, but the method
15. PROPOSED METHOD
The proposed methodology is described in four subsections. In Section II-A, we calculate the product of
Laplacian and Sobel operations on the input image to enhance the text details and it is called the Laplacian–
Sobel product (LSP) process. The Bayesian classifier is used for classifying true text pixels based on three
probable matrices, as described in Section II-B. The three probable matrices are obtained on the basis of LSP
such that high contrast pixels in LSP are classified as text pixels (HLSP), K-means with k = 2 of maximum
gradient difference of HLSP (K-MGD-HLSP), and K-means of LSP (between maximum and minimum values of
a sliding window over HLSP.
16. Posterior probability estimation and text candidates. (a) TPM. (b) NTPM. (c) Bayesian result. (d) Text
candidates.
Boundary growing method. (a) BGM for components. (b) BGM for first line. (c) BGM for second line.
(d) BGM for third line. (e) BGM for third line and false positives. (f) BGM for false positives. (g) False
positives shown. (h) False positive elimination
17. EXPERIMENT RESULT AND ANALYSIS
An end-to-end real-time text localization and recognition method is presented in the paper. In the first stage of the classification, the
probability of each ER being a character is estimated using novel features calculated with O(1) complexity and only ERs with locally
maximal probability are selected for the second stage, where the classification is improved using more computationally expensive
features. It is demonstrated that including the novel gradient magnitude projection ERs cover 94.8% of characters. The average run
time of the method on a 800 × 600 image is 0.3s on a standard PC. however direct comparison is not possible as the method of Wang et
al. uses a different task formulation and a different evaluation protocol. Robustness of
18.
19. CONCLUSION
In this paper, we proposed a new video scene text detection method that made use of a new
enhancement method using Laplacian and Sobel operations of input images to enhance low contrast
text pixels. A Bayesian classifier was used to classify true text pixels from the enhanced text matrix
without a priori knowledge of the input image.
Three probable text matrices and three probable non text matrices were derived based on clustering
and the result of enhancement method. To traverse the multi oriented text, we proposed a boundary
growing method based on the nearest neighbor concept.
Experimentation and comparative study showed that the proposed method outperformed the existing
methods in terms of measures, especially on complex nonhorizontal data. However, there are few
problems in handling false positives.
We planned to extend this method to detection of curve-shaped text lines with good recall, precision,
F-measures, and low computational times. Notwithstanding the current limitations that we will deal
with in our future research, the contribution of this paper lies in our continued effort in detecting
multi oriented text lines in videos, which hitherto has not been well explored by others.
20. REFERENCES
[1] J. Canny. A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 8:679–698, 1986.
[2] X. Chen and A. L. Yuille. Detecting and reading text in natural scenes. CVPR, 2:366–373, 2004.
[3] H. Cheng, X. Jiang, Y. Sun, and J. Wang. Colour image segmentation: advances and prospects. Pattern Recognition,
34(12):2259 – 2281, 2001.
[4] N. Cristianini and J. Shawe Taylor. An introduction to Support Vector Machines. Cambridge University Press,
March 2000.
[5] B. Epshtein, E. O fek, and Y. Wexler. Detecting text in natural scenes with stroke width transform. In CVPR 2010,
pages 2963 –2970.
[6] L. Jung- Jin, P.-H. Lee, S.-W. Lee, A. Yuille, and C. Koch. Ada boost for text detection in natural scene. In ICDAR
2011, pages 429–434, 2011.
[7] R. Li enhart and A. Wernicke. Localizing and segmenting text in images and videos. Circuits and Systems for
Video Technology, 12(4):256 –268, 2002.
[8] H. Liu and X. Ding. Handwritten character recognition using gradient feature and quadratic classifier with multiple
discrimination schemes. In ICDAR 2005, pages 19 – 23 Vol. 1.