Optical character recognition (OCR) is a recognition system used to recognize the substance of a checked picture. This system gives erroneous results, which necessitates a post-treatment, for the sentence correction. In this paper, we proposed a new method for syntactic and semantic correction of sentences it is based on the frequency of two correct words in the sentence and a recursive technique. This approach starts with the frequency calculation of each two words successive in the corpora, the words that have the greatest frequency build a correction center. We found 98% using our approach when we used the noisy channel. Further, we obtained 96% using the same corpus in the same conditions.
Digital camera and mobile document image acquisition are new trends arising in the world of Optical Character Recognition and text detection. In some cases, such process integrates many distortions and produces poorly scanned text or text-photo images and natural images, leading to an unreliable OCR digitization. In this paper, we present a novel nonparametric and unsupervised method to compensate for
undesirable document image distortions aiming to optimally improve OCR accuracy. Our approach relies on a very efficient stack of document image enhancing techniques to recover deformation of the entire document image. First, we propose a local brightness and contrast adjustment method to effectively handle lighting variations and the irregular distribution of image illumination. Second, we use an optimized greyscale conversion algorithm to transform our document image to greyscale level. Third, we sharpen the
useful information in the resulting greyscale image using Un-sharp Masking method. Finally, an optimal global binarization approach is used to prepare the final document image to OCR recognition. The proposed approach can significantly improve text detection rate and optical character recognition
accuracy. To demonstrate the efficiency of our approach, an exhaustive experimentation on a standard dataset is presented.
A Survey On Thresholding Operators of Text Extraction In VideosCSCJournals
Β
ideo indexing is an important problem that has interested by the communities of visual information in image processing. The detection and extraction of scene and caption text from unconstrained, general purpose video is an important research problem in the context of content-based retrieval and summarization. In this paper, the technique presented is for detection text from frames video. Finding the textual contents in images is a challenging and promising research area in information technology. Consequently, text detection and recognition in multimedia had become one of the most important fields in computer vision due to its valuable uses in a variety of recent technical applications. The work in this paper consists using morphological operations for extract text appearing in the video frames. The proposed scheme well as preprocessing to differentiate among where it as the high similarity between text and background information. Experimental results show that the resultant image is the image with only text. The evaluated criteria are applied with the image result and one obtained bay different operator.
A Survey On Thresholding Operators of Text Extraction In VideosCSCJournals
Β
Video indexing is an important problem that has interested by the communities of visual information in image processing. The detection and extraction of scene and caption text from unconstrained, general purpose video is an important research problem in the context of content-based retrieval and summarization. In this paper, the technique presented is for detection text from frames video. Finding the textual contents in images is a challenging and promising research area in information technology. Consequently, text detection and recognition in multimedia had become one of the most important fields in computer vision due to its valuable uses in a variety of recent technical applications. The work in this paper consists using morphological operations for extract text appearing in the video frames. The proposed scheme well as preprocessing to differentiate among where it as the high similarity between text and background information. Experimental results show that the resultant image is the image with only text. The evaluated criteria are applied with the image result and one obtained bay different operator.
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...ijiert bestjournal
Β
Optical character recognition systems have been effectively developed for the recognition of p rinted characters. Optical character recognition is an awesome computer vision technique with various applications ranging from saving real time scripts digitally and deriving context based intelligence using natural language processing from the texts. One such application is the recognition of machine printed characters. This paper illustrates the technique to identify machine printed characters using Blob detection method and Image processing. In many cases of such machine printed characters there is simi larity between character colour and background colour. There is mix up of reflected light and scattered light. Colour is not consistent across character area or background area. Paper explains how Blob detection technique is used for recognition of these m achines printed characters.
Digital camera and mobile document image acquisition are new trends arising in the world of Optical Character Recognition and text detection. In some cases, such process integrates many distortions and produces poorly scanned text or text-photo images and natural images, leading to an unreliable OCR digitization. In this paper, we present a novel nonparametric and unsupervised method to compensate for
undesirable document image distortions aiming to optimally improve OCR accuracy. Our approach relies on a very efficient stack of document image enhancing techniques to recover deformation of the entire document image. First, we propose a local brightness and contrast adjustment method to effectively handle lighting variations and the irregular distribution of image illumination. Second, we use an optimized greyscale conversion algorithm to transform our document image to greyscale level. Third, we sharpen the
useful information in the resulting greyscale image using Un-sharp Masking method. Finally, an optimal global binarization approach is used to prepare the final document image to OCR recognition. The proposed approach can significantly improve text detection rate and optical character recognition
accuracy. To demonstrate the efficiency of our approach, an exhaustive experimentation on a standard dataset is presented.
A Survey On Thresholding Operators of Text Extraction In VideosCSCJournals
Β
ideo indexing is an important problem that has interested by the communities of visual information in image processing. The detection and extraction of scene and caption text from unconstrained, general purpose video is an important research problem in the context of content-based retrieval and summarization. In this paper, the technique presented is for detection text from frames video. Finding the textual contents in images is a challenging and promising research area in information technology. Consequently, text detection and recognition in multimedia had become one of the most important fields in computer vision due to its valuable uses in a variety of recent technical applications. The work in this paper consists using morphological operations for extract text appearing in the video frames. The proposed scheme well as preprocessing to differentiate among where it as the high similarity between text and background information. Experimental results show that the resultant image is the image with only text. The evaluated criteria are applied with the image result and one obtained bay different operator.
A Survey On Thresholding Operators of Text Extraction In VideosCSCJournals
Β
Video indexing is an important problem that has interested by the communities of visual information in image processing. The detection and extraction of scene and caption text from unconstrained, general purpose video is an important research problem in the context of content-based retrieval and summarization. In this paper, the technique presented is for detection text from frames video. Finding the textual contents in images is a challenging and promising research area in information technology. Consequently, text detection and recognition in multimedia had become one of the most important fields in computer vision due to its valuable uses in a variety of recent technical applications. The work in this paper consists using morphological operations for extract text appearing in the video frames. The proposed scheme well as preprocessing to differentiate among where it as the high similarity between text and background information. Experimental results show that the resultant image is the image with only text. The evaluated criteria are applied with the image result and one obtained bay different operator.
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...ijiert bestjournal
Β
Optical character recognition systems have been effectively developed for the recognition of p rinted characters. Optical character recognition is an awesome computer vision technique with various applications ranging from saving real time scripts digitally and deriving context based intelligence using natural language processing from the texts. One such application is the recognition of machine printed characters. This paper illustrates the technique to identify machine printed characters using Blob detection method and Image processing. In many cases of such machine printed characters there is simi larity between character colour and background colour. There is mix up of reflected light and scattered light. Colour is not consistent across character area or background area. Paper explains how Blob detection technique is used for recognition of these m achines printed characters.
Information Extraction from Product Labels: A Machine Vision Approachgerogepatton
Β
This research tackles the challenge of manual data extraction from product labels by employing a blend of
computer vision and Natural Language Processing (NLP). We introduce an enhanced model that combines
Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) in a Convolutional
Recurrent Neural Network (CRNN) for reliable text recognition. Our model is further refined by
incorporating the Tesseract OCR engine, enhancing its applicability in Optical Character Recognition
(OCR) tasks. The methodology is augmented by NLP techniques and extended through the Open Food
Facts API (Application Programming Interface) for database population and text-only label prediction.
The CRNN model is trained on encoded labels and evaluated for accuracy on a dedicated test set.
Importantly, our approach enables visually impaired individuals to access essential information on
product labels, such as directions and ingredients. Overall, the study highlights the efficacy of deep
learning and OCR in automating label extraction and recognition.
INFORMATION EXTRACTION FROM PRODUCT LABELS: A MACHINE VISION APPROACHijaia
Β
This research tackles the challenge of manual data extraction from product labels by employing a blend of
computer vision and Natural Language Processing (NLP). We introduce an enhanced model that combines
Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) in a Convolutional
Recurrent Neural Network (CRNN) for reliable text recognition. Our model is further refined by
incorporating the Tesseract OCR engine, enhancing its applicability in Optical Character Recognition
(OCR) tasks. The methodology is augmented by NLP techniques and extended through the Open Food
Facts API (Application Programming Interface) for database population and text-only label prediction.
The CRNN model is trained on encoded labels and evaluated for accuracy on a dedicated test set.
Importantly, our approach enables visually impaired individuals to access essential information on
product labels, such as directions and ingredients. Overall, the study highlights the efficacy of deep
learning and OCR in automating label extraction and recognition.
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Enhancement and Segmentation of Historical Recordscsandit
Β
Document Analysis and Recognition (DAR) aims to extract automatically the information in the document and also addresses to human comprehension. The automatic processing of degraded
historical documents are applications of document image analysis field which is confronted with many difficulties due to the storage condition and the complexity of the script. The main interest
of enhancement of historical documents is to remove undesirable statistics that appear in the
background and highlight the foreground, so as to enable automatic recognition of documents
with high accuracy. This paper addresses pre-processing and segmentation of ancient scripts, as an initial step to automate the task of an epigraphist in reading and deciphering inscriptions.
Pre-processing involves, enhancement of degraded ancient document images which is achieved through four different Spatial filtering methods for smoothing or sharpening namely Median,
Gaussian blur, Mean and Bilateral filter, with different mask sizes. This is followed by
binarization of the enhanced image to highlight the foreground information, using Otsu
thresholding algorithm. In the second phase Segmentation is carried out using Drop Fall and
WaterReservoir approaches, to obtain sampled characters, which can be used in later stages of
OCR. The system showed good results when tested on the nearly 150 samples of varying
degraded epigraphic images and works well giving better enhanced output for, 4x4 mask size
for Median filter, 2x2 mask size for Gaussian blur, 4x4 mask size for Mean and Bilateral filter.
The system can effectively sample characters from enhanced images, giving a segmentation rate of 85%-90% for Drop Fall and 85%-90% for Water Reservoir techniques respectively
Optical Character Recognition from Text ImageEditor IJCATR
Β
Optical Character Recognition (OCR) is a system that provides a full alphanumeric recognition of printed or handwritten
characters by simply scanning the text image. OCR system interprets the printed or handwritten characters image and converts it into
corresponding editable text document. The text image is divided into regions by isolating each line, then individual characters with
spaces. After character extraction, the texture and topological features like corner points, features of different regions, ratio of
character area and convex area of all characters of text image are calculated. Previously features of each uppercase and lowercase
letter, digit, and symbols are stored as a template. Based on the texture and topological features, the system recognizes the exact
character using feature matching between the extracted character and the template of all characters as a measure of similarity.
Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic conversion of scanned images of handwritten, typewritten or printed text into machine-encoded text.
It is a system that provides a full alphanumeric recognition of printed or handwritten characters at electronic speed by simply scanning the form. It is widely used as a form of data entry from some sort of original paper data source, whether documents, sales receipts, mail, or any number of printed records.
It is a common method of digitizing printed texts so that they can be electronically searched, stored more compactly, displayed on-line, and used in machine processes such as machine translation, text-to-speech and text mining.OCR is a field of research in pattern recognition, artificial intelligence and computer vision. More recently, the term Intelligent Character Recognition(ICR) has been used to describe the process of interpreting image data, in particular alphanumeric text .
Optical character recognition (OCR) is process of classification of optical patterns contained in a digital image. The process of OCR Recognition involves several steps including pre-processing, segmentation, feature extraction, classification. Pre-processing is for done the basic operation on input image like noise reduction which remove the noisy signal from image. Segmentation stage for segment the given image into line by line and segment each character from segmented line. Future extraction calculates the characteristics of character. A Radial Basis Function Neural Network (RBFNN) is used to classification contains the database and does the comparison.
An optical character recognition (OCR) refers to a process of converting the text document images into editable and searchable text. OCR process poses several challenges in particular in the Arabic language due to it has caused a high percentage of errors. In this paper, a method, to improve the outputs of the arabic optical character recognition (AOCR) Systems is suggested based on a statistical language model built from the available huge corpora. This method includes detecting and correcting non-word and real words error according to the context of the word in the sentence. The results show that the percentage of improvement in the results is up to (98%) as a new accuracy for AOCR output.
The most critical parameters that indicate the Wi-Fi network are throughput, delay, latency, and packet loss since they provide significant benefits, especially to the end-user. This research aims to investigate Wi-Fi performance in an indoor environment for light-of-sight (LOS) and nonlight-of-sight (NLOS) conditions. The effect of the surrounding obstacles and distance has also been reported in the paper. The parameters measured are packet loss, the packet sent, the packet received, throughput, and latency. Site measurement is done to obtain real-time and optimum results. The measured parameters are then validated using the EMCO ping monitor 8 software. The comparison results between the measurement and the simulation are well presented in this paper. Additionally, the measurement distance is done up to 30 meters and the results are reported in the paper as well. The results indicate that the throughput value decreases with an increasing distance, where the lowest throughput value is 24.64 Mbps and the highest throughput value is 70.83 Mbps. Next, the maximum latency value from the measurement is 79 ms, while the lowest latency value is 56.09 ms. Finally, this research verified that obstacles and distances are among the contributing factors affecting the throughput and latency performance of the Wi-Fi network.
Subarrays of phased-array antennas for multiple-input multiple-output radar a...IJICTJOURNAL
Β
The subarray MIMO radar (SMIMO) is a multiple-input multiple-output (MIMO) radar with elements in the form of a sub-array that acts as a phased array (PAR), so it combines at the same time the key advantage of the PAR radar, which is high directional gain to increase target range, and the key advantage of the MIMO radar, i.e., high diversity gains to increase the maximum number of detected targets. Different schemes for the number of antenna elements in the transceiver zones, such as uniform and/or variable, overlapped and non-overlapped, significantly determine the performance of radars as virtual arrays (VARs), maximum number of detected targets, accuracy of target angle, detection resolution, SNR detection, and detection probability. Performance is also compared with the PAR, the MIMO, and the phased MIMO radars (PMIMO). The SMIMO radar offers great versatility for radar applications, being able to adapt to different shapes of the multiple targets to be detected and their environment. For example, for a transmit-receive with an antenna element number, i.e., M = N = 8, the range of the number of detected targets for the SMIMO radar is flexible compared to the other radars. On the other hand, the proposed radar's signal-to-noise ratio (SNR) detection performance and detection probability (K = 5, L = 3) are both 1,999 and above 90%, which are better than other radars.
More Related Content
Similar to Correcting optical character recognition result via a novel approach
Information Extraction from Product Labels: A Machine Vision Approachgerogepatton
Β
This research tackles the challenge of manual data extraction from product labels by employing a blend of
computer vision and Natural Language Processing (NLP). We introduce an enhanced model that combines
Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) in a Convolutional
Recurrent Neural Network (CRNN) for reliable text recognition. Our model is further refined by
incorporating the Tesseract OCR engine, enhancing its applicability in Optical Character Recognition
(OCR) tasks. The methodology is augmented by NLP techniques and extended through the Open Food
Facts API (Application Programming Interface) for database population and text-only label prediction.
The CRNN model is trained on encoded labels and evaluated for accuracy on a dedicated test set.
Importantly, our approach enables visually impaired individuals to access essential information on
product labels, such as directions and ingredients. Overall, the study highlights the efficacy of deep
learning and OCR in automating label extraction and recognition.
INFORMATION EXTRACTION FROM PRODUCT LABELS: A MACHINE VISION APPROACHijaia
Β
This research tackles the challenge of manual data extraction from product labels by employing a blend of
computer vision and Natural Language Processing (NLP). We introduce an enhanced model that combines
Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) in a Convolutional
Recurrent Neural Network (CRNN) for reliable text recognition. Our model is further refined by
incorporating the Tesseract OCR engine, enhancing its applicability in Optical Character Recognition
(OCR) tasks. The methodology is augmented by NLP techniques and extended through the Open Food
Facts API (Application Programming Interface) for database population and text-only label prediction.
The CRNN model is trained on encoded labels and evaluated for accuracy on a dedicated test set.
Importantly, our approach enables visually impaired individuals to access essential information on
product labels, such as directions and ingredients. Overall, the study highlights the efficacy of deep
learning and OCR in automating label extraction and recognition.
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Enhancement and Segmentation of Historical Recordscsandit
Β
Document Analysis and Recognition (DAR) aims to extract automatically the information in the document and also addresses to human comprehension. The automatic processing of degraded
historical documents are applications of document image analysis field which is confronted with many difficulties due to the storage condition and the complexity of the script. The main interest
of enhancement of historical documents is to remove undesirable statistics that appear in the
background and highlight the foreground, so as to enable automatic recognition of documents
with high accuracy. This paper addresses pre-processing and segmentation of ancient scripts, as an initial step to automate the task of an epigraphist in reading and deciphering inscriptions.
Pre-processing involves, enhancement of degraded ancient document images which is achieved through four different Spatial filtering methods for smoothing or sharpening namely Median,
Gaussian blur, Mean and Bilateral filter, with different mask sizes. This is followed by
binarization of the enhanced image to highlight the foreground information, using Otsu
thresholding algorithm. In the second phase Segmentation is carried out using Drop Fall and
WaterReservoir approaches, to obtain sampled characters, which can be used in later stages of
OCR. The system showed good results when tested on the nearly 150 samples of varying
degraded epigraphic images and works well giving better enhanced output for, 4x4 mask size
for Median filter, 2x2 mask size for Gaussian blur, 4x4 mask size for Mean and Bilateral filter.
The system can effectively sample characters from enhanced images, giving a segmentation rate of 85%-90% for Drop Fall and 85%-90% for Water Reservoir techniques respectively
Optical Character Recognition from Text ImageEditor IJCATR
Β
Optical Character Recognition (OCR) is a system that provides a full alphanumeric recognition of printed or handwritten
characters by simply scanning the text image. OCR system interprets the printed or handwritten characters image and converts it into
corresponding editable text document. The text image is divided into regions by isolating each line, then individual characters with
spaces. After character extraction, the texture and topological features like corner points, features of different regions, ratio of
character area and convex area of all characters of text image are calculated. Previously features of each uppercase and lowercase
letter, digit, and symbols are stored as a template. Based on the texture and topological features, the system recognizes the exact
character using feature matching between the extracted character and the template of all characters as a measure of similarity.
Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic conversion of scanned images of handwritten, typewritten or printed text into machine-encoded text.
It is a system that provides a full alphanumeric recognition of printed or handwritten characters at electronic speed by simply scanning the form. It is widely used as a form of data entry from some sort of original paper data source, whether documents, sales receipts, mail, or any number of printed records.
It is a common method of digitizing printed texts so that they can be electronically searched, stored more compactly, displayed on-line, and used in machine processes such as machine translation, text-to-speech and text mining.OCR is a field of research in pattern recognition, artificial intelligence and computer vision. More recently, the term Intelligent Character Recognition(ICR) has been used to describe the process of interpreting image data, in particular alphanumeric text .
Optical character recognition (OCR) is process of classification of optical patterns contained in a digital image. The process of OCR Recognition involves several steps including pre-processing, segmentation, feature extraction, classification. Pre-processing is for done the basic operation on input image like noise reduction which remove the noisy signal from image. Segmentation stage for segment the given image into line by line and segment each character from segmented line. Future extraction calculates the characteristics of character. A Radial Basis Function Neural Network (RBFNN) is used to classification contains the database and does the comparison.
An optical character recognition (OCR) refers to a process of converting the text document images into editable and searchable text. OCR process poses several challenges in particular in the Arabic language due to it has caused a high percentage of errors. In this paper, a method, to improve the outputs of the arabic optical character recognition (AOCR) Systems is suggested based on a statistical language model built from the available huge corpora. This method includes detecting and correcting non-word and real words error according to the context of the word in the sentence. The results show that the percentage of improvement in the results is up to (98%) as a new accuracy for AOCR output.
The most critical parameters that indicate the Wi-Fi network are throughput, delay, latency, and packet loss since they provide significant benefits, especially to the end-user. This research aims to investigate Wi-Fi performance in an indoor environment for light-of-sight (LOS) and nonlight-of-sight (NLOS) conditions. The effect of the surrounding obstacles and distance has also been reported in the paper. The parameters measured are packet loss, the packet sent, the packet received, throughput, and latency. Site measurement is done to obtain real-time and optimum results. The measured parameters are then validated using the EMCO ping monitor 8 software. The comparison results between the measurement and the simulation are well presented in this paper. Additionally, the measurement distance is done up to 30 meters and the results are reported in the paper as well. The results indicate that the throughput value decreases with an increasing distance, where the lowest throughput value is 24.64 Mbps and the highest throughput value is 70.83 Mbps. Next, the maximum latency value from the measurement is 79 ms, while the lowest latency value is 56.09 ms. Finally, this research verified that obstacles and distances are among the contributing factors affecting the throughput and latency performance of the Wi-Fi network.
Subarrays of phased-array antennas for multiple-input multiple-output radar a...IJICTJOURNAL
Β
The subarray MIMO radar (SMIMO) is a multiple-input multiple-output (MIMO) radar with elements in the form of a sub-array that acts as a phased array (PAR), so it combines at the same time the key advantage of the PAR radar, which is high directional gain to increase target range, and the key advantage of the MIMO radar, i.e., high diversity gains to increase the maximum number of detected targets. Different schemes for the number of antenna elements in the transceiver zones, such as uniform and/or variable, overlapped and non-overlapped, significantly determine the performance of radars as virtual arrays (VARs), maximum number of detected targets, accuracy of target angle, detection resolution, SNR detection, and detection probability. Performance is also compared with the PAR, the MIMO, and the phased MIMO radars (PMIMO). The SMIMO radar offers great versatility for radar applications, being able to adapt to different shapes of the multiple targets to be detected and their environment. For example, for a transmit-receive with an antenna element number, i.e., M = N = 8, the range of the number of detected targets for the SMIMO radar is flexible compared to the other radars. On the other hand, the proposed radar's signal-to-noise ratio (SNR) detection performance and detection probability (K = 5, L = 3) are both 1,999 and above 90%, which are better than other radars.
Statistical analysis of an orographic rainfall for eight north-east region of...IJICTJOURNAL
Β
Autoregressive integrated moving average (ARIMA) models are used to predict the rain rate for orographic rainfall over a long period of time, from 1980 to 1918. As the orographic rainfall may cause landslides and other natural disaster issues, So, this study is very important for the analysis of rainfall prediction. In this research, statistical calculations have been done based on the rainfall data for twelve regions of India (Cherrapunji, Darjeling, Dawki, Ghum, Itanagar, Kamchenjunga, Mizoram, Nagaland, Pakyong, Saser Kangri, Slot Kangri, and Tripura) from the eight states, i.e., Sikkim, Meghalaya, West Bengal, Ladakh (Union Territory of India), Arunachal Pradesh, Mizoram, Tripura, and Nagaland) with varying altitude. The model's output is assessed using several error calculations. The model's performance is represented by the fit value, which is reliable for the northeast region of India with increasing altitude. The statistical dependability of the rainfall prediction is shown by the parameters. The lowest value of root mean square error (RMSE) indicates better prediction for orographic rainfall.
A broadband MIMO antenna's channel capacity for WLAN and WiMAX applicationsIJICTJOURNAL
Β
This paper describes the findings of a research into the multiple input multiple output (MIMO) channel capacity of a broadband dual-element printed inverted F-antenna (PIFA) antenna array. The dual-element antenna array is made up of two PIFAs that are meant to fit on a teeny-tiny and small wireless communication device that runs at 5 GHz. The device's frequency range is between 3.5 and 4.5 GHz. These PIFAs are also loaded into the device during the installation process. In order to investigate the channel capacity, the ray tracing method is employed in two different kinds of circumstances. For the purpose of carrying out this analysis of the channel capacity, both the simulated and measured mutual couplings of the broadband MIMO antenna are utilized.
Satellite dish antenna control for distributed mobile telemedicine nodesIJICTJOURNAL
Β
The positioning control of a dish antenna mounted on distributed mobile telemedicine nodes (DMTNs) within Nigeria communicating via NigComSat-1R has been presented. It was desired to improve the transient and steady performance of satellite dish antenna and reduce the effect of delay during satellite communication. In order to overcome this, the equations describing the dynamics of the antenna positioning system were obtained and transformed into state space variable equations. A full state feedback controller was developed with forward path gain and an observer. The proposed controller was introduced into the closed loop of the dish antenna positioning control system. The system was subjected to unit step forcing function in MATLAB/Simulink simulation environment considering three different cases so as to obtain time domain parameters that characterized the transient and steady state response performances. The simulation results obtained revealed that the introduction of the full state feedback controller provided improved position tracking to unit step input with a rise time of 0.42 s, settling time of 1.22 s and overshoot of 4.91%. With the addition of observer, the rise time achieved was 0.39 s, settling time of 1.31 s, and overshoot of 10.7%. The time domain performance comparison of the proposed system with existing systems revealed its superiority over them.
High accuracy sensor nodes for a peat swamp forest fire detection using ESP32...IJICTJOURNAL
Β
The use of smoke sensors in high-precision and low-cost forest fire detection kits needs to be developed immediately to assist the authorities in monitoring forest fires especially in remote areas more efficiently and systematically. The implementation of automatic reclosing operation allows the fire detector kit to distinguish between real smoke and non-real smoke successfully. This has profitably reduced kit errors when detecting fires and in turn prevent the users from receiving incorrect messages. However, using a smoke sensor with automatic reclosing operation has not been able to optimize the accuracy of identifying the actual smoke due to the working sensor node situation is difficult to predict and sometimes unexpected such as the source of smoke received. Thus, to further improve the accuracy when detecting the presence of smoke, the system is equipped with two digital cameras that can capture and send pictures of fire smoke to the users. The system gives the users choice of three interesting options if they want the camera to capture and send pictures to them, namely request, smoke trigger and movement for security purposes. In all cases, users can request the system to send pictures at any time. The system equipped with this camera shows the accuracy of smoke detection by confirming the actual smoke that has been detected through images sent in the userβs Telegram channel and on the graphical user interface (GUI) display. As a comparison of the system before and after using this camera, it was found that the system that uses the camera gives advantage to the users in monitoring fire smoke more effectively and accurately.
Prediction analysis on the pre and post COVID outbreak assessment using machi...IJICTJOURNAL
Β
In this time of a global urgency where people are losing lives each day in a large number, people are trying to develop ways/technology to solve the challenges of COVID-19. Machine learning (ML) and artificial intelligence (AI) tools have been employed previously as well to the times of pandemic where they have proven their worth by providing reliable results in varied fields this is why ML tools are being used extensively to fight this pandemic as well. This review describes the applications of ML in the post and pre COVID-19 conditions for contact tracing, vaccine development, prediction and diagnosis, risk management, and outbreak predictions to help the healthcare system to work efficiently. This review discusses the ongoing research on the pandemic virus where various ML models have been employed to a certain data set to produce outputs that can be used for risk or outbreak prediction of virus in the population, vaccine development, and contact tracing. Thus, the significance and the contribution of ML against COVID-19 are self-explanatory but what should not be compromised is the quality and accuracy based on which solutions/methods/policies adopted or produced from this analysis which will be implied in the real world to real people.
Meliorating usable document density for online event detectionIJICTJOURNAL
Β
Online event detection (OED) has seen a rise in the research community as it can provide quick identification of possible events happening at times in the world. Through these systems, potential events can be indicated well before they are reported by the news media, by grouping similar documents shared over social media by users. Most OED systems use textual similarities for this purpose. Similar documents, that may indicate a potential event, are further strengthened by the replies made by other users, thereby improving the potentiality of the group. However, these documents are at times unusable as independent documents, as they may replace previously appeared noun phrases with pronouns, leading OED systems to fail while grouping these replies to their suitable clusters. In this paper, a pronoun resolution system that tries to replace pronouns with relevant nouns over social media data is proposed. Results show significant improvement in performance using the proposed system.
Performance analysis on self organization based clustering scheme for FANETs ...IJICTJOURNAL
Β
With the fast-increasing development of wireless communication networks, unmanned aerial vehicle (UAV) has emerged as a flying platform for wireless communication with efficient coverage, capacity, reliability, and its network is called flying ad-hoc network (FANET); which keeps changing its topology due to its dynamic nature, causing inefficient communication, and therefore needs cluster formation. In this paper, we proposed a cluster formation, selection of cluster head and its members, connectivity and transmission with the base station using the K-means algorithm, and choice of an optimized path for transmission using firefly optimization algorithm for efficient communication. Evaluation of performance with experimental results are obtained and compared using the K-means algorithm and firefly optimization algorithm in cluster building time, cluster lifetime, energy consumption, and probability of delivery success. On comparison of firefly optimization algorithm with firefly optimization algorithm, i.e., K-means algorithm results proved than without firefly optimization algorithm, better in terms of cluster building time, energy consumption, cluster lifetime, and also the probability of delivery success.
A persuasive agent architecture for behavior change interventionIJICTJOURNAL
Β
A persuasive agent makes use of persuasion attributions to ensure that its predefined objective(s) is achieved within its immediate environment. This is made possible based on the five unique features namely sociable, persuasive, autonomy, reactive, and proactive natures. However, there are limited successes recorded within the behavioural intervention and psychological reactance is responsible for these failures. Psychological reactance is the stage where rejection, negative response and frustration are felt by the users of the persuasive system. Thus, this study proposes a persuasive agent (PAT) architecture that limits the experience of psychological reactance to achieve an improved behavioural intervention. PAT architecture adopted the combination of the reactance model for behavior change and the persuasive design principle. The architecture is evaluated by conducting an experimental study using a user-centred approach. The evaluation reflected that there is a reduction in the number of users who experienced psychological reactance from 70 per cent to 3 per cent. The result is a better improvement compared with previous outcomes. The contribution made in this study would provide a design model and a steplike approach to software designers on how to limit the effect of psychological reactance on persuasive system applications and interventions.
Enterprise architecture-based ISA model development for ICT benchmarking in c...IJICTJOURNAL
Β
Building on a coincided in progress paper, this paper constructs and evaluate an information systems architecture (ISA) model for the Bahraini architecture, engineering and construction (AEC) sector, from the lens of enterprise architecture (EA). This model acts as an information and communication technology (ICT) barometer tool to identify and benchmark the ICTβs gaps, duplicative levels, and future investments. Following the design science research, this paper and throughout a utilization of a tailored version of the open group architectural framework (TOGAF), embedded into a rigorous case study approach, the construction, testing, and evaluation of the conceptual ISA model is approached to benchmark the ICT measurement. Empirically, the study revealed the appropriateness of the model and the ability to identify the availability of 28 groups of 38 individual ICT applications in the Bahraini AEC sector and benchmark them to score an average of 18.5% against 17 countries that scored an average of 18.6%.
Empirical studies on the effect of electromagnetic radiation from multiple so...IJICTJOURNAL
Β
Just after the invention of electricity by Michael Faraday, there has been a revolution in the communication technology, which lead to the invention of radio, television, radar, satellite, and mobile. While these machines transformed our life high quality, safer and simpler, they have been associated with alarming probable health hazards owing to their electromagnetic radiation (EMR) emission. Couple of cases it has been reported by personals regarding various health related issues relating to exposure on electromagnetic field (EMF) and EMR. Although couple of persons showed light symptoms and respond by avoiding the electrical field (EF) and EMR fields as much as possible, some others are so much affected that they have changed their entire lifestyle. In this paper, empirical survey study has been carried out in the laboratories of Daffodil International University (DIU) main and permanent campus. It was found that some of the instrument had higher EMFs. The findings from this survey may be helpful for the students to take precautionary measurement who work for long duration in the various laboratories for their practical classes and for the users of the domestic appliances as well as office equipment and industrial instruments.
Cyber attack awareness and prevention in network securityIJICTJOURNAL
Β
This article aims to provide an overview of cyber attack awareness and prevention in network security. This article discussed the different types of cyber attacks, current trends of cyber attacks, how to prevent cyber attacks and uum students' awareness of cyber attacks. First, we will go over the different types of cyber attack, current trend, impact of cyber attack and the prevention. The approach entailed comparing and observing the outcomes of 13 different papers. The survey's findings would demonstrate the results obtained after analyzing the data collection which are the questionnaire filled out by respondents after watching the cyber attack awareness video to improve awareness of students through the cyber attack. Depending on the outcome of this survey, we will have a better understanding of current students' knowledge and awareness of cyber attacks, allowing us to improve students' understanding of cyber threats and the necessity of cyber security.
An internet of things-based irrigation and tank monitoring systemIJICTJOURNAL
Β
Agriculture plays a significant role in the development of a nation and provides the main source of food production, income, and employment to nations. It was the most practiced occupation in Nigeria and this formed the backbone of the economy in the early 1960s before the discovery of crude oil, which has led to the derail of sufficient food production, exportation, and the agricultural economy at large. Over time, the dry season has always been challenging with little or no rainfall and there are no irrigation facilities that incorporate different saving practices to adapt to these climate changes on their own. In this paper, a cost-effective internet of things irrigation system that is capable of reducing water wastage, manual labor, monitoring tank water level and that can be controlled remotely is designed. The system integrated Arduino UNO with a soil moisture sensor, HCSR04 ultrasonic sensor, and ESP8266 Wi-Fi module that gives the system capable of being controlled remotely via the internet, thus achieving optimal irrigation using the internet of things (IoT). Some of the challenges facing the existing irrigation system are water wastage, poor performance, and high cost of implementation. The design system helps to control water supply to crops when it is needed, and also monitors soil moisture, temperature, and water tank level. After carrying out the experiments for 15 days, the system saved approximately 49% of the water used in traditional irrigation method. The system is useful in large farming areas to minimize human effort and reduce the cost of hiring personnel.
About decentralized swarms of asynchronous distributed cellular automata usin...IJICTJOURNAL
Β
This research describes the simple implementation of asynchronous distributed cellular automata and decentralized swarms of asynchronous distributed cellular automata built on top of inter-planetary file systemβs publish-subscribe (IPFS PubSub) experimentation. Various publish-subscribe(PubSub) models are described. As an illustration, two distributed versions and a decentralized swarm version of a 2D elementary cellular automaton are thoroughly detailed to highlight the simplicity of implementation with IPFS and the inner workings of these kinds of cellular automata (CA). Both algorithms were implemented, and experiments were conducted throughout five datacenters of Gridβ5000 testbed in France to obtain preliminary performance results in terms of network bandwidth usage. This work is prior to implementing a large-scale decentralized epidemic propagation modeling and prediction system based upon asynchronous distributed cellular automata with application to the current pandemic of SARSCoV-2 coronavirus disease 2019 (COVID-19).
A convolutional neural network for skin cancer classificationIJICTJOURNAL
Β
Skin diseases can be seen clearly by oneself and others. Although this disease is visible on the skin, we fear that this skin disease is harmful. People who experience skin diseases immediately visit a dermatologist to have their complaints and symptoms checked. This skin protects the body, especially from the sun, so it can be lethal if something goes wrong. One example of deadly skin disease is skin cancer or skin tumors. In this research, we classified skin cancer into Benign and Malignant using the convolution neural network (CNN) algorithm. The purpose of this research is to develop the CNN architecture to help identify skin diseases. We used a dataset of 3,297 skin cancer images which are publicly available on the Kaggle website. We propose two CNN architectures that differ in the number of parameters. The first architecture has 6,427,745 parameters, and the second architecture has 2,797,665. The accuracy of the proposed models is 93% and 74% respectively. The first model with the number of parameters 6,427,745 was saved for use in the creation of the website. We created a web-based application with the Django framework for skin disease identification.
A review on notification sending methods to the recipients in different techn...IJICTJOURNAL
Β
Women have progressed a lot in terms of social empowerment and economics. They are going for higher education, jobs, and many other similar endeavors, but harassment cases have also been on the rise. So, womenβs safety is a big concern nowadays, especially in developing countries. Many previous studies and attempts were made to create a feasible safety solution for women. Out of various features to ensure womenβs safety in critical situations, location tracking is a very common and key feature in most previously proposed solutions. This study found mechanisms of sending the location to different types of recipients in various women safety solutions. In addition, the advantages and drawbacks of location sending methods in women's safety solutions were analyzed.
Detection of myocardial infarction on recent dataset using machine learningIJICTJOURNAL
Β
In developing countries such as India, with a large aging population and limited access to medical facilities, remote and timely diagnosis of myocardial infarction (MI) has the potential to save the life of many. An electrocardiogram is the primary clinical tool utilized in the onset or detection of a previous MI incident. Artificial intelligence has made a great impact on every area of research as well as in medical diagnosis. In medical diagnosis, the hypothesis might be doctors' experience which would be used as input to predict a disease that saves the life of mankind. It is been observed that a properly cleaned and pruned dataset provides far better accuracy than an unclean one with missing values. Selection of suitable techniques for data cleaning alongside proper classification algorithms will cause the event of prediction systems that give enhanced accuracy. In this proposal detection of myocardial infarction using new parameters is proposed with increased accuracy and efficiency of the existing model. Additional parameters are used to predict MI with more accuracy. The proposed model is used to predict an early diagnosis of MI with the help of expertise experiences and data gathered from hospitals.
Multiple educational data mining approaches to discover patterns in universit...IJICTJOURNAL
Β
This paper presented the utilization of pattern discovery techniques by using multiple relationships and clustering educational data mining approaches to establish a knowledge base that will aid in the prediction of ideal college program selection and enrollment forecasting for incoming freshmen. Results show a significant level of accuracy in predicting college programs for students by mining two years of student college admission and graduation final grade scholastic records. The results of educational predictive data mining methods can be applied in improving the services of the admission department of an educational institution, particularly in its course alignment, student mentoring, admission forecast, marketing, and enrollment preparedness.
A novel enhanced algorithm for efficient human trackingIJICTJOURNAL
Β
Tracking moving objects has been an issue in recent years of computer vision and image processing and human tracking makes it a more significant challenge. This category has various aspects and wide applications, such as autonomous deriving, human-robot interactions, and human movement analysis. One of the issues that have always made tracking algorithms difficult is their interaction with goal recognition methods, the mutable appearance of variable aims, and simultaneous tracking of multiple goals. In this paper, a method with high efficiency and higher accuracy was compared to the previous methods for tracking just objects using imaging with the fixed camera is introduced. The proposed algorithm operates in four steps in such a way as to identify a fixed background and remove noise from that. This background is used to subtract from movable objects. After that, while the image is being filtered, the shadows and noises of the filmed image are removed, and finally, using the bubble routing method, the mobile object will be separated and tracked. Experimental results indicated that the proposed model for detecting and tracking mobile objects works well and can improve the motion and trajectory estimation of objects in terms of speed and accuracy to a desirable level up to in terms of accuracy compared with previous methods.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
Β
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
DevOps and Testing slides at DASA ConnectKari Kakkonen
Β
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Β
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Β
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Β
Are you looking to streamline your workflows and boost your projectsβ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, youβre in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part βEssentials of Automationβ series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Hereβs what youβll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
Weβll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Donβt miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
Β
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Β
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overviewβ
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
Β
As AI technology is pushing into IT I was wondering myself, as an βinfrastructure container kubernetes guyβ, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefitβs both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
Β
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Β
Clients donβt know what they donβt know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clientsβ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Β
Correcting optical character recognition result via a novel approach
1. International Journal of Informatics and Communication Technology (IJ-ICT)
Vol. 11, No. 1, April 2022, pp. 8~19
ISSN: 2252-8776, DOI: 10.11591/ijict.v11i1.pp8-19 ο² 8
Journal homepage: http://ijict.iaescore.com
Correcting optical character recognition result via a novel
approach
Otman Maarouf1
, Rachid El Ayachi1
, Mohamed Biniz2
1
Department of Computer Science, Faculty of Science and Technology, Sultan Moulay Slimane University, Beni-Mellal, Morocco
2
Department of Computer Science, Polydisciplinary Faculty, Sultan Moulay Slimane University, Beni Mellal, Morocco
Article Info ABSTRACT
Article history:
Received May 14, 2021
Revised Dec 25, 2021
Accepted Jan 13, 2022
Optical character recognition (OCR) is a recognition system used to
recognize the substance of a checked picture. This system gives erroneous
results, which necessitates a post-treatment, for the sentence correction. In
this paper, we proposed a new method for syntactic and semantic correction
of sentences it is based on the frequency of two correct words in the sentence
and a recursive technique. This approach starts with the frequency
calculation of each two words successive in the corpora, the words that have
the greatest frequency build a correction center. We found 98% using our
approach when we used the noisy channel. Further, we obtained 96% using
the same corpus in the same conditions.
Keywords:
Correction center
Natural language processing
Optical character recognition
Recursive technique
Sentences correction
Tifinagh
This is an open access article under the CC BY-SA license.
Corresponding Author:
Otman Maarouf
Department of Computer Science, Sultan Moulay Slimane University
Av Med V, BP 591, Beni-Mellal 23000, Morocco
Email: maarouf.otman94@gmail.com
1. INTRODUCTION
With the advancement in innovation and handling speed, an ever-increasing number of complicated
calculations for optical character recognition (ORC) frameworks including AI and neural organizations are
proposed. [1] OCR is a course of changing over a picture portrayal into editable and text design. It is the
technique for digitizing printed and written by hand text [2]. Numerous applications including number plate
acknowledgment, book checking, and continuous transformation of transcribed text advantage from OCR. [3]
Unfortunately, the results given by OCR are not always satisfying; it contains errors influencing the meaning
of sentences. These errors are divided into several types: i) Missing characters: the result of the OCR contains
several characters less than the number of characters in the image to be recognized. ii) Addition of characters:
the result of the OCR contains several characters greater than the number of characters appearing in the
image to be recognized. iii) Character modification: the result of the OCR contains several characters equal
to the number of characters in the image to be recognized, but there are some that are modified by other
characters different from the origin.
To solve this problem, post-processing must be added. At this level, we propose to use a new
approach that applies in two stages; it begins with the detection of errors, then it attacks the syntactic and
semantic correction of the OCR output, this approach is based on the frequency of two correct words in the
sentence and a recursive technique. This approach starts with the frequency calculation of each two words
Successive in the corpora, the words that have the greatest frequency build a correction center, then it begins
to correct all words using the recursive technique that will describe in the next section. This approach belongs
to the natural language processing (NLP) domain, Which is a branch of artificial intelligence that analyzes,
2. Int J Inf & Commun Technol ISSN: 2252-8776 ο²
Correcting optical character recognition result via a novel approach (Otman Maarouf)
9
understands, and generates natural languages used by humans to interact with computers in written contexts
and spoken using natural human languages rather than computer languages [4].
In the literature, we find that the NLP domain is used in several languages, namely: English [5]
French [6] Arabic [7]. On the other hand, the Amazigh language, which uses the Tifinagh characters, has not
beneficiated the advantages offered by this domain. This lack motivated us to approach this area to improve
OCR results for Tifinagh characters.
Tifinagh [8] is the set of alphabets used by the Amazigh population. The Royal Institute of Amazigh
Culture (IRCAM) has normalized the Tifinagh alphabet of thirty-three characters as shown in Figure 1.
Figure 1. Tifinagh characters (IRCAM)
The remainder of the paper is coordinated as follows: section 2 portrays the engineering of the OCR
framework utilized for the acknowledgment of composed archives in Tifinagh, section 3 examines the
proposed approach of NLP took on to further develop the outcomes given by an OCR, section 4 shows the
test results acquired to pass judgment on the exhibition of the proposed approach, at last, an end is given, to
sum up, the reason for the work and to report the extricated ends.
2. OPTICAL CHARACTER RECOGNITION SYSTEM
Explaining research chronological, including research design, research procedure (in the form of
algorithms, Pseudocode, or other), how to test, and data acquisition [3]β[5]. The description of the course of
research should be supported references, so the explanation can be accepted scientifically [4]β[6].
The advancement in pattern recognition has accelerated recently due to the many emerging
applications which are not only challenging, but also computationally more demanding, such as evident in
OCR, [9] document classification, Computer vision [10], data mining, shape recognition, and biometric
authentication, for example. The space of OCR is turning into an indispensable piece of archive scanners and
is utilized in numerous applications like postal handling, script acknowledgment, banking, security (for
example visa confirmation), and language distinguishing proof. The exploration in this space has been
progressing for over 50 years and the results have been surprising with fruitful acknowledgment rates for
printed characters surpassing close to 100%, with huge enhancements in execution for written by hand
cursive person acknowledgment where acknowledgment rates have surpassed the 90% imprint. These days,
numerous associations are relying upon OCR frameworks to dispense with human connections for better
execution and proficiency [11].
Under the current work, a character recognition system is presented for recognizing Tifinagh
characters extracted from pictures/designs implanted text records, for example, business cards pictures [12].
Figure 2 describes our OCR steps, it takes as the input image, it starts by the text extraction, then the
binarization phase, Segmentation phase, and the last one is recognition finally it gives as output a text file.
2.1. Image acquisition
The proposed OCR begins with the image acquisition [3] process, it is the first step in OCR, which
consists of getting a digital image and converting it into a proper form that can be easily handled by a
computer. This may include quantization as well as compression of the image [13]. A special case of
quantification is binarization, which involves only two image levels [14]. In most cases, the binary image is
sufficient to characterize the image. The compression itself can be lossy or lossless. An overview of the
different image compression techniques in [15]. This OCR takes any typewritten image containing Tifinagh
characters in either β.pngβ or β.jpgβ format.
3. ο² ISSN: 2252-8776
Int J Inf & Commun Technol, Vol. 11, No. 1, April 2022: 8-19
10
Figure 2. Block diagram of the present system
2.2. Text extraction
Through the filtering system, an advanced picture of the first record is caught. In OCR optical
scanners are utilized, which for the most part comprise a vehicle instrument in addition to a detecting gadget
that converts light power into dark levels. [16] Printed records generally comprise of dark print on a white
foundation. Henceforth, when performing OCR, it isn't unexpected practice to change over the staggered
picture into a bi-level picture of high contrast. Regularly this cycle, known as thresholding, is performed on
the scanner to save memory space and computational exertion. Issues in thresholding can be seen in Figure 3.
The thresholding system is significant as the aftereffects of the accompanying acknowledgment are
absolutely subject to the nature of the bi-level picture. In any case, the thresholding performed on the scanner
is normally exceptionally straightforward. [17] A ο¬xed limit is utilized, where dim levels underneath this
edge are supposed to be dark and levels above are supposed to be white. For a high-balance archive with a
uniform foundation, a prechosen ο¬xed limit can be sufο¬cient. In any case, a ton of archives experienced
practically speaking have a somewhat huge reach interestingly. In these cases, more modern strategies for
thresholding are needed to get a decent outcome [18].
The best techniques for thresholding are generally those, which can differ the limit over the record
adjusting to the neighborhood properties as differentiation and splendor. In any case, such techniques
ordinarily rely on staggered filtering of the record, which requires more memory and computational limit. In
this manner, such procedures are rarely utilized regarding OCR frameworks, despite the fact that they bring
about better pictures [11].
(a) (b) (c)
Figure 3. Issues in thresholding: (a) original dark level pictures, (b) image thresholded with worldwide
strategy, and (c) image thresholded with a versatile technique
2.3. Binarization
A skew revised text locale is binarized utilizing a straightforward yet proficient binarization
technique created by us before dividing it [19]. The calculation has been given below. Essentially, this is a
further developed form of Bernsen's binarization strategy. In his technique, the number juggling method for
the most extreme (Gmax) and the base (Gmin) dim levels around a pixel is taken as the limit for binarizing
4. Int J Inf & Commun Technol ISSN: 2252-8776 ο²
Correcting optical character recognition result via a novel approach (Otman Maarouf)
11
the pixel. In the current calculation, the eight quick neighbors around the pixel subject to binarization are
likewise taken as a central consideration for binarization. This sort of approach is particularly valuable to
interface the detached forefront pixels of a person [12].
2.4. Segmentation
Segmentation is the separation of characters or words. Most optical person acknowledgment
calculations fragment the words into segregated characters, which are perceived separately [11]. Typically,
this division is performed by segregating each associated part that is each associated dark region. This
procedure is not difficult to carry out, however, issues happen if characters contact or then again in case
characters are divided and comprise of a few sections. The primary issues in the division might be partitioned
into four gatherings:
- Extraction of touching and fragmented characters.
- Distinguishing noise from the text.
- Mistaking graphics or geometry for text.
- Mistaking text for graphics.
For our case, the operation of the segmentation process is based on histogram-based thresholding.
We will signify the histogram of pixel esteems by β0, β1, . . . , βπ, where βπΎ speciο¬es the quantity of pixels in
a picture with greyscale esteem k and N is the greatest pixel esteem (regularly 255). Ridler and Calvard
(1978) and Trussell (1979) proposed a straightforward algorithm for picking a solitary edge. We will allude
to it as the entomb implies algorithm.
At first, a supposition should be made at a potential incentive for the edge. From this, the mean
upsides of pixels in the two classes created utilizing this limit are determined. The limit is repositioned to lie
precisely somewhere between the two methods. Mean qualities are determined once more, and another limit
is gotten until the edge quits evolving esteem [20].
2.5. Recognition
Recognition is the last phase of the OCR system which is used to identify the segmented content. In
this step, the correlation coefficients are used in the classification. The correlation coefficient is processed
from the example information estimates the strength and bearing of a connection between two factors. A
relationship coefficient is a number somewhere in the range of 0 and 1. In case there is no relationship,
between the anticipated qualities and the real qualities, the connection coefficient is 0 or exceptionally low
(the anticipated qualities are no more excellent than irregular numbers). As the strength of the connection
between the anticipated qualities and real worth increments so does the relationship coefficient. An ideal fit
gives a coefficient of 1.0. In this way, a superior outcome is compared to the higher correlation coefficient
[21]. Corr2 computes the correlation coefficient using:
π =
β β (π΄ππβπ΄Μ )(π΅ππβπ΅
Μ )
π
π
β(β β (π΄ππβπ΄Μ )2)(β β (π΅ππβπ΅
Μ )2)
π
π
π
π
(1)
where
- A and B are two matrices
- m: number of rows
- n: number of columns
3. THE NOISY CHANNEL MODEL
Around here, we present the boisterous channel model and advise the most ideal way of applying it
to the endeavor of recognizing and changing spelling botches. The boisterous channel model was applied to
the spelling remedy task at about a similar time by research artificial AT and T Bell [22] and IBM Watson
Research [23].
The nature of the loud channel model in Figure 4 is to treat the inaccurately spelled uproarious
channel word like an adequately spelled word had been "ravaged" by being used as a clamorous
correspondence procedure. This channel presents "uproar" as replacements or various changes to the letters,
making it hard to see the "certified" word. The goal, by then, is to develop a model of the boisterous channel.
Given this model, we then ο¬nd the authentic word bypassing every declaration of the language through the
model of the loud channel and see which one comes the closest to the mistakenly spelled word [24].
In the noisy channel model, we envision that the surfacesβ structure we see is really a "misshaped"
type of a unique word that went through a loud channel. The decoder goes every theory through a model of
this channel and picks the word that best matches the surface boisterous word [25]
5. ο² ISSN: 2252-8776
Int J Inf & Commun Technol, Vol. 11, No. 1, April 2022: 8-19
12
Figure 4. Loud channel model
3.1. Extraction of candidates
This noisy channel model is a sort of Bayesian inference. We see a perception π₯ (an incorrectly
spelled word) and the work is to ο¬nd the word w that created this incorrectly spelled word. Out of all
potential words in the jargon, V we need to ο¬nd the word w to such an extent that π(π€|π₯) is most elevated.
We utilize the cap documentation Λ to signify "the gauge of the right word".
π€
Μ = ππππππ₯π€βπ π(π€|π₯) (2)
The function ππππππ₯π₯π(π₯) means βthe π₯ such that π(π₯) is maximizedβ. The (2) in this way implies
that out of all words in the jargon, we need the specific word that augments the right-hand side π(π€|π₯).
3.2. Bayesian classiο¬cation
The instinct of Bayesian classiο¬cation is to utilize Bayes rule to change (2) into a bunch of different
probabilities. Bayes rule is introduced in (3) it offers us a way of reprieving down any contingent likelihood
π(π|π) into three different probabilities:
π(π|π) =
π(π|π)π(π)
π(π)
(3)
it can then substitute (3) into (2) to get (4):
π€ = ππππππ₯π€βπ
π(π₯|π€)π(π€)
π(π₯)
(4)
It can helpfully streamline (4) by dropping the denominatorπ(π₯). For what reason is that? Since we
are picking a potential adjustment word out everything being equal, we will process
π(π₯|π€)π(π€)
π(π₯)
for each word.
In any case, π(π₯) does not change for each word; we are continually approaching the no doubt word for the
equivalent watched mistakeπ₯, which must have the same probability π(π₯). Along these lines, we can pick the
word that augments this less difficult [24].
π€ = ππππππ₯ π€βππ(π₯|π€)π(π€) (5)
To summarize, the noisy channel model says that we have some evident basic word w, and we have
an uproarious channel that modiο¬es the word into some conceivable incorrectly spelled noticed surface
structure. The probability or channel model of the uproarious channel model delivering a specific perception
grouping x is demonstrated by π(π₯|π€). The earlier likelihood of a secret word is demonstrated by π(π€). We
can process the most earlier likelihood plausible word π€
Μ considering that we have seen some noticed
incorrect spelling x by increasing the prior π(π€) and the likelihood π(π₯|π€) and picking the word for which
this item is most prominent.
The noisy channel approach way to deal with rectifying non-word spelling blunders by taking any
word not in the spell word reference, creating a rundown of up-and-comer words, positioning them as per (5),
and picking the most noteworthy positioned one. We can adjust (5) to allude to this rundown of competitor
words rather than the full jargon V as by [24].
6. Int J Inf & Commun Technol ISSN: 2252-8776 ο²
Correcting optical character recognition result via a novel approach (Otman Maarouf)
13
π€
Μ = ππππππ₯π€βπΆ π(π₯|π€)
β π(π€)
β (6)
4. THE PROPOSED APPROACH
NLP is a way for computers to analyze, comprehend, and get importance from human language in a
shrewd and helpful manner. By using NLP, designers can coordinate and construction information to perform
assignments like programmed rundown, interpretation, named substance acknowledgment, relationship
extraction, opinion investigation, discourse acknowledgment, and subject division.
4.1. Correction of words
Correcting the wrong word requires a list of candidates to select among them. In this part, we will
treat the word correction based on a dictionary of Tifinagh words, calculating the distance between words in
order to have the list of candidates taking the words that we have a maximum distance. The distance between
two words is the number of common letters between the two letters in the same position, which is being
calculated by (7).
πππ π‘(π€1, π€2) = |{ ππ: (ππππ€1
π
Λππππ€2
π
)Λ (ππππ€1
π
Λππππ€2
πΒ±1
) (7)
Λ (ππππ€1
πΒ±1
Λππππ€2
π
)Λ (ππππ€1
πΒ±1
Λππππ€2
πΒ±1
)Λ }|
with
- ππ: The character of index i in the word 1.
- π€1
π
: The position k in the word 1.
in OCR We can find certain types of errors, erroneous words by deletion, insertion, transposition or
substitution.
Table 1 shows us that the two types of erroneous words (by transposition and substitution), have the
same number of the correct word characters, on the other hand, the words erased by suppression have a
number of characters of less than the correct word character, and the words erased by the insertion have a
number of characters greater than the number of correct word characters.
Table 1. Types of erroneous words
Erroneous words Correct words types of error
β΄°β΅β΄°β΅ β΄°β΄·β΅β΄°β΅ deletion
β΄°β΄·β΄·β΅β΄°β΅ β΄°β΄·β΅β΄°β΅ insertion
β΄°β΅β΄·β΄°β΅ β΄°β΄·β΅β΄°β΅ transposition
β΄°β΄·β΅β΄°β΅ β΄°β΄·β΅β΄°β΅ substitution
To correct an erroneous word, we will calculate its distance with all the dictionary words that have
the same size or a size Β±1, after which we will return the max of the distance by (8).
max _πππ π‘(π€) = max { πππ π‘π(π€, π€π)} (8)
with:
- π€ the corrected word
- π€π: are dictionary words that have a size equal to the size or size Β± 1 of π€
we can retrieve all the words that have a maximum distance with the wrong word, using the maximum of the
distance indicated (7).
π€ππππ _πππππππ‘π (π€) = { π€π: πππ π‘(π€, π€π) = max _πππ π‘(π€) } (9)
4.2. Words frequency
To correct a sentence, we need to start the correction with correct words. Therefore, we will
determine the word position belongs to the dictionary, such that the word that follows also belongs to the
dictionary, by (10).
πππ π€π = { π βΆ π€π β ππππ‘ππππππππ Λ π€π+1 β ππππ‘ππππππππ} (10)
with:
- π€π the word of the sentence has the position i.
7. ο² ISSN: 2252-8776
Int J Inf & Commun Technol, Vol. 11, No. 1, April 2022: 8-19
14
For πππ π€π = β , we will recalculate πππ π€π by (11).
πππ π€π = {π βΆ π€π β ππππ‘ππππππππ} (11)
We calculate the frequency, using (11) of the words π€π in the corpus knowing that π€π+1 consecutive
with π€π by (12).
πππππ€π = {ππ/π+1/π βΆ π β πππ π€} (12)
where:
- ππ/π+1 is a number of occurrences of π€π in the corpus followed by π€π+1
- N is a total number of corpora words
if we used (6), we will calculate the frequency of the words by (13).
πππππ€π = {ππ/π βΆ π β πππ π€} (13)
with
- ππ: the number of word occurrences in the corpus
- N: total number of corpora words
after the calculation of the frequencies of the words, we will seek the maximum frequency by (14).
maxfreq = {max (πππππ€)} (14)
Then we will return the word position that has a maximum frequency by (15). Therefore, the
correction process focuses on this position (called: the correction center).
πππ = {i βΆ πππ₯πππππ€ = πππππ€π} (15)
4.3. Correction of the sentence
The frequency of the words that we calculated in the previous section, allowed us to correct each
word of a sentence, based on the words that exist in the dictionary. Now we will compute the frequency of
words that are close to the erroneous word, using the result of (14) and (15), by a calculation that takes into
consideration n correct words after the erroneous word or before the erroneous word by (16).
i) Left formula:
πππππ€π/πππ
πΏ
= {
ππ/πππ
π
: π€πππ€ππππ _πππππππ‘π (π€
Μ ) } (16)
where
- ππ/πππ the number of occurrences of the word π€π in the corpus knowing that it is by (17) π€π+1π€πππ .
- N: The total number of corpora words
- π€
Μ : The word to correct
ii) Right formula:
πππππ€π/πππ
π
= {
ππ/πππ
π
: π€πππ€ππππ _πππππππ‘π (π€
Μ ) } (17)
with:
- ππ/πππ The number of occurrences of the word π€π in the corpora knowing that preceded by π€0.π€πππ +1
- N: The total number of corpora words
- π€
Μ : The word to correct
For each (11) and (12), we calculate the maximum of frequencies on the left or the right as by (18)
and (19).
πππ₯πππππ/πππ
πΏ
= {max(πππππ€π/πππ
πΏ
)} (18)
and
πππ₯πππππ/πππ
π
= {max(πππππ€π/πππ
π
)} (19)
8. Int J Inf & Commun Technol ISSN: 2252-8776 ο²
Correcting optical character recognition result via a novel approach (Otman Maarouf)
15
If we have πππ₯πππππ/πππ
π
=zero, we will apply the recursive technique: The incrementing of pos
(pos=pos+1) until πππ₯πππππ/πππ
π
> 0. Similarly, if we have πππ₯πππππ/πππ
πΏ
=zero we will apply the recursive
technique: The decrement of pos (pos=pos-1) until πππ₯πππππ/πππ
πΏ
> 0. To correct an erroneous word, we will
use two formulas depending on its position in relation to the correction center. If we have the position of the
wrong word above the correction center:
πππππππ‘π
(π€
Μ ) = {π€π: πππ₯πππππ/πππ
π
= πππππ€π/πππ
π
πππ π€ππ π€ππππ _πππππππ‘π (π€
Μ )} (20)
if we have the position of the wrong word below the correction center:
πππππππ‘πΏ
(π€
Μ ) = {π€π: πππ₯πππππ/πππ
πΏ
= πππππ€π/πππ
πΏ
πππ π€ππ π€ππππ _πππππππ‘π (π€
Μ )} (21)
with π€
Μ is the wrong word.
Finally, we can correct a sentence, using the word correction on the left or the right and the recursive
technique, by (22).
πππππππ‘(π πππ‘ππππ) = {πππππππ‘πΏ
(π€π<πππ ); πππππππ‘π
(π€π>πππ )} (22)
where
- π€π<πππ : The sentence words locate before the word exists in the correction center
- π€π>πππ : The sentence words locate after the word exists in the correction center
5. TEST AND MEASURE
The evaluation of our system requires experiments, for that, we divided the dataset, containing 5000
images, into two parts: 80% for training, and 20% to test the performance of the system.
5.1. Experimental results of OCR
The realized OCR system starts by reading an image containing a text written in Tifinagh, then
performs the recognition of the content, finally generates a file containing the result of the recognition.
Figure 5 is an example of the image, containing a sentence written in Tifinagh, used to test our OCR. The
OCR output obtained is illustrated in Figure 6. The observation of this result shows that there are two errors,
these errors are reported in Table 2. In the first word, the error exists in the third character. On the other hand,
in the second word, the error appears in the first character.
Figure 5. Example of image presented at the OCR input
Figure 6. Example of image retrieved at the OCR output
Table 2. Errors extracted
Input Output
β΅β΅β΅£β΅ β΅β΅β΅β΅
β΅β΅ β΅β΅
5.2. Experimental results of noisy channel
Find words that have a comparative spelling to the info word. Investigation of spelling mistake
information has shown that most spelling blunders comprise of a solitary letter change thus we regularly
make the working on the supposition that these competitors have an alter distance of 1 from the blunder
word. To ο¬nd this rundown of applicants we'll utilize the base alter distance calculation, yet stretched out so
that notwithstanding inclusions, erasures, and replacements, we'll add a fourth sort of alter, interpretations, in
9. ο² ISSN: 2252-8776
Int J Inf & Commun Technol, Vol. 11, No. 1, April 2022: 8-19
16
which two letters are traded. The form of alter distance with interpretation is called Damerau-Levenshtein
update distance. Applying all such single changes to ββ΅β΅β΅£β΅β yields the rundown of up-and-comer words as
shown in Table 3.
Once have a bunch of up-and-comers, to score every one utilizing (6) requires that process the
earlier and the channel model. The earlier likelihood of every amendment π(π€) is the language model
likelihood of the word w in setting, which can be processed utilizing any language model, from unigram to
trigram or 4-gram. For this model, let us start in the accompanying Table 4 by accepting a unigram language
model.
Table 3. Candidate corrections for the misspelling ββ΅β΅β΅£β΅β and the transformations that would have produced
the error (after [22] βββ represents a null letter)
Error Correction Types
β΅β΅β΅£β΅ β΅β΅β΅£β΅β΅ deletion
β΅β΅β΅£β΅ β΅β΅β΅₯β΅ substitution
β΅β΅β΅£β΅ β΅β΅β΅£ insertion
Table 4. The probability to correct the misspelling ββ΅β΅β΅£β΅
w Count(w) P(w)
β΅β΅β΅£β΅β΅ 2 798 0.000256
β΅β΅β΅₯β΅ 19 300 0.001565
β΅β΅β΅£β΅ 230 0.0000695
How might appraise the probability π(π₯|π€), likewise called the channel model or blunder model?
An ideal model of the likelihood that a word will be mistyped would blunder model condition on a wide
range of components: who the typist was, regardless of whether the typist was left given or right-gave,
Fortunately, we can get a sensible gauge of π(π₯|π€) just by checking out nearby setting: the character of the
right letter itself, the incorrect spelling, and the encompassing letters.
Once have the disarray lattices, we can appraise π(π₯|π€) as follows (where π€π is the πth character of
the correct word π€) and π₯π is the ith character of the typo π₯. Using the counts from [17] results in the error
model probabilities for acres shown in Table 5.
π(π₯|π€) =
{
πππ[π₯πβ1,π€π]
πππ’ππ‘[π₯πβ1π€π]
, ππ πππππ‘πππ
πππ [π₯πβ1,π€π]
πππ’ππ‘[π€πβ1]
, ππ πππ πππ‘πππ
π π’π[π₯π,π€π]
πππ’ππ‘[π€π]
, ππ π π’ππ π‘ππ‘π’π‘πππ
(23)
Table 5. Channel model for β΅β΅β΅£β΅
Candidate correction Correct letter Error letter x/w P(x/w)
β΅β΅β΅£β΅β΅ - β΅ β΅/# 0.000045
β΅β΅β΅₯β΅ β΅₯ β΅£ β΅£/β΅₯ 0.00064
β΅β΅β΅£β΅ β΅β΅ β΅β΅ β΅β΅/β΅β΅ 0.0000012
Table 6 shows the ο¬nal results for each of the corrections; the unigram prior is calculated with (23)
and the confusion matrices. The computations in Table 6. show that the noisy channel model chooses β΅β΅β΅₯β΅
as the better, and β΅β΅β΅£β΅β΅ as the second most likely word. For this reason, it is important to use larger language
models than unigrams. For example, if we use a corpus o compute bigram probability for the words β΅β΅β΅₯β΅ and
β΅β΅β΅£β΅β΅ in their context using add-one smoothing, we get the following probabilities:
P(β΅’β΄°β΅|β΅β΅β΅₯β΅) = .000051
P(β΅’β΄°β΅|β΅β΅β΅£β΅β΅) = .000002
10. Int J Inf & Commun Technol ISSN: 2252-8776 ο²
Correcting optical character recognition result via a novel approach (Otman Maarouf)
17
Combining the language model with the error model in Table 6, the bigram noisy channel model
now chooses the correct word β΅β΅β΅₯β΅.
Table 6. Calculated the ranking for each word correction
Candidate correction Correct letter Error letter x/w P(x/w) P(w)
β΅β΅β΅£β΅β΅ - β΅ β΅/# 0.000045 0.000256
β΅β΅β΅₯β΅ β΅₯ β΅£ β΅£/β΅₯ 0.00064 0.001565
β΅β΅β΅£β΅ β΅β΅ β΅β΅ β΅β΅/β΅β΅ 0.0000012 0.0000695
5.3. Experimental results of proposed approach
Errors correction will be realized using the proposed approach (NLP). Using (14), we found the
following results for the correction of ββ΅β΅β΅β΅β: The candidate words are β΅β΅β΅β΅, β΅β΅β΅₯β΅, β΅β΅β΅£β΅ and β΅β΅β΄³β΅. For the
correction of ββ΅β΅β, we have three candidate words: β΄³β΅, β΅β΅ and β΅β΅. Now we will look for the position
words most frequently used in Figure 7.
Table 7. Frequency of two successive correct words
Words Frequency
β΅’β΄°β΅ β΅β΅β΄±β΄° 6.895301738404073E-4
β΅β΅β΅β΅β΅ β΅β΅ 9.85043105486296E-5
According to the results in Table 7, it can be seen that ββ΅’β΄°β΅ β΅β΅β΄±β΄°β is the most common combination,
so the position returned by (15) is β1β. This is the correction center and the position of the word ββ΅’β΄°β΅β. Using
the correction on the left, we will correct the words that are before the correction center (20). The word ββ΅β΅β΅β΅β
does not exist in the dictionary, we will calculate the frequency of all the words that are close to it using (20)
with pos=1 (i.e., frequency of words closes to ββ΅β΅β΅β΅β followed by ββ΅’β΄°β΅ β΅β΅β΄±β΄°β) as shown in Table 8.
Table 8. Frequency of each word among the candidate words of ββ΅β΅β΅β΅β
The condidate word The frequency
β΅β΅β΅β΅ 0.0
β΅β΅β΅₯β΅ 6.855846505486296E-9
β΅β΅β΅£β΅ 8.855.254505486296E-7
According to the results indicate in Table 8, we conclude that the correct word is ββ΅β΅β΅£β΅β, instead of
putting ββ΅β΅β΅β΅β, we will replace it with ββ΅β΅β΅£β΅β. Now we will go to the right correction, we have the word
Β« β΅β΅Β» does not exist in the dictionary, we will compute the frequency of all the words that are close to it using
(21) with pos=1 (i.e., the frequency of words closes to ββ΅β΅β preceded by ββ΅β΅β΅£β΅ β΅’β΄°β΅ β΅β΅β΄±β΄°β x) can be seen in
Table 9.
Table 9. Frequency of each word among the candidate words of ββ΅β΅β
Word Frequency
β΅β΅ 8.31651623321656E-8
β΄³β΅ 0.0
β΅β΅ 0.0
From the results, indicate in Table 9, we can conclude that the correct word is ββ΅β΅β, instead of
putting ββ΅β΅β, we will replace it with ββ΅β΅β. We have the word ββ΅β΅β΅β΅β΅β exists in the dictionary its frequency
knowing that it is preceded by ββ΅β΅β΅£β΅ β΅’β΄°β΅ β΅β΅β΄±β΄° β΅β΅β is 1.3558880795743596E-6 not null, so the word
ββ΅β΅β΅β΅β΅β is correct. Similarly, we have the word ββ΅β΅β exists in the dictionary its frequency knowing that it is
preceded by ββ΅β΅β΅£β΅ β΅’β΄°β΅ β΅β΅β΄±β΄° β΅β΅ β΅β΅β΅β΅β΅β is 1.3558880795743596E-6 not null, so the word ββ΅β΅β is correct.
Similarly, we have the word ββ΅β΅β exists in the dictionary its frequency knowing that it is preceded by ββ΅β΅β΅£β΅
12. Int J Inf & Commun Technol ISSN: 2252-8776 ο²
Correcting optical character recognition result via a novel approach (Otman Maarouf)
19
[11] H. Modi and M. C., βA review on optical character recognition techniquesβ, International Journal of Computer Applications, vol.
160, no. 6, pp. 20β24, Feb. 2017, doi: 10.5120/ijca2017913061.
[12] A. F. Mollah, N. Majumder, S. Basu, and M. Nasipuri, βDesign of an optical character recognition system for camera- based
handheld devicesβ, IJCSI International Journal of Computer Science, vol. 8, no. 4, pp. 283β289, 2011.
[13] J. LΓ‘zaro, J. L. MartΓn, J. Arias, A. Astarloa, and C. Cuadrado, βNeuro semantic thresholding using OCR software for high
precision OCR applicationsβ, Image and Vision Computing, vol. 28, no. 4, pp. 571β578, 2010.
[14] N. Islam, Z. Islam, and N. Noor, βA Survey on Optical Character Recognition Systemβ, arXiv:1710.05703 [cs], Oct. 2017,
Accessed: Feb. 07, 2022. [Online]. Available: http://arxiv.org/abs/1710.05703
[15] W. B. Lund, D. J. Kennard, and E. K. Ringger, βCombining multiple thresholding binarization values to improve OCR outputβ,
in Document Recognition and Retrieval XX, 2013, vol. 8658, p. 86580R.
[16] K. M. Yindumathi, S. S. Chaudhari, and R. Aparna, βAnalysis of image classification for text extraction from bills and invoicesβ,
in 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), Jul. 2020, pp.
1β6. doi: 10.1109/ICCCNT49239.2020.9225564.
[17] R. Deepa and K. N. Lalwani, βImage classification and text extraction using machine learningβ, in 2019 3rd International
conference on Electronics, Communication and Aerospace Technology (ICECA), Jun. 2019, pp. 680β684. doi:
10.1109/ICECA.2019.8821936.
[18] D. Saravanan and D. Joseph, βImage data extraction using image similaritiesββ, 2019, pp. 409β420. doi: 10.1007/978-981-13-
1906-8_43.
[19] A. F. Mollah, S. Basu, N. Das, R. Sarkar, M. N. Nasipuri, and M. Kundu, βBinarizing business card images for mobile devicesβ,
Int. Conf. On advances in computer vision and it, pp. 968β975, 2009.
[20] M.-H. Yang and N. Ahuja, βRecognizing hand gesture using motion trajectoriesβ, 2001, pp. 53β81. doi: 10.1007/978-1-4615-
1423-7_3.
[21] S. Sahoo, N. Dash, and P. Sahoo, βWord extraction from speech recognition using correlation coefficientsβ, International
Journal of Computer Applications, vol. 51, no. 13, pp. 21β25, Aug. 2012, doi: 10.5120/8102-1694.
[22] M. D. Kemighan, K. Church, and W. A. Gale, βA spelling correction program based on a noisy channel modelβ, in in the 13th
International Conference on Computational Linguistics, 1990, pp. 205β210.
[23] E. Mays, F. J. Damerau, and R. L. Mercer, βContext based spelling correctionβ, Information Processing & Management, vol. 27,
no. 5, pp. 517β522, Jan. 1991, doi: 10.1016/0306-4573(91)90066-U.
[24] D. Jurafsky and J. H. Martin, Speech and lnguage processing (3rd ed. draft). 2009.
[25] H. Lane, C. Howard, and H. M. Hapke, Natural language processing in action: understanding, analyzing, and generating text
with python. shelter island. Simon and Schuster, 2019.
BIOGRAPHIES OF AUTHORS
Otman Maarouf received his master's degree in business intelligence in 2018
from the Faculty of Science and Technology, University Sultan Moulay Sliman Beni-Mellal.
He is currently a PhD degree sutdent. His research activities are located in the area of natural
language processing specifically; it deals with Part of speech, named entity recognition, and
limatization-steaming of the Amazigh language. He can be contacted at email:
maarouf.otman94@gmail.com.
Rachid El Ayachi obtained a degree in Master of Informatic Telecom and
Multimedia (ITM) in 2006 from the Faculty of Sciences, Mohammed V University
(Morocco) and a Ph.D. degree in computer science from the Faculty of Science and
Technology, Sultan Moulay Slimane University (Morocco). He is currently a member of
laboratory TIAD and a professor at the Faculty of Science and Technology, Sultan Moulay
Slimane University, Morocco. His research focuses on image processing, pattern recognition,
machine learning, semantic web. He can be contacted at email: rachid.elayachi@usms.ma.
Mohamed Biniz received his master's degree in business intelligence in 2014
and Ph.D degree in computer science in 2018 from the Faculty of Science and Technology,
University Sultan Moulay Sliman Beni-Mellal. He is a professor at polydisciplinary faculty
University Sultan My Slimane Beni Mellal morocco. His research activities are located in the
area of the semantic web engineering and deep learning specifically, it deals with the
research question of the evolution of ontology, big data, natural language processing,
dynamic programming. He can be contacted at email: mohamedbiniz@gmail.com.