The document proposes a new approach called partial character reconstruction for segmenting characters in license plate images to improve license plate recognition performance. It introduces using angular information and stroke width properties in different domains to segment characters and then reconstruct their complete shapes for recognition. Experimental results on several benchmark license plate databases and video databases show the technique is effective in handling images affected by multiple challenges.
this is VTU FINAL YEAR PROJECT REPORT full report is attached below.this alone with front pages attached Front pages report follows all the guidelines specified by vtu according to our college.
Abstract. With chatbots gaining traction and their adoption growing in different verticals, e.g. Health, Banking, Dating; and users sharing more and more private information with chatbots — studies have started to highlight the privacy risks of chatbots. In this paper, we propose two privacy-preserving approaches for chatbot conversations. The first approach applies ‘entity’ based privacy filtering and transformation, and can be applied directly on the app (client) side. It however requires knowledge of the chatbot design to be enabled. We present a second scheme based on Searchable Encryption that is able to preserve user chat privacy, without requiring any knowledge of the chatbot design. Finally, we present some experimental results based on a real-life employee Help Desk chatbot that validates both the need and feasibility of the proposed approaches.
this is VTU FINAL YEAR PROJECT REPORT full report is attached below.this alone with front pages attached Front pages report follows all the guidelines specified by vtu according to our college.
Abstract. With chatbots gaining traction and their adoption growing in different verticals, e.g. Health, Banking, Dating; and users sharing more and more private information with chatbots — studies have started to highlight the privacy risks of chatbots. In this paper, we propose two privacy-preserving approaches for chatbot conversations. The first approach applies ‘entity’ based privacy filtering and transformation, and can be applied directly on the app (client) side. It however requires knowledge of the chatbot design to be enabled. We present a second scheme based on Searchable Encryption that is able to preserve user chat privacy, without requiring any knowledge of the chatbot design. Finally, we present some experimental results based on a real-life employee Help Desk chatbot that validates both the need and feasibility of the proposed approaches.
Implementing data-driven decision support system based on independent educati...IJECEIAES
Decision makers in the educational field always seek new technologies and tools, which provide solid, fast answers that can support decision-making process. They need a platform that utilize the students’ academic data and turn them into knowledge to make the right strategic decisions. In this paper, a roadmap for implementing a data driven decision support system (DSS) is presented based on an educational data mart. The independent data mart is implemented on the students’ degrees in 8 subjects in a private school (AlIskandaria Primary School in Basrah province, Iraq). The DSS implementation roadmap is started from pre-processing paper-based data source and ended with providing three categories of online analytical processing (OLAP) queries (multidimensional OLAP, desktop OLAP and web OLAP). Key performance indicator (KPI) is implemented as an essential part of educational DSS to measure school performance. The static evaluation method shows that the proposed DSS follows the privacy, security and performance aspects with no errors after inspecting the DSS knowledge base. The evaluation shows that the data driven DSS based on independent data mart with KPI, OLAP is one of the best platforms to support short-tolong term academic decisions.
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGdannyijwest
Social Networks has become one of the most popular platforms to allow users to communicate, and share their interests without being at the same geographical location. With the great and rapid growth of Social Media sites such as Facebook, LinkedIn, Twitter…etc. causes huge amount of user-generated content. Thus, the improvement in the information quality and integrity becomes a great challenge to all social media sites, which allows users to get the desired content or be linked to the best link relation using improved search / link technique. So introducing semantics to social networks will widen up the representation of the social networks. In this paper, a new model of social networks based on semantic tag ranking is introduced. This model is based on the concept of multi-agent systems. In this proposed model the representation of social links will be extended by the semantic relationships found in the vocabularies which are known as (tags) in most of social networks.The proposed model for the social media engine is based on enhanced Latent Dirichlet Allocation(E-LDA) as a semantic indexing algorithm, combined with Tag Rank as social network ranking algorithm. The improvements on (E-LDA) phase is done by optimizing (LDA) algorithm using the optimal parameters. Then a filter is introduced to enhance the final indexing output. In ranking phase, using Tag Rank based on the indexing phase has improved the output of the ranking. Simulation results of the proposed model have shown improvements in indexing and ranking output.
Clustering Prediction Techniques in Defining and Predicting Customers Defecti...IJECEIAES
With the growth of the e-commerce sector, customers have more choices, a fact which encourages them to divide their purchases amongst several ecommerce sites and compare their competitors‟ products, yet this increases high risks of churning. A review of the literature on customer churning models reveals that no prior research had considered both partial and total defection in non-contractual online environments. Instead, they focused either on a total or partial defect. This study proposes a customer churn prediction model in an e-commerce context, wherein a clustering phase is based on the integration of the k-means method and the Length-RecencyFrequency-Monetary (LRFM) model. This phase is employed to define churn followed by a multi-class prediction phase based on three classification techniques: Simple decision tree, Artificial neural networks and Decision tree ensemble, in which the dependent variable classifies a particular customer into a customer continuing loyal buying patterns (Non-churned), a partial defector (Partially-churned), and a total defector (Totally-churned). Macroaveraging measures including average accuracy, macro-average of Precision, Recall, and F-1 are used to evaluate classifiers‟ performance on 10-fold cross validation. Using real data from an online store, the results show the efficiency of decision tree ensemble model over the other models in identifying both future partial and total defection.
Immune-Inspired Method for Selecting the Optimal Solution in Semantic Web Ser...IJwest
The increasing interest in developing efficient and effective optimization techniques has conducted researchers to turn their attention towards biology. It has been noticed that biology offers many clues for designing novel optimization techniques, these approaches exhibit self-organizing capabilities and permit the reachability of promising solutions without the existence of a central coordinator. In this paper we handle the problem of dynamic web service composition, by using the clonal selection algorithm. In order to assess the optimality rate of a given composition, we use the QOS attributes of the services involved in the workflow as well as, the semantic similarity between these components. The experimental evaluation shows that the proposed approach has a better performance in comparison with other approaches such as the genetic algorithm.
Voice Based Search Engine for Visually Impairment PeoplesIJASRD Journal
World Wide Web (WWW) is unexpectedly emerging because the accepted records supply for our society. The WWW is normally reachable the usage of an internet-browsing package from a networked pc. The layout of facts on the net is visually orientated. The reliance on visible presentation locations excessive cognitive demands on a person to function this sort of system. The interaction might also sometimes require the whole attention of a consumer. The design of information presentation at the web is predominately visible-oriented. This presentation technique requires most, if no longer all, of the person’s attention and imposes significant cognitive load on a user. This technique isn't always sensible, in particular for the visually impaired persons. The awareness of this challenge is to develop a prototype which supports net browsing the use of a speech-based interface, e.g. A telephone, and to degree its effectiveness. The command input and the delivery of web contents are totally in voice. Audio icons are constructed into the prototype so that users will have higher knowledge of the original shape/purpose of a web page. Navigation and manage commands are available to decorate the net browsing enjoy. The effectiveness of this prototype is evaluated in a consumer take a look at involving both generally sighted and visually impaired humans. Voice browsers allow human beings to get right of entry to the Web the usage of speech synthesis, pre-recorded audio, and speech reputation. This may be supplemented via keypads and small presentations. Voice may also be supplied as an accessory to standard computing device browsers with high resolution graphical presentations, presenting an on hand alternative to the use of the keyboard or screen, as an instance in cars in which palms/eyes unfastened operation is crucial. Voice interplay can get away the bodily obstacles on keypads and shows as cell devices turn out to be ever smaller. The browser will have an integrated textual content extraction engine that inspects the content of the page to construct a structured illustration. The inner nodes of the structure constitute diverse tiers of abstraction of the content. This enables in easy and bendy navigation of the page so that it will hastily home into gadgets of interest.
A Study on the Applications and Impact of Artificial Intelligence in E Commer...ijtsrd
Trends in computer science show that various aspects of Artificial Intelligence are emerging, and other trends show that these advances are being applied to create intelligent in formation systems. In recent days artificial intelligence is changing the ways in which computers are usable as problem solving tools. The talent of humans is thus smartly creating and operating tools are indeed a feature of human based brainpower. This technology is now adapted by various E Commerce websites in order to identify the customer preference, pervious purchases, frequent checks etc. Google and Microsoft are also investing in artificial intelligence through various forms in order to enhance better customer service. The main aim of the study is to analysis and explores the various applications and impact of artificial intelligence in E Commerce industry. This study analyses and concludes that by replacement of human expert with artificial intelligence systems in E Commerce industry can significantly speedup and cheapens the production or service process. Prof. Lakshmi Narayan. N | Naveena. N "A Study on the Applications and Impact of Artificial Intelligence in E-Commerce Industry" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd26374.pdfPaper URL: https://www.ijtsrd.com/computer-science/artificial-intelligence/26374/a-study-on-the-applications-and-impact-of-artificial-intelligence-in-e-commerce-industry/prof-lakshmi-narayan-n
Multi-objective NSGA-II based community detection using dynamical evolution s...IJECEIAES
Community detection is becoming a highly demanded topic in social networking-based applications. It involves finding the maximum intraconnected and minimum inter-connected sub-graphs in given social networks. Many approaches have been developed for community’s detection and less of them have focused on the dynamical aspect of the social network. The decision of the community has to consider the pattern of changes in the social network and to be smooth enough. This is to enable smooth operation for other community detection dependent application. Unlike dynamical community detection Algorithms, this article presents a non-dominated aware searching Algorithm designated as non-dominated sorting based community detection with dynamical awareness (NDS-CD-DA). The Algorithm uses a non-dominated sorting genetic algorithm NSGA-II with two objectives: modularity and normalized mutual information (NMI). Experimental results on synthetic networks and real-world social network datasets have been compared with classical genetic with a single objective and has been shown to provide superiority in terms of the domination as well as the convergence. NDS-CD-DA has accomplished a domination percentage of 100% over dynamic evolutionary community searching DECS for almost all iterations.
There are essential security considerations in the systems used by semiconductor companies like TI. Along
with other semiconductor companies, TI has recognized that IT security is highly crucial during web
application developers' system development life cycle (SDLC). The challenges faced by TI web developers
were consolidated via questionnaires starting with how risk management and secure coding can be
reinforced in SDLC; and how to achieve IT Security, PM and SDLC initiatives by developing a prototype
which was evaluated considering the aforementioned goals. This study aimed to practice NIST strategies
by integrating risk management checkpoints in the SDLC; enforce secure coding using static code analysis
tool by developing a prototype application mapped with IT Security goals, project management and SDLC
initiatives and evaluation of the impact of the proposed solution. This paper discussed how SecureTI was
able to satisfy IT Security requirements in the SDLC and PM phases.
Single object detection to support requirements modeling using faster R-CNNTELKOMNIKA JOURNAL
Requirements engineering (RE) is one of the most important phases of a software engineering project in which the foundation of a software product is laid, objectives and assumptions, functional and non-functional needs are analyzed and consolidated. Many modeling notations and tools are developed to model the information gathered in the RE process, one popular framework is the iStar 2.0. Despite the frameworks and notations that are introduced, many engineers still find that drawing the diagrams is easier done manually by hand. Problem arises when the corresponding diagram needs to be updated as requirements evolve. This research aims to kickstart the development of a modeling tool using Faster Region-based Convolutional Neural Network for single object detection and recognition of hand-drawn iStar 2.0 objects, Gleam grayscale, and Salt and Pepper noise to digitalize hand-drawn diagrams. The single object detection and recognition tool is evaluated and displays promising results of an overall accuracy and precision of 95%, 100% for recall, and 97.2% for the F-1 score.
ARTIFICIAL INTELLIGENCE TECHNIQUES FOR THE MODELING OF A 3G MOBILE PHONE BASE...ijaia
The principal objective of this work is to be able to use artificial intelligence techniques to be able to
design a predictive model of the performance of a third-generation mobile phone base radio, using the
analysis of KPIs obtained in a statistical data set of the daily behaviour of an RBS. For the realization of
these models, various techniques such as Decision Trees, Neural Networks and Random Forest were used.
which will allow faster progress in the deep analysis of large amounts of data statistics and get better
results. In this part of the work, data was obtained from the behaviour of a third-party mobile phone base
radio generation of the Claro operator in Ecuador, it should be noted that. To specify this practical case,
several models were generated based on in various artificial intelligence technique for the prediction of
performance results of a mobile phone base radio of third generation, the same ones that after several tests
were creation of a predictive model that determines the performance of a mobile phone base radio. As a
conclusion of this work, it was determined that the development of a predictive model based on artificial
intelligence techniques is very useful for the analysis of large amounts of data in order to find or predict
complex results, more quickly and trustworthy. The data are KPIs of the daily and hourly performance of a
radio base of third generation mobile telephony, these data were obtained through the operator's remote
monitoring and management tool Sure call PRS.
MITIGATION TECHNIQUES TO OVERCOME DATA HARM IN MODEL BUILDING FOR MLijaia
Given the impact of Machine Learning (ML) on individuals and the society, understanding how harm might
be occur throughout the ML life cycle becomes critical more than ever. By offering a framework to
determine distinct potential sources of downstream harm in ML pipeline, the paper demonstrates the
importance of choices throughout distinct phases of data collection, development, and deployment that
extend far beyond just model training. Relevant mitigation techniques are also suggested for being used
instead of merely relying on generic notions of what counts as fairness.
Mining knowledge graphs to map heterogeneous relations between the internet o...IJECEIAES
Patterns for the internet of things (IoT) which represent proven solutions used to solve design problems in the IoT are numerous. Similar to objectoriented design patterns, these IoT patterns contain multiple mutual heterogeneous relationships. However, these pattern relationships are hidden and virtually unidentified in most documents. In this paper, we use machine learning techniques to automatically mine knowledge graphs to map these relationships between several IoT patterns. The end result is a semantic knowledge graph database which outlines patterns as vertices and their relations as edges. We have identified four main relationships between the IoT patterns-a pattern is similar to another pattern if it addresses the same use case problem, a large-scale pattern uses a small- scale pattern in a lower level layer, a large pattern is composed of multiple smaller scale patterns underneath it, and patterns complement and combine with each other to resolve a given use case problem. Our results show some promising prospects towards the use of machine learning techniques to generate an automated repository to organise the IoT patterns, which are usually extracted at various levels of abstraction and granularity.
DATA AUGMENTATION TECHNIQUES AND TRANSFER LEARNING APPROACHES APPLIED TO FACI...ijaia
The face expression is the first thing we pay attention to when we want to understand a person’s state of
mind. Thus, the ability to recognize facial expressions in an automatic way is a very interesting research
field. In this paper, because the small size of available training datasets, we propose a novel data
augmentation technique that improves the performances in the recognition task. We apply geometrical
transformations and build from scratch GAN models able to generate new synthetic images for each
emotion type. Thus, on the augmented datasets we fine tune pretrained convolutional neural networks with
different architectures. To measure the generalization ability of the models, we apply extra-database
protocol approach, namely we train models on the augmented versions of training dataset and test them on
two different databases. The combination of these techniques allows to reach average accuracy values of
the order of 85% for the InceptionResNetV2 model.
MOVIE SUCCESS PREDICTION AND PERFORMANCE COMPARISON USING VARIOUS STATISTICAL...ijaia
Movies are among the most prominent contributors to the global entertainment industry today, and they
are among the biggest revenue-generating industries from a commercial standpoint. It's vital to divide
films into two categories: successful and unsuccessful. To categorize the movies in this research, a variety
of models were utilized, including regression models such as Simple Linear, Multiple Linear, and Logistic
Regression, clustering techniques such as SVM and K-Means, Time Series Analysis, and an Artificial
Neural Network. The models stated above were compared on a variety of factors, including their accuracy
on the training and validation datasets as well as the testing dataset, the availability of new movie
characteristics, and a variety of other statistical metrics. During the course of this study, it was discovered
that certain characteristics have a greater impact on the likelihood of a film's success than others. For
example, the existence of the genre action may have a significant impact on the forecasts, although another
genre, such as sport, may not. The testing dataset for the models and classifiers has been taken from the
IMDb website for the year 2020. The Artificial Neural Network, with an accuracy of 86 percent, is the best
performing model of all the models discussed.
Research trends on CAPTCHA: A systematic literature IJECEIAES
The advent of technology has crept into virtually all sectors and this has culminated in automated processes making use of the Internet in executing various tasks and actions. Web services have now become the trend when it comes to providing solutions to mundane tasks. However, this development comes with the bottleneck of authenticity and intent of users. Providers of these Web services, whether as a platform, as a software or as an Infrastructure use various human interaction proof’s (HIPs) to validate authenticity and intent of its users. Completely automated public turing test to tell computer and human apart (CAPTCHA), a form of IDS in web services is advantageous. Research into CAPTCHA can be grouped into two -CAPTCHA development and CAPTCH recognition. Selective learning and convolutionary neural networks (CNN) as well as deep convolutionary neural network (DCNN) have become emerging trends in both the development and recognition of CAPTCHAs. This paper reviews critically over fifty article publications that shows the current trends in the area of the CAPTCHA scheme, its development and recognition mechanisms and the way forward in helping to ensure a robust and yet secure CAPTCHA development in guiding future research endeavor in the subject domain.
Implementing data-driven decision support system based on independent educati...IJECEIAES
Decision makers in the educational field always seek new technologies and tools, which provide solid, fast answers that can support decision-making process. They need a platform that utilize the students’ academic data and turn them into knowledge to make the right strategic decisions. In this paper, a roadmap for implementing a data driven decision support system (DSS) is presented based on an educational data mart. The independent data mart is implemented on the students’ degrees in 8 subjects in a private school (AlIskandaria Primary School in Basrah province, Iraq). The DSS implementation roadmap is started from pre-processing paper-based data source and ended with providing three categories of online analytical processing (OLAP) queries (multidimensional OLAP, desktop OLAP and web OLAP). Key performance indicator (KPI) is implemented as an essential part of educational DSS to measure school performance. The static evaluation method shows that the proposed DSS follows the privacy, security and performance aspects with no errors after inspecting the DSS knowledge base. The evaluation shows that the data driven DSS based on independent data mart with KPI, OLAP is one of the best platforms to support short-tolong term academic decisions.
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGdannyijwest
Social Networks has become one of the most popular platforms to allow users to communicate, and share their interests without being at the same geographical location. With the great and rapid growth of Social Media sites such as Facebook, LinkedIn, Twitter…etc. causes huge amount of user-generated content. Thus, the improvement in the information quality and integrity becomes a great challenge to all social media sites, which allows users to get the desired content or be linked to the best link relation using improved search / link technique. So introducing semantics to social networks will widen up the representation of the social networks. In this paper, a new model of social networks based on semantic tag ranking is introduced. This model is based on the concept of multi-agent systems. In this proposed model the representation of social links will be extended by the semantic relationships found in the vocabularies which are known as (tags) in most of social networks.The proposed model for the social media engine is based on enhanced Latent Dirichlet Allocation(E-LDA) as a semantic indexing algorithm, combined with Tag Rank as social network ranking algorithm. The improvements on (E-LDA) phase is done by optimizing (LDA) algorithm using the optimal parameters. Then a filter is introduced to enhance the final indexing output. In ranking phase, using Tag Rank based on the indexing phase has improved the output of the ranking. Simulation results of the proposed model have shown improvements in indexing and ranking output.
Clustering Prediction Techniques in Defining and Predicting Customers Defecti...IJECEIAES
With the growth of the e-commerce sector, customers have more choices, a fact which encourages them to divide their purchases amongst several ecommerce sites and compare their competitors‟ products, yet this increases high risks of churning. A review of the literature on customer churning models reveals that no prior research had considered both partial and total defection in non-contractual online environments. Instead, they focused either on a total or partial defect. This study proposes a customer churn prediction model in an e-commerce context, wherein a clustering phase is based on the integration of the k-means method and the Length-RecencyFrequency-Monetary (LRFM) model. This phase is employed to define churn followed by a multi-class prediction phase based on three classification techniques: Simple decision tree, Artificial neural networks and Decision tree ensemble, in which the dependent variable classifies a particular customer into a customer continuing loyal buying patterns (Non-churned), a partial defector (Partially-churned), and a total defector (Totally-churned). Macroaveraging measures including average accuracy, macro-average of Precision, Recall, and F-1 are used to evaluate classifiers‟ performance on 10-fold cross validation. Using real data from an online store, the results show the efficiency of decision tree ensemble model over the other models in identifying both future partial and total defection.
Immune-Inspired Method for Selecting the Optimal Solution in Semantic Web Ser...IJwest
The increasing interest in developing efficient and effective optimization techniques has conducted researchers to turn their attention towards biology. It has been noticed that biology offers many clues for designing novel optimization techniques, these approaches exhibit self-organizing capabilities and permit the reachability of promising solutions without the existence of a central coordinator. In this paper we handle the problem of dynamic web service composition, by using the clonal selection algorithm. In order to assess the optimality rate of a given composition, we use the QOS attributes of the services involved in the workflow as well as, the semantic similarity between these components. The experimental evaluation shows that the proposed approach has a better performance in comparison with other approaches such as the genetic algorithm.
Voice Based Search Engine for Visually Impairment PeoplesIJASRD Journal
World Wide Web (WWW) is unexpectedly emerging because the accepted records supply for our society. The WWW is normally reachable the usage of an internet-browsing package from a networked pc. The layout of facts on the net is visually orientated. The reliance on visible presentation locations excessive cognitive demands on a person to function this sort of system. The interaction might also sometimes require the whole attention of a consumer. The design of information presentation at the web is predominately visible-oriented. This presentation technique requires most, if no longer all, of the person’s attention and imposes significant cognitive load on a user. This technique isn't always sensible, in particular for the visually impaired persons. The awareness of this challenge is to develop a prototype which supports net browsing the use of a speech-based interface, e.g. A telephone, and to degree its effectiveness. The command input and the delivery of web contents are totally in voice. Audio icons are constructed into the prototype so that users will have higher knowledge of the original shape/purpose of a web page. Navigation and manage commands are available to decorate the net browsing enjoy. The effectiveness of this prototype is evaluated in a consumer take a look at involving both generally sighted and visually impaired humans. Voice browsers allow human beings to get right of entry to the Web the usage of speech synthesis, pre-recorded audio, and speech reputation. This may be supplemented via keypads and small presentations. Voice may also be supplied as an accessory to standard computing device browsers with high resolution graphical presentations, presenting an on hand alternative to the use of the keyboard or screen, as an instance in cars in which palms/eyes unfastened operation is crucial. Voice interplay can get away the bodily obstacles on keypads and shows as cell devices turn out to be ever smaller. The browser will have an integrated textual content extraction engine that inspects the content of the page to construct a structured illustration. The inner nodes of the structure constitute diverse tiers of abstraction of the content. This enables in easy and bendy navigation of the page so that it will hastily home into gadgets of interest.
A Study on the Applications and Impact of Artificial Intelligence in E Commer...ijtsrd
Trends in computer science show that various aspects of Artificial Intelligence are emerging, and other trends show that these advances are being applied to create intelligent in formation systems. In recent days artificial intelligence is changing the ways in which computers are usable as problem solving tools. The talent of humans is thus smartly creating and operating tools are indeed a feature of human based brainpower. This technology is now adapted by various E Commerce websites in order to identify the customer preference, pervious purchases, frequent checks etc. Google and Microsoft are also investing in artificial intelligence through various forms in order to enhance better customer service. The main aim of the study is to analysis and explores the various applications and impact of artificial intelligence in E Commerce industry. This study analyses and concludes that by replacement of human expert with artificial intelligence systems in E Commerce industry can significantly speedup and cheapens the production or service process. Prof. Lakshmi Narayan. N | Naveena. N "A Study on the Applications and Impact of Artificial Intelligence in E-Commerce Industry" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd26374.pdfPaper URL: https://www.ijtsrd.com/computer-science/artificial-intelligence/26374/a-study-on-the-applications-and-impact-of-artificial-intelligence-in-e-commerce-industry/prof-lakshmi-narayan-n
Multi-objective NSGA-II based community detection using dynamical evolution s...IJECEIAES
Community detection is becoming a highly demanded topic in social networking-based applications. It involves finding the maximum intraconnected and minimum inter-connected sub-graphs in given social networks. Many approaches have been developed for community’s detection and less of them have focused on the dynamical aspect of the social network. The decision of the community has to consider the pattern of changes in the social network and to be smooth enough. This is to enable smooth operation for other community detection dependent application. Unlike dynamical community detection Algorithms, this article presents a non-dominated aware searching Algorithm designated as non-dominated sorting based community detection with dynamical awareness (NDS-CD-DA). The Algorithm uses a non-dominated sorting genetic algorithm NSGA-II with two objectives: modularity and normalized mutual information (NMI). Experimental results on synthetic networks and real-world social network datasets have been compared with classical genetic with a single objective and has been shown to provide superiority in terms of the domination as well as the convergence. NDS-CD-DA has accomplished a domination percentage of 100% over dynamic evolutionary community searching DECS for almost all iterations.
There are essential security considerations in the systems used by semiconductor companies like TI. Along
with other semiconductor companies, TI has recognized that IT security is highly crucial during web
application developers' system development life cycle (SDLC). The challenges faced by TI web developers
were consolidated via questionnaires starting with how risk management and secure coding can be
reinforced in SDLC; and how to achieve IT Security, PM and SDLC initiatives by developing a prototype
which was evaluated considering the aforementioned goals. This study aimed to practice NIST strategies
by integrating risk management checkpoints in the SDLC; enforce secure coding using static code analysis
tool by developing a prototype application mapped with IT Security goals, project management and SDLC
initiatives and evaluation of the impact of the proposed solution. This paper discussed how SecureTI was
able to satisfy IT Security requirements in the SDLC and PM phases.
Single object detection to support requirements modeling using faster R-CNNTELKOMNIKA JOURNAL
Requirements engineering (RE) is one of the most important phases of a software engineering project in which the foundation of a software product is laid, objectives and assumptions, functional and non-functional needs are analyzed and consolidated. Many modeling notations and tools are developed to model the information gathered in the RE process, one popular framework is the iStar 2.0. Despite the frameworks and notations that are introduced, many engineers still find that drawing the diagrams is easier done manually by hand. Problem arises when the corresponding diagram needs to be updated as requirements evolve. This research aims to kickstart the development of a modeling tool using Faster Region-based Convolutional Neural Network for single object detection and recognition of hand-drawn iStar 2.0 objects, Gleam grayscale, and Salt and Pepper noise to digitalize hand-drawn diagrams. The single object detection and recognition tool is evaluated and displays promising results of an overall accuracy and precision of 95%, 100% for recall, and 97.2% for the F-1 score.
ARTIFICIAL INTELLIGENCE TECHNIQUES FOR THE MODELING OF A 3G MOBILE PHONE BASE...ijaia
The principal objective of this work is to be able to use artificial intelligence techniques to be able to
design a predictive model of the performance of a third-generation mobile phone base radio, using the
analysis of KPIs obtained in a statistical data set of the daily behaviour of an RBS. For the realization of
these models, various techniques such as Decision Trees, Neural Networks and Random Forest were used.
which will allow faster progress in the deep analysis of large amounts of data statistics and get better
results. In this part of the work, data was obtained from the behaviour of a third-party mobile phone base
radio generation of the Claro operator in Ecuador, it should be noted that. To specify this practical case,
several models were generated based on in various artificial intelligence technique for the prediction of
performance results of a mobile phone base radio of third generation, the same ones that after several tests
were creation of a predictive model that determines the performance of a mobile phone base radio. As a
conclusion of this work, it was determined that the development of a predictive model based on artificial
intelligence techniques is very useful for the analysis of large amounts of data in order to find or predict
complex results, more quickly and trustworthy. The data are KPIs of the daily and hourly performance of a
radio base of third generation mobile telephony, these data were obtained through the operator's remote
monitoring and management tool Sure call PRS.
MITIGATION TECHNIQUES TO OVERCOME DATA HARM IN MODEL BUILDING FOR MLijaia
Given the impact of Machine Learning (ML) on individuals and the society, understanding how harm might
be occur throughout the ML life cycle becomes critical more than ever. By offering a framework to
determine distinct potential sources of downstream harm in ML pipeline, the paper demonstrates the
importance of choices throughout distinct phases of data collection, development, and deployment that
extend far beyond just model training. Relevant mitigation techniques are also suggested for being used
instead of merely relying on generic notions of what counts as fairness.
Mining knowledge graphs to map heterogeneous relations between the internet o...IJECEIAES
Patterns for the internet of things (IoT) which represent proven solutions used to solve design problems in the IoT are numerous. Similar to objectoriented design patterns, these IoT patterns contain multiple mutual heterogeneous relationships. However, these pattern relationships are hidden and virtually unidentified in most documents. In this paper, we use machine learning techniques to automatically mine knowledge graphs to map these relationships between several IoT patterns. The end result is a semantic knowledge graph database which outlines patterns as vertices and their relations as edges. We have identified four main relationships between the IoT patterns-a pattern is similar to another pattern if it addresses the same use case problem, a large-scale pattern uses a small- scale pattern in a lower level layer, a large pattern is composed of multiple smaller scale patterns underneath it, and patterns complement and combine with each other to resolve a given use case problem. Our results show some promising prospects towards the use of machine learning techniques to generate an automated repository to organise the IoT patterns, which are usually extracted at various levels of abstraction and granularity.
DATA AUGMENTATION TECHNIQUES AND TRANSFER LEARNING APPROACHES APPLIED TO FACI...ijaia
The face expression is the first thing we pay attention to when we want to understand a person’s state of
mind. Thus, the ability to recognize facial expressions in an automatic way is a very interesting research
field. In this paper, because the small size of available training datasets, we propose a novel data
augmentation technique that improves the performances in the recognition task. We apply geometrical
transformations and build from scratch GAN models able to generate new synthetic images for each
emotion type. Thus, on the augmented datasets we fine tune pretrained convolutional neural networks with
different architectures. To measure the generalization ability of the models, we apply extra-database
protocol approach, namely we train models on the augmented versions of training dataset and test them on
two different databases. The combination of these techniques allows to reach average accuracy values of
the order of 85% for the InceptionResNetV2 model.
MOVIE SUCCESS PREDICTION AND PERFORMANCE COMPARISON USING VARIOUS STATISTICAL...ijaia
Movies are among the most prominent contributors to the global entertainment industry today, and they
are among the biggest revenue-generating industries from a commercial standpoint. It's vital to divide
films into two categories: successful and unsuccessful. To categorize the movies in this research, a variety
of models were utilized, including regression models such as Simple Linear, Multiple Linear, and Logistic
Regression, clustering techniques such as SVM and K-Means, Time Series Analysis, and an Artificial
Neural Network. The models stated above were compared on a variety of factors, including their accuracy
on the training and validation datasets as well as the testing dataset, the availability of new movie
characteristics, and a variety of other statistical metrics. During the course of this study, it was discovered
that certain characteristics have a greater impact on the likelihood of a film's success than others. For
example, the existence of the genre action may have a significant impact on the forecasts, although another
genre, such as sport, may not. The testing dataset for the models and classifiers has been taken from the
IMDb website for the year 2020. The Artificial Neural Network, with an accuracy of 86 percent, is the best
performing model of all the models discussed.
Research trends on CAPTCHA: A systematic literature IJECEIAES
The advent of technology has crept into virtually all sectors and this has culminated in automated processes making use of the Internet in executing various tasks and actions. Web services have now become the trend when it comes to providing solutions to mundane tasks. However, this development comes with the bottleneck of authenticity and intent of users. Providers of these Web services, whether as a platform, as a software or as an Infrastructure use various human interaction proof’s (HIPs) to validate authenticity and intent of its users. Completely automated public turing test to tell computer and human apart (CAPTCHA), a form of IDS in web services is advantageous. Research into CAPTCHA can be grouped into two -CAPTCHA development and CAPTCH recognition. Selective learning and convolutionary neural networks (CNN) as well as deep convolutionary neural network (DCNN) have become emerging trends in both the development and recognition of CAPTCHAs. This paper reviews critically over fifty article publications that shows the current trends in the area of the CAPTCHA scheme, its development and recognition mechanisms and the way forward in helping to ensure a robust and yet secure CAPTCHA development in guiding future research endeavor in the subject domain.
Projection Profile Based Number Plate Localization and Recognition csandit
This paper proposes algorithms to localize vehicle
number plates from natural background
images, to segment the characters from the localize
d number plates and to recognize the
segmented characters. The reported system is tested
on a dataset of 560 sample images
captured with different background under various il
luminations. The performance accuracy of
the proposed system has been calculated at each sta
ge, which is 97.1%, 95.4% and 95.72% for
localisation & extraction, character segmentation a
nd character recognition respectively. The
proposed method is also capable of localising and r
ecognising multiple number plates in
images.
PROJECTION PROFILE BASED NUMBER PLATE LOCALIZATION AND RECOGNITIONcscpconf
This paper proposes algorithms to localize vehicle number plates from natural background images to segment the characters from the localized number plates and to recognize the
segmented characters. The reported system is tested on a dataset of 560 sample images captured with different background under various illuminations. The performance accuracy of the proposed system has been calculated at each stage, which is 97.1%, 95.4% and 95.72% for
localisation & extraction, character segmentation and character recognition respectively. The proposed method is also capable of localising and recognising multiple number plates in images.
An Analysis of Various Deep Learning Algorithms for Image Processingvivatechijri
Various applications of image processing has given it a wider scope when it comes to data analysis.
Various Machine Learning Algorithms provide a powerful environment for training modules effectively to
identify various entities of images and segment the same accordingly. Rather one can observe that though the
image classifiers like the Support Vector Machines (SVM) or Random Forest Algorithms do justice to the task,
deep learning algorithms like the Artificial Neural Networks (ANN) and its subordinates, the very well-known
and extremely powerful Algorithm Convolution Neural Networks (CNN) can provide a new dimension to the
image processing domain. It has way higher accuracy and computational power for classifying images further
and segregating their various entities as individual components of the image working region. Major focus will
be on the Region Convolution Neural Networks (R-CNN) algorithm and how well it provides the pixel-level
segmentation further using its better successors like the Fast-Faster and Mask R-CNN versions.
License Plate Recognition using Morphological Operation. Amitava Choudhury
This paper describes an efficient technique of locating and
extracting license plate and recognizing each segmented
character. The proposed model can be subdivided into four
parts- Digitization of image, Edge Detection, Separation of
characters and Template Matching. In this work, we propose a
method which is based on morphological operations where
different Structuring Elements (SE) are used to maximally
eliminate non-plate region and enhance plate region.
Character segmentation is done using Connected Component
Analysis. Correlation based template matching technique is
used for recognition of characters. This system is
implemented using MATLAB7.4.0. The proposed system is
mainly applicable to Indian License Plates.
Character recognition of kannada text in scene images using neuralIAEME Publication
Character recognition in scene images is one of the most fascinating and challenging
areas of pattern recognition with various practical application potentials. It can contribute
immensely to the advancement of an automation process and can improve the interface
between man and machine in many applications. Some practical application potentials of
character recognition system are: reading aid for the blind, traffic guidance systems, tour
guide systems, location aware systems and many more. In this work, a novel method for
recognizing basic Kannada characters in natural scene images is proposed. The proposed
method uses zone wise horizontal and vertical profile based features of character images. The
method works in two phases. During training, zone wise vertical and horizontal profile based
features are extracted from training samples and neural network is trained. During testing, the
test image is processed to obtain features and recognized using neural network classifier. The
method has been evaluated on 490 Kannada character images captured from 2 Mega Pixels
cameras on mobile phones at various sizes 240x320, 600x800 and 900x1200, which contains
samples of different sizes, styles and with different degradations, and achieves an average
recognition accuracy of 92%. The system is efficient and insensitive to the variations in size
and font, noise, blur and other degradations.
Technique for recognizing faces using a hybrid of moments and a local binary...IJECEIAES
The face recognition process is widely studied, and the researchers made great achievements, but there are still many challenges facing the applications of face detection and recognition systems. This research contributes to overcoming some of those challenges and reducing the gap in the previous systems for identifying and recognizing faces of individuals in images. The research deals with increasing the precision of recognition using a hybrid method of moments and local binary patterns (LBP). The moment technique computed several critical parameters. Those parameters were used as descriptors and classifiers to recognize faces in images. The LBP technique has three phases: representation of a face, feature extraction, and classification. The face in the image was subdivided into variable-size blocks to compute their histograms and discover their features. Fidelity criteria were used to estimate and evaluate the findings. The proposed technique used the standard Olivetti Research Laboratory dataset in the proposed system training and recognition phases. The research experiments showed that adopting a hybrid technique (moments and LBP) recognized the faces in images and provide a suitable representation for identifying those faces. The proposed technique increases accuracy, robustness, and efficiency. The results show enhancement in recognition precision by 3% to reach 98.78%.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
PHP Frameworks: I want to break free (IPC Berlin 2024)
Novel character segmentation reconstruction approach for license plate recognition
1. Accepted Manuscript
A Novel Character Segmentation-Reconstruction Approach for
License Plate Recognition
Vijeta Khare , Palaiahnakote Shivakumara , Chee Seng Chan ,
Tong Lu , Liang Kim Meng , Hon Hock Woon ,
Michael Blumenstein
PII: S0957-4174(19)30260-X
DOI: https://doi.org/10.1016/j.eswa.2019.04.030
Reference: ESWA 12612
To appear in: Expert Systems With Applications
Received date: 26 November 2018
Revised date: 2 March 2019
Accepted date: 16 April 2019
Please cite this article as: Vijeta Khare , Palaiahnakote Shivakumara , Chee Seng Chan , Tong Lu ,
Liang Kim Meng , Hon Hock Woon , Michael Blumenstein , A Novel Character Segmentation-
Reconstruction Approach for License Plate Recognition, Expert Systems With Applications (2019),
doi: https://doi.org/10.1016/j.eswa.2019.04.030
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service
to our customers we are providing this early version of the manuscript. The manuscript will undergo
copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please
note that during the production process errors may be discovered which could affect the content, and
all legal disclaimers that apply to the journal pertain.
2. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
Highlights
-We introduce partial character reconstruction to segment characters.
-Angular information is explored for finding spaces between characters.
-Stroke width properties in different domains are used for shape restoration.
-Experiments are conducted on benchmark databases to show effective and
usefulness.
3. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
A Novel Character Segmentation-Reconstruction
Approach for License Plate Recognition
Vijeta Kharea
, Palaiahnakote Shivakumarab
, Chee Seng Chanb
, Tong Luc
, Liang Kim Mengd
, Hon Hock
Woond
, and Michael Blumensteine
a
Department of Computer Science and Software Engineering, Concordia University, Montreal, Canada
kharevijeta@gmail.com,
b
Faculty of Computer Systems and Information Technology, University of Malaya, Malaysia.
shiva@um.edu.my, hudempsk@yahoo.com, cs.chan@um.edu.my,
c
National Key Lab for Novel Software Technology, Nanjing University, Nanjing, China. Email:
lutong@nju.edu.cn
d
Advanced Informatics Lab, MIMOS Berhad, Kuala Lumpur, Malaysia. Email:
liang.kimmeng@mimos.my, hockwoon.hon@mimos.my,
e
Faculty of Engineering and Information Technology, University of Technology Sydney, Australia.
Email: Michael.Blumenstein@uts.edu.au.
Author Statement
All the authors contribute equally.
Abstract
Developing an automatic license plate recognition system that can cope with multiple factors is
challenging and interesting in the current scenario. In this paper, we introduce a new concept called
partial character reconstruction to segment characters of license plates to enhance the performance of
license plate recognition systems. Partial character reconstruction is proposed based on the characteristics
of stroke width in the Laplacian and gradient domain in a novel way. This results in character components
with incomplete shapes. The angular information of character components determined by PCA and the
major axis are then studied by considering regular spacing between characters and aspect ratios of
character components in a new way for segmenting characters. Next, the same stroke width properties are
used for reconstructing the complete shape of each character in the gray domain rather than in the
gradient domain, which helps in improving the recognition rate. Experimental results on benchmark
license plate databases, namely, MIMOS, Medialab, UCSD data, Uninsbria data Challenged data, as well
as video databases, namely, ICDAR 2015, YVT video, and natural scene data, namely, ICDAR 2013,
ICDAR 2015, SVT, MSRA, show that the proposed technique is effective and useful.
Keywords:—Character segmentation; Character reconstruction; Stroke width; Zero crossing; Gradient
vector flow; License plate recognition.
4. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
1. Introduction
Creating a smart/digital/safe city has been one of the important emerging trends in both developing and
developed countries in recent times. As a result, developing automatic systems has become an integral
part of the above-mentioned initiatives (Rathore et al., 2016; Yuan et al., 2017). One such example is to
develop intelligent transport systems for safety and mobility, and to enhance public welfare with the help
of advanced technologies by recognizing license plates (Du, Ibrahim, Shehata & Badawy, 2013; Suresh,
Kumar & Rajagoplan, 2007; Anagnostopoulos, Anagnostopoulos, Loumos & Kayafas, 2006). There are
transport systems proposed for recognizing license plates in the literature for applications such as the
automatic collection of toll fees, automatic monitoring of car speeds on the road, automatic estimation of
traffic volume at different traffic junctions, detection of illegal parking and incorrect traffic flows
(Abolghasemi & Ahmadyafrd, 2009; Tadi, Popovic & Odry, 2016; Azam & Islam, 2016). However, such
a system only works well for a particular application since it is not developed for multiple applications.
This is because any particular system can cope with a single adverse factor but not multiple factors, which
affect license plate visuals (Zhou et al., 2012). In addition, most of the existing systems that have been
developed use conventional binarization methods, which are proposed for plain background document
images to localize and recognize license plates (Ghaili, Mashohor, Ramli & Ismail, 2013; Yu et al., 2015;
Du, Ibrahim, Shehata & Badawy, 2013). It is obvious that for different real-time applications, multiple
environmental effects are common (e.g., low resolution, low contrast, complex backgrounds, blur due to
camera or vehicle movements, illumination effects due to sunlight, headlights, degradation effects due to
rain, fog or haze, and distortion effects due to camera angle variations).
―EXE8950‖ ―WCR2246‖
(c) Reconstruction and recognition results by the proposed approach for the images in (a)
Fig. 1. Binarization and recognition results of license plate images affected by different effects, which result in varying
recognition results.
― ― ―HCR22A6‖
(b) Binarization results by Zhou, Field, Miller & Wang, 2013 and recognition results by Tesseract OCR, 2016 for the
input images in (a)
(a) Input Image-1 Input Image-2
5. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
The illustration shown in Fig. 1 demonstrates that input image-1 is affected by perspective distortion,
while input image-2 is affected by blur as shown in Fig. 1(a). For these two license plate images, the
binarization method (Zhou et al., 2013), which is the state-of-the-art method and works well for low
contrast and complex background images, fails to give good results for input image-1, but gives better
results for input image-2 as shown in Fig. 1(b). However, the recognition results given by Tesseract OCR
gives nothing for input image-1 due to touching, and incorrect results for input image-2 due to shape loss
as shown in Fig. 1(b). On the other hand, the proposed method works well except for the first character in
input image-1 through reconstruction-segmentation with the same OCR. With this illustration, one can
conclude that there is an urgent need for developing a system, which can withstand multiple adverse
factors such that the same system can be used for several real-time applications successfully.
2. Related Work
The proposed license plate recognition system involves character segmentation through partial
reconstruction, and complete reconstruction for recognition. Therefore, we review the research related to
character segmentation, character recognition and character reconstruction.
Character Segmentation: Phan et al., 2011 proposed a gradient-vector-flow based method for video
character segmentation. The method uses text line length for finding seed points that are unreliable, and
then uses minimum cost path estimation for finding spaces between characters. Sharma et al., 2013
proposed a new method for character segmentation from multi-oriented video words. The method is
sensitive to dominant points. Liang et al., 2015 proposed a new wavelet Laplacian method for arbitrarily-
oriented character segmentation in video text lines. This method explores zero crossing points to find
spaces between words or characters. The performance of the method degrades when an image contains
noisy backgrounds. There are methods proposed for segmenting characters from license plate images. For
example, Tian et al., 2015 proposed a two-stage character segmentation method for Chinese license
plates. This method relies on binarization for segmentation. Sedighi & Vafadust, 2011 proposed a new
and robust method for character segmentation and recognition in license plate images. This method uses a
classifier, and binarization for segmentation. As a result, the method is dataset dependent. Khare et al.,
2015 proposed a new sharpness-based approach for character segmentation of license plate images. The
method explores gradient vector and sharpness for segmentation. However, the method is said to be
sensitive to seed point selection and blur presence. Kim et al., 2016 proposed an effective character
segmentation approach for license plate recognition under varying illumination environments. The
method uses binarization and the super pixel concept for segmentation. However, the method focuses on a
single cause but not multiple causes.
6. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
In the same way, recently, Dhar et al, 2018 proposed a system design for license plate recognition using
edge detection and convolutional neural networks. The method uses character segmentation as a
preprocessing step for license plate recognition. For character segmentation, the method explores edge
detection, morphological operations and region properties. However, the method is good for the images
with simple backgrounds but not for images affected by many challenges. Ingole et al, 2017 proposed
character feature-based vehicle license plate detection and recognition. First, the method segments
characters from license plate regions for recognition. For character segmentation, the method proposes
vertical and horizontal projection profile-based features. The proposed projection profile-based features
may not be robust for the images with complex backgrounds. Radchenako et al, 2017 proposed a
segmentation and recognition method for Ukrainian license plates. The method segments characters based
on connected component analysis. The connected component analysis works well when the input image is
binarized without the loss of the character shapes and touching between the characters. However, for the
images with complex backgrounds, it is hard to propose a binarization method to separate foreground and
background information.
In summary, from the above context, we can conclude that most of the methods made an attempt to solve
the problem of low resolution or illumination effects, but do not include other distortions such as blur,
touching and complex backgrounds. In addition, none of the methods explore the concept of
reconstruction for segmenting characters from license plate images.
Character Recognition: To recognize characters in text lines of video, natural scene images and license
plate images, there are methods that use either binarization methods or classifiers (Ye & Doermann,
2015). For example, Zhou et al., 2013 proposed scene text binarization via inverse rendering. The method
proposes a different idea for adapting parameters that tune the method according to image complexity.
However, the assumptions made for proposing a number of criteria limits its ability to work on different
applications. Wang et al., 2015 proposed MRF-based text binarization for complex images using stroke
features. The success of the method depends on how well it selects seed pixels from the foreground and
background. Similarly, Anagnostopoulos et al., 2006 proposed a license plate recognition algorithm for
intelligent transportation applications. Since the method involves binarization and a classifier for
recognition, it may not work well for images affected by multiple adverse effects such as low resolution,
blur and touching. Saha, Basu & Nasipuri, 2015 proposed automatic license plate recognition for Indian
license plate images. The method involves edge map generation, the Hough transform and a classifier for
recognition. The success of the method depends on edge map generation and a classifier. Gou et al., 2016
proposed vehicle license plate recognition based on extremal regions and restricted Boltzmann machines.
The method extracts HoG features for detected characters, and then uses a classifier for recognition. In
7. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
summary, it is noted from the above review of license plate recognition approaches that most of the
methods consider binarization algorithms and classifiers for recognition. In addition, the methods do not
consider images affected by multiple factors for achieving their results. Therefore, the methods lose
generality and the ability to work on license plate images of different background and foreground
complexities.
Deep Learning Models for Character Recognition: Jaderberg et al., 2016 proposed an approach for
reading texts in the wild with a convolutional neural network, which explores deep learning for achieving
high recognition results for texts in natural scene images. Goodfellow et al., 2013 proposed multi-digit
number recognition from street view imagery using deep convolutional neural networks, which explores
deep learning at the pixel level. Despite both methods addressing the challenges caused by natural scene
images, they are limited to text recognition from high contrast images but not from low resolution license
plate images and video images. Raghunadan et al., 2017 proposed a Riesz fractional-based model for
enhancing license plate detection and recognition. This method makes an attempt to address the causes
which affect license plate detection and recognition. Based on the experimental results, it is noted that
enhancement of license plate images may improve the recognition results but it is not adequate for real
time applications. Shemarry et al. (Shemarry et al., 2018) proposed an ensemble of adaboost cascades of
3L-LBPs classifiers for license plate detection from low quality images. The method explores texture
features based on LBP operations and uses a classifier for license plate detection from images affected by
multiple adverse factors. However, the performance of the method heavily depends on learning and the
number of labeled samples. In addition, the scope is limited to text detection but not recognition as in the
proposed work. Text detection is easier than recognition in this case because detection does not require
the full shapes of characters.
Recently, inspired by the strong ability and discriminating power of deep learning models, some methods
have explored different deep learning models for license plate recognition. For example, Dong et al, 2017
proposed a CNN-based approach for automatic license plate recognition in the wild. The method explores
an R-CNN for license plate recognition. Bulan et al, 2017 proposed segmentation-and annotation-free
license plate recognition with deep localization and failure identification. The method explores CNNs for
detecting a set of candidate regions. Then it filters false positive from the candidate regions based on
strong CNNs. Silva et al. 2018 proposed license plate detection and recognition in unconstrained
scenarios. The method explores CNNs for addressing challenges caused by degradation. It detects the
license plate region first and then the detected region is fed to an OCR for recognition. Lin et al, 2018
proposed an efficient license plate recognition system using convolution neural networks. The method
detects vehicles for license plate region detection and then it explores CNNs for recognition. Yang et al.
8. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
2018 proposed Chinese vehicle license plate recognition using kernel-based extreme learning machines
with deep convolutional features. The method explore the combination of CNN and ELM (extreme
learning machines) for license plate recognition. It is found from the above discussion on deep learning
models that the methods work well when we have a huge number of labeled predefined samples.
However, it is hard to choose predefined samples that represent all possible variations in license plate
recognition, especially for the images affected by multiple adverse factors as in the proposed work. In
addition, deep learning has its own inherent limitations such as optimizing parameters for different
databases and maintaining stability of deep neural networks (Liu et al., 2017). It can be noted from the
above discussion that there is a gap between the state-of-the-art methods and the present demand. This
observation motivated us to propose a new method for license plate recognition without depending much
on classifiers and a large number of labeled samples, as in the existing methods.
Character Reconstruction: Similar to the proposed work, there are methods in the literature, which
reconstruct character shapes to improve recognition rates without the help of classifiers and binarization
algorithms. Shivakumara et al., 2013 proposed a ring radius transform for character shape reconstruction
in video. Its performance is good as long as Canny produces the correct character structures. However, it
is true that Canny is sensitive to blur and other distortions. To overcome this drawback, Tian et al., 2015
proposed a method for character shape restoration using gradient orientations. It finds the medial axis in
the gradient domain with different directions. However, the method does not work well for characters
having blur and complex backgrounds. In addition, the primary objective of this work is to reconstruct the
characters from video, which suffer from low resolution and low contrast, but does not deal with license
plate images.
In light of the above discussions on the review of character segmentation from license plate images,
character recognition from license plate images and character reconstruction, most of the methods focus
on a particular dataset and certain applications, such as natural scene images or video images or license
plate images. As a result the scope of the above methods is limited to specific applications and objectives.
This motivated us to propose a method that can work well for license plate images, natural scenes and
video images. In addition, license plates images are generally affected by multiple adverse factors due to
background and foreground variations, making the problem of recognition more complex and interesting.
Inspired by the work (Shivakumara et al., 2019) where keyword spotting is addressed for multiple types
of images with powerful feature extraction, we propose a novel idea for recognizing characters from
license plates affected by multiple factors. The key contributions of the proposed work are as follows: (1)
Proposing partial reconstruction for segmenting characters from license plate images is novel; (2)
Reconstructing complete shapes of characters from segmented characters without binarization, which can
9. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
work well for not only license plate images but also natural scene and video images, is also novel; (3) The
combination of reconstruction and character segmentation in a new way is another interesting step to
achieve good recognition rates for multi-type images. The main advantage of the proposed method is that
since the proposed reconstruction approach preserves character shapes, the performance of the method
does not depend much on classifiers and the number of training samples.
The proposed method is structured as follows. Stroke width pair candidate detection is illustrated by
estimating stroke width distances for each pixel in the images in Section 3.1. In Section 3.2, we propose
symmetry properties based on stroke width distances to obtain partial reconstruction results. Section 3.3
proposes character segmentation using partial reconstruction results based on principal and major axis
information of the character components. We describe the steps for complete reconstruction in the gray
domain in Section 3.4.
3. Proposed Technique
This work considers license plates affected by multiple factors according to various applications, such as
low resolution, low contrast, complex backgrounds, multiple fonts or font sizes, blur, multi-orientation,
touching elements and distortion due to illumination effects, as input for character segmentation and
recognition.
To overcome the problem of low contrast and low resolution, inspired by Laplacian and gradient
operations, which usually enhance high contrast information at the edges or near edges by suppressing
background information (Phan et al., 2011; Liang et al. 2015; Khare et al. 2015), we propose Laplacian
and gradient information for finding pixels which represent stroke width (thickness of the stroke) of
characters in license plate images. This is justified because the Laplacian process, which is the second
order derivative, gives high positive and negative values at the edges and near edges, respectively.
Similarly, the gradient, which is the first order derivative, gives high positive values at the edges and near
edges. This information is used for Stroke Width Pair (SWP) candidate detection. It is true that stroke
width or stroke width distance and color remain constant throughout characters regardless of font or font
size variations (Epshtein, Ofek & Wexler, 2010) at the character level. Most of the time, license plates are
prepared using upper case letters. Furthermore, the spacing between characters in license plate images is
almost constant. Based on these facts, we propose new symmetry features which use Laplacian and
gradient properties at the SWP candidates to find neighboring SWPs. However, due to complex
backgrounds, severe illumination effects and blur, there is a possibility for SWPs to fail in satisfying the
symmetry features. This results in the loss of information and hence we consider the output of this step as
partial reconstruction. We believe that the output of partial reconstruction results preserve the structure of
character components. This may lead to under-and over-segmentation.
10. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
It is understood that Eigen vectors of PCA give angles based on the number of pixels which contribute to
the direction of character components (Shivakumara et al. 2014). In other words, to estimate the possible
angle of the whole character, PCA does not requires the full character information. As per our
experiments, in general, if the character contains more than 50% of pixels, one can expect almost the
same angle of the actual character. The same thing is true for angle estimation via the major axis of the
character. With this motivation, we use angle information given by PCA and the Major Axis (MA) to
estimate angles of character components. The angle information between PCA and MA is explored for
character segmentation. Since the proposed symmetry properties are sensitive to blur, touching and
complex backgrounds, we propose the same symmetry properties with weak conditions in the gray
domain instead of Laplacian and gradient domains to reconstruct the full character shape with the help of
the Canny edge image of the input image. This is possible because there is no influence from neighboring
characters after segmenting characters from the image. The reconstructed characters are passed to
Tesseract OCR for recognition. The flow of the proposed method is shown in Fig. 2.
3.1.Stroke Width Pair Candidates Detection
As mentioned in the previous section, the stroke width distances (thickness of the stroke) of characters in
a license plate image are usually the same as shown in Fig. 3(a). To extract stroke width distance, we
propose a Laplacian operation which gives high positive and negative responses for the transition from
background to foreground and vice versa, respectively. This results in searching two zero crossing points
that define stroke width distance as shown in Fig. 3(b) and Fig.3 (c), where a pictorial representation of
the marked region in Fig. 3(b) is shown. Since the input images considered have complex backgrounds
Input: License plate image
Stroke pair candidates‘ detection
Partial character reconstruction
Character segmentation
Fig. 2. Pipeline of the proposed method
Under-Segmentation Over-Segmentation
Complete reconstruction
OCR-Recognition
11. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
and small orientations due to angle variations, we use the following mask to extract horizontal and
vertical diagonal zero crossing points. Due to background variations and noise introduced by the
Laplacian operation as shown in Fig. 3(b), background and noise pixels may contribute to defining stroke
width distances. Therefore, to overcome this issue, we plot a histogram for stroke width distances as
shown in Fig. 3(c). The distances are chosen from those contributing to the highest peak as candidate
stroke width pairs, which are shown in Fig. 3(d), where one can see all the red pixels denoting stroke
width pair candidates. This is justified because the stroke pixel pairs that define actual stroke width
distance are higher than the pixel pairs defined by background or noise pixels. In this way, the proposed
step can withstand the cause of background noise and degradations. It may be noted from Fig. 3(d) that
Stroke Width Pair (SWP) candidates represent character strokes. In addition, each character has a set of
SWPs. It is evident from Fig. 3(e) that the proposed technique detects SWPs for the complex image in
Fig. 1(a), where touching exists due to perspective distortion.
It is noted from Fig. 3(d) that the number of red pixels for the characters are different from one character
to another. This is because the proposed steps estimate stroke width distance by considering all the pixels
of characters but not the pixels of individual characters. Since we consider the common stroke width
distance of the pixels in the image, the number of stroke width pairs vary from one region to another due
to background complexity. As a result, all the pixels of characters may not contribute to the highest peak
in the histogram. Therefore, one cannot predict the number of stroke width pairs for each character as
shown in Fig. 3(d). However, the proposed method has the ability to restore the character shape with one
stroke width pair of each character by the partial reconstruction step. We believe that each character gets
at least one stroke width pair from the histogram operation for the partial reconstruction step because they
follow the same font size and typeface.
Laplace Mask=
1
1
1
1
8
1
1
1
1
12. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
3.2.Partial Character Reconstruction
The proposed technique considers SWP candidates given by the previous section as the representatives to
find neighboring SWP candidates, which define stroke width of the character. To achieve this, for each
SWP candidate, the proposed technique considers eight neighbors of two stroke pixels and then checks all
the combinations to identify the correct SWP as shown in Fig. 4(a), where we can see the process of
searching for the right neighbor SWP. In this work, the proposed method uses an 8-directional code for
searching the correct stroke width pair; one can expect 8 neighbor pixels for each stroke pixel of the pair.
Therefore, the total number of combinations is 8×8 = 64 pairs. The reason to consider 8 neighbors for
each stroke pixel is to ensure that the step does not miss checking any pair of pixels. Since stroke pixels
represent edge pixels of characters, we can expect high gradient values compared to their background.
Similarly, the pixel value between the stroke pixels represent a homogeneous background, and the
gradient gives low values for the pixels compared to the gradient values of the stroke pixels as shown in
Fig. 4(b) (Khare et al., 2015). Therefore, we study the gradual changes from high to low and low to high
as shown in Fig. 4(c), where we can see gradual changes in gradient values which are defined as the
(d) Stroke widths that contribute to the highest peak in the histogram for the image in (a)
(c) Stroke width defined by Laplacian zero crossing points and histograms for stroke width distances
(a) Input image (b) Laplacian image
(e) Stroke widths that contribute to the highest peak in the histogram for one more complex image compared to the
image in (a).
Fig.3. Stroke width pair candidate detection.
Peak at SW=2
width
13. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
Gradient Symmetry (GS) feature. When we look at the Gradient Vector Flow (GVF) of the stroke pixels,
as shown in Fig. 4(d), we can observe arrows, which are pointing towards the edges; the direction of the
arrows of two stroke pixels have opposite directions. This is called the GVF Symmetry (GVFS) feature as
shown in Fig. 4(e). Similarly, we consider the value of a positive peak of the Laplacian and the difference
between the positive and negative peak values for finding symmetry. In this way, we find the neighboring
SWP of each SWP candidate as shown in Fig. 4(f), where one can see positive and positive-negative
peaks. This is called the Laplacian Symmetry (LS) feature. The proposed technique extracts four
symmetry features for each SWP candidate, and then checks the four symmetry features with all 64
combinations. Subsequently, it chooses the combination which satisfies the four symmetries as the
neighboring SWP, and the pair will be displayed as white pixels. The identified neighbor SWP is
considered as an SWP candidate, and again the whole process repeats recursively to find all the
neighboring SWPs in the image. This process stops when it visits all SWPs. However, the number of
iterations depends on the complexity of the characters and the number of SWPs of each character. As long
as the stroke width pair satisfies the symmetry properties, the partial reconstruction step restores the
contour pixels of the characters. When SWPs fail to satisfy the symmetry properties or there are no more
SWPs to visit, the iterative process terminates. This is the reason to obtain the partial shape of the
character by partial reconstruction as shown in Fig. 5, where we can see the intermediate steps for the
partial reconstruction results. It can also be noted from Fig. 5 that the partial reconstruction results
provide the structures of the characters with some loss of information.
14. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
The four symmetrical features are defined specifically as follows.
(i) If
SWn
SW
SW
SW g
g
g
G ,
, 2
1
and,
NPn
NP
NP
NP g
g
g
G ,
,
, 2
1
, where n is the size of
the stroke width (SW), SWn
g and NPn
g represents the gradient value of the stroke width and
Neighbor Pair (NP) at location n, respectively.
Then 1
NP iff
NPn
SWn
NP
SW
NP
SW g
g
g
g
g
g
,
, 2
2
1
1 Gradient symmetries can
be visualized as in Fig. 4(b).
(f) Laplacian symmetry features
Fig. 4. Exploiting symmetrical features for finding neighbor SWPs from 64 combinations
(d) GVF image (e) GVF symmetry feature
(b) Gradient image (c) Gradient symmetry feature
(a) Finding neighboring SWPs from the 64 combinations
15. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
(ii) Angle information of GVF at the starting point (sp) and end point (ep) of the stroke width is
represented as: )
(sp
SW
GVF and )
(ep
SW
GVF . Then 1
NP iff
)
(
)
(
)
(
)
( &
&
ep
SW
ep
NP
sp
SW
sp
NP
GVF
GVF
GVF
GVF
where )
(sp
NP
GVF and )
(ep
NP
GVF represent the angle information of GVF at the starting point and end
point of NP, respectively. GVF angle symmetry can be visualized as in Fig. 3(e).
(iii) The peak values of stroke width Laplace (L) at the starting point and end point are respectively
represented as )
(
_ sp
SW
L
P and )
(
_ ep
SW
L
P , and the peak values of neighbor pair Laplace starting
point and end points are respectively denoted by )
(
_ sp
NP
L
P and )
(
_ ep
NP
L
P .
Then 1
NP iff )
(
)
( _
_ sp
SW
sp
NP L
P
L
P && )
(
)
( _
_ ep
SW
ep
NP L
P
L
P
(iv) Similarly, the highest peak to the lowest peak of the Laplace zero-crossing difference is also used
for comparing neighbor pairs. Here the highest and lowest peaks of Laplace zero-crossing points for
stroke width can be represented as: SW
L
hP _ and SW
L
lP _ and for the neighbor pair NP
L
hP _
and NP
L
lP _ . Then the high to low difference can be defined as:
SW
SW
SW L
lP
L
hP
Diff _
_
, NP
NP
NP L
lP
L
hP
Diff _
_
Then 1
NP iff SW
NP Diff
Diff
Laplace symmetries (iii) and (iv) can be visualized as in Fig. 4(f).
(e) Result of partial reconstruction process (f) SWP displayed as white pixels
(c) Second intermediate result (d) Third intermediate result
(a) SWP candidates (b) First intermediate result
(g). Result of partial reconstruction of a complex image
Fig. 5. Intermediate and final partial reconstruction results
16. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
3.3.Character Segmentation
When we look at the partial reconstruction results given by the previous section as shown in Fig. 5(f) and
Fig. 5(g), one can understand that even though there is a loss of shape, it still provides enough structure,
which helps us to find the spacing between characters and character regions for segmentation. As
mentioned in the proposed Methodology Section, Principal Component Analysis (PCA) and the Major
Axis (MA) do not require the full character shape to estimate possible directions of character components.
It is also noted that most license plate images including Malaysian license plates contain upper case letters
with numerals, but not the combination of upper case with lower case letters. According to the statement
in Yao et al., 2012 that ―for most text lines, the major orientations of characters are nearly perpendicular
to the major orientation of the text line‖, both PCA and MA should give approximately 90 degrees if
characters in the text are aligned in the horizontal direction. The above observations can be confirmed
from the sample results of partial reconstruction on alphabets, namely, A to Z, and numerals, namely, 0-9,
chosen from the databases shown in Fig. 6, where we note that for both alphabet and numeral images,
PCA (yellow color axis) and MA (red color axis) give angles, which are almost the same and
approximately 90 degrees because all the images are inclined in the vertical direction. Similarly, the same
conclusion can be drawn from the results shown in Fig. 7(a)-Fig. 7(b), where we present PCA and MA
angle information for the images affected by low contrast, complex backgrounds, multi-fonts, multi-font
sizes, blur and perspective distortion. In the same way, the sample partial reconstruction results shown in
Fig. 8(a)-Fig. 8(b) for the images of two character components show that PCA and MA give angles of
almost 0 degrees as character components, which are aligned towards the horizontal direction.
(b). PCA and MA axes for partial reconstruction results of alphabets and numerals in (a).
Fig. 6. Angle information given by PCA and MA for the alphabets and numerals of license plate images. The MA axis is
represented by a red color and the PCA axis is represented by a yellow color
(a). Sample alphabets (A-Z) and numeral (0-9) images chosen from datasets
17. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
The results in Fig. 6, and Fig. 7 show that partial reconstruction has the ability to preserve character
shapes regardless of different causes, while PCA and MA have the ability to give the angle of character
orientation without the complete shape of the character components. This observation leads to define the
following hypothesis for character segmentation. If both the axes give almost 90 degrees with a ±26
difference, then the component is considered as a full character, else if both the axes give almost zero
degrees with a ± 26 difference then the component is considered to be an under-segmentation. This is
possible when two character components are joined together as shown in Fig. 8. Otherwise, the
component is considered as a case of over-segmentation. This occurs when a character loses shape. The
value of ±26 is determined based on experimental results, which will be presented in the Experimental
Section. The reason to fix such a threshold is that segmentation requires either a vertical or horizontal
orientation. With this idea, the proposed technique classifies components from the partial reconstruction
results into three cases.
In general, characters in license plate images share the same aspect ratio especially height of characters,
as shown in Fig. 5(a). This observation motivated us to find the width of components of three cases. If
partial reconstruction outputs characters with clear shapes, and all the components are classified as an
(74.9, 89.2) (75.1, 89.2) (81.1, 89.3) (83.4, 88.9) (87.5, 72.2) (77.6,
88.1)
(b). PCA and MA angles of the partial reconstruction results for the images shown in (a).
Fig. 7. PCA and MA angle information of the partial reconstruction result for the different distorted images
(a). Sample character images chosen from the license plates affected by low contrast, complex backgrounds, multi-font,
multi-font size, blur and perspective distortion.
(2.7, 0) (8.3, 25.2) (12.8, 25.2) (3.8, 25.2) (4.2, 25.3) (26.3, 25.3)
(b). PCA and MA angles of the partial reconstruction results for images having two character components shown in (a).
Fig. 8. PCA and MA angle information of the partial reconstruction results for the image of two character components
(a). Sample images of two character components
18. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
ideal character case according to angular information, the proposed technique considers the width which
contributes to the highest peak in the histogram as the probable width. If the proposed technique does not
find a peak on the basis of width, it considers the average of the width of the characters as a probable
width. The same probable width is used for segmenting characters as shown in Fig. 9, where for the input
license plate images in Fig. 9(a) and Fig. 9(b), the proposed technique plots histograms using the probable
width as shown in Fig. 9(c), and the segmentation results given by the probable width are shown in Fig.
9(d) and Fig. 9(e), respectively. Fig. 9(d) and Fig. 9(e) show that segmentation is performed using
probable width segments in almost all the characters for image-1 in Fig. 9(a) except for ―12‖. For image-2
in Fig. 9(b), it segments almost all the characters except for ―W‖ and ―U‖. Therefore, segmentation with
probable widths is good in ideal cases as shown in Fig. 9(f), where for the complex image in Fig. 3(e), the
probable width segments all the characters successfully using the partial reconstruction results. However,
it is not true for all the cases. For example, it results in under-segmentation and over-segmentation as
shown in Fig. 9(d) and Fig. 9(e), respectively.
To solve the problem of under-segmentation given by the probable width, we propose an iterative-
shrinking algorithm, which reduces small portions of components from the right side with a step size of
five pixels in the partial reconstruction results, and then checks angle information of ideal characters. The
proposed technique investigates whether the angle difference between PCA and MA leads to an angle of
90 degrees or not, iteratively. When the angle difference satisfies the condition of an ideal character, the
iterative process stops, and the character is considered as an individual component. Since under-
segmentation usually contains two characters such as ―12‖, the iterative process segments such cases
successfully. This process is tested on all the components from the results of partial reconstruction to
solve the problem of under-segmentation.
19. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
The process of iterative-shrinking is illustrated in Fig. 10, where (a) is a sample of an under-segmentation
case, (b) gives the intermediate results of the iterative process, and (c) shows the final results. It is
observed from Fig. 10(b) that the angle difference between axes given by PCA and MA reduces as the
iterations continue, and subsequently stops when both the axes give the same angle.
In the same way of iterative-shrinking for under-segmentation, we propose iterative-expansion to solve
the over-segmentation cases. For each component given by the probable width, the proposed technique
expands with a step size of five pixels from the left side. At the same time, in the partial reconstruction
(f) Character segmentation result for the image in Fig. 3(e)
Fig. 9. Character segmentation using probable widths
(c) Histogram for the images in (a) & (b) to find probable widths
(a). License plate image-1 (b) License plate image-2
(d) Under-segmentation (e) Over-segmentation
(a) (b)
(c) Segmented characters
Fig. 10. Iterative-shrinking process for under-segmentation. (a) gives the case of under segmentation, and (b) shows the
intermediate results of the iterative process.
20. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
results, it calculates the angle differences of PCA and MA. This process continues until it gets the angle
of almost zero degrees. When two characters are merged, the iterative process gets an angle of zero
degrees by both PCA and MA. At this point, the iterative process stops and then we use the iterative-
shrinking algorithm to segment both the characters. Therefore, the proposed iterative-expansion uses
iterative-shrinking for solving the over-segmentation problem. Note that the proposed technique first
employs iterative-shrinking to solve the under-segmentation, then it uses iterative-expansion for solving
over-segmentation. This is because iterative-expansion requires iterative-shrinking. The reason to propose
an iterative procedure for both shrinking and expansion is that when a character component is split into
small fragments due to adverse factors or when character components are joined together, it is necessary
to study local information in order to identify the vertical and horizontal cases. The process of iterative-
expansion is illustrated in Fig. 11, where (a) shows the cases of under-segmentation, (b) shows
intermediate results of partial reconstruction of (a), (c) gives the results of iterative-expansion followed by
shrinking for correct segmentation, and (d) gives the final character segmentation results.
3.4.Complete Character Reconstruction
Section 3.2 described the method to obtain partial character reconstruction for input license plate images,
and the method presented in Section 3.3 uses the advantage of partial reconstruction for character
segmentation. Since characters are segmented well from license plate images even when they are affected
by multiple factors, we apply Canny to obtain edges to reconstruct complete shapes of characters for each
incomplete shape given by partial reconstruction. This is because Canny gives fine edges for low and high
contrast images when we supply individual characters rather than the whole license plate image (Saha,
Basu & Nasipuri, 2015). Therefore, we consider the output of Canny as the input for reconstructing
missing information in partial reconstruction results.
(a) Over-segmentation (b) Partial reconstruction of (a) (c) Iterative-expansion followed by shrinking
(d) Characters segmented
Fig. 11. Iterative-expansion process for over-segmentation
21. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
For the Canny edge of the input character image shown in Fig. 12(a), the proposed technique finds the
Stroke Width Pair (SWP) candidates as described in Section 3.2, where we can see the characters ―W‖
and ―5‖ given by partial reconstruction of lost shapes. The SWP are considered as representatives for
reconstruction in this Section. For each SWP, as the proposed technique defines symmetrical features
using gradient values, gradient vector flow and Laplacian, we define the same symmetry features using
gray information rather than gradient information. This is because according to our analysis of the
experimental results, the gradient does not give good responses for low contrast, low resolution and
distorted images. This is the main reason for the loss of shapes and the same thing has led to partial
reconstruction. Since characters are segmented and pixels have uniform color values, we propose
symmetry features in the gray domain to restore the rest of the incomplete information for partial
reconstruction results to obtain complete character shapes.
(d) Complete reconstruction for the partial reconstruction
Fig. 12. Complete reconstruction in the gray domain
(b) Peak intensity symmetry features (c) Intensity symmetry features
(a). Segmented characters, Canny edge image and partial reconstruction with Tangent representation
(b) Results of the complete reconstruction algorithm
(a) Segmented Characters by the method presented in Section 3.3
―WXD8012‖ ―WNY5555‖ ―EXE8950‖
(c) Recognition outcomes for the results in (b) by OCR
Fig. 13. Effectiveness of the complete reconstruction algorithm
22. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
For SWP, the proposed technique calculates a tangent angle as defined below:
))
/(
)
tan(( 1
1
tan x
x
y
y
Angle
where )
,
( y
x is the starting pixel location of the SWP, and )
,
( 1
1 y
x is the location of its neighbor pixel.
Since the tangent angle between the pixel of SWP and the neighbor pixel gives a direction, the proposed
technique finds the neighbor pixel in the same direction with the same stroke width distance to restore the
neighbor SWP. As long as the difference between the tangent angle of the current pixel and the neighbor
pixel remains the same, and the neighbor pair satisfies the stroke width distance of SWP, the proposed
technique moves in the same direction to restore the neighbor SWP. This process works well when
straight strokes are present, whilst at curves and corners the tangent angle gives a high difference.
Moreover, this tangent-based restoration works well for individual characters but not for the whole
license plate image, where this tangent direction may be a guide for touching, adjacent characters. In this
situation, the proposed technique recalculates the stroke width using eight neighbors of SWP pixels as we
calculated in Section 3.2. To find the right combination SWP out of 64, we define symmetry features as
the intensity value at the first pixel, and the second pixel has almost the same value as shown in Fig.
12(b), which is called the Peak Intensity Symmetry (PIS). The intensity values between the first and
second pixels of SWP should have gradual changes from high to low and low to high as shown in Fig.
12(c), which is called the Intensity Symmetry (IS). If the combination of SWP satisfies the above two
symmetry features, the pair is considered as actual contour pixels and displayed as white pixels, which are
shown in Fig. 12(d), where one can see that the lost information in Fig. 12(a) is restored. The potential of
complete character reconstruction for license plate images shown in Fig. 13(a) can be seen in Fig. 13(b)
where shapes are restored, and the recognition results in Fig. 13(c) illustrate correct OCR recognition
results for both the license plate images.
In summary, the gradient domain helps us to define symmetry properties, and at the same time, it misses
vital pixels of characters due to sensitivity to low contrast and low resolution, which results in partial
character reconstruction. To overcome this problem, the proposed method defines the same properties
using gray values rather than gradient values. This is because the segmented character does not have an
influence on complex backgrounds and it understood that that the pixel of the characters have almost
uniform values. Therefore, the combination of the properties in gradient and gray domains help us to
restore the missing information. In other words, the partial reconstruction helps in the accurate
segmentation of characters while segmentation helps in restoring the complete shape using intensity
values in the gray domain.
4. Experimental Results
23. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
To evaluate the effectiveness of the proposed technique for real-time applications, we consider the dataset
provided by MIMOS, which is the institute funded by the Government of Malaysia where License Plate
Recognition (LPR) is a live ongoing project. The dataset consists of 680 complex license plate images
with various challenges, such as poor quality images where we can expect low contrast, blurred images,
and character-touching images due to illumination effects, sun light, or headlights at night.
To demonstrate the merit of the proposed technique, we consider standard datasets that are available
publicly, namely, the UCSD dataset (Zamberletti, Gallo & Noce, 2015) with 1547 images, which have a
variety of challenges including the presence of blur, license plate images with very small font captured
from a substantial distance, and low resolution images. The Medialab dataset (Zamberletti, Gallo & Noce,
2015) contains 680 license plate images, which have a variety of font sizes, illumination effects, and
shadow effects. The Uninsbria dataset (Zamberletti, Gallo & Noce, 2015) containing 503 license plate
images captured from nearby, are better quality compared to the UCSD and Medialab datasets, but
generally have more complex backgrounds. In total, we considered 3410 license plate images for
experimentation, covering multiple factors that were mentioned in the Introduction section. In addition,
we chose 100 license plate images that are affected by multiple adverse factors as mentioned above from
all the license plate datasets to test the ability and effectiveness of the proposed technique, which are
termed as challenging data. This data does not include ‗good‘ (easy) images like in other datasets.
Since the proposed technique is capable of handling multiple causes, we test the ability of the proposed
technique on other standard datasets, such as ICDAR 2013 which has 28 videos (Karatzas et al., 2013),
YVT which has 29 videos (Nguyen, Wang & Belongie, 2013), and ICDAR 2015 which has 49 videos
(Karatzas et al., 2015). These datasets are popular for text detection and recognition in order to evaluate
the method. These datasets include low resolution, low contrast, complex backgrounds, and multiple
fonts, sizes, or orientations. Similarly, for natural scene datasets, we use ICDAR 2013 (Karatzas et al.,
2013), which has 551 images, SVT which has 350 images (Wang & Belongie, 2010), MSRA-500 which
has 500 images (Yao et al., 2012) and ICDAR 2015 which has 462 images (Karatzas et al., 2015). The
reason to consider natural scene datasets for experimentation is to show that when the proposed technique
works well for low resolution and low contrast images, it will also work for high resolution and high
contrast images. The main differences between the video datasets and these datasets include contrast and
resolution. In other words, video datasets suffer from low resolution and low contrast, while natural scene
datasets provide high contrast and high resolution images. In total, 3510 license plate images, 106 videos,
and 1863 natural scene images are considered for experimentation to demonstrate that the proposed
technique is robust, generic and effective.
The proposed technique involves a reconstruction step and a character segmentation step. To evaluate the
24. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
reconstruction step, we follow the standard measures and scheme used in (Peyrard et al., 2015) for
calculating measures, namely, Peak Signal to Noise Ratio (PSNR), Root Mean Square Error (RMSE), and
Mean Structural Similarity (MSSIM) as defined below. Since the measures used (Peyrard et al., 2015) are
proposed for evaluating the quality of handwritten images, we therefore prefer these measures for
evaluating the reconstruction steps of the proposed technique.
N
i
i
PSNR
N
PSNR
1
1
(1)
N
i
i
RMSE
N
RMSE
1
1
(2)
N
i
i
MSSIM
N
MSSIM
1
1
(3)
For character segmentation, we use standard measures proposed in (Phan et al., 2011), where the same
measures are used for character segmentation, namely, Recall (R), Precision (P), F-measure (F),
UnderSegmentation (U) and OverSegmentation (O). The definitions for the measures are as follows.
Truly Detected Character (TDC): A segmented block that contains correctly-segmented characters.
Under-Segmented Blocks (USB): A segmented block which contains more than one characters.
Over-Segmented Blocks (OSB): A segmented block that contains no complete characters.
False detected block (FDB): A segmented block that does not contain any characters; for example,
intermediate objects, boundary or a blank space. The measures can be calculated as follows.
Recall (R) = TDC / ANC,
Precision (P) = TDC / (FDB),
F-measure (F) = (2 *P* R) / (P + R).
UnderSegmentation (U) = USB/ANC
OverSegmentation (O) = OSB/ANC
To validate the reconstruction step that preserves character shapes, we consider the character recognition
rate as a measure for reconstructed images with the publicly available Tesseract OCR, 2016. For the
purpose of the evaluation of the recognition results, we follow the definitions of Recall (RR) as Precision
(RP) and F-measures (RF) as in Ami, Basha & Avidan, 2012, because these definitions are proposed for
Bib number recognition. Since Bib number and license number have a similarity, we prefer to use these
25. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
measures.
RR is defined as the percentage of correctly recognized characters out of the total number of characters
(ground truth), and RP is defined as the percentage of correctly recognized characters out of the total
number of recognized characters. For the F-measure, we use the same formula employed for evaluating
the segmentation step for combining RR and RP into one measure.
Note that since there is no ground truth available for license plate datasets, for MIMOS, UCSD, Medialab,
and Uninsbria, we manually count the Actual Number of Characters (ANC) as the ground truth. For
standard video and scene image datasets, we use the available ground truth and evaluation schemes as
instructions in the ground truth.
In order to show the usefulness and effectiveness of the proposed technique, we implement existing
character segmentation methods, namely, Phan et al., 2011, which use minimum cost path estimation for
character segmentation in video; the method of Khare et al., 2015 proposes sharpness features for
character segmentation in license plate images, and the method of Sharma et al., 2013, uses the
combination of clusters analysis and the minimum cost path estimation for character segmentation in
video to facilitate comparative studies. The main reason for selecting these existing methods is that an
existing method which focuses on a single factor may not work well for license plate images affected by
multiple factors. Phan et al.‘s method addresses low resolution and low contrast factors, Khare et al.‘s
method is a recent one that addresses license plate issues to some extent, and Sharma et al.‘s method
addresses multi-oriented and touching factors. Dhar et al, 2018 proposed a system design for license plate
recognition by using edge detection and convolutional neural networks. Ingole et al, 2017 proposed
character feature-based vehicle license plate detection and recognition. Radchenako et al, 2017 proposed
a method of segmentation and recognition of Ukrainian license plates. The reason to choose these
methods is that the objective of the methods is the same as the proposed work. However, the methods are
confined to specific applications.
In the same way, we choose the state-of-the-art recognition methods, namely, the method of Zhao et al.,
2013, which is a robust binarization approach that works well for high resolution and low contrast
images: the method of Tian et al., 2015, which is a recent approach proposed for the recognition of video
characters through shape restoration, and the method of Anagnostopoulos et al., 2006which proposes an
artificial neural network for character recognition in license plate images. The motivation to choose these
methods for the comparative study is that Zhao et al.‘s method is the state-of-the-art approach which
represents recognition of scene characters through binarization, Tian et al.‘s method is the state-of-the-art
approach which represents recognition of video characters through reconstruction, and the method of
Anagnostopulos et al., 2006, is the state-of-the-art approach recognizing characters in license plates
26. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
through classifiers. Since the proposed technique is robust to multiple factors, we chose these methods to
work on different datasets for undertaking a comparative study to validate the strengths of the proposed
technique. Additionally, we also consider the following methods that explore the recent deep learning
models for license plate recognition. Bulan et al, 2017 proposed segmentation-and annotation-free
license plate recognition with deep localization and failure identification. The method explore CNNs for
detecting a set of candidate regions. Silva et al. 2018 proposed license plate detection and recognition in
unconstrained scenarios. The method explores CNNs for addressing challenges caused by degradation.
Lin et al, 2018 proposed an efficient license plate recognition system using convolution neural networks.
For finding the value for the parameters, threshold, symmetry properties and conditions, we randomly
chose 500 sample images from the dataset for experimentation. Since the proposed method does not
involve classifiers for training, we prefer to choose samples randomly from all the databases considered in
this work for experimentation. We use a system with an Intel Core i5 CPU with 8 GB RAM configuration
for all experiments. According to our experiments, the proposed method consumes 30ms for each image,
which includes partial reconstruction, character segmentation, complete character reconstruction and
recognition.
In Section 3.3, we define three hypotheses for ideal character detection, over-segmentation and under-
segmentation based on the principal (PCA) and major axes (MA). It is expected that the PCA gives the
same angles for the ideal characters. However, it is not the case due to the complexity of the problem.
Therefore, we set ±26 degree as a threshold for character segmentation using partial reconstruction
results. To determine the value, we conduct experiments for 500 samples chosen randomly by varying
different angle values against the recognition rate as shown in Fig. 14, where it is observed that for angle,
26, the proposed method reports a high recognition rate. Hence, we choose the same value for all the
experiments in this work.
27. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
In Section 3.2, the proposed method introduces the partial reconstruction concept for character
segmentation. It is expected that the partial reconstruction step outputs the structure of the character shape
such that at least a human could read the character. The question is how to define the partial
reconstruction in terms of quantity. Therefore, we conducted experiments by estimating the number of
missing pixels compared to the pixels in the ground truth. In this experiment, we manually add noise and
blur at different levels to make the character images complex such that they lose pixels. We calculate the
percentage of missing pixels with the help of the ground truth. We illustrate sample results for different
percentages of missing pixels during partial reconstruction as shown in Fig. 15(a) where we can see
angles given by PCA, MA, the difference between the PCA and MA angle and different percentages of
missing white pixels. It is observed from Fig. 15(a) and Fig. 15(b) that for 90% to 40%, the proposed
method constructs the complete shape of the character and obtains correct recognition results. But for
lower than 40%, the proposed method loses the shape of the character, which results in incorrect
recognition. Based on this experimental analysis, we consider 40% as the threshold to define partial
reconstruction results in this work. It is also noted from Fig. 15(a) that for a difference angle, 28.2, the
proposed criteria for character segmentation fail as the OCR gives incorrect results. It is evident that ±26
is a feasible threshold to achieve better results.
Varying the difference angle given by PCA and MA to determine optimal threshold value
Fig. 14. Determining the optimal value for the threshold of PCA and MA to check whether a segmented character
is ideal or not. Note: at a 26 angle value, the recognition rate is high compared to other percentage values.
Recognition Rate
28. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
4.1.Experiments for Analyzing the Contributions of Individual Steps of the Proposed Technique
The major contributions of the proposed technique are partial reconstruction, character segmentation and
complete reconstruction. To understand the effectiveness of each step, we conducted experiments on the
MIMOS dataset and calculated the respective measures as reported in Table 1. The reason for selecting
the MIMOS dataset is that it consists of live data provided by a research institute. To estimate the quality
measures for partial reconstruction and complete reconstruction, we use the Canny edge images of
English alphabets created artificially as the ground truth. It is noted from the quality measures of the
partial reconstruction reported in Table 1 that except MSSIM, the other two measures report poor results.
This shows that the partial reconstruction steps preserve the character structures, at the same time, some
information is lost. It is evident from the recognition results of the partial reconstruction reported in Table
1 that all three measures report low results. Therefore, one can ascertain that partial reconstruction alone
may not help us to achieve better results. To analyze the effectiveness of the segmentation step, we apply
the segmentation step on the Canny edge images of the input characters without partial reconstruction
(SWR) results. It is observed from the measures of segmentation that they all report low results,
especially under- and over-segmentation report poor results. This shows that the segmentation step alone
is inadequate for solving the problem of segmentation for license plate images. Similarly, we apply the
steps of the complete reconstruction algorithm on the Canny edge image of each input image without
segmenting characters (CRWS). The results reported in Table 1 show that the quality measures report low
results except for MSSIM, and the measures of recognition also report poor results. Therefore, we can
argue that the symmetry features proposed for complete reconstruction are not good when we apply them
on the whole image without segmentation. Overall, we can conclude that reconstruction and character
segmentation are complementing each other to achieve better results.
Fig. 15. Determining the percentage of missing pixels to define partial reconstruction and the threshold value for
angle difference between PCA and MA angles.
Ground truth PCA Angle: 89.2 85.48 89.7 88.9 89.8 83.1 78.3 64.1 2.3
Major Axis Angle: 92.2 92.3 92.2 92.23 92.23 91.99 92.1 92.3 92.2
Difference: 3.0 6.5 2.2 3.4 2.4 8.8 13.8 28.2 89.9
Percentage of missing pixels (%): 90 80 70 60 50 40 30 20 10
(a) Determining the percentage of missing pixels to define partial reconstruction. Yellow denotes the PCA axis and
the red denotes the Major axis.
Recognition results: ―0‖ ―0‖ ―0‖ ―0‖ ―0‖ ―0‖ ―U‖ ―U‖ ― –―
(b) The results of complete reconstruction with recognition results.
29. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
Table 1. Performances of individual steps of the Proposed Technique on the MIMOS dataset
Steps Quality measures Segmentation measures Recognition
PSNR RMSE MSSIM R P F O U RR RP RF
PR 12.3 69.4 0.29 23.7 13.1 18.5
SWR 16.3 8.4 11.1 33.3 26.6
CRWS 8.6 78.6 0.24 14.4 13.2 13.8
In the case of license plate recognition, when the images are affected by multiple causes, sometimes, we
can expect a little elongation, such as the effect of perspective distortion. To show the effect of elongation
created by multiple causes, we implemented the method in Dhar et al. 2018 where the method considers
extrema points for correcting small tilts to the horizontal direction. In this work, we calculate quality
measures, segmentation measures and recognition rate before and after rectification on the MIMOS
dataset as reported in Table 2. Before rectification the images are considered as input without correcting
the small tilt in the horizontal direction for experimentation. After rectification, the corrected images are
considered for experimentation. It is found from Table 2 that the results of all the steps including the
proposed method give slightly better results after rectification compared to before rectification. However,
the difference is marginal. Therefore, we can conclude that overall, if we use rectification before
recognizing the license plates, the recognition rate improves slightly.
Table 2. Performance of the individual steps and the proposed method before and after rectification on the MIMOS
dataset
Before rectification After rectification
Steps Quality measures Segmentation measures Recognition Quality measures Segmentation
measures
Recognition
PSNR RMSE MSSIM R P F O U RR RP RF PSNR RMSE MSSIM R P F O U RR RP RF
PR 12.3 69.4 0.29 23.7 13.1 18.5 13.7 65.4 0.32 27.6 15.3 21.4
SWR 16.3 8.4 11.1 33.3 26.6 18.9 10.6 14.7 30.7 24.4
CRWS 8.6 78.6 0.24 14.4 13.2 13.8 10.4 74.3 0.21
Proposed 32.1 7.1 0.65 86.8 82.6 84.6 10.8 2.4 88.4 84.3 86.3 34.7 6.4 0.62 88.9 84.3 86.6 9.4 2.1 90.6 87.3 88.9
4.2.Experiments on the Proposed Character Segmentation approach
Qualitative results of the proposed technique on license plate images of different datasets, namely,
MIMOS, Medialab, UCSD and Uninsubria are shown in Fig. 16(a)-Fig. 16(b), where we can see that the
complexity of the input images vary from one dataset to another due to multiple factors of the datasets.
For such images, the proposed technique segments characters successfully. It is evident that the proposed
(a) Sample input images of MIMOS, Medialab, UCSD and Unisnusubria datasets
(b). Character segmentation of the proposed technique for the respective images in (a)
Fig. 16. Qualitative results of the proposed technique for character segmentation on different datasets.
30. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
technique is robust to multiple factors. Quantitative results of the proposed and existing techniques for the
above-mentioned datasets are reported in Table 3, where we note that the proposed technique is the best at
all the measures especially under- and over-segmentation rates, which report a low score compared to the
existing techniques. Table 3 shows that all the methods including the proposed technique provide good
accuracies on the MIMOS dataset and the lowest for the UCSD dataset. This is because the number of
distorted images is higher in the case of the UCSD dataset compared to MIMOS and the other datasets.
The results of the proposed and existing methods on challenging data show that the proposed method
performs almost the same as other license plate approaches despite the fact that the challenging data does
not include any ‗good‘ (easy) images as in other datasets. The reason for the poor results by the existing
methods is the main goal of all the three methods is to detect text in video or natural scene images but not
license plate images. Similarly, though the methods, namely, Dhar et al, 2018, Ingole et al. 2017 and
Radchenko et al. 2017 were developed for character segmentation from license plate images, the methods
do not perform well on all the dataset compared to the proposed method. The reason is that the methods
depend on profile based features, binarization and the specific nature of the dataset as conventional
document analysis methods.
Similarly, quantitative results of the proposed and existing techniques for video and natural scene images
are reported in Table 4, where it is observed that the proposed technique is the best for the F-measure for
under-and over-segmentations as compared to existing techniques. It may be noted from Table 4 that the
proposed technique scores consistent results for all the datasets except for the MSRA-TD-500 dataset.
This is because this dataset contains arbitrary-oriented texts. Since our aim is to develop a technique for
license plate images, where we may not find arbitrary orientations, the proposed technique gives poor
results when the characters are in arbitrary orientations, such as curved texts. The reason for the poor
results by the existing method is that all the three methods are sensitive to the starting point as they need
to estimate the minimum cost path. On the other hand, the proposed technique does not require either seed
points or starting points to find spaces between characters. Overall, the segmentation experiments shows
that the proposed technique is capable of handling license plates as well as video and natural scene
images.
Table 3. Performance of the proposed and existing techniques for character segmentation on different license
plate datasets
Datasets Measures
Phan et
al., 2011
Khare et
al., 2015
Shrama et
al., 2013
Dhar et
al., 2018
Ingole et
al., 2017
Radchenko et
al., 2017
Proposed
MIMOS
R 39.4 58.4 68.3 73.4 74.6 65.3 86.8
P 38.4 57.3 66.9 72.3 70.4 63.3 82.6
F 38.7 57.5 67.5 72.8 72.5 64.3 84.6
O 21.1 23.2 14.9 14.8 15.3 18.3 10.8
U 38.7 18.4 16.7 12.4 12.2 17.4 2.4
Medialab R 34.3 51.3 54.7 69.7 70 59.4 82.1
31. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
P 33.6 47.3 42.1 64.2 67.4 55.4 81.6
F 33.9 49.3 48.3 66.9 68.7 57.4 81.6
O 24.2 25.2 19.7 21.1 20.4 42.6 10.1
U 39.6 20.6 22.6 12.7 10.9 23.3 7.9
UCSD
R 21.3 26.1 41.3 35.2 47.2 29.6 56.7
P 20.4 22.4 36.9 30.6 40.7 27.4 53.4
F 20.8 24.6 39.1 32.9 43.9 28.5 55.1
O 35.5 43.1 26.4 39.7 34 35.9 12.9
U 45.7 30.6 32.6 27.4 22.1 35.6 29.8
Uninusubria
R 31.4 42.7 61.3 41.6 53.6 48.7 75.7
P 30.5 41.6 57.4 39.8 50.9 46.1 66.4
F 30.9 42.1 59.3 40.7 52.2 47.4 71.1
O 35.7 28.9 22.3 31.9 26.9 24.3 12.3
U 32.9 28.4 16.4 27.4 20.8 28.3 12.6
Only Challenged
Images
R 33.4 47.2 57.4 43.6 54.8 51.6 72.1
P 36.2 42.3 52.3 41.1 50.4 47.9 73.4
F 34.7 44.7 54.8 42.3 52.6 49.7 72.6
O 34.3 29.7 24.6 33.5 24.8 26.8 13.6
U 30.9 25.5 20.6 24.1 22.6 23.4 13.8
Table 4. Performance of the proposed and existing techniques for character segmentation on different video and
natural scene datasets
Datasets Measures
Phan et
al.,2011
Khare et
al., 2015
Sharma et
al., 2013
Dhar et
al., 2018
Ingole et
al., 2017
Radchenko et
al., 2017
Proposed
ICDAR 2015
Video
R 22.6 37.9 60.7 39.4 55.3 46.8 66.9
P 24.6 34.2 58.3 37.4 53.9 42.8 62.4
F 23.2 36.1 59.5 38.4 54.6 44.8 64.6
O 38.7 28.1 23.4 31.9 24.1 30.9 18.7
U 36.4 35.8 17.1 29.7 21.3 24.3 18.1
YVT Video
R 30.6 38.2 51.6 51 65.9 57.4 74.9
P 29.7 41.6 52.4 49.1 60.3 56.3 73.4
F 30.1 39.9 52.1 50 63.1 56.8 73.8
O 32.7 32.6 24.3 29.3 24.5 24.8 16.2
U 36.1 29.8 23.6 20.7 12.4 18.3 14.9
ICDAR 2013
Video
R 28.9 37.4 52.6 54.2 66.8 54.5 71.2
P 27.6 39.1 51.4 50.3 63.4 52.2 70.9
F 28.1 38.5 51.9 52.2 65.1 53.3 71.1
O 42.4 33.2 17.8 30.3 20.8 29.2 13.5
U 30.6 28.9 29.4 17.4 14.1 17.4 18.4
ICDAR 2015
Scene Dataset
R 31.4 42.7 61.3 56.7 58.3 54.1 71.3
P 30.5 41.6 57.4 52.2 52.8 48.7 69.4
F 30.9 42.1 59.3 54.7 55.5 51.4 70.7
O 35.7 28.9 22.3 29.1 31.9 27 14.2
U 32.9 28.4 16.4 16.4 12.5 21.6 16.8
ICDAR 2013
Scene Dataset
R 32.6 40.7 61.1 59.3 54.3 53.8 76.8
P 32.5 46.5 52.2 52.6 53.7 47.2 72.3
F 32.5 43.4 56.6 55.9 54 50.5 74.5
O 37.1 29.7 21.3 23.6 25.3 25.4 16.3
U 32.4 26.9 22.0 20.4 20.7 24.1 9.1
SVT Scene
Dataset
R 21.4 38.6 61.3 43.4 50.6 47.9 64.7
P 20.4 31.4 57.4 39.2 45.8 42.7 61.9
F 20.9 35.0 59.3 41.3 48.2 45.3 63.3
O 41.3 22.9 22.3 31.6 27.5 30.8 12.6
U 37.2 42.1 16.4 27.1 24.3 23.9 24.1
MSRA-TD-500
Dataset
R 22.4 26.7 42.1 30.7 38.4 34.3 59.3
P 23.6 24.4 32.1 28.4 35.7 29.4 57.3
F 22.7 25.6 37.1 29.5 37 31.8 58.6
32. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
O 32.4 41.2 29.3 42.6 36.6 39.9 22.7
U 46.9 35.7 31.8 27.8 26.3 28.2 28.9
4.3.Experiments on the Proposed Character Recognition Technique Through Reconstruction
Qualitative results of the proposed and existing techniques for the recognition of license plate images for
different datasets are shown in Fig. 17(a)-Fig. 17(d) for MIMOS, Medialab, UCSD and Uninsubria,
respectively. The recognition step considers the output of the segmentation step as the input, which is
shown in Fig. 17, to reconstruct shapes of the segmented characters. This results in the conclusion that the
proposed technique reconstructs shapes well for characters of different datasets affected by different
factors. It can be validated by the recognition results given by OCR in double quotes in Fig. 17. Thus, we
can assert that the proposed technique does not require binarization for recognition. To evaluate the
reconstruction results given by the proposed and existing techniques, we estimate quality measures, which
are reported in Table 5 for license plate, video and natural scene images datasets. Since Tian et al.‘s
method (Tian et al., 2015) outputs reconstruction results for recognition as in our technique but unlike
other existing methods, the proposed technique is compared with only Tian et al.‘s method (Tian et al.,
2015). Table 5 shows that the proposed technique is better than the existing method in terms of three
quality measures for all the three types of datasets. It is also observed from Table 5 that the proposed
method performs almost the same as on other datasets, but is applied to challenging data. The main reason
for the poor results of the existing method is that it depends on gradient information, which gives good
responses for high contrast images for reconstructing character images, while the proposed technique uses
(b). Reconstruction and recognition results, ―M.IZ8563‖on Medialab
(a). Reconstruction and recognition results ,―HSP6598‖on MIMOS
Segmented character images Reconstruction
(c). Reconstruction and recognition results, ―LAPVhD9‖on UCSD
(d). Reconstruction and recognition results, ―ZG0013JP‖on Uninsubria
Fig. 17. Qualitative results of the proposed technique for reconstruction and recognition on different license plate images
33. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
both gradient and intensity information for reconstruction to handle the images affected by multiple
factors.
Quantitative results for the recognition of the proposed and existing techniques on license plate images,
video and natural image datasets are reported in Table 6. These experiments include recognition results
using Canny edge images of input character images, where we pass Canny edge images to the OCR
directly for recognition without reconstruction. To demonstrate that a Canny edge image alone without
reconstruction is inadequate to achieve good results, we conducted recognition experiments by passing
the Canny edge images to an OCR directly. It is can be verified from the results reported in Table 6,
where it is noted that recognition results with Canny images are far from those of the proposed technique
in terms of all the three measures. This is due to the fact that Canny is sensitive to blur and complex
backgrounds. We can also observe from Table 6 that the proposed technique achieves better results than
the other existing methods for complex datasets, namely, MIMOS, UCSD, YYVT video, SVT, MSRA
and the challenging dataset. For other datasets, the existing method, Silva et al. 2018 achieves better
results than all the methods including the proposed method. This is justifiable because the method
explores a powerful deep learning model for unconstrained license plate recognition. It is evident that the
methods, namely, Bulan et al. 2017, Lin et al. 2018, which also explore deep learning models for license
plate recognition, achieve better results than all the other existing methods but these two perform worse
than the proposed method. However, the difference between the method in Silva, 2018 and the proposed
method is marginal. Besides, the results on difficult data show that the proposed method is effective in
tackling challenges as it reports almost the same as on the other datasets. Therefore, the proposed
technique is robust and generic compared to the existing methods. The major weakness of the existing
methods is as follows. Since the gradient used in Tian et al.‘s method is good for high contrast images, it
gives poor results for low contrast images; Zhao et al.‘s method is developed for high contrast and
homogeneous background images, however it gives poor results, and Anagnostopoulus et al.‘s method
involves binarization and parameter tuning to give poor results for the images affected by multiple
factors. Conversely, the proposed technique does not depend on binarization and explores the
combination of gradient and intensity for reconstruction through character segmentation, and it performs
better than the existing methods especially for the datasets, which involve images containing multiple
challenging factors.
Table 5. Performance of the proposed and existing techniques for reconstruction on different license, video and
natural scene datasets
Methods Tian et al., 2015 Proposed
Datasets RMSE PSNR MSSIM RMSE PSNR MSSIM
MIMOS 22.7 19.9 0.74 7.1 32.1 0.65
Medialab 42.7 21.8 0.79 12.4 26.3 0.60
UCSD 69.0 19.7 0.59 31.7 22.4 0.6
34. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
Uninsubria 72.4 8.4 0.52 26.3 23.8 0.4
ICDAR 2015 Video 63.5 11.7 0.61 19.7 23.9 0.63
YVT Video 55.3 16.1 0.68 16.3 24.5 0.67
ICDAR 2013 Video 63.6 11.7 0.61 18.4 24.0 0.60
ICDAR 2015 Scene 57.3 15.4 0.67 18.41 24.0 0.64
ICDAR 2013 Scene 57.3 15.4 0.67 16.2 24.6 0.65
SVT Scene 62.1 12.4 0.62 22.3 23.8 0.61
MSRA-TD 500 Scene 68.7 10.7 0.59 26.1 23.7 0.55
Only Challenged 39.2 12.5 0.57 15.6 20.4 0.48
Table 6. Performance of the proposed and existing techniques for recognition on different license, video and natural
scene datasets
Datasets Measures Canny
Anagnostopolus et
al., 2006
Zhou et
al., 2013
Tian et
al., 2015
Bulan et
al., 2017
Silva et
al., 2018
Lin et
al., 2018
Proposed
MIMOS
RR 58.7 63.2 47.4 57.6 86.3 89.3 78.3 88.4
RP 54.3 64.7 52.3 59.7 82.6 83.2 74.9 84.3
RF 56.4 63.8 50.3 58.6 84.5 86.2 76.6 86.3
Medialab
RR 59.3 64.7 52.4 61.2 83.7 86.4 75.6 82.3
RP 52.4 66.9 56.8 62.7 75.3 82.3 71.9 79.3
RF 55.3 65.7 54.6 61.6 79.5 84.3 73.7 81.3
UCSD
RR 29.2 42.3 47.2 44.9 52.4 58.3 51.7 65.7
RP 32.7 44.7 48.1 46.2 47.4 55.3 49.5 62.1
RF 31.3 43.6 47.6 45.5 49.9 56.8 50.6 63.9
Uninsubria
RR 62.4 65.3 68.3 64.3 76.4 78.4 77.1 78.7
RP 66.7 68.7 69.4 69.4 72.4 75.3 77.4 80.3
RF 64.8 66.9 68.8 67.1 74.4 76.8 77.2 79.5
ICDAR 2015
Video
RR 66.2 68.9 71.8 72.6 83.4 86.4 84.3 78.6
RP 61.3 75.7 72.3 72.7 81.3 80.3 78.9 73.4
RF 63.7 72.7 72.1 72.6 82.3 83.3 81.6 76.2
YVT Video
RR 72.4 72.9 66.9 71.4 83.4 85.9 84.9 78.3
RP 65.3 77.8 70.3 74.8 79.2 81.4 78.8 82.6
RF 68.4 75.8 68.7 72.9 81.3 83.6 81.8 80.5
ICDAR 2013
Video
RR 68.2 78.7 71.3 74.9 81.6 83.2 79.5 83.7
RP 61.3 79.3 68.9 71.3 80.4 81.5 78.4 84.2
RF 65.7 78.5 69.8 72.8 81 82.3 78.9 83.5
ICDAR 2015
Scene
RR 66.8 77.3 72.1 65.3 82.3 85.7 80.2 80.3
RP 67.3 72.1 74.3 62.1 81.4 84.4 80.3 82.1
RF 66.9 75.2 73.6 64.4 81.8 85 80.2 81.5
ICDAR 2013
Scene
RR 59.3 72.3 71.3 65.6 83.1 86.1 81.4 78.3
RP 56.3 72.4 68.7 64.3 81.5 84.3 80.4 73.2
RF 58.6 72.3 70.1 64.9 82.3 85.2 80.9 75.8
SVT Scene
RR 58.3 76.4 71.4 66.3 78.2 79.3 76.4 80.4
RP 59.7 78.3 74.7 67.2 77.3 76.3 74.8 81.6
RF 58.6 77.9 73.1 66.8 77.7 77.8 75.6 81.0
MSRA-TD-500
Scene
RR 64.3 73.9 75.9 72.4 78.4 81.3 74.9 82.4
RP 65.8 76.4 74.3 77.3 74.8 80.4 73.8 81.6
RF 64.9 75.9 75.1 75.4 76.6 80.8 74.3 81.9
Only
Challenged
Images
RR 58.7 51.9 47.4 57.6 54.8 57.6 55.9 62.9
RP 54.3 52.3 52.3 59.7 51.7 56.6 51.4 65.7
RF 56.5 52.1 49.8 58.6 53.2 57.1 53.6 64.3
Overall, to show the proposed method is robust to multiple adverse factors as mentioned in the
Introduction and Proposed Methodology sections, we present sample results of each step on different
images affected by low contrast, complex background, multi-fonts, multi-font sizes, blur and distortion
due to perspective angle, as shown respectively in Fig. 18(a)-Fig. 18(f), which include the results of
partial reconstruction, character segmentation, full reconstruction and recognition. One can assert from
35. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
the results shown in Fig. 18 that the proposed method has significant benefits in handling multiple
adverse factors. If a license plate image contains any logo or symbol as shown in Fig. 18(c), the
segmentation algorithm dissects the symbols as characters. However, when the result is sent to an OCR, it
fails as shown in Fig. 18(c). As a result, the presence of symbols in license plate images does not affect
the overall performance of the technique. It is evident from the results reported in Table 3, Table 5 and
Table 6 on challenging data, that one can see the proposed method performs almost the same as on the
other data. It is noted that for the recognition experiments, we use OCR, which is available publicly. This
OCR has inherent limitations such as image size, font variation, and orientation. As a result, despite the
fact that the proposed method reconstructs character shapes, it fails to achieve a high accuracy, which is
more than 90%. Since our target is to address the above challenges and to develop a generalized method,
we prefer to use available OCR approaches to demonstrate the effectiveness and usefulness of the
proposed method rather than using language models, lexicons and learning models. This is because these
methods restrict generality. Therefore, we believe that the proposed work makes an important statement
that there is a way to handle adverse factors such that one can use machine learning or deep learning to
achieve high accuracy instead of using traditional OCR by considering reconstructed results as input. This
is our next target to achieve high accuracy on the MIMOS dataset by exploring the deep learning concept.
36. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
To show the effectiveness of the proposed method on license plate images of different countries, we test
the steps of the proposed method on American license plate images as well as those shown in Fig. 19,
where it is noted that the steps and the proposed method work well for American license plate images.
This is the advantage of the steps proposed in this work, i.e. stroke width pair candidate detection, partial
reconstruction, character segmentation, complete reconstruction and recognition. This shows that the
proposed method is independent of scripts.
(e). Sample results of a key step for a blurred image
(d). Sample results of a key step for a multi-font size image
(c). Sample results of key steps for a multi-font image
(b). Sample results of key steps for a complex background image
(a). Sample results of a key step for a low contrast license plate image
5GUY95Z
SYOISI S
LAU PA 78
LAYK317
YY'-5673
LII'I3B1
(f). Sample results of key steps for a perspective distortion image
Fig. 18. Overall performance of the proposed method on the images affected by multiple adverse factors. Column-1-
Column-5 denote input images of different causes, the results of partial reconstruction, the result of character
segmentation, the result of full reconstruction and recognition, respectively.
37. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
To test the scaling effect for license plate recognition of the proposed method, we calculate recognition
rate for different scales as shown in Fig. 20. If the image is too small as shown in Fig. 20 (i.e. size of the
character image is 4×4), the proposed method reports poor results as shown in Fig. 20. This type of small
size is rare for license plate recognition. However, for a size greater than 16×16, the proposed method
gives better results. This shows that the different scales may not have much of an effect on the overall
performance of the proposed method. Therefore, we can conclude that the proposed method is invariant to
scaling. This is justifiable because the features proposed based on stroke width distance are invariant to
scaling.
―6DZG261‖ ―6WDG928‖ ―4NQE750‖
(g) Recognition Results for the outputs of the complete reconstruction step.
Fig. 19. Examples of the proposed stroke width pair candidate detection, reconstruction, segmentation and
recognition approaches for American license plate images.
(e) The results of character segmentation from the results of partial reconstruction in (d)
(d) The result of the partial reconstruction step
(c) Stroke width pair candidate pixels
(b) Stroke width pair candidate detection
(a) Inputs of different American license plates
(f) The results of the complete reconstruction step
38. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
5. Conclusions and Future Work
We have proposed a novel technique for recognizing license plates, video and natural scene images
through reconstruction. The proposed technique explores gradient and Laplacian symmetrical features
based on stroke width distance to obtain partial reconstruction for segmenting characters. To segment
characters affected by multiple factors such as low contrast, blur, complex backgrounds, and illumination
variations, we introduce angular information for partial reconstruction results based on character
structures, which solve under- and over-segmentations successfully. For segmented characters, the
proposed technique explores symmetry features based on stroke width distance and tangent direction in
the gray domain to restore complete shapes for partial reconstruction results. Comprehensive
experimental results are conducted on large datasets, which include license plates, video and natural scene
images to show that the proposed technique is robust and generic compared to existing methods. The
same idea can be extended with the help of a deep learning concept for images of different scripts from
other countries, such as Indian, Russian, Arabic and European, to develop a generic system in the near
future.
Acknowledgements
This work was supported by the Natural Science Foundation of China under Grant 61672273 and Grant
61832008, and the Science Foundation for Distinguished Young Scholars of Jiangsu under Grant
BK20160021. This work is also partly supported by the University of Malaya under Grant No:
UM.0000520/HRU.BK (BKS003-2018).
Examples of different scaled images
Fig. 20. Recognition rate of the proposed method for different scales to find the lower and upper boundary for
scaling up and down.
39. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
The authors would like to thank the anonymous reviewers and the Editor for their constructive comments
and suggestions to improve the quality and clarity of this paper.
References
Abolghasemi, V., & Ahmadyfard, A. (2009). An edge-based color-aided method for license plate detection. Image
and Vision Computing, 27(8), 1134-1142.
Al-Ghaili, A. M., Mashohor, S., Ramli, A. R., & Ismail, A. (2013). Vertical-edge-based car-license-plate detection
method. IEEE transactions on vehicular technology, 62(1), 26-38.
Al-Shemarry, M. S., Li, Y., & Abdulla, S. (2018). Ensemble of adaboost cascades of 3L-LBPs classifiers for license
plates detection with low quality images. Expert Systems With Applications, 92, 216-235.
Anagnostopoulos, C. N. E., Anagnostopoulos, I. E., Loumos, V., & Kayafas, E. (2006). A license plate-recognition
algorithm for intelligent transportation system applications. IEEE Transactions on Intelligent transportation
systems, 7(3), 377-392.
Azam, S., & Islam, M. M. (2016). Automatic license plate detection in hazardous condition. Journal of Visual
Communication and Image Representation, 36, 172-186.
Ben-Ami, I., Basha, T., & Avidan, S. (2012). Racing Bib Numbers Recognition. In BMVC (pp. 1-10).
Bulan, O., Kozitsky, V., Ramesh, P., & Shreve, M. (2017). Segmentation-and annotation-free license plate
recognition with deep localization and failure identification. IEEE Trans. ITS, pp 2351-2363.
Dhar, P., Guha, S., Biswas, T., & Abedin, M. Z. (2018). A System Design for License Plate Recognition by Using
Edge Detection and Convolution Neural Network. In Proc. IC4ME2, pp. 1-4.
Dong, M., He, D., Luo, C., Liu, D. and Zeng, W., (2017). A CNN-based approach for automatic license plate
recognition in the wild. In Proc. BMCV, pp 1-12.
Du, S., Ibrahim, M., Shehata, M., & Badawy, W. (2013). Automatic license plate recognition (ALPR): A state-of-
the-art review. IEEE Transactions on circuits and systems for video technology, 23(2), 311-325.
Du, S., Ibrahim, M., Shehata, M., & Badawy, W. (2013). Automatic license plate recognition (ALPR): A state-of-
the-art review. IEEE Transactions on circuits and systems for video technology, 23(2), 311-325.
Epshtein, B., Ofek, E., & Wexler, Y. (2010, June). Detecting text in natural scenes with stroke width transform. In
Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on (pp. 2963-2970). IEEE.
Goodfellow, I. J., Bulatov, Y., Ibarz, J., Arnoud, S., & Shet, V. (2013). Multi-digit number recognition from street
view imagery using deep convolutional neural networks. arXiv preprint arXiv:1312.6082.
Gou, C., Wang, K., Yao, Y., & Li, Z. (2016). Vehicle license plate recognition based on extremal regions and
restricted Boltzmann machines. IEEE Transactions on Intelligent Transportation Systems, 17(4), 1096-1107.
Ingole, S. K., & Gundre, S. B. (2017). Characters feature based Indian Vehicle license plate detection and
recognition. In Proc. I2C2, pp. 1-5.
Jaderberg, M., Simonyan, K., Vedaldi, A., & Zisserman, A. (2016). Reading text in the wild with convolutional
neural networks. International Journal of Computer Vision, 116(1), 1-20.
Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., Ghosh, S., Bagdanov, A., Iwamura, M., ... & Shafait, F. (2015,
August). ICDAR 2015 competition on robust reading. In Document Analysis and Recognition (ICDAR), 2015
13th International Conference on (pp. 1156-1160). IEEE.
Karatzas, D., Shafait, F., Uchida, S., Iwamura, M., i Bigorda, L. G., Mestre, S. R., ... & De Las Heras, L. P. (2013,
August). ICDAR 2013 robust reading competition. In Document Analysis and Recognition (ICDAR), 2013 12th
International Conference on (pp. 1484-1493). IEEE.
40. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
Khare, V., Shivakumara, P., Raveendran, P., Meng, L. K., & Woon, H. H. (2015, November). A new sharpness
based approach for character segmentation in License plate images. In Pattern Recognition (ACPR), 2015 3rd
IAPR Asian Conference on (pp. 544-548). IEEE.
Kim, D., Song, T., Lee, Y., & Ko, H. (2016, January). Effective character segmentation for license plate recognition
under illumination changing environment. In Consumer Electronics (ICCE), 2016 IEEE International
Conference on (pp. 532-533). IEEE.
Liang, G., Shivakumara, P., Lu, T., & Tan, C. L. (2015, August). A new wavelet-Laplacian method for arbitrarily-
oriented character segmentation in video text lines. In Document Analysis and Recognition (ICDAR), 2015
13th International Conference on (pp. 926-930). IEEE.
Lin, C. H., Lin, Y. S., & Liu, W. C. (2018). An efficient license plate recognition system using convolution neural
networks. In Proc. ICASI pp. 224-227.
Liu, W., Wang, Z., Liu, X., Zeng, N., Liu, Y., & Alsaadi, F. E. (2017). A survey of deep neural network
architectures and their applications. Neurocomputing, 234, 11-26.
Nguyen, P. X., Wang, K., & Belongie, S. (2014, March). Video text detection and recognition: Dataset and
benchmark. In Applications of Computer Vision (WACV), 2014 IEEE Winter Conference on (pp. 776-783).
IEEE.
Peyrard, C., Baccouche, M., Mamalet, F., & Garcia, C. (2015, August). ICDAR2015 competition on text image
super-resolution. In Document Analysis and Recognition (ICDAR), 2015 13th International Conference on (pp.
1201-1205). IEEE.
Phan, T. Q., Shivakumara, P., Su, B., & Tan, C. L. (2011, September). A gradient vector flow-based method for
video character segmentation. In Document Analysis and Recognition (ICDAR), 2011 International Conference
on (pp. 1024-1028). IEEE.
Radchenko, A., Zarovsky, R., & Kazymyr, V. (2017). Method of segmentation and recognition of Ukrainian license
plates. In Proc. YSF, pp. 62-65.
Raghunandan, K. S., Shivakumara, P., Jalab, H. A., Ibrahim, R. W., Kumar, G. H., Pal, U., & Lu, T. (2017). Riesz
fractional based model for enhancing license plate detection and recognition. IEEE Transactions on Circuits and
Systems for Video Technology.
Rathore, M. M., Ahmad, A., Paul, A., & Rho, S. (2016). Urban planning and building smart cities based on the
internet of things using big data analytics. Computer Networks, 101, 63-80.
Saha, S., Basu, S., & Nasipuri, M. (2015). iLPR: an Indian license plate recognition system. Multimedia Tools and
Applications, 74(23), 10621-10656.
Sedighi, A., & Vafadust, M. (2011). A new and robust method for character segmentation and recognition in license
plate images. Expert Systems with Applications, 38(11), 13497-13504.
Sharma, N., Shivakumara, P., Pal, U., Blumenstein, M., & Tan, C. L. (2013, August). A new method for character
segmentation from multi-oriented video words. In Document Analysis and Recognition (ICDAR), 2013 12th
International Conference on (pp. 413-417). IEEE.
Shivakumara, P., Dutta, A., Tan, C. L., & Pal, U. (2014). Multi-oriented scene text detection in video based on
wavelet and angle projection boundary growing. Multimedia tools and applications, 72(1), 515-539.
Shivakumara, P., Phan, T. Q., Bhowmick, S., Tan, C. L., & Pal, U. (2013). A novel ring radius transform for video
character reconstruction. Pattern Recognition, 46(1), 131-140.
Shivakumara, P., Roy, S., Jalab, H. A., Ibrahim, R. W., Pal, U., Lu, T., ... & Wahab, A. W. B. A. (2019). Fractional
means based method for multi-oriented keyword spotting in video/scene/license plate images. Expert Systems
with Applications, 118, 1-19.
Silva, S. M., & Jung, C. R. (2018). License Plate Detection and Recognition in Unconstrained Scenarios. In Proc.
ECCV, pp. 593-609.
41. ACCEPTED MANUSCRIPT
A
C
C
E
P
T
E
D
M
A
N
U
S
C
R
I
P
T
Suresh, K. V., Kumar, G. M., & Rajagopalan, A. N. (2007). Superresolution of license plates in real traffic videos.
IEEE Transactions on Intelligent Transportation Systems, 8(2), 321-331.
Tadic, V., Popovic, M., & Odry, P. (2016). Fuzzified Gabor filter for license plate detection. Engineering
Applications of Artificial Intelligence, 48, 40-58.
Tesseract OCR software (2016) http://vision.ucsd.edu/belongie-grp/research/carRec/car_rec.html
Tian, J., Wang, R., Wang, G., Liu, J., & Xia, Y. (2015). A two-stage character segmentation method for Chinese
license plate. Computers & Electrical Engineering, 46, 539-553.
Tian, S., Shivakumara, P., Phan, T. Q., Lu, T., & Tan, C. L. (2015). Character shape restoration system through
medial axis points in video. Neurocomputing, 161, 183-198.
Wang, K., & Belongie, S. (2010, September). Word spotting in the wild. In European Conference on Computer
Vision (pp. 591-604). Springer, Berlin, Heidelberg.
Wang, Y., Shi, C., Xiao, B., & Wang, C. (2015, August). MRF based text binarization in complex images using
stroke feature. In Document Analysis and Recognition (ICDAR), 2015 13th International Conference on (pp.
821-825). IEEE.
Yang, Y., Li, D., & Duan, Z. (2017). Chinese vehicle license plate recognition using kernel-based extreme learning
machine with deep convolutional features. IET Intelligent Transport Systems, pp 213-219.
Yao, C., Bai, X., Liu, W., Ma, Y., & Tu, Z. (2012, June). Detecting texts of arbitrary orientations in natural images.
In 2012 IEEE Conference on Computer Vision and Pattern Recognition (pp. 1083-1090). IEEE.
Ye, Q., & Doermann, D. (2015). Text detection and recognition in imagery: A survey. IEEE transactions on pattern
analysis and machine intelligence, 37(7), 1480-1500.
Yu, S., Li, B., Zhang, Q., Liu, C., & Meng, M. Q. H. (2015). A novel license plate location method based on wavelet
transform and EMD analysis. Pattern Recognition, 48(1), 114-125.
Yuan, Y., Zou, W., Zhao, Y., Xin'an Wang, Hu, X., & Komodakis, N. (2017). A Robust and Efficient Approach to
License Plate Detection. IEEE Trans. Image Processing, 26(3), 1102-1114.
Zamberletti, A., Gallo, I., & Noce, L. (2015, November). Augmented text character proposals and convolutional
neural networks for text spotting from scene images. In Pattern Recognition (ACPR), 2015 3rd IAPR Asian
Conference on (pp. 196-200). IEEE.
Zhou, W., Li, H., Lu, Y., & Tian, Q. (2012). Principal visual word discovery for automatic license plate detection.
IEEE transactions on image processing, 21(9), 4269-4279.
Zhou, Y., Feild, J., Learned-Miller, E., & Wang, R. (2013, August). Scene text segmentation via inverse rendering.
In Document Analysis and Recognition (ICDAR), 2013 12th International Conference on (pp. 457-461). IEEE.