The aim of this paper is to build a real-time eyeglass detection framework based on deep features present in facial or ocular images, which serve as a prime factor in forensics analysis, authentication systems and many more. Generally, eyeglass detection methods were executed using cleaned and fine-tuned facial datasets; it resulted in a well-developed model, but the slightest deviation could affect the performance of the model giving poor results on real-time non-standard facial images. Therefore, a robust model is introduced which is trained on custom non-standard facial data. An Inception V3 architecture based pre-trained convolutional neural network (CNN) is used and fine-tuned using model hyper-parameters to achieve a high accuracy and good precision on non-standard facial images in real-time. This resulted in an accuracy score of about 99.2% and 99.9% for training and testing datasets respectively in less amount of time thereby showing the robustness of the model in all conditions.
The main aim of Data-Centric Architecture is to reduce complexity of information systems by using shared data with clear meaning. But how can you trust your data? How do you know if it is accurate and reliable?
The document discusses cloud computing standards and defines cloud computing. It notes that cloud computing converges many technologies and represents a leap in capabilities. Standards will be important to ensure interoperability between proprietary and standards-based clouds. The document proposes that the US government establish minimum standards and architecture to enable agencies to create interoperable cloud capabilities through a federal cloud infrastructure. This would promote standardization, security, and application portability across agency clouds without limiting innovation.
The document discusses the differences between the deep web and surface web. The deep web refers to content that is not indexed by typical search engines, as it is stored in dynamic databases rather than static web pages. It contains over 500 times more information than the surface web. Some key differences are that deep web content is accessed through direct database queries rather than URLs, and search results are generated dynamically rather than having fixed URLs. Specialized search engines are needed to access the deep web.
The document discusses an introduction to Android development in Uganda. It provides an overview of key Android concepts like activities, intents, services, and user interface design. It encourages attendees to get hands-on with Android app development by exploring the Android framework and APIs, and mentions prerequisites like Java programming skills. The document also highlights example Android views, layouts and app components to help explain building basic Android apps.
The document discusses fog computing as a complement to cloud computing for handling IoT data. It describes fog computing as distributing computing to network edges near IoT devices to overcome limitations of centralized cloud systems. It outlines key aspects of fog including scalability, security, programmability, and low latency. Application examples where fog may provide benefits over cloud are discussed such as connected vehicles, fire detection, and real-time health analytics. Major tech companies are working to advance fog computing standards and technologies.
The main aim of Data-Centric Architecture is to reduce complexity of information systems by using shared data with clear meaning. But how can you trust your data? How do you know if it is accurate and reliable?
The document discusses cloud computing standards and defines cloud computing. It notes that cloud computing converges many technologies and represents a leap in capabilities. Standards will be important to ensure interoperability between proprietary and standards-based clouds. The document proposes that the US government establish minimum standards and architecture to enable agencies to create interoperable cloud capabilities through a federal cloud infrastructure. This would promote standardization, security, and application portability across agency clouds without limiting innovation.
The document discusses the differences between the deep web and surface web. The deep web refers to content that is not indexed by typical search engines, as it is stored in dynamic databases rather than static web pages. It contains over 500 times more information than the surface web. Some key differences are that deep web content is accessed through direct database queries rather than URLs, and search results are generated dynamically rather than having fixed URLs. Specialized search engines are needed to access the deep web.
The document discusses an introduction to Android development in Uganda. It provides an overview of key Android concepts like activities, intents, services, and user interface design. It encourages attendees to get hands-on with Android app development by exploring the Android framework and APIs, and mentions prerequisites like Java programming skills. The document also highlights example Android views, layouts and app components to help explain building basic Android apps.
The document discusses fog computing as a complement to cloud computing for handling IoT data. It describes fog computing as distributing computing to network edges near IoT devices to overcome limitations of centralized cloud systems. It outlines key aspects of fog including scalability, security, programmability, and low latency. Application examples where fog may provide benefits over cloud are discussed such as connected vehicles, fire detection, and real-time health analytics. Major tech companies are working to advance fog computing standards and technologies.
This document provides an introduction to the deep web, including its size, evolution, and how it can be accessed. The deep web refers to content that is not indexed by standard search engines and is much larger than the surface web that search engines can index. It includes dynamically generated pages that can only be accessed through a form and private pages that require login credentials. Sites on the deep web can only be accessed using special browsers and protocols like Tor that allow for anonymous surfing through onion addresses. While some deep web sites provide legal content, others are used for illegal activities and information sharing.
This document discusses an online FIR (First Information Report) system. The proposed online FIR system allows citizens to register FIRs online from anywhere using the internet. It also allows police and administrators to access the system to update and view logged FIRs and complaint statuses. The system uses a SQL Server 2008 backend and ASP.Net 4.0 frontend. Key features include online crime reporting, viewing missing persons lists and wanted lists, and tracking FIR statuses. Users, police, and administrators each have different access rights within the system.
This document outlines a presentation on cloud storage. It begins with definitions of cloud storage, including that data is stored on servers owned by hosting companies and accessed via APIs. The history of cloud storage from the 1960s to modern services like Dropbox is discussed. Advantages include easy sharing of files, large storage capacity, and online editing. Potential concerns cited are security, technical problems, and unclear data ownership. Applications include personal, public, business and hybrid cloud storage. The conclusion is that cloud storage brings convenience but users should be aware of security risks, and its use will likely continue increasing.
This document provides a summary of an eTL project. eTL is an event management system that allows users to register for events online. It automatically generates and emails certificates to participants. The system efficiently stores and retrieves data from its database. It aims to save time by automating manual record keeping and report generation tasks. The system will use Java, JSP, HTML, CSS, JavaScript, jQuery, Ajax, and Hibernate framework. It will have modules for registration, events, certificates, notifications, user accounts, and administration.
This document is the table of contents for a book about marshaling with C#. It lists the book's chapters and sections. The book discusses marshaling simple types in Chapter 2, covering numeric, textual, and handle data types. It also covers marshaling compound types like structures and unions in Chapter 3. The book is intended for developers familiar with unmanaged code who want to learn how to interface managed and unmanaged code using C#.
Foog computing and iFogSim for sustainable smart city.sindhuRashmi1
This gives a overview of what is Fog computing how it is different from cloud computing for developing a efficient and sustainable smart cities. it also give a basic knowledge about simulating the fog layer and a tool kit that helps in simulation which is a IfogSim
Web scraping involves extracting data from human-readable web pages and converting it into structured data. There are several types of scraping including screen scraping, report mining, and web scraping. The process of web scraping typically involves using techniques like text pattern matching, HTML parsing, and DOM parsing to extract the desired data from web pages in an automated way. Common tools used for web scraping include Selenium, Import.io, Phantom.js, and Scrapy.
The Android architecture consists of 5 layers: the Linux kernel, native libraries, the Android runtime, application framework, and applications. The Linux kernel handles low-level system functionality like drivers. Native libraries provide common functions like media playback. The Android runtime includes the Dalvik VM and core Java libraries. It allows each app to run in its own process. The application framework offers higher-level services to apps like activity management and notifications. Finally, applications are built on top of the framework and distributed to users.
Internship report based on Flutter Lawyer App.
Script:
Hello everyone! My name is Mohammad Shamim Hossain. I am doing my major in Computer Science and my minor in Environmental Management. In this presentation, I will be discussing my time interning as a full-stack applications developer with Aveneurs solutions.
Control of Photo Sharing on Online Social Network.SAFAD ISMAIL
A social networking service (also social networking site', SNS or social media) is an online platform which people use to build social networks or social relations with other people who share similar personal or career interests, activities, backgrounds or real-life connections.
This presentation lecture was delivered in HITEC University, Pakistan. This is my view of the cloud and next generation computing infrastructure supported by the cloud infrastructure.
Giới thiệu về điện toán đám mây và các công nghệ trong điện toán đám mây. Các vấn đề trong điện toán đám mây như bảo mật và an toán thông tin. Giới thiệu về một số ứng dụng của điện toán đám mây
This document summarizes a Java-based chat application created by DVS Technologies. It discusses what chatting and chat applications are, how they allow real-time communication, and how they are used on websites and mobile devices. It then explains the technical details of how sockets allow for two-way communication between client and server programs by binding to specific port numbers, allowing the server to listen for connection requests and the client to identify the server to connect to. Diagrams demonstrate how a port scanner can find the port a chat server is listening on so the client can connect and authenticate with the server.
A Literature Survey on Image Linguistic Visual Question AnsweringIRJET Journal
This document discusses a literature survey on image and linguistic visual question answering. It aims to develop a model that achieves higher performance than state-of-the-art solutions by exploring different existing models and developing a custom model. The paper reviews several existing models for visual question answering and image classification using convolutional neural networks. It also discusses developing a new dataset for visual question answering using automated question generation from image descriptions.
FACE PHOTO-SKETCH RECOGNITION USING DEEP LEARNING TECHNIQUES - A REVIEWIRJET Journal
This document reviews various deep learning techniques for face photo-sketch recognition. It summarizes several studies that used deep learning and other machine learning methods to match faces across photo and sketch domains. Feature injection and convolutional neural networks, knowledge distillation frameworks, relational graph modules, domain alignment embedding networks, and using 3D morphable models and CNNs for forensic sketch recognition are some of the techniques discussed. The studies aim to improve cross-domain face recognition accuracy, which remains challenging due to differences in photo and sketch representations. Accuracies ranging from 73-95% are reported for the various methods.
In this study, an end-to-end human iris recognition system is presented to automatically identify individuals for high level of security purposes. The deep learning technology based new 2D convolutional neural network (CNN) model is introduced for extracting the features and classifying the iris patterns. Firstly, the iris dataset is collected, preprocessed and augmented. The dataset are expanded and enhanced using data augmentation, histogram equalization (HE) and contrast-limited adaptive histogram equalization (CLAHE) techniques. Secondly, the features of the iris patterns were extracted and classified using CNN. The structure of CNN comprises of convolutional layers and ReLu layers for extracting the features, pooling layers for reducing the parameters, fully connected layer and Softmax layer for classifying the extracted features into N classes. For the training process and updating the weights, the backpropagation algorithm and adaptive moment estimation Adam optimizer are used. The experimental results carried out based on a graphics processing unit (GPU) and using Matlab. The overall training accuracy of the introduced system was 95.33% with a consumption time of 17.59 minutes for training set. While the testing accuracy 100% with a consumption time of 12 seconds. The introduced iris recognition system has been successfully applied.
Visual Saliency Model Using Sift and Comparison of Learning Approachescsandit
This document discusses a study that aims to develop a visual saliency model to predict where humans look in images. It uses the SIFT feature in addition to low, mid, and high-level image features to train machine learning models on an eye-tracking dataset. Support vector machines (SVM) achieved the best performance, accurately predicting fixations 88% of the time. Including the SIFT feature further improved SVM performance to 91% accuracy. The study evaluates different machine learning methods and determines SVM to be best suited for this binary classification task using high-dimensional image data.
FACEMASK AND PHYSICAL DISTANCING DETECTION USING TRANSFER LEARNING TECHNIQUEIRJET Journal
This document describes a system for detecting whether individuals are wearing face masks and maintaining proper social distancing using deep learning and computer vision techniques. The system uses the YOLOv3 object detection algorithm trained on a dataset of images labeled as containing individuals with and without masks. The trained model can detect faces in images and label them as wearing a mask properly or not. It can also detect multiple individuals, calculate distances between them, and identify violations of social distancing guidelines. The system provides outputs by drawing colored boxes around detected faces and individuals to indicate mask wearing and social distancing compliance in real-time. It aims to help authorities monitor adherence to COVID-19 safety measures without manual efforts.
This document provides an introduction to the deep web, including its size, evolution, and how it can be accessed. The deep web refers to content that is not indexed by standard search engines and is much larger than the surface web that search engines can index. It includes dynamically generated pages that can only be accessed through a form and private pages that require login credentials. Sites on the deep web can only be accessed using special browsers and protocols like Tor that allow for anonymous surfing through onion addresses. While some deep web sites provide legal content, others are used for illegal activities and information sharing.
This document discusses an online FIR (First Information Report) system. The proposed online FIR system allows citizens to register FIRs online from anywhere using the internet. It also allows police and administrators to access the system to update and view logged FIRs and complaint statuses. The system uses a SQL Server 2008 backend and ASP.Net 4.0 frontend. Key features include online crime reporting, viewing missing persons lists and wanted lists, and tracking FIR statuses. Users, police, and administrators each have different access rights within the system.
This document outlines a presentation on cloud storage. It begins with definitions of cloud storage, including that data is stored on servers owned by hosting companies and accessed via APIs. The history of cloud storage from the 1960s to modern services like Dropbox is discussed. Advantages include easy sharing of files, large storage capacity, and online editing. Potential concerns cited are security, technical problems, and unclear data ownership. Applications include personal, public, business and hybrid cloud storage. The conclusion is that cloud storage brings convenience but users should be aware of security risks, and its use will likely continue increasing.
This document provides a summary of an eTL project. eTL is an event management system that allows users to register for events online. It automatically generates and emails certificates to participants. The system efficiently stores and retrieves data from its database. It aims to save time by automating manual record keeping and report generation tasks. The system will use Java, JSP, HTML, CSS, JavaScript, jQuery, Ajax, and Hibernate framework. It will have modules for registration, events, certificates, notifications, user accounts, and administration.
This document is the table of contents for a book about marshaling with C#. It lists the book's chapters and sections. The book discusses marshaling simple types in Chapter 2, covering numeric, textual, and handle data types. It also covers marshaling compound types like structures and unions in Chapter 3. The book is intended for developers familiar with unmanaged code who want to learn how to interface managed and unmanaged code using C#.
Foog computing and iFogSim for sustainable smart city.sindhuRashmi1
This gives a overview of what is Fog computing how it is different from cloud computing for developing a efficient and sustainable smart cities. it also give a basic knowledge about simulating the fog layer and a tool kit that helps in simulation which is a IfogSim
Web scraping involves extracting data from human-readable web pages and converting it into structured data. There are several types of scraping including screen scraping, report mining, and web scraping. The process of web scraping typically involves using techniques like text pattern matching, HTML parsing, and DOM parsing to extract the desired data from web pages in an automated way. Common tools used for web scraping include Selenium, Import.io, Phantom.js, and Scrapy.
The Android architecture consists of 5 layers: the Linux kernel, native libraries, the Android runtime, application framework, and applications. The Linux kernel handles low-level system functionality like drivers. Native libraries provide common functions like media playback. The Android runtime includes the Dalvik VM and core Java libraries. It allows each app to run in its own process. The application framework offers higher-level services to apps like activity management and notifications. Finally, applications are built on top of the framework and distributed to users.
Internship report based on Flutter Lawyer App.
Script:
Hello everyone! My name is Mohammad Shamim Hossain. I am doing my major in Computer Science and my minor in Environmental Management. In this presentation, I will be discussing my time interning as a full-stack applications developer with Aveneurs solutions.
Control of Photo Sharing on Online Social Network.SAFAD ISMAIL
A social networking service (also social networking site', SNS or social media) is an online platform which people use to build social networks or social relations with other people who share similar personal or career interests, activities, backgrounds or real-life connections.
This presentation lecture was delivered in HITEC University, Pakistan. This is my view of the cloud and next generation computing infrastructure supported by the cloud infrastructure.
Giới thiệu về điện toán đám mây và các công nghệ trong điện toán đám mây. Các vấn đề trong điện toán đám mây như bảo mật và an toán thông tin. Giới thiệu về một số ứng dụng của điện toán đám mây
This document summarizes a Java-based chat application created by DVS Technologies. It discusses what chatting and chat applications are, how they allow real-time communication, and how they are used on websites and mobile devices. It then explains the technical details of how sockets allow for two-way communication between client and server programs by binding to specific port numbers, allowing the server to listen for connection requests and the client to identify the server to connect to. Diagrams demonstrate how a port scanner can find the port a chat server is listening on so the client can connect and authenticate with the server.
A Literature Survey on Image Linguistic Visual Question AnsweringIRJET Journal
This document discusses a literature survey on image and linguistic visual question answering. It aims to develop a model that achieves higher performance than state-of-the-art solutions by exploring different existing models and developing a custom model. The paper reviews several existing models for visual question answering and image classification using convolutional neural networks. It also discusses developing a new dataset for visual question answering using automated question generation from image descriptions.
FACE PHOTO-SKETCH RECOGNITION USING DEEP LEARNING TECHNIQUES - A REVIEWIRJET Journal
This document reviews various deep learning techniques for face photo-sketch recognition. It summarizes several studies that used deep learning and other machine learning methods to match faces across photo and sketch domains. Feature injection and convolutional neural networks, knowledge distillation frameworks, relational graph modules, domain alignment embedding networks, and using 3D morphable models and CNNs for forensic sketch recognition are some of the techniques discussed. The studies aim to improve cross-domain face recognition accuracy, which remains challenging due to differences in photo and sketch representations. Accuracies ranging from 73-95% are reported for the various methods.
In this study, an end-to-end human iris recognition system is presented to automatically identify individuals for high level of security purposes. The deep learning technology based new 2D convolutional neural network (CNN) model is introduced for extracting the features and classifying the iris patterns. Firstly, the iris dataset is collected, preprocessed and augmented. The dataset are expanded and enhanced using data augmentation, histogram equalization (HE) and contrast-limited adaptive histogram equalization (CLAHE) techniques. Secondly, the features of the iris patterns were extracted and classified using CNN. The structure of CNN comprises of convolutional layers and ReLu layers for extracting the features, pooling layers for reducing the parameters, fully connected layer and Softmax layer for classifying the extracted features into N classes. For the training process and updating the weights, the backpropagation algorithm and adaptive moment estimation Adam optimizer are used. The experimental results carried out based on a graphics processing unit (GPU) and using Matlab. The overall training accuracy of the introduced system was 95.33% with a consumption time of 17.59 minutes for training set. While the testing accuracy 100% with a consumption time of 12 seconds. The introduced iris recognition system has been successfully applied.
Visual Saliency Model Using Sift and Comparison of Learning Approachescsandit
This document discusses a study that aims to develop a visual saliency model to predict where humans look in images. It uses the SIFT feature in addition to low, mid, and high-level image features to train machine learning models on an eye-tracking dataset. Support vector machines (SVM) achieved the best performance, accurately predicting fixations 88% of the time. Including the SIFT feature further improved SVM performance to 91% accuracy. The study evaluates different machine learning methods and determines SVM to be best suited for this binary classification task using high-dimensional image data.
FACEMASK AND PHYSICAL DISTANCING DETECTION USING TRANSFER LEARNING TECHNIQUEIRJET Journal
This document describes a system for detecting whether individuals are wearing face masks and maintaining proper social distancing using deep learning and computer vision techniques. The system uses the YOLOv3 object detection algorithm trained on a dataset of images labeled as containing individuals with and without masks. The trained model can detect faces in images and label them as wearing a mask properly or not. It can also detect multiple individuals, calculate distances between them, and identify violations of social distancing guidelines. The system provides outputs by drawing colored boxes around detected faces and individuals to indicate mask wearing and social distancing compliance in real-time. It aims to help authorities monitor adherence to COVID-19 safety measures without manual efforts.
End-to-end deep auto-encoder for segmenting a moving object with limited tra...IJECEIAES
The document proposes two end-to-end deep auto-encoder approaches for segmenting moving objects from surveillance videos when limited training data is available. The first approach uses transfer learning with a pre-trained VGG-16 model as the encoder and its transposed architecture as the decoder. The second approach uses a multi-depth auto-encoder with convolutional and upsampling layers. Both approaches apply data augmentation techniques like PCA and traditional methods to increase the training data size. The models are trained and evaluated on the CDnet2014 dataset, achieving better performance than other models trained with limited data.
Gender classification using custom convolutional neural networks architecture IJECEIAES
Gender classification demonstrates high accuracy in many previous works. However, it does not generalize very well in unconstrained settings and environments. Furthermore, many proposed convolutional neural network (CNN) based solutions vary significantly in their characteristics and architectures, which calls for optimal CNN architecture for this specific task. In this work, a hand-crafted, custom CNN architecture is proposed to distinguish between male and female facial images. This custom CNN requires smaller input image resolutions and significantly fewer trainable parameters than some popular state-of-the-arts such as GoogleNet and AlexNet. It also employs batch normalization layers which results in better computation efficiency. Based on experiments using publicly available datasets such as LFW, CelebA and IMDB-WIKI datasets, the proposed custom CNN delivered the fastest inference time in all tests, where it needs only 0.92ms to classify 1200 images on GPU, 1.79ms on CPU, and 2.51ms on VPU. The custom CNN also delivers performance on-par with state-ofthe-arts and even surpassed these methods in CelebA gender classification where it delivered the best result at 96% accuracy. Moreover, in a more challenging cross-dataset inference, custom CNN trained using CelebA dataset gives the best gender classification accuracy for tests on IMDB and WIKI datasets at 97% and 96% accuracy respectively.
Deep learning methods have improved machine capabilities for face recognition. Three key modules are typically used: 1) A face detector localizes faces in images using region-based or sliding window approaches with DCNNs. 2) A fiducial point detector localizes facial landmarks using DCNN regressors or 3D models. 3) A DCNN extracts features for face recognition and verification by computing similarity scores between representations. Large annotated datasets have enabled DCNNs to learn robust representations for unconstrained images.
AN EFFICIENT FACE RECOGNITION EMPLOYING SVM AND BU-LDPIRJET Journal
The document presents a study on an efficient face recognition method employing support vector machines (SVM) and biomimetic uncorrelated local difference projection (BU-LDP). The study proposes using BU-LDP, which is based on uncorrelated local projection but uses a different neighborhood coefficient calculation approach inspired by human perception. Experimental results on several datasets show that BU-LDP and its kernel variant KBU-LDP outperform state-of-the-art methods for face recognition. Future work will focus on addressing the "one sample problem" and applying the approach to unlabeled data.
A face recognition system using convolutional feature extraction with linear...IJECEIAES
Face recognition is one of the important biometric authentication research areas for security purposes in many fields such as pattern recognition and image processing. However, the human face recognitions have the major problem in machine learning and deep learning techniques, since input images vary with poses of people, different lighting conditions, various expressions, ages as well as illumination conditions and it makes the face recognition process poor in accuracy. In the present research, the resolution of the image patches is reduced by the max pooling layer in convolutional neural network (CNN) and also used to make the model robust than other traditional feature extraction technique called local multiple pattern (LMP). The extracted features are fed into the linear collaborative discriminant regression classification (LCDRC) for final face recognition. Due to optimization using CNN in LCDRC, the distance ratio between the classes has maximized and the distance of the features inside the class reduces. The results stated that the CNN-LCDRC achieved 93.10% and 87.60% of mean recognition accuracy, where traditional LCDRC achieved 83.35% and 77.70% of mean recognition accuracy on ORL and YALE databases respectively for the training number 8 (i.e. 80% of training and 20% of testing data).
Face recognition for occluded face with mask region convolutional neural net...IJECEIAES
Face recognition technology has been used in many ways, such as in the authentication and identification process. The object raised is a piece of face image that does not have complete facial information (occluded face), it can be due to acquisition from a different point of view or shooting a face from a different angle. This object was raised because the object can affect the detection and identification performance of the face image as a whole. Deep leaning method can be used to solve face recognition problems. In previous research, more focused on face detection and recognition based on resolution, and detection of face. Mask region convolutional neural network (mask R-CNN) method still has deficiency in the segmentation section which results in a decrease in the accuracy of face identification with incomplete face information objects. The segmentation used in mask R-CNN is fully convolutional network (FCN). In this research, exploration and modification of many FCN parameters will be carried out using the CNN backbone pooling layer, and modification of mask R-CNN for face identification, besides that, modifications will be made to the bounding box regressor. it is expected that the modification results can provide the best recommendations based on accuracy.
A hybrid approach for face recognition using a convolutional neural network c...IAESIJAI
Facial recognition technology has been used in many fields such as security,
biometric identification, robotics, video surveillance, health, and commerce
due to its ease of implementation and minimal data processing time.
However, this technology is influenced by the presence of variations such as
pose, lighting, or occlusion. In this paper, we propose a new approach to
improve the accuracy rate of face recognition in the presence of variation or
occlusion, by combining feature extraction with a histogram of oriented
gradient (HOG), scale invariant feature transform (SIFT), Gabor, and the
Canny contour detector techniques, as well as a convolutional neural
network (CNN) architecture, tested with several combinations of the
activation function used (Softmax and Segmoïd) and the optimization
algorithm used during training (adam, Adamax, RMSprop, and stochastic
gradient descent (SGD)). For this, a preprocessing was performed on two
databases of our database of faces (ORL) and Sheffield faces used, then we
perform a feature extraction operation with the mentioned techniques and
then pass them to our used CNN architecture. The results of our simulations
show a high performance of the SIFT+CNN combination, in the case of the
presence of variations with an accuracy rate up to 100%.
USING IMAGE CLASSIFICATION TO INCENTIVIZE RECYCLINGIRJET Journal
This document discusses using image classification to incentivize recycling. It proposes a web application where users can upload images of recyclable materials. Using image processing and classification algorithms, the material is identified and points are awarded. When enough points are accumulated, users can exchange them for rewards. The system architecture includes image upload and classification, data storage, and transaction processing. Popular classification models like ResNet and FastAI are evaluated. Analysis shows some materials like plastic and metal are confused, indicating room for improvement. The goal is to promote recycling through gamification and make recycling more accessible.
Vehicle Driver Age Estimation using Neural NetworksIRJET Journal
This document presents research on developing a convolutional neural network model to estimate the age and gender of a vehicle driver from their facial image. The researchers assembled a large dataset of over 60,000 face images from various sources to train their CNN model. They implemented the model using Caffe and tested it on a Raspberry Pi 3B+ for real-time age and gender detection. After training, the CNN model was able to accurately classify age and gender from input images with an accuracy of 98.1%. The document discusses the CNN architecture, preprocessing steps, and algorithms used to develop this age and gender detection system for vehicle drivers.
Real-time face detection in digital video-based on Viola-Jones supported by c...IJECEIAES
Face detection is a critical function of security (secure witness face in the video) who appear in a scene and are frequently captured by the camera. Recognition of people from their faces in images has recently piqued the scientific community, partly due to application concerns, but also for the difficulty this characterizes for the algorithms of artificial vision. The idea for this research stems from a broad interest in courtroom witness face detection. The goal of this work is to detect and track the face of a witness in court. In this work, a Viola-Jones method is used to extract human faces and then a particular transformation is applied to crop the image. Witness and non-witness images are classified using convolutional neural networks (CNN). The Kanade-Lucas-Tomasi (KLT) algorithm was utilized to track the witness face using trained features. In this model, the two methods were combined in one model to take the advantage of each method in terms of speed and reduce the amount of space required to implement CNN and detection accuracy. After the test, the results of the proposed model showed that it was 99.5% percent accurate when executed in real-time and with adequate lighting.
IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...IRJET Journal
This document discusses factors affecting the deployment of deep learning models for face recognition on smartphones. It examines training data requirements, suitable neural network architectures, and effective loss functions. Larger datasets with more subjects and images are preferred for training models that generalize well. Residual networks like ResNet have achieved good accuracy while being efficient for face recognition. Loss functions like center loss and triplet loss help learn discriminative features by reducing intra-class and increasing inter-class variations.
IRJET- Spot Me - A Smart Attendance System based on Face RecognitionIRJET Journal
The article discusses international issues. It mentions that globalization has increased economic interdependence between nations while also raising tensions over immigration and trade. Solutions will require cooperation and compromise and a recognition that isolationism is not a viable strategy in an interconnected world.
Real time multi face detection using deep learningReallykul Kuul
This document proposes a framework for real-time multiple face recognition using deep learning on an embedded GPU system. The framework includes face detection using a CNN, face tracking to reduce processing time, and a state-of-the-art deep CNN for face recognition. Experimental results showed the system can recognize up to 8 faces simultaneously in real-time, with processing times up to 0.23 seconds and a minimum recognition rate of 83.67%.
Realtime face matching and gender prediction based on deep learningIJECEIAES
Face analysis is an essential topic in computer vision that dealing with human faces for recognition or prediction tasks. The face is one of the easiest ways to distinguish the identity people. Face recognition is a type of personal identification system that employs a person’s personal traits to determine their identity. Human face recognition scheme generally consists of four steps, namely face detection, alignment, representation, and verification. In this paper, we propose to extract information from human face for several tasks based on recent advanced deep learning framework. The proposed approach outperforms the results in the state-of-the-art.
AUTOMATION OF ATTENDANCE USING DEEP LEARNINGIRJET Journal
This document describes a proposed system to automate student attendance using deep learning techniques like face detection and recognition. The system would take pictures of the classroom and use these techniques to identify which students are present, addressing issues with current manual attendance systems. It reviews previous literature on automated attendance systems and face recognition methods. The proposed system would use Python with OpenCV for face detection and an LBPH model for face recognition. It would generate reports with student attendance data and photos/videos from the classroom.
Similar to Real-time eyeglass detection using transfer learning for non-standard facial data (20)
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...IJECEIAES
Medical image analysis has witnessed significant advancements with deep learning techniques. In the domain of brain tumor segmentation, the ability to
precisely delineate tumor boundaries from magnetic resonance imaging (MRI)
scans holds profound implications for diagnosis. This study presents an ensemble convolutional neural network (CNN) with transfer learning, integrating
the state-of-the-art Deeplabv3+ architecture with the ResNet18 backbone. The
model is rigorously trained and evaluated, exhibiting remarkable performance
metrics, including an impressive global accuracy of 99.286%, a high-class accuracy of 82.191%, a mean intersection over union (IoU) of 79.900%, a weighted
IoU of 98.620%, and a Boundary F1 (BF) score of 83.303%. Notably, a detailed comparative analysis with existing methods showcases the superiority of
our proposed model. These findings underscore the model’s competence in precise brain tumor localization, underscoring its potential to revolutionize medical
image analysis and enhance healthcare outcomes. This research paves the way
for future exploration and optimization of advanced CNN models in medical
imaging, emphasizing addressing false positives and resource efficiency.
Embedded machine learning-based road conditions and driving behavior monitoringIJECEIAES
Car accident rates have increased in recent years, resulting in losses in human lives, properties, and other financial costs. An embedded machine learning-based system is developed to address this critical issue. The system can monitor road conditions, detect driving patterns, and identify aggressive driving behaviors. The system is based on neural networks trained on a comprehensive dataset of driving events, driving styles, and road conditions. The system effectively detects potential risks and helps mitigate the frequency and impact of accidents. The primary goal is to ensure the safety of drivers and vehicles. Collecting data involved gathering information on three key road events: normal street and normal drive, speed bumps, circular yellow speed bumps, and three aggressive driving actions: sudden start, sudden stop, and sudden entry. The gathered data is processed and analyzed using a machine learning system designed for limited power and memory devices. The developed system resulted in 91.9% accuracy, 93.6% precision, and 92% recall. The achieved inference time on an Arduino Nano 33 BLE Sense with a 32-bit CPU running at 64 MHz is 34 ms and requires 2.6 kB peak RAM and 139.9 kB program flash memory, making it suitable for resource-constrained embedded systems.
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
Neural network optimizer of proportional-integral-differential controller par...IJECEIAES
Wide application of proportional-integral-differential (PID)-regulator in industry requires constant improvement of methods of its parameters adjustment. The paper deals with the issues of optimization of PID-regulator parameters with the use of neural network technology methods. A methodology for choosing the architecture (structure) of neural network optimizer is proposed, which consists in determining the number of layers, the number of neurons in each layer, as well as the form and type of activation function. Algorithms of neural network training based on the application of the method of minimizing the mismatch between the regulated value and the target value are developed. The method of back propagation of gradients is proposed to select the optimal training rate of neurons of the neural network. The neural network optimizer, which is a superstructure of the linear PID controller, allows increasing the regulation accuracy from 0.23 to 0.09, thus reducing the power consumption from 65% to 53%. The results of the conducted experiments allow us to conclude that the created neural superstructure may well become a prototype of an automatic voltage regulator (AVR)-type industrial controller for tuning the parameters of the PID controller.
An improved modulation technique suitable for a three level flying capacitor ...IJECEIAES
This research paper introduces an innovative modulation technique for controlling a 3-level flying capacitor multilevel inverter (FCMLI), aiming to streamline the modulation process in contrast to conventional methods. The proposed
simplified modulation technique paves the way for more straightforward and
efficient control of multilevel inverters, enabling their widespread adoption and
integration into modern power electronic systems. Through the amalgamation of
sinusoidal pulse width modulation (SPWM) with a high-frequency square wave
pulse, this controlling technique attains energy equilibrium across the coupling
capacitor. The modulation scheme incorporates a simplified switching pattern
and a decreased count of voltage references, thereby simplifying the control
algorithm.
A review on features and methods of potential fishing zoneIJECEIAES
This review focuses on the importance of identifying potential fishing zones in seawater for sustainable fishing practices. It explores features like sea surface temperature (SST) and sea surface height (SSH), along with classification methods such as classifiers. The features like SST, SSH, and different classifiers used to classify the data, have been figured out in this review study. This study underscores the importance of examining potential fishing zones using advanced analytical techniques. It thoroughly explores the methodologies employed by researchers, covering both past and current approaches. The examination centers on data characteristics and the application of classification algorithms for classification of potential fishing zones. Furthermore, the prediction of potential fishing zones relies significantly on the effectiveness of classification algorithms. Previous research has assessed the performance of models like support vector machines, naïve Bayes, and artificial neural networks (ANN). In the previous result, the results of support vector machine (SVM) were 97.6% more accurate than naive Bayes's 94.2% to classify test data for fisheries classification. By considering the recent works in this area, several recommendations for future works are presented to further improve the performance of the potential fishing zone models, which is important to the fisheries community.
Electrical signal interference minimization using appropriate core material f...IJECEIAES
As demand for smaller, quicker, and more powerful devices rises, Moore's law is strictly followed. The industry has worked hard to make little devices that boost productivity. The goal is to optimize device density. Scientists are reducing connection delays to improve circuit performance. This helped them understand three-dimensional integrated circuit (3D IC) concepts, which stack active devices and create vertical connections to diminish latency and lower interconnects. Electrical involvement is a big worry with 3D integrates circuits. Researchers have developed and tested through silicon via (TSV) and substrates to decrease electrical wave involvement. This study illustrates a novel noise coupling reduction method using several electrical involvement models. A 22% drop in electrical involvement from wave-carrying to victim TSVs introduces this new paradigm and improves system performance even at higher THz frequencies.
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...IJECEIAES
Climate change's impact on the planet forced the United Nations and governments to promote green energies and electric transportation. The deployments of photovoltaic (PV) and electric vehicle (EV) systems gained stronger momentum due to their numerous advantages over fossil fuel types. The advantages go beyond sustainability to reach financial support and stability. The work in this paper introduces the hybrid system between PV and EV to support industrial and commercial plants. This paper covers the theoretical framework of the proposed hybrid system including the required equation to complete the cost analysis when PV and EV are present. In addition, the proposed design diagram which sets the priorities and requirements of the system is presented. The proposed approach allows setup to advance their power stability, especially during power outages. The presented information supports researchers and plant owners to complete the necessary analysis while promoting the deployment of clean energy. The result of a case study that represents a dairy milk farmer supports the theoretical works and highlights its advanced benefits to existing plants. The short return on investment of the proposed approach supports the paper's novelty approach for the sustainable electrical system. In addition, the proposed system allows for an isolated power setup without the need for a transmission line which enhances the safety of the electrical network
Bibliometric analysis highlighting the role of women in addressing climate ch...IJECEIAES
Fossil fuel consumption increased quickly, contributing to climate change
that is evident in unusual flooding and draughts, and global warming. Over
the past ten years, women's involvement in society has grown dramatically,
and they succeeded in playing a noticeable role in reducing climate change.
A bibliometric analysis of data from the last ten years has been carried out to
examine the role of women in addressing the climate change. The analysis's
findings discussed the relevant to the sustainable development goals (SDGs),
particularly SDG 7 and SDG 13. The results considered contributions made
by women in the various sectors while taking geographic dispersion into
account. The bibliometric analysis delves into topics including women's
leadership in environmental groups, their involvement in policymaking, their
contributions to sustainable development projects, and the influence of
gender diversity on attempts to mitigate climate change. This study's results
highlight how women have influenced policies and actions related to climate
change, point out areas of research deficiency and recommendations on how
to increase role of the women in addressing the climate change and
achieving sustainability. To achieve more successful results, this initiative
aims to highlight the significance of gender equality and encourage
inclusivity in climate change decision-making processes.
Voltage and frequency control of microgrid in presence of micro-turbine inter...IJECEIAES
The active and reactive load changes have a significant impact on voltage
and frequency. In this paper, in order to stabilize the microgrid (MG) against
load variations in islanding mode, the active and reactive power of all
distributed generators (DGs), including energy storage (battery), diesel
generator, and micro-turbine, are controlled. The micro-turbine generator is
connected to MG through a three-phase to three-phase matrix converter, and
the droop control method is applied for controlling the voltage and
frequency of MG. In addition, a method is introduced for voltage and
frequency control of micro-turbines in the transition state from gridconnected mode to islanding mode. A novel switching strategy of the matrix
converter is used for converting the high-frequency output voltage of the
micro-turbine to the grid-side frequency of the utility system. Moreover,
using the switching strategy, the low-order harmonics in the output current
and voltage are not produced, and consequently, the size of the output filter
would be reduced. In fact, the suggested control strategy is load-independent
and has no frequency conversion restrictions. The proposed approach for
voltage and frequency regulation demonstrates exceptional performance and
favorable response across various load alteration scenarios. The suggested
strategy is examined in several scenarios in the MG test systems, and the
simulation results are addressed.
Enhancing battery system identification: nonlinear autoregressive modeling fo...IJECEIAES
Precisely characterizing Li-ion batteries is essential for optimizing their
performance, enhancing safety, and prolonging their lifespan across various
applications, such as electric vehicles and renewable energy systems. This
article introduces an innovative nonlinear methodology for system
identification of a Li-ion battery, employing a nonlinear autoregressive with
exogenous inputs (NARX) model. The proposed approach integrates the
benefits of nonlinear modeling with the adaptability of the NARX structure,
facilitating a more comprehensive representation of the intricate
electrochemical processes within the battery. Experimental data collected
from a Li-ion battery operating under diverse scenarios are employed to
validate the effectiveness of the proposed methodology. The identified
NARX model exhibits superior accuracy in predicting the battery's behavior
compared to traditional linear models. This study underscores the
importance of accounting for nonlinearities in battery modeling, providing
insights into the intricate relationships between state-of-charge, voltage, and
current under dynamic conditions.
Smart grid deployment: from a bibliometric analysis to a surveyIJECEIAES
Smart grids are one of the last decades' innovations in electrical energy.
They bring relevant advantages compared to the traditional grid and
significant interest from the research community. Assessing the field's
evolution is essential to propose guidelines for facing new and future smart
grid challenges. In addition, knowing the main technologies involved in the
deployment of smart grids (SGs) is important to highlight possible
shortcomings that can be mitigated by developing new tools. This paper
contributes to the research trends mentioned above by focusing on two
objectives. First, a bibliometric analysis is presented to give an overview of
the current research level about smart grid deployment. Second, a survey of
the main technological approaches used for smart grid implementation and
their contributions are highlighted. To that effect, we searched the Web of
Science (WoS), and the Scopus databases. We obtained 5,663 documents
from WoS and 7,215 from Scopus on smart grid implementation or
deployment. With the extraction limitation in the Scopus database, 5,872 of
the 7,215 documents were extracted using a multi-step process. These two
datasets have been analyzed using a bibliometric tool called bibliometrix.
The main outputs are presented with some recommendations for future
research.
Use of analytical hierarchy process for selecting and prioritizing islanding ...IJECEIAES
One of the problems that are associated to power systems is islanding
condition, which must be rapidly and properly detected to prevent any
negative consequences on the system's protection, stability, and security.
This paper offers a thorough overview of several islanding detection
strategies, which are divided into two categories: classic approaches,
including local and remote approaches, and modern techniques, including
techniques based on signal processing and computational intelligence.
Additionally, each approach is compared and assessed based on several
factors, including implementation costs, non-detected zones, declining
power quality, and response times using the analytical hierarchy process
(AHP). The multi-criteria decision-making analysis shows that the overall
weight of passive methods (24.7%), active methods (7.8%), hybrid methods
(5.6%), remote methods (14.5%), signal processing-based methods (26.6%),
and computational intelligent-based methods (20.8%) based on the
comparison of all criteria together. Thus, it can be seen from the total weight
that hybrid approaches are the least suitable to be chosen, while signal
processing-based methods are the most appropriate islanding detection
method to be selected and implemented in power system with respect to the
aforementioned factors. Using Expert Choice software, the proposed
hierarchy model is studied and examined.
Enhancing of single-stage grid-connected photovoltaic system using fuzzy logi...IJECEIAES
The power generated by photovoltaic (PV) systems is influenced by
environmental factors. This variability hampers the control and utilization of
solar cells' peak output. In this study, a single-stage grid-connected PV
system is designed to enhance power quality. Our approach employs fuzzy
logic in the direct power control (DPC) of a three-phase voltage source
inverter (VSI), enabling seamless integration of the PV connected to the
grid. Additionally, a fuzzy logic-based maximum power point tracking
(MPPT) controller is adopted, which outperforms traditional methods like
incremental conductance (INC) in enhancing solar cell efficiency and
minimizing the response time. Moreover, the inverter's real-time active and
reactive power is directly managed to achieve a unity power factor (UPF).
The system's performance is assessed through MATLAB/Simulink
implementation, showing marked improvement over conventional methods,
particularly in steady-state and varying weather conditions. For solar
irradiances of 500 and 1,000 W/m2
, the results show that the proposed
method reduces the total harmonic distortion (THD) of the injected current
to the grid by approximately 46% and 38% compared to conventional
methods, respectively. Furthermore, we compare the simulation results with
IEEE standards to evaluate the system's grid compatibility.
Enhancing photovoltaic system maximum power point tracking with fuzzy logic-b...IJECEIAES
Photovoltaic systems have emerged as a promising energy resource that
caters to the future needs of society, owing to their renewable, inexhaustible,
and cost-free nature. The power output of these systems relies on solar cell
radiation and temperature. In order to mitigate the dependence on
atmospheric conditions and enhance power tracking, a conventional
approach has been improved by integrating various methods. To optimize
the generation of electricity from solar systems, the maximum power point
tracking (MPPT) technique is employed. To overcome limitations such as
steady-state voltage oscillations and improve transient response, two
traditional MPPT methods, namely fuzzy logic controller (FLC) and perturb
and observe (P&O), have been modified. This research paper aims to
simulate and validate the step size of the proposed modified P&O and FLC
techniques within the MPPT algorithm using MATLAB/Simulink for
efficient power tracking in photovoltaic systems.
Adaptive synchronous sliding control for a robot manipulator based on neural ...IJECEIAES
Robot manipulators have become important equipment in production lines, medical fields, and transportation. Improving the quality of trajectory tracking for
robot hands is always an attractive topic in the research community. This is a
challenging problem because robot manipulators are complex nonlinear systems
and are often subject to fluctuations in loads and external disturbances. This
article proposes an adaptive synchronous sliding control scheme to improve trajectory tracking performance for a robot manipulator. The proposed controller
ensures that the positions of the joints track the desired trajectory, synchronize
the errors, and significantly reduces chattering. First, the synchronous tracking
errors and synchronous sliding surfaces are presented. Second, the synchronous
tracking error dynamics are determined. Third, a robust adaptive control law is
designed,the unknown components of the model are estimated online by the neural network, and the parameters of the switching elements are selected by fuzzy
logic. The built algorithm ensures that the tracking and approximation errors
are ultimately uniformly bounded (UUB). Finally, the effectiveness of the constructed algorithm is demonstrated through simulation and experimental results.
Simulation and experimental results show that the proposed controller is effective with small synchronous tracking errors, and the chattering phenomenon is
significantly reduced.
Remote field-programmable gate array laboratory for signal acquisition and de...IJECEIAES
A remote laboratory utilizing field-programmable gate array (FPGA) technologies enhances students’ learning experience anywhere and anytime in embedded system design. Existing remote laboratories prioritize hardware access and visual feedback for observing board behavior after programming, neglecting comprehensive debugging tools to resolve errors that require internal signal acquisition. This paper proposes a novel remote embeddedsystem design approach targeting FPGA technologies that are fully interactive via a web-based platform. Our solution provides FPGA board access and debugging capabilities beyond the visual feedback provided by existing remote laboratories. We implemented a lab module that allows users to seamlessly incorporate into their FPGA design. The module minimizes hardware resource utilization while enabling the acquisition of a large number of data samples from the signal during the experiments by adaptively compressing the signal prior to data transmission. The results demonstrate an average compression ratio of 2.90 across three benchmark signals, indicating efficient signal acquisition and effective debugging and analysis. This method allows users to acquire more data samples than conventional methods. The proposed lab allows students to remotely test and debug their designs, bridging the gap between theory and practice in embedded system design.
Detecting and resolving feature envy through automated machine learning and m...IJECEIAES
Efficiently identifying and resolving code smells enhances software project quality. This paper presents a novel solution, utilizing automated machine learning (AutoML) techniques, to detect code smells and apply move method refactoring. By evaluating code metrics before and after refactoring, we assessed its impact on coupling, complexity, and cohesion. Key contributions of this research include a unique dataset for code smell classification and the development of models using AutoGluon for optimal performance. Furthermore, the study identifies the top 20 influential features in classifying feature envy, a well-known code smell, stemming from excessive reliance on external classes. We also explored how move method refactoring addresses feature envy, revealing reduced coupling and complexity, and improved cohesion, ultimately enhancing code quality. In summary, this research offers an empirical, data-driven approach, integrating AutoML and move method refactoring to optimize software project quality. Insights gained shed light on the benefits of refactoring on code quality and the significance of specific features in detecting feature envy. Future research can expand to explore additional refactoring techniques and a broader range of code metrics, advancing software engineering practices and standards.
Smart monitoring technique for solar cell systems using internet of things ba...IJECEIAES
Rapidly and remotely monitoring and receiving the solar cell systems status parameters, solar irradiance, temperature, and humidity, are critical issues in enhancement their efficiency. Hence, in the present article an improved smart prototype of internet of things (IoT) technique based on embedded system through NodeMCU ESP8266 (ESP-12E) was carried out experimentally. Three different regions at Egypt; Luxor, Cairo, and El-Beheira cities were chosen to study their solar irradiance profile, temperature, and humidity by the proposed IoT system. The monitoring data of solar irradiance, temperature, and humidity were live visualized directly by Ubidots through hypertext transfer protocol (HTTP) protocol. The measured solar power radiation in Luxor, Cairo, and El-Beheira ranged between 216-1000, 245-958, and 187-692 W/m 2 respectively during the solar day. The accuracy and rapidity of obtaining monitoring results using the proposed IoT system made it a strong candidate for application in monitoring solar cell systems. On the other hand, the obtained solar power radiation results of the three considered regions strongly candidate Luxor and Cairo as suitable places to build up a solar cells system station rather than El-Beheira.
An efficient security framework for intrusion detection and prevention in int...IJECEIAES
Over the past few years, the internet of things (IoT) has advanced to connect billions of smart devices to improve quality of life. However, anomalies or malicious intrusions pose several security loopholes, leading to performance degradation and threat to data security in IoT operations. Thereby, IoT security systems must keep an eye on and restrict unwanted events from occurring in the IoT network. Recently, various technical solutions based on machine learning (ML) models have been derived towards identifying and restricting unwanted events in IoT. However, most ML-based approaches are prone to miss-classification due to inappropriate feature selection. Additionally, most ML approaches applied to intrusion detection and prevention consider supervised learning, which requires a large amount of labeled data to be trained. Consequently, such complex datasets are impossible to source in a large network like IoT. To address this problem, this proposed study introduces an efficient learning mechanism to strengthen the IoT security aspects. The proposed algorithm incorporates supervised and unsupervised approaches to improve the learning models for intrusion detection and mitigation. Compared with the related works, the experimental outcome shows that the model performs well in a benchmark dataset. It accomplishes an improved detection accuracy of approximately 99.21%.
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Sinan KOZAK
Sinan from the Delivery Hero mobile infrastructure engineering team shares a deep dive into performance acceleration with Gradle build cache optimizations. Sinan shares their journey into solving complex build-cache problems that affect Gradle builds. By understanding the challenges and solutions found in our journey, we aim to demonstrate the possibilities for faster builds. The case study reveals how overlapping outputs and cache misconfigurations led to significant increases in build times, especially as the project scaled up with numerous modules using Paparazzi tests. The journey from diagnosing to defeating cache issues offers invaluable lessons on maintaining cache integrity without sacrificing functionality.
Discover the latest insights on Data Driven Maintenance with our comprehensive webinar presentation. Learn about traditional maintenance challenges, the right approach to utilizing data, and the benefits of adopting a Data Driven Maintenance strategy. Explore real-world examples, industry best practices, and innovative solutions like FMECA and the D3M model. This presentation, led by expert Jules Oudmans, is essential for asset owners looking to optimize their maintenance processes and leverage digital technologies for improved efficiency and performance. Download now to stay ahead in the evolving maintenance landscape.
Comparative analysis between traditional aquaponics and reconstructed aquapon...bijceesjournal
The aquaponic system of planting is a method that does not require soil usage. It is a method that only needs water, fish, lava rocks (a substitute for soil), and plants. Aquaponic systems are sustainable and environmentally friendly. Its use not only helps to plant in small spaces but also helps reduce artificial chemical use and minimizes excess water use, as aquaponics consumes 90% less water than soil-based gardening. The study applied a descriptive and experimental design to assess and compare conventional and reconstructed aquaponic methods for reproducing tomatoes. The researchers created an observation checklist to determine the significant factors of the study. The study aims to determine the significant difference between traditional aquaponics and reconstructed aquaponics systems propagating tomatoes in terms of height, weight, girth, and number of fruits. The reconstructed aquaponics system’s higher growth yield results in a much more nourished crop than the traditional aquaponics system. It is superior in its number of fruits, height, weight, and girth measurement. Moreover, the reconstructed aquaponics system is proven to eliminate all the hindrances present in the traditional aquaponics system, which are overcrowding of fish, algae growth, pest problems, contaminated water, and dead fish.
Design and optimization of ion propulsion dronebjmsejournal
Electric propulsion technology is widely used in many kinds of vehicles in recent years, and aircrafts are no exception. Technically, UAVs are electrically propelled but tend to produce a significant amount of noise and vibrations. Ion propulsion technology for drones is a potential solution to this problem. Ion propulsion technology is proven to be feasible in the earth’s atmosphere. The study presented in this article shows the design of EHD thrusters and power supply for ion propulsion drones along with performance optimization of high-voltage power supply for endurance in earth’s atmosphere.
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...shadow0702a
This document serves as a comprehensive step-by-step guide on how to effectively use PyCharm for remote debugging of the Windows Subsystem for Linux (WSL) on a local Windows machine. It meticulously outlines several critical steps in the process, starting with the crucial task of enabling permissions, followed by the installation and configuration of WSL.
The guide then proceeds to explain how to set up the SSH service within the WSL environment, an integral part of the process. Alongside this, it also provides detailed instructions on how to modify the inbound rules of the Windows firewall to facilitate the process, ensuring that there are no connectivity issues that could potentially hinder the debugging process.
The document further emphasizes on the importance of checking the connection between the Windows and WSL environments, providing instructions on how to ensure that the connection is optimal and ready for remote debugging.
It also offers an in-depth guide on how to configure the WSL interpreter and files within the PyCharm environment. This is essential for ensuring that the debugging process is set up correctly and that the program can be run effectively within the WSL terminal.
Additionally, the document provides guidance on how to set up breakpoints for debugging, a fundamental aspect of the debugging process which allows the developer to stop the execution of their code at certain points and inspect their program at those stages.
Finally, the document concludes by providing a link to a reference blog. This blog offers additional information and guidance on configuring the remote Python interpreter in PyCharm, providing the reader with a well-rounded understanding of the process.
artificial intelligence and data science contents.pptxGauravCar
What is artificial intelligence? Artificial intelligence is the ability of a computer or computer-controlled robot to perform tasks that are commonly associated with the intellectual processes characteristic of humans, such as the ability to reason.
› ...
Artificial intelligence (AI) | Definitio
Real-time eyeglass detection using transfer learning for non-standard facial data
1. International Journal of Electrical and Computer Engineering (IJECE)
Vol. 12, No. 4, August 2022, pp. 3709~3720
ISSN: 2088-8708, DOI: 10.11591/ijece.v12i4.pp3709-3720 3709
Journal homepage: http://ijece.iaescore.com
Real-time eyeglass detection using transfer learning for
non-standard facial data
Ritik Jain1
, Aashi Goyal1
, Kalaichelvi Venkatesan2
1
Department of Computer Science, Birla Institute of Technology and Science Pilani, Dubai Campus, Dubai, United Arab Emirates
2
Department of Electrical and Electronics Engineering, Birla Institute of Technology and Science Pilani, Dubai Campus, Dubai,
United Arab Emirates
Article Info ABSTRACT
Article history:
Received Jul 17, 2021
Revised Mar 19, 2022
Accepted Mar 30, 2022
The aim of this paper is to build a real-time eyeglass detection framework
based on deep features present in facial or ocular images, which serve as a
prime factor in forensics analysis, authentication systems and many more.
Generally, eyeglass detection methods were executed using cleaned and
fine-tuned facial datasets; it resulted in a well-developed model, but the
slightest deviation could affect the performance of the model giving poor
results on real-time non-standard facial images. Therefore, a robust model is
introduced which is trained on custom non-standard facial data. An
Inception V3 architecture based pre-trained convolutional neural network
(CNN) is used and fine-tuned using model hyper-parameters to achieve a
high accuracy and good precision on non-standard facial images in real-time.
This resulted in an accuracy score of about 99.2% and 99.9% for training
and testing datasets respectively in less amount of time thereby showing the
robustness of the model in all conditions.
Keywords:
Computer vision
Convolutional neural network
Deep learning
Eyeglass detection
Ocular images
Performance analysis
Transfer learning
This is an open access article under the CC BY-SA license.
Corresponding Author:
Ritik Jain
Department of Computer Science, Birla Institute of Technology and Science Pilani, Dubai Campus
P.O. Box No. 345055, Dubai, United Arab Emirates
Email: ritikcjain@gmail.com; f20180231@dubai.bits-pilani.ac.in
1. INTRODUCTION
Human soft biometrics identification is an important and hot computer vision (CV) research field.
Surveillance systems provide vast data to be visualized [1]. Facial analysis serves as backbone for most of
these identification tasks [2]. The human face has recognizable demographic with lots of attributes to
compare with. Although these factors like nose bridge level, forehead facial expression and all other factors
seem easy for identification of faces in an image, the process is not elementary as certain technological
factors like resolution and quality of the image, background and shadows, and presence of eyeglasses could
affect the performance.
Human facial analysis can give inaccurate results because of certain obstruction near ocular region
offered by eyeglasses in terms of reflection from lenses and frame occlusion, for example in facial analysis
models using convolutional neural networks. The situation is even worse when images have sunglasses,
thereby covering the entire eye area causing failure in eye detection in system. To increase the precision level
for human faces by eliminating the above factors, many systems have included an eyeglasses or
non-eyeglasses classification phase in their models followed by removal of eyeglasses by regenerating facial
images using generative adversarial or similar network models. Therefore, a robust and accurate eyeglasses
classifier is needed to improve and strengthen systems to be accurate in real world non-standard images as well.
Glasses detection is not a new problem, but very limited amount of work is dedicated for a clear
detection of glasses in facial images. Various approaches were adopted by researchers involving handcrafted
2. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 4, August 2022: 3709-3720
3710
feature extraction methods, edge detection in the Frame Bridge and ocular region and using shallow features
to increase the prediction time of the model. The models used for this classification task can be roughly
grouped into three categories: Linear classifier, neural networks, and Bayesian networks. Using a linear
classifier can be pretty simple as it does not involve a lot of training data given the simplicity of the model,
but the challenge arises in the feature extraction task which can be tricky as the images data may have a
variety of features depending upon the background, and lighting conditions. Hidden Markov models
(Bayesian network model) are quite efficient and give high accuracies. The advantage of using this type of
model is that they are good at capturing certain temporal features, but prior to learning they require a clear
definition of model which is difficult given the diversity of our data [3].
All the existing approaches used pre-existing standard datasets with handcrafted feature extraction
methods for training which does not represent real-life scenarios. This leads to poor real-time performance.
Looking at the success of deep learning in various computer vision projects, a convolutional neural network
(CNN) based transfer learning model is capable to extract deep features which will have inputs for a same
person’s image with and without the glasses to make the comparison and easy detection of eyeglasses with
respect to other facial recognition factors. CNN acts as a regularized multilayer perceptron network by
convolving input data with learned features using 2D convolutional layers. As a result, CNNs can learn filters
or patters that must be explicitly defined in other classification algorithms. Similarly, in our case there is no
need to explicitly define the filter or pattern for eyeglasses, rather the network automatically learns the
features and creates a gradient filter to focus on specific subarea of the input image. But for successful
filtering and pattern recognition using CNN, a varied dataset representing all possible real-life scenarios is
required. Therefore, an attempt to generate such a dataset and fine-tuning the trained model on this dataset is
made to increase the classification capability in real-time while being time efficient. The glasses would be
covering the ocular region which will be the key factor for its detection. The performance of the model is
tested in real-time using a custom-built framework using OpenCV.
The rest of the paper is organized as follows. Some related literature surveys will be given in section 2
describing 14 different past researchers works in the similar fields. Dataset collection and preparation is
defined in section 3. Thereafter, section 4 talks about the detailed information of the implementation details
and optimization of our model. Finally, the paper presents the results and conclusion in sections 5 and 6,
respectively.
2. LITERATURE REVIEW
There has been a good amount of work done in the field of eyeglass detection by various researchers
as the detection of glasses plays a key role in the removal of glasses from facial images which tends to cause
a lot of problems for ocular and facial recognition tasks because of the presence of reflection and shadow
offered by the eyeglasses. Mohammad et al. [4] evaluated and compared the use of various CNN structures
being applied specifically on the ocular region and the frame bridge, serving as the regions of interest for
feature extraction and training/learning of the model. Their CNN model was squeezed to be successfully run
by mobile devices while offering parallel accuracy.
Basbrain et al. [5] emphasized on designing a robust shallow CNN system for detection of
eyeglasses for much more faster results and high accuracy. They presented and addressed two major concepts
of CNN namely, the adequate dataset length required for training and the depth of the architecture for the
proposed network. After learning the parameters from a deep neural network and fine tuning it on a small
dataset, some of the convolutional layers are removed to reduce the depth of the network after testing it on
the validation data.
While targeting the region in CNN, some unrelated regions during learning are commonly used
which leads to neglecting the effective features. Hao et al. [6] used an optimized CNN method instead of the
traditional CNN since it does not consider weights of learning instances. They combine image recognition
with bottom up segmentation considering the target selection and weight deliberation by reinforcement
learning, this improved and by selecting huge amount of images robustness of the paper was shown and
concluded that this model need further improvements. Another implementation by Chauhan et al. [7], using
CNN on Modified National Institute of Standards and Technology (MNIST) and CIFAR-10 datasets for
image recognition, the accuracy of model on MNIST came out to be 99.6% by using root mean square prop
(RMS prop) as an optimizer, the paper concluded by saying that improvement was needed on the CIFAR-10
dataset which could be improved with larger epochs and taking more hidden layers for training the dataset.
Mo et al. [8] suggested an ensemble learning based image recognition algorithm which implements
error level analysis (ELE-CNN), this method was used to solve those problems which single model cannot
predict correctly. CIFAR-10 dataset was used as image dataset. Bagging ensemble was used to train the
3. Int J Elec & Comp Eng ISSN: 2088-8708
Real-time eyeglass detection using transfer learning for non-standard facial data (Ritik Jain)
3711
models and a network structure was used which was a combination of DenseNet, ResNet, Inception-ResNet-
V2, and DenseNet-BC architectures. The final result was the mean probability of the prediction vector.
Fernandez et al. [9] proposed an algorithm for automated recognition of glasses on face images
derived by robust alignment along with robust local binary pattern (RLBP). Normalized and pre-processed
image data is used to generate a RLBP pattern, based on the observation that the nose pin of glasses is often
located at a level same as that of the center of the ocular region or eyes and finally, these features are used to
perform classification task using a support vector machine. Another approach involving a unique set of
Harr-like features on in-plane rotated facial images was implemented by Du et al. [10].
Bekhet and Alahmer [11] developed a deep learning-based recognition model for detection of
glasses on a realistic selfie dataset which includes challenging and non-synthesized images of full or partial
faces. After an extensive training of almost two weeks on a CNN based transfer learning model, they were
able to such a model with an accuracy score of 96% on such a varied dataset.
Many research works have been done on eyeglasses recognition, removal, and localization [12],
[13]. Jiang et al. [14] have used an eyeglasses classifier for its detection on facial images, for determining
glasses on facial images only six measures were used. This method required that the position of eyes on the
face in the images should be known since it will detect the presence of the glasses’ bridge present in the
ocular region between the eyes. Mohammad et al. [15] proposed two schemes learning and non-learning for
eyeglasses detection. A non-learning scheme was used by the authors in contrast to the limited adaptive
histogram equalization (CLAHE) filters to obtain edge information to get a hint of eyeglass detection.
Learning scheme used multi-layer preceptor (MLP), support vector machine (SVM) and linear discriminant
analysis (LDA) which were applied on two databases and results vary between 97.9% to 99.3% detection
accuracy.
Eyeglasses can change the aspect and insight of facial images, if observing under infra-red it
deteriorates the image quality, thereby making detection of glasses a prerequisite for the stratified quality.
Drozdowski et al. [16] used deep neural networks based Caffe framework method for its detection which
resulted the neural network classified images without glasses more accurately than the statistical approach
and would work best on ocular images alone than the whole infrared face image. Jing and Mariani [17] used
a deformable contour way for detecting the presence of eyeglasses using Bayesian network framework, they
used 50 key points to mark the position and structure of glasses which was found by maximizing the
posteriori and finally combining the edge and geometrical features. There were some ambiguities as grey
scale of face skin was nearly identical to that of glasses which gave some wrong results. Based on the above
survey, an attempt to build a model for detection of eyeglasses for non-standard facial images has been
proposed. Also, an in-depth analysis has been carried out to optimize the model hyperparameters and
real-time testing was performed using a custom-built framework.
3. DATASET PREPARATION
A pre-trained transfer learning model from Keras trained on the ImageNet ILSVRC dataset was
used by freezing the weights of certain layers and redefining the top layers. A custom dataset of non-standard
facial images was used to refactor the pre-trained model and update the weights on top layers to make the
model robust in making prediction for real-life data. The dataset used for modelling contains a total of
46,372 images belonging to two classes namely eyeglasses present (label-0) and eyeglasses absent (label-1).
Three sources of data from which the data was collected are mentioned below. The first part of dataset is
custom built wherein, facial image data was collected from few families and friends by requesting them to
click and send their pictures with and without eyeglasses in different illuminations and orientations. This part
of the dataset involves a total of 4062 images from 20 subjects.
The database released by the authors of “An Indian facial database highlighting the spectacles
problem contains 123,213 images of 58 subjects with and without eyeglasses in multiple sessions. These
images were captured in real-time with different physical conditions like illumination, posture and
orientation, alertness level, yawning and different frame types [18]. But due to the huge size of the data from
just 58 subjects, using this entire dataset can cause over to overfit on this data. Hence, a random sub-sample
of about 30% was selected in equal proportions from each subject to give a total of 37,388 images belonging
to both output classes in a balanced ratio. The Kaggle dataset of “Glasses or No Glasses” containing
4,922 images were used in the third part of our dataset. This dataset contains high-quality pictures and thus
bringing diversity to our dataset [19].
Figure 1 shows a single batch of 16 images from our labelled dataset collected from the three
sources as mentioned previously. Apart from this a small dataset of about 400 images was collected from
subjects whose images were not used for training for the purpose of testing the real-time performance of our
model on images of subjects the model has never seen before. This dataset will be termed as real test data in
rest of the paper. Additionally, two independent published datasets (Olivetti research laboratory (ORL) and
4. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 4, August 2022: 3709-3720
3712
Sunglasses) were used to compare the generalization capabilities of trained model. ORL dataset includes 400
frontal facial images (in grayscale) of 40 different subjects both wearing and not wearing eyeglasses. These
images were taken in various lighting and facial expressions. The second independent dataset was collected
from Kaggle which includes about 3400 facial images of people with and without sunglasses [20], [21].
Figure 1. Single batch of 16 images from our labelled dataset
4. RESEARCH METHOD
This section is discussing about the pipeline proposed for prediction of presence of glasses in facial
images in real-time along with optimizing and fine-tuning the model by varying different hyperparameters.
The different is the types of optimizers, learning rates and activation function for the connected layers in the
neural network model. Figure 2 represents the proposed pipeline.
Figure 2. Pipeline of proposed work
4.1. Face detection and pre-processing of facial images
The input images need to be pre-processed before being used for training. The first step as shown in
pipeline in Figure 2 is detecting and extracting the face in each input image, which is done by using a
technique based on the renowned voila-jones face detection algorithm called the Haar-cascades. OpenCV
supports object recognition using Haar features which can recognize faces in an image and return a
rectangular region of interest (ROI) which can be cropped and saved as a new image. Using this approach, a
new dataset is thus generated consisting of close shots of faces in the base dataset.
While using a pre-trained image classification model, the selection of such a model must be done
carefully. There are a variety of pre-trained models available online, trained on varied datasets and using
different network architectures. For this research, a pre-trained model from Keras was chosen which is based
on the Inception-V3 architecture and trained on ImageNet dataset. Our pre-processing techniques thus
includes resizing the images to (299,299) so as to match the input vector size of the pre-trained Inception-V3
model. Next step is to normalize each pixel value by dividing it by 255 so as to get the red, green, blue
(RGB) values of each pixel in range 0 to 1 for the ease of calculations while training.
5. Int J Elec & Comp Eng ISSN: 2088-8708
Real-time eyeglass detection using transfer learning for non-standard facial data (Ritik Jain)
3713
4.2. Model selection and training
Neural networks have proven to be best for solving classification problems as a result of their
property to learn the most significant classification features by the virtue of neuron’s collective behavior and
ability of each neuron to do a fixed amount of parallel computation. Using a convolutional neural network
can further boost the performance of the model as it has the advantage of having multiple layers over a
simple artificial neural network. Using multiple layers helps us to automatically detect the most significant
features for training without any human supervision.
Transfer learning is nowadays the most popular machine learning technique used for training
classification models wherein a pre-trained model is used which is trained on significantly large datasets with
high accuracy and then refactored on our concerned dataset to fit more. A typical transfer learning algorithm
involves using a pre-trained model and recycling a portion of its weights while reinitializing and updating the
weights of shallower layers. The elemental instance of this approach requires modifying or updating the
weights on the final classification dense layer of a fully trained neural network and using it to classify a
different dataset [22], [23]. Figure 3 visualizes the proposed convolutional neural network architecture used
by the transfer learning model and the filters from its first and last convolutional layers.
Inception-V3 also known as the 3rd
version of Google’s convolutional neural network from
inception family, is an architecture as shown in Figure 3(a) with several advancements offered by factorized
7×7 convolutions, label smoothing and propagating to lower down the network using an auxiliary and batch
normalization layer [24]. Using a pre-trained model instance from keras based of the Inception-V3
architecture and trained on ImageNet ILSVRC dataset, a new model was defined by taking all the 311 layers
except the top fully connected layers (which is used for classification) and explicitly freezing these layers’
weights to prevent them from being changed or updated while training. To this model a few extra layers were
added including, a flatten layer to reshape the tensor, a dropout layer with frequency of 0.5 to prevent
overfitting and finally a dense layer with 2 output neurons representing out 2 output class labels namely
eyeglasses present and eyeglasses absent. Our model now includes a total 314 layers with 94 convolutional
filter layers with first convolutional layer having 32 filter and the last convolutional layer having 192 filters.
The Figures 3(b) and 3(c) show how a typical convolutional layer filters features in an input image.
(a)
(b) (c)
Figure 3. Proposed inception V3 based architecture and its convolutional layer filters
6. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 4, August 2022: 3709-3720
3714
From Figure 3(b) it can be deduced that for an input image passing through the first convolutional
layer filters, almost all the features in the image are retained although some filters are still not active at this
layer which are marked by solid-colored boxes. Comparing this with the filters of our last convolutional layer
in Figure 3(c), it is inferred that the activations become abstract being less interpretable visually. At this layer
certain higher-level concepts are encoded like specific border, corners, and angles. Next step involves
training of the model consisting of 262,146 trainable parameters over a total of 22,064,930 parameters
present in the proposed model. Multiple models were trained on the batch image dataset while varying the
values of different hyper parameters to find the optimized version of proposed model and hence fine tuning
the model. After training all these models, the optimized model was exported and saved as “.h5” files which
will be used for loading the model to make predictions in real-time using a custom framework.
4.3. Fine tuning hyper parameters
4.3.1. Training-testing splitting ratios
To study the effect of using different training-testing split ratios, three different ratios were used and
trained the model on each of them to compare the performance. Table 1 shows that all the three models
perform equally good, but the model with 80-20 split performing slightly better with the highest accuracy and
least loss factor than the other two models. Also, batch size of 16 was found to be the best selection out of 8,
16 and 32 with significantly higher accuracy of about 15% more than the other two.
Table 1. Analysis of model for proposed different training-testing ratios
Train Ratio Test Ratio Train Accuracy Test Accuracy Train Loss
60% 40% 98.57 99.82 0.24
70% 30% 98.78 99.76 0.19
80% 20% 98.91 99.88 0.17
4.3.2. Optimizer
Optimizers are of great importance in the performance of neural networks. These are the algorithms
that governs the change or updates in weights and learning rate during the training process. They also help in
reducing and, hence, minimizing the losses. Hence selection of optimizes becomes crucial for the
performance of our model. As a classification task is being performed in this research hence the loss function
of interest is categorical cross entropy loss function which is a standard loss function for all multiclass
classification tasks. To minimize the loss and maximize the performance of our model a detailed analysis is
performed based on the performance of five distinct optimization algorithms namely stochastic gradient
descent, RMSprop, Adam, Adagrad and Adadelta. To ensure the uniformity of our analysis a seed value of
100 was sued while splitting our data into test (20%) and train (80%) datasets. The loss and accuracy plots in
Figures 4 and 5, respectively represent the significantly better performance of Adagrad algorithm over other
optimization algorithms as it converges to minima the fastest and has least overall loss value along with
highest accuracy as mentioned in Table 2.
4.3.3. Learning rate
As Adagrad or adaptive gradient turned out to be the best performing optimization algorithm
because of the highest accuracy and minimum loss offered by this algorithm in comparison to other
algorithms mentioned in Table 2, Adagrad will be used for the optimization function and further fine-tuning
will be performed on the learning rate of this optimization algorithm which governs the rate of change in
weights while learning. Adagrad is an improvement over stochastic gradient descent with an adaptive
learning rate which performs larger updates on the weights of features that occur less frequently while
performing smaller updates for the features that occur more frequently. Adagrad instead of recording the sum
of gradient like in momentum takes into account the sum of gradients squared, which is used to update the
gradient in different directions as shown in the formula (1). The gradient history is stored in Gt as it is a
diagonal matrix where each element i,i represents the sum of gradients squared for every wt. 𝜀 is introduced
to avoid division by zero. This diagonal matrix which records the history of squared sum of gradients is used
to update the learning rate (𝛼) [25].
𝑤𝑡+1.𝑖 = 𝑤𝑡.𝑖 −
𝛼
√𝐺𝑡.𝑖𝑖+𝜀
× 𝑔𝑡.𝑖 (1)
Although Adagrad eliminates the process of manually tuning the learning rate but specifying the initial
learning rate can change the performance of the model in terms of how fast our model converges to its
7. Int J Elec & Comp Eng ISSN: 2088-8708
Real-time eyeglass detection using transfer learning for non-standard facial data (Ritik Jain)
3715
minima. Hence, comparison of model performance for five exponentially decreasing value of initial learning,
ranging from 1.0 to 0.0001 while using Adagrad optimizer is represented visually in Figure 6.
Figure 4. Training loss for different optimizer functions
Figure 5. Training accuracy for different optimizer functions
8. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 4, August 2022: 3709-3720
3716
Table 2. Analysis of model for proposed different optimizer functions
Optimizer Function Train Accuracy Train Loss Test Accuracy Test Loss
SGD 98.80 41.48 99.60 3.87
RMSprop 98.74 24.68 99.94 0.91
Adam 98.84 17.18 99.86 4.00
AdaGrad 99.23 3.56 99.95 0.25
AdaDelta 85.43 32.48 97.14 10.16
Figure 6. Training loss for different values of initial learning rate using Adagrad optimizer
In the plots in Figure 6, a pattern of loss decreasing with decreasing value of learning rate only till
the 0.001 is observed. On decreasing (exponentially) the learning rate value further below 0.001, an increase
in the overall loss value was observed for our trained model. Also, from the Table 3 it can be inferred
accuracy not to be the suitable factor for comparing the performance based on initial learning rate because of
marginal difference in accuracy on changing the learning rate. But looking at the loss values in Table 3 for
models trained using different learning rates, the learning rate value of 0.001 is the best performing value as a
result of its lowest loss value for both training and testing datasets.
Table 3. Analysis of model for proposed different initial learning rate values
Learning Rate Train Accuracy Train Loss Test Accuracy Test Loss
1.0 99.15 21.299 99.90 1.018
0.1 99.13 1.991 99.90 0.128
0.01 99.21 0.194 99.70 0.022
0.001 99.17 0.032 99.97 0.002
0.0001 96.81 0.091 99.47 0.024
4.3.4. Activation function (for fully connected layer)
At last, different activation functions were analyzed for the top fully connected dense layer
consisting of 2 output neurons which gives a probabilistic output for an input image belonging each of the
two output classes namely eyeglasses present and eyeglasses absent. Rectified linear unit (ReLU) activation
function is often used with hidden layers because of its non-linear property and its ability of having
negligible backpropagation errors as seen in sigmoid activation function. Therefore, the analysis of
performance of Sigmoid and SoftMax activation functions in performed as shown in Figure 7.
9. Int J Elec & Comp Eng ISSN: 2088-8708
Real-time eyeglass detection using transfer learning for non-standard facial data (Ritik Jain)
3717
Figure 7. Training accuracy and loss for different activation function of fully connected layer
Sigmoid and SoftMax activation function are very similar to each other with only difference being
Sigmoid is generally used for binary classification tasks while SoftMax is generally used for multi-class
classification tasks. As a result of their similar nature, a very similar trend is seen in Figure 7 and the results
of both these activation functions are found to be nearly identical. Hence the choice of proposed activation
functions does not change or improve the performance of model. SoftMax activation function was used for
the final optimized model.
4.4. Simulation of real-time eyeglass detection framework
For real time detection framework, the model trained on the fine-tuned hyper parameters is used and
loaded through saved model in .h5 file. This loaded model is used to make predictions on frame captured in
real-time by OpenCV [26]. Initially, the entire frame is captured and then using the Viola-Jones based Haar
Cascading method, the faces present in the frame are extracted as mentioned in section 4.1. Before making
any predictions for the extracted face image, it needs to be pre-processed using various techniques as
mentioned in section 4.1. By extracting the individual frames from the live camera feed, the frame is resized
according to the model input vector size which is 299×299 and scaled down to have each pixel range between
0 and 1. This processed frame then serves as input to our loaded trained model to make the probabilistic
prediction for each output class. Label of the output class with the highest probability is marked as our final
output. Some snips from our eyeglass detection framework predicting presence of different types of
eyeglasses in real-time are attached later in the results in section 5.
5. RESULTS AND DISCUSSION
After successful collection and pre-processing of data as mentioned in section 3 and section 4.1
respectively, the model was trained on 80% of this data using Adagrad optimizer for the categorical cross
entropy loss function using an initial learning rate value of 0.001 while using SoftMax activation for our final
dense classification layer to get results as mentioned in the Table 4. The model was evaluated on the left out
20% of the above data termed as test data. To further test the robustness and performance of our model,
another small custom-built dataset termed as real test data was used which includes subjects different from
those in the main dataset. Two additional independent test datasets namely ORL and Sunglasses datasets
were collected from published resources to further test the generalization capability of our model.
Table 4. Performance metrics for optimized model on different datasets
Dataset Accuracy Loss
Train Data (80%) 99.28 0.0326
Test Data (20%) 99.95 0.0026
Real Test Data 100.00 0.0004
ORL Test Data 94.38 0.1472
Sunglasses Test Data 98.20 0.0962
The above results clearly represent the robustness of our model in all conditions. The proposed
model performs exceptionally well on training, testing and real-time testing datasets. As compared to the
performance of model proposed by Jiang et al. [14] on ORL dataset, our model is able to generalize the
results well with a recall value of 100% while the proposed edge likelihood probabilistic model in [14]
10. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 4, August 2022: 3709-3720
3718
performed well on the in-house dataset (maximum fisher value 28.9) but gave comparatively poor results
(maximum fisher value of 9.6) on ORL test dataset. As ORL dataset includes grayscale facial images, it
becomes a little difficult of extract features from such images and hence gives slightly less accuracy when
compared with other testing datasets. To visualize the results of our model as shown in Figure 8, the
activations of our final dense classification layer were used to plot guided gradient and saliency maps
respectively which represents unique quality and contribution of each pixel in an input image. Finally, to test
the performance of our model in real-time a custom-built framework was used for detecting the presence of
eyeglasses in facial images as shown in Figure 9. Two different cameras of standard-quality and high-quality
resolution were used respectively.
Figure 10 shows some of the images from the two independent test datasets that were misclassified
by our trained model. Possible reasons for misclassification include extremely low-quality images as in first
two images. Also, the horizontal flip (upside down) and inappropriate cropped images resulted in difficulties
while making positive predictions. It was also observed that certain facial images without eyeglasses were
misclassified because of the shadow present in the ocular region. As a result of aging, ocular region develops
wrinkles which also results in dark patches accounting for misclassified results.
Figure 8. Guided gradient and saliency map
Figure 9. Real-time framework prediction sample
Figure 10. Misclassified test images
11. Int J Elec & Comp Eng ISSN: 2088-8708
Real-time eyeglass detection using transfer learning for non-standard facial data (Ritik Jain)
3719
6. CONCLUSION
This paper presents an effective and efficient glasses detection model based on deep convolutional
neural network and transfer learning. First, the input data is used and classified based on the with and without
glasses images of the same person so as to make the detection of eyeglasses more accurate. Then the learned
weights from pre-trained model were added to the corresponding layers according to various attributes to
identify the glasses covering the ocular region can be detected. This model has proved to be highly efficient
by giving a desirable accuracy of 99.9% on testing data in significantly less amount of time. Also, the model
has been tested on two independent datasets while giving accuracy score of 98.2% and 94.4% respectively.
Our model is robust enough to detect presence of all types of eyeglasses in real-time which gives us an edge
over other similar models available. The main contribution of this paper is to design the robust eye-glass
detection model using convolutional neural networks (CNN), which is expected to give a high accuracy on
non-standard real life facial/ocular data.
Automatic eyeglasses detection is useful as eyeglasses offers obstruction during facial analysis and
recognition tasks like security, surveillance, user authentication and other intelligent systems. The proposed
detection algorithm can be extended for removal of eyeglasses from facial images and hence improve the
accuracy of facial recognition algorithms. The model can be further generalized by adding more data from
real life scenarios to boost the real-time performance.
ACKNOWLEDGEMENTS
The authors are grateful to the authorities of Birla Institute of Technology and Science Pilani, Dubai
Campus for carrying out this research work.
REFERENCES
[1] D. A. Reid, S. Samangooei, C. Chen, M. S. Nixon, and A. Ross, “Soft biometrics for surveillance: an overview,” in Handbook of
statistics-Machine Learning: Theory and Applications, 2013, pp. 327–352.
[2] A. K. Jain, A. Ross, and S. Prabhakar, “An introduction to biometric recognition,” IEEE Transactions on Circuits and Systems for
Video Technology, vol. 14, no. 1, pp. 4–20, Jan. 2004, doi: 10.1109/TCSVT.2003.818349.
[3] B. Garcia and S. A. Viesca, “Real-time American sign language recognition with convolutional neural networks,” Convolutional
Neural Networks for Visual Recognition, vol. 2, pp. 225–232.
[4] A. S. Mohammad, A. Rattani, and R. Derakhshani, “Comparison of squeezed convolutional neural network models for eyeglasses
detection in mobile environment,” Journal of the ACM, vol. 33, no. 5, pp. 136–144, 2018.
[5] A. M. Basbrain, I. Al-Taie, N. Azeez, J. Q. Gan, and A. Clark, “Shallow convolutional neural network for eyeglasses detection in
facial images,” in 2017 9th Computer Science and Electronic Engineering (CEEC), Sep. 2017, pp. 157–161, doi:
10.1109/CEEC.2017.8101617.
[6] W. Hao, R. Bie, J. Guo, X. Meng, and S. Wang, “Optimized CNN based image recognition through target region selection,”
Optik, vol. 156, pp. 772–777, Mar. 2018, doi: 10.1016/j.ijleo.2017.11.153.
[7] R. Chauhan, K. K. Ghanshala, and R. C. Joshi, “Convolutional neural network (CNN) for image detection and recognition,” in
2018 First International Conference on Secure Cyber Computing and Communication (ICSCCC), Dec. 2018, pp. 278–282, doi:
10.1109/ICSCCC.2018.8703316.
[8] W. Mo, X. Luo, Y. Zhong, and W. Jiang, “Image recognition using convolutional neural network combined with ensemble
learning algorithm,” Journal of Physics: Conference Series, vol. 1237, no. 2, Jun. 2019, doi: 10.1088/1742-6596/1237/2/022026.
[9] A. Fernández, R. García, R. Usamentiaga, and R. Casado, “Glasses detection on real images based on robust alignment,” Machine
Vision and Applications, vol. 26, no. 4, pp. 519–531, May 2015, doi: 10.1007/s00138-015-0674-1.
[10] S. Du, J. Liu, Y. Liu, X. Zhang, and J. Xue, “Precise glasses detection algorithm for face with in-plane rotation,” Multimedia
Systems, vol. 23, no. 3, pp. 293–302, Jun. 2017, doi: 10.1007/s00530-015-0483-4.
[11] S. Bekhet and H. Alahmer, “A robust deep learning approach for glasses detection in non‐standard facial images,” IET
Biometrics, vol. 10, no. 1, pp. 74–86, Jan. 2021, doi: 10.1049/bme2.12004.
[12] S. Mehta, S. Gupta, B. Bhushan, and C. K. Nagpal, “Face recognition using neuro-fuzzy inference system,” International Journal
of Signal Processing, Image Processing and Pattern Recognition, vol. 7, no. 1, pp. 331–344, Feb. 2014, doi:
10.14257/ijsip.2014.7.1.31.
[13] W. Ou, X. You, D. Tao, P. Zhang, Y. Tang, and Z. Zhu, “Robust face recognition via occlusion dictionary learning,” Pattern
Recognition, vol. 47, no. 4, pp. 1559–1572, Apr. 2014, doi: 10.1016/j.patcog.2013.10.017.
[14] X. Jiang, M. Binkert, B. Achermann, and H. Bunke, “Towards detection of glasses in facial images,” in Proceedings. Fourteenth
International Conference on Pattern Recognition (Cat. No.98EX170), 2000, vol. 2, pp. 1071–1073, doi:
10.1109/ICPR.1998.711877.
[15] A. S. Mohammad, A. Rattani, and R. Derahkshani, “Eyeglasses detection based on learning and non-learning based classification
schemes,” in 2017 IEEE International Symposium on Technologies for Homeland Security (HST), Apr. 2017, pp. 1–5, doi:
10.1109/THS.2017.7943484.
[16] P. Drozdowski, F. Struck, C. Rathgeb, and C. Busch, “Detection of glasses in near-infrared ocular images,” in 2018 International
Conference on Biometrics (ICB), Feb. 2018, pp. 202–208, doi: 10.1109/ICB2018.2018.00039.
[17] Z. Jing and R. Mariani, “Glasses detection and extraction by deformable contour,” in Proceedings - International Conference on
Pattern Recognition, 2000, vol. 15, no. 2, pp. 933–936, doi: 10.1109/icpr.2000.906227.
[18] M. Z. Lazarus, S. Gupta, and N. Panda, “An Indian facial database highlighting the spectacle problems,” IEEE International
Conference on Innovative Technologies in Engineering 2018 (ICITE OU), 2018, doi: 10.21227/ay8v-kj06.
[19] J. Heaton, “Glasses or no glasses.” Kaggle, 2020, Accessed: Apr. 20, 2021. [Online]. Available:
https://www.kaggle.com/jeffheaton/glasses-or-no-glasses.
12. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 4, August 2022: 3709-3720
3720
[20] J. Zhang, Y. Yan, and M. Lades, “Face recognition: eigenface, elastic matching, and neural nets,” Proceedings of the IEEE, vol.
85, no. 9, pp. 1423–1435, 1997, doi: 10.1109/5.628712.
[21] Amol, “Sunglasses/no sunglasses.” Kaggle, 2021, Accessed: Apr. 21, 2021. [Online]. Available:
https://www.kaggle.com/amol07/sunglasses-no-sunglasses.
[22] K. Weiss, T. M. Khoshgoftaar, and D. Wang, “A survey of transfer learning,” Journal of Big Data, vol. 3, no. 1, Dec. 2016, doi:
10.1186/s40537-016-0043-6.
[23] F. Zhuang et al., “A comprehensive survey on transfer learning,” in Proceedings of the IEEE, Nov. 2019, vol. 109, no. 1,
pp. 43–76.
[24] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture for computer vision,” in
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2016, pp. 2818–2826, doi:
10.1109/CVPR.2016.308.
[25] S. Ruder, “An overview of gradient descent optimization algorithms,” arXiv:1609.04747v2, Sep. 2016.
[26] N. Mahamkali and V. Ayyasamy, “OpenCV for computer vision applications,” in Proceedings of National Conference on Big
Data and Cloud Computing (NCBDC’15), 2015, pp. 52–56.
BIOGRAPHIES OF AUTHORS
Ritik Jain is a final year B.E. Computer Science student at BITS Pilani, Dubai
Campus. He is the General Secretary of Microsoft Tech Club at BITS Pilani, Dubai Campus
and a Microsoft Learn Student Ambassador since 2020. He is a machine learning enthusiast
and has worked on various projects in the same field. His research interests are in the areas of
Computer Vision, Sentiment Analysis, and Intelligent Automated Systems. He can be
contacted at email: ritikcjain@gmail.com.
Aashi Goyal is a final year B.E. Computer Science student at Birla Institute of
Technology and Science (BITS) Pilani, Dubai Campus. She has worked on various projects in
the same field. She is currently the E-Portal lead at the Artificial Intelligence club at BITS
Pilani, Dubai Campus. She has her work published at an international conference. Her area of
research interest includes computer vision, prediction and sentiment analysis. She can be
contacted at email: aashigoyal2109@gmail.com.
Kalaichelvi Venkatesan is an Associate Professor in the Department of Electrical
and Electronics Engineering at BITS Pilani, Dubai Campus. She is working with Bits Pilani,
Dubai Campus since 2008 and she is having 29 years of teaching experience. She has her
research work published in refereed international journals and many international conferences.
She has also reviewed many papers in International Journals and Conferences. Her research
area of expertise includes Process Control, Control Systems, Neural Networks, Fuzzy Logic
and Computer Vision. She is currently guiding students in the area of Intelligent Control
Techniques applied to Robotics and Mechatronics. She can be contacted at email:
kalaichelvi@dubai.bits-pilani.ac.in.