In general, a good interaction including communication can be achieved when verbal and non-verbal information such as body movements, gestures, facial expressions, can be processed in two directions between the speaker and listener. Especially the facial expression is one of the indicators of the inner state of the speaker and/or the listener during the communication. Therefore, recognizing the facial expressions is necessary and becomes the important ability in communication. Such ability will be a challenge for the visually impaired persons. This fact motivated us to develop a facial recognition system. Our system is based on deep learning algorithm. We implemented the proposed system on a wearable device which enables the visually impaired persons to recognize facial expressions during the communication. We have conducted several experiments involving the visually impaired persons to validate our proposed system and the promising results were achieved.
IRJET- Facial Expression Recognition using Deep Learning: A ReviewIRJET Journal
This document provides a review of facial expression recognition using deep learning approaches. It begins with an introduction to facial expression recognition and its applications. It then discusses commonly used datasets for facial expression recognition, including image-based datasets like JAFFE and video-based datasets like CK+. The document reviews 26 previous research papers that used deep learning methods like convolutional neural networks for facial expression recognition. It concludes that convolutional neural networks provide more accurate results for facial expression recognition compared to traditional methods.
Face recognition smart cane using haar-like features and eigenfacesTELKOMNIKA JOURNAL
1) The document describes a prototype for a smart cane with face recognition capabilities to help visually impaired people identify faces.
2) The prototype uses a Raspberry Pi, camera mounted on eyeglasses, and earphones to provide audio feedback. The camera captures images that are analyzed using Haar-like features and eigenfaces algorithms for face detection and recognition.
3) Testing showed the prototype could recognize faces within 1-1.5 meters with 91.67% accuracy for frontal faces but only 18-32% for other positions. It took around 3 seconds to recognize one face and longer for multiple faces.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Review of facial expression recognition system and used datasetseSAT Journals
Abstract The human face is main part to recognize the individuals as well as provides the important information, current state of user behavior through their different expressions. Therefore, in biometric area of the research, automatically face & face expression recognition attracts researcher’s interest. The other areas which use such technique are computer science medicine, psychology etc. Usually face recognition system is consisting of many internal tasks. Face detection is thefirst task of such systems. Due to different variations across the human faces, the process of detecting face becomes complex. But with help of different modeling methods, it becomes possible to recognize the face and hence different face expressions. This paperpresents a literature review over the techniques and methods used for facial expression recognition. Also, different facial expression datasets available for the research or testing of existing methods of facial expression recognition are discussed. Keywords: Facial Expression, Face Detection, Features Extraction, Recognition, datasets.
MOBILE AUGMENTED REALITY APPLICATION FOR PRIMARY SCHOOL EDUCATIONijma
Mobile-based Augmented Reality (AR) has grown rapidly in various sectors including an education field. This technology has supported in facilitating users to perform daily tasks such as tools in teaching and learning. However, in teaching and learning activities still practise a traditional method by using books that have caused students feel bored and cannot focus in the activities. Therefore, with the advent of ‘BelajarBacaanSolat’ using Augmented Reality (AR-BBS) application these students can utilize this to improve students' level of understanding as well as attract them to stay focus in the learning. The objective of this project was to help the primary school students to study an Islamic subject namely Kelas al-Quran dan Fardu Ain (KAFA). In addition, this project can also benefit to KAFA teachers in teaching student more effectively. This research focus on the topic of “BacaanSolat” in prayer. The framework proposed in this research contains three items including entity, platform and content. In conclusion, this project can help student especially KAFA student learn more efficiently. It is believed that this application can help the students to improve their skills and attract to learn KAFA effectively.
This document summarizes recent developments in face recognition technology. It discusses how face recognition can be used for verification (one-to-one matching) or identification (one-to-many matching). It then outlines several major approaches to face recognition including template-based methods, statistical approaches, and neural network-based approaches. Specific algorithms are discussed, such as Viola-Jones, HOG, R-CNN, and YOLO. Finally, potential applications of face recognition technology are presented, including biometrics, healthcare, education, and emotion analysis.
IRJET- Prediction of Human Facial Expression using Deep LearningIRJET Journal
This document discusses predicting human facial expressions using deep learning. It begins with an introduction to emotion recognition, facial emotion recognition, and deep learning techniques. It then reviews related literature on facial expression recognition using methods like CNNs, active appearance models, and analyzing facial features. The proposed system and methodology are described, including face registration, feature extraction focusing on action units, and using classifiers like KNN, MLP, and SVM to classify emotions based on these features. The goal is to recognize seven basic emotions (neutral, joy, sadness, surprise, anger, fear, disgust) in still images using deep learning techniques.
IRJET- Facial Expression Recognition using Deep Learning: A ReviewIRJET Journal
This document provides a review of facial expression recognition using deep learning approaches. It begins with an introduction to facial expression recognition and its applications. It then discusses commonly used datasets for facial expression recognition, including image-based datasets like JAFFE and video-based datasets like CK+. The document reviews 26 previous research papers that used deep learning methods like convolutional neural networks for facial expression recognition. It concludes that convolutional neural networks provide more accurate results for facial expression recognition compared to traditional methods.
Face recognition smart cane using haar-like features and eigenfacesTELKOMNIKA JOURNAL
1) The document describes a prototype for a smart cane with face recognition capabilities to help visually impaired people identify faces.
2) The prototype uses a Raspberry Pi, camera mounted on eyeglasses, and earphones to provide audio feedback. The camera captures images that are analyzed using Haar-like features and eigenfaces algorithms for face detection and recognition.
3) Testing showed the prototype could recognize faces within 1-1.5 meters with 91.67% accuracy for frontal faces but only 18-32% for other positions. It took around 3 seconds to recognize one face and longer for multiple faces.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Review of facial expression recognition system and used datasetseSAT Journals
Abstract The human face is main part to recognize the individuals as well as provides the important information, current state of user behavior through their different expressions. Therefore, in biometric area of the research, automatically face & face expression recognition attracts researcher’s interest. The other areas which use such technique are computer science medicine, psychology etc. Usually face recognition system is consisting of many internal tasks. Face detection is thefirst task of such systems. Due to different variations across the human faces, the process of detecting face becomes complex. But with help of different modeling methods, it becomes possible to recognize the face and hence different face expressions. This paperpresents a literature review over the techniques and methods used for facial expression recognition. Also, different facial expression datasets available for the research or testing of existing methods of facial expression recognition are discussed. Keywords: Facial Expression, Face Detection, Features Extraction, Recognition, datasets.
MOBILE AUGMENTED REALITY APPLICATION FOR PRIMARY SCHOOL EDUCATIONijma
Mobile-based Augmented Reality (AR) has grown rapidly in various sectors including an education field. This technology has supported in facilitating users to perform daily tasks such as tools in teaching and learning. However, in teaching and learning activities still practise a traditional method by using books that have caused students feel bored and cannot focus in the activities. Therefore, with the advent of ‘BelajarBacaanSolat’ using Augmented Reality (AR-BBS) application these students can utilize this to improve students' level of understanding as well as attract them to stay focus in the learning. The objective of this project was to help the primary school students to study an Islamic subject namely Kelas al-Quran dan Fardu Ain (KAFA). In addition, this project can also benefit to KAFA teachers in teaching student more effectively. This research focus on the topic of “BacaanSolat” in prayer. The framework proposed in this research contains three items including entity, platform and content. In conclusion, this project can help student especially KAFA student learn more efficiently. It is believed that this application can help the students to improve their skills and attract to learn KAFA effectively.
This document summarizes recent developments in face recognition technology. It discusses how face recognition can be used for verification (one-to-one matching) or identification (one-to-many matching). It then outlines several major approaches to face recognition including template-based methods, statistical approaches, and neural network-based approaches. Specific algorithms are discussed, such as Viola-Jones, HOG, R-CNN, and YOLO. Finally, potential applications of face recognition technology are presented, including biometrics, healthcare, education, and emotion analysis.
IRJET- Prediction of Human Facial Expression using Deep LearningIRJET Journal
This document discusses predicting human facial expressions using deep learning. It begins with an introduction to emotion recognition, facial emotion recognition, and deep learning techniques. It then reviews related literature on facial expression recognition using methods like CNNs, active appearance models, and analyzing facial features. The proposed system and methodology are described, including face registration, feature extraction focusing on action units, and using classifiers like KNN, MLP, and SVM to classify emotions based on these features. The goal is to recognize seven basic emotions (neutral, joy, sadness, surprise, anger, fear, disgust) in still images using deep learning techniques.
IRJET- An Innovative Approach for Interviewer to Judge State of Mind of an In...IRJET Journal
This document presents a proposed system to analyze the state of mind of an interviewee during an interview using facial expression recognition and classification. The system would use Fisher Face algorithm to detect facial features from video frames and Naive Bayes classification to categorize the detected expressions as indicators of emotional states like happy, sad, angry etc. This automated analysis of facial expressions could provide feedback to improve the interview process and selection of candidates. The summarized system aims to identify an individual's state of mind during an interview through facial expression recognition using deep learning techniques.
Development of video-based emotion recognition using deep learning with Googl...TELKOMNIKA JOURNAL
Emotion recognition using images, videos, or speech as input is considered as a hot topic in the field of research over some years. With the introduction of deep learning techniques, e.g., convolutional neural networks (CNN), applied in emotion recognition, has produced promising results. Human facial expressions are considered as critical components in understanding one's emotions. This paper sheds light on recognizing the emotions using deep learning techniques from the videos. The methodology of the recognition process, along with its description, is provided in this paper. Some of the video-based datasets used in many scholarly works are also examined. Results obtained from different emotion recognition models are presented along with their performance parameters. An experiment was carried out on the fer2013 dataset in Google Colab for depression detection, which came out to be 97% accurate on the training set and 57.4% accurate on the testing set.
Face recognition for presence system by using residual networks-50 architectu...IJECEIAES
Presence system is a system for recording the individual attendance in the company, school or institution. There are several types presence system, including the manually presence system using signatures, presence system using fingerprints and presence system using face recognition technology. Presence system using face recognition technology is one of presence system that implements biometric system in the process of recording attendance. In this research we used one of the convolutional neural network (CNN) architectures that won the imagenet large scale visual recognition competition (ILSVRC) in 2015, namely the Residual Networks-50 architecture (ResNet-50) for face recognition. Our contribution in this research is to determine effectiveness ResNet architecture with different configuration of hyperparameters. This hyperparameters includes the number of hidden layers, the number of units in the hidden layer, batch size, and learning rate. Because hyperparameter are selected based on how the experiments performed and the value of each hyperparameter affects the final result accuracy, so we try 22 configurations (experiments) to get the best accuracy. We conducted experiments to get the best model with an accuracy of 99%.
An HCI Principles based Framework to Support Deaf CommunityIJEACS
Sign language is a communication language preferred and used by a deaf person to converse with the common people in the community. Even with the existence of the sign language, there exist a communication gap between the normal and the disable/deaf person. Some solutions such as sensor gloves already are in place to address this problem area of communication, but they are limited and are not covering all parts of the language as required by the deaf person for the ordinary person to understand what is said and wanted? Due to the lack of credibility of the existing solutions for sign language translation, we have proposed a system that aims to assist the deaf people in communicating with the common people of the society and helping, in turn, the disabled people to understand the healthy (normal people) easily. Knowing the needs of the users will help us in focusing on the Human Computer Interaction technologies for deaf people to make it further more a user-friendly and a better alternative to the existing technologies that are in place. The Human Computer Interface (HCI) concept of usability, empirical measurement and simplicity are the key consideration in the development of our system. The proposed Kinect System removes the need for physical contact to operate by using Microsoft Kinect for Windows SDK beta. The result shows that the It has a strong, positive and emotional impact on persons with physical disabilities and their families and friends by giving them the ability to communicate in an easy manner and non-repetitive gestures.
Automatic Emotion Recognition Using Facial Expression: A ReviewIRJET Journal
This document reviews automatic emotion recognition using facial expressions. It discusses how facial expressions are an important form of non-verbal communication that can express human perspectives and mental states. The document then summarizes several popular techniques for automatic facial expression recognition systems, including those based on statistical movement, auto-illumination correction, identification-driven emotion recognition for social robots, e-learning approaches, cognitive analysis for interactive TV, and motion detection using optical flow. Each technique is assessed in terms of its pros and cons. The goal of the techniques is to enhance human-computer interaction through more accurate and real-time recognition of facial expressions and emotions.
FACE DETECTION AND FEATURE EXTRACTION FOR FACIAL EMOTION DETECTIONvivatechijri
: Facial emotion Recognition has been a major issue and an advanced area of research in the field of HumanMachine Interaction and Image Processing. To get facial expression the system needs to meet a variety of human facial
features such as color, body shape, reflection, posture, etc. To get a person's facial expression first it is necessary to get
various facial features such as eye movement, nose, lips, etc. and then differentiate by comparing the trained data using
differentiation appropriate for speech recognition. An AI-based approach to the novel visual system system is suggested.
There are two main processes in the proposed system, namely Face detection and feature extraction.Face detection is
performed using the Haar Cascade Method. The proper feature extraction method is used to extract the element and then
used a vector machine to distinguish the final face shape. The FER13 data set is used for training.
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
IRJET- Age Analysis using Face Recognition with Hybrid AlgorithmIRJET Journal
1) The document presents a study on age analysis of human faces using a hybrid algorithm combining multiple methods including convolutional neural networks.
2) Facial images are collected then preprocessed and trained on a detection model to extract features. A convolutional neural network model is used to classify ages from the extracted features.
3) The results found the proposed hybrid algorithm approach achieved 96.7% accuracy in age prediction from faces, outperforming existing methods. This shows promise for accurate automatic age analysis.
The document summarizes several techniques for face recognition:
1. Holistic methods identify faces using the whole face image as input, while feature-based methods use local facial features. Hybrid methods combine both approaches.
2. Key factors that affect face recognition performance include illumination variations, pose changes, facial expressions, time delays, and occlusions.
3. Common techniques include Eigenface decomposition using principal component analysis (PCA) and linear discriminant analysis (LDA). PCA represents faces in a low-dimensional space using eigenvectors of face images. LDA provides better discrimination between classes than PCA.
AN APPLICATION OF PHYSICS EXPERIMENTS OF HIGH SCHOOL BY USING AUGMENTED REALITYijseajournal
There has been done little research to validate the utility and usability of virtual and augmented reality
environments. The evaluation of usability of these new technologies is very important to design systems that
are more intuitive than a traditional method. Such an evaluation is also important for future development
of applications that can gain from this new technology. The augmented reality (AR) is a technology that
embedded virtual object (video, picture and 3D object) to the user view the real world. The combination of
AR technology with the educational content creates new type of automated applications and acts to
enhance the effectiveness and attractiveness of teaching and learning for students in real life scenarios. The
study aims to improve the teaching methods used in secondary school by employing modern educational
technology and thus assess the effectiveness of AR apps in teaching students the physics experiments.
Therefore, in this study we took the challenge of adapting this technology to facilitate physics subject in
secondary school.
The document is a curriculum vitae for Adhika Pratomo that includes his personal information, educational background, work experience, professional activities, publications, and volunteer experience. It details that he is from Medan, Indonesia and is currently an undergraduate student studying Information Systems at Institut Teknologi Sepuluh Nopember. His experience includes internships in network engineering and teaching, public speaking engagements, publications and awards for his work, and involvement in student organizations.
Abstract: Face Recognition appears to be an integral part in human-computer interfaces and eservices. To carry out security and fault tolerance various Image Processing techniques have been incorporated using ‘Curse of Dimensionality’ that refers to Classifying a pattern with high dimensions that requires a large number of training data. A face recognition & Detection system is a computer-driven application for automatically identifying or verifying a person from still or video image. It does that by comparing selected facial features in the live image and a facial database where the system returns a possible list of faces corresponding to training samples from the database. The nodal points are measured creating a numerical code, called a faceprint, representing the face in the database. Relatively many techniques are used. Image processing technique has been implemented using Feature extraction by Gabor Filters and database training data using Neural Networks. Multiscale resolution and matrix sampling is efficiently performed using this technique.
Keywords: Image Processing techniques, Curse of Dimensionality, Faceprint, Feature extraction, Gabor Filters, Neural Networks.
This document summarizes research on facial recognition technology. It discusses how facial recognition works, recognizing facial expressions to determine emotions. It outlines two major systems used - Facial Action Coding System (FACS) and Facial Animation Parameters (FAP). FACS codes facial movements based on action units of facial muscles. FAP defines parameters like distances between facial features. The document also discusses challenges like illumination changes and outlines future areas of research like micro-expression recognition.
IRJET- ATM Security using Machine LearningIRJET Journal
This document describes a project to improve security for ATM transactions using machine learning techniques. It proposes using face detection and emotion recognition algorithms like K-Nearest Neighbors and Back Propagation Neural Networks to analyze a user's facial expressions and determine if they appear tense or nervous. If abnormal expressions are detected, a security alert would be sent. It discusses implementing multiple object detection to prevent multiple users, as well as expression analysis to identify signs of forced usage. The goal is to automatically detect suspicious activity and improve ATM security.
IRJET- A Comprehensive Survey and Detailed Study on Various Face Recognition ...IRJET Journal
This document provides a comprehensive survey and detailed study of various face recognition methods. It begins with an introduction to face recognition and its advantages over other biometric methods. It then categorizes face recognition methods into four types: knowledge-based, feature-based, template matching, and appearance-based. The majority of the document discusses these methods in further detail and provides examples of algorithms that fall under each type, such as Eigenfaces, LDA, ICA, and neural networks. It concludes by stating that reviewing existing face recognition algorithms may lead to improved methods for solving this fundamental problem.
This document summarizes a research paper on face recognition techniques. It discusses the history of face recognition research from the 1950s to present. Early work focused on identifying faces using geometric measurements of facial features. In the 1980s, neural networks and eigenfaces approaches using principal component analysis were introduced. The document also outlines common problems in face recognition systems, including variations in scale, illumination, expression, and noise. Finally, it reviews literature on human and machine face recognition systems.
Framework for A Personalized Intelligent Assistant to Elderly People for Acti...CSCJournals
The increasing population of elderly people is associated with the need to meet their increasing requirements and to provide solutions that can improve their quality of life in a smart home. In addition to fear and anxiety towards interfacing with systems; cognitive disabilities, weakened memory, disorganized behavior and even physical limitations are some of the problems that elderly people tend to face with increasing age. The essence of providing technology-based solutions to address these needs of elderly people and to create smart and assisted living spaces for the elderly; lies in developing systems that can adapt by addressing their diversity and can augment their performances in the context of their day to day goals. Therefore, this work proposes a framework for development of a Personalized Intelligent Assistant to help elderly people perform Activities of Daily Living (ADLs) in a smart and connected Internet of Things (IoT) based environment. This Personalized Intelligent Assistant can analyze different tasks performed by the user and recommend activities by considering their daily routine, current affective state and the underlining user experience. To uphold the efficacy of this proposed framework, it has been tested on a couple of datasets for modelling an "average user" and a "specific user" respectively. The results presented show that the model achieves a performance accuracy of 73.12% when modelling a "specific user", which is considerably higher than its performance while modelling an "average user", this upholds the relevance for development and implementation of this proposed framework.
Assistance Application for Visually Impaired - VISIONIJSRED
This document describes an Android application called VISION that was developed to assist visually impaired users in identifying objects. The application uses a smartphone camera and machine learning techniques like convolutional neural networks (CNN) and TensorFlow to detect objects in images. When an object is identified, the application audibly announces the name of the object to the user. The developers tested the application and it was able to accurately identify common daily objects like bags and electrical switches. The conclusion is that the application helps visually impaired people identify objects and enhances their independence.
A COMPREHENSIVE STUDY ON OCCLUSION INVARIANT FACE RECOGNITION UNDER FACE MASK...mlaij
The face mask is an essential sanitaryware in daily lives growing during the pandemic period and is a big threat to current face recognition systems. The masks destroy a lot of details in a large area of face, and it makes it difficult to recognize them even for humans. The evaluation report shows the difficulty well when recognizing masked faces. Rapid development and breakthrough of deep learning in the recent past have witnessed most promising results from face recognition algorithms. But they fail to perform far from satisfactory levels in the unconstrained environment during the challenges such as varying lighting conditions, low resolution, facial expressions, pose variation and occlusions. Facial occlusions are considered one of the most intractable problems. Especially when the occlusion occupies a large region of the face because it destroys lots of official features.
This document summarizes a research article about developing a virtual mouse using eye tracking techniques. The researchers created a system that can detect a user's eyes and gaze using a Viola-Jones algorithm and Matlab software. This allows the position of the user's gaze to control the mouse cursor in real-time, providing an alternative input method that is useful for people with disabilities. The document discusses different eye tracking techniques and prior research in face and eye detection algorithms to support the system's development. It presents the overall system design which first detects the face and eyes, then tracks eye movements to control mouse movements and clicks.
The objective of emotion recognition is
identifying emotions of a human. The emotion can be
captured either from face or from verbal communication. In
this work we focus on identifying human emotion from
facial expressions. Facial emotion recognition is one of the
useful task and can be used as a base for many real-time
applications.
Facial expression recognition of masked faces using deep learningIAESIJAI
Facial expression recognition (FER) represents one of the most prevalent forms of interpersonal communication, which contains rich emotional information. But it became even more challenging during the times of COVID, where face masks became a mandatory protection measure, leading to the challenge of occluded lower-face during facial expression recognition. In this study, deep convolutional neural network (DCNN) represents the core of both our full-face FER system and our masked face FER model. The focus was on incorporating knowledge distillation in transfer learning between a teacher model, which is the full-face FER DCNN, and the student model, which is the masked face FER DCNN via the combination of both the loss from the teacher soft-labels vs the student soft labels and the loss from the dataset hard-labels vs the student hard-labels. The teacher-student architecture used FER2013 and a masked customized version of FER2013 as datasets to generate an accuracy of 69% and 61% respectively. Therefore, the study proves that the process of knowledge distillation may be used as a way for transfer learning and enhancing accuracy as a regular DCNN model (student only) would result in 46% accuracy compared to our approach (61% accuracy).
IRJET- An Innovative Approach for Interviewer to Judge State of Mind of an In...IRJET Journal
This document presents a proposed system to analyze the state of mind of an interviewee during an interview using facial expression recognition and classification. The system would use Fisher Face algorithm to detect facial features from video frames and Naive Bayes classification to categorize the detected expressions as indicators of emotional states like happy, sad, angry etc. This automated analysis of facial expressions could provide feedback to improve the interview process and selection of candidates. The summarized system aims to identify an individual's state of mind during an interview through facial expression recognition using deep learning techniques.
Development of video-based emotion recognition using deep learning with Googl...TELKOMNIKA JOURNAL
Emotion recognition using images, videos, or speech as input is considered as a hot topic in the field of research over some years. With the introduction of deep learning techniques, e.g., convolutional neural networks (CNN), applied in emotion recognition, has produced promising results. Human facial expressions are considered as critical components in understanding one's emotions. This paper sheds light on recognizing the emotions using deep learning techniques from the videos. The methodology of the recognition process, along with its description, is provided in this paper. Some of the video-based datasets used in many scholarly works are also examined. Results obtained from different emotion recognition models are presented along with their performance parameters. An experiment was carried out on the fer2013 dataset in Google Colab for depression detection, which came out to be 97% accurate on the training set and 57.4% accurate on the testing set.
Face recognition for presence system by using residual networks-50 architectu...IJECEIAES
Presence system is a system for recording the individual attendance in the company, school or institution. There are several types presence system, including the manually presence system using signatures, presence system using fingerprints and presence system using face recognition technology. Presence system using face recognition technology is one of presence system that implements biometric system in the process of recording attendance. In this research we used one of the convolutional neural network (CNN) architectures that won the imagenet large scale visual recognition competition (ILSVRC) in 2015, namely the Residual Networks-50 architecture (ResNet-50) for face recognition. Our contribution in this research is to determine effectiveness ResNet architecture with different configuration of hyperparameters. This hyperparameters includes the number of hidden layers, the number of units in the hidden layer, batch size, and learning rate. Because hyperparameter are selected based on how the experiments performed and the value of each hyperparameter affects the final result accuracy, so we try 22 configurations (experiments) to get the best accuracy. We conducted experiments to get the best model with an accuracy of 99%.
An HCI Principles based Framework to Support Deaf CommunityIJEACS
Sign language is a communication language preferred and used by a deaf person to converse with the common people in the community. Even with the existence of the sign language, there exist a communication gap between the normal and the disable/deaf person. Some solutions such as sensor gloves already are in place to address this problem area of communication, but they are limited and are not covering all parts of the language as required by the deaf person for the ordinary person to understand what is said and wanted? Due to the lack of credibility of the existing solutions for sign language translation, we have proposed a system that aims to assist the deaf people in communicating with the common people of the society and helping, in turn, the disabled people to understand the healthy (normal people) easily. Knowing the needs of the users will help us in focusing on the Human Computer Interaction technologies for deaf people to make it further more a user-friendly and a better alternative to the existing technologies that are in place. The Human Computer Interface (HCI) concept of usability, empirical measurement and simplicity are the key consideration in the development of our system. The proposed Kinect System removes the need for physical contact to operate by using Microsoft Kinect for Windows SDK beta. The result shows that the It has a strong, positive and emotional impact on persons with physical disabilities and their families and friends by giving them the ability to communicate in an easy manner and non-repetitive gestures.
Automatic Emotion Recognition Using Facial Expression: A ReviewIRJET Journal
This document reviews automatic emotion recognition using facial expressions. It discusses how facial expressions are an important form of non-verbal communication that can express human perspectives and mental states. The document then summarizes several popular techniques for automatic facial expression recognition systems, including those based on statistical movement, auto-illumination correction, identification-driven emotion recognition for social robots, e-learning approaches, cognitive analysis for interactive TV, and motion detection using optical flow. Each technique is assessed in terms of its pros and cons. The goal of the techniques is to enhance human-computer interaction through more accurate and real-time recognition of facial expressions and emotions.
FACE DETECTION AND FEATURE EXTRACTION FOR FACIAL EMOTION DETECTIONvivatechijri
: Facial emotion Recognition has been a major issue and an advanced area of research in the field of HumanMachine Interaction and Image Processing. To get facial expression the system needs to meet a variety of human facial
features such as color, body shape, reflection, posture, etc. To get a person's facial expression first it is necessary to get
various facial features such as eye movement, nose, lips, etc. and then differentiate by comparing the trained data using
differentiation appropriate for speech recognition. An AI-based approach to the novel visual system system is suggested.
There are two main processes in the proposed system, namely Face detection and feature extraction.Face detection is
performed using the Haar Cascade Method. The proper feature extraction method is used to extract the element and then
used a vector machine to distinguish the final face shape. The FER13 data set is used for training.
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
IRJET- Age Analysis using Face Recognition with Hybrid AlgorithmIRJET Journal
1) The document presents a study on age analysis of human faces using a hybrid algorithm combining multiple methods including convolutional neural networks.
2) Facial images are collected then preprocessed and trained on a detection model to extract features. A convolutional neural network model is used to classify ages from the extracted features.
3) The results found the proposed hybrid algorithm approach achieved 96.7% accuracy in age prediction from faces, outperforming existing methods. This shows promise for accurate automatic age analysis.
The document summarizes several techniques for face recognition:
1. Holistic methods identify faces using the whole face image as input, while feature-based methods use local facial features. Hybrid methods combine both approaches.
2. Key factors that affect face recognition performance include illumination variations, pose changes, facial expressions, time delays, and occlusions.
3. Common techniques include Eigenface decomposition using principal component analysis (PCA) and linear discriminant analysis (LDA). PCA represents faces in a low-dimensional space using eigenvectors of face images. LDA provides better discrimination between classes than PCA.
AN APPLICATION OF PHYSICS EXPERIMENTS OF HIGH SCHOOL BY USING AUGMENTED REALITYijseajournal
There has been done little research to validate the utility and usability of virtual and augmented reality
environments. The evaluation of usability of these new technologies is very important to design systems that
are more intuitive than a traditional method. Such an evaluation is also important for future development
of applications that can gain from this new technology. The augmented reality (AR) is a technology that
embedded virtual object (video, picture and 3D object) to the user view the real world. The combination of
AR technology with the educational content creates new type of automated applications and acts to
enhance the effectiveness and attractiveness of teaching and learning for students in real life scenarios. The
study aims to improve the teaching methods used in secondary school by employing modern educational
technology and thus assess the effectiveness of AR apps in teaching students the physics experiments.
Therefore, in this study we took the challenge of adapting this technology to facilitate physics subject in
secondary school.
The document is a curriculum vitae for Adhika Pratomo that includes his personal information, educational background, work experience, professional activities, publications, and volunteer experience. It details that he is from Medan, Indonesia and is currently an undergraduate student studying Information Systems at Institut Teknologi Sepuluh Nopember. His experience includes internships in network engineering and teaching, public speaking engagements, publications and awards for his work, and involvement in student organizations.
Abstract: Face Recognition appears to be an integral part in human-computer interfaces and eservices. To carry out security and fault tolerance various Image Processing techniques have been incorporated using ‘Curse of Dimensionality’ that refers to Classifying a pattern with high dimensions that requires a large number of training data. A face recognition & Detection system is a computer-driven application for automatically identifying or verifying a person from still or video image. It does that by comparing selected facial features in the live image and a facial database where the system returns a possible list of faces corresponding to training samples from the database. The nodal points are measured creating a numerical code, called a faceprint, representing the face in the database. Relatively many techniques are used. Image processing technique has been implemented using Feature extraction by Gabor Filters and database training data using Neural Networks. Multiscale resolution and matrix sampling is efficiently performed using this technique.
Keywords: Image Processing techniques, Curse of Dimensionality, Faceprint, Feature extraction, Gabor Filters, Neural Networks.
This document summarizes research on facial recognition technology. It discusses how facial recognition works, recognizing facial expressions to determine emotions. It outlines two major systems used - Facial Action Coding System (FACS) and Facial Animation Parameters (FAP). FACS codes facial movements based on action units of facial muscles. FAP defines parameters like distances between facial features. The document also discusses challenges like illumination changes and outlines future areas of research like micro-expression recognition.
IRJET- ATM Security using Machine LearningIRJET Journal
This document describes a project to improve security for ATM transactions using machine learning techniques. It proposes using face detection and emotion recognition algorithms like K-Nearest Neighbors and Back Propagation Neural Networks to analyze a user's facial expressions and determine if they appear tense or nervous. If abnormal expressions are detected, a security alert would be sent. It discusses implementing multiple object detection to prevent multiple users, as well as expression analysis to identify signs of forced usage. The goal is to automatically detect suspicious activity and improve ATM security.
IRJET- A Comprehensive Survey and Detailed Study on Various Face Recognition ...IRJET Journal
This document provides a comprehensive survey and detailed study of various face recognition methods. It begins with an introduction to face recognition and its advantages over other biometric methods. It then categorizes face recognition methods into four types: knowledge-based, feature-based, template matching, and appearance-based. The majority of the document discusses these methods in further detail and provides examples of algorithms that fall under each type, such as Eigenfaces, LDA, ICA, and neural networks. It concludes by stating that reviewing existing face recognition algorithms may lead to improved methods for solving this fundamental problem.
This document summarizes a research paper on face recognition techniques. It discusses the history of face recognition research from the 1950s to present. Early work focused on identifying faces using geometric measurements of facial features. In the 1980s, neural networks and eigenfaces approaches using principal component analysis were introduced. The document also outlines common problems in face recognition systems, including variations in scale, illumination, expression, and noise. Finally, it reviews literature on human and machine face recognition systems.
Framework for A Personalized Intelligent Assistant to Elderly People for Acti...CSCJournals
The increasing population of elderly people is associated with the need to meet their increasing requirements and to provide solutions that can improve their quality of life in a smart home. In addition to fear and anxiety towards interfacing with systems; cognitive disabilities, weakened memory, disorganized behavior and even physical limitations are some of the problems that elderly people tend to face with increasing age. The essence of providing technology-based solutions to address these needs of elderly people and to create smart and assisted living spaces for the elderly; lies in developing systems that can adapt by addressing their diversity and can augment their performances in the context of their day to day goals. Therefore, this work proposes a framework for development of a Personalized Intelligent Assistant to help elderly people perform Activities of Daily Living (ADLs) in a smart and connected Internet of Things (IoT) based environment. This Personalized Intelligent Assistant can analyze different tasks performed by the user and recommend activities by considering their daily routine, current affective state and the underlining user experience. To uphold the efficacy of this proposed framework, it has been tested on a couple of datasets for modelling an "average user" and a "specific user" respectively. The results presented show that the model achieves a performance accuracy of 73.12% when modelling a "specific user", which is considerably higher than its performance while modelling an "average user", this upholds the relevance for development and implementation of this proposed framework.
Assistance Application for Visually Impaired - VISIONIJSRED
This document describes an Android application called VISION that was developed to assist visually impaired users in identifying objects. The application uses a smartphone camera and machine learning techniques like convolutional neural networks (CNN) and TensorFlow to detect objects in images. When an object is identified, the application audibly announces the name of the object to the user. The developers tested the application and it was able to accurately identify common daily objects like bags and electrical switches. The conclusion is that the application helps visually impaired people identify objects and enhances their independence.
A COMPREHENSIVE STUDY ON OCCLUSION INVARIANT FACE RECOGNITION UNDER FACE MASK...mlaij
The face mask is an essential sanitaryware in daily lives growing during the pandemic period and is a big threat to current face recognition systems. The masks destroy a lot of details in a large area of face, and it makes it difficult to recognize them even for humans. The evaluation report shows the difficulty well when recognizing masked faces. Rapid development and breakthrough of deep learning in the recent past have witnessed most promising results from face recognition algorithms. But they fail to perform far from satisfactory levels in the unconstrained environment during the challenges such as varying lighting conditions, low resolution, facial expressions, pose variation and occlusions. Facial occlusions are considered one of the most intractable problems. Especially when the occlusion occupies a large region of the face because it destroys lots of official features.
This document summarizes a research article about developing a virtual mouse using eye tracking techniques. The researchers created a system that can detect a user's eyes and gaze using a Viola-Jones algorithm and Matlab software. This allows the position of the user's gaze to control the mouse cursor in real-time, providing an alternative input method that is useful for people with disabilities. The document discusses different eye tracking techniques and prior research in face and eye detection algorithms to support the system's development. It presents the overall system design which first detects the face and eyes, then tracks eye movements to control mouse movements and clicks.
The objective of emotion recognition is
identifying emotions of a human. The emotion can be
captured either from face or from verbal communication. In
this work we focus on identifying human emotion from
facial expressions. Facial emotion recognition is one of the
useful task and can be used as a base for many real-time
applications.
Facial expression recognition of masked faces using deep learningIAESIJAI
Facial expression recognition (FER) represents one of the most prevalent forms of interpersonal communication, which contains rich emotional information. But it became even more challenging during the times of COVID, where face masks became a mandatory protection measure, leading to the challenge of occluded lower-face during facial expression recognition. In this study, deep convolutional neural network (DCNN) represents the core of both our full-face FER system and our masked face FER model. The focus was on incorporating knowledge distillation in transfer learning between a teacher model, which is the full-face FER DCNN, and the student model, which is the masked face FER DCNN via the combination of both the loss from the teacher soft-labels vs the student soft labels and the loss from the dataset hard-labels vs the student hard-labels. The teacher-student architecture used FER2013 and a masked customized version of FER2013 as datasets to generate an accuracy of 69% and 61% respectively. Therefore, the study proves that the process of knowledge distillation may be used as a way for transfer learning and enhancing accuracy as a regular DCNN model (student only) would result in 46% accuracy compared to our approach (61% accuracy).
The document proposes a smart glass system that utilizes computer vision and deep learning to help visually impaired people navigate their environment. The system uses a camera to capture images, detect faces and objects using models like Faster R-CNN, and provides audio descriptions to the user, such as identifying known people and alerting about obstacles. It aims to make movement safer and easier for the visually impaired by recognizing surroundings through real-time voice messages. The system is designed to be low-cost, fast, and user-friendly. It evaluates the performance of its detection and recognition abilities through metrics like accuracy, sensitivity and specificity.
Two Level Decision for Recognition of Human Facial Expressions using Neural N...IIRindia
Facial Expressions of the human being is the one which is the outcome of the inner feelings of the mind. It is the person’s internal emotional states and intentions.A person’s face provides a lot of information such as age, gender, identity, mood, expressions and so on. Faces play an important role in the recognition of the expressions of persons. In this research, an attempt is made to design a model to classify human facial expressions according to the features extracted f0rom human facial images by applying 3 Sigma limits inSecond level decision using Neural Network (NN). Now a days, Artificial Neural Network (ANN) has been widely used as a tool for solving many decision modeling problems. In this paper a feed forward propagation Neural networks are constructed for expression classification system for gray-scale facial images. Three groups of expressions including Happy, Sad and Anger are used in the classification system. In this paper, a Second level decision has been proposed in which the output obtained from the Neural Network(Primary Level) has been refined at the Second level in order to improvise the accuracy of the recognition rate. The accuracy of the system is analyzed by the variation on the range of the expression groups. The efficiency of the system is demonstrated through the experimental results.
Facial Emotion Recognition using Convolution Neural NetworkYogeshIJTSRD
Facial expression plays a major role in every aspect of human life for communication. It has been a boon for the research in facial emotion with the systems that give rise to the terminology of human computer interaction in real life. Humans socially interact with each other via emotions. In this research paper, we have proposed an approach of building a system that recognizes facial emotion using a Convolutional Neural Network CNN which is one of the most popular Neural Network available. It is said to be a pattern recognition Neural Network. Convolutional Neural Network reduces the dimension for large resolution images and not losing the quality and giving a prediction output whats expected and capturing of the facial expressions even in odd angles makes it stand different from other models also i.e. it works well for non frontal images. But unfortunately, CNN based detector is computationally heavy and is a challenge for using CNN for a video as an input. We will implement a facial emotion recognition system using a Convolutional Neural Network using a dataset. Our system will predict the output based on the input given to it. This system can be useful for sentimental analysis, can be used for clinical practices, can be useful for getting a persons review on a certain product, and many more. Raheena Bagwan | Sakshi Chintawar | Komal Dhapudkar | Alisha Balamwar | Prof. Sandeep Gore "Facial Emotion Recognition using Convolution Neural Network" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-3 , April 2021, URL: https://www.ijtsrd.com/papers/ijtsrd39972.pdf Paper URL: https://www.ijtsrd.com/computer-science/artificial-intelligence/39972/facial-emotion-recognition-using-convolution-neural-network/raheena-bagwan
Hybrid Facial Expression Recognition (FER2013) Model for Real-Time Emotion Cl...BIJIAM Journal
Facial expression recognition is a vital research topic in most fields ranging from artificial intelligence and gaming to human–computer interaction (HCI) and psychology. This paper proposes a hybrid model for facial expression recognition, which comprises a deep convolutional neural network (DCNN) and a Haar Cascade deep learning architecture. The objective is to classify real-time and digital facial images into one of the seven facial emotion categories considered. The DCNN employed in this research has more convolutional layers, ReLU activation functions, and multiple kernels to enhance filtering depth and facial feature extraction. In addition, a Haar Cascade model was also mutually used to detect facial features in real-time images and video frames. Grayscale images from the Kaggle repository (FER-2013) and then exploited graphics processing unit (GPU) computation to expedite the training and validation process. Pre-processing and data augmentation techniques are applied to improve training efficiency and classification performance. The experimental results show a significantly improved classification performance compared to state-of-the-art (SoTA) experiments and research. Also, compared to other conventional models, this paper validates that the proposed architecture is superior in classification performance with an improvement of up to 6%, totaling up to 70% accuracy, and with less execution time of 2,098.8 s.
Model for autism disorder detection using deep learningIAESIJAI
Autism spectrum disorder (ASD) is a neurodegenerative illness that impacts individuals' social abilities. The majority of available approaches rely on structural and resting-state functional magnetic resonance imaging (fMRI) to detect ASD with a small dataset, resulting in high accuracy but low generality. To detect ASD with a limited dataset, the bulk of known technologies involve Machine Learning, pattern recognition, and other techniques, leading to high accuracy but moderate generality. To address this constraint and improve the efficacy of the automated autism diagnosis model, an ASD detection model based on deep learning (DL) is provided in this work. The classification challenge is solved using a convolutional neural network classifier. The suggested model beats state-of-the-art methodologies in terms of accuracy, according to simulation findings. The proposed approach investigates how anatomical and functional connectivity indicators can be used to determine whether or not a person is autistic. The proposed method delivers state-of-the-art results, with the classification of Autistic patients achieving 93.41% accuracy and the localization of the classified data regressed to 0.29 mean absolute error (MAE).
Comparative Studies for the Human Facial Expressions Recognition Techniquesijtsrd
This document provides a review and comparative study of different techniques for recognizing human facial expressions. It begins with an introduction to facial expression recognition, its applications, and challenges. It then discusses methods for measuring emotions, including discrete and dimensional perspectives. The document reviews techniques for identifying faces, extracting facial features, and classifying expressions. It provides an overview of common emotions and factors that influence facial expression recognition, such as pose, occlusion, and lighting. Overall, the document analyzes and compares recent studies on automatic facial expression recognition techniques.
The document describes a deep learning model called Deep-Emotion that uses an attentional convolutional network for facial expression recognition. The model focuses on important facial regions like the eyes and mouth that are most relevant to detecting emotions. The model achieves state-of-the-art accuracy on several facial expression datasets, outperforming previous methods. Visualization techniques show the model learns to focus on different facial regions for different emotions.
Facial emoji recognition is a human computer interaction system. In recent times, automatic face recognition or facial expression recognition has attracted increasing attention from researchers in psychology, computer science, linguistics, neuroscience, and similar fields. Facial emoji recognizer is an end user application which detects the expression of the person in the video being captured by the camera. The smiley relevant to the expression of the person in the video is shown on the screen which changes with the change in the expressions. Facial expressions are important in human communication and interactions. Also, they are used as an important tool in studies about behavior and in medical fields. Facial emoji recognizer provides a fast and practical approach for non meddlesome emotion detection. The purpose was to develop an intelligent system for facial based expression classification using CNN algorithm. Haar classifier is used for face detection and CNN algorithm is utilized for the expression detection and giving the emoticon relevant to the expression as the output. N. Swapna Goud | K. Revanth Reddy | G. Alekhya | G. S. Sucheta ""Facial Emoji Recognition"" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-3 , April 2019, URL: https://www.ijtsrd.com/papers/ijtsrd23166.pdf
Paper URL: https://www.ijtsrd.com/engineering/computer-engineering/23166/facial-emoji-recognition/n-swapna-goud
Face and liveness detection with criminal identification using machine learni...IAESIJAI
In the past, real-world photos have been used to train classifiers for face liveness identification since the related face presentation attacks (PA) and real-world images have a high degree of overlap. The use of deep convolutional neural networks (CNN) and real-world face photos together to identify the liveness of a face, however, has received very little study. A face recognition system should be able to identify real faces as well as efforts at faking utilizing printed or digital presentations. A true spoofing avoidance method involves observing facial liveness, such as eye blinking and lip movement. However, this strategy is rendered useless when defending against replay assaults that use video. The anti-spoofing technique consists of two modules: the ConvNet classifier module and the blinking eye module, which measure lip and eye movement. The results of the testing demonstrate that the developed module is capable of identifying various face spoof assaults, including those made with the use of posters, masks, or smartphones. To assess the convolutional features in this study adaptively fused from deep CNN produced face pictures and convolutional layers learned from real-world identification. Extensive tests using intra-database and cross-database scenarios on cutting-edge face anti-spoofing databases including CASIA, OULU, NUAA and replay-attack dataset demonstrate that the proposed solution methods for face liveness detection. The algorithm has a 94.30% accuracy rate.
A Study on Face Expression Observation Systemsijtsrd
Human expressions can convey a great deal of information. We cannot learn every language in the world, but we can decipher the majority of human expressions. The state of a users conduct in various settings and scenarios can be inferred from their facial expressions. Through various human computer interface and programming approaches, facial expression can be digitized. Face detection, feature extraction, and kind of expression determination are all parts of the facial expression perception process. Verbal and non verbal forms of communication are both possible. Through their emotions, people can communicate nonverbally Weve given a broad summary of the various facial expression perception processes in the literature in this article. Jyoti | Neeraj Chawaria | Ekta "A Study on Face Expression Observation Systems" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-6 | Issue-5 , August 2022, URL: https://www.ijtsrd.com/papers/ijtsrd50450.pdf Paper URL: https://www.ijtsrd.com/computer-science/computer-graphics/50450/a-study-on-face-expression-observation-systems/jyoti
Face Mask Detection And Attendance SystemDarsh Jain
The Current COVID-19 pandemic has got the world to a halt and took everyones attention, as it was a global pandemic the world knew about it. But some underdeveloped nations suffer from viral outbreaks every year. To protect ourselves from these viral outbreaks there is a need for a continuous supply of Sanitizers and face masks to maintain personal hygiene. To help such nations and give some contribution to the world, a novel idea of Face Mask Detection and Attendance System has been proposed in this paper. Also, the various techniques that can be used for facial recognition and object detection, like the HAAR cascade method, Machine learning Algorithms, Deep Learning, etc. are discussed.
AI Therapist – Emotion Detection using Facial Detection and Recognition and S...ijtsrd
This paper presents an integrated system for emotion detection using facial detection and recognition. we have taken into account the fact that emotions are most widely represented with eyes and mouth expressions. In this research effort, we implement a general convolutional neural network CNN building framework for designing real time CNNs. We validate our models by creating a real time vision system that accomplishes the tasks of face detection, emotion classification, and generating the content according to the emotion or mood of the person simultaneously in one blended step using our proposed CNN architecture. Our proposed model consisted of modules such as image processing, Feature extraction, feature classification, and recommendation process. The images used in the experiment are pre processed with various image processing methods like canny edge detection, histogram equalization, fit ellipse, and FER dataset is mediated for conducting the experiments. With a trained profile that can be updated flexibly, a user can detect his her behavior on a real time basis. It utilizes the state of the art of face detection and recognition algorithms. Sanket Godbole | Jaivardhan Shelke "AI Therapist – Emotion Detection using Facial Detection and Recognition & Showing Content According to Emotions" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-6 , October 2020, URL: https://www.ijtsrd.com/papers/ijtsrd33267.pdf Paper Url: https://www.ijtsrd.com/computer-science/artificial-intelligence/33267/ai-therapist-–-emotion-detection-using-facial-detection-and-recognition-and-showing-content-according-to-emotions/sanket-godbole
Facial emotion recognition using enhanced multi-verse optimizer methodIJECEIAES
In recent years, facial emotion recognition has gained significant improvement and attention. This technology utilizes advanced algorithms to analyze facial expressions, enabling computers to detect and interpret human emotions accurately. Its applications span over a wide range of fields, from improving customer service through sentiment analysis, to enhancing mental health support by monitoring emotional states. However, there are several challenges in facial emotion recognition, including variability in individual expressions, cultural differences in emotion display, and privacy concerns related to data collection and usage. Lighting conditions, occlusions, and the need for diverse datasets also impacts accuracy. To solve these issues, an enhanced multi-verse optimizer (EMVO) technique is proposed to improve the efficiency of recognizing emotions. Moreover, EMVO is used to improve the convergence speed, exploration-exploitation balance, solution quality, and the applicability in different types of optimization problems. Two datasets were used to collect the data, namely YouTube and surrey audio-visual expressed emotion (SAVEE) datasets. Then, the classification is done using the convolutional neural networks (CNN) to improve the performance of emotion recognition. When compared to the existing methods shuffled frog leaping algorithm-incremental wrapper-based subset selection (SFLA-IWSS), hierarchical deep neural network (H-DNN) and unique preference learning (UPL), the proposed method achieved better accuracies, measured at 98.65% and 98.76% on the YouTube and SAVEE datasets, respectively.
A review of human skin detection applications based on image processingjournalBEEI
In computer science, virtual image processing is the use of a digital computer to manipulate digital images through an algorithm for many applications. To begin with a new research topic, the must trend application that gets many requests to develop should know. Therefore, many applications based on human skin and human life are reviewed in this article, such as detection, classification, blocking, cryptography, identification, localization, steganography, segmentation, tracking, and recognition. In this article, the published articles with the topic of human skin-based image processing are investigated. The international publishers, such as Springer, IEEE, arXiv, and Elsevier are selected. The searching is implemented with the duration criteria of 2015-2019. It noted that human skin detection and recognition are the most repetitive articles with 43% and 28.5%, respectively of the total number of the investigated articles. The usage of human skin models is being widely used in the image processing of various applications.
This document summarizes research on the Blue Eyes technology, which aims to give computers human-like perceptual abilities. It discusses how Blue Eyes uses non-intrusive sensors like video cameras and microphones to identify a user's actions, physical state, emotions, and where they are looking. This information is analyzed to build a model of the user over time to help the computer adapt and create a more productive environment. The document also reviews related work on detecting emotions from facial expressions and touch input and explores using eye tracking for computer input.
PREPROCESSING CHALLENGES FOR REAL WORLD AFFECT RECOGNITIONCSEIJJournal
Real world human affect recognition requires immediate attention which is a significant aspect of humancomputer interaction. Audio-visual modalities can make a significant contribution by providing rich
contextual information. Preprocessing is an important step in which the relevant information is extracted.
It has a crucial impact on prominent feature extraction and further processing. The main aim is to
highlight the challenges in preprocessing real world data. The research focuses on experimental testing
and comparative analysis for preprocessing using OpenCV, Single Shot MultiBox Detector (SSD), DLib,
Multi-Task Cascaded Convolutional Neural Networks (MTCNN), and RetinaFace detectors. The
comparative analysis shows that MTCNN and RetinaFace give better performance in real world data. The
performance of facial affect recognition using a pre-trained CNN model is analysed with a lab-controlled
dataset CK+ and a representative wild dataset AFEW. This comparative analysis demonstrates the impact
of preprocessing issues on feature engineering framework in real world affect recognition.
Similar to Deep learning based facial expressions recognition system for assisting visually impaired persons (20)
Square transposition: an approach to the transposition process in block cipherjournalBEEI
The transposition process is needed in cryptography to create a diffusion effect on data encryption standard (DES) and advanced encryption standard (AES) algorithms as standard information security algorithms by the National Institute of Standards and Technology. The problem with DES and AES algorithms is that their transposition index values form patterns and do not form random values. This condition will certainly make it easier for a cryptanalyst to look for a relationship between ciphertexts because some processes are predictable. This research designs a transposition algorithm called square transposition. Each process uses square 8 × 8 as a place to insert and retrieve 64-bits. The determination of the pairing of the input scheme and the retrieval scheme that have unequal flow is an important factor in producing a good transposition. The square transposition can generate random and non-pattern indices so that transposition can be done better than DES and AES.
Hyper-parameter optimization of convolutional neural network based on particl...journalBEEI
The document proposes using a particle swarm optimization (PSO) algorithm to optimize the hyperparameters of a convolutional neural network (CNN) for image classification. The PSO algorithm is used to find optimal values for CNN hyperparameters like the number and size of convolutional filters. In experiments on the MNIST handwritten digit dataset, the optimized CNN achieved a testing error rate of 0.87%, which is competitive with state-of-the-art models. The proposed approach finds optimized CNN architectures automatically without requiring manual design or encoding strategies during training.
Supervised machine learning based liver disease prediction approach with LASS...journalBEEI
In this contemporary era, the uses of machine learning techniques are increasing rapidly in the field of medical science for detecting various diseases such as liver disease (LD). Around the globe, a large number of people die because of this deadly disease. By diagnosing the disease in a primary stage, early treatment can be helpful to cure the patient. In this research paper, a method is proposed to diagnose the LD using supervised machine learning classification algorithms, namely logistic regression, decision tree, random forest, AdaBoost, KNN, linear discriminant analysis, gradient boosting and support vector machine (SVM). We also deployed a least absolute shrinkage and selection operator (LASSO) feature selection technique on our taken dataset to suggest the most highly correlated attributes of LD. The predictions with 10 fold cross-validation (CV) made by the algorithms are tested in terms of accuracy, sensitivity, precision and f1-score values to forecast the disease. It is observed that the decision tree algorithm has the best performance score where accuracy, precision, sensitivity and f1-score values are 94.295%, 92%, 99% and 96% respectively with the inclusion of LASSO. Furthermore, a comparison with recent studies is shown to prove the significance of the proposed system.
A secure and energy saving protocol for wireless sensor networksjournalBEEI
The research domain for wireless sensor networks (WSN) has been extensively conducted due to innovative technologies and research directions that have come up addressing the usability of WSN under various schemes. This domain permits dependable tracking of a diversity of environments for both military and civil applications. The key management mechanism is a primary protocol for keeping the privacy and confidentiality of the data transmitted among different sensor nodes in WSNs. Since node's size is small; they are intrinsically limited by inadequate resources such as battery life-time and memory capacity. The proposed secure and energy saving protocol (SESP) for wireless sensor networks) has a significant impact on the overall network life-time and energy dissipation. To encrypt sent messsages, the SESP uses the public-key cryptography’s concept. It depends on sensor nodes' identities (IDs) to prevent the messages repeated; making security goals- authentication, confidentiality, integrity, availability, and freshness to be achieved. Finally, simulation results show that the proposed approach produced better energy consumption and network life-time compared to LEACH protocol; sensors are dead after 900 rounds in the proposed SESP protocol. While, in the low-energy adaptive clustering hierarchy (LEACH) scheme, the sensors are dead after 750 rounds.
Plant leaf identification system using convolutional neural networkjournalBEEI
This paper proposes a leaf identification system using convolutional neural network (CNN). This proposed system can identify five types of local Malaysia leaf which were acacia, papaya, cherry, mango and rambutan. By using CNN from deep learning, the network is trained from the database that acquired from leaf images captured by mobile phone for image classification. ResNet-50 was the architecture has been used for neural networks image classification and training the network for leaf identification. The recognition of photographs leaves requested several numbers of steps, starting with image pre-processing, feature extraction, plant identification, matching and testing, and finally extracting the results achieved in MATLAB. Testing sets of the system consists of 3 types of images which were white background, and noise added and random background images. Finally, interfaces for the leaf identification system have developed as the end software product using MATLAB app designer. As a result, the accuracy achieved for each training sets on five leaf classes are recorded above 98%, thus recognition process was successfully implemented.
Customized moodle-based learning management system for socially disadvantaged...journalBEEI
This study aims to develop Moodle-based LMS with customized learning content and modified user interface to facilitate pedagogical processes during covid-19 pandemic and investigate how teachers of socially disadvantaged schools perceived usability and technology acceptance. Co-design process was conducted with two activities: 1) need assessment phase using an online survey and interview session with the teachers and 2) the development phase of the LMS. The system was evaluated by 30 teachers from socially disadvantaged schools for relevance to their distance learning activities. We employed computer software usability questionnaire (CSUQ) to measure perceived usability and the technology acceptance model (TAM) with insertion of 3 original variables (i.e., perceived usefulness, perceived ease of use, and intention to use) and 5 external variables (i.e., attitude toward the system, perceived interaction, self-efficacy, user interface design, and course design). The average CSUQ rating exceeded 5.0 of 7 point-scale, indicated that teachers agreed that the information quality, interaction quality, and user interface quality were clear and easy to understand. TAM results concluded that the LMS design was judged to be usable, interactive, and well-developed. Teachers reported an effective user interface that allows effective teaching operations and lead to the system adoption in immediate time.
Understanding the role of individual learner in adaptive and personalized e-l...journalBEEI
Dynamic learning environment has emerged as a powerful platform in a modern e-learning system. The learning situation that constantly changing has forced the learning platform to adapt and personalize its learning resources for students. Evidence suggested that adaptation and personalization of e-learning systems (APLS) can be achieved by utilizing learner modeling, domain modeling, and instructional modeling. In the literature of APLS, questions have been raised about the role of individual characteristics that are relevant for adaptation. With several options, a new problem has been raised where the attributes of students in APLS often overlap and are not related between studies. Therefore, this study proposed a list of learner model attributes in dynamic learning to support adaptation and personalization. The study was conducted by exploring concepts from the literature selected based on the best criteria. Then, we described the results of important concepts in student modeling and provided definitions and examples of data values that researchers have used. Besides, we also discussed the implementation of the selected learner model in providing adaptation in dynamic learning.
Prototype mobile contactless transaction system in traditional markets to sup...journalBEEI
1) Researchers developed a prototype contactless transaction system using QR codes and digital payments to support physical distancing during the COVID-19 pandemic in traditional markets.
2) The system allows sellers and buyers in traditional markets to conduct fast, secure transactions via smartphones without direct cash exchange. Buyers scan sellers' QR codes to view product details and make e-wallet payments.
3) Testing showed the system's functions worked properly and users found it easy to use and useful for supporting contactless transactions and digital transformation of traditional markets. However, further development is needed to increase trust in digital payments for users unfamiliar with the technology.
Wireless HART stack using multiprocessor technique with laxity algorithmjournalBEEI
The use of a real-time operating system is required for the demarcation of industrial wireless sensor network (IWSN) stacks (RTOS). In the industrial world, a vast number of sensors are utilised to gather various types of data. The data gathered by the sensors cannot be prioritised ahead of time. Because all of the information is equally essential. As a result, a protocol stack is employed to guarantee that data is acquired and processed fairly. In IWSN, the protocol stack is implemented using RTOS. The data collected from IWSN sensor nodes is processed using non-preemptive scheduling and the protocol stack, and then sent in parallel to the IWSN's central controller. The real-time operating system (RTOS) is a process that occurs between hardware and software. Packets must be sent at a certain time. It's possible that some packets may collide during transmission. We're going to undertake this project to get around this collision. As a prototype, this project is divided into two parts. The first uses RTOS and the LPC2148 as a master node, while the second serves as a standard data collection node to which sensors are attached. Any controller may be used in the second part, depending on the situation. Wireless HART allows two nodes to communicate with each other.
Implementation of double-layer loaded on octagon microstrip yagi antennajournalBEEI
This document describes the implementation of a double-layer structure on an octagon microstrip yagi antenna (OMYA) to improve its performance at 5.8 GHz. The double-layer consists of two double positive (DPS) substrates placed above the OMYA. Simulation and experimental results show that the double-layer configuration increases the gain of the OMYA by 2.5 dB compared to without the double-layer. The measured bandwidth of the OMYA with double-layer is 14.6%, indicating the double-layer can increase both the gain and bandwidth of the OMYA.
The calculation of the field of an antenna located near the human headjournalBEEI
In this work, a numerical calculation was carried out in one of the universal programs for automatic electro-dynamic design. The calculation is aimed at obtaining numerical values for specific absorbed power (SAR). It is the SAR value that can be used to determine the effect of the antenna of a wireless device on biological objects; the dipole parameters will be selected for GSM1800. Investigation of the influence of distance to a cell phone on radiation shows that absorbed in the head of a person the effect of electromagnetic radiation on the brain decreases by three times this is a very important result the SAR value has decreased by almost three times it is acceptable results.
Exact secure outage probability performance of uplinkdownlink multiple access...journalBEEI
In this paper, we study uplink-downlink non-orthogonal multiple access (NOMA) systems by considering the secure performance at the physical layer. In the considered system model, the base station acts a relay to allow two users at the left side communicate with two users at the right side. By considering imperfect channel state information (CSI), the secure performance need be studied since an eavesdropper wants to overhear signals processed at the downlink. To provide secure performance metric, we derive exact expressions of secrecy outage probability (SOP) and and evaluating the impacts of main parameters on SOP metric. The important finding is that we can achieve the higher secrecy performance at high signal to noise ratio (SNR). Moreover, the numerical results demonstrate that the SOP tends to a constant at high SNR. Finally, our results show that the power allocation factors, target rates are main factors affecting to the secrecy performance of considered uplink-downlink NOMA systems.
Design of a dual-band antenna for energy harvesting applicationjournalBEEI
This report presents an investigation on how to improve the current dual-band antenna to enhance the better result of the antenna parameters for energy harvesting application. Besides that, to develop a new design and validate the antenna frequencies that will operate at 2.4 GHz and 5.4 GHz. At 5.4 GHz, more data can be transmitted compare to 2.4 GHz. However, 2.4 GHz has long distance of radiation, so it can be used when far away from the antenna module compare to 5 GHz that has short distance in radiation. The development of this project includes the scope of designing and testing of antenna using computer simulation technology (CST) 2018 software and vector network analyzer (VNA) equipment. In the process of designing, fundamental parameters of antenna are being measured and validated, in purpose to identify the better antenna performance.
Transforming data-centric eXtensible markup language into relational database...journalBEEI
eXtensible markup language (XML) appeared internationally as the format for data representation over the web. Yet, most organizations are still utilising relational databases as their database solutions. As such, it is crucial to provide seamless integration via effective transformation between these database infrastructures. In this paper, we propose XML-REG to bridge these two technologies based on node-based and path-based approaches. The node-based approach is good to annotate each positional node uniquely, while the path-based approach provides summarised path information to join the nodes. On top of that, a new range labelling is also proposed to annotate nodes uniquely by ensuring the structural relationships are maintained between nodes. If a new node is to be added to the document, re-labelling is not required as the new label will be assigned to the node via the new proposed labelling scheme. Experimental evaluations indicated that the performance of XML-REG exceeded XMap, XRecursive, XAncestor and Mini-XML concerning storing time, query retrieval time and scalability. This research produces a core framework for XML to relational databases (RDB) mapping, which could be adopted in various industries.
Key performance requirement of future next wireless networks (6G)journalBEEI
The document provides an overview of the key performance indicators (KPIs) for 6G wireless networks compared to 5G networks. Some of the major KPIs discussed for 6G include: achieving data rates of up to 1 Tbps and individual user data rates up to 100 Gbps; reducing latency below 10 milliseconds; supporting up to 10 million connected devices per square kilometer; improving spectral efficiency by up to 100 times through technologies like terahertz communications and smart surfaces; and achieving an energy efficiency of 1 pico-joule per bit transmitted through techniques like wireless power transmission and energy harvesting. The document outlines how 6G aims to integrate terrestrial, aerial and maritime communications into a single network to provide ubiquitous connectivity with higher
Noise resistance territorial intensity-based optical flow using inverse confi...journalBEEI
This paper presents the use of the inverse confidential technique on bilateral function with the territorial intensity-based optical flow to prove the effectiveness in noise resistance environment. In general, the image’s motion vector is coded by the technique called optical flow where the sequences of the image are used to determine the motion vector. But, the accuracy rate of the motion vector is reduced when the source of image sequences is interfered by noises. This work proved that the inverse confidential technique on bilateral function can increase the percentage of accuracy in the motion vector determination by the territorial intensity-based optical flow under the noisy environment. We performed the testing with several kinds of non-Gaussian noises at several patterns of standard image sequences by analyzing the result of the motion vector in a form of the error vector magnitude (EVM) and compared it with several noise resistance techniques in territorial intensity-based optical flow method.
Modeling climate phenomenon with software grids analysis and display system i...journalBEEI
This study aims to model climate change based on rainfall, air temperature, pressure, humidity and wind with grADS software and create a global warming module. This research uses 3D model, define, design, and develop. The results of the modeling of the five climate elements consist of the annual average temperature in Indonesia in 2009-2015 which is between 29oC to 30.1oC, the horizontal distribution of the annual average pressure in Indonesia in 2009-2018 is between 800 mBar to 1000 mBar, the horizontal distribution the average annual humidity in Indonesia in 2009 and 2011 ranged between 27-57, in 2012-2015, 2017 and 2018 it ranged between 30-60, during the East Monsoon, the wind circulation moved from northern Indonesia to the southern region Indonesia. During the west monsoon, the wind circulation moves from the southern part of Indonesia to the northern part of Indonesia. The global warming module for SMA/MA produced is feasible to use, this is in accordance with the value given by the validate of 69 which is in the appropriate category and the response of teachers and students through a 91% questionnaire.
An approach of re-organizing input dataset to enhance the quality of emotion ...journalBEEI
The purpose of this paper is to propose an approach of re-organizing input data to recognize emotion based on short signal segments and increase the quality of emotional recognition using physiological signals. MIT's long physiological signal set was divided into two new datasets, with shorter and overlapped segments. Three different classification methods (support vector machine, random forest, and multilayer perceptron) were implemented to identify eight emotional states based on statistical features of each segment in these two datasets. By re-organizing the input dataset, the quality of recognition results was enhanced. The random forest shows the best classification result among three implemented classification methods, with an accuracy of 97.72% for eight emotional states, on the overlapped dataset. This approach shows that, by re-organizing the input dataset, the high accuracy of recognition results can be achieved without the use of EEG and ECG signals.
Parking detection system using background subtraction and HSV color segmentationjournalBEEI
Manual system vehicle parking makes finding vacant parking lots difficult, so it has to check directly to the vacant space. If many people do parking, then the time needed for it is very much or requires many people to handle it. This research develops a real-time parking system to detect parking. The system is designed using the HSV color segmentation method in determining the background image. In addition, the detection process uses the background subtraction method. Applying these two methods requires image preprocessing using several methods such as grayscaling, blurring (low-pass filter). In addition, it is followed by a thresholding and filtering process to get the best image in the detection process. In the process, there is a determination of the ROI to determine the focus area of the object identified as empty parking. The parking detection process produces the best average accuracy of 95.76%. The minimum threshold value of 255 pixels is 0.4. This value is the best value from 33 test data in several criteria, such as the time of capture, composition and color of the vehicle, the shape of the shadow of the object’s environment, and the intensity of light. This parking detection system can be implemented in real-time to determine the position of an empty place.
Quality of service performances of video and voice transmission in universal ...journalBEEI
The universal mobile telecommunications system (UMTS) has distinct benefits in that it supports a wide range of quality of service (QoS) criteria that users require in order to fulfill their requirements. The transmission of video and audio in real-time applications places a high demand on the cellular network, therefore QoS is a major problem in these applications. The ability to provide QoS in the UMTS backbone network necessitates an active QoS mechanism in order to maintain the necessary level of convenience on UMTS networks. For UMTS networks, investigation models for end-to-end QoS, total transmitted and received data, packet loss, and throughput providing techniques are run and assessed and the simulation results are examined. According to the results, appropriate QoS adaption allows for specific voice and video transmission. Finally, by analyzing existing QoS parameters, the QoS performance of 4G/UMTS networks may be improved.
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTjpsjournal1
The rivalry between prominent international actors for dominance over Central Asia's hydrocarbon
reserves and the ancient silk trade route, along with China's diplomatic endeavours in the area, has been
referred to as the "New Great Game." This research centres on the power struggle, considering
geopolitical, geostrategic, and geoeconomic variables. Topics including trade, political hegemony, oil
politics, and conventional and nontraditional security are all explored and explained by the researcher.
Using Mackinder's Heartland, Spykman Rimland, and Hegemonic Stability theories, examines China's role
in Central Asia. This study adheres to the empirical epistemological method and has taken care of
objectivity. This study analyze primary and secondary research documents critically to elaborate role of
china’s geo economic outreach in central Asian countries and its future prospect. China is thriving in trade,
pipeline politics, and winning states, according to this study, thanks to important instruments like the
Shanghai Cooperation Organisation and the Belt and Road Economic Initiative. According to this study,
China is seeing significant success in commerce, pipeline politics, and gaining influence on other
governments. This success may be attributed to the effective utilisation of key tools such as the Shanghai
Cooperation Organisation and the Belt and Road Economic Initiative.
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
Literature Review Basics and Understanding Reference Management.pptxDr Ramhari Poudyal
Three-day training on academic research focuses on analytical tools at United Technical College, supported by the University Grant Commission, Nepal. 24-26 May 2024
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
International Conference on NLP, Artificial Intelligence, Machine Learning an...gerogepatton
International Conference on NLP, Artificial Intelligence, Machine Learning and Applications (NLAIM 2024) offers a premier global platform for exchanging insights and findings in the theory, methodology, and applications of NLP, Artificial Intelligence, Machine Learning, and their applications. The conference seeks substantial contributions across all key domains of NLP, Artificial Intelligence, Machine Learning, and their practical applications, aiming to foster both theoretical advancements and real-world implementations. With a focus on facilitating collaboration between researchers and practitioners from academia and industry, the conference serves as a nexus for sharing the latest developments in the field.
International Conference on NLP, Artificial Intelligence, Machine Learning an...
Deep learning based facial expressions recognition system for assisting visually impaired persons
1. Bulletin of Electrical Engineering and Informatics
Vol. 9, No. 3, June 2020, pp. 1208~1219
ISSN: 2302-9285, DOI: 10.11591/eei.v9i3.2030 1208
Journal homepage: http://beei.org
Deep learning based facial expressions recognition system for
assisting visually impaired persons
Hendra Kusuma, Muhammad Attamimi, Hasby Fahrudin
Department of Electrical Engineering, Institut Teknologi Sepuluh Nopember, Indonesia
Article Info ABSTRACT
Article history:
Received Aug 6, 2019
Revised Nov 3, 2019
Accepted Dec 4, 2020
In general, a good interaction including communication can be achieved
when verbal and non-verbal information such as body movements, gestures,
facial expressions; can be processed in two directions between the speaker
and listener. Especially the facial expression is one of the indicators
of the inner state of the speaker and/or the listener during the communication.
Therefore, recognizing the facial expressions is necessary and becomes
the important ability in communication. Such ability will be a challenge for
the visually impaired persons. This fact motivated us to develop a facial
recognition system. Our system is based on deep learning algorithm.
We implemented the proposed system on a wearable device which enables
the visually impaired persons to recognize facial expressions during
the communication. We have conducted several experiments involving
the visually impaired persons to validate our proposed system
and the promising results were achieved.
Keywords:
Deep learning
Facial expressions recognition
Visual recognition
Visually impaired persons
Wearable devices
This is an open access article under the CC BY-SA license.
Corresponding Author:
Muhammad Attamimi,
Department of Electrical Engineering,
Institut Teknologi Sepuluh Nopember,
Raya ITS, Sukolilo 60111, Surabaya, Indonesia.
Email: attamimi@ee.its.ac.id
1. INTRODUCTION
In everyday life, the visually impaired persons face many challenges such as finding a path in
unknown environments as well as visually recognizing their surroundings including the objects, human body
or face. Many studies have been done to enable the computer and/or the machine to see the world as
human does through developing the algorithm in visual recognition systems including object detection,
object recognition, and so forth [1-15]. One of the ability in visual recognition is to recognize the facial
expressions of a person during the interaction. Generally, non-verbal information such as body movements,
gestures, facial expressions; takes a great role in a face-to-face interaction including communication [16-18].
Moreover, the facial expression is strongly related to human emotion that indicates the inner state of the
speaker and/or the listener during communication. Such information is important in engaging a good
communication as well as interaction with the others. The visually impaired persons that in general have
difficulties to obtain such information having the limitation in social interaction [19]. Especially, the ones
who lost vision early in their life will have more difficulties in social interaction [20]. To the best of our
knowledge, there are only few assistive devices available for supporting visually impaired persons in
capturing non-verbal information during social interaction.
Without the ability to see, human can process the information which usually obtained through
sensing using the other senses such as tactile and/or auditory [21-23]. The textual information such as text
can be delivered in the auditory form or touching a medium designed to convey such information. One knows
2. Bulletin of Electr Eng & Inf ISSN: 2302-9285
Deep learning based facial expressions recognition system for… (Hendra Kusuma)
1209
that the braille is commonly used to deliver textual information for the visually impaired persons. This fact
inspires us to develop a wearable device that can capture non-verbal information for assisting the visually
impaired persons. In this study, we focus on facial expressions recognition as one of the important non-verbal
information during the social interaction. We propose a facial expression recognition system which is based
on deep learning. The proposed system is implemented on a wearable device shown in Figure 1. The device
is equipped to the visually impaired persons. Thanks to proposed system, the input image captured from
the device can be classified into facial expressions which consist of anger, disgust, fear, neutral, happy,
surprised, and sadness. To deliver the results to the users which are visually impaired persons, it is natural to
consider a touch or an auditory information as a choice. We have investigated and chosen the auditory
information as the final output of our wearable device.
Figure 1. A wearable device developed in this study for assisting visually impaired persons
2. RESEARCH METHOD
This study focuses on proposing a facial expression recognition system and implementing
the proposed system in a wearable device to assist the visually impaired persons. There are several related
studies. First, the studies related with the assistive tools for visually impaired persons. For examples,
a vibrotactile has been used in [21-22] and sound has been used in [23] for conveying information to
the users. Particularly the study in [21] developed a wearable device to convey the facial expressions to
the haptic information via a vibrotactile belt. In [21], the device consisted of a Microsoft Surface Pro 4 tablet
(6th Gen 2.2-GHz Intel Core i7-6650U processor with Intel Iris graphics 540, Windows 10 operating system)
and a webcam mounted on a cap to capture the image. FaceReader 6™ (Vicar Vision, Amsterdam,
The Netherlands) was used for face and facial expression recognition. The integrated system was excellent,
but still difficult to realize a cheap device. Therefore, we try to develop a cheaper wearable device that can
support the visually impaired persons.
Second, the studies related with the datasets for facial expressions recognition [24-26].
The Cohn-Kanade (CK) dataset aims to promote research on the detection of individual facial expressions
automatically. CK dataset has become one of the datasets commonly used for the development of algorithms
and their evaluations. However, according to [24], CK dataset still has some limitations and the Extented
Cohn-Kanade (CK+) provided the solutions to overcome those limitations. In [24], the number of sequences
increased by 22% and the number of subjects increased by 27%, and the target expressions for each
sequenced coded FACS and emotional labels have been validated and revised. Another dataset is
the AffectNet [25]. AffecNet is a database created by querying different search engines (Google, Bing,
and Yahoo) using 1250 emotional-related tags in six different areas (English, Spanish, Portuguese, German,
Arabic, and Farsi). AffectNet has more than one million images with faces. Twelve experts annotated each
450,000 images in terms of categories and dimensions. To calculate the similarity of perception, 36,000
images were annotated by two people. AffectNet has so far been the largest dataset of facial expressions on
static images that include both category and dimension models.
3. ISSN: 2302-9285
Bulletin of Electr Eng & Inf, Vol. 9, No. 3, June 2020 : 1208 – 1219
1210
In this study, we combine the mentioned datasets with the Japanese Female Facial Expressions
(JAFFE) dataset [26], to realize the adaptable facial expression recognition system. JAFFE dataset consists
of 213 images of seven facial expressions (six basic facial expressions and neutral) with 10 Japanese women
as the model. Each image is rated based on six emotions from 60 subjects. JAFFE was designed and created
by Michael Lyons, Miyuki Kamachi, and Jiro Gyoba. Photo taken at the psychology department of Kyushu
University. Each expressor was shot while looking at the semi-reflective plastic sheet towards the camera.
Hair was tied away from the face to reveal all regions of expression on the face. The tungsten lamp was
positioned to add illumination to the face. A box covered the area between the camera and plastic to reduce
back reflection. The JAFFE dataset has been used as research data to prove cultural differences in
the interpretation of facial expressions. In [27], there was a finding that there were differences in
the interpretation of facial expressions for each person who has a different cultural identity.
Third, the studies that related to image recognition, particularly, the methods for image
classification. There are two paradigms in image classification that are feature based ones such as in [12-15],
[28, 29] and end-to-end ones which is popular with neural network [30]. The latter paradigm become
powerful since deep learning had been found. One popular method in deep learning is Convolutional Neural
Network (CNN) and had been proven to outperform others method and become the state-of-the-art in visual
recognition [31]. However, this method needs high computational power. Fortunately, there are some
frameworks that help us to implement algorithm in a low computational power device by reduce it
performance to certain degree [32]. Those problems could be anticipated by training deep learning model in
cloud service.
3. PROPOSED METHOD
In this study, we propose a facial expressions recognition system which can classify seven human
expressions that are sadness, anger, happy, disgusted, fear, surprise, and neutral expression. Our proposed
method is based on deep learning framework through transfer learning using publicly available model based
on convolutional neural networks (CNN). Figure 2 shows the illustration of our proposed method.
The mechanism of transfer learning is similar to our previous work in [33] which is by collecting datasets
consist of seven types of facial expressions and training the publicly available model with the new dataset.
Since model building is time consuming, the training phase is conducted offline. Once the model is trained,
we can use it for facial expressions recognition.
One can see from Figure 2 that the proposed method is divided into two blocks which
are preprocessing for input data and the prediction phase. To predict the preprocessed input image,
a pretrained networks is used. The proposed method is then implemented on a wearable device shown in
Figure 1. The main processor of our device is Raspberry Pi 3, which is able to control the camera mounted on
the cap to capture the images and process them to output the recognition results. Thanks to our proposed
device the facial expressions results can be conveyed into auditory information which is allow the visually
impaired persons to understand the expressions during the social interaction. The details of proposed method
is explained as follows.
3.1. Implementation of a wearable device
In this paper, we also implement a facial expression recognition system to a wearable device as
shown in Figure 2. We use auditory information instead of vibration pattern information. Visually impaired
persons obtain the information of facial expressions of their partner by remembering the tone of each
expression that have been assigned beforehand. We consider that auditory information can be conveyed
facial expressions simpler than the one using a vibration. We used earphone as a tool to convey auditory
information because it has relatively small size and available in cheap price. Raspberry Pi 3 used as the main
processing unit because it already compatible with the latest deep learning frameworks, so we can implement
our trained network without any compatibility issues. A webcam is responsible on providing a raw input data
for trained network frame by frame. The webcam will be placed on a cap to be more precisely parallel to
the eye sight. There are push buttons that serve as pause, play, and shutdown command. We decide to add
this feature to help visually impaired persons to use the device independently. The user can pause
the recognition system when taking a break instead of turning off the power which is not energy efficient.
Raspberry Pi 3, push buttons, and the power supply will be placed in a small box. With this design, we can
predict at least once in 1.2 seconds which is not very fast but enough to maintain a good conversation.
3.2. Data preprocessing
In Figure 2, data preprocessing block have several steps. Data preprocessing block is used to
maintain consistency and quality of the raw input data. As one can see, in the first step we convert the RGB
4. Bulletin of Electr Eng & Inf ISSN: 2302-9285
Deep learning based facial expressions recognition system for… (Hendra Kusuma)
1211
image into grayscale image. There are two reasons we used grayscale image, i.e., 1) the computational cost
is low since the data is reduced in 1/3 (using one channel instead of three channels), and 2) to identify facial
expression we only need shape, texture, and depth of human face that can be extracted even using
the grayscale image.
We used boosted-cascade to localize human face in captured image [34]. Boosted-cascade has
a good accuracy with less computational cost. Capturing picture of an object in the wild usually did not have
good lighting condition which can cause decrease of trained network accuracy. Therefore, by applying
histogram equalization method can adjust contrast of the image, so the main face feature will not be
damaged or overlooked.
Figure 2. Overview of proposed facial expressions recognition system
3.3. Deep learning for facial expressions classification
We used transfer learning method to develop our own facial expressions recognition model. In this
study, one of our goals is to reduce cost production, so using a cheap microcomputer is one of the solutions.
Cheaper devices generally have low computational power. Therefore, we need to use a lightweight model.
In this paper, we use MobileNetV2 as a pretrained network due to its excellent performance while provides
very efficient mobile-oriented model [35]. The intuition is that the bottleneck encodes intermediate its input
and output while inner layer designed to allow model transform from the pixel information to general
descriptors such as image category. MobileNetV2 use residual connection in their model architecture to
increase performance while enabling shortcuts to increase model speed.
In this study, we combined CK+ dataset [24], AffectNet dataset [25], and JAFFE dataset [26] for
training and test sets. We divide the dataset based on six basic human emotion which are anger, sadness,
disgust, happy, surprise and fear with addition of neutral expression. In Figure 2, we processed our image in
grayscale format to maintain input quality. To maintain the consistency among the datasets, we processed all
the data in grayscale format. The data needs to be checked manually by human before split into training data
and validation data to assure data integrity. Manual check guidance will be based on [24-26]. On the training
phase, we focused on learning rate, epoch, and dropout layer. We believe the parameter tuning can greatly
improve networks performance. Those parameters manually tuned according to experiment results.
3.4. Outputs
Prediction results are presented in the form of audio files which each of facial expression is assigned
to different audio files that are made by simple pattern of three different tones called audio_1, audio_2,
and audio_3. Instead of using seven kind of tone to address each expression, using simple pattern with fewer
kind of tone will be easier to remember. Table 1 shows the audio representation of each facial expression.
5. ISSN: 2302-9285
Bulletin of Electr Eng & Inf, Vol. 9, No. 3, June 2020 : 1208 – 1219
1212
Table 1. Auditory information
Facial expression Audio representation
Anger
Disgust
Fear
Neutral
Happy
Surprise
Playing audio_1 three times
Playing audio_1 two times
Playing audio_1
Playing audio_2
Playing audio_3
Playing audio_3 two times
Sadness Playing audio_3 three times
4. EXPERIMENTS AND DISCUSSION
To evaluate our proposed facial expressions recognition system, we have conducted several
experiments. First, the experiment is to compare and report the result of transfer learning using
the MobileNetV2 model [35], Xception model [36], and VGG16 model [37]. Here, we compared
the computational cost and performance of those models. Next, we have chosen the best model
and performed parameter tuning to get a better model. Finally, we have implemented the proposed facial
expressions recognition system to a wearable device and tested the performance of a whole system.
The details of experimental settings and the results are described as the following.
4.1. Experimental setup
One of major factors of the quality in the deep learning based algorithm is the dataset. The more
representative the dataset used, the better the model's performance. In this study, we used the combination
of three publicly available dataset, i.e., CK+ dataset [24], AffectNet dataset [25], and JAFFE dataset [26].
Combining the datasets was needed for training the deep learning model implemented on the proposed
system. The images on the JAFFE dataset and AffectNet dataset have been separated according to the type
of emotions as many as seven categories, i.e., sadness, happy, fear, disgust, anger, surprise, and neutral by
storing images in the folder whose title matches the facial expressions. However, we needed to extract
the image in the CK+ dataset because it was provided in the form of a video. Therefore the image needed to
be selected manually and sorted according to emotion category. The total image on the combined dataset was
approximately 40,000 images. The images were then divided into 32,000 images for a training set and 8,000
images for a test set.
The next step is to separate the face image with the background. This aims to ensure the learning
process the model will focus on studying features on the face. To ensure the quality of the dataset, we have
checked the image one by one. Images with inappropriate facial expressions will be labeled accordingly.
The image that contained ambiguous facial expression was deleted. At this stage, human perception was
needed in classifying facial expressions. Figure 3 shows some examples of images in the dataset with
the corresponding labels. Once the dataset preparation was done we performed the training task. To realize
a better implementation of facial expressions recognition system, we need to choose a best model according
to the computational cost and the performance. Section 4.2. discusses the model comparison used in this
study. After the best model was found, the model need to be tuned to obtain a better performance. The results
of parameter tuning is discussed on section 4.3.
Figure 3. The datasets used in this paper
Disgust
Anger Fear Neutral Happy Surprised Sad
CK+
dataset
AffectNet
dataset
JAFFE
dataset
6. Bulletin of Electr Eng & Inf ISSN: 2302-9285
Deep learning based facial expressions recognition system for… (Hendra Kusuma)
1213
Finally, since the ultimate goal of this study is to assist the visually impaired persons recognizing
the facial expressions of their speaking partners in daily activities, we conducted the experiment to evaluate
our wearable device in indoor and outdoor environments. We tested our proposed system on visually
impaired users. Figure 4 shows the experimental scenes in indoor and outdoor environments participated by
visually impaired subjects. From the Figure 4, the person using proposed device guessed the facial
expressions of his/her partner in some communication scenarios. The distance between the speaker
and listener was one meter. It should be note that in ice breaking phase, we trained the user for about 10
minutes to obtain the information from auditory information outputed from proposed device. We then
compared the correct information collected from the facial expressor with one come from the user equipped
with the wearable device. This experiment participated by three facial expressors. Each expressor
was performed seven types of facial expressions for five times. Due to the limitations, we have only invited
three visually impaired persons to use our wearable device and participated in our experiment. We reported
the result on section 4.4.
(a) (b)
Figure 4. Experimental scenes in, (a) Indoor environment, and (b) Outdoor environment
4.2. Model comparison
In this study, we compared three publicly available deep learning models that are MobileNetV2
model [35], Xception model [36], and VGG16 model [37] in two aspects, i.e., computational speed
and performance. Here, using the training set mentioned in section 4.1, we trained the models via transfer
learning and compared the accuracies of test set. In general, to train a deep learning model, a personal
computer with powerful graphical processing unit (GPU) is needed. We used a 3.0-GHz Intel Core i5-7400
processor with GeForce GTX 1060 6 GB and Linux Mint 19 operating system to create a model to be used in
our proposed wearable device that is based on Rasberri Pi 3.
Table 2 shows the comparison results in a personal computer whereas Table 3 shows the results in
Rasberri Pi 3. From Table 2, the difference in accuracy and speed was not too significant. If the model speed
is less than 0.1 seconds, then the model is enough to predict the images continuously in a social interaction
scenario. There is no significant difference in accuracy of the three architectures. On the other hands, one can
see from Table 3 that there was a significant difference in speed. In the aspect of accuracy, the three models
have relatively the same results. Therefore, MobilenetV2 was suitable for implementation on proposed
wearable device because it was the fastest model in prediction the input images.
Table 2. The results on personal computer
Model architecture Speed [seconds/predictions] Accuracy [%]
VGG16
Xception
Mobilenet V2
0.05123
0.03244
0.01982
58
62
60
Table 3. The results on Rasberri Pi 3
Model architecture Speed [seconds/predictions] Accuracy [%]
VGG16
Xception
Mobilenet V2
9.132
11.983
1.2783
53
56
55
7. ISSN: 2302-9285
Bulletin of Electr Eng & Inf, Vol. 9, No. 3, June 2020 : 1208 – 1219
1214
4.3. Parameter tuning
Tuning the parameters of deep learning models is important process to obtain a robust model. In this
study, the parameters to be tuned were learning rate and momentum. Using the training set, we trained
several times by changing the momentum parameters to find the optimal value. It should be note that each
model was also tested using the test set. We also conducted parameter tuning for learning rate to obtain
the optimal value. First, we tuned the momentum value. The momentum can help to know the direction
of change of the parameters that refer to the results of the previous step and can also prevent oscillations on
gradient descent. Figure 5 shows the results of model accuracy and model loss when the momentum values
of 0.3, 0.5, and 0.7 were set to the model. Based on the figure, it can be concluded that changing momentum
values can significantly influence the learning process. At values of 0.9 and 0.7, the accuracy of the training
set increased to reach 0.9 on 15th epoch and changes in the test set were less significant. Loss from training
set at this value (0.9 and 0.5) increased with the learning session progressing. This causes no increase in
accuracy in the test set. From this tendency it can be concluded that the values of 0.9 and 0.7 were not
suitable and fallen in overfitting. At 0.5 momentum value the accuracy of training set increased gradually but
in the same epoch the training session accuracy of the momentum value of 0.5 was not as high as the other
values (0.9 and 0.7). However, the accuracy in test data was tended to increase. Loss in the test set decreased
with the learning session progressing. From these trends, it can be concluded that the momentum value
suitable for this application was approximately 0.5.
Figure 5. Model accuracy and model loss using the momentum value of: (a) 0.5, (b) 0.7, and (c) 0.9
8. Bulletin of Electr Eng & Inf ISSN: 2302-9285
Deep learning based facial expressions recognition system for… (Hendra Kusuma)
1215
Next, after selecting the momentum value, the parameter tuning was made on learning rate.
The learning rate determines how quickly the model updates the model parameters. By finding
the appropriate learning rate value, the model can learn input features well and quickly. Figure 6 depicts
the results of model accuracy and model loss when the learning rate values of 0.001, 0.0001, and 0.00001
were set to the model. Based on the figure, changes in the learning rate value can affect the learning session.
At a value of 0.001, the model has difficulty recognizing the test set. It can be seen in the training history,
the value of 0.001 has a very high loss even though it has high training accuracy and low training loss.
From these characteristics, it can be concluded that the model experiences overfitting when using a learning
rate of 0.001. The values of 0.0001 and 0.00001 have the same tendency, that is the accuracy of training set
and test set shows an increase. As the learning session progressed, the loss of test set and training set have
decreased. However, the value of 0.00001 required a longer learning session to match the ability of the model
at a value of 0.0001. When observed in more detail, the accuracy of training set shows stagnation in the last
few epochs. So, if we want to match the performance of the model with a learning rate value of 0.0001 it will
take a very long time. Therefore, the suitable value of the learning rate was approximately 0.0001.
Finally, after finding the optimal parameters that are 0.55 for momentum and 0.0004 for the learning
rate, we trained our models with those parameters. Figure 7 shows the model accuracy. One can see that
the accuracy of the training set almost reached 99%. It should be noted that the accuracy of the test set
has exceeded the minimum limit of close to 85%. Therefore, the model of the learning session can be used as
a final model that will be implemented on a wearable device.
Figure 6. Model accuracy and model loss using the learning rate value of, (a) 0.001, (b) 0.0001,
and (c) 0.00001
9. ISSN: 2302-9285
Bulletin of Electr Eng & Inf, Vol. 9, No. 3, June 2020 : 1208 – 1219
1216
Figure 7. Model accuray using the optimal momentum and learning rate
4.4. Evaluation results of the whole system
In this paper, to validate the whole system we tested on visually impaired persons. The result
of indoor experiment and outdoor experiment were respectively shown in Figure 8 and Figure 9. One can see
from Figure 8(a), subjects A and B have success rates relatively the same. However, there was a significant
difference with the subject C. To investigate the results, we have also interviewed the subject C. The subject
was explained that the subject did not really understand the explanation during the ice breaking. However,
subject C felt more comfortable after taking data on the second expressor. The confussion matrix of the result
was shown in Figure 8(b). From the figure, one can see that the subjects can guess the facial expressions
correctly by accuracy of 80% with natural room conditions without any additional lighting. Fear expression
was the biggest facial expressions that have errors. On expression disgusted, the error of the answer was
always in an angry expression. This can be considered that the expression of anger and disgust having similar
facial features or the user was not precise in digesting auditory information because the angry expression has
a sound pattern that was almost the same as the expression of disgust (see Table 1). Other expressions have
errors that can be tolerated. In general, the device can be said to work properly.
One can see from Figure 9(a), subjects A, B, and C have almost the same trend on the accuracy
of predictions. The fear expressions have the lowest success rates. Neutral expressions and smiles are always
ranked first on the percentage of success predictions. From Figure 9(b), the user can guess the facial
expressions correctly by 78% in an outdoor environment with no additional lighting.
Figure 8. The results of indoor experiment, (a) Success rates of each subjects, (b) Confussion matrix of facial
expressions recognition
10. Bulletin of Electr Eng & Inf ISSN: 2302-9285
Deep learning based facial expressions recognition system for… (Hendra Kusuma)
1217
Figure 9. The results of outdoor experiment, (a) Success rates of each subjects, and (b) Confussion matrix
of facial expressions recognition
5. CONCLUSION
In this paper, we have proposed the facial expressions recognition system. Our proposed system
was based on deep learning framework utilizing the transfer learning of available models. To boost
the accuracy of proposed method, we have combined several publicly available datasets and performed
comprehensive parameter tuning. We have also implemented the proposed system on a wearable device.
The device was designed to assist the visually impaired persons in social interaction. We have used
the auditory information to convey the results of facial expressions predicted during the social interaction.
To validate our proposed system, we have invited visually impaired persons to have conversations with some
facial expressors in indoor and outdoor environments. The results were 80% for indoor environment and 78%
for outdoor environment. These results were good enough for our proposed wearable device and can be
considered assisting the visually impaired persons in social interaction. In the future, we are planning to
improve this device in accuracy, ergonomics aspects, and power management. Therefore, to improve
accuracy we plan to collect more data and possibly make our own dataset. We also plan to a study about
characteristic of basic human facial expressions which can help us building a better facial expression dataset.
REFERENCES
[1] T-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, "Focal loss for dense object detection," IEEE Transactions
on Pattern Analysis and Machine Intelligence, vol. 42, no. 2, pp. 318-327, 2020.
[2] X. Zhu, H. Hu, S. Lin, and J. Dai, "Deformable ConvNets v2: More deformable, better results," IEEE Xplore,
pp. 9308-9316, 2018.
[3] S. Zhang, L. Wen, X. Bian, Z. Lei, and S. Z. Li, "Single-shot refinement neural network for object detection,"
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4203-4212, 2018.
[4] J. Redmon and A. Farhadi, "YOLOv3: An incremental improvement," ArXiv: 1804.02767Comment: Tech Report,
pp. 1-6, 2018.
[5] J. Redmon and A. Farhadi, "YOLO9000: Better, faster, stronger," IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), pp. 6517-6525, 2017.
[6] J. Dai et al., "Deformable convolutional networks," IEEE Xplore, pp. 764-773, 2017.
[7] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You only look once: Unified, real-time object detection,"
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779-788, 2016.
[8] H. Bilen et al., "Weakly supervised deep detection networks," Conference on Computer Vision and Pattern
Recognition, pp. 2846-2854, 2016.
[9] N. Bodla, B. Singh, R. Chellappa, and L. S. Davis, "SoftNMS improving object detection with one line of code,"
International Conference on Computer Vision, pp. 5562-5570, 2017.
11. ISSN: 2302-9285
Bulletin of Electr Eng & Inf, Vol. 9, No. 3, June 2020 : 1208 – 1219
1218
[10] Z. Cai and N. Vasconcelos, "Cascade RCNN: Delving into high quality object detection," Conference on Computer
Vision and Pattern Recognition, pp. 6155-6162, 2018.
[11] K. Chen et al., "Hybrid task cascade for instance segmentation," Conference on Computer Vision and Pattern
Recognition, pp. 4974-4983, 2019.
[12] M. Attamimi, T. Araki, T. Nakamura, and T. Nagai, "Visual recognition system for cleaning tasks by humanoid
robots," International Journal of Advanced Robotic Systems, vol. 10, no. 11, pp. 1-14, 2013.
[13] M. Attamimi, T. Nagai, and D. Purwanto, "Particle filter with integrated multiple features for object detection and
tracking," TELKOMNIKA Telecommunication Computing Electronics and Control, vol. 16, no. 6, pp. 3008-3015, 2018.
[14] M. Attamimi et al., "Real-Time 3D visual sensor for robust object recognition," IEEE/RSJ International
Conference on Intelligent Robots and Systems, pp. 4560-4565, 2010.
[15] M. Attamimi, T. Nakamura, and T. Nagai., "Hierarchical multilevel object recognition using markov model," 21st
International Conference on Pattern Recognition, pp. 2963-2966, 2012.
[16] M. Attamimi, Y. Katakami, K. Abe, T. Nakamura, and T. Nagai, "Modeling of honest signals for human robot
interaction," 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 415-416, 2016.
[17] K. Abe, C. Hieida, M. Attmimi, and T. Nagai, "Toward playmate robots that can play with children considering
personality," 2nd International Conference on Human-Agent Interaction, pp. 165-168, 2014.
[18] M. Attamimi, M. Miyata, T. Yamada, T. Omori, and R. Hida, "Attention estimation for child-robot interaction,"
Proceedings of the Fourth International Conference on Human Agent Interaction, pp. 267-271, 2016.
[19] M. G. Frank., "Facial expressions," International Encyclopedia of the Social & Behavioral Sciences,
pp. 5230-5234, 2001.
[20] M. D. Naraine, D. Mala, and P. H. Lindsay, "Social inclusion of employees who are blind or low vision," Disabil
Soc., vol. 26, no. 4, pp. 389-403, 2011.
[21] H. P. Buimer et al., "Conveying facial expressions to blind and visually impaired persons through a wearable
vibrotactile device," PLoS One, vol. 13, no. 3, pp. 1-16, 2018.
[22] Y. Jin, J. Kim, B. Kim, and R. Mallipeddi, "Smart cane: Face recognition system for blind," Proceedings of the 3rd
International Conference on Human-Agent Interaction, pp. 145-148, 2015.
[23] Y. Zhao, S. Wu, L. Reynolds, and S. Azenkot, "A face recognition application for people with visual impairments:
Understanding use beyond the lab," Conference on Human Factors in Computing Systems, pp. 1-4, 2018.
[24] P. Lucey et al., "The extended cohn-kanade dataset (CK+): A complete dataset for action unit
and emotion-specified expression," IEEE Computer Society Conference on Computer Vision and Pattern
Recognition -Workshops, pp. 96-101, 2010.
[25] A. Mollahosseini, B. Hasani, and M. H. Mahoor., "AffectNet: A database for facial expression, valence,
and arousal computing in the wild," IEEE Transactions on Affective Computing, vol. 10, no. 1, pp. 18-31, 2017.
[26] M. J. Lyons, M. G. Kamachi, and J. Gyoba, "Japanese female facial expressions (JAFFE),” Database of Digital
Images, pp. 200-205, 1997.
[27] M. N Dailey, C. Joyce, M. J. Lyons, and M. G. Kamachi "Evidence and a computational explanation of cultural
differences in facial expression recognition," Emotion, vol. 10, no. 6, pp. 874-893, 2010.
[28] R. Osada, T. A. Funkhouser, B. M. Chazelle, and D. P. Dobkin, "Shape distributions," ACM Transactions on
Graphics, vol. 21, no. 4, pp. 807-832, 2002.
[29] D. G. Lowe, "Distinctive image features from scale-invariants keypoints," J. of Computer Vision, vol. 60, no. 2,
pp. 91-110, 2004.
[30] C. M. Bishop, "Pattern Recognition and Machine Learning," Springer, 2006.
[31] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classification with deep convolutional neural networks,"
Advances in neural information processing systems, pp. 1097-1105, 2012.
[32] C. Sakr and N. Shanbhag, "Per-tensor fixed-point quantization of the back-propagation algorithm," International
Conference on Learning Representations, pp. 1-26, 2019.
[33] M. Attamimi, R. Mardiyanto, and A. N. Irfansyah, "Inclined image recognition for aerial mapping using deep
learning and tree based models," TELKOMNIKA, vol. 16, no. 6, pp. 3034-3044, 2018.
[34] P. Viola and M. Jones, "Rapid object detection using a boosted cascade of simple features," IEEE Computer Society
Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 511-518, 2001.
[35] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L-C. Chen, "MobileNetV2: Inverted residuals and linear
bottlenecks," IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4510-4520, 2018.
[36] F. Chollet, "Xception: Deep learning with depthwise separable convolutions," IEEE Conference on Computer
Vision and Pattern Recognition, pp. 1800-1807, 2017.
[37] K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition,"
International Conference on Learning Representations, pp. 1-14, 2015.
12. Bulletin of Electr Eng & Inf ISSN: 2302-9285
Deep learning based facial expressions recognition system for… (Hendra Kusuma)
1219
BIOGRAPHIES OF AUTHORS
Hendra Kusuma received the BS and PhD degree from Institut Teknologi Sepuluh Nopember
Surabaya both in electrical engineering, in 1988 and 2016, respectively. He also received the MS
degree from Curtin University of Technology in renewable energy engineering. From 1989 until
now, he is a lecturer in the Department of electrical engineering, Institut Teknologi Sepuluh
Nopember Surabaya. His research interests are in artificial intelligence, machine learning,
pattern recognition, IoT as well as in applied electronic.
Muhammad Attamimi received his BE, ME, and DE degrees from the University
of Electro-Communications in 2010, 2012, and 2015, respectively. He received scholarship from
Ministry of Education, Culture, Sports, Science and Technology Japan (MEXT) for his BE and
ME courses. From 2012 to 2015, he was also a Research Fellow (DC1) of Japan Society for
the Promotion of Science (JSPS). He was with a postdoctoral researcher at the department
of Mechanical Engineering and Intelligent Systems, The University of Electro-communications
(UEC) from April 2015 to December 2015. Since January 2016, he was with Tamagawa
University Brain Science Institute as a commission researcher for six months. Currently,
he is a lecturer at Department of Electrical Engineering, Institut Teknologi Sepuluh Nopember,
Surabaya, Indonesia. His research interests are computer vision, visual recognition, machine
learning, multimodal categorization, artificial intelligent, probability robotics, intelligent
systems, intelligent robotics.
Hasby Fahrudin has completed elementary education at SDIT Ghilmani Surabaya in 2009.
After that, the author continued his education at SMPN 6 Surabaya and SMAN 2 Surabaya.
After graduating from high school, writer proceed to tertiary education at Institut Teknologi
Sepuluh Nopember. While pursuing bachelor degree, he spent a lot of time in developing
electrical workshop. He was also the assistant of basic electronics laboratory. The author became
the participant of ASEAN SCIENS GKS held in South Korea in 2018.