Biometrics identification using multiple modalities has attracted the attention of many researchers as it
produces more robust and trustworthy results than single modality biometrics. In this paper, we present a
novel multimodal recognition system that trains a Deep Learning Network to automatically learn features
after extracting multiple biometric modalities from a single data source, i.e., facial video clips. Utilizing
different modalities, i.e., left ear, left profile face, frontal face, right profile face, and right ear, present in
the facial video clips, we train supervised denoising auto-encoders to automatically extract robust and nonredundant
features. The automatically learned features are then used to train modality specific sparse
classifiers to perform the multimodal recognition. Experiments conducted on the constrained facial video
dataset (WVU) and the unconstrained facial video dataset (HONDA/UCSD), resulted in a 99.17% and
97.14% rank-1 recognition rates, respectively. The multimodal recognition accuracy demonstrates the
superiority and robustness of the proposed approach irrespective of the illumination, non-planar
movement, and pose variations present in the video clips.
MULTIMODAL BIOMETRICS RECOGNITION FROM FACIAL VIDEO VIA DEEP LEARNINGcsandit
Biometrics identification using multiple modalities has attracted the attention of many researchers as it produces more robust and trustworthy results than single modality biometrics.
In this paper, we present a novel multimodal recognition system that trains a Deep Learning Network to automatically learn features after extracting multiple biometric modalities from a single data source, i.e., facial video clips. Utilizing different modalities, i.e., left ear, left profile face, frontal face, right profile face, and right ear, present in the facial video clips, we train supervised denosing autoencoders to automatically extract robust and non-redundant features.The automatically learned features are then used to train modality specific sparse classifiers to perform the multimodal recognition. Experiments conducted on the constrained facial video
dataset (WVU) and the unconstrained facial video dataset (HONDA/UCSD), resulted in a 99.17% and 97.14% rank-1 recognition rates, respectively. The multimodal recognition
accuracy demonstrates the superiority and robustness of the proposed approach irrespective of the illumination, non-planar movement, and pose variations present in the video clips.
Bimodal Biometric System using Multiple Transformation Features of Fingerprin...IDES Editor
This document presents a bimodal biometric system that fuses fingerprint and iris features for identification. It extracts features from the iris using two-level discrete wavelet transformation and discrete cosine transformation. Fingerprint features are extracted using fast Fourier transformation and discrete wavelet transformation. The iris and fingerprint features are concatenated to form the final feature set. Experimental results on fingerprint and iris databases show that the proposed bimodal system has lower false rejection and false acceptance rates and higher total success rate compared to existing unimodal systems.
A comparative review of various approaches for feature extraction in Face rec...Vishnupriya T H
This document provides an overview of various approaches for feature extraction in face recognition. It discusses common feature extraction algorithms such as PCA, DCT, LDA, and ICA. PCA is aimed at data compression while ensuring no information loss. DCT transforms images from spatial to frequency domains. LDA maximizes between-class variations and minimizes within-class variations. ICA determines statistically independent variables and minimizes higher-order dependencies. The document reviews several papers comparing the performance of these algorithms individually and in combination for face recognition applications.
IRJET- Autonamy of Attendence using Face RecognitionIRJET Journal
This document summarizes an automated attendance system using video-based face recognition. The system works by capturing a video of students in a classroom and using face detection and recognition algorithms to identify and mark the attendance of each student. It first detects faces in each video frame using the Haar cascade classifier, then recognizes the faces by comparing them to a training database of student faces using the Eigenfaces algorithm. Finally, it registers the attendance in an Excel sheet. The system aims to make the attendance process more efficient and accurate compared to traditional manual methods.
This document summarizes a research paper that proposes a feature level fusion based bimodal biometric authentication system using fingerprint and face recognition with transformation domain techniques. The system extracts fingerprint features using Dual Tree Complex Wavelet Transforms and extracts face features using Discrete Wavelet Transforms. It then concatenates the fingerprint and face features into a single feature vector. Euclidean distance is used to match test biometrics to those stored in a database. The proposed algorithm is shown to achieve better equal error rates and true positive identification rates compared to individual transformation domain techniques.
Face recognition systems are becoming increasingly important for security applications like surveillance cameras. They use biometric facial features which are easier for non-collaborating individuals compared to other biometrics. The document outlines the steps for a face recognition system as acquiring an image, detecting faces, recognizing faces to identify individuals. It discusses challenges like illumination, occlusion and methods are categorized as knowledge-based or appearance-based. The problem is to design a system for a robotics lab to detect and recognize frontal faces under changing lighting of at least 50 people, excluding sunglasses. The thesis outline covers literature review, proposed system theory, experiments and results, discussion and future work.
A Fast Recognition Method for Pose and Illumination Variant Faces on Video Se...IOSR Journals
This document summarizes a research paper that proposes a new face recognition method for video sequences with variations in pose and illumination. The proposed method uses an active appearance model without nonlinear programming to extract features, and a lazy classifier for recognition, in order to reduce computational complexity compared to previous methods. Experimental results show the proposed method achieves better recognition performance and lower computational cost than conventional techniques. The document provides background on video face recognition challenges and reviews related work on pose-invariant and illumination-robust recognition methods.
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD Editor
The document describes a proposed algorithm called Fusion of Hybrid Domain features for Iris Recognition (FHDIR).
The algorithm pre-processes iris images by resizing, binarization, cropping and splitting them. It then applies Fast Fourier Transform (FFT) to the left half of the iris image to extract features and applies Principal Component Analysis (PCA) to the right half to extract features. These feature sets are then fused using arithmetic addition to generate a final feature vector. Test iris features are compared to the database using Euclidean Distance for identification.
The proposed algorithm is evaluated on the CASIA iris database and is found to have better performance than existing algorithms in terms of false rejection rate, false acceptance rate, and true
MULTIMODAL BIOMETRICS RECOGNITION FROM FACIAL VIDEO VIA DEEP LEARNINGcsandit
Biometrics identification using multiple modalities has attracted the attention of many researchers as it produces more robust and trustworthy results than single modality biometrics.
In this paper, we present a novel multimodal recognition system that trains a Deep Learning Network to automatically learn features after extracting multiple biometric modalities from a single data source, i.e., facial video clips. Utilizing different modalities, i.e., left ear, left profile face, frontal face, right profile face, and right ear, present in the facial video clips, we train supervised denosing autoencoders to automatically extract robust and non-redundant features.The automatically learned features are then used to train modality specific sparse classifiers to perform the multimodal recognition. Experiments conducted on the constrained facial video
dataset (WVU) and the unconstrained facial video dataset (HONDA/UCSD), resulted in a 99.17% and 97.14% rank-1 recognition rates, respectively. The multimodal recognition
accuracy demonstrates the superiority and robustness of the proposed approach irrespective of the illumination, non-planar movement, and pose variations present in the video clips.
Bimodal Biometric System using Multiple Transformation Features of Fingerprin...IDES Editor
This document presents a bimodal biometric system that fuses fingerprint and iris features for identification. It extracts features from the iris using two-level discrete wavelet transformation and discrete cosine transformation. Fingerprint features are extracted using fast Fourier transformation and discrete wavelet transformation. The iris and fingerprint features are concatenated to form the final feature set. Experimental results on fingerprint and iris databases show that the proposed bimodal system has lower false rejection and false acceptance rates and higher total success rate compared to existing unimodal systems.
A comparative review of various approaches for feature extraction in Face rec...Vishnupriya T H
This document provides an overview of various approaches for feature extraction in face recognition. It discusses common feature extraction algorithms such as PCA, DCT, LDA, and ICA. PCA is aimed at data compression while ensuring no information loss. DCT transforms images from spatial to frequency domains. LDA maximizes between-class variations and minimizes within-class variations. ICA determines statistically independent variables and minimizes higher-order dependencies. The document reviews several papers comparing the performance of these algorithms individually and in combination for face recognition applications.
IRJET- Autonamy of Attendence using Face RecognitionIRJET Journal
This document summarizes an automated attendance system using video-based face recognition. The system works by capturing a video of students in a classroom and using face detection and recognition algorithms to identify and mark the attendance of each student. It first detects faces in each video frame using the Haar cascade classifier, then recognizes the faces by comparing them to a training database of student faces using the Eigenfaces algorithm. Finally, it registers the attendance in an Excel sheet. The system aims to make the attendance process more efficient and accurate compared to traditional manual methods.
This document summarizes a research paper that proposes a feature level fusion based bimodal biometric authentication system using fingerprint and face recognition with transformation domain techniques. The system extracts fingerprint features using Dual Tree Complex Wavelet Transforms and extracts face features using Discrete Wavelet Transforms. It then concatenates the fingerprint and face features into a single feature vector. Euclidean distance is used to match test biometrics to those stored in a database. The proposed algorithm is shown to achieve better equal error rates and true positive identification rates compared to individual transformation domain techniques.
Face recognition systems are becoming increasingly important for security applications like surveillance cameras. They use biometric facial features which are easier for non-collaborating individuals compared to other biometrics. The document outlines the steps for a face recognition system as acquiring an image, detecting faces, recognizing faces to identify individuals. It discusses challenges like illumination, occlusion and methods are categorized as knowledge-based or appearance-based. The problem is to design a system for a robotics lab to detect and recognize frontal faces under changing lighting of at least 50 people, excluding sunglasses. The thesis outline covers literature review, proposed system theory, experiments and results, discussion and future work.
A Fast Recognition Method for Pose and Illumination Variant Faces on Video Se...IOSR Journals
This document summarizes a research paper that proposes a new face recognition method for video sequences with variations in pose and illumination. The proposed method uses an active appearance model without nonlinear programming to extract features, and a lazy classifier for recognition, in order to reduce computational complexity compared to previous methods. Experimental results show the proposed method achieves better recognition performance and lower computational cost than conventional techniques. The document provides background on video face recognition challenges and reviews related work on pose-invariant and illumination-robust recognition methods.
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD Editor
The document describes a proposed algorithm called Fusion of Hybrid Domain features for Iris Recognition (FHDIR).
The algorithm pre-processes iris images by resizing, binarization, cropping and splitting them. It then applies Fast Fourier Transform (FFT) to the left half of the iris image to extract features and applies Principal Component Analysis (PCA) to the right half to extract features. These feature sets are then fused using arithmetic addition to generate a final feature vector. Test iris features are compared to the database using Euclidean Distance for identification.
The proposed algorithm is evaluated on the CASIA iris database and is found to have better performance than existing algorithms in terms of false rejection rate, false acceptance rate, and true
The document discusses face detection and recognition methods. It covers several topics:
1) Face detection can be categorized into knowledge-based (using facial features, skin color, templates) and image-based methods (AdaBoost, Eigenfaces, neural networks, support vector machines).
2) Important factors for face detection include facial features, skin color modeling, and template matching.
3) Face recognition uses linear/nonlinear projection methods, neural networks for 2D recognition, and 3D geometric measures for 3D recognition.
Face recognition is a widely used biometric approach. Face recognition technology has developed rapidly
in recent years and it is more direct, user friendly and convenient compared to other methods. But face
recognition systems are vulnerable to spoof attacks made by non-real faces. It is an easy way to spoof face
recognition systems by facial pictures such as portrait photographs. A secure system needs Liveness
detection in order to guard against such spoofing. In this work, face liveness detection approaches are
categorized based on the various types techniques used for liveness detection. This categorization helps
understanding different spoof attacks scenarios and their relation to the developed solutions. A review of
the latest works regarding face liveness detection works is presented. The main aim is to provide a simple
path for the future development of novel and more secured face liveness detection approach.
https://telecombcn-dl.github.io/2017-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
Face recognization using artificial nerual networkDharmesh Tank
This document presents an overview of face recognition using artificial neural networks. It discusses the basic concepts of face recognition, issues with existing systems, and proposes a new system using discrete cosine transform (DCT) for feature extraction and an artificial neural network with backpropagation for classification. DCT is used to extract illumination invariant features and reduce dimensionality. The neural network is trained on these features to recognize faces. Thresholding rules are also introduced to improve recognition performance. Real-time applications of face recognition like Microsoft's Project Natal are mentioned.
The document summarizes face recognition techniques. It discusses how face recognition involves detecting faces, extracting and matching features. Common feature extraction methods discussed include principal component analysis, linear discriminant analysis, and neural networks. The document also summarizes different categories of face recognition approaches, such as template-based, statistical, neural network-based, and hybrid approaches. Local geometry-based features and other approaches like using range, infrared, or profile images are also mentioned.
Scale Invariant Feature Transform Based Face Recognition from a Single Sample...ijceronline
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
Presentation on Face detection and recognition - Credits goes to Mr Shriram, "https://www.hackster.io/sriram17ei/facial-recognition-opencv-python-9bc724"
This document describes a wearable AI device that uses computer vision and speech synthesis to help blind individuals. The device uses a Raspberry Pi with a camera to perform three main functions: facial recognition using convolutional neural networks and linear discriminant analysis, optical character recognition (OCR) to convert text to speech using a text-to-speech system, and object detection. The facial recognition and text are conveyed to the blind user through a speaker. The system is designed to be portable and help blind people identify faces, read text, and detect objects to assist them in daily life.
Iris recognition is a method of biometric identification.
Biometric identification provides automatic recognition of an
individual based on the unique feature of physiological
characteristics or behavioral characteristic. Iris recognition is a
method of recognizing a person by analyzing the iris pattern.
This survey paper covers the different iris recognition techniques
and methods.
Explaining Aluminous Ascientification Of Significance Examples Of Personal St...SubmissionResearchpa
This document describes research on algorithms for recognizing ear tags for biometric identification. It presents three algorithms: 1) using discrete cosine transformation to distinguish ear image characteristics, which achieved 86% accuracy; 2) using principal component analysis of ear images, which achieved 89% accuracy; and 3) segmenting ear images into static marks, which achieved the best result of 92% accuracy with 12 marks. The discrete cosine method was less accurate due to extracting too many characteristics, while the principal component and segmentation methods performed better with fewer extracted characteristics.
International Journal of Image Processing (IJIP) Volume (1) Issue (2)CSCJournals
This document summarizes a research paper on a face recognition system that uses a multi-local feature selection approach. The proposed system consists of five stages: face detection, extraction of facial features like eyes, nose and mouth, generation of moments to represent the features, classification of facial features using RBF neural networks, and face identification. The system was tested on over 3000 images from three facial databases and achieved recognition rates over 89%, outperforming global feature-based and single local feature approaches. The technique was also found to be robust to variations in translation, orientation and scaling.
This document compares various biometric methods for identification and verification. It discusses fingerprint recognition, face recognition, voice recognition, and iris recognition as some of the main biometric techniques. For each method, it describes how the biometric data is captured and analyzed, the advantages and disadvantages, and examples of applications where the technique can be used. The document provides an overview of the history of biometrics and the typical modules involved in a biometric system, such as sensors, feature extraction, matching, and template databases.
The paper explores iris recognition for personal identification and verification. In this paper a new iris recognition technique is proposed using (Scale Invariant Feature Transform) SIFT. Image-processing algorithms have been validated on noised real iris image database. The proposed innovative technique is computationally effective as well as reliable in terms of recognition rates.
IRDO: Iris Recognition by fusion of DTCWT and OLBPIJERA Editor
This document proposes a new iris recognition method called IRDO that fuses Dual Tree Complex Wavelet Transform (DTCWT) and Overlapping Local Binary Pattern (OLBP) features. DTCWT is used to extract micro-texture features from the iris, while OLBP enhances the extraction of edge features. Fusing these two methods results in improved matching performance and classification accuracy compared to state-of-the-the-art techniques. The proposed IRDO method achieves higher iris recognition rates as measured by Total Success Rate and Equal Error Rate.
The goal of this paper is to present a critical survey of existing literatures on human face detection and recognition over the last 4-5 years. An application for automatic face detection and tracking in video streams from surveillance cameras in public or commercial places is discussed in this paper. Prototype is designed to work with web cameras for the face detection and tracking system based on Visual 2010 C# and Open CV. This system can be used for security purpose to record the visitor face as well as to detect and track the face.
Keywords:- Face Detection, Face Recognition, Open CV, Face Tracking, Video Streams.
International Journal of Engineering and Science Invention (IJESI)inventionjournals
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
Biometric Iris Recognition Based on Hybrid Techniqueijsc
This document presents a study on implementing an iris recognition system using a hybrid technique. The system utilizes several image processing and machine learning techniques. It begins with preprocessing the iris image, including capturing, resizing and converting to grayscale. Histogram equalization is then used for enhancement. Two-dimensional discrete wavelet transform (2D DWT) is applied for feature extraction. Various edge detection algorithms including Canny, Prewitt, Roberts and Sobel are used to detect iris boundaries. The features are then stored in a vector for classification. The system is tested on different iris images and analysis shows 2D DWT and Canny edge detection provide adequate results for feature extraction and iris recognition.
ANALYSIS OF MMSE SPEECH ESTIMATION IMPACT IN WEST SUMATRA'S NOISESsipij
This study aimed to estimate the original voice signal which is interrupted by noise with MMSE method
based on Wiener filter. The Wiener filter is classified as the MMSE estimator studied by previous
researchers. The study assessed the voice signal input count down by European woman that are distorted
by two types of noises, the one is noise based on the outdoor location, sound of siren firefighter, and the
other is the indoor location noise, which represented by noise in lecturer room in campus. The two process
signal is estimated by MMSE estimator which approximated by Wiener filter that must have founded and
counted the covariance of each signal processes are related to the system. Thus the researchers tried to
estimate both types of sounds noisy European woman are due to interference noise source of fire fighter
and faculty room and assess its impact in the form of graphs vote against a function of time, the pattern
shape of the signal and SNR.
A PROPOSED MULTI-DOMAIN APPROACH FOR AUTOMATIC CLASSIFICATION OF TEXT DOCUMENTSijsc
Classification is an important technique used in information retrieval. Supervised classification suffers
from certain limitations concerning the collection and labeling of the training dataset. When facing Multi-
Domain classification, multiple training datasets and classifiers are needed which is relatively difficult. In
this paper an unsupervised classification system is proposed that can manage the Multi-Domain
classification problem as well. It is a multi-domain system where each domain represented by an ontology.
A document is mapped on each ontology based on the weights of the mutual tokens between them with the
help of fuzzy sets, resulting in a mapping degree of the document with each domain. An experiment carried
out showing satisfying classification results with an improvement in the evaluation results of the proposed
system compared to Apache Lucene.
A PREDICTION METHOD OF GESTURE TRAJECTORY BASED ON LEAST SQUARES FITTING MODELVLSICS Design
Implicit interaction based on context information is widely used and studied in the virtual scene. In context
based human computer interaction, the meaning of action A is well defined. For instance, the right wave is
defined turning paper or PPT in context B, And it mean volume up in context C. However, we cannot use
the context information when we select the object to be manipulated. In view of this situation, this paper
proposes using the least squares fitting curve beam to predict the user's trajectory, so as to determine what
object the user’s wants to operate. At the same time, the fitting effects of the three curves were compared,
and the curve which is more accord with the hand movement trajectory is obtained. In addition, using the
bounding box size control the Z variable to move in an appropriate location. Experimental results show
that the proposed in this paper based on bounding box size to control the Z variables get a good effect; by
fitting the trajectory of a human hand, to predict the object that the subjects would like to operate. The
correct rate is 91%.
Natural Language Toolkit (NLTK) is a generic platform to process the data of various natural (human)
languages and it provides various resources for Indian languages also like Hindi, Bangla, Marathi and so
on. In the proposed work, the repositories provided by NLTK are used to carry out the processing of Hindi
text and then further for analysis of Multi word Expressions (MWEs). MWEs are lexical items that can be
decomposed into multiple lexemes and display lexical, syntactic, semantic, pragmatic and statistical
idiomaticity. The main focus of this paper is on processing and analysis of MWEs for Hindi text. The
corpus used for Hindi text processing is taken from the famous Hindi novel “KaramaBhumi by Munshi
PremChand”. The result analysis is done using the Hindi corpus provided by Resource Centre for Indian
Language Technology Solutions (CFILT). Results are analysed to justify the accuracy of the proposed
work.
The document discusses face detection and recognition methods. It covers several topics:
1) Face detection can be categorized into knowledge-based (using facial features, skin color, templates) and image-based methods (AdaBoost, Eigenfaces, neural networks, support vector machines).
2) Important factors for face detection include facial features, skin color modeling, and template matching.
3) Face recognition uses linear/nonlinear projection methods, neural networks for 2D recognition, and 3D geometric measures for 3D recognition.
Face recognition is a widely used biometric approach. Face recognition technology has developed rapidly
in recent years and it is more direct, user friendly and convenient compared to other methods. But face
recognition systems are vulnerable to spoof attacks made by non-real faces. It is an easy way to spoof face
recognition systems by facial pictures such as portrait photographs. A secure system needs Liveness
detection in order to guard against such spoofing. In this work, face liveness detection approaches are
categorized based on the various types techniques used for liveness detection. This categorization helps
understanding different spoof attacks scenarios and their relation to the developed solutions. A review of
the latest works regarding face liveness detection works is presented. The main aim is to provide a simple
path for the future development of novel and more secured face liveness detection approach.
https://telecombcn-dl.github.io/2017-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
Face recognization using artificial nerual networkDharmesh Tank
This document presents an overview of face recognition using artificial neural networks. It discusses the basic concepts of face recognition, issues with existing systems, and proposes a new system using discrete cosine transform (DCT) for feature extraction and an artificial neural network with backpropagation for classification. DCT is used to extract illumination invariant features and reduce dimensionality. The neural network is trained on these features to recognize faces. Thresholding rules are also introduced to improve recognition performance. Real-time applications of face recognition like Microsoft's Project Natal are mentioned.
The document summarizes face recognition techniques. It discusses how face recognition involves detecting faces, extracting and matching features. Common feature extraction methods discussed include principal component analysis, linear discriminant analysis, and neural networks. The document also summarizes different categories of face recognition approaches, such as template-based, statistical, neural network-based, and hybrid approaches. Local geometry-based features and other approaches like using range, infrared, or profile images are also mentioned.
Scale Invariant Feature Transform Based Face Recognition from a Single Sample...ijceronline
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
Presentation on Face detection and recognition - Credits goes to Mr Shriram, "https://www.hackster.io/sriram17ei/facial-recognition-opencv-python-9bc724"
This document describes a wearable AI device that uses computer vision and speech synthesis to help blind individuals. The device uses a Raspberry Pi with a camera to perform three main functions: facial recognition using convolutional neural networks and linear discriminant analysis, optical character recognition (OCR) to convert text to speech using a text-to-speech system, and object detection. The facial recognition and text are conveyed to the blind user through a speaker. The system is designed to be portable and help blind people identify faces, read text, and detect objects to assist them in daily life.
Iris recognition is a method of biometric identification.
Biometric identification provides automatic recognition of an
individual based on the unique feature of physiological
characteristics or behavioral characteristic. Iris recognition is a
method of recognizing a person by analyzing the iris pattern.
This survey paper covers the different iris recognition techniques
and methods.
Explaining Aluminous Ascientification Of Significance Examples Of Personal St...SubmissionResearchpa
This document describes research on algorithms for recognizing ear tags for biometric identification. It presents three algorithms: 1) using discrete cosine transformation to distinguish ear image characteristics, which achieved 86% accuracy; 2) using principal component analysis of ear images, which achieved 89% accuracy; and 3) segmenting ear images into static marks, which achieved the best result of 92% accuracy with 12 marks. The discrete cosine method was less accurate due to extracting too many characteristics, while the principal component and segmentation methods performed better with fewer extracted characteristics.
International Journal of Image Processing (IJIP) Volume (1) Issue (2)CSCJournals
This document summarizes a research paper on a face recognition system that uses a multi-local feature selection approach. The proposed system consists of five stages: face detection, extraction of facial features like eyes, nose and mouth, generation of moments to represent the features, classification of facial features using RBF neural networks, and face identification. The system was tested on over 3000 images from three facial databases and achieved recognition rates over 89%, outperforming global feature-based and single local feature approaches. The technique was also found to be robust to variations in translation, orientation and scaling.
This document compares various biometric methods for identification and verification. It discusses fingerprint recognition, face recognition, voice recognition, and iris recognition as some of the main biometric techniques. For each method, it describes how the biometric data is captured and analyzed, the advantages and disadvantages, and examples of applications where the technique can be used. The document provides an overview of the history of biometrics and the typical modules involved in a biometric system, such as sensors, feature extraction, matching, and template databases.
The paper explores iris recognition for personal identification and verification. In this paper a new iris recognition technique is proposed using (Scale Invariant Feature Transform) SIFT. Image-processing algorithms have been validated on noised real iris image database. The proposed innovative technique is computationally effective as well as reliable in terms of recognition rates.
IRDO: Iris Recognition by fusion of DTCWT and OLBPIJERA Editor
This document proposes a new iris recognition method called IRDO that fuses Dual Tree Complex Wavelet Transform (DTCWT) and Overlapping Local Binary Pattern (OLBP) features. DTCWT is used to extract micro-texture features from the iris, while OLBP enhances the extraction of edge features. Fusing these two methods results in improved matching performance and classification accuracy compared to state-of-the-the-art techniques. The proposed IRDO method achieves higher iris recognition rates as measured by Total Success Rate and Equal Error Rate.
The goal of this paper is to present a critical survey of existing literatures on human face detection and recognition over the last 4-5 years. An application for automatic face detection and tracking in video streams from surveillance cameras in public or commercial places is discussed in this paper. Prototype is designed to work with web cameras for the face detection and tracking system based on Visual 2010 C# and Open CV. This system can be used for security purpose to record the visitor face as well as to detect and track the face.
Keywords:- Face Detection, Face Recognition, Open CV, Face Tracking, Video Streams.
International Journal of Engineering and Science Invention (IJESI)inventionjournals
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
Biometric Iris Recognition Based on Hybrid Techniqueijsc
This document presents a study on implementing an iris recognition system using a hybrid technique. The system utilizes several image processing and machine learning techniques. It begins with preprocessing the iris image, including capturing, resizing and converting to grayscale. Histogram equalization is then used for enhancement. Two-dimensional discrete wavelet transform (2D DWT) is applied for feature extraction. Various edge detection algorithms including Canny, Prewitt, Roberts and Sobel are used to detect iris boundaries. The features are then stored in a vector for classification. The system is tested on different iris images and analysis shows 2D DWT and Canny edge detection provide adequate results for feature extraction and iris recognition.
ANALYSIS OF MMSE SPEECH ESTIMATION IMPACT IN WEST SUMATRA'S NOISESsipij
This study aimed to estimate the original voice signal which is interrupted by noise with MMSE method
based on Wiener filter. The Wiener filter is classified as the MMSE estimator studied by previous
researchers. The study assessed the voice signal input count down by European woman that are distorted
by two types of noises, the one is noise based on the outdoor location, sound of siren firefighter, and the
other is the indoor location noise, which represented by noise in lecturer room in campus. The two process
signal is estimated by MMSE estimator which approximated by Wiener filter that must have founded and
counted the covariance of each signal processes are related to the system. Thus the researchers tried to
estimate both types of sounds noisy European woman are due to interference noise source of fire fighter
and faculty room and assess its impact in the form of graphs vote against a function of time, the pattern
shape of the signal and SNR.
A PROPOSED MULTI-DOMAIN APPROACH FOR AUTOMATIC CLASSIFICATION OF TEXT DOCUMENTSijsc
Classification is an important technique used in information retrieval. Supervised classification suffers
from certain limitations concerning the collection and labeling of the training dataset. When facing Multi-
Domain classification, multiple training datasets and classifiers are needed which is relatively difficult. In
this paper an unsupervised classification system is proposed that can manage the Multi-Domain
classification problem as well. It is a multi-domain system where each domain represented by an ontology.
A document is mapped on each ontology based on the weights of the mutual tokens between them with the
help of fuzzy sets, resulting in a mapping degree of the document with each domain. An experiment carried
out showing satisfying classification results with an improvement in the evaluation results of the proposed
system compared to Apache Lucene.
A PREDICTION METHOD OF GESTURE TRAJECTORY BASED ON LEAST SQUARES FITTING MODELVLSICS Design
Implicit interaction based on context information is widely used and studied in the virtual scene. In context
based human computer interaction, the meaning of action A is well defined. For instance, the right wave is
defined turning paper or PPT in context B, And it mean volume up in context C. However, we cannot use
the context information when we select the object to be manipulated. In view of this situation, this paper
proposes using the least squares fitting curve beam to predict the user's trajectory, so as to determine what
object the user’s wants to operate. At the same time, the fitting effects of the three curves were compared,
and the curve which is more accord with the hand movement trajectory is obtained. In addition, using the
bounding box size control the Z variable to move in an appropriate location. Experimental results show
that the proposed in this paper based on bounding box size to control the Z variables get a good effect; by
fitting the trajectory of a human hand, to predict the object that the subjects would like to operate. The
correct rate is 91%.
Natural Language Toolkit (NLTK) is a generic platform to process the data of various natural (human)
languages and it provides various resources for Indian languages also like Hindi, Bangla, Marathi and so
on. In the proposed work, the repositories provided by NLTK are used to carry out the processing of Hindi
text and then further for analysis of Multi word Expressions (MWEs). MWEs are lexical items that can be
decomposed into multiple lexemes and display lexical, syntactic, semantic, pragmatic and statistical
idiomaticity. The main focus of this paper is on processing and analysis of MWEs for Hindi text. The
corpus used for Hindi text processing is taken from the famous Hindi novel “KaramaBhumi by Munshi
PremChand”. The result analysis is done using the Hindi corpus provided by Resource Centre for Indian
Language Technology Solutions (CFILT). Results are analysed to justify the accuracy of the proposed
work.
BUILDING A SYLLABLE DATABASE TO SOLVE THE PROBLEM OF KHMER WORD SEGMENTATIONijnlc
Word segmentation is a basic problem in natural language processing. With the languages having the
complex writing system like the Khmer language in Southern of Vietnam, this problem really very
intractable, posing the significant challenges. Although there are some experts in Vietnam as well as
international having deeply researched this problem, there are still no reasonable results meeting the
demand, in particular, no treated thoroughly the ambiguous phenomenon, in the process of Khmer
language processing so far. This paper present a solution based on the syllable division into component
clusters using two syllable models proposed, thereby building a Khmer syllable database, is still not
actually available. This method using a lexical database updated from the online Khmer dictionaries and
some supported dictionaries serving role of training data and complementary linguistic characteristics.
Each component cluster is labelled and located by the first and last letter to identify entirety a syllable. This
approach is workable and the test results achieve high accuracy, eliminate the ambiguity, contribute to
solving the problem of word segmentation and applying efficiency in Khmer language processing.
INNOVATIVE AND SOCIAL TECHNOLOGIES ARE REVOLUTIONIZING SAUDI CONSUMER ATTITUD...ijsptm
The Internet today connects about 40% of the world population. Half of these are living outside the
advanced economies, often in some countries that are quickly climbing the developmental ladder, with
diverse populations and inarguable economic potentialities including Saudi Arabia. There are 20 million
Internet users in Saudi Arabia by the end of 2016. This research examines the impact of the Internet,
innovation, disruption and the dynamics that affect adoption of electronic commerce (e-commerce) in
Saudi Arabia. The study further analyzes the influence of Internet security and its impact on the users’
decision-making process. The data have gathered through questionnaires that disseminated to 2,823 Saudi
Internet male and female users through email. The response rate was 23%. The important findings have
discussed in this study that shows that there are significant factors that are influencing the adoption of ecommerce,
including legal procedures to ease of doing business, ease of doing businesses and security
features and controls introduced by the online vendors.
A CASE STUDY ON AUTO SOCIALIZATION IN ONLINE PLATFORMSIJMIT JOURNAL
This document presents a case study on how the theory of auto socialization can be applied to understand socialization in online platforms. The study examined how clinical research associates used a Yammer platform to socialize with their employer organization. The results showed that the associates did not imitate or conform to the values of management as represented on Yammer. As they could not reflect the significant others on the platform, they rejected the socialization attempt. The document discusses how mastery learning used in MOOCs lacks an interaction dimension, hindering socialization and contributing to high dropout rates. It argues online learning needs to incorporate social aspects to be effective.
A thorough review of trust models is carried out in this paper to reveal the key capabilities of existing trust
models and compare how they differ among disciplines. Trust decisions are risky due to uncertainties and
the loss of control. On the other hand, not trusting might mean giving up some potential benefits. Advances
in electronic transactions, mutliagent systems, and decision support systems create a necessity to develop
trust and reputation models. The development of such models will allow for trust reasoning and decisions
to be made in situations with high risk and uncertainty. In recent years, several attempts have been made to
model reputation and trust. However, perceiving trust differently and the lack of having a unified trust
definition are among the main causes of the proliferation of many trust models across different disciplines.
A STUDY ON QUANTITATIVE PARAMETERS OF SPECTRUM HANDOFF IN COGNITIVE RADIO NET...ijwmn
The innovation of wireless technologies requires dynamic allocation of spectrum band in an efficient
manner. This has been achieved by Cognitive Radio (CR) networks which allow unlicensed users to make
use of free licensed spectrum, when the licensed users are kept away from that spectrum. The cognitive
radio makes decision, switching from primary user to secondary user and vice-versa, based on its built-in
interference engine. It allows secondary users to makes use of a channel based on its availability i.e. on the
absence of the primary user and they should vacate the channel once the primary user re-enters and
continue their communication on another available channel and this process in the cognitive radio is
known as spectrum mobility. The main objective of spectrum mobility is that, there is no interruption
caused due to the channel occupied by secondary users and maintains a good quality of service. In order to
achieve better spectrum mobility, it is mandatory to choose an effective spectrum handoff strategy with the
capability of predicting spectrum mobility. The handoff strategy with its parameters and its impact is an
important concept in spectrum mobility but fairly explored. In this paper an empirical study on quantitative
parameters involved in spectrum mobility prediction are discussed in detail. These parameters are studied
extensively because they play a vital role in the spectrum handoff process moreover the impact of these
parameters in various handoff methods can be used to predict the effectiveness of the system.
NETWORK PERFORMANCE ENHANCEMENT WITH OPTIMIZATION SENSOR PLACEMENT IN WIRELES...ijwmn
From one side, sensor manufacturing technology and from other side wireless communication technology
improvement has an effect on the growth and deployment of Wireless Network Sensor (WSN). The
appropriate performance of WSN has abundant necessity which has dependent on the different parameters
such as optimize sensor placement and structure of network sensor. The optimized placement in WSN not
only would optimize number of sensors, but also help to reach to the more precise information. Therefore
different solutions are proposed to reduce cost and increase life time of sensor networks that most of them
are concentrated in the field of routing and information transmission. In this paper, places which they need
new sensors placement or sensor movements are determined and then with applying these changes,
performance of WSN will calculate. To achieve the optimum placement, the network should evaluate
precisely and effective criteria on the performance should extract. Therefore the criteria should be ranked
and after weighting with using AHP algorithms, with use of Geographical Information System (GIS), these
weighted criteria will combined and in the locations which WSN doesn’t have enough performance, new
sensor placement will create. New proposed method, improve 21.11% performance of WSN with sensor
placement in the low performance locations. Also the number of added sensor is 26.09% which is lowest
number of added sensors in comparison with other methods.
A NOVEL IMAGE STEGANOGRAPHY APPROACH USING MULTI-LAYERS DCT FEATURES BASED ON...ijma
Steganography is the science of hidden data in the cover image without any updating of the cover image.
The recent research of the steganography is significantly used to hide large amount of information within
an image and/or audio files. This paper proposed a new novel approach for hiding the data of secret image
using Discrete Cosine Transform (DCT) features based on linear Support Vector Machine (SVM)
classifier. The DCT features are used to decrease the image redundant information. Moreover, DCT is
used to embed the secrete message based on the least significant bits of the RGB. Each bit in the cover
image is changed only to the extent that is not seen by the eyes of human. The SVM used as a classifier to
speed up the hiding process via the DCT features. The proposed method is implemented and the results
show significant improvements. In addition, the performance analysis is calculated based on the
parameters MSE, PSNR, NC, processing time, capacity, and robustness.
ROBUST FEATURE EXTRACTION USING AUTOCORRELATION DOMAIN FOR NOISY SPEECH RECOG...sipij
Previous research has found autocorrelation domain as an appropriate domain for signal and noise
separation. This paper discusses a simple and effective method for decreasing the effect of noise on the
autocorrelation of the clean signal. This could later be used in extracting mel cepstral parameters for
speech recognition. Two different methods are proposed to deal with the effect of error introduced by
considering speech and noise completely uncorrelated. The basic approach deals with reducing the effect
of noise via estimation and subtraction of its effect from the noisy speech signal autocorrelation. In order
to improve this method, we consider inserting a speech/noise cross correlation term into the equations used
for the estimation of clean speech autocorrelation, using an estimate of it, found through Kernel method.
Alternatively, we used an estimate of the cross correlation term using an averaging approach. A further
improvement was obtained through introduction of an overestimation parameter in the basic method. We
tested our proposed methods on the Aurora 2 task. The Basic method has shown considerable improvement
over the standard features and some other robust autocorrelation-based features. The proposed techniques
have further increased the robustness of the basic autocorrelation-based method.
DEVELOPMENT OF AN ANDROID APPLICATION FOR OBJECT DETECTION BASED ON COLOR, SH...ijma
Object detection and recognition is an important task in many computer vision applications. In this paper
an Android application was developed using Eclipse IDE and OpenCV3 Library. This application is able to
detect objects in an image that is loaded from the mobile gallery, based on its color, shape, or local
features. The image is processed in the HSV color domain for better color detection. Circular shapes are
detected using Circular Hough Transform and other shapes are detected using Douglas-Peucker
algorithm. BRISK (binary robust invariant scalable keypoints) local features were applied in the developed
Android application for matching an object image in another scene image. The steps of the proposed
detection algorithms are described, and the interfaces of the application are illustrated. The application is
ported and tested on Galaxy S3, S6, and Note1 Smartphones. Based on the experimental results, the
application is capable of detecting eleven different colors, detecting two dimensional geometrical shapes
including circles, rectangles, triangles, and squares, and correctly match local features of object and scene
images for different conditions. The application could be used as a standalone application, or as a part of
another application such as Robot systems, traffic systems, e-learning applications, information retrieval
and many others.
A NOVEL METHOD FOR PERSON TRACKING BASED K-NN : COMPARISON WITH SIFT AND MEAN...sipij
Object tracking can be defined as the process of detecting an object of interest from a video scene and
keeping track of its motion, orientation, occlusion etc. in order to extract useful information. It is indeed a
challenging problem and it’s an important task. Many researchers are getting attracted in the field of
computer vision, specifically the field of object tracking in video surveillance. The main purpose of this
paper is to give to the reader information of the present state of the art object tracking, together with
presenting steps involved in Background Subtraction and their techniques. In related literature we found
three main methods of object tracking: the first method is the optical flow; the second is related to the
background subtraction, which is divided into two types presented in this paper, then the temporal
differencing and the SIFT method and the last one is the mean shift method. We present a novel approach
to background subtraction that compare a current frame with the background model that we have set
before, so we can classified each pixel of the image as a foreground or a background element, then comes
the tracking step to present our object of interest, which is a person, by his centroid. The tracking step is
divided into two different methods, the surface method and the K-NN method, both are explained in the
paper. Our proposed method is implemented and evaluated using CAVIAR database.
A COLLAGE IMAGE CREATION & “KANISEI” ANALYSIS SYSTEM BY COMBINING MULTIPLE IM...ijma
The collage is an artistic method to create a new image by combining multiple images under some rules or
conditions for collage creators. To realize a mechanism to interpret “Kansei” of the collage creator by the
combination process of multiple image-materials and a formed collage image, we propose a new system to
analyze the collage work by using a database stored to each collage creation information. The collage
works which a creator made by using this system was evaluated. And also, we clarified how to combine
image-materials to make the good collage image having specific impression.
ASPHALTIC MATERIAL IN THE CONTEXT OF GENERALIZED POROTHERMOELASTICITYijsc
This document summarizes a mathematical model of generalized porothermoelasticity for a porous asphalt material. The model considers a poroelastic half-space saturated with fluid and constructs governing equations for the stresses, strains, displacements, and temperatures of the solid and fluid phases. The equations are derived using the theory of generalized porothermoelasticity with one relaxation time. Numerical solutions to the equations are obtained by applying the model to an asphalt material subjected to thermal shock on its surface. Graphs of the temperature, stresses, strains, and displacements within the material are presented.
BFO – AIS: A FRAME WORK FOR MEDICAL IMAGE CLASSIFICATION USING SOFT COMPUTING...ijsc
Medical images provide diagnostic evidence/information about anatomical pathology. The growth in
database is enormous as medical digital image equipment’s like Magnetic Resonance Images (MRI),
Computed Tomography (CT), and Positron Emission Tomography CT (PET-CT) are part of clinical work.
CT images distinguish various tissues according to gray levels to help medical diagnosis. Ct is more
reliable for early tumours and haemorrhages detection as it provides anatomical information to plan radio
therapy. Medical information systems goals are to deliver information to right persons at the right time and
place to improve care process quality and efficiency. This paper proposes an Artificial Immune System
(AIS) classifier and proposed feature selection based on hybrid Bacterial Foraging Optimization (BFO)
with Local Search (LS) for medical image classification.
REAL-TIME INTRUSION DETECTION SYSTEM FOR BIG DATAijp2p
The objective of the proposed system is to integrate the high volume of data along with the important
considerations like monitoring a wide array of heterogeneous security. When a real time cyber attack
occurred, the Intrusion Detection System automatically store the log in distributed environment and
monitor the log with existing intrusion dictionary. At the same time the system will check and categorize the
severity of the log to high, medium, and low respectively. After the categorization, the system will
automatically take necessary action against the user-unit with respect to the severity of the log. The
advantage of the system is that it utilize anomaly detection, evaluates data and issue alert message or
reports based on abnormal behaviour.
STANDARD ARABIC VERBS INFLECTIONS USING NOOJ PLATFORMijnlc
This article describes the morphological analysis of a standard Arabic natural language processing, as a
part of an electronic dictionary-constricting phase. A fully 3-lettered inflected verbs model are formalized
based on a linguistic classification, using NOOJ platform, the classification gives certain representative
verbs that will considered as lemmas, this verbs form our dictionary entries, they are also conjugated
according to our inflection paradigm relying on certain specific morphological properties. This dictionary
will be considered as an Arabic resource, which will help NLP applications and NOOJ platform to analyse
sophisticated Arabic corpora.
A NOVEL APPROACH FOR INFORMATION RETRIEVAL TECHNIQUE FOR WEB USING NLPijnlc
Webpages are loaded with vast and different kinds of information about the entities in the real-world.
Information retrieval from the Web is of a greater significance today to get the accurate queried data
within the desired time frame which is increasingly becoming difficult with each passing day. Need to
develop a system to solve entity extraction problems from the web as compared to the traditional system.
It’s critical to have a clear understanding of natural language sentence processing in the web page along
with the structure of the web page to get the correct information with speed. Here we have proposed
approach for information retrieval technique for Web using NLP where techniques Hierarchical
Conditional Random Fields (i.e. HCRF) and extended Semi-Markov Conditional Random Fields (i.e. Semi-
CRF) along with Visual Page Segmentation is used to get the accurate results. Also parallel processing is
used to achieve the results in desired time frame. It further improves the decision making between HCRF
and Semi-CRF by using bidirectional approach rather than top-down approach. It enables better
understanding of the content and page structure.
Multimodal Biometrics Recognition from Facial Video via Deep Learning cscpconf
Biometrics identification using multiple modalities has attracted the attention of many
researchers as it produces more robust and trustworthy results than single modality biometrics.
In this paper, we present a novel multimodal recognition system that trains a Deep Learning
Network to automatically learn features after extracting multiple biometric modalities from a
single data source, i.e., facial video clips. Utilizing different modalities, i.e., left ear, left profile
face, frontal face, right profile face, and right ear, present in the facial video clips, we train
supervised denosing autoencoders to automatically extract robust and non-redundant features.
The automatically learned features are then used to train modality specific sparse classifiers to
perform the multimodal recognition. Experiments conducted on the constrained facial video
dataset (WVU) and the unconstrained facial video dataset (HONDA/UCSD), resulted in a
99.17% and 97.14% rank-1 recognition rates, respectively. The multimodal recognition
accuracy demonstrates the superiority and robustness of the proposed approach irrespective of
the illumination, non-planar movement, and pose variations present in the video clips.
IRJET- Human Face Recognition in Video using Convolutional Neural Network (CNN)IRJET Journal
This document describes a method for human face recognition in videos using convolutional neural networks (CNNs). It involves preprocessing video frames to grayscale, detecting faces using Viola-Jones detection, extracting features using BRISK, and classifying faces using a CNN. The CNN approach improves efficiency and accuracy in detecting faces under varying poses and illumination conditions compared to previous methods. The goal is to develop a system that can accurately recognize faces in videos despite challenges like illumination changes, poses, occlusions, etc.
Analysis of student sentiment during video class with multi-layer deep learni...IJECEIAES
The modern education system is an essential part of the rise of technology. The E-learning education system is not just an experimental system; it is a vital learning system for the whole world over the last few months. In our research, we have developed our learning method in a more effective and modern way for students and teachers. For significant implementation, we are implementing convolutions neural networks and advanced data classifiers. The expression and mood analysis of a student during the online class is the main focus of our study. For output measure, we divide the final output result as attentive, inattentive, understand, and neutral. Showing the output in real-time online class and for sensory analysis, we have used support vector machine (SVM) and OpenCV. The level of 5*4 neural network is created for this work. An advanced learning medium is proposed through our study. Teachers can monitor the live class and different feelings of a student during the class period through this system.
This document provides a review of face recognition techniques and several face video databases. It discusses four main face video databases: NRC-IIT, Honda/UCSD, Choke Point, and McGill. For each database, it describes the image characteristics, subjects, environment, limitations, and relevant research. It also reviews approaches for face detection, feature extraction, and recognition in video-based systems. Common techniques discussed include Viola-Jones detection, SIFT features, and combining detection with tracking across frames. The performance of these methods on the different video datasets is analyzed and areas for future improvement are identified.
Face detection is one of the most suitable applications for image processing and biometric programs. Artificial neural networks have been used in the many field like image processing, pattern recognition, sales forecasting, customer research and data validation. Face detection and recognition have become one of the most popular biometric techniques over the past few years. There is a lack of research literature that provides an overview of studies and research-related research of Artificial neural networks face detection. Therefore, this study includes a review of facial recognition studies as well systems based on various Artificial neural networks methods and algorithms.
Ioannis Pitas, Professor, Aristotle University of Thessaloniki, Department of Informatics (IEEE Fellow), Semantic 3DTV Content Analysis and Description
This document summarizes a research project on face and facial feature detection from images. The project uses the Viola-Jones object detection framework combined with geometric information to detect faces and locate features like eyes, nose, and mouth. Key steps include face detection using Haar-like features and AdaBoost classification, then detecting facial features based on characteristics like size, shape, color, and position relative to other facial features. MATLAB functions like videoinput and getsnapshot are used to acquire video frames and capture images for processing.
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
1) The document presents a new face parts detection algorithm that combines the Viola-Jones object detection framework with geometric information of facial features.
2) It detects faces, then isolates regions of interest for the eyes, nose, and mouth. Eye pupils are located using iris recognition techniques.
3) The algorithm was tested on hundreds of images and showed promising results for automated facial feature detection.
DEEPFAKE DETECTION TECHNIQUES: A REVIEWvivatechijri
Noteworthy advancements in the field of deep learning have led to the rise of highly realistic AI generated fake videos, these videos are commonly known as Deepfakes. They refer to manipulated videos, that are generated by sophisticated AI, that yield formed videos and tones that seem to be original. Although this technology has numerous beneficial applications, there are also significant concerns about the disadvantages of the same. So there is a need to develop a system that would detect and mitigate the negative impact of these AI generated videos on society. The videos that get transferred through social media are of low quality, so the detection of such videos becomes difficult. Many researchers in the past have done analysis on Deepfake detection which were based on Machine Learning, Support Vector Machine and Deep Learning based techniques such as Convolution Neural Network with or without LSTM .This paper analyses various techniques that are used by several researchers to detect Deepfake videos.
Exploring visual and motion saliency for automatic video object extractionMuthu Samy
Sybian Technologies Pvt Ltd
Final Year Projects & Real Time live Projects
JAVA(All Domains)
DOTNET(All Domains)
ANDROID
EMBEDDED
VLSI
MATLAB
Project Support
Abstract, Diagrams, Review Details, Relevant Materials, Presentation,
Supporting Documents, Software E-Books,
Software Development Standards & Procedure
E-Book, Theory Classes, Lab Working Programs, Project Design & Implementation
24/7 lab session
Final Year Projects For BE,ME,B.Sc,M.Sc,B.Tech,BCA,MCA
PROJECT DOMAIN:
Cloud Computing
Networking
Network Security
PARALLEL AND DISTRIBUTED SYSTEM
Data Mining
Mobile Computing
Service Computing
Software Engineering
Image Processing
Bio Medical / Medical Imaging
Contact Details:
Sybian Technologies Pvt Ltd,
No,33/10 Meenakshi Sundaram Building,
Sivaji Street,
(Near T.nagar Bus Terminus)
T.Nagar,
Chennai-600 017
Ph:044 42070551
Mobile No:9790877889,9003254624,7708845605
Mail Id:sybianprojects@gmail.com,sunbeamvijay@yahoo.com
Exploring visual and motion saliency for automatic video object extractionMuthu Samy
Sybian Technologies Pvt Ltd
Final Year Projects & Real Time live Projects
JAVA(All Domains)
DOTNET(All Domains)
ANDROID
EMBEDDED
VLSI
MATLAB
Project Support
Abstract, Diagrams, Review Details, Relevant Materials, Presentation,
Supporting Documents, Software E-Books,
Software Development Standards & Procedure
E-Book, Theory Classes, Lab Working Programs, Project Design & Implementation
24/7 lab session
Final Year Projects For BE,ME,B.Sc,M.Sc,B.Tech,BCA,MCA
PROJECT DOMAIN:
Cloud Computing
Networking
Network Security
PARALLEL AND DISTRIBUTED SYSTEM
Data Mining
Mobile Computing
Service Computing
Software Engineering
Image Processing
Bio Medical / Medical Imaging
Contact Details:
Sybian Technologies Pvt Ltd,
No,33/10 Meenakshi Sundaram Building,
Sivaji Street,
(Near T.nagar Bus Terminus)
T.Nagar,
Chennai-600 017
Ph:044 42070551
Mobile No:9790877889,9003254624,7708845605
Mail Id:sybianprojects@gmail.com,sunbeamvijay@yahoo.com
This document discusses a face identification system to identify criminals. The system stores images of known criminals along with their details in a database. Eyewitnesses can then help construct a face using image slices from the database, which is compared to stored images. If a match of 99% or more is found, the person is identified as the criminal. The objectives, advantages, and disadvantages of the system are provided. Technical, economic, and operational feasibility of the system are also analyzed.
An Iot Based Smart Manifold Attendance SystemIJERDJOURNAL
ABSTRACT:- Attendance has been an age old procedure employed in different disciplines of educational institutions. While attendance systems have witnessed growth right from manual techniques to biometrics, plight of taking attendance is undeniable. In fingerprint based attendance monitoring, if fingers get roughed / scratched, it leads to misreading. Also for face recognition, students will have to make a queue and each one will have to wait until their face gets recognised. Our proposed system is employing “manifold attendance” that means employing passive attendance, where at a time, the attendance of multiple people can get captured. We have eliminated the need of queue system / paper-pen system of attendance, and just with a single click the attendance is not only captured, but monitored as well, that too without any human intervention. In the proposed system, creation of database and face detection is done by using the concepts of bounding box, whereas for face recognition we employ histogram equalization and matching technique.
MUSIC RECOMMENDATION THROUGH FACE RECOGNITION AND EMOTION DETECTIONIRJET Journal
This document describes a system for music recommendation based on facial emotion recognition using convolutional neural networks. It analyzes facial images using CNN models to detect seven basic emotions. Based on the detected emotion, it recommends songs from predefined playlists that match the user's mood. The system architecture inputs images of the user's face, uses a CNN for emotion detection, and then selects appropriate music from playlists organized by emotion. It was developed to provide personalized music recommendations based on a user's real-time facial expressions and emotional state, unlike other systems that rely only on search queries or prior listening history.
Implementation of video tagging identifying characters in videoeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Fake Video Creation and Detection: A ReviewIRJET Journal
This document summarizes research on fake video creation and detection using deep learning techniques. It discusses how advances in deep learning, particularly generative adversarial networks (GANs), have made it easier to generate realistic fake videos but also pose risks if misused. The document reviews methods for creating fake videos, such as face swapping and face reenactment using autoencoders, as well as methods for detecting fake videos by examining visual artifacts in frames or temporal inconsistencies across frames using classifiers like CNNs. Overall, the document provides an overview of the state of deepfake video generation and detection.
This document describes an iris recognition system implemented using National Instruments LabVIEW for secure voting. The system has four main stages: 1) image acquisition using an infrared camera, 2) iris localization by detecting circles in the iris image, 3) pattern matching to extract an iris code, and 4) authentication by matching the iris code to a database. The database stores voter information and iris codes in an encrypted format. On voting day, the system matches the voter's ID and captured iris image to the database to verify their identity before allowing them to vote. The system aims to provide more secure identity verification than traditional password or ID systems.
PERFORMANCE ANALYSIS OF FINGERPRINTING EXTRACTION ALGORITHM IN VIDEO COPY DET...IJCSEIT Journal
A video fingerprint is a recognizer that is derived from a piece of video content. The video fingerprinting
methods obtain unique features of a video that differentiates one video clip from another. It aims to identify
whether a query video segment is a copy of video from the video database or not based on the signature of
the video. It is difficult to find whether a video is a copied video or a similar video, since the features of the
content are very similar from one video to the other. The main focus of this paper is to detect that the query
video is present in the video database with robustness depending on the content of video and also by fast
search of fingerprints. The Fingerprint Extraction Algorithm and Fast Search Algorithms are adopted in
this paper to achieve robust, fast, efficient and accurate video copy detection. As a first step, the
Fingerprint Extraction algorithm is employed which extracts a fingerprint through the features from the
image content of video. The images are represented as Temporally Informative Representative Images
(TIRI). Then, the second step is to find the presence of copy of a query video in a video database, in which
a close match of its fingerprint in the corresponding fingerprint database is searched using inverted-filebased
method. The proposed system is tested against various attacks like noise, brightness, contrast,
rotation and frame drop. Thus the performance of the proposed system on an average shows high true
positive rate of 98% and low false positive rate of 1.3% for different attacks.
Similar to MULTIMODAL BIOMETRICS RECOGNITION FROM FACIAL VIDEO VIA DEEP LEARNING (20)
Infrastructure Challenges in Scaling RAG with Custom AI modelsZilliz
Building Retrieval-Augmented Generation (RAG) systems with open-source and custom AI models is a complex task. This talk explores the challenges in productionizing RAG systems, including retrieval performance, response synthesis, and evaluation. We’ll discuss how to leverage open-source models like text embeddings, language models, and custom fine-tuned models to enhance RAG performance. Additionally, we’ll cover how BentoML can help orchestrate and scale these AI components efficiently, ensuring seamless deployment and management of RAG systems in the cloud.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/
DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen!
Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell.
Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten.
Diese Themen werden behandelt
- Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten
- Wie funktionieren CCB- und CCX-Lizenzen wirklich?
- Verstehen des DLAU-Tools und wie man es am besten nutzt
- Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw.
- Praxisbeispiele und Best Practices zum sofortigen Umsetzen
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceIndexBug
Imagine a world where machines not only perform tasks but also learn, adapt, and make decisions. This is the promise of Artificial Intelligence (AI), a technology that's not just enhancing our lives but revolutionizing entire industries.
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxSitimaJohn
Ocean Lotus cyber threat actors represent a sophisticated, persistent, and politically motivated group that poses a significant risk to organizations and individuals in the Southeast Asian region. Their continuous evolution and adaptability underscore the need for robust cybersecurity measures and international cooperation to identify and mitigate the threats posed by such advanced persistent threat groups.
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
MULTIMODAL BIOMETRICS RECOGNITION FROM FACIAL VIDEO VIA DEEP LEARNING
1. Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.1, February 2017
DOI : 10.5121/sipij.2017.8101 1
MULTIMODAL BIOMETRICS RECOGNITION FROM
FACIAL VIDEO VIA DEEP LEARNING
Sayan Maity1
, Mohamed Abdel-Mottaleb2
and Shihab S. Asfour1
1
Industrial Engineering (IE) Department, University of Miami; 1251 Memorial Drive;
Coral Gables; Florida
2
Electrical and Computer Engineering Department, University of Miami; 1251 Memorial
Drive; Coral Gables; Florida
ABSTRACT
Biometrics identification using multiple modalities has attracted the attention of many researchers as it
produces more robust and trustworthy results than single modality biometrics. In this paper, we present a
novel multimodal recognition system that trains a Deep Learning Network to automatically learn features
after extracting multiple biometric modalities from a single data source, i.e., facial video clips. Utilizing
different modalities, i.e., left ear, left profile face, frontal face, right profile face, and right ear, present in
the facial video clips, we train supervised denoising auto-encoders to automatically extract robust and non-
redundant features. The automatically learned features are then used to train modality specific sparse
classifiers to perform the multimodal recognition. Experiments conducted on the constrained facial video
dataset (WVU) and the unconstrained facial video dataset (HONDA/UCSD), resulted in a 99.17% and
97.14% rank-1 recognition rates, respectively. The multimodal recognition accuracy demonstrates the
superiority and robustness of the proposed approach irrespective of the illumination, non-planar
movement, and pose variations present in the video clips.
KEYWORDS
Multimodal Biometrics, Auto-encoder, Deep Learning, Sparse Classification.
1. INTRODUCTION
There are several motivations for building robust multimodal biometric systems that extract
multiple modalities from a single source of biometrics, i.e., facial video clips. Firstly, acquiring
video clips of facial data is straight forward using conventional video cameras, which are
ubiquitous. Secondly, the nature of data collection is non-intrusive and the ear, frontal, and profile
face can appear in the same video. The proposed system, shown in Figure 1, consists of three
distinct components to perform the task of efficient multimodal recognition from facial video
clips. First, the object detection technique proposed by Viola and Jones [1], was adopted for the
automatic detection of modality specific regions from the video frames. Unconstrained facial
video clips contain significant head pose variations due to non-planar movements, and sudden
changes in facial expressions. This results in an uneven number of detected modality specific
video frames for the same subject in different video clips, and also a different number of modality
specific images for different subject. From the aspect of building a robust and accurate model, it
is always preferable to use the entire available training data. However, classification through
sparse representation (SRC) is vulnerable in the presence of uneven number of modality specific
training samples for different subjects. Thus, to overcome the vulnerability of SRC whilst using
all of the detected modality specific regions, in the model building phase we train supervised
2. Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.1, February 2017
denoising sparse auto-encoder to construct a mapping function. This
automatically extract the discriminative features
variances using the uneven number of
Deep Learning Network as the second component i
training sample features for the different subjects. Finally, using the modality
results, score level multimodal fusion is performed to obtain
Figure 1. System Block Diagram: Multimodal Biometrics Recognition from Facial Video
Due to the unavailability of proper datasets f
multimodal databases are synthetically obtained by pair
different databases. To the best of our
multiple modalities are extracted from a single data source that belongs to the same subject. The
main contributions of the proposed a
Network for automatic feature learning in multimodal biometrics
source of biometrics i.e., facial video data, irrespective
and pose variations present in the
The remainder of this paper is organized as follows: Section 2 details the
detection from the constrained and unconstrained
automatic feature learning using supervised denoising sparse auto
Section 4 presents the modality specific classification using
fusion. Section 5 provides the experimental
[3]) and the unconstrained facial video dataset (HONDA/UCSD [4]) to demonstrate the
performance of the proposed framework. Finally, conclusions and future research directions are
presented in Section 6.
Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.1, February 2017
encoder to construct a mapping function. This mapping function is used to
automatically extract the discriminative features preserving the robustness to the possible
variances using the uneven number of detected modality specific regions. Therefore, by applying
work as the second component in the pipeline results in an equal number of
training sample features for the different subjects. Finally, using the modality specific recognition
results, score level multimodal fusion is performed to obtain the multimodal recognition result.
System Block Diagram: Multimodal Biometrics Recognition from Facial Video
Due to the unavailability of proper datasets for multimodal recognition studies [2], often virtual
synthetically obtained by pairing modalities of different subjects from
different databases. To the best of our knowledge, the proposed approach is the first study where
are extracted from a single data source that belongs to the same subject. The
main contributions of the proposed approach is the application of training a Deep Learning
Network for automatic feature learning in multimodal biometrics recognition using a single
, facial video data, irrespective of the illumination, non-planar movement,
variations present in the face video clips.
The remainder of this paper is organized as follows: Section 2 details the modality specific frame
constrained and unconstrained facial video clips. Section 3 de
earning using supervised denoising sparse auto-encoder (deep
Section 4 presents the modality specific classification using sparse representation and multimodal
fusion. Section 5 provides the experimental results on the constrained facial video d
facial video dataset (HONDA/UCSD [4]) to demonstrate the
the proposed framework. Finally, conclusions and future research directions are
Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.1, February 2017
2
function is used to
preserving the robustness to the possible
detected modality specific regions. Therefore, by applying
n the pipeline results in an equal number of
specific recognition
the multimodal recognition result.
System Block Diagram: Multimodal Biometrics Recognition from Facial Video
ies [2], often virtual
different subjects from
knowledge, the proposed approach is the first study where
are extracted from a single data source that belongs to the same subject. The
Deep Learning
recognition using a single
planar movement,
modality specific frame
acial video clips. Section 3 describes the
coder (deep-learning).
sparse representation and multimodal
results on the constrained facial video dataset (WVU
facial video dataset (HONDA/UCSD [4]) to demonstrate the
the proposed framework. Finally, conclusions and future research directions are
3. Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.1, February 2017
3
2. MODALITY SPECIFIC IMAGE FRAME DETECTION
To perform multimodal biometric recognition, we first need to detect the images of the different
modalities from the facial video. The facial video clips in the constrained dataset are collected in
a controlled environment, where the camera rotates around the subject's head. The video
sequences start with the left profile of each subject (0 degrees) and proceed to the right profile
(180 degrees). Each of these video sequences contains image frames of different modalities, e.g.,
left ear, left profile face, frontal face, right profile face, and right ear, respectively. The video
sequences in the unconstrained dataset contains uncontrolled and non-uniform head rotations and
changing facial expressions. Thus, the appearance of a specific modality in a certain frame of the
unconstrained video clip is random compared with the constrained video clips.
The algorithm was trained to detect the different modalities that appear in the facial video clips.
To automate the detection process of the modality specific image frames, we adopt the Adaboost
object detection technique, proposed by Viola and Jones [1]. The algorithm is trained to detect
frontal and profile faces in the video frames, respectively, using manually cropped frontal face
images from color FERET database, and profile face images from the University of Notre Dame
Collection J2 database. Moreover, it is trained using cropped ear images from UND color ear
database to detect ear images in the video frames. By using these modality specific trained
detectors, we can detect faces and ears in the video frames. The modality specific trained
detectors are applied to the entire video sequence to detect the face and the ear regions in the
video frames.
Before using the detected modality specific regions from the video frames for extracting features,
some pre-processing steps are performed. The facial video clips recorded in the unconstrained
environment contain variations in illumination and low contrast. Histogram equalization is
performed to enhance the contrast of the images. Finally, all detected modality specific regions
from the facial video clips were resized; ear images were resized to 110 X 70 pixels and faces
images (frontal and profile) were resized to 128 X 128 pixels.
3. AUTOMATIC FEATURE LEARNING USING DEEP NEURAL NETWORK
Even though the modality specific sparse classifiers result in relatively high recognition accuracy
on the constrained face video clips, the accuracy suffers in case of unconstrained video because
the sparse classifier is vulnerable to the bias in the number of training images from different
subjects. For example, subjects in the HONDA/UCSD dataset [4] randomly change their head
pose. This results in a non-uniform number of detected modality specific video frames across
different video clips, which is not ideal to perform classification through sparse representation.
In the subsequent sections we first describe the Gabor feature extraction technique. Then, we
describe the supervised denoising sparse auto-encoders, which we use to automatically learn
equal number of feature vectors for each subject from the uneven number of modality specific
detected regions.
3.1. Feature Extraction
2D Gabor filters [5] are used in broad range of applications to extract scale and rotation invariant
feature vectors. In our feature extraction step, uniform down-sampled Gabor wavelets are
computed for the detected regions:
߰ஜ,ఔሺݖሻ =
ฮಔ,ഌฮ
మ
௦మ ℯ
ሺ
షฮೖಔ,ഌฮ
మ
‖‖మ
మೞమ ሻ
ℯಔ,ഌ௭
− ℯ
ೞమ
మ ൨, (1)
4. Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.1, February 2017
4
where z = (x, y) represents each pixel in the 2D image, ݇μ,ߥ is the wave vector, which can be
defined as ݇μ,ߥ = ݇ߥℯ݅߶ݑ , ݇ߥ =
ೌೣ
ഌ , ݇௫ is the maximum frequency, and f is the spacing
factor between kernels in the frequency domain, ߶ݑ =
గஜ
ଶ
, and the value of s determines the ratio
of the Gaussian window width to wavelength. Using equation 1, Gabor kernels can be generated
from one filter using different scaling and rotation factors. In this paper, we used five scales,
ߥ ∈ 0,… , 4 and eight orientations μ ∈ 0,… , 7. The other parameter values used are s = 2π
, ݇௫ =
గ
ଶ
, and ݂ = √2.
Before computing the Gabor features, all detected ear regions are resized to the average size of all
the ear images, i.e., 110 X 70 pixels, and all face images (frontal and profile) are resized to the
average size of all the face images, i.e., 128 X 128 pixels. Gabor features are computed by
convolving each Gabor wavelet with the detected 2D region, as follows:
ܥஜ,ఔሺݖሻ = ܶሺݖሻ ∗ ߰ஜ,ఔሺݖሻ, (2)
where T(z) is the detected 2D region, and z = (x, y) represents the pixel location. The feature
vector is constructed out of ܥஜ,ఔ by concatenating its rows.
3.2. Supervised Stacked Denoising Auto-encoder
The application of neural networks to supervised learning [6] is well proven in different
applications including computer vision and speech recognition. An auto-encoder neural network
is an unsupervised learning algorithm, one of the commonly used building blocks in deep neural
networks, that applies backpropagation to set the target values to be equal to the inputs. The
reconstruction error between the input and the output of the network is used to adjust the weights
of each layer. An auto-encoder tries to learn a function ݔ = ݔపෝ, where ݔ belongs to unlabelled
training examples set {ݔሺଵሻ, ݔሺଶሻ, ݔሺଷሻ, …, ݔሺሻ}, and ݔ߳ ℝ
. In other words, it is trying to learn
an approximation to the identity function, to produce an output ݔො that is similar to x, in two
subsequent stages: (i) An encoder that maps the input x to the hidden nodes through some
deterministic mapping function f : h = f(x), then (ii) A decoder that maps the hidden nodes back
to the original input space through another deterministic mapping function g : ݔො = ݃ሺℎሻ. For real-
valued input, by minimizing the reconstruction error ‖ݔ − ݃ሺ݂ሺݔሻሻ‖ଶ
, the parameters of encoder
and decoder can be learned.
To learn features, which are robust to illumination, viewing angle, pose etc., from modality
specific image regions, we adopted the supervised auto-encoder [7]. The supervised auto-encoder
is trained using features extracted from image regions (ݔపෝ) containing variations in illumination,
viewing angle and pose whereas the features of selected image regions, (ݔ), with similar
illumination and without pose variations are utilized as the target. By minimizing the objective
criterion given in Equation 3 (subject to, the modality-specific features of the same person are
similar), the supervised auto-encoders learn to capture the modality specific robust representation.
݉݅݊ௐ,,
ଵ
ே
∑ ሺ‖ݔ − ݃ሺ݂ሺݔపෝሻሻ‖ଶ
ଶ
+ ߣ‖݂ሺݔሻ − ݂ሺݔపෝሻ‖ଶ
ଶሻ, (3)
where the output of the hidden layer, h, is defined as ℎ = ݂ሺݔሻ = tanh ሺܹݔ + ܾሻ, ݃ = ℎሺݔሻ =
tanh ሺ்ܹ
ℎ + ܾௗሻ, N is the total number of training samples, and is the weight preservation term.
The first term in Equation 3 minimize the reconstruction error, i.e., after passing through the
encoder and the decoder, the variations (illumination, viewing angle and pose) of the features
extracted from the unconstrained images will be repaired. The second term in Equation 3 enforces
the similarity of modality specific features corresponding to the same person.
After training a stack of encoders its highest level output representation can be used as input to a
stand-alone supervised learning algorithm. A logistic regression (LR) layer was added on top of
5. Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.1, February 2017
5
the encoders as the final output layer which enable the deep neural network to perform supervised
learning. By performing gradient descent on a supervised cost function, the Supervised Stacked
Denoising Auto-encoder (SDAE) automatically learned fine-tuned network weights. Thus, the
parameters of the entire SDAE network are fine-tuned to minimize the error in predicting the
supervised target (e.g., class labels).
3.3. Training the Deep Learning Network
We adopt the two stage training of the Deep Learning Network, where we have a better
initialization to begin with and a fine tuned network weights that lead us to a more accurate high-
level representation of the dataset. The steps of two stage Deep Learning Network training are as
follows:
Step1. Stacked Denoising Autoencoders are used to train the initial network weights one layer at a
time in a greedy fashion using Deep Belief Network (DBN).
Step2. The initial weights of the Deep learning network are initialized using the learned
parameters from DBN.
Step3. Labelled training data are used as input, and their predicted classification labels obtained
from the Logistic regression layer along with the initial weights are used to apply back
propagation on the SDAE.
Step4. Back propagation is applied on the network to optimize the objective function (given in
equation 5), results in fine tune the weights and bias for the entire network.
Step5. Finally, the learned network weights and bias are used to extract image features to train the
sparse classifier.
Figure 2. Supervised Stacked Denoising Auto-encoder
The network is illustrated in Figure 2, which shows a two-category classification problem (there
are two output values), where the decoding part of SDAE is removed and the encoding part of
SDAE is retained to produce the initial features. In addition, the output layer of the whole
network, which is also called logistic regression layer, is added. The following sigmoid function
is used as activation function of the logistic regression layer:
6. Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.1, February 2017
6
ℎሺݔሻ =
ଵ
షೈೣష್ (4)
where x is the output of the last encoding layer ݕ
, in other words the features are pre-trained by
the SDAE network. The output of the sigmoid function is between 0 and 1, which denotes the
classification results in case of two class classification problem. Therefore, we can use the errors
between the predicted classification results and the true labels associated with the training data
points to fine-tune the whole network weights. The cost function is defined as the following
cross-entropy function:
ݐݏܥ = −
ଵ
ቂ∑ ݈
ୀଵ log ቀℎ൫ݔ
൯ቁ + ൫1 − ݈
൯ log ቀ1 − ℎ൫ݔ
൯ቁቃ, (5)
where ݈
denotes the label of the sample ݔ. By minimizing the cost function, we update the
network weights.
4. MODALITY SPECIFIC AND MULTIMODAL RECOGNITION
The modality specific sub-dictionaries (݀
) contain feature vectors generated by Deep Learning
Network using the modality specific training data of each individual subject; where i represents
the modality, ݅ ߳ 1,2, … ,5; and j stands for the number of training video sequence.
Later, we concatenate the modality specific learned sub-dictionaries (݀
) of all the subjects in the
dataset to obtain the modality specific (i.e., left ear, left profile face, frontal face, right profile
face, and right ear) dictionary ܦ, as follows.
ܦ = ൣ݀ଵ
; ݀ଶ
; … ; ݀
൧; ∀݅߳ 1,2, … , 5 (6)
4.1. Multimodal Recognition
The recognition results from the five modalities -- left ear, left profile face, frontal face, right
profile face, and right ear are combined using score level fusion. Score level fusion has the
flexibility of fusing various modalities upon their availability. To prepare for fusion, the matching
scores obtained from the different matchers are transformed into a common domain using a score
normalization technique. Later, the weighted sum technique is used to fuse the results at the score
level. We have adopted the Tanh score normalization technique [8], which is both robust and
efficient. The normalized match scores are then fused using the weighted sum technique:
ܵ = ∑ ݓ ∗ ݏ
ெ
ୀଵ , (7)
where ݓ and ݏ
are the weight and normalized match score of the ith
modality specific classifier,
respectively, such that ∑ ݓ = 1ெ
ୀଵ . In this study, the weights ݓ, ݅ = 1, 2, 3, 4, 5; correspond for
the left ear, left profile face, frontal face, right profile face, and right ear modalities, respectively.
These weights can be obtained by exhaustive search or based on the individual performance of
the classifiers [8]. Later, the weights for the modality specific classifiers in the score level fusion
were determined by using a separate training set with the goal of maximizing the fused
multimodal recognition accuracy.
5. EXPERIMENTAL RESULTS
In this section we describe the results of the modality specific and multi-modal recognition
experiments on both datasets. The feature vectors automatically learned using the trained Deep
Learning network resulted in length of 9600 for frontal and profile face; 4160 for ear. In order to
decrease the computational complexity and to find out most effective feature vector length to
7. Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.1, February 2017
7
maximize the recognition accuracy, the dimensionality of the feature vector is reduced to a lower
dimension using Principal Component Analysis (PCA) [9]. Using PCA, the number of features is
reduced to 500 and 1000. In Table- 1 the modality specific recognition accuracy obtained for the
reduced feature vector of 500, 1000 is shown. Feature vectors of length 1000 resulted in best
recognition accuracy for both modality specific and multimodal recognition.
Table 1. Modality Specific and Multimodal Rank-1 Recognition Accuracy.
Gabor
Feature
Length
Frontal
face
Left
profile
face
Right
profile
face
Left ear Right ear Multimodal
No feature
reduction
91.43% 71.43% 71.43% 85.71% 85.71% 88.57%
1000 91.43% 71.43% 74.29% 88.57% 88.57% 97.14%
500 88.57% 68.57% 68.57% 85.71% 82.86% 91.42%
The best rank-1 recognition rates, using ear, frontal and profile face modalities for multimodal
recognition, compared with the results reported in [10-12] is shown in Table 2.
Table 2. Comparison of 2D multimodal (frontal face, profile face and ear) rank-1 recognition accuracy
with the state-of-the-art techniques
Approaches Modalities Fusion
Performed In
Best Reported Rank-1 accuracy
Kisku et
al.[11]
Ear and
Frontal Face
Decision Level Ear: 93.53%; Frontal Face: 91.96%;
Profile Face: NA; Fusion: 95.53%
Pan et al. [12] Ear and
Profile Face
Feature Level Ear: 91.77%; Frontal Face: NA; Profile Face:
93.46%; Fusion: 96.84%
Boodoo et
al.[10]
Ear and
Frontal Face
Decision Level Ear: 90.7%; Frontal Face: 94.7%; Profile
Face: NA; Fusion: 96%
This Work Ear, Frontal
and Profile
Face
Score Level Ear: 95.04%; Frontal Face: 97.52%;
Profile Face: 93.39%; Fusion: 99.17%
5.1. Parameter selection for the Deep Neural Network
We have tested the performance of the proposed multimodal recognition framework against
different parameters of the Deep Neural Network. We varied the number of hidden layers from
three to seven. By using five hidden layers we achieved the best performance. The pre-training
learning rate of the DBN is used as 0.001 and the fine tuning learning rate of the SADE is used as
0.1 to achieve the optimal performance. The number of nodes in the input layer of the SADE is
equal to the size of the modality specific detected image, i.e., 7700 nodes for the left and right ear
(size of ear images: 110 X 70 pixels) and 16384 nodes for the frontal and profile face images
(size of frontal and profile face images: 128 X 128 pixels). While training the SADE network in a
Core i7-2600K CPU clocked at 3.40GHz Windows®
PC using Theano Library (Python
Programming Language) pre-training of the DBN takes approximately 600 minutes and the fine-
tuning of the SADE network converged within 48 epochs in 560.2 minutes.
6. CONCLUSIONS
We proposed a system for multimodal recognition using a single biometrics data source, i.e.,
facial video clips. Using the Adaboost detector, we automatically detect modality specific
regions. We use Gabor features extracted from the detected regions to automatically learn robust
and non-redundant features by training a Supervised Stacked Denoising Auto-encoder (Deep
8. Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.1, February 2017
8
Learning) network. Classification through sparse representation is used for each modality. Then,
the multimodal recognition is obtained through the fusion of the results from the modality
specific recognition.
REFERENCES
[1] Viola, P. & Jones, M. (2001) "Grid Rapid object detection using a boosted cascade of simple
features.", Computer Vision and Pattern Recognition, pp 511-518.
[2] Huang , Zengxi & Liu , Yiguang & Li , Chunguang & Yang , Menglong & Chen , Liping (2013) "A
robust face and ear based multimodal biometric system using sparse representation.", Pattern
Recognition, pp 2156-2168.
[3] Fahmy, Gamal & El-sherbeeny, Ahmed & M., Susmita and Abdel-Mottaleb, Mohamed & Ammar,
Hany (2006) "The effect of lighting direction/condition on the performance of face recognition
algorithms.", SPIE Conference on Biometrics for Human Identification, pp 188-200.
[4] Lee, K.C. & Ho, J. & Yang, M.H. & Kriegman, D. (2005) "Visual Tracking and Recognition Using
Probabilistic Appearance Manifolds.” , Computer Vision and Image Understanding.
[5] Liu, Chengjun & Wechsler, H. (2002) "Gabor feature based classification using the enhanced fisher
linear discriminant model for face recognition.", IEEE Transactions on Image Processing, pp 467-
476.
[6] Rumelhart, David E. & McClelland, James L. (1986) "Parallel Distributed Processing: Explorations in
the Microstructure of Cognition.", MIT Press. Cambridge, MA, USA.
[7] Gao, Shenghua and Zhang, Yuting and Jia, Kui and Lu, Jiwen and Zhang, Yingying (2015) "Single
Sample Face Recognition via Learning Deep Supervised Autoencoders.", IEEE Transactions on
Information Forensics and Security, pp 2108-2118.
[8] Ross, A. A. & Nandakumar, K. & Jain, A. K. (2006) "Handbook of multibiometrics." Springer.
[9] Matthew, Turk & Ale, Pentland (1991) "Eigenfaces for recognition.", J. Cognitive Neuroscience. MIT
Press, pp 71-86.
[10] Boodoo, Nazmeen Bibi & Subramanian, R. K. (2009) "Robust Multi biometric Recognition Using
Face and Ear Images.", J. CoRR.
[11] Kisku, Dakshina Ranjan & Sing, Jamuna Kanta & Gupta, Phalguni (2010) "Multibiometrics Belief
Fusion." J. CoRR.
[12] Pan, Xiuqin & Cao, Yongcun & Xu, Xiaona & Lu, Yong and Zha, Yue (2008) "Ear and face based
multimodal recognition based on KFDA." International Conference on Audio, Language and Image
Processing, pp 965-969.
AUTHORS
Sayan Maity received the M.E. in Electronics and Telecommunication Engineering
from Jadavpur University, India in 2011. He is currently working towards his PhD
degree in Industrial Engineering at the University of Miami. His research interests
include biometrics, computer vision and pattern recognition.
9. Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.1, February 2017
9
Mohamed Abdel-Mottaleb received the Ph.D. degree in computer science from the
University of Maryland, College Park, in 1993. From 1993 to 2000, he was with Philips
Research, Briarcliff Manor, NY, where he was a Principal Member of the Research Staff
and a Project Leader. At Philips Research, he led several projects in image processing
and content-based multimedia retrieval. He joined the University of Miami in 2001. He
is currently a Professor and a Chairman of the Department of Electrical and Computer
Engineering. He represented Philips in the standardization activity of ISO for MPEG-7,
where some of his work was included in the standard. He holds 22 U.S. patents and over 30 international
patents. He has authored over 120 journal and conference papers in the areas of image processing,
computer vision, and content-based retrieval. His research focuses on 3-D face and ear biometrics, dental
biometrics, visual tracking, and human activity recognition. He is an Editorial Board Member of the Pattern
Recognition journal.
Dr. Shihab Asfour received his Ph.D. in Industrial Engineering from Texas Tech
University, Lubbock, Texas. Dr. Asfour currently serves as the Associate Dean of the
College of Engineering, since August 15th, 2007, at the University of Miami. In
addition, Dr. Asfour has been serving as the Chairman of the Department of Industrial
Engineering at the University of Miami since June 1st, 1999. In addition to his
appointment in Industrial Engineering, he holds the position of Professor in the
Biomedical Engineering and the Orthopedics and Rehabilitation Departments. Dr.
Asfour has published over 250 articles in national and international journals, proceedings and books. His
publications have appeared in the Ergonomics, Human Factors, Spine, and IIE journals. He has also edited
a two-volume book titled Trends in Ergonomics/Human Factors IV published by Elsevier Science
Publishers in 1987, and the book titled Computer Aided Ergonomics, published by Taylor and Francis,
1990.