Face recognition is used in wide range of application.
In recent years, face recognition has become one of the most
successful applications in image analysis and understanding.
Different statistical method and research groups reported a
contradictory result when comparing principal component
analysis (PCA) algorithm, independent component analysis
(ICA) algorithm, and linear discriminant analysis (LDA)
algorithm that has been proposed in recent years. The goal of
this paper is to compare and analyze the three algorithms and
conclude which is best. Feret Dataset is used for consistency
Multimodal Biometrics Recognition by Dimensionality Diminution MethodIJERA Editor
Multimodal biometric system utilizes two or more character modalities, e.g., face, ear, and fingerprint,
Signature, plamprint to improve the recognition accuracy of conventional unimodal methods. We propose a new
dimensionality reduction method called Dimension Diminish Projection (DDP) in this paper. DDP can not only
preserve local information by capturing the intra-modal geometry, but also extract between-class relevant
structures for classification effectively. Experimental results show that our proposed method performs better
than other algorithms including PCA, LDA and MFA.
Reduct generation for the incremental data using rough set theorycsandit
n today’s changing world huge amount of data is ge
nerated and transferred frequently.
Although the data is sometimes static but most comm
only it is dynamic and transactional. New
data that is being generated is getting constantly
added to the old/existing data. To discover the
knowledge from this incremental data, one approach
is to run the algorithm repeatedly for the
modified data sets which is time consuming. The pap
er proposes a dimension reduction
algorithm that can be applied in dynamic environmen
t for generation of reduced attribute set as
dynamic reduct.
The method analyzes the new dataset, when it become
s available, and modifies
the reduct accordingly to fit the entire dataset. T
he concepts of discernibility relation, attribute
dependency and attribute significance of Rough Set
Theory are integrated for the generation of
dynamic reduct set, which not only reduces the comp
lexity but also helps to achieve higher
accuracy of the decision system. The proposed metho
d has been applied on few benchmark
dataset collected from the UCI repository and a dyn
amic reduct is computed. Experimental
result shows the efficiency of the proposed method
Multimodal Biometrics Recognition by Dimensionality Diminution MethodIJERA Editor
Multimodal biometric system utilizes two or more character modalities, e.g., face, ear, and fingerprint,
Signature, plamprint to improve the recognition accuracy of conventional unimodal methods. We propose a new
dimensionality reduction method called Dimension Diminish Projection (DDP) in this paper. DDP can not only
preserve local information by capturing the intra-modal geometry, but also extract between-class relevant
structures for classification effectively. Experimental results show that our proposed method performs better
than other algorithms including PCA, LDA and MFA.
Reduct generation for the incremental data using rough set theorycsandit
n today’s changing world huge amount of data is ge
nerated and transferred frequently.
Although the data is sometimes static but most comm
only it is dynamic and transactional. New
data that is being generated is getting constantly
added to the old/existing data. To discover the
knowledge from this incremental data, one approach
is to run the algorithm repeatedly for the
modified data sets which is time consuming. The pap
er proposes a dimension reduction
algorithm that can be applied in dynamic environmen
t for generation of reduced attribute set as
dynamic reduct.
The method analyzes the new dataset, when it become
s available, and modifies
the reduct accordingly to fit the entire dataset. T
he concepts of discernibility relation, attribute
dependency and attribute significance of Rough Set
Theory are integrated for the generation of
dynamic reduct set, which not only reduces the comp
lexity but also helps to achieve higher
accuracy of the decision system. The proposed metho
d has been applied on few benchmark
dataset collected from the UCI repository and a dyn
amic reduct is computed. Experimental
result shows the efficiency of the proposed method
Content Based Image Retrieval Using Gray Level Co-Occurance Matrix with SVD a...ijcisjournal
In this paper, gray level co-occurrence matrix, gray level co-occurrence matrix with singular value
decomposition and local binary pattern are presented for content based image retrieval. Based upon the
feature vector parameters of energy, contrast, entropy and distance metrics such as Euclidean distance,
Canberra distance, Manhattan distance the retrieval efficiency, precision, and recall of the images are
calculated. The retrieval results of the proposed method are tested on Corel-1k database. The results after
being investigated shows a significant improvement in terms of average retrieval rate, average retrieval
precision and recall of different algorithms such as GLCM, GLCM & SVD, LBP with radius one and LBP
with radius two based on different distance metrics.
A New Method Based on MDA to Enhance the Face Recognition PerformanceCSCJournals
A novel tensor based method is prepared to solve the supervised dimensionality reduction problem. In this paper a multilinear principal component analysis(MPCA) is utilized to reduce the tensor object dimension then a multilinear discriminant analysis(MDA), is applied to find the best subspaces. Because the number of possible subspace dimensions for any kind of tensor objects is extremely high, so testing all of them for finding the best one is not feasible. So this paper also presented a method to solve that problem, The main criterion of algorithm is not similar to Sequential mode truncation(SMT) and full projection is used to initialize the iterative solution and find the best dimension for MDA. This paper is saving the extra times that we should spend to find the best dimension. So the execution time will be decreasing so much. It should be noted that both of the algorithms work with tensor objects with the same order so the structure of the objects has been never broken. Therefore the performance of this method is getting better. The advantage of these algorithms is avoiding the curse of dimensionality and having a better performance in the cases with small sample sizes. Finally, some experiments on ORL and CMPU-PIE databases is provided.
Supervised learning: discover patterns in the data that relate data attributes with a target (class) attribute.
These patterns are then utilized to predict the values of the target attribute in future data instances.
Unsupervised learning: The data have no target attribute.
We want to explore the data to find some intrinsic structures in them.
Automatic Unsupervised Data Classification Using Jaya Evolutionary Algorithmaciijournal
In this paper we attempt to solve an automatic clustering problem by optimizing multiple objectives such as
automatic k-determination and a set of cluster validity indices concurrently. The proposed automatic
clustering technique uses the most recent optimization algorithm Jaya as an underlying optimization
stratagem. This evolutionary technique always aims
to attain global best solution rather than a local best solution in larger datasets. The explorations and exploitations imposed on the proposed work results to
detect the number of automatic clusters, appropriate partitioning present in data sets and mere optimal
values towards CVIs frontiers. Twelve datasets of different intricacy are used to endorse the performance
of aimed algorithm. The experiments lay bare that the conjectural advantages of multi objective clustering optimized with evolutionary approaches decipher into realistic and scalable performance paybacks.
Automatic Unsupervised Data Classification Using Jaya Evolutionary Algorithmaciijournal
In this paper we attempt to solve an automatic clustering problem by optimizing multiple objectives such as automatic k-determination and a set of cluster validity indices concurrently. The proposed automatic clustering technique uses the most recent optimization algorithm Jaya as an underlying optimization stratagem. This evolutionary technique always aims to attain global best solution rather than a local best solution in larger datasets. The explorations and exploitations imposed on the proposed work results to detect the number of automatic clusters, appropriate partitioning present in data sets and mere optimal values towards CVIs frontiers. Twelve datasets of different intricacy are used to endorse the performance of aimed algorithm. The experiments lay bare that the conjectural advantages of multi objective clustering optimized with evolutionary approaches decipher into realistic and scalable performance paybacks.
Data Mining: Concepts and Techniques — Chapter 2 —Salah Amean
the presentation contains the following :
-Data Objects and Attribute Types.
-Basic Statistical Descriptions of Data.
-Data Visualization.
-Measuring Data Similarity and Dissimilarity.
-Summary.
Graph Theory Based Approach For Image Segmentation Using Wavelet TransformCSCJournals
This paper presents the image segmentation approach based on graph theory and threshold. Amongst the various segmentation approaches, the graph theoretic approaches in image segmentation make the formulation of the problem more flexible and the computation more resourceful. The problem is modeled in terms of partitioning a graph into several sub-graphs; such that each of them represents a meaningful region in the image. The segmentation problem is then solved in a spatially discrete space by the well-organized tools from graph theory. After the literature review, the problem is formulated regarding graph representation of image and threshold function. The boundaries between the regions are determined as per the segmentation criteria and the segmented regions are labeled with random colors. In presented approach, the image is preprocessed by discrete wavelet transform and coherence filter before graph segmentation. The experiments are carried out on a number of natural images taken from Berkeley Image Database as well as synthetic images from online resources. The experiments are performed by using the wavelets of Haar, DB2, DB4, DB6 and DB8. The results are evaluated and compared by using the performance evaluation parameters like execution time, Performance Ratio, Peak Signal to Noise Ratio, Precision and Recall and obtained results are encouraging.
In the present day huge amount of data is generated in every minute and transferred frequently. Although
the data is sometimes static but most commonly it is dynamic and transactional. New data that is being
generated is getting constantly added to the old/existing data. To discover the knowledge from this
incremental data, one approach is to run the algorithm repeatedly for the modified data sets which is time
consuming. Again to analyze the datasets properly, construction of efficient classifier model is necessary.
The objective of developing such a classifier is to classify unlabeled dataset into appropriate classes. The
paper proposes a dimension reduction algorithm that can be applied in dynamic environment for
generation of reduced attribute set as dynamic reduct, and an optimization algorithm which uses the
reduct and build up the corresponding classification system. The method analyzes the new dataset, when it
becomes available, and modifies the reduct accordingly to fit the entire dataset and from the entire data
set, interesting optimal classification rule sets are generated. The concepts of discernibility relation,
attribute dependency and attribute significance of Rough Set Theory are integrated for the generation of
dynamic reduct set, and optimal classification rules are selected using PSO method, which not only
reduces the complexity but also helps to achieve higher accuracy of the decision system. The proposed
method has been applied on some benchmark dataset collected from the UCI repository and dynamic
reduct is computed, and from the reduct optimal classification rules are also generated. Experimental
result shows the efficiency of the proposed method.
A new model for iris data set classification based on linear support vector m...IJECEIAES
Data mining is known as the process of detection concerning patterns from essential amounts of data. As a process of knowledge discovery. Classification is a data analysis that extracts a model which describes an important data classes. One of the outstanding classifications methods in data mining is support vector machine classification (SVM). It is capable of envisaging results and mostly effective than other classification methods. The SVM is a one technique of machine learning techniques that is well known technique, learning with supervised and have been applied perfectly to a vary problems of: regression, classification, and clustering in diverse domains such as gene expression, web text mining. In this study, we proposed a newly mode for classifying iris data set using SVM classifier and genetic algorithm to optimize c and gamma parameters of linear SVM, in addition principle components analysis (PCA) algorithm was use for features reduction.
Towards Semantic Clustering – A Brief OverviewCSCJournals
Image clustering is an important technology which helps users to get hold of the large amount of online visual information, especially after the rapid growth of the Web. This paper focuses on image clustering methods and their application in image collection or online image repository. Current progress of image clustering related to image retrieval and image annotation are summarized and some open problems are discussed. Related works are summarized based on the problems addressed, which are image segmentation, compact representation of image set, search space reduction, and semantic gap. Issues are also identified in current progress and semantic clustering is conjectured to be the potential trend. Our framework of semantic clustering as well as the main abstraction levels involved is briefly discussed.
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETSEditor IJCATR
This paper presents a hybrid data mining approach based on supervised learning and unsupervised learning to identify the closest data patterns in the data base. This technique enables to achieve the maximum accuracy rate with minimal complexity. The proposed algorithm is compared with traditional clustering and classification algorithm and it is also implemented with multidimensional datasets. The implementation results show better prediction accuracy and reliability.
Abstract Face recognition is a form of computer vision that uses faces to identify a person or verify a person’s claimed identity. In this paper, a neural based algorithm is presented, to detect frontal views of faces. The dimensionality of input face image is reduced by the Principal component analysis and the Classification is by the neural back propagation network. This method is robust for a dataset of 300 face images and has better performance in terms of 80 – 90 % recognition rate.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
Content Based Image Retrieval Using Gray Level Co-Occurance Matrix with SVD a...ijcisjournal
In this paper, gray level co-occurrence matrix, gray level co-occurrence matrix with singular value
decomposition and local binary pattern are presented for content based image retrieval. Based upon the
feature vector parameters of energy, contrast, entropy and distance metrics such as Euclidean distance,
Canberra distance, Manhattan distance the retrieval efficiency, precision, and recall of the images are
calculated. The retrieval results of the proposed method are tested on Corel-1k database. The results after
being investigated shows a significant improvement in terms of average retrieval rate, average retrieval
precision and recall of different algorithms such as GLCM, GLCM & SVD, LBP with radius one and LBP
with radius two based on different distance metrics.
A New Method Based on MDA to Enhance the Face Recognition PerformanceCSCJournals
A novel tensor based method is prepared to solve the supervised dimensionality reduction problem. In this paper a multilinear principal component analysis(MPCA) is utilized to reduce the tensor object dimension then a multilinear discriminant analysis(MDA), is applied to find the best subspaces. Because the number of possible subspace dimensions for any kind of tensor objects is extremely high, so testing all of them for finding the best one is not feasible. So this paper also presented a method to solve that problem, The main criterion of algorithm is not similar to Sequential mode truncation(SMT) and full projection is used to initialize the iterative solution and find the best dimension for MDA. This paper is saving the extra times that we should spend to find the best dimension. So the execution time will be decreasing so much. It should be noted that both of the algorithms work with tensor objects with the same order so the structure of the objects has been never broken. Therefore the performance of this method is getting better. The advantage of these algorithms is avoiding the curse of dimensionality and having a better performance in the cases with small sample sizes. Finally, some experiments on ORL and CMPU-PIE databases is provided.
Supervised learning: discover patterns in the data that relate data attributes with a target (class) attribute.
These patterns are then utilized to predict the values of the target attribute in future data instances.
Unsupervised learning: The data have no target attribute.
We want to explore the data to find some intrinsic structures in them.
Automatic Unsupervised Data Classification Using Jaya Evolutionary Algorithmaciijournal
In this paper we attempt to solve an automatic clustering problem by optimizing multiple objectives such as
automatic k-determination and a set of cluster validity indices concurrently. The proposed automatic
clustering technique uses the most recent optimization algorithm Jaya as an underlying optimization
stratagem. This evolutionary technique always aims
to attain global best solution rather than a local best solution in larger datasets. The explorations and exploitations imposed on the proposed work results to
detect the number of automatic clusters, appropriate partitioning present in data sets and mere optimal
values towards CVIs frontiers. Twelve datasets of different intricacy are used to endorse the performance
of aimed algorithm. The experiments lay bare that the conjectural advantages of multi objective clustering optimized with evolutionary approaches decipher into realistic and scalable performance paybacks.
Automatic Unsupervised Data Classification Using Jaya Evolutionary Algorithmaciijournal
In this paper we attempt to solve an automatic clustering problem by optimizing multiple objectives such as automatic k-determination and a set of cluster validity indices concurrently. The proposed automatic clustering technique uses the most recent optimization algorithm Jaya as an underlying optimization stratagem. This evolutionary technique always aims to attain global best solution rather than a local best solution in larger datasets. The explorations and exploitations imposed on the proposed work results to detect the number of automatic clusters, appropriate partitioning present in data sets and mere optimal values towards CVIs frontiers. Twelve datasets of different intricacy are used to endorse the performance of aimed algorithm. The experiments lay bare that the conjectural advantages of multi objective clustering optimized with evolutionary approaches decipher into realistic and scalable performance paybacks.
Data Mining: Concepts and Techniques — Chapter 2 —Salah Amean
the presentation contains the following :
-Data Objects and Attribute Types.
-Basic Statistical Descriptions of Data.
-Data Visualization.
-Measuring Data Similarity and Dissimilarity.
-Summary.
Graph Theory Based Approach For Image Segmentation Using Wavelet TransformCSCJournals
This paper presents the image segmentation approach based on graph theory and threshold. Amongst the various segmentation approaches, the graph theoretic approaches in image segmentation make the formulation of the problem more flexible and the computation more resourceful. The problem is modeled in terms of partitioning a graph into several sub-graphs; such that each of them represents a meaningful region in the image. The segmentation problem is then solved in a spatially discrete space by the well-organized tools from graph theory. After the literature review, the problem is formulated regarding graph representation of image and threshold function. The boundaries between the regions are determined as per the segmentation criteria and the segmented regions are labeled with random colors. In presented approach, the image is preprocessed by discrete wavelet transform and coherence filter before graph segmentation. The experiments are carried out on a number of natural images taken from Berkeley Image Database as well as synthetic images from online resources. The experiments are performed by using the wavelets of Haar, DB2, DB4, DB6 and DB8. The results are evaluated and compared by using the performance evaluation parameters like execution time, Performance Ratio, Peak Signal to Noise Ratio, Precision and Recall and obtained results are encouraging.
In the present day huge amount of data is generated in every minute and transferred frequently. Although
the data is sometimes static but most commonly it is dynamic and transactional. New data that is being
generated is getting constantly added to the old/existing data. To discover the knowledge from this
incremental data, one approach is to run the algorithm repeatedly for the modified data sets which is time
consuming. Again to analyze the datasets properly, construction of efficient classifier model is necessary.
The objective of developing such a classifier is to classify unlabeled dataset into appropriate classes. The
paper proposes a dimension reduction algorithm that can be applied in dynamic environment for
generation of reduced attribute set as dynamic reduct, and an optimization algorithm which uses the
reduct and build up the corresponding classification system. The method analyzes the new dataset, when it
becomes available, and modifies the reduct accordingly to fit the entire dataset and from the entire data
set, interesting optimal classification rule sets are generated. The concepts of discernibility relation,
attribute dependency and attribute significance of Rough Set Theory are integrated for the generation of
dynamic reduct set, and optimal classification rules are selected using PSO method, which not only
reduces the complexity but also helps to achieve higher accuracy of the decision system. The proposed
method has been applied on some benchmark dataset collected from the UCI repository and dynamic
reduct is computed, and from the reduct optimal classification rules are also generated. Experimental
result shows the efficiency of the proposed method.
A new model for iris data set classification based on linear support vector m...IJECEIAES
Data mining is known as the process of detection concerning patterns from essential amounts of data. As a process of knowledge discovery. Classification is a data analysis that extracts a model which describes an important data classes. One of the outstanding classifications methods in data mining is support vector machine classification (SVM). It is capable of envisaging results and mostly effective than other classification methods. The SVM is a one technique of machine learning techniques that is well known technique, learning with supervised and have been applied perfectly to a vary problems of: regression, classification, and clustering in diverse domains such as gene expression, web text mining. In this study, we proposed a newly mode for classifying iris data set using SVM classifier and genetic algorithm to optimize c and gamma parameters of linear SVM, in addition principle components analysis (PCA) algorithm was use for features reduction.
Towards Semantic Clustering – A Brief OverviewCSCJournals
Image clustering is an important technology which helps users to get hold of the large amount of online visual information, especially after the rapid growth of the Web. This paper focuses on image clustering methods and their application in image collection or online image repository. Current progress of image clustering related to image retrieval and image annotation are summarized and some open problems are discussed. Related works are summarized based on the problems addressed, which are image segmentation, compact representation of image set, search space reduction, and semantic gap. Issues are also identified in current progress and semantic clustering is conjectured to be the potential trend. Our framework of semantic clustering as well as the main abstraction levels involved is briefly discussed.
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETSEditor IJCATR
This paper presents a hybrid data mining approach based on supervised learning and unsupervised learning to identify the closest data patterns in the data base. This technique enables to achieve the maximum accuracy rate with minimal complexity. The proposed algorithm is compared with traditional clustering and classification algorithm and it is also implemented with multidimensional datasets. The implementation results show better prediction accuracy and reliability.
Abstract Face recognition is a form of computer vision that uses faces to identify a person or verify a person’s claimed identity. In this paper, a neural based algorithm is presented, to detect frontal views of faces. The dimensionality of input face image is reduced by the Principal component analysis and the Classification is by the neural back propagation network. This method is robust for a dataset of 300 face images and has better performance in terms of 80 – 90 % recognition rate.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
COMPRESSION BASED FACE RECOGNITION USING DWT AND SVMsipij
The biometric is used to identify a person effectively and employ in almost all applications of day to day
activities. In this paper, we propose compression based face recognition using Discrete Wavelet Transform
(DWT) and Support Vector Machine (SVM). The novel concept of converting many images of single person
into one image using averaging technique is introduced to reduce execution time and memory. The DWT is
applied on averaged face image to obtain approximation (LL) and detailed bands. The LL band coefficients
are given as input to SVM to obtain Support vectors (SV’s). The LL coefficients of DWT and SV’s are fused
based on arithmetic addition to extract final features. The Euclidean Distance (ED) is used to compare test
image features with database image features to compute performance parameters. It is observed that, the
proposed algorithm is better in terms of performance compared to existing algorithms.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
FACE RECOGNITION USING PRINCIPAL COMPONENT ANALYSIS WITH MEDIAN FOR NORMALIZA...csandit
Recognizing Faces helps to name the various subjects present in the image. This work focuses
on labeling faces on an image which includes faces of humans being of various age group
(heterogeneous set ). Principal component analysis concentrates on finds the mean of the data
set and subtracts the mean value from the data set with an intention to normalize that data.
Normalization with respect to image is the removal of common features from the data set. This
work brings in the novel idea of deploying the median another measure of central tendency for
normalization rather than mean. The above work was implemented using matlab. Results show
that Median is the best measure for normalization for a heterogeneous data set which gives
raise to outliers.
Q UANTUM C LUSTERING -B ASED F EATURE SUBSET S ELECTION FOR MAMMOGRAPHIC I...ijcsit
In this paper, we present an algorithm for feature selection. This algorithm labeled QC-FS: Quantum
Clustering for Feature Selection performs the selection in two steps. Partitioning the original features
space in order to group similar features is performed using the Quantum Clustering algorithm. Then the
selection of a representative for each cluster is carried out. It uses similarity measures such as correlation
coefficient (CC) and the mutual information (MI). The feature which maximizes this information is chosen
by the algorithm
Fast and robust tracking of multiple faces is receiving increased attention from computer vision researchers as it finds potential applications in many fields like video surveillance and computer mediated video conferencing. Real-time tracking of multiple faces in high resolution videos involve three basic tasks namely initialization, tracking and display. Among these, tracking is quite compute intensive as it involves particle filtering that won’t yield a real time performance if we use a conventional CPU based system alone.
This paper presents a study of the efficiency and performance speedup achieved by applying Graphics Processing Units for Face Recognition Solutions. We explore one of the possibilities of parallelizing and optimizing a well-known Face Recognition algorithm, Principal Component Analysis (PCA) with Eigenfaces. In recent years, the Graphics Processing Units (GPU) has been the subject of extensive research and the computation speed of GPUs has been rapidly increasing.
A review on data mining techniques for Digital Mammographic Analysisijdmtaiir
- Medical Data mining is the search for relationships
and patterns within the medical data that could provide useful
knowledge for effective medical diagnosis. The predictability
of disease will become more effective and early detection of
disease will aid in increased exposure to required patient care
and improved cure rates using computational applications.
Review shows that the reasons for feature selection include
improvement in performance prediction, reduction in
computational requirements, reduction in data storage
requirements, reduction in the cost of future measurements
and improvement in data or model understanding
A Novel Approach to Mathematical Concepts in Data Miningijdmtaiir
-This paper describes three different fundamental
mathematical programming approaches that are relevant to
data mining. They are: Feature Selection, Clustering and
Robust Representation. This paper comprises of two clustering
algorithms such as K-mean algorithm and K-median
algorithms. Clustering is illustrated by the unsupervised
learning of patterns and clusters that may exist in a given
databases and useful tool for Knowledge Discovery in
Database (KDD). The results of k-median algorithm are used
to collecting the blood cancer patient from a medical database.
K-mean clustering is a data mining/machine learning algorithm
used to cluster observations into groups of related observations
without any prior knowledge of those relationships. The kmean algorithm is one of the simplest clustering techniques
and it is commonly used in medical imaging, biometrics and
related fields.
Analysis of Classification Algorithm in Data Miningijdmtaiir
Data Mining is the extraction of hidden predictive
information from large database. Classification is the process
of finding a model that describes and distinguishes data classes
or concept. This paper performs the study of prediction of class
label using C4.5 and Naïve Bayesian algorithm.C4.5 generates
classifiers expressed as decision trees from a fixed set of
examples. The resulting tree is used to classify future samples
.The leaf nodes of the decision tree contain the class name
whereas a non-leaf node is a decision node. The decision node
is an attribute test with each branch (to another decision tree)
being a possible value of the attribute. C4.5 uses information
gain to help it decide which attribute goes into a decision node.
A Naïve Bayesian classifier is a simple probabilistic classifier
based on applying Baye’s theorem with strong (naive)
independence assumptions. Naive Bayesian classifier assumes
that the effect of an attribute value on a given class is
independent of the values of the other attribute. This
assumption is called class conditional independence. The
results indicate that Predicting of class label using Naïve
Bayesian classifier is very effective and simple compared to
C4.5 classifier
Performance Analysis of Selected Classifiers in User Profilingijdmtaiir
User profiles can serve as indicators of personal
preferences which can be effectively used while providing
personalized services. Building user files which can capture
accurate information of individuals has been a daunting task.
Several attempts have been made by researchers to extract
information from different data sources to build user profiles
on different application domains. Towards this end, in this
paper we employ different classification algorithmsto create
accurate user profiles based on information gathered from
demographic data. The aim of this work is to analyze the
performance of five most effective classification methods,
namely Bayesian Network(BN), Naïve Bayesian(NB), Naives
Bayes Updateable(NBU), J48, and Decision Table(DT). Our
simulation results show that, in general, the J48has the highest
classification accuracy performance with the lowest error rate.
On the other hand, it is found that Naïve Bayesian and Naives
Bayes Updateable classifiers have the lowest time requirement
to build the classification model
Analysis of Sales and Distribution of an IT Industry Using Data Mining Techni...ijdmtaiir
The goal of this work is to allow a corporation to
improve its marketing, sales, and customer support operations
through a better understanding of its customers. Keep in mind,
however, that the data mining techniques and tools described
here are equally applicable in fields ranging from law
enforcement to radio astronomy, medicine, and industrial
process control. Businesses in today’s environment
increasingly focus on gaining competitive advantages.
Organizations have recognized that the effective use of data is
the key element in the next generation is to predict the sales
value and emerging trend of technology market. Data is
becoming an important resource for the companies to analyze
existing sales value with current technology trends and this
will be more useful for the companies to identify future sales
value. There a variety of data analysis and modeling techniques
to discover patterns and relationships in data that are used to
understand what your customers want and predict what they
will do. The main focus of this is to help companies to select
the right prospects on whom to focus, offer the right additional
products to company’s existing customers and identify good
customers who may be about to leave. This results in improved
revenue because of a greatly improved ability to respond to
each individual contact in the best way and reduced costs due
to properly allocated resources. Keywords: sales, customer,
technology, profit.
Analysis of Influences of memory on Cognitive load Using Neural Network Back ...ijdmtaiir
Educational mining used to evaluate the leaner's
performance and the learning environment. The learning
process are involved and influenced by different
components. The memory is playing vital role in the
process of learning. The long term, short term, working,
instant, responsive, process, recollect, reference,
instruction and action memory are involved in the
process of learning. The influencing factors on these
memories are identified through the construction analysis
of Neural Network Back Propagation Algorithm. The
observed set of data represented using cubical dataset
format for the mining approach. The mining process is
carried out using neural network based back propagation
network model to decide the influencing cognitive load
for the different learning challenges. The learners’
difficulties are identified through the experimental
results.
An Analysis of Data Mining Applications for Fraud Detection in Securities Marketijdmtaiir
-In recent securities fraud broadly refers to deceptive
practices in connection with the offering for sale of securities.
There are many challenges involved in developing data mining
applications for fraud detection in securities market, including:
massive datasets, accuracy, privacy, performance measures and
complexity. The impacts on the market and the training of
regulators are other issues that need to be addressed. In this
paper we present the results of a Comprehensive systematic
literature review on data mining techniques for detecting
fraudulent activities and market manipulation in securities
market. We identify the best practices that are based on data
mining methods for detecting known fraudulent patterns and
discovering new predatory strategies. Furthermore, we
highlight the challenges faced in the development and
implementation of data mining systems for detecting market
manipulation in securities market and we provide
recommendation for future research works accordingly
An Ill-identified Classification to Predict Cardiac Disease Using Data Cluste...ijdmtaiir
The health care industry contains large amount of
health care data with hidden information. This information is
useful for making effective decision. For getting appropriate
result from the hidden information computer based data mining
techniques are used. Previously Neural Network (NN) is
widely used for predicting cardiac disease. In this paper, a
Cardiac Disease Prediction System (CDPS) is developed by
using data clustering. The CDPS system uses 15 parameters to
predict the disease, for example BP, Obesity, cholesterol, etc.
This 15 attributes like sex, age, weight are given as the input.
In this paper by using the patient’s medical record, an illdefined classification is used at the early stage of the patient to
diagnose the cardiac disease. Based on the result the patients
are advised to keep the sensor to predict them.
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...ijdmtaiir
-In this study a comprehensive evaluation of two
supervised feature selection methods for dimensionality
reduction is performed - Latent Semantic Indexing (LSI) and
Principal Component Analysis (PCA). This is gauged against
unsupervised techniques like fuzzy feature clustering using
hard fuzzy C-means (FCM) . The main objective of the study is
to estimate the relative efficiency of two supervised techniques
against unsupervised fuzzy techniques while reducing the
feature space. It is found that clustering using FCM leads to
better accuracy in classifying documents in the face of
evolutionary algorithms like LSI and PCA. Results show that
the clustering of features improves the accuracy of document
classification
Music Promotes Gross National Happiness Using Neutrosophic fuzzyCognitive Map...ijdmtaiir
This paper provides an investigation to promote
gross national happiness from music using fuzzy logic model.
Music influences rate of learning. It has been the subject of
study for many years. Researchers have confirmed that loud
background noise impedes learning, concentration, and
information acquisition. An interesting phenomenon that
occurs frequently in listening new music among the students
creates sense of anxiety even without having
properunderstandingofmusic.Happiness is the emotion that
expresses various degrees of positive and negativefeelings
ranging from satisfaction to extreme joy. The happiness is
thegoal most people strive to achieve. Happypeople are
satisfied with their lives. The goal of this work is to find the
particular component of music which will ultimately promote
the happiness of people because of indeterminacy situation in
components of music
A Study on Youth Violence and Aggression using DEMATEL with FCM Methodsijdmtaiir
The DEMATEL method is then a good technique for
making decisions. In this paper we analyzed the risk factors of
youth violence and what makes them more aggressive. Since
there are more risk factors of youth violence, to relate each
other more complex to construct FCM and analyze them.
Moreover the data is an unsupervised one obtained from
survey as well as interviews. Hence fuzzy alone has the
capacity to analyses these concepts.
Certain Investigation on Dynamic Clustering in Dynamic Dataminingijdmtaiir
Clustering is the process of grouping a set of objects
into classes of similar objects. Dynamic clustering comes in a
new research area that is concerned about dataset with dynamic
aspects. It requires updates of the clusters whenever new data
records are added to the dataset and may result in a change of
clustering over time. When there is a continuous update and
huge amount of dynamic data, rescan the database is not
possible in static data mining. But this is possible in Dynamic
data mining process. This dynamic data mining occurs when
the derived information is present for the purpose of analysis
and the environment is dynamic, i.e. many updates occur.
Since this has now been established by most researchers and
they will move into solving some of the problems and the
research is to concentrate on solving the problem of using data
mining dynamic databases. This paper gives some
investigation of existing work done in some papers related with
dynamic clustering and incremental data clustering
Analyzing the Role of a Family in Constructing Gender Roles Using Combined Ov...ijdmtaiir
Family, as a social institution and as the
fundamental unit of the society, plays a vital role in forming
persons. We become full-fledged members of the society
through the process of socialization which starts in the family.
It provides the first foundational formation of personhood.
Especially in traditional societies like India the role of the
family assumes even greater importance. We learn to
differentiate and to discriminate between man and woman from
the role our parents play. In this paper we analyze the role of
the family in constructing gender roles using Combined
Overlap Block Fuzzy Cognitive Maps
An Interval Based Fuzzy Multiple Expert System to Analyze the Impacts of Clim...ijdmtaiir
Indian agriculture is completely dependent on the
environment and any undesirable change in the environment
has an adverse impact on agriculture. Climate change and
pollution in India have caused great damage to the
environment. In this paper we analyse the impact of climate
change on Indian agriculture. The first section gives an
introduction to the problem. In section two we introduce a new
fuzzy tool called interval based fuzzy multiple expert system.
Section three adapt the new fuzzy tool to analyse the problem
of agricultural impacts of climate change and in section four
we give the results and suggestions based on our analysis
An Approach for the Detection of Vascular Abnormalities in Diabetic Retinopathyijdmtaiir
Diabetic Retinopathy is a common complication of
diabetes that is caused by changes in the blood vessels of the
retina. The blood vessels in the retina get altered. Exudates are
secreted, micro-aneurysms and hemorrhages occur in the
retina. The appearance of these features represents the degree
of severity of the disease. In this paper the proposed approach
detects the presence of abnormalities in the retina using image
processing techniques by applying morphological processing
techniques to the fundus images to extract features such as
blood vessels, micro aneurysms and exudates. These features
are used for the detection of severity of Diabetic Retinopathy.
It can quickly process a large number of fundus images
obtained from mass screening to help reduce the cost, increase
productivity and efficiency for ophthalmologists.
Improve the Performance of Clustering Using Combination of Multiple Clusterin...ijdmtaiir
The ever-increasing availability of textual
documents has lead to a growing challenge for information
systems to effectively manage and retrieve the information
comprised in large collections of texts according to the user’s
information needs. There is no clustering method that can
adequately handle all sorts of cluster structures and properties
(e.g. shape, size, overlapping, and density). Combining
multiple clustering methods is an approach to overcome the
deficiency of single algorithms and further enhance their
performances. A disadvantage of the cluster ensemble is the
highly computational load of combing the clustering results
especially for large and high dimensional datasets. In this paper
we propose a multiclustering algorithm , it is a combination of
Cooperative Hard-Fuzzy Clustering model based on
intermediate cooperation between the hard k-means (KM) and
fuzzy c-means (FCM) to produce better intermediate clusters
and ant colony algorithm. This proposed method gives better
result than individual clusters.
The Study of Symptoms of Tuberculosis Using Induced Fuzzy Coginitive Maps (IF...ijdmtaiir
Tuberculosis or TB is a common, infectious disease
caused by various strains of mycobacterium usually called as
mycobacterium Tuberculosis. Tb attacks the lungs but can also
affect other parts of the body. Most infections are
asymptomatic and latent, but about one in ten latent infections
eventually progresses to active disease which, if left untreated,
kills more than 50% of those so infected. Hence, this paper
analysisthes y m p t o m s o f Tuberculosis using Induced
Fuzzy Cognitive Maps (IFCMs). IFCMs area fuzzy-graph
modeling approach based on expert’s opinion. This is the nonstatistical approach to study the problems with imprecise
information.
A Study on Finding the Key Motive of Happiness Using Fuzzy Cognitive Maps (FCMs)ijdmtaiir
Happiness is subjective. It is difficult to compare one
person’s happiness with another. It can be especially difficult
to compare happiness across cultures. The function of man is
to live a certain kind of life and this activity implies a rational
principle. The function of a good man is the good and noble
performance. If any action is well performed it is performed in
accord with the appropriate excellence. If this is the case, then
happiness turns out to be an activity of the soul in accordance
with virtue. That is happiness as the exercise of virtue. Every
human being thinks happiness in his own perspective.
Everyone wants to be happy and searching happiness in all
their activities. Happiness is typically measured using
subjective measures. Happiness cannot be defined in terms of
rigid boundaries and it is a vague term and it is appropriate to
use Fuzzy Logic. In particular, we use Fuzzy Cognitive
Mapping to find the key motive of happiness. This paper
consists of four sections. The first section is of inductive
nature about happiness. The section two introduces the
concept of Fuzzy Cognitive Mapping (FCMs). In section three
Fuzzy Cognitive Mapping applied on the concept of
Happiness. Section four gives the conclusions and
suggestions
Study of sustainable development using Fuzzy Cognitive Relational Maps (FCM)ijdmtaiir
Sustainable development provides a framework under
which communities can use resources efficiently, create efficient
infrastructures, protect and enhance quality of life, and create new
businesses to strengthen their economies. It can help us create
healthy communities that can sustain our generation, as well as
those that follow ours. Sustainable development is not a new
concept. Rather, it is the latest expression of a long-standing ethic
involving peoples' relationships with the environment and the
current generation's responsibilities to future generations. For a
community to be truly sustainable, it must adopt a three-pronged
approach that considers economic, environmental and cultural
resources. Communities must consider these needs in the short
term as well as the long term. Sustainable Development also can
be defined simply as a better quality of life for everyone, now and
for generations to come. It is a vision of progress that links
economic development, protection of the environment and social
justice, and its values are recognised by democratic governments
and political movements the world over. Sustainable
Development is therefore closely linked to Governance, Better
Regulation and Impact Assessment. Indicators to measure
progress are also vital.This paper has four sections. In the first
section we introduce the notion of fuzzy cognitive maps and
Combined Fuzzy Cognitive Maps (CFCMs). In section two we
describe the problem and justification for the use of FCMs. In
section three we give the adaptation of FCM to the problem. In
the final section we give conclusions based on our analysis of the
problem using FCM
A Study of Personality Influence in Building Work Life Balance Using Fuzzy Re...ijdmtaiir
Personality plays an important role in work life
balance irrespective of the organizational setups and other
factors. It has become a subject of concern in terms of
technological, market and organizational changes associated
with an individual’s personality. Here in this study an attempt is
made to study about the holistic picture of personality influence
in work-life balance on the basis of experts’ opinion. The
influence of personality is studied from the big five factors of
personality traits. The data were analyzed using Fuzzy
Relational mapping (FRM) model and conclusions arrived for
which personality has more influence in building work life
balance and which one is more vulnerable for work life
imbalance
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
1. Integrated Intelligent Research (IIR) International Journal of Data Mining Techniques and Applications
Volume: 03 Issue: 01 June 2014, Page No. 42-45
ISSN: 2278-2419
42
Comparison on PCA ICA and LDA in Face
Recognition
C.Christry1
, J.Pavithra2
, F.Mahimai Sathya3
St. Joseph’s College of Arts & Science (Autonomous), Cuddalore
Email:1
canschris@yahoo.com2
pavithrasam27@gmail.com3
mahimaisathya@gmail.com
Abstract-Face recognition is used in wide range of application.
In recent years, face recognition has become one of the most
successful applications in image analysis and understanding.
Different statistical method and research groups reported a
contradictory result when comparing principal component
analysis (PCA) algorithm, independent component analysis
(ICA) algorithm, and linear discriminant analysis (LDA)
algorithm that has been proposed in recent years. The goal of
this paper is to compare and analyze the three algorithms and
conclude which is best. Feret Dataset is used for consistency.
Keywords: Face Recognition, PCA, ICA, LDA.
I. INTRODUCTION
Face recognition is defined as identification of a person from
an image of their face. It is one of the most successful
application of image analysis and understanding. A lot of face
recognition algorithm along with their modification have been
developed during their past decades [1].In recent years, face
recognition has gained much attention. Identifying a face
whether it is a known face or unknown face is compared by a
person from a database. The interest of researchers in face
recognition has grown rapidly in recent years, since; there is a
wide range of commercial and law enforcement application on
face recognition.Applications of face recognition are credit
card, passport checks, criminalinvestigation,etc.The solutions
for face recognition problem are of three
parts.Facerecognition,Feature extraction done from face
region,Decisiontaken.Decision result is based on recognition,
verification or categorization of unknown face by comparing
the face with the database. Solving a problem is not easy.There
are some technical problem for face recognition such as lack of
robustness, to illuminate and pose variation.In this paper two
techniques is used appearance based techniques, feature based
techniques.
A. Appearance based technique: It uses holistic features that
are applied to whole face or a specified region in face.
B. Feature based technique: It uses geometric facial features
and relations between them.
The goal of this paper is to compare and analyses the
threealgorithms and tell them which is best. In previous work it
proposes a result of equal working condition when comparing.
The three algorithms are PCA, ICA, and LDA.
a. PCA: PCA find a set of projection vector such that the
projected sample retains the information about original
sample [2].
b. ICA: ICA capture second and higher-order statistic and
project the I/P data on basic vector that are statistically
independent [3][4].
c. LDA: LDA is also known as fisherman discriminate
analysis. It uses class information and find a set of vector
that maximize between class scatter matrixand minimize
within class scatter matrix [5][6].
The comparison in done by feret dataset[7]theferet dataset is
used for consistency.
II. LITERATURE REVIEW
Bartlett et al. [3] and Liu [8] claim that ICA outperforms PCA,
while Baek et al. [9] claim thatPCA is better. Moghaddam [10]
states that there is no significant statistical
difference.Beveridge et al. [11] claim that in their tests LDA
performed uniformly worse than PCA, Martinez [12] states
that LDA is better for some tasks, and Belhumeur et al. [5] and
Navarreteet al. [13] claim that LDA outperforms PCA on all
tasks in their tests (for more than twosamples per class in
training phase).
A. PCA (Principal Component Analysis)
Each face in a training set of M images is represented as S-
dimension vector. Here PCA finds t-dimension subspace
whose basis vector corresponds to maximum variance direction
in original image space. This is normally low dimensiont<<s).
Basic vector define subspace on image called “face space”. To
identify the known face the images are projected onto face
space and it finds the weight by contributing each vector. To
identify the unknown face the images are projected onto face
space to obtain set of weight. By comparing the weight of
known and unknown face, the face can be identified. If the
images are considered as random variables, PCA basics vector
is identified as Eigen vector of scatter matrix denoted as ST,
and it is defined as
ST=∑
M
(Xi-µ).( Xi-µ)T
µ- mean of all image in training set.
Xi- ith
image with column concatenated in a vector
WPCA- projection matrix is composed of t- Eigen
vector correspond to t- largest Eigen value thus PCA finds t
dimension face space[2].
The steps in finding the principal components can be
summarized as follows:
• Collect xi of an n dimensional data set x, i=1,2, …, m
• Mean correct (center) all the points: Calculate mean mx
and subtract it from each data point, xi – mx
• Calculate the covariance matrix: C = (xi - mx)(xi - mx )T
• Determine eigenvalues and eigenvectors of the matrix C.
2. Integrated Intelligent Research (IIR) International Journal of Data Mining Techniques and Applications
Volume: 03 Issue: 01 June 2014, Page No. 42-45
ISSN: 2278-2419
43
• Sort the eigenvalues (and corresponding eigenvectors) in
decreasing order.
• Select the first d ≤ n eigenvectors and generate the data
set in the new representation.
• The projected test image is compared to every projected
training image by using a similarity measure. The
result is the training image which is the closest to test
image.
B. ICA (Independent Component Analysis)
With Gaussian distribution, PCA considered the image element
as random variable and it minimizes the second order statistic.
If it is a non Gaussian distribution large variation does not
correspond to PCA basics vector. Here ICA [3][4] minimizes
both second-order and a higher-order dependency in input data
and try to find the basic along which data is statistically
independent. ICA is a statistical method of transforming
observed multidimensional random vector into its component
that is statistically independent. ICA is a special case of
redundancy technique and it represents the data in terms of
statistically independent variables. Two types architecture is
provided by the author bartlet[3] for face recognition.
Architecture 1: Statistically independent basic image.
Architecture 2: Factorial code representation. Also INFOMAX
algorithm was implemented and used in ICA by the author bell
and sejnouski [3]. As a result to perform ICA, PCA is used
reduce its dimensionalitypriorly before performing ICA.
Compared with other statistical method ICA provides more
powerful data than PCA.
The basic steps to derive the independent components are as
follows:
• Collect xi of an n dimensional data set x, i=1,2, …., m
• Mean correct all the points: Calculate mean mx and subtract
it from each data point,
Xi – mx
• Calculate the covariance matrix: C = (xi -mx)(xi - mx)T
• The ICA of x factorizes the covariance matrix C into the
following form:
C = F Δ FT where Δ is a diagonal real positive matrix.
• F transforms the original data x into Z such that the
components of the new data Z are independent: X = F Z.
C. LDA (Linear Discriminant Analysis)
Linear Discriminant Analysis (LDA) is a dimensionality
reduction technique which is used for classification problem. It
finds the vectors in the underlying space that best discriminate
among classes [5] [6]. The goal of LDA is to maximize the
between-class scatter matrix measure while minimizing the
within-class scatter matrix measure. The between-class scatter
matrix SB and the within-class scatter matrix are denoted as
SW.
The basic steps in LDA are as follows [14]:
Samples for class1 and class2
Calculate the mean of class1 and class2 i.e. Mu1 and
Mu2.
Covariance Matrix of the first class and second class
i.e. S1 and S2.
Calculate within-class scatter matrix by using given
equation Sw = S1 + S2.
Calculate between-class scatter matrix by using given
equation SB = (Mu1-Mu2)*(Mu1-Mu2).
Calculate the mean of all classes.
Compute the LDA projection invSw =inv(Sw)
invSw _ by _SB = invSw * SB .
The LDA projection is then obtained as the solution
of the generalized eigen value problem Sw-1SBW =
λW W = eig(Sw-1 sb) Where W is projection vector
• Compare the test image’s projection matrix with the
projection matrix of each training image by using a
similarity measure. The result is the training image
which is the closest to the test image.
III. SAMPLE MODEL
The gallery contains 1,196 face images, the training images are
a randomly selected subset of 500 gallery images. Most
importantly, there are four different sets of probeimages such
as fa,fb,duplicateI, duplicateII. The fafbprobe set contains
1,195 images of subjects taken at the same time as the gallery
images. The only difference is that the subjects were told to
assume a different facial expression then in the gallery image.
The duplicate I probe set contains 722 images of subjects taken
between one minute and 1,031 days after the gallery image was
taken. The duplicate II probe setis a subset of the duplicate I
probe set, where the probe image is taken at least 18 months
after the gallery image. The duplicate II set has 234 images.
All images in the data set are of size 384× 256 pixels and
grayscale.
IV. TRAINING
To train the PCA algorithm we used a subset of classes for
which there wereexactlythree images per class. We found 225
such classes (different persons), so our trainingset consisted of
3× 225 = 675 images (M = 675, c = 225). One important
question worthanswering at this stage is: in what extent does
the training set and gallery and probe setsoverlap? Out of 675
images in the training set, 224 were taken from the gallery
(33%),another 224 (33%) were taken from the fbset and were
of the same subject as the ones takenfrom the gallery, while 3
are in dup1 set. The remaining 224 were not in any set used
inrecognition stage. We can therefore conclude that algorithms
were trained roughly on 33% ofsubjects later used in the
recognition stage. The effect that this percentage of overlap has
onalgorithm performance needs further exploration and will be
part of our future work. PCAderived, in accordance with
theory, M - 1 = 674 meaningful eigenvectors. Weadopted
theFERET recommendation and kept the top 40% of those,
resulting in 270-dimensional PCAsubspace (40% of 674
≈ 270). It was calculated that 97.85% of energy was retained
in those270 eigenvectors. This subspace was used for
recognition as PCA face space and as input toICA and LDA
(PCA was the preprocessing dimensionality reduction step).
ICA yielded tworepresentations (ICA1 & ICA2) using the
input from PCA (as in [3]). Dimensionality of bothICA
representations was also 270. However, LDA yielded only
224-dimensional space sinceit can, by theory, produce a
maximum of c - 1 basis vectors. All of those were kept to
stayclose to the dimensionality of PCA and ICA spaces and
thus make comparisons as fair aspossible. After all the
subspaces have been derived, all images from data sets were
projectedonto each subspace and recognition using nearest
3. Integrated Intelligent Research (IIR) International Journal of Data Mining Techniques and Applications
Volume: 03 Issue: 01 June 2014, Page No. 42-45
ISSN: 2278-2419
44
neighbour classification with various Distance measures were
performed.
SamplesTable 1: Algorithm performance across four metrics.
Left part contains the results for rank 1 and the best algorithm-
metric combinations are bolded.
V. STATISTICAL SIGNIFICANCE
The simplest method for determining significance is to model
each probe image as a binomial test that either succeeds or
fails. Under this model, PCA is significantly better than ICA
on every probe set. For the fafb probe sets, the differences are
significant to a probability of 99.99% for both. For the
duplicate I and duplicate II probe sets, the differences are again
significant to 99.96% and 99.87%, respectively. When L2
norm is used, ICA performs significantly better on the fafb and
duplicate I probe sets, but not on duplicate II probe sets.
Table 1: Showing best Algorithm – Metric combination
Results at
rank 1
CMS Result
Metrics L1 L2 MAH COS Highest
Curve
Same
as
Rank 1
Algorithm Fa
PCA 82,26% 82,18% 64,94% 81,00% PCA+COS N
ICA1 81,00% 81,51% 64,94% 80,92% ICA1+L2 Y
ICA2 64,94% 74,31% 64,94% 83,85% ICA2+COS Y
LDA 78,08% 82,76% 70,88% 81,51% LDA+COS N
Fb
PCA 5567% 25,26% 32,99% 18,56% PCA+L1 Y
ICA1 18,04% 17,53% 32,99% 12,89% ICA1+L1 N
ICA2 15,98% 44,85% 32,99% 64,95% ICA2+COS Y
LDA 26,80% 26,80% 41,24% 20,62% LDA+L2 N
Dup1
PCA 36,29% 33,52% 25,62% 33,52% PCA+L1 Y
ICA1 32,55% 31,86% 25,62% 32,27% ICA1+L1 Y
ICA2 28,81% 31,99% 25,62% 42,66% ICA2+COS Y
LDA 34,76% 32,96% 27,70% 33,38% LDA+L1 Y
Dup2
PCA 17,09% 10,68% 14,53% 11,11% PCA+L1 Y
ICA1 8,97% 7,69% 14,53% 8,97% ICA1+MAH Y
ICA2 16,24% 19,66% 14,53% 28,21% ICA2+COS Y
LDA 16,24% 10,26% 16,67% 10,68% LDA+L1 N
4. Integrated Intelligent Research (IIR) International Journal of Data Mining Techniques and Applications
Volume: 03 Issue: 01 June 2014, Page No. 42-45
ISSN: 2278-2419
45
VI. CONCLUSION
In this based on rank we conclude that PCA, ICA and
LDA are in equal working conditions on all four probe
sets. In future work the difference in performance
between PCA, ICA, and LDA can be statistically
significant.
REFERENCES
[1] W. Zhao, R. Chellappa, J. Phillips, A. Rosenfeld, "Face Recognition in
Still and Video Images: A Literature Survey", ACM Computing Surveys,
Vol. 35, Dec. 2003, pp. 399-458
[2] M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of
Cognitive Neuroscience, Vol. 3, No. 1, 1991, pp. 71-86
[3] M.S. Bartlett, J.R. Movellan, T.J. Sejnowski, "Face Recognition by
Independent Component Analysis", IEEE Trans. on Neural Networks,
Vol. 13, No. 6, November 2002, pp. 1450-1464
[4] B. Draper, K. Baek, M.S. Bartlett, J.R. Beveridge, "Recognizing Faces
with PCA and ICA", Computer Vision and Image Understanding
(Special Issue on Face Recognition), Vol. 91, Issues 1-2, July-August
2003, pp. 115-137
[5] W. Zhao, R. Chellappa, A. Krishnaswamy, "Discriminant Analysis of
PrincipalComponents for Face Recognition", Proc. of the 3rd IEEE
International Conference onAutomatic Face and Gesture Recognition,
14-16 April 1998, Nara, Japan, pp. 336-341
[6] P.J. Phillips, H. Moon, S.A. Rizvi, P.J. Rauss, "The FERET Evaluation
Methodology forFace Recognition Algorithms", IEEE Trans. on Pattern
Analysis and MachineIntelligence, Vol. 22, No. 10, October 2000, pp.
1090-1104
[7] C. Liu, H. Wechsler, "Comparative Assessment of Independent
Component Analysis(ICA) for Face Recognition", Second International
Conference on Audio- and VideobasedBiometric Person Authentication,
Washington D.C., USA, 22-23 March 1999
[8] K. Baek, B. Draper, J.R. Beveridge, K. She, "PCA vs. ICA: A
Comparison on the FERETData Set", Proc. of the Fourth International
Conference on Computer Vision, PatternRecognition and Image
Processing, Durham, NC, USA, 8-14 March 2002, pp. 824-827
[9] B. Moghaddam, "Principal Manifolds and Probabilistic Subspaces for
VisualRecognition", IEEE Trans. on Pattern Analysis and Machine
Intelligence, Vol. 24, No. 6,October 2002, pp. 780-788
[10] J.R. Beveridge, K. She, B. Draper, G.H. Givens, "A Nonparametric
StatisticalComparison of Principal Component and Linear Discriminant
Subspaces for FaceRecognition", Proc. of the IEEE Conference on
Computer Vision and PatternRecognition, December 2001, Kaui, HI,
USA, pp. 535- 542
[11] A. Martinez, A. Kak, "PCA versus LDA", IEEE Trans. on Pattern
Analysis and MachineIntelligence, Vol. 23, No. 2, February 2001, pp.
228-233
[12] P. Navarrete, J. Ruiz-del-Solar, "Analysis and Comparison of
Eigenspace-Based FaceRecognition Approaches", International Journal
of Pattern Recognition and ArtificialIntelligence, Vol. 16, No. 7,
November 2002, pp. 817-830
[13] Aly A. Farag, ShireenY.Elhabian, “A Tutorial on Data Reduction”,
October 2 2008.