"FingerPrint Recognition Using Principle Component Analysis(PCA)”Er. Arpit Sharma
Fingerprint recognition is one of the oldest and most popular biometric technologies and it is used in criminal investigations, civilian, commercial applications, and so on. Fingerprint matching is the process used to determine whether the two sets of fingerprints details come from the same finger or not. This work focuses on feature extraction and minutiae matching stage. There are many matching techniques used for fingerprint recognition systems such as minutiae based matching, pattern based matching, Correlation based matching, and image based matching.
A new method based upon Principal Component Analysis (PCA) for fingerprint enhancement is proposed in this paper. PCA is a useful statistical technique that has found application in fields such as face recognition and image compression, and is a common technique for finding patterns in data of high dimension. In the proposed method image is first decomposed into directional images using decimation free Directional Filter bank DDFB. Then PCA is applied to these directional fingerprint images which gives the PCA filtered images. Which are basically directional images? Then these directional images are reconstructed into one image which is the enhanced one. Simulation results are included illustrating the capability of the proposed method.
"FingerPrint Recognition Using Principle Component Analysis(PCA)”Er. Arpit Sharma
Fingerprint recognition is one of the oldest and most popular biometric technologies and it is used in criminal investigations, civilian, commercial applications, and so on. Fingerprint matching is the process used to determine whether the two sets of fingerprints details come from the same finger or not. This work focuses on feature extraction and minutiae matching stage. There are many matching techniques used for fingerprint recognition systems such as minutiae based matching, pattern based matching, Correlation based matching, and image based matching.
A new method based upon Principal Component Analysis (PCA) for fingerprint enhancement is proposed in this paper. PCA is a useful statistical technique that has found application in fields such as face recognition and image compression, and is a common technique for finding patterns in data of high dimension. In the proposed method image is first decomposed into directional images using decimation free Directional Filter bank DDFB. Then PCA is applied to these directional fingerprint images which gives the PCA filtered images. Which are basically directional images? Then these directional images are reconstructed into one image which is the enhanced one. Simulation results are included illustrating the capability of the proposed method.
Methodological study of opinion mining and sentiment analysis techniquesijsc
Decision making both on individual and organizational level is always accompanied by the search of
other’s opinion on the same. With tremendous establishment of opinion rich resources like, reviews, forum
discussions, blogs, micro-blogs, Twitter etc provide a rich anthology of sentiments. This user generated
content can serve as a benefaction to market if the semantic orientations are deliberated. Opinion mining
and sentiment analysis are the formalization for studying and construing opinions and sentiments. The
digital ecosystem has itself paved way for use of huge volume of opinionated data recorded. This paper is
an attempt to review and evaluate the various techniques used for opinion and sentiment analysis.
PROFIT AGENT CLASSIFICATION USING FEATURE SELECTION EIGENVECTOR CENTRALITYIAEME Publication
Classification is a method that process related categories used to group data
according to it are similarities. High dimensional data used in the classification
process sometimes makes a classification process not optimize because there are huge
amounts of otherwise meaningless data. in this paper, we try to classify profit agent
from PT.XYZ and find the best feature that has a major impact to profit agent. Feature
selection is one of the methods that can optimize the dataset for the classification
process. in this paper we applied a feature selection based on graph method, graph
method identifies the most important nodes that are interrelated with neighbors nodes.
Eigenvector centrality is a method that estimates the importance of features to its
neighbors, using Eigenvector centrality will ranking central nodes as candidate
features that used for classification method and find the best feature for classifying
Data Agent. Support Vector Machines (SVM) is a method that will be used whether
the approach using Feature Selection with Eigenvalue Centrality will further optimize
the accuracy of the classification
LOSSLESS RECONSTRUCTION OF SECRET IMAGE USING THRESHOLD SECRET SHARING AND TR...IJNSA Journal
This paper is proposed to provide confidentiality of the secret image which can be used by multiple users or to store on multiple servers. A secret sharing is a technique to protect the secret information which will be used by multiple users. The threshold secret sharing is more efficient as it is possible to
reconstruct the secret with the threshold number of shares. Along with Shamir’s secret sharing method we propose to use the radon transformation before dividing the image in to shares. This transformation is used so that the shares will not have the original pixel intensity. The run length code is used to compress
the image after the transformation. Then apply secret sharing technique. The reconstruction of the image results in original image by applying the operations in the reverse order.
Rough set theory is a novel mathematical tool to process uncertainty decision-making problem. It offers a
new viewpoint to study conflict analysis decision making as in Pawlak conflict analysis model.
Conflict Theory supports the political defacto such as the well-known sayings "friend of my friend is my
friend", and "enemy of my enemy is my friend", according to the feature coalition relation in Pawlak conflict theory,
it is possible to expect of indirect relationships between neutral agents based on their relationships with others. There
is no real research dedicated to implement or discuss these features.
In this paper, we attempt to develop the conflict analysis system to predict the changes that may happen in
coalitions and conflicts relations among the agents. These changes usually occur with the neutral agents, they may
change their opinions to coalition or conflict. The proposed modification of conflict model depends on suggested
operations accomplished on the graph representation of the information system, such as ORing, ANDing, XORing,
and finding the indirect coalition and conflict paths among the agents in the model.
Methodological study of opinion mining and sentiment analysis techniquesijsc
Decision making both on individual and organizational level is always accompanied by the search of
other’s opinion on the same. With tremendous establishment of opinion rich resources like, reviews, forum
discussions, blogs, micro-blogs, Twitter etc provide a rich anthology of sentiments. This user generated
content can serve as a benefaction to market if the semantic orientations are deliberated. Opinion mining
and sentiment analysis are the formalization for studying and construing opinions and sentiments. The
digital ecosystem has itself paved way for use of huge volume of opinionated data recorded. This paper is
an attempt to review and evaluate the various techniques used for opinion and sentiment analysis.
PROFIT AGENT CLASSIFICATION USING FEATURE SELECTION EIGENVECTOR CENTRALITYIAEME Publication
Classification is a method that process related categories used to group data
according to it are similarities. High dimensional data used in the classification
process sometimes makes a classification process not optimize because there are huge
amounts of otherwise meaningless data. in this paper, we try to classify profit agent
from PT.XYZ and find the best feature that has a major impact to profit agent. Feature
selection is one of the methods that can optimize the dataset for the classification
process. in this paper we applied a feature selection based on graph method, graph
method identifies the most important nodes that are interrelated with neighbors nodes.
Eigenvector centrality is a method that estimates the importance of features to its
neighbors, using Eigenvector centrality will ranking central nodes as candidate
features that used for classification method and find the best feature for classifying
Data Agent. Support Vector Machines (SVM) is a method that will be used whether
the approach using Feature Selection with Eigenvalue Centrality will further optimize
the accuracy of the classification
LOSSLESS RECONSTRUCTION OF SECRET IMAGE USING THRESHOLD SECRET SHARING AND TR...IJNSA Journal
This paper is proposed to provide confidentiality of the secret image which can be used by multiple users or to store on multiple servers. A secret sharing is a technique to protect the secret information which will be used by multiple users. The threshold secret sharing is more efficient as it is possible to
reconstruct the secret with the threshold number of shares. Along with Shamir’s secret sharing method we propose to use the radon transformation before dividing the image in to shares. This transformation is used so that the shares will not have the original pixel intensity. The run length code is used to compress
the image after the transformation. Then apply secret sharing technique. The reconstruction of the image results in original image by applying the operations in the reverse order.
Rough set theory is a novel mathematical tool to process uncertainty decision-making problem. It offers a
new viewpoint to study conflict analysis decision making as in Pawlak conflict analysis model.
Conflict Theory supports the political defacto such as the well-known sayings "friend of my friend is my
friend", and "enemy of my enemy is my friend", according to the feature coalition relation in Pawlak conflict theory,
it is possible to expect of indirect relationships between neutral agents based on their relationships with others. There
is no real research dedicated to implement or discuss these features.
In this paper, we attempt to develop the conflict analysis system to predict the changes that may happen in
coalitions and conflicts relations among the agents. These changes usually occur with the neutral agents, they may
change their opinions to coalition or conflict. The proposed modification of conflict model depends on suggested
operations accomplished on the graph representation of the information system, such as ORing, ANDing, XORing,
and finding the indirect coalition and conflict paths among the agents in the model.
With these components in place, we present the Data
Science Machine — an automated system for generating
predictive models from raw data. It starts with a relational
database and automatically generates features to be used
for predictive modeling.
Face Emotion Analysis Using Gabor Features In Image Database for Crime Invest...Waqas Tariq
The face is the most extraordinary communicator, which plays an important role in interpersonal relations and Human Machine Interaction. . Facial expressions play an important role wherever humans interact with computers and human beings to communicate their emotions and intentions. Facial expressions, and other gestures, convey non-verbal communication cues in face-to-face interactions. In this paper we have developed an algorithm which is capable of identifying a person’s facial expression and categorize them as happiness, sadness, surprise and neutral. Our approach is based on local binary patterns for representing face images. In our project we use training sets for faces and non faces to train the machine in identifying the face images exactly. Facial expression classification is based on Principle Component Analysis. In our project, we have developed methods for face tracking and expression identification from the face image input. Applying the facial expression recognition algorithm, the developed software is capable of processing faces and recognizing the person’s facial expression. The system analyses the face and determines the expression by comparing the image with the training sets in the database. We have followed PCA and neural networks in analyzing and identifying the facial expressions.
Data Science - Part XVII - Deep Learning & Image ProcessingDerek Kane
This lecture provides an overview of Image Processing and Deep Learning for the applications of data science and machine learning. We will go through examples of image processing techniques using a couple of different R packages. Afterwards, we will shift our focus and dive into the topics of Deep Neural Networks and Deep Learning. We will discuss topics including Deep Boltzmann Machines, Deep Belief Networks, & Convolutional Neural Networks and finish the presentation with a practical exercise in hand writing recognition technique.
CDS is the criminal face identification by capsule neural network.
Solving the common problems in image recognition such as illumination problem, scale variability, and to fight against a most common problem like pose problem, we are introducing Face Reconstruction System.
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...Simplilearn
This presentation on Machine Learning will help you understand what is clustering, K-Means clustering, flowchart to understand K-Means clustering along with demo showing clustering of cars into brands, what is logistic regression, logistic regression curve, sigmoid function and a demo on how to classify a tumor as malignant or benign based on its features. Machine Learning algorithms can help computers play chess, perform surgeries, and get smarter and more personal. K-Means & logistic regression are two widely used Machine learning algorithms which we are going to discuss in this video. Logistic Regression is used to estimate discrete values (usually binary values like 0/1) from a set of independent variables. It helps to predict the probability of an event by fitting data to a logit function. It is also called logit regression. K-means clustering is an unsupervised learning algorithm. In this case, you don't have labeled data unlike in supervised learning. You have a set of data that you want to group into and you want to put them into clusters, which means objects that are similar in nature and similar in characteristics need to be put together. This is what k-means clustering is all about. Now, let us get started and understand K-Means clustering & logistic regression in detail.
Below topics are explained in this Machine Learning tutorial part -2 :
1. Clustering
- What is clustering?
- K-Means clustering
- Flowchart to understand K-Means clustering
- Demo - Clustering of cars based on brands
2. Logistic regression
- What is logistic regression?
- Logistic regression curve & Sigmoid function
- Demo - Classify a tumor as malignant or benign based on features
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars.This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
We recommend this Machine Learning training course for the following professionals in particular:
1. Developers aspiring to be a data scientist or Machine Learning engineer
2. Information architects who want to gain expertise in Machine Learning algorithms
3. Analytics professionals who want to work in Machine Learning or artificial intelligence
4. Graduates looking to build a career in data science and Machine Learning
Learn more at: https://www.simplilearn.com/
Decision tree knowledge discovery through neural Networks
structure of decision tree and neural networks.
how they work?
Models
working
knowledge discovery
clustering
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
1. Data Science with Python
Facial Expression Recognition
Final Project Report - OPIM 5894 - Data Science with Python
Team Brogrammers: Santanu Paul, Sree Inturi, Saurav Gupta, Vibhuti Upadhyay, Sunender Pothula
Nov 30 2017
2. Table of Contents
Facial Expression Recognition .............................................................................................. 1
1. Introduction................................................................................................................................ 1
1.1 Background – What is Representation Learning? .......................................................................1
1.2 Research Objectives..................................................................................................................2
2. Data Description and Exploration.............................................................................................. 2
2.1 About the Dataset.................................................................................................................... 2
2.2 Data Exploration.......................................................................................................................3
2.3 Data Preprocessing...................................................................................................................4
3. Dimensionality Reduction.......................................................................................................... 4
3.1 Curse of Dimensionality............................................................................................................4
3.2 Principal Component Analysis...................................................................................................5
Takeaway from the plot: ........................................................................................................6
Visualizing the Eigen Value:..................................................................................................6
Interactive visualizations of PCArepresentation...............................................................7
Improvements:........................................................................................................................8
3.3 Linear Discriminant Analysis......................................................................................................9
4. Modeling................................................................................................................................... 11
4.1 Support Vector Machine .........................................................................................................11
4.2 Neural Networks.....................................................................................................................14
4.3 Conclusion..............................................................................................................................16
5. Scope for Improvement ........................................................................................................... 18
5.1 CNN (Convolutional Neural Network) And Parameter Tuning...................................................18
Attachments – Python Notebooks and Code.............................................................................. 19
3. 1
1. Introduction
1.1 Background– What is Representation Learning?
Talking about the older machine learning algorithms, they rely on the input being a
feature and then learn a classifier, regressor, etc. on top of that. Most of these features are
hand crafted, i.e. designed by humans. Classical examples of features in computer vision
include SIFT, LBP, etc. The problem with these is that they are designed by humans based
on heuristics. Images can be represented using these features and ML algorithms can be
applied on top of that. However, they may not be the most optimal in terms of the objective
function, i.e., it may be possible to design better features that can lead to lower objective
function values. Instead of hand crafting these image representations, we can learn them.
That is known as representation learning. We can have a neural network which takes the
image as an input and outputs a vector, which is the feature representation of the image. This
is the representation learner. This be followed by another neural network that acts as the
classifier, regressor, etc.
For example: A wheel has a geometric shape, but its image may be complicated by
shadows falling on the wheel, the sun glaring off the metal parts of the wheel, the fender of
the car or an object in the foreground obscuring part of the wheel, and so on. We can try to
manually describe how a wheel should look like and how it can be represented. Say, it should
be circular, be black in color, have treads, etc. But these are all hand-crafted features and
may not generalize to all situations. For example, if you look at the wheel from a different
angle, it might be oval in shape. Or the lighting may cause it to have lighter and darker
patches. These kinds of variations are hard to account for manually. Instead, we can let the
representation learning neural network learn them from data by giving it several positive and
negative examples of a wheel and training it end to end.
4. 2
1.2 ResearchObjectives
The major objective of this project is to classify an image using its facial expression.
(1) Image Classification
We are presenting a method for the classification of facial expression from the analysis of
facial deformations. The classification process is based on Convolutional Neural Networks which
classifies an image as “Happy” or “Sad”. Our Neural Network model extracts an expression
skeleton of facial features. We also demonstrate the efficiency of our classifier. Our classifier was
compared with PCA and LDA classifiers working on the same data.
2. Data Description and Exploration
The data set used in this project is Challenges in Representation Learning: Facial
Expression Recognition Challenge, which contains 48x48 pixel grayscale images of faces. The
faces have been automatically registered so that the face is more or less centered and occupies
about the same amount of space in each image. The task is to categorize each face based on
the emotion shown in the facial expression in to one of two categories (3=Happy, 4=Sad)
2.1 About the Dataset
The training set consists of 15,066 examples (Happy:8989, Sad:6077) and two columns,
"emotion" and "pixels”. The "emotion" column contains a numeric code i.e. 3 & 4, inclusive, for
the emotion that is present in the image. The "pixels" column contains a string surrounded in
quotes for each image. The contents of this string a space-separated pixel values in row major
order.
Similarly test set used for the leaderboard consists of 3,589 examples and contains only
the "pixels" column and our task is to predict the emotion (Happy or Sad)
There were no missing values in our data set, it was a clean dataset.
Value Counts of our data points: Our data set is quite balanced.
5. 3
Screenshotofdata
2.2 Data Exploration
As we had pixel information in the pixel column, our first goal is to split the pixel column into multiple fields,
so that we get a rough idea how a final 48*48 picture looks like.
6. 4
Let’s see how the emotions looks like
Happy Face Sad Face
2.3 Data Preprocessing
Standardization: Standardization is a good practice for many machine learning algorithms.
Although our data is on the same scale i.e. values (1 to 255) we still preferred to do Standardize
our data.
3. Dimensionality Reduction
3.1 Curse of Dimensionality
This term has often been thrown about, especially when PCA, LDA is thrown into the mix.
This phrase refers to how our perfectly good and reliable Machine Learning methods may
suddenly perform badly when we are dealing in a very high-dimensional space. But what exactly
do all these two acronyms do? They are essentially transformation methods used for
dimensionality reduction. Therefore, if we are able to project our data from a higher-dimensional
space to a lower one while keeping most of the relevant information, that would make life a lot
easier for our learning methods.
In our data, there are 48 X 48 pixel images of data contributing to 2307 columns.Modeling
in such high dimensional space our model could perform badly so it’s perfect time to introduce
Dimensionality Reduction methods.
7. 5
3.2 PrincipalComponentAnalysis
In a nutshell, PCA is a linear transformation algorithm that seeks to project the original
features of our data onto a smaller set of features (or subspace) while still retaining most of the
information. To do this the algorithm tries to find the most appropriate directions/angles (which
are the principal components) that maximize the variance in the new subspace.
We know that principal components are orthogonal to each other. As such when
generating the covariance matrix in our new subspace, the off-diagonal values of the covariance
matrix will be zero and only the diagonals (or eigenvalues) will be non-zero. It is these diagonal
values that represent the variances of the principal components i.e. the information about the
variability of our features.
This is how our final preprocessed data looks like:
The method follows:
1. Standardize the data (already done)
2. Calculating Eigen Vectors and Eigen Values of Covariance matrix
3. Create a list of (Eigen Value, Eigen Vector) tuples
8. 6
4. Sort the Eigen Value, Eigen Vector pair from high to low
5. Calculate the explained variance from Eigen Values
Takeaway from the plot:
There are two plots above, a smaller one embedded within the larger plot. The smaller
plot (Green and Red) shows the distribution of the Individual and Explained variances across all
features while the larger plot (Golden and black) portrays a zoomed section of the explained
variances only.
As we can see, out of our 2304 features or columns approximately 90% of the Explained
Variance can be described by using just over 107 features. So, if we wanted to implement a PCA
on this, extracting the top 107 features would be a very logical choice as they already account for
the majority of the data
Visualizing the Eigen Value:
As alluded to above, since the PCA method seeks to obtain the optimal directions (or
eigenvectors) that captures the most variance (spreads out the data points the most). Therefore,
9. 7
it may be informative to visualize these directions and their associated eigenvalues. For the
purposes of this notebook and for speed, I will invoke PCA to only extract the top 28. Of interest
is when one compares the first component "Eigenvalue 1" to the 28th component "Eigenvalue
28", it is obvious that more complicated directions or components are being generated in the
search to maximize variance in the new feature subspace.
Interactive visualizations of PCA representation
When it comes to these dimensionality reduction methods, scatter plots are most
commonly implemented because they allow for great and convenient visualizations of clustering
(if any existed) and this will be exactly what we will be doing as we plot the first 2 principal
components as follows. We observed that there are no observable clusters for first two Principal
Components.
10. 8
Improvements:
Looking at the reconstruction of the original image vs the image generated after PCA, it appears
that reconstructed images are not very similar to the original ones so as to discern them
categorically. Facial expressions can be subtle and lot more information will be needed to detect
them.
Sometimes, even naked eyes fail to understand the reconstructed images' emotions. Hence,
90% is not enough information. Let's move to 95% variance (259 components)
11. 9
But as we know PCA is meant to be an unsupervised method and therefore not optimized for
separating different class labels. Classifying more accurately is what we try to accomplish by the
very next method i.e. LDA.
3.3 Linear DiscriminantAnalysis
LDA, much like PCA is also a linear transformation method commonly used in
dimensionality reduction tasks. However, unlike the latter which is an unsupervised learning
algorithm, LDA falls into the class of supervised learning methods. As such the goal of LDA is that
with available information about class labels, LDA will seek to maximize the separation between
the different classes by computing the component axes (linear discriminants) which does this.
LDA Implementation from Scratch
The objective of LDA is to preserve the class separation information whilst still reducing
the dimensions of the dataset. As such implementing the method from scratch can roughly be
split into 4 distinct stages as below.
A. Projected Means
Since this method was designed to take into account class labels we therefore first need to
establish a suitable metric with which to measure the 'distance' or separation between different
12. 10
classes. Let's assume that we have a set of data points x that belong to one particular class w.
Therefore, in LDA the first step is to the project these points onto a new line, Y that contains the
class-specific information via the transformation
$$Y = omega^intercal x $$
With this the idea is to find some method that maximizes the separation of these new projected
variables. To do so, we first calculate the projected mean.
B. Scatter Matrices and their solutions: Having introduced our projected means, we now need
to find a function that can represent the difference between the means and then maximize it. Like
in linear regression, where the most basic case is to find the line of best fit we need to find the
equivalent of the variance in this context. And hence this is where we introduce scatter matrices
where the scatter is the equivalent of the variance.
$$ tilde{S}^{2} = (y - tilde{mu})^{2}$$
C. Selecting Optimal Projection Matrices
D. Transforming features onto new subspace
LDA Implementation via Sklearn: We used Sklearn inbuilt LDA function and hence we invoke
an LDA model as follows:
The syntax for the LDA implementation is very much like PCA whereby one calls the fit and
transform methods which fits the LDA model with the data and then does a transformation by
applying the LDA dimensionality reduction to it. However, since LDA is a supervised learning
algorithm, there is a second argument to the method that the user must provide and this would
be the class labels, which in this case is the target labels of the digits.
13. 11
Interactive visualizations of LDA representation:
From the scatter plot above, we can see that the data points are more clearly clustered when
using LDA with as compared to implementing PCA with class labels. This is an inherent advantage
in having class labels to supervise the method with.
4. Modeling
4.1 SupportVectorMachine
SVM can be considered as an extension of the perceptron. Using the perceptron algorithm, we
can minimize misclassification errors. However, in SVMs, our optimization objective is
to maximize the margin between the classes. The margin is defined as the distance between the
separating hyperplane (decision boundary) and the training samples (support vectors) that are
closest to this hyperplane.
14. 12
Input X: Components from PCA i.e. 107
Running a SVM classifier with default parameters on it we get the accuracy of 62%
Input X: Components from PCA i.e. 259 components.
Running a SVM classifier with default parameters on it we get the accuracy of 65%
16. 14
Input X: Output from LDA, i.e. LD 1
Running a SVM classifier with default parameters we get accuracy of 66.4%
Misclassification rate is 33.6%, we will try to fit a neural network model so that our model classifies
with more accuracy.
4.2 NeuralNetworks
A computational model that works in a similar way to the neurons in the human brain.
Each neuron takes an input, performs some operations then passes the output to the following
neuron. As we are done pre-processing and splitting our dataset we can start implementing our
neural network
We have designed a simple neural network with one hidden layer i.e. Vanilla NN with 50
nodes and the Hyperbolic Tangent Activation Function
17. 15
We have used a simple neural network with one hidden layers having 50 nodes. The learning
rate used is also quite low in order to find the optimum solution. A mix of gradient descent and
momentum method is used. Tangent hyperbolic function is applied in the hidden layer, and a
cross entropy loss function is used from the softmax output. An accuracy of 65.8% was
achieved.
18. 16
The maximum accuracy is achieved rather quickly in this method using gradient descent and
momentum.
4.3 Conclusion
As our model is misclassifying 33 times out of 100. We tried to look at the initial image,
what features it is not able to predict right. Pictures like following is what our model is not able to
predict right. Maybe because of the hair or the eyes or maybe because of the lightning. As the
image set is very discrete there may be some error there. Because of the time constraint we
were not able to run CNN (Convolutional Neural Network) on the dataset. But that would be our
next step.
19. 17
Many pictures in our data had watermarks just like this one, which were misclassified. Majority
of our training data doesn’t have watermarks, that is also the reason it is not able to classify to
the maximum capacity.
20. 18
5. Scope for Improvement
5.1 CNN (ConvolutionalNeural Network)And ParameterTuning
We were not able to tune the parameters of our neural network model because of time
crunch and it took a lot of time in training this huge dataset. So, going forward not for the grades
but for our self-learning we will be focusing on Tenserflow and CNN.
Traditional neural networks that are very good at doing image classification have many
more parameters and take a lot of time if trained on CPU. They are faster and are applied heavily
in image and video recognition, recommender systems and natural language processing. CNNs
share weights in convolutional layers, which means that the same filter weights bank is used for
each receptive field in the layer; this reduces memory footprint and improves performance.
21. 19
Attachments – Python Notebooks and Code
1.Python Project_Image Classification.ipynb
Initial Data exploration and Preprocessing. Dimensionality Reduction by PCA, LDA
2. Python Project_Image Classification2.ipynb
SVM Implementation on top of PCA and LDA (Comparison)
3. Vanilla Neural Network.ipynb
Neural Network Implementation