Unit3_1.pptx

Dimensionality Reduction
Principal Component Analysis (PCA)

 The central idea of principal component analysis (PCA) is to reduce the
dimensionality of a data set consisting of a large number of interrelated
features while retaining as much as possible of the variation present in the
data set.
 This is achieved by transforming to a new set of features, the principal
components (PCs), which are uncorrelated, and which are ordered so that
the first few retain most of the variation present in all of the original
features.

Mathematics Behind PCA
 PCA can be thought of as an unsupervised learning problem.
 The whole process of PCA can be summarized as follows:
• Standardize the given set of d-dimensional samples using Z-score Normalization.
• Compute the covariance matrix of the standardized dataset.
• Compute the eigenvectors and the corresponding eigenvalues of the
covariance matrix.
• Sort the eigenvectors by decreasing order of eigenvalues and choose the
eigenvectors corresponding to the largest k eigenvalues to form a d × k
dimensional matrix W
• Use this d × k eigenvector matrix W to transform the samples onto the new subspace

 Consider the following two-dimensional dataset with features x1 and x2:
 Our goal is to use PCA to reduce the dimensions of our dataset from
two to one, that is, from to R2 to R

 Lets visualize the given data set:

1. Standardization of the given dataset
 Suppose we want to perform PCA on two features - a person's age and weight. If the unit of
weight is in grams, then the magnitude of its spread or variance will be much larger than that
of the age feature.
 The variance of the weight would be in the order of magnitude of say 10,000 while that of
age would be say 10.
 As PCA uses the variance of each features to reduce the dimensionality, it would focus more
on extracting information from features with higher variances and ignore the other features of
less variance.
 The way to overcome this is to initially perform standardization such that all the features are
transformed to the same unitless scale. In PCA, we perform Z-score normalization

 Z-score normalization is performed as follows:
 After performing this standardization, the transformed features would each have a mean
of 0 and a standard deviation of 1.

 As an example, let's manually standardize our first feature x1. We first need to compute
the mean and the variance of this feature. We begin with the mean:
 Next, let's compute the standard deviation:

 Now, we can compute each scaled value of x1 as :

 For example, to compute the first value:
 We repeat this process for the rest of the values in the feature to finally obtain the scaled
feature z1.

 Remember that we've only standardized the feature x1 - we need to repeat the entire process
(starting with computing the mean and standard deviation) for the feature x2 also. The data set
after Z-score normalization will be as follows:
Z1 Z2
1 1.650 0.990
2 0.889 0.078
3 -0.637 0.990
4 0.126 0.534
5 -1.019 -0.835
6 -1.019 -1.749

 Our dataset visually looks like the following after standardization
 As we can see, the layout of the points still looks similar even after standardization,
and they are now centered around the origin!

 The next step of PCA is to find a line (also known as principal components ) on which to
project the given samples that captures the relationship between the two features well.
 How well the relationship is captured is based on how much variance is preserved when the
samples are projected onto the line.
Finding the principal components

Finding the principal components
 Let's intuitively understand what is meant by finding the principal components,
consider the following example. Suppose we have the following samples:

Unit3_1.pptx

Recommended

Recommended

More Related Content

Similar to Unit3_1.pptx

Similar to Unit3_1.pptx (20)

Recently uploaded

Recently uploaded (20)

Unit3_1.pptx