SlideShare a Scribd company logo
1 of 33
Principal Component Analysis
(Dimensionality Reduction)
By:
Tarun Bhatia
Y7475
Overview:
• What is Principal Component Analysis
• Computing the compnents in PCA
• Dimensionality Reduction using PCA
• A 2D example in PCA
• Applications of PCA in computer vision
• Importance of PCA in analysing data in higher dimensions
• Questions
Principal Component Analysis
• Most common form of factor analysis
• The new variables/dimensions
– Are linear combinations of the original ones
– Are uncorrelated with one another
• Orthogonal in original dimension space
– Capture as much of the original variance in the
data as possible
– Are called Principal Components
What are the new axes?
• Orthogonal directions of greatest variance in data
• Projections along PC1 discriminate the data most along any one axis
Principal Components
•First principal component is the direction of greatest
variability (covariance) in the data
• Second is the next orthogonal (uncorrelated) direction of greatest variability
– So first remove all the variability along the first
component, and then find the next direction of
greatest variability
• And so on …
Principal Components Analysis (PCA)
• Principle
– Linear projection method to reduce the number of parameters
– Transfer a set of correlated variables into a new set of uncorrelated
variables
– Map the data into a space of lower dimensionality
– Form of unsupervised learning
• Properties
–It can be viewed as a rotation of the existing axes to new positions in the
space defined by original variables
–New axes are orthogonal and represent the directions with maximum
variability
Computing the Components
• Data points are vectors in a multidimensional space
• Projection of vector x onto an axis (dimension) u is u.x
•Direction of greatest variability is that in which the average square of the
projection is greatest
– I.e. u such that E((u.x)2) over all x is maximized
–(we subtract the mean along each dimension, and center the original axis
system at the centroid of all data points, for simplicity)
– This direction of u is the direction of the first Principal Component
Computing the Components
• E((u.x)2) = E ((u.x) (u.x)T) = E (u.x.x T.uT)
•The matrix S = x.xT contains the correlations (similarities) of the original axes based
on how the data values project onto them
• So we are looking for w that maximizes uSuT, subject to u being unit-length
• It is maximized when w is the principal eigenvector of the matrix S, in which case
–uCuT = uλuT = λ if u is unit-length, where λ is the principal eigenvalue of the
correlation matrix C
– The eigenvalue denotes the amount of variability captured along that dimension
Why the Eigenvectors?
Maximise uTxxTu s.t uTu = 1
Construct Langrangian uTxxTu – λuTu
Vector of partial derivatives set to zero
xxTu – λu = (xxT – λI) u = 0
As u ≠ 0 then u must be an eigenvector of xxT with eigenvalue λ
Computing the Components
• Similarly for the next axis, etc.
•So, the new axes are the eigenvectors of the matrix of correlations of the
original variables, which captures the similarities of the original variables
based on how data samples project to them
• Geometrically: centering followed by rotation
• – Linear transformation
PCs, Variance and Least-Squares
• The first PC retains the greatest amount of variation in the sample
• The kth PC retains the kth greatest fraction of the variation in the sample
•The kth largest eigenvalue of the correlation matrix C is the variance in the
sample along the kth PC
•The least-squares view: PCs are a series of linear least
squares fits to a sample, each orthogonal to all previous ones
How Many PCs?
•For n original dimensions, correlation matrix is
nxn, and has up to n eigenvectors. So n PCs.
•Where does dimensionality reduction come
from?
Dimensionality Reduction
Can ignore the components of lesser significance.
You do lose some information, but if the eigenvalues are small, you don’t lose
much
– n dimensions in original data
– calculate n eigenvectors and eigenvalues
– choose only the first p eigenvectors, based on their eigenvalues
– final data set has only p dimensions
Eigenvectors of a Correlation Matrix
PCA Example –STEP 1
• Subtract the mean
from each of the data dimensions. All the x values
have x subtracted and y values have y subtracted
from them. This produces a data set whose mean is
zero.
Subtracting the mean makes variance and covariance
calculation easier by simplifying their equations. The
variance and co-variance values are not affected by
the mean value.
PCA Example –STEP 1
DATA:
x y
2.5 2.4
0.5 0.7
2.2 2.9
1.9 2.2
3.1 3.0
2.3 2.7
2 1.6
1 1.1
1.5 1.6
1.1 0.9
ZERO MEAN DATA:
x y
.69 .49
-1.31 -1.21
.39 .99
.09 .29
1.29 1.09
.49 .79
.19 -.31
-.81 -.81
-.31 -.31
-.71 -1.01
http://kybele.psych.cornell.edu/~edelman/Psych-465-Spring-2003/PCA-tutorial.pdf
PCA Example –STEP 1
http://kybele.psych.cornell.edu/~edelman/Psych-465-Spring-2003/PCA-tutorial.pdf
PCA Example –STEP 2
• Calculate the covariance matrix
cov = .616555556 .615444444
.615444444 .716555556
• since the non-diagonal elements in this covariance
matrix are positive, we should expect that both the x
and y variable increase together.
PCA Example –STEP 3
• Calculate the eigenvectors and eigenvalues of
the covariance matrix
eigenvalues = .0490833989
1.28402771
eigenvectors = -.735178656 -.677873399
.677873399 -.735178656
PCA Example –STEP 3
http://kybele.psych.cornell.edu/~edelman/Psych-465-Spring-2003/PCA-tutorial.pdf
•eigenvectors are plotted as
diagonal dotted lines on the
plot.
•Note they are
perpendicular to each other.
•Note one of the
eigenvectors goes through
the middle of the points, like
drawing a line of best fit.
•The second eigenvector
gives us the other, less
important, pattern in the
data, that all the points
follow the main line, but are
off to the side of the main
line by some amount.
PCA Example –STEP 4
• Reduce dimensionality and form feature vector
the eigenvector with the highest eigenvalue is the principle
component of the data set.
In our example, the eigenvector with the larges eigenvalue
was the one that pointed down the middle of the data.
Once eigenvectors are found from the covariance matrix, the
next step is to order them by eigenvalue, highest to lowest.
This gives you the components in order of significance.
PCA Example –STEP 4
Now, if you like, you can decide to ignore the components of
lesser significance.
You do lose some information, but if the eigenvalues are
small, you don’t lose much
• n dimensions in your data
• calculate n eigenvectors and eigenvalues
• choose only the first p eigenvectors
• final data set has only p dimensions.
PCA Example –STEP 4
• Feature Vector
FeatureVector = (eig1 eig2 eig3 … eign)
We can either form a feature vector with both of the
eigenvectors:
-.677873399
-.735178656
-.735178656
.677873399
or, we can choose to leave out the smaller, less
significant component and only have a single
column:
- .677873399
- .735178656
PCA Example –STEP 5
• Deriving the new data
FinalData = RowFeatureVector x RowZeroMeanData
RowFeatureVector is the matrix with the eigenvectors in the
columns transposed so that the eigenvectors are now in the
rows, with the most significant eigenvector at the top
RowZeroMeanData is the mean-adjusted data
transposed, ie. the data items are in each column,
with each row holding a separate dimension.
PCA Example –STEP 5
FinalData transpose: dimensions
along columns
x y
-.827970186 -.175115307
1.77758033 .142857227
-.992197494 .384374989
-.274210416 .130417207
-1.67580142 -.209498461
-.912949103 .175282444
.0991094375 -.349824698
1.14457216 .0464172582
.438046137 .0177646297
1.22382056 -.162675287
PCA Example –STEP 5
http://kybele.psych.cornell.edu/~edelman/Psych-465-Spring-2003/PCA-tutorial.pdf
Reconstruction of original Data
• If we reduced the dimensionality, obviously, when
reconstructing the data we would lose those
dimensions we chose to discard. In our example let
us assume that we considered only the x dimension…
Reconstruction of original Data
http://kybele.psych.cornell.edu/~edelman/Psych-465-Spring-2003/PCA-tutorial.pdf
x
-.827970186
1.77758033
-.992197494
-.274210416
-1.67580142
-.912949103
.0991094375
1.14457216
.438046137
1.22382056
Applications in computer vision-
PCA to find patterns-
• 20 face images: NxN size
• One image represented as follows-
• Putting all 20 images in 1 big matrix as follows-
• Performing PCA to find patterns in the face images
• Idenifying faces by measuring differences along the new axes (PCs)
PCA for image compression:
• Compile a dataset of 20 images
• Build the covariance matrix of 20 dimensions
• Compute the eigenvectors and eigenvalues
• Based on the eigenvalues, 5 dimensions can be left out, those with the least
eigenvalues.
• 1/4th of the space is saved.
Importance of PCA
• In data of high dimensions, where graphical representation is difficult, PCA is a
powerful tool for analysing data and finding patterns in it.
• Data compression is possible using PCA
• The most efficient expression of data is by the use of perpendicular components,
as done in PCA.
Questions:
• What do the eigenvectors of the covariance matrix while computing the principal
components give us?
• At what point in the PCA process can we decide to compress the data?
• Why are the principal components orthogonal?
• How many different covariance values can you calculate for an n-dimensional data
set?
THANK YOU

More Related Content

Similar to pcappt-140121072949-phpapp01.pptx

The following ppt is about principal component analysis
The following ppt is about principal component analysisThe following ppt is about principal component analysis
The following ppt is about principal component analysisSushmit8
 
Dimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptxDimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptxSivam Chinna
 
Lecture1_jps (1).ppt
Lecture1_jps (1).pptLecture1_jps (1).ppt
Lecture1_jps (1).pptShivareddyGangam
 
Lecture1_jps.ppt
Lecture1_jps.pptLecture1_jps.ppt
Lecture1_jps.pptABINASHPADHY6
 
Lect5 principal component analysis
Lect5 principal component analysisLect5 principal component analysis
Lect5 principal component analysishktripathy
 
machine learning.pptx
machine learning.pptxmachine learning.pptx
machine learning.pptxAbdusSadik
 
Matrix Factorization In Recommender Systems
Matrix Factorization In Recommender SystemsMatrix Factorization In Recommender Systems
Matrix Factorization In Recommender SystemsYONG ZHENG
 
5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdf5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdfRahul926331
 
Lect4 principal component analysis-I
Lect4 principal component analysis-ILect4 principal component analysis-I
Lect4 principal component analysis-Ihktripathy
 
introduction to Statistical Theory.pptx
 introduction to Statistical Theory.pptx introduction to Statistical Theory.pptx
introduction to Statistical Theory.pptxDr.Shweta
 
Data Mining Lecture_9.pptx
Data Mining Lecture_9.pptxData Mining Lecture_9.pptx
Data Mining Lecture_9.pptxSubrata Kumer Paul
 
Excel and research
Excel and researchExcel and research
Excel and researchNursing Path
 
Excel and research
Excel and researchExcel and research
Excel and researchNursing Path
 
dimension reduction.ppt
dimension reduction.pptdimension reduction.ppt
dimension reduction.pptDeadpool120050
 
Singular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptxSingular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptxrajalakshmi5921
 
EDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptxEDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptxrajalakshmi5921
 

Similar to pcappt-140121072949-phpapp01.pptx (20)

pca.ppt
pca.pptpca.ppt
pca.ppt
 
The following ppt is about principal component analysis
The following ppt is about principal component analysisThe following ppt is about principal component analysis
The following ppt is about principal component analysis
 
Dimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptxDimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptx
 
Lecture1_jps (1).ppt
Lecture1_jps (1).pptLecture1_jps (1).ppt
Lecture1_jps (1).ppt
 
Lecture1_jps.ppt
Lecture1_jps.pptLecture1_jps.ppt
Lecture1_jps.ppt
 
Lect5 principal component analysis
Lect5 principal component analysisLect5 principal component analysis
Lect5 principal component analysis
 
machine learning.pptx
machine learning.pptxmachine learning.pptx
machine learning.pptx
 
Matrix Factorization In Recommender Systems
Matrix Factorization In Recommender SystemsMatrix Factorization In Recommender Systems
Matrix Factorization In Recommender Systems
 
5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdf5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdf
 
Lect4 principal component analysis-I
Lect4 principal component analysis-ILect4 principal component analysis-I
Lect4 principal component analysis-I
 
Unit3_1.pptx
Unit3_1.pptxUnit3_1.pptx
Unit3_1.pptx
 
introduction to Statistical Theory.pptx
 introduction to Statistical Theory.pptx introduction to Statistical Theory.pptx
introduction to Statistical Theory.pptx
 
Data Mining Lecture_9.pptx
Data Mining Lecture_9.pptxData Mining Lecture_9.pptx
Data Mining Lecture_9.pptx
 
pca.pdf
pca.pdfpca.pdf
pca.pdf
 
Daamen r 2010scwr-cpaper
Daamen r 2010scwr-cpaperDaamen r 2010scwr-cpaper
Daamen r 2010scwr-cpaper
 
Excel and research
Excel and researchExcel and research
Excel and research
 
Excel and research
Excel and researchExcel and research
Excel and research
 
dimension reduction.ppt
dimension reduction.pptdimension reduction.ppt
dimension reduction.ppt
 
Singular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptxSingular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptx
 
EDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptxEDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptx
 

More from ABINASHPADHY6

Rapid Guard PVC Doors (Ritul Joshi).pptx
Rapid Guard PVC Doors (Ritul Joshi).pptxRapid Guard PVC Doors (Ritul Joshi).pptx
Rapid Guard PVC Doors (Ritul Joshi).pptxABINASHPADHY6
 
wqedfghj,hgfdgthjkjhgfdsdfgthujsdfrtghyu
wqedfghj,hgfdgthjkjhgfdsdfgthujsdfrtghyuwqedfghj,hgfdgthjkjhgfdsdfgthujsdfrtghyu
wqedfghj,hgfdgthjkjhgfdsdfgthujsdfrtghyuABINASHPADHY6
 
Blue Shark Tech zsxdfghsdfghjsdfghjkdfghjkdfghjk
Blue Shark Tech zsxdfghsdfghjsdfghjkdfghjkdfghjkBlue Shark Tech zsxdfghsdfghjsdfghjkdfghjkdfghjk
Blue Shark Tech zsxdfghsdfghjsdfghjkdfghjkdfghjkABINASHPADHY6
 
Types-of-Information-System.pptx
Types-of-Information-System.pptxTypes-of-Information-System.pptx
Types-of-Information-System.pptxABINASHPADHY6
 
Ferns and Petals.pdf
Ferns and Petals.pdfFerns and Petals.pdf
Ferns and Petals.pdfABINASHPADHY6
 
Coursera H5B26VDU9GSH.pdf
Coursera H5B26VDU9GSH.pdfCoursera H5B26VDU9GSH.pdf
Coursera H5B26VDU9GSH.pdfABINASHPADHY6
 
Advertising &.pptx
Advertising &.pptxAdvertising &.pptx
Advertising &.pptxABINASHPADHY6
 
Bagging_and_Boosting.pptx
Bagging_and_Boosting.pptxBagging_and_Boosting.pptx
Bagging_and_Boosting.pptxABINASHPADHY6
 
shubhampresentation-180430060134.pptx
shubhampresentation-180430060134.pptxshubhampresentation-180430060134.pptx
shubhampresentation-180430060134.pptxABINASHPADHY6
 
decisiontree-110906040745-phpapp01.pptx
decisiontree-110906040745-phpapp01.pptxdecisiontree-110906040745-phpapp01.pptx
decisiontree-110906040745-phpapp01.pptxABINASHPADHY6
 
Module 7_ Use Cases_ Blockchain Certification Training Course.pptx
Module 7_ Use Cases_ Blockchain Certification Training Course.pptxModule 7_ Use Cases_ Blockchain Certification Training Course.pptx
Module 7_ Use Cases_ Blockchain Certification Training Course.pptxABINASHPADHY6
 
collaborativefiltering-150228122057-conversion-gate02.pptx
collaborativefiltering-150228122057-conversion-gate02.pptxcollaborativefiltering-150228122057-conversion-gate02.pptx
collaborativefiltering-150228122057-conversion-gate02.pptxABINASHPADHY6
 
Chernick.Michael.ppt
Chernick.Michael.pptChernick.Michael.ppt
Chernick.Michael.pptABINASHPADHY6
 
videorecommendationsystemfornewseducationandentertainment-170519183703.pptx
videorecommendationsystemfornewseducationandentertainment-170519183703.pptxvideorecommendationsystemfornewseducationandentertainment-170519183703.pptx
videorecommendationsystemfornewseducationandentertainment-170519183703.pptxABINASHPADHY6
 
log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...
log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...
log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...ABINASHPADHY6
 
09learning.ppt
09learning.ppt09learning.ppt
09learning.pptABINASHPADHY6
 
Advertisement &.pptx
Advertisement &.pptxAdvertisement &.pptx
Advertisement &.pptxABINASHPADHY6
 

More from ABINASHPADHY6 (20)

Rapid Guard PVC Doors (Ritul Joshi).pptx
Rapid Guard PVC Doors (Ritul Joshi).pptxRapid Guard PVC Doors (Ritul Joshi).pptx
Rapid Guard PVC Doors (Ritul Joshi).pptx
 
wqedfghj,hgfdgthjkjhgfdsdfgthujsdfrtghyu
wqedfghj,hgfdgthjkjhgfdsdfgthujsdfrtghyuwqedfghj,hgfdgthjkjhgfdsdfgthujsdfrtghyu
wqedfghj,hgfdgthjkjhgfdsdfgthujsdfrtghyu
 
Blue Shark Tech zsxdfghsdfghjsdfghjkdfghjkdfghjk
Blue Shark Tech zsxdfghsdfghjsdfghjkdfghjkdfghjkBlue Shark Tech zsxdfghsdfghjsdfghjkdfghjkdfghjk
Blue Shark Tech zsxdfghsdfghjsdfghjkdfghjkdfghjk
 
Types-of-Information-System.pptx
Types-of-Information-System.pptxTypes-of-Information-System.pptx
Types-of-Information-System.pptx
 
Ferns and Petals.pdf
Ferns and Petals.pdfFerns and Petals.pdf
Ferns and Petals.pdf
 
Coursera H5B26VDU9GSH.pdf
Coursera H5B26VDU9GSH.pdfCoursera H5B26VDU9GSH.pdf
Coursera H5B26VDU9GSH.pdf
 
Advertising &.pptx
Advertising &.pptxAdvertising &.pptx
Advertising &.pptx
 
Bagging_and_Boosting.pptx
Bagging_and_Boosting.pptxBagging_and_Boosting.pptx
Bagging_and_Boosting.pptx
 
shubhampresentation-180430060134.pptx
shubhampresentation-180430060134.pptxshubhampresentation-180430060134.pptx
shubhampresentation-180430060134.pptx
 
decisiontree-110906040745-phpapp01.pptx
decisiontree-110906040745-phpapp01.pptxdecisiontree-110906040745-phpapp01.pptx
decisiontree-110906040745-phpapp01.pptx
 
Module 7_ Use Cases_ Blockchain Certification Training Course.pptx
Module 7_ Use Cases_ Blockchain Certification Training Course.pptxModule 7_ Use Cases_ Blockchain Certification Training Course.pptx
Module 7_ Use Cases_ Blockchain Certification Training Course.pptx
 
collaborativefiltering-150228122057-conversion-gate02.pptx
collaborativefiltering-150228122057-conversion-gate02.pptxcollaborativefiltering-150228122057-conversion-gate02.pptx
collaborativefiltering-150228122057-conversion-gate02.pptx
 
Chernick.Michael.ppt
Chernick.Michael.pptChernick.Michael.ppt
Chernick.Michael.ppt
 
videorecommendationsystemfornewseducationandentertainment-170519183703.pptx
videorecommendationsystemfornewseducationandentertainment-170519183703.pptxvideorecommendationsystemfornewseducationandentertainment-170519183703.pptx
videorecommendationsystemfornewseducationandentertainment-170519183703.pptx
 
log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...
log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...
log6kntt4i4dgwfwbpxw-signature-75c4ed0a4b22d2fef90396cdcdae85b38911f9dce0924a...
 
Culbert.ppt
Culbert.pptCulbert.ppt
Culbert.ppt
 
Bootstrap.ppt
Bootstrap.pptBootstrap.ppt
Bootstrap.ppt
 
15303589.ppt
15303589.ppt15303589.ppt
15303589.ppt
 
09learning.ppt
09learning.ppt09learning.ppt
09learning.ppt
 
Advertisement &.pptx
Advertisement &.pptxAdvertisement &.pptx
Advertisement &.pptx
 

Recently uploaded

History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxsocialsciencegdgrohi
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxUnboundStockton
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerunnathinaik
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 

Recently uploaded (20)

History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developer
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 

pcappt-140121072949-phpapp01.pptx

  • 1. Principal Component Analysis (Dimensionality Reduction) By: Tarun Bhatia Y7475
  • 2. Overview: • What is Principal Component Analysis • Computing the compnents in PCA • Dimensionality Reduction using PCA • A 2D example in PCA • Applications of PCA in computer vision • Importance of PCA in analysing data in higher dimensions • Questions
  • 3. Principal Component Analysis • Most common form of factor analysis • The new variables/dimensions – Are linear combinations of the original ones – Are uncorrelated with one another • Orthogonal in original dimension space – Capture as much of the original variance in the data as possible – Are called Principal Components
  • 4. What are the new axes? • Orthogonal directions of greatest variance in data • Projections along PC1 discriminate the data most along any one axis
  • 5. Principal Components •First principal component is the direction of greatest variability (covariance) in the data • Second is the next orthogonal (uncorrelated) direction of greatest variability – So first remove all the variability along the first component, and then find the next direction of greatest variability • And so on …
  • 6. Principal Components Analysis (PCA) • Principle – Linear projection method to reduce the number of parameters – Transfer a set of correlated variables into a new set of uncorrelated variables – Map the data into a space of lower dimensionality – Form of unsupervised learning • Properties –It can be viewed as a rotation of the existing axes to new positions in the space defined by original variables –New axes are orthogonal and represent the directions with maximum variability
  • 7. Computing the Components • Data points are vectors in a multidimensional space • Projection of vector x onto an axis (dimension) u is u.x •Direction of greatest variability is that in which the average square of the projection is greatest – I.e. u such that E((u.x)2) over all x is maximized –(we subtract the mean along each dimension, and center the original axis system at the centroid of all data points, for simplicity) – This direction of u is the direction of the first Principal Component
  • 8. Computing the Components • E((u.x)2) = E ((u.x) (u.x)T) = E (u.x.x T.uT) •The matrix S = x.xT contains the correlations (similarities) of the original axes based on how the data values project onto them • So we are looking for w that maximizes uSuT, subject to u being unit-length • It is maximized when w is the principal eigenvector of the matrix S, in which case –uCuT = uλuT = λ if u is unit-length, where λ is the principal eigenvalue of the correlation matrix C – The eigenvalue denotes the amount of variability captured along that dimension
  • 9. Why the Eigenvectors? Maximise uTxxTu s.t uTu = 1 Construct Langrangian uTxxTu – λuTu Vector of partial derivatives set to zero xxTu – λu = (xxT – λI) u = 0 As u ≠ 0 then u must be an eigenvector of xxT with eigenvalue λ
  • 10. Computing the Components • Similarly for the next axis, etc. •So, the new axes are the eigenvectors of the matrix of correlations of the original variables, which captures the similarities of the original variables based on how data samples project to them • Geometrically: centering followed by rotation • – Linear transformation
  • 11. PCs, Variance and Least-Squares • The first PC retains the greatest amount of variation in the sample • The kth PC retains the kth greatest fraction of the variation in the sample •The kth largest eigenvalue of the correlation matrix C is the variance in the sample along the kth PC •The least-squares view: PCs are a series of linear least squares fits to a sample, each orthogonal to all previous ones
  • 12. How Many PCs? •For n original dimensions, correlation matrix is nxn, and has up to n eigenvectors. So n PCs. •Where does dimensionality reduction come from?
  • 13. Dimensionality Reduction Can ignore the components of lesser significance. You do lose some information, but if the eigenvalues are small, you don’t lose much – n dimensions in original data – calculate n eigenvectors and eigenvalues – choose only the first p eigenvectors, based on their eigenvalues – final data set has only p dimensions
  • 14. Eigenvectors of a Correlation Matrix
  • 15. PCA Example –STEP 1 • Subtract the mean from each of the data dimensions. All the x values have x subtracted and y values have y subtracted from them. This produces a data set whose mean is zero. Subtracting the mean makes variance and covariance calculation easier by simplifying their equations. The variance and co-variance values are not affected by the mean value.
  • 16. PCA Example –STEP 1 DATA: x y 2.5 2.4 0.5 0.7 2.2 2.9 1.9 2.2 3.1 3.0 2.3 2.7 2 1.6 1 1.1 1.5 1.6 1.1 0.9 ZERO MEAN DATA: x y .69 .49 -1.31 -1.21 .39 .99 .09 .29 1.29 1.09 .49 .79 .19 -.31 -.81 -.81 -.31 -.31 -.71 -1.01 http://kybele.psych.cornell.edu/~edelman/Psych-465-Spring-2003/PCA-tutorial.pdf
  • 17. PCA Example –STEP 1 http://kybele.psych.cornell.edu/~edelman/Psych-465-Spring-2003/PCA-tutorial.pdf
  • 18. PCA Example –STEP 2 • Calculate the covariance matrix cov = .616555556 .615444444 .615444444 .716555556 • since the non-diagonal elements in this covariance matrix are positive, we should expect that both the x and y variable increase together.
  • 19. PCA Example –STEP 3 • Calculate the eigenvectors and eigenvalues of the covariance matrix eigenvalues = .0490833989 1.28402771 eigenvectors = -.735178656 -.677873399 .677873399 -.735178656
  • 20. PCA Example –STEP 3 http://kybele.psych.cornell.edu/~edelman/Psych-465-Spring-2003/PCA-tutorial.pdf •eigenvectors are plotted as diagonal dotted lines on the plot. •Note they are perpendicular to each other. •Note one of the eigenvectors goes through the middle of the points, like drawing a line of best fit. •The second eigenvector gives us the other, less important, pattern in the data, that all the points follow the main line, but are off to the side of the main line by some amount.
  • 21. PCA Example –STEP 4 • Reduce dimensionality and form feature vector the eigenvector with the highest eigenvalue is the principle component of the data set. In our example, the eigenvector with the larges eigenvalue was the one that pointed down the middle of the data. Once eigenvectors are found from the covariance matrix, the next step is to order them by eigenvalue, highest to lowest. This gives you the components in order of significance.
  • 22. PCA Example –STEP 4 Now, if you like, you can decide to ignore the components of lesser significance. You do lose some information, but if the eigenvalues are small, you don’t lose much • n dimensions in your data • calculate n eigenvectors and eigenvalues • choose only the first p eigenvectors • final data set has only p dimensions.
  • 23. PCA Example –STEP 4 • Feature Vector FeatureVector = (eig1 eig2 eig3 … eign) We can either form a feature vector with both of the eigenvectors: -.677873399 -.735178656 -.735178656 .677873399 or, we can choose to leave out the smaller, less significant component and only have a single column: - .677873399 - .735178656
  • 24. PCA Example –STEP 5 • Deriving the new data FinalData = RowFeatureVector x RowZeroMeanData RowFeatureVector is the matrix with the eigenvectors in the columns transposed so that the eigenvectors are now in the rows, with the most significant eigenvector at the top RowZeroMeanData is the mean-adjusted data transposed, ie. the data items are in each column, with each row holding a separate dimension.
  • 25. PCA Example –STEP 5 FinalData transpose: dimensions along columns x y -.827970186 -.175115307 1.77758033 .142857227 -.992197494 .384374989 -.274210416 .130417207 -1.67580142 -.209498461 -.912949103 .175282444 .0991094375 -.349824698 1.14457216 .0464172582 .438046137 .0177646297 1.22382056 -.162675287
  • 26. PCA Example –STEP 5 http://kybele.psych.cornell.edu/~edelman/Psych-465-Spring-2003/PCA-tutorial.pdf
  • 27. Reconstruction of original Data • If we reduced the dimensionality, obviously, when reconstructing the data we would lose those dimensions we chose to discard. In our example let us assume that we considered only the x dimension…
  • 28. Reconstruction of original Data http://kybele.psych.cornell.edu/~edelman/Psych-465-Spring-2003/PCA-tutorial.pdf x -.827970186 1.77758033 -.992197494 -.274210416 -1.67580142 -.912949103 .0991094375 1.14457216 .438046137 1.22382056
  • 29. Applications in computer vision- PCA to find patterns- • 20 face images: NxN size • One image represented as follows- • Putting all 20 images in 1 big matrix as follows- • Performing PCA to find patterns in the face images • Idenifying faces by measuring differences along the new axes (PCs)
  • 30. PCA for image compression: • Compile a dataset of 20 images • Build the covariance matrix of 20 dimensions • Compute the eigenvectors and eigenvalues • Based on the eigenvalues, 5 dimensions can be left out, those with the least eigenvalues. • 1/4th of the space is saved.
  • 31. Importance of PCA • In data of high dimensions, where graphical representation is difficult, PCA is a powerful tool for analysing data and finding patterns in it. • Data compression is possible using PCA • The most efficient expression of data is by the use of perpendicular components, as done in PCA.
  • 32. Questions: • What do the eigenvectors of the covariance matrix while computing the principal components give us? • At what point in the PCA process can we decide to compress the data? • Why are the principal components orthogonal? • How many different covariance values can you calculate for an n-dimensional data set?