SlideShare a Scribd company logo
1 of 9
Principal Component Analysis
Mason Ziemer
12/2/16
Abstract
One problemthat oftencropsup indata analysis isthe
presence of a high-dimensional dataset.We will explorea
specificdimensionreductiontechnique inthisreportcalled
Principal ComponentAnalysis.The aimof thisreportisto use
eigenvectors,eigenvalues,andorthogonality tounderstand the
conceptof Principal ComponentAnalysis(PCA) andtoshow why
PCA is useful.
Introduction
The aim for Principal ComponentAnalysisliesinthe title;
findingthe principal componentsof the data.PCA is usedto
projectdata intoa new,lower-dimension, coordinate system
where the axescorrespond toeachprincipal component. What
isPCA useful for? Ithelpswithreducingthe dimensionalityof a
datasetwhichinturn helpswiththe efficiencyof runninga
machine learningalgorithm.Italsosimplifiesthe dataset,
allowingforthese algorithmstorunfaster. So,whatis a
principal component?A principalcomponentisthe direction
where the mostvariance liesinthe data.To get a visual,the
firstprincipal componentof the dataonthe x-yplane isshown
below.
As youcan see above,the firstprincipal componentisthe line
where the datavariesthe most.Let’ssay we want to projectthe
data alongthe firstprincipal componentonly.Thiswould
effectivelyreduce the dimensionof the dataset fromtwo
dimensionstoone while retainingthe mostdatapossible.
Althoughthiswill destroysome of the data,itstill holds
informationfromboth the x and y. This iswhatthe projection
lookslike.
x
y
Firstprincipal component
y Firstprincipal component
Here is the data projectedontothe firstprincipal component.
Now,lookingbacktothe original graph,the secondprincipal
componentmustbe orthogonal tothe firstin orderto capture
the most remainingvariance thatfirstprincipal componentdid
not.Here iswhatsecondprincipal componentlookslike.
If we were to performPCA and projectthe data ontothe first
twoprincipal components,thennone of the datawouldbe lost
inthe transformation.Thisisbecause we are transformingthe
data fromthe x-yplane,whichhastwodimensions, intoanew
two-dimensional space wherethe axesare undefined.
Completingthistransformationmerelyrotatesthe dataonto
the newaxesand lookslike this:
x
x
y
Firstprincipal component
x
y
Firstprincipal component
SecondPrincipal Component
As youcan see none of the data has changed,we are just
lookingatit froma differentangle.
Eigenvalues and Eigenvectors
In mathematical terms, the principalcomponentsare the
eigenvectorsof the covariance matrix forthe dataset. Itwill be
illustratedinthe example below of how toobtainthe
covariance matrix alongwith the eigenvectorsandeigenvalues.
The eigenvectors of the covariance matrix pointinthe direction
where the mostvariance inthe data lies.Eacheigenvectorhasa
correspondingeigenvalue,whichisascalar,that denotesthe
amountof variance inthe data alongitscorresponding
eigenvector. There canonlybe as manyeigenvectorswith
correspondingeigenvaluesasthere are variables inthe dataset.
As the eigenvaluesgetlarger,itmeansthatmore variance in
the data is accountedfor.In the previous example fromabove,
since the data liesonthe x-yplane,there are onlytwo
eigenvectorswithcorrespondingeigenvalues.Foranydata set,
the firstprincipal componentisthe eigenvectorthat
correspondstothe largesteigenvalue.Itisalsoimportantto
note that the matrices that are formedby the datasetdonot
have to be square; the variablesmake upthe columnsof the
matrix,while the observationsmake upthe rows. Theydonot
have to be square matricesbecause we are takingthe
eigenvaluesfromthe covariance matrix,whichwill be explained
inthe example below.
New Axis
NewAxis
Example
For an example andimplementationof PCA Iwill refertothe iris
datasetinR. Iris contains measurements,incm, for150 iris
flowersonfourdifferentfeatures:Sepal.Length,Sepal.Width,
Petal.Length,andPetal.Width. The datasetalsocontainsthe
speciesof irisforeachirisflower.The three differentspecies
are Setosa,Versicolor,andVirginica. The fourfeaturesmake up
the columnsof our matrix,while the 150 observationsof each
feature make upthe rows of our matrix. Here iswhat the first
six rowsof the data lookslike.
> head(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width
1 5.1 3.5 1.4 0.2
2 4.9 3.0 1.4 0.2
3 4.7 3.2 1.3 0.2
4 4.6 3.1 1.5 0.2
5 5.0 3.6 1.4 0.2
6 5.4 3.9 1.7 0.4
The nextstepisto findthe covariance matrix,whichcanbe
computedbythe followingformula.
𝐶𝑂𝑉( 𝑋, 𝑌) =
∑ (𝑋𝑖
̅ − 𝑋)(𝑌𝑖
̅ − 𝑌)𝑛
𝑖=1
𝑛 − 1
Since our data sethas fourvariables (fourdimensionaldataset),
the covariance betweenall fourvariablescanbe measured.Itis
alsoimportantto rememberthatthe covariance of a variable
withitself COV(X,X) justequalsitsvariance VAR(X). Suppose we
use arbitraryvariablesW,X,Y,andZ to setup the covariance
matrix for thisexample.The resulting4x4matrix will looklike t
his:
𝑉𝐴𝑅(𝑊) 𝐶𝑂𝑉(𝑊, 𝑋)
𝐶𝑂𝑉( 𝑋, 𝑊) 𝑉𝐴𝑅(𝑋)
𝐶𝑂𝑉(𝑊, 𝑌) 𝐶𝑂𝑉(𝑊, 𝑍)
𝐶𝑂𝑉(𝑋, 𝑌) 𝐶𝑂𝑉( 𝑋, 𝑍)
𝐶𝑂𝑉(𝑌, 𝑊) 𝐶𝑂𝑉(𝑌, 𝑋)
𝐶𝑂𝑉(𝑍, 𝑊) 𝐶𝑂𝑉( 𝑍, 𝑋)
𝑉𝐴𝑅(𝑌) 𝐶𝑂𝑉(𝑌, 𝑍)
𝐶𝑂𝑉(𝑍, 𝑌) 𝑉𝐴𝑅(𝑍)
It’salso importanttonote that COV(X,Y) equal toCOV(Y,X),
hence the matrix issymmetrical aboutthe diagonal,where the
diagonal equalsthe variancesof W,X,Y,andZ.The covariance
matrix forour datasetcan be obtainedinRby the following
command.
> cov(iris[-5])
Sepal.Length Sepal.Width Petal.Length Petal.Width
Sepal.Length 0.6856935 -0.0424340 1.2743154 0.5162707
Sepal.Width -0.0424340 0.1899794 -0.3296564 -0.1216394
Petal.Length 1.2743154 -0.3296564 3.1162779 1.2956094
Petal.Width 0.5162707 -0.1216394 1.2956094 0.5810063
Nowthat we have obtainedthe covariance matrix forthe iris
datasetwe can now go aheadandfind eigenvectorsandtheir
correspondingeigenvalues. Remember,aneigenvectorisa
nonzerovector 𝑥⃑ such that A𝑥⃑= λ𝑥⃑ for some scalarλ. The scalar
λ is oureigenvalueforthe correspondingeigenvector.Solving
for the eigenvectorsandeigenvalues,we get:
> eigen(X)$vectors
𝑣1⃑⃑⃑⃑⃑ 𝑣2⃑⃑⃑⃑⃑ 𝑣3⃑⃑⃑⃑⃑ 𝑣4⃑⃑⃑⃑
Sepal.Length 0.36138659 -0.65658877 -0.58202985 0.3154872
Sepal.Width -0.08452251 -0.73016143 0.59791083 -0.3197231
Petal.Length 0.85667061 0.17337266 0.07623608 -0.4798390
Petal.Width 0.35828920 0.07548102 0.54583143 0.7536574
As youcan see,ourfirsteigenvector,betterknownasthe first
principle component,isdominatedbyPetal.Length,withavalue
of .85667. This meansthatPetal.Lengthcapturesthe most
variationinthe data for the firstdimension.So,if we wantedto
reduce our datasettoone variable,Petal.Lengthwouldbe the
bestchoice.
> eigen(X)$values
[1] 4.22824171 0.24267075 0.07820950 0.02383509
The eigenvaluesof the covariance matrix are able totell ushow
much variance isexplainedbyeacheigenvector. Notethatthe
firsteigenvalue,4.228,ismuch largerthan the followingthree.
Thus,the proportionof variance explainedbythe first
eigenvectoris equal to:
𝜆1
𝜆1+𝜆2+𝜆3+𝜆4
= .9246
Thismeansthat 92.46% of the data iscapturedby the first
principle component.If we wantedthe proportionof overall
variance explainedforthe firsttwoprinciplecomponentsjust
add 𝜆2 to the numeratorand the equationthenequals97.77%.
So,97.77% of the variance in the data containing4 variablescan
be explainedbythe firsttwoprinciple components.
Projection
Nowthat we have obtainedthe eigenvectorsandeigenvalues,it
istime to projectthe data ontofewerdimensions.Since we
computedabove thatthe firsttwo principle componentsmake
up almost98% of the variance inthe data, we will projectour
data ontothe firsttwoprinciple components.Inordertofind
the coordinatesbysolvingthe equationA =XV where Xisour
original matrix with4columnsand 150 rows (note thismatrix
has to be centeredwithmean= 0), A is the matrix of
coordinatesinthe new principle componentspace thatare
spannedbythe eigenvectorsin V.Remember,V isourmatrix of
eigenvectorsinthiscase.
X = scale(iris[1:4], center = TRUE, scale = FALSE)
scores = data.frame(X %*% eig$vectors)
colnames(scores) = c("Prin1", "Prin2", "Prin3", "Prin4")
scores[1:10, ]
Prin1 Prin2 Prin3 Prin4
1 -2.684126 -0.31939725 -0.02791483 0.002262437
2 -2.714142 0.17700123 -0.21046427 0.099026550
3 -2.888991 0.14494943 0.01790026 0.019968390
4 -2.745343 0.31829898 0.03155937 -0.075575817
5 -2.728717 -0.32675451 0.09007924 -0.061258593
6 -2.280860 -0.74133045 0.16867766 -0.024200858
7 -2.820538 0.08946138 0.25789216 -0.048143106
8 -2.626145 -0.16338496 -0.02187932 -0.045297871
9 -2.886383 0.57831175 0.02075957 -0.026744736
10 -2.672756 0.11377425 -0.19763272 -0.056295401
The commandsabove will give the coordinates,orscores,for
each principle component.Since we know about98% of the
variance inthe data is capturedbythe firsttwoprinciple
components,we willuse the firsttwocolumnsof coordinates
fromabove to plotour datasetin2 dimensionswiththe
followingcommand inR.The axesof thisnew twodimensional
projectionare the firsttwoprinciple components.
plot(scores$Prin1,scores$Prin2,main="Data ProjectedonFirst
2 Principal Components",
xlab= "FirstPrincipal Component",ylab="SecondPrincipal
Component",
col = c("green","red","blue")[iris$Species])
Note:The three differentcolorsrepresentthe speciesof iris
flower.
Conclusion
What was justaccomplishedwasthe exactgoal of PCA.We
were able to effectively reduce ourdatasetirisfromfour
dimensionsdownto twodimensions while maintainingnearly
98% of the original data.We were able todo thisbyusingthe
conceptsof eigenvaluesandeigenvectors. Toreview,we start
by settingupthe matrix forthe data whichhasthe observations
as rows,and variablesascolumns.The nextstepistocompute
the covariance matrix forthe data.The covariance matrix
resultsinan NxN matrix,where N isthe numberof variables.
The nextstepisto findthe eigenvectors andthe corresponding
eigenvaluesof the covariance matrix.The eigenvectorsmake up
the principle components.Next,itisimportanttoanalyze the
eigenvectorsandeigenvaluestosee how muchvariabilityis
accountedforby each component andto see whichvariable
contributesthe mostforeach eigenvector.Once itisdecided
howmany dimensionsyouwantyourprojectiontobe,the
scores,or coordinates,forthe new axesneedtobe obtained.
The final step isto plotthe data to see what the reduced
dimensionslooklikeandPCA issuccessfullycompleted!
Sources:
Hamilton,L.D. (n.d.).LauraDiane Hamilton.RetrievedDecember13,2016, from
http://www.lauradhamilton.com/introduction-to-principal-component-
analysis-pca
C. (2015). Principal ComponentAnalysis4Dummies:
Eigenvectors,EigenvaluesandDimensionReduction.
RetrievedDecember13, 2016, from
https://georgemdallas.wordpress.com/2013/10/30/prin
cipal-component-analysis-4-dummies-eigenvectors-
eigenvalues-and-dimension-reduction/
PRINCIPALCOMPONENTSANALYSIS.(n.d.).RetrievedDecember13,2016, from
http://www.bing.com/cr?IG=BE9D91B52E12482181305171F3DE2744&CID=29E
B5DDB568F60A634A7543057BE6161&rd=1&h=w3menlZrSNEgeE4CqkgKpvMgpi
xKBovnov7bpaVv7sg&v=1&r=http://www4.ncsu.edu/~slrace/LinearAlgebra2016
/RChapters/PCA.pdf&p=DevEx,5037.1

More Related Content

What's hot

Classification and prediction
Classification and predictionClassification and prediction
Classification and predictionAcad
 
Hidden markov model
Hidden markov modelHidden markov model
Hidden markov modelUshaYadav24
 
Scoring schemes in bioinformatics
Scoring schemes in bioinformaticsScoring schemes in bioinformatics
Scoring schemes in bioinformaticsSumatiHajela
 
Decision Trees
Decision TreesDecision Trees
Decision TreesCloudxLab
 
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...Edureka!
 
Basics of Data Analysis in Bioinformatics
Basics of Data Analysis in BioinformaticsBasics of Data Analysis in Bioinformatics
Basics of Data Analysis in BioinformaticsElena Sügis
 
Maximum Likelihood Estimation
Maximum Likelihood EstimationMaximum Likelihood Estimation
Maximum Likelihood Estimationguestfee8698
 
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...Edureka!
 
Pca(principal components analysis)
Pca(principal components analysis)Pca(principal components analysis)
Pca(principal components analysis)kalung0313
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysisAcad
 
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...Simplilearn
 

What's hot (20)

Principal Component Analysis
Principal Component AnalysisPrincipal Component Analysis
Principal Component Analysis
 
Classification and prediction
Classification and predictionClassification and prediction
Classification and prediction
 
Hidden markov model
Hidden markov modelHidden markov model
Hidden markov model
 
Interactomeee
InteractomeeeInteractomeee
Interactomeee
 
Scoring schemes in bioinformatics
Scoring schemes in bioinformaticsScoring schemes in bioinformatics
Scoring schemes in bioinformatics
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
Pathway and network analysis
Pathway and network analysisPathway and network analysis
Pathway and network analysis
 
Principal component analysis
Principal component analysisPrincipal component analysis
Principal component analysis
 
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
 
Basics of Data Analysis in Bioinformatics
Basics of Data Analysis in BioinformaticsBasics of Data Analysis in Bioinformatics
Basics of Data Analysis in Bioinformatics
 
Maximum Likelihood Estimation
Maximum Likelihood EstimationMaximum Likelihood Estimation
Maximum Likelihood Estimation
 
Statistics for data science
Statistics for data science Statistics for data science
Statistics for data science
 
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
 
Cluster Analysis
Cluster AnalysisCluster Analysis
Cluster Analysis
 
Mle
MleMle
Mle
 
David
DavidDavid
David
 
Self Organizing Maps
Self Organizing MapsSelf Organizing Maps
Self Organizing Maps
 
Pca(principal components analysis)
Pca(principal components analysis)Pca(principal components analysis)
Pca(principal components analysis)
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
 

Similar to Principal Component Analysis

Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...Seval Çapraz
 
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...A Mathematical Programming Approach for Selection of Variables in Cluster Ana...
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...IJRES Journal
 
Linear regression [Theory and Application (In physics point of view) using py...
Linear regression [Theory and Application (In physics point of view) using py...Linear regression [Theory and Application (In physics point of view) using py...
Linear regression [Theory and Application (In physics point of view) using py...ANIRBANMAJUMDAR18
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Implement principal component analysis (PCA) in python from scratch
Implement principal component analysis (PCA) in python from scratchImplement principal component analysis (PCA) in python from scratch
Implement principal component analysis (PCA) in python from scratchEshanAgarwal4
 
Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and ldaSuresh Pokharel
 
Getting started with chemometric classification
Getting started with chemometric classificationGetting started with chemometric classification
Getting started with chemometric classificationAlex Henderson
 
Eli plots visualizing innumerable number of correlations
Eli plots   visualizing innumerable number of correlationsEli plots   visualizing innumerable number of correlations
Eli plots visualizing innumerable number of correlationsLeonardo Auslender
 
Introduction to Principle Component Analysis
Introduction to Principle Component AnalysisIntroduction to Principle Component Analysis
Introduction to Principle Component AnalysisSunjeet Jena
 
A Novel Algorithm for Design Tree Classification with PCA
A Novel Algorithm for Design Tree Classification with PCAA Novel Algorithm for Design Tree Classification with PCA
A Novel Algorithm for Design Tree Classification with PCAEditor Jacotech
 
Dimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptxDimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptxSivam Chinna
 
SupportVectorRegression
SupportVectorRegressionSupportVectorRegression
SupportVectorRegressionDaniel K
 
Exploring Support Vector Regression - Signals and Systems Project
Exploring Support Vector Regression - Signals and Systems ProjectExploring Support Vector Regression - Signals and Systems Project
Exploring Support Vector Regression - Signals and Systems ProjectSurya Chandra
 
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...IRJET Journal
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Zihui Li
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Mostafa G. M. Mostafa
 
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...
Classification of Iris Data using Kernel Radial Basis Probabilistic  Neural N...Classification of Iris Data using Kernel Radial Basis Probabilistic  Neural N...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...Scientific Review SR
 
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...Scientific Review
 

Similar to Principal Component Analysis (20)

Principal component analysis
Principal component analysisPrincipal component analysis
Principal component analysis
 
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
 
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...A Mathematical Programming Approach for Selection of Variables in Cluster Ana...
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...
 
Linear regression [Theory and Application (In physics point of view) using py...
Linear regression [Theory and Application (In physics point of view) using py...Linear regression [Theory and Application (In physics point of view) using py...
Linear regression [Theory and Application (In physics point of view) using py...
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
Implement principal component analysis (PCA) in python from scratch
Implement principal component analysis (PCA) in python from scratchImplement principal component analysis (PCA) in python from scratch
Implement principal component analysis (PCA) in python from scratch
 
Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and lda
 
Getting started with chemometric classification
Getting started with chemometric classificationGetting started with chemometric classification
Getting started with chemometric classification
 
Eli plots visualizing innumerable number of correlations
Eli plots   visualizing innumerable number of correlationsEli plots   visualizing innumerable number of correlations
Eli plots visualizing innumerable number of correlations
 
Introduction to Principle Component Analysis
Introduction to Principle Component AnalysisIntroduction to Principle Component Analysis
Introduction to Principle Component Analysis
 
1376846406 14447221
1376846406  144472211376846406  14447221
1376846406 14447221
 
A Novel Algorithm for Design Tree Classification with PCA
A Novel Algorithm for Design Tree Classification with PCAA Novel Algorithm for Design Tree Classification with PCA
A Novel Algorithm for Design Tree Classification with PCA
 
Dimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptxDimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptx
 
SupportVectorRegression
SupportVectorRegressionSupportVectorRegression
SupportVectorRegression
 
Exploring Support Vector Regression - Signals and Systems Project
Exploring Support Vector Regression - Signals and Systems ProjectExploring Support Vector Regression - Signals and Systems Project
Exploring Support Vector Regression - Signals and Systems Project
 
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)
 
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...
Classification of Iris Data using Kernel Radial Basis Probabilistic  Neural N...Classification of Iris Data using Kernel Radial Basis Probabilistic  Neural N...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...
 
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
 

Recently uploaded

Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...ThinkInnovation
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 

Recently uploaded (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 

Principal Component Analysis

  • 1. Principal Component Analysis Mason Ziemer 12/2/16 Abstract One problemthat oftencropsup indata analysis isthe presence of a high-dimensional dataset.We will explorea specificdimensionreductiontechnique inthisreportcalled Principal ComponentAnalysis.The aimof thisreportisto use eigenvectors,eigenvalues,andorthogonality tounderstand the conceptof Principal ComponentAnalysis(PCA) andtoshow why PCA is useful.
  • 2. Introduction The aim for Principal ComponentAnalysisliesinthe title; findingthe principal componentsof the data.PCA is usedto projectdata intoa new,lower-dimension, coordinate system where the axescorrespond toeachprincipal component. What isPCA useful for? Ithelpswithreducingthe dimensionalityof a datasetwhichinturn helpswiththe efficiencyof runninga machine learningalgorithm.Italsosimplifiesthe dataset, allowingforthese algorithmstorunfaster. So,whatis a principal component?A principalcomponentisthe direction where the mostvariance liesinthe data.To get a visual,the firstprincipal componentof the dataonthe x-yplane isshown below. As youcan see above,the firstprincipal componentisthe line where the datavariesthe most.Let’ssay we want to projectthe data alongthe firstprincipal componentonly.Thiswould effectivelyreduce the dimensionof the dataset fromtwo dimensionstoone while retainingthe mostdatapossible. Althoughthiswill destroysome of the data,itstill holds informationfromboth the x and y. This iswhatthe projection lookslike. x y Firstprincipal component y Firstprincipal component
  • 3. Here is the data projectedontothe firstprincipal component. Now,lookingbacktothe original graph,the secondprincipal componentmustbe orthogonal tothe firstin orderto capture the most remainingvariance thatfirstprincipal componentdid not.Here iswhatsecondprincipal componentlookslike. If we were to performPCA and projectthe data ontothe first twoprincipal components,thennone of the datawouldbe lost inthe transformation.Thisisbecause we are transformingthe data fromthe x-yplane,whichhastwodimensions, intoanew two-dimensional space wherethe axesare undefined. Completingthistransformationmerelyrotatesthe dataonto the newaxesand lookslike this: x x y Firstprincipal component x y Firstprincipal component SecondPrincipal Component
  • 4. As youcan see none of the data has changed,we are just lookingatit froma differentangle. Eigenvalues and Eigenvectors In mathematical terms, the principalcomponentsare the eigenvectorsof the covariance matrix forthe dataset. Itwill be illustratedinthe example below of how toobtainthe covariance matrix alongwith the eigenvectorsandeigenvalues. The eigenvectors of the covariance matrix pointinthe direction where the mostvariance inthe data lies.Eacheigenvectorhasa correspondingeigenvalue,whichisascalar,that denotesthe amountof variance inthe data alongitscorresponding eigenvector. There canonlybe as manyeigenvectorswith correspondingeigenvaluesasthere are variables inthe dataset. As the eigenvaluesgetlarger,itmeansthatmore variance in the data is accountedfor.In the previous example fromabove, since the data liesonthe x-yplane,there are onlytwo eigenvectorswithcorrespondingeigenvalues.Foranydata set, the firstprincipal componentisthe eigenvectorthat correspondstothe largesteigenvalue.Itisalsoimportantto note that the matrices that are formedby the datasetdonot have to be square; the variablesmake upthe columnsof the matrix,while the observationsmake upthe rows. Theydonot have to be square matricesbecause we are takingthe eigenvaluesfromthe covariance matrix,whichwill be explained inthe example below. New Axis NewAxis
  • 5. Example For an example andimplementationof PCA Iwill refertothe iris datasetinR. Iris contains measurements,incm, for150 iris flowersonfourdifferentfeatures:Sepal.Length,Sepal.Width, Petal.Length,andPetal.Width. The datasetalsocontainsthe speciesof irisforeachirisflower.The three differentspecies are Setosa,Versicolor,andVirginica. The fourfeaturesmake up the columnsof our matrix,while the 150 observationsof each feature make upthe rows of our matrix. Here iswhat the first six rowsof the data lookslike. > head(iris) Sepal.Length Sepal.Width Petal.Length Petal.Width 1 5.1 3.5 1.4 0.2 2 4.9 3.0 1.4 0.2 3 4.7 3.2 1.3 0.2 4 4.6 3.1 1.5 0.2 5 5.0 3.6 1.4 0.2 6 5.4 3.9 1.7 0.4 The nextstepisto findthe covariance matrix,whichcanbe computedbythe followingformula. 𝐶𝑂𝑉( 𝑋, 𝑌) = ∑ (𝑋𝑖 ̅ − 𝑋)(𝑌𝑖 ̅ − 𝑌)𝑛 𝑖=1 𝑛 − 1 Since our data sethas fourvariables (fourdimensionaldataset), the covariance betweenall fourvariablescanbe measured.Itis alsoimportantto rememberthatthe covariance of a variable withitself COV(X,X) justequalsitsvariance VAR(X). Suppose we use arbitraryvariablesW,X,Y,andZ to setup the covariance matrix for thisexample.The resulting4x4matrix will looklike t his: 𝑉𝐴𝑅(𝑊) 𝐶𝑂𝑉(𝑊, 𝑋) 𝐶𝑂𝑉( 𝑋, 𝑊) 𝑉𝐴𝑅(𝑋) 𝐶𝑂𝑉(𝑊, 𝑌) 𝐶𝑂𝑉(𝑊, 𝑍) 𝐶𝑂𝑉(𝑋, 𝑌) 𝐶𝑂𝑉( 𝑋, 𝑍) 𝐶𝑂𝑉(𝑌, 𝑊) 𝐶𝑂𝑉(𝑌, 𝑋) 𝐶𝑂𝑉(𝑍, 𝑊) 𝐶𝑂𝑉( 𝑍, 𝑋) 𝑉𝐴𝑅(𝑌) 𝐶𝑂𝑉(𝑌, 𝑍) 𝐶𝑂𝑉(𝑍, 𝑌) 𝑉𝐴𝑅(𝑍) It’salso importanttonote that COV(X,Y) equal toCOV(Y,X), hence the matrix issymmetrical aboutthe diagonal,where the diagonal equalsthe variancesof W,X,Y,andZ.The covariance matrix forour datasetcan be obtainedinRby the following command.
  • 6. > cov(iris[-5]) Sepal.Length Sepal.Width Petal.Length Petal.Width Sepal.Length 0.6856935 -0.0424340 1.2743154 0.5162707 Sepal.Width -0.0424340 0.1899794 -0.3296564 -0.1216394 Petal.Length 1.2743154 -0.3296564 3.1162779 1.2956094 Petal.Width 0.5162707 -0.1216394 1.2956094 0.5810063 Nowthat we have obtainedthe covariance matrix forthe iris datasetwe can now go aheadandfind eigenvectorsandtheir correspondingeigenvalues. Remember,aneigenvectorisa nonzerovector 𝑥⃑ such that A𝑥⃑= λ𝑥⃑ for some scalarλ. The scalar λ is oureigenvalueforthe correspondingeigenvector.Solving for the eigenvectorsandeigenvalues,we get: > eigen(X)$vectors 𝑣1⃑⃑⃑⃑⃑ 𝑣2⃑⃑⃑⃑⃑ 𝑣3⃑⃑⃑⃑⃑ 𝑣4⃑⃑⃑⃑ Sepal.Length 0.36138659 -0.65658877 -0.58202985 0.3154872 Sepal.Width -0.08452251 -0.73016143 0.59791083 -0.3197231 Petal.Length 0.85667061 0.17337266 0.07623608 -0.4798390 Petal.Width 0.35828920 0.07548102 0.54583143 0.7536574 As youcan see,ourfirsteigenvector,betterknownasthe first principle component,isdominatedbyPetal.Length,withavalue of .85667. This meansthatPetal.Lengthcapturesthe most variationinthe data for the firstdimension.So,if we wantedto reduce our datasettoone variable,Petal.Lengthwouldbe the bestchoice. > eigen(X)$values [1] 4.22824171 0.24267075 0.07820950 0.02383509 The eigenvaluesof the covariance matrix are able totell ushow much variance isexplainedbyeacheigenvector. Notethatthe firsteigenvalue,4.228,ismuch largerthan the followingthree. Thus,the proportionof variance explainedbythe first eigenvectoris equal to: 𝜆1 𝜆1+𝜆2+𝜆3+𝜆4 = .9246 Thismeansthat 92.46% of the data iscapturedby the first principle component.If we wantedthe proportionof overall variance explainedforthe firsttwoprinciplecomponentsjust add 𝜆2 to the numeratorand the equationthenequals97.77%. So,97.77% of the variance in the data containing4 variablescan be explainedbythe firsttwoprinciple components.
  • 7. Projection Nowthat we have obtainedthe eigenvectorsandeigenvalues,it istime to projectthe data ontofewerdimensions.Since we computedabove thatthe firsttwo principle componentsmake up almost98% of the variance inthe data, we will projectour data ontothe firsttwoprinciple components.Inordertofind the coordinatesbysolvingthe equationA =XV where Xisour original matrix with4columnsand 150 rows (note thismatrix has to be centeredwithmean= 0), A is the matrix of coordinatesinthe new principle componentspace thatare spannedbythe eigenvectorsin V.Remember,V isourmatrix of eigenvectorsinthiscase. X = scale(iris[1:4], center = TRUE, scale = FALSE) scores = data.frame(X %*% eig$vectors) colnames(scores) = c("Prin1", "Prin2", "Prin3", "Prin4") scores[1:10, ] Prin1 Prin2 Prin3 Prin4 1 -2.684126 -0.31939725 -0.02791483 0.002262437 2 -2.714142 0.17700123 -0.21046427 0.099026550 3 -2.888991 0.14494943 0.01790026 0.019968390 4 -2.745343 0.31829898 0.03155937 -0.075575817 5 -2.728717 -0.32675451 0.09007924 -0.061258593 6 -2.280860 -0.74133045 0.16867766 -0.024200858 7 -2.820538 0.08946138 0.25789216 -0.048143106 8 -2.626145 -0.16338496 -0.02187932 -0.045297871 9 -2.886383 0.57831175 0.02075957 -0.026744736 10 -2.672756 0.11377425 -0.19763272 -0.056295401 The commandsabove will give the coordinates,orscores,for each principle component.Since we know about98% of the variance inthe data is capturedbythe firsttwoprinciple components,we willuse the firsttwocolumnsof coordinates fromabove to plotour datasetin2 dimensionswiththe followingcommand inR.The axesof thisnew twodimensional projectionare the firsttwoprinciple components. plot(scores$Prin1,scores$Prin2,main="Data ProjectedonFirst 2 Principal Components", xlab= "FirstPrincipal Component",ylab="SecondPrincipal Component", col = c("green","red","blue")[iris$Species])
  • 8. Note:The three differentcolorsrepresentthe speciesof iris flower. Conclusion What was justaccomplishedwasthe exactgoal of PCA.We were able to effectively reduce ourdatasetirisfromfour dimensionsdownto twodimensions while maintainingnearly 98% of the original data.We were able todo thisbyusingthe conceptsof eigenvaluesandeigenvectors. Toreview,we start by settingupthe matrix forthe data whichhasthe observations as rows,and variablesascolumns.The nextstepistocompute the covariance matrix forthe data.The covariance matrix resultsinan NxN matrix,where N isthe numberof variables. The nextstepisto findthe eigenvectors andthe corresponding eigenvaluesof the covariance matrix.The eigenvectorsmake up the principle components.Next,itisimportanttoanalyze the eigenvectorsandeigenvaluestosee how muchvariabilityis accountedforby each component andto see whichvariable contributesthe mostforeach eigenvector.Once itisdecided howmany dimensionsyouwantyourprojectiontobe,the scores,or coordinates,forthe new axesneedtobe obtained. The final step isto plotthe data to see what the reduced dimensionslooklikeandPCA issuccessfullycompleted! Sources: Hamilton,L.D. (n.d.).LauraDiane Hamilton.RetrievedDecember13,2016, from http://www.lauradhamilton.com/introduction-to-principal-component- analysis-pca
  • 9. C. (2015). Principal ComponentAnalysis4Dummies: Eigenvectors,EigenvaluesandDimensionReduction. RetrievedDecember13, 2016, from https://georgemdallas.wordpress.com/2013/10/30/prin cipal-component-analysis-4-dummies-eigenvectors- eigenvalues-and-dimension-reduction/ PRINCIPALCOMPONENTSANALYSIS.(n.d.).RetrievedDecember13,2016, from http://www.bing.com/cr?IG=BE9D91B52E12482181305171F3DE2744&CID=29E B5DDB568F60A634A7543057BE6161&rd=1&h=w3menlZrSNEgeE4CqkgKpvMgpi xKBovnov7bpaVv7sg&v=1&r=http://www4.ncsu.edu/~slrace/LinearAlgebra2016 /RChapters/PCA.pdf&p=DevEx,5037.1