SlideShare a Scribd company logo
1 of 11
PCA(Principal Components Analysis)
IsaacLu
Conception
 Principal component analysis(PCA) projects the feature onto the principal
components.
 The motivation is to reduce the features dimensionality while only losing a
small amount of information.
Procedure:
 The first principle component is just the normalized linear combination of the
variables that has the highest variance.
 The second principal component has largest variance, subject to being
uncorrelated with the first.(It means the second principal component is
orthogonal with first one)
 And so on.
Why we choose the direction with the
most variation
• Reason 1 : In signal analysis, they think
that signal has bigger variance and noise
has smaller variance.
• Reason 2 : As we can see our data
project on the green line (var = 0.6524)
that can separate data well rather than
purple line(var = 0.1678).
So choose the direction of PC with the most
variation in the data is our goal.
Example
DATA p1 p2 p3 p4 p5 p6 p7 p8 p9 p10
x 2.5 0.5 2.2 1.9 3.1 2.3 2 1 1.5 1.1
y 2.4 0.7 2.9 2.2 3 2.7 1.6 1.1 1.6 0.9
We have 10 points(p1~p10) in two
dimensions as picture and table at
right side.
We want to use PCA to do dimension
reduce form 2 to 1.
 First step: Zero-centered(去中心化)
 Reason :We want to move the data center to original points. To make calculation
more clear without consider bias.
DATA p1 p2 p3 p4 p5
x 0.69 -1.31 0.39 0.09 1.29
y 0.49 -1.21 0.99 0.29 1.09
DATA p6 p7 p8 p9 p10
x 0.49 0.19 -0.81 -0.31 -0.71
y 0.79 -0.31 -0.81 -0.31 -1.01
 Second step: calculate covariance matrix(計算共變異矩陣)
 Reason :In probability theory and statistics, covariance is a measure of the joint
variability of two random variables.(衡量兩個變量的總體誤差)
 The sign of the covariance therefore shows the tendency in the linear
relationship between the variables. Variables whose covariance is zero are
called uncorrelated.
In our case:
𝑐𝑜𝑣 =
0.616556 0.615444
0.615444 0.716556
Covariance matrix
The covariance matrix defines the
shape of the data. Diagonal spread
is captured by the covariance,
while axis-aligned spread is
captured by the variance.
 Third step: calculate eigenvalue and eigenvector of covariance matrix(計算共
變異矩陣的eigenvalue跟eigenvector)
 Reason: Want to find direction of principle component.
𝑒𝑖𝑔𝑒𝑛𝑣𝑎𝑙𝑢𝑒 = 0.049 1.284
𝑒𝑖𝑔𝑒𝑛𝑣𝑒𝑐𝑡𝑜𝑟 =
−0.735
0.678
0.678
0.735
Sort eigenvalue from large to small, also arrange
eigenvector follow eigenvalue.
0.678
0.735
1.284 0.049
0.678
0.735
−0.735
0.678
 Forth Step: Mapping Zero_centered data to new space generate new data,
 𝑁𝑒𝑤 𝑑𝑎𝑡𝑎 = 𝑍𝑒𝑟𝑜_𝑐𝑒𝑛𝑡𝑒𝑟𝑒𝑑 𝑑𝑎𝑡𝑎 ∗ 𝐹𝑒𝑎𝑡𝑢𝑟𝑒 𝑣𝑒𝑐𝑡𝑜𝑟
0.678
0.735
0.69 0.49
−1.31 −1.21
0.39 0.99
0.09 0.29
1.29 1.09
0.49 0.79
0.19 −0.31
−0.81 −0.81
−0.31 −0.31
−0.71 −1.01
*=
0.828
−1.778
0.992
0.274
1.676
0.913
−0.099
−1.145
−0.438
−1.224
Reference
 http://www.visiondummy.com/2014/04/geometric-interpretation-covariance-
matrix/
 http://www.libinx.com/2017/machine-learning-algorithm-series-pca/
 https://en.wikipedia.org/wiki/Covariance#cite_note-1
 http://alexhwilliams.info/itsneuronalblog/2016/03/27/pca/#everything-you-
did-know-or-do-now
 https://gerardnico.com/data_mining/pca
 https://www.jianshu.com/p/d090721cf501

More Related Content

What's hot

Logistic regression
Logistic regressionLogistic regression
Logistic regression
saba khan
 
Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methods
Reza Ramezani
 

What's hot (20)

Introduction to principal component analysis (pca)
Introduction to principal component analysis (pca)Introduction to principal component analysis (pca)
Introduction to principal component analysis (pca)
 
Implement principal component analysis (PCA) in python from scratch
Implement principal component analysis (PCA) in python from scratchImplement principal component analysis (PCA) in python from scratch
Implement principal component analysis (PCA) in python from scratch
 
Pca ppt
Pca pptPca ppt
Pca ppt
 
PCA
PCAPCA
PCA
 
Principal component analysis
Principal component analysisPrincipal component analysis
Principal component analysis
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
PCA (Principal component analysis)
PCA (Principal component analysis)PCA (Principal component analysis)
PCA (Principal component analysis)
 
Introduction to Linear Discriminant Analysis
Introduction to Linear Discriminant AnalysisIntroduction to Linear Discriminant Analysis
Introduction to Linear Discriminant Analysis
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
Exploratory data analysis data visualization
Exploratory data analysis data visualizationExploratory data analysis data visualization
Exploratory data analysis data visualization
 
Dimension Reduction: What? Why? and How?
Dimension Reduction: What? Why? and How?Dimension Reduction: What? Why? and How?
Dimension Reduction: What? Why? and How?
 
CART – Classification & Regression Trees
CART – Classification & Regression TreesCART – Classification & Regression Trees
CART – Classification & Regression Trees
 
Linear discriminant analysis
Linear discriminant analysisLinear discriminant analysis
Linear discriminant analysis
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Machine Learning With Logistic Regression
Machine Learning  With Logistic RegressionMachine Learning  With Logistic Regression
Machine Learning With Logistic Regression
 
Principal Component Analysis and Clustering
Principal Component Analysis and ClusteringPrincipal Component Analysis and Clustering
Principal Component Analysis and Clustering
 
Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methods
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regression
 

Similar to Pca(principal components analysis)

Mva 06 principal_component_analysis_2010_11
Mva 06 principal_component_analysis_2010_11Mva 06 principal_component_analysis_2010_11
Mva 06 principal_component_analysis_2010_11
P Palai
 
Slides distancecovariance
Slides distancecovarianceSlides distancecovariance
Slides distancecovariance
Shrey Nishchal
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
Machine learning (11)
Machine learning (11)Machine learning (11)
Machine learning (11)
NYversity
 
Stevens-Benchmarking Sorting Algorithms
Stevens-Benchmarking Sorting AlgorithmsStevens-Benchmarking Sorting Algorithms
Stevens-Benchmarking Sorting Algorithms
James Stevens
 

Similar to Pca(principal components analysis) (20)

Dimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptxDimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptx
 
Class9_PCA_final.ppt
Class9_PCA_final.pptClass9_PCA_final.ppt
Class9_PCA_final.ppt
 
IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING
IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING
IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING
 
IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING
IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING
IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING
 
Mva 06 principal_component_analysis_2010_11
Mva 06 principal_component_analysis_2010_11Mva 06 principal_component_analysis_2010_11
Mva 06 principal_component_analysis_2010_11
 
Count-Distinct Problem
Count-Distinct ProblemCount-Distinct Problem
Count-Distinct Problem
 
pcappt-140121072949-phpapp01.pptx
pcappt-140121072949-phpapp01.pptxpcappt-140121072949-phpapp01.pptx
pcappt-140121072949-phpapp01.pptx
 
6_nome_e_numero_Chapra_Canale_1998_Numerical_Differentiation_and_Integration.pdf
6_nome_e_numero_Chapra_Canale_1998_Numerical_Differentiation_and_Integration.pdf6_nome_e_numero_Chapra_Canale_1998_Numerical_Differentiation_and_Integration.pdf
6_nome_e_numero_Chapra_Canale_1998_Numerical_Differentiation_and_Integration.pdf
 
Principal Component Analysis
Principal Component AnalysisPrincipal Component Analysis
Principal Component Analysis
 
Linear regression [Theory and Application (In physics point of view) using py...
Linear regression [Theory and Application (In physics point of view) using py...Linear regression [Theory and Application (In physics point of view) using py...
Linear regression [Theory and Application (In physics point of view) using py...
 
pre
prepre
pre
 
06-07 Chapter interpolation in MATLAB
06-07 Chapter interpolation in MATLAB06-07 Chapter interpolation in MATLAB
06-07 Chapter interpolation in MATLAB
 
Slides distancecovariance
Slides distancecovarianceSlides distancecovariance
Slides distancecovariance
 
Unit3_1.pptx
Unit3_1.pptxUnit3_1.pptx
Unit3_1.pptx
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
PCA Final.pptx
PCA Final.pptxPCA Final.pptx
PCA Final.pptx
 
Machine Learning.pdf
Machine Learning.pdfMachine Learning.pdf
Machine Learning.pdf
 
Machine learning (11)
Machine learning (11)Machine learning (11)
Machine learning (11)
 
FinalReport
FinalReportFinalReport
FinalReport
 
Stevens-Benchmarking Sorting Algorithms
Stevens-Benchmarking Sorting AlgorithmsStevens-Benchmarking Sorting Algorithms
Stevens-Benchmarking Sorting Algorithms
 

More from kalung0313 (8)

Stacking ensemble
Stacking ensembleStacking ensemble
Stacking ensemble
 
Bagging ensemble
Bagging ensembleBagging ensemble
Bagging ensemble
 
Fourier transforms
Fourier transformsFourier transforms
Fourier transforms
 
FLDA(fisher linear discriminant analysis)
FLDA(fisher linear discriminant analysis)FLDA(fisher linear discriminant analysis)
FLDA(fisher linear discriminant analysis)
 
Decision tree of cart
Decision tree of cartDecision tree of cart
Decision tree of cart
 
adaboost
adaboostadaboost
adaboost
 
Tests of hypothesis
Tests of hypothesisTests of hypothesis
Tests of hypothesis
 
LR vs LDA
LR vs LDALR vs LDA
LR vs LDA
 

Recently uploaded

%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
masabamasaba
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
chiefasafspells
 

Recently uploaded (20)

%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 

Pca(principal components analysis)

  • 2. Conception  Principal component analysis(PCA) projects the feature onto the principal components.  The motivation is to reduce the features dimensionality while only losing a small amount of information.
  • 3. Procedure:  The first principle component is just the normalized linear combination of the variables that has the highest variance.  The second principal component has largest variance, subject to being uncorrelated with the first.(It means the second principal component is orthogonal with first one)  And so on.
  • 4. Why we choose the direction with the most variation • Reason 1 : In signal analysis, they think that signal has bigger variance and noise has smaller variance. • Reason 2 : As we can see our data project on the green line (var = 0.6524) that can separate data well rather than purple line(var = 0.1678). So choose the direction of PC with the most variation in the data is our goal.
  • 5. Example DATA p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 x 2.5 0.5 2.2 1.9 3.1 2.3 2 1 1.5 1.1 y 2.4 0.7 2.9 2.2 3 2.7 1.6 1.1 1.6 0.9 We have 10 points(p1~p10) in two dimensions as picture and table at right side. We want to use PCA to do dimension reduce form 2 to 1.
  • 6.  First step: Zero-centered(去中心化)  Reason :We want to move the data center to original points. To make calculation more clear without consider bias. DATA p1 p2 p3 p4 p5 x 0.69 -1.31 0.39 0.09 1.29 y 0.49 -1.21 0.99 0.29 1.09 DATA p6 p7 p8 p9 p10 x 0.49 0.19 -0.81 -0.31 -0.71 y 0.79 -0.31 -0.81 -0.31 -1.01
  • 7.  Second step: calculate covariance matrix(計算共變異矩陣)  Reason :In probability theory and statistics, covariance is a measure of the joint variability of two random variables.(衡量兩個變量的總體誤差)  The sign of the covariance therefore shows the tendency in the linear relationship between the variables. Variables whose covariance is zero are called uncorrelated. In our case: 𝑐𝑜𝑣 = 0.616556 0.615444 0.615444 0.716556
  • 8. Covariance matrix The covariance matrix defines the shape of the data. Diagonal spread is captured by the covariance, while axis-aligned spread is captured by the variance.
  • 9.  Third step: calculate eigenvalue and eigenvector of covariance matrix(計算共 變異矩陣的eigenvalue跟eigenvector)  Reason: Want to find direction of principle component. 𝑒𝑖𝑔𝑒𝑛𝑣𝑎𝑙𝑢𝑒 = 0.049 1.284 𝑒𝑖𝑔𝑒𝑛𝑣𝑒𝑐𝑡𝑜𝑟 = −0.735 0.678 0.678 0.735 Sort eigenvalue from large to small, also arrange eigenvector follow eigenvalue. 0.678 0.735 1.284 0.049 0.678 0.735 −0.735 0.678
  • 10.  Forth Step: Mapping Zero_centered data to new space generate new data,  𝑁𝑒𝑤 𝑑𝑎𝑡𝑎 = 𝑍𝑒𝑟𝑜_𝑐𝑒𝑛𝑡𝑒𝑟𝑒𝑑 𝑑𝑎𝑡𝑎 ∗ 𝐹𝑒𝑎𝑡𝑢𝑟𝑒 𝑣𝑒𝑐𝑡𝑜𝑟 0.678 0.735 0.69 0.49 −1.31 −1.21 0.39 0.99 0.09 0.29 1.29 1.09 0.49 0.79 0.19 −0.31 −0.81 −0.81 −0.31 −0.31 −0.71 −1.01 *= 0.828 −1.778 0.992 0.274 1.676 0.913 −0.099 −1.145 −0.438 −1.224
  • 11. Reference  http://www.visiondummy.com/2014/04/geometric-interpretation-covariance- matrix/  http://www.libinx.com/2017/machine-learning-algorithm-series-pca/  https://en.wikipedia.org/wiki/Covariance#cite_note-1  http://alexhwilliams.info/itsneuronalblog/2016/03/27/pca/#everything-you- did-know-or-do-now  https://gerardnico.com/data_mining/pca  https://www.jianshu.com/p/d090721cf501

Editor's Notes

  1. PCA是一種無監督算法 LDA是選擇投影後使得組內方差小,組間方差大的方向來投影
  2. The direction in which the data varies the most actually falls along the green line. This is the direction with the most variation in the data, this is why it's the first principal component (direction). The sum of square distances is the smallest possible.
  3. The magnitude of the covariance is not easy to interpret because it is not normalized and hence depends on the magnitudes of the variables. The normalized version of the covariance, the correlation coefficient, however, shows by its magnitude the strength of the linear relation.