SlideShare a Scribd company logo
Dimensionality Reduction
Principal Component Analysis (PCA)
Principal Component Analysis (PCA)
 The central idea of principal component analysis (PCA) is to reduce the
dimensionality of a data set consisting of a large number of interrelated
features while retaining as much as possible of the variation present in the
data set.
 This is achieved by transforming to a new set of features, the principal
components (PCs), which are uncorrelated, and which are ordered so that
the first few retain most of the variation present in all of the original
features.
Principal Component Analysis (PCA)
Mathematics Behind PCA
 PCA can be thought of as an unsupervised learning problem.
 The whole process of PCA can be summarized as follows:
• Standardize the given set of d-dimensional samples using Z-score Normalization.
• Compute the covariance matrix of the standardized dataset.
• Compute the eigenvectors and the corresponding eigenvalues of the
covariance matrix.
• Sort the eigenvectors by decreasing order of eigenvalues and choose the
eigenvectors corresponding to the largest k eigenvalues to form a d × k
dimensional matrix W
• Use this d × k eigenvector matrix W to transform the samples onto the new subspace
 Consider the following two-dimensional dataset with features x1 and x2:
 Our goal is to use PCA to reduce the dimensions of our dataset from
two to one, that is, from to R2 to R
Principal Component Analysis (PCA)
 Lets visualize the given data set:
Principal Component Analysis (PCA)
Principal Component Analysis (PCA)
1. Standardization of the given dataset
 Suppose we want to perform PCA on two features - a person's age and weight. If the unit of
weight is in grams, then the magnitude of its spread or variance will be much larger than that
of the age feature.
 The variance of the weight would be in the order of magnitude of say 10,000 while that of
age would be say 10.
 As PCA uses the variance of each features to reduce the dimensionality, it would focus more
on extracting information from features with higher variances and ignore the other features of
less variance.
 The way to overcome this is to initially perform standardization such that all the features are
transformed to the same unitless scale. In PCA, we perform Z-score normalization
Principal Component Analysis (PCA)
1. Standardization of the given dataset
 Z-score normalization is performed as follows:
 After performing this standardization, the transformed features would each have a mean
of 0 and a standard deviation of 1.
Principal Component Analysis (PCA)
1. Standardization of the given dataset
 As an example, let's manually standardize our first feature x1. We first need to compute
the mean and the variance of this feature. We begin with the mean:
 Next, let's compute the standard deviation:
Principal Component Analysis (PCA)
1. Standardization of the given dataset
 Now, we can compute each scaled value of x1 as :
Principal Component Analysis (PCA)
1. Standardization of the given dataset
 For example, to compute the first value:
 We repeat this process for the rest of the values in the feature to finally obtain the scaled
feature z1.
Principal Component Analysis (PCA)
1. Standardization of the given dataset
 Remember that we've only standardized the feature x1 - we need to repeat the entire process
(starting with computing the mean and standard deviation) for the feature x2 also. The data set
after Z-score normalization will be as follows:
Z1 Z2
1 1.650 0.990
2 0.889 0.078
3 -0.637 0.990
4 0.126 0.534
5 -1.019 -0.835
6 -1.019 -1.749
Principal Component Analysis (PCA)
1. Standardization of the given dataset
 Our dataset visually looks like the following after standardization
 As we can see, the layout of the points still looks similar even after standardization,
and they are now centered around the origin!
Principal Component Analysis (PCA)
 The next step of PCA is to find a line (also known as principal components ) on which to
project the given samples that captures the relationship between the two features well.
 How well the relationship is captured is based on how much variance is preserved when the
samples are projected onto the line.
Finding the principal components
Principal Component Analysis (PCA)
Finding the principal components
 Let's intuitively understand what is meant by finding the principal components,
consider the following example. Suppose we have the following samples:

More Related Content

Similar to Unit3_1.pptx

Understandig PCA and LDA
Understandig PCA and LDAUnderstandig PCA and LDA
Understandig PCA and LDA
Dr. Syed Hassan Amin
 
Image recogonization
Image recogonizationImage recogonization
Image recogonizationSANTOSH RATH
 
House Price Estimation as a Function Fitting Problem with using ANN Approach
House Price Estimation as a Function Fitting Problem with using ANN ApproachHouse Price Estimation as a Function Fitting Problem with using ANN Approach
House Price Estimation as a Function Fitting Problem with using ANN ApproachYusuf Uzun
 
Unsupervised learning
Unsupervised learning Unsupervised learning
Unsupervised learning
AlexAman1
 
Introduction to Principle Component Analysis
Introduction to Principle Component AnalysisIntroduction to Principle Component Analysis
Introduction to Principle Component Analysis
Sunjeet Jena
 
Matrix Factorization In Recommender Systems
Matrix Factorization In Recommender SystemsMatrix Factorization In Recommender Systems
Matrix Factorization In Recommender Systems
YONG ZHENG
 
PCA Final.pptx
PCA Final.pptxPCA Final.pptx
PCA Final.pptx
HarisMasood20
 
introduction to Statistical Theory.pptx
 introduction to Statistical Theory.pptx introduction to Statistical Theory.pptx
introduction to Statistical Theory.pptx
Dr.Shweta
 
Machine Learning.pdf
Machine Learning.pdfMachine Learning.pdf
Machine Learning.pdf
BeyaNasr1
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)
Zihui Li
 
Singular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptxSingular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptx
rajalakshmi5921
 
EDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptxEDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptx
rajalakshmi5921
 
Standard Statistical Feature analysis of Image Features for Facial Images usi...
Standard Statistical Feature analysis of Image Features for Facial Images usi...Standard Statistical Feature analysis of Image Features for Facial Images usi...
Standard Statistical Feature analysis of Image Features for Facial Images usi...
Bulbul Agrawal
 
Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and lda
Suresh Pokharel
 
Predicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningPredicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine Learning
Leo Salemann
 
Predicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningPredicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine Learning
Karunakar Kotha
 
Predicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningPredicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine Learning
Wenfan Xu
 
ML-Unit-4.pdf
ML-Unit-4.pdfML-Unit-4.pdf
ML-Unit-4.pdf
AnushaSharma81
 
Asset Price Prediction with Machine Learning
Asset Price Prediction with Machine LearningAsset Price Prediction with Machine Learning
Asset Price Prediction with Machine Learning
Taweh Beysolow II
 
Principal Component Analysis(PCA) understanding document
Principal Component Analysis(PCA) understanding documentPrincipal Component Analysis(PCA) understanding document
Principal Component Analysis(PCA) understanding document
Naveen Kumar
 

Similar to Unit3_1.pptx (20)

Understandig PCA and LDA
Understandig PCA and LDAUnderstandig PCA and LDA
Understandig PCA and LDA
 
Image recogonization
Image recogonizationImage recogonization
Image recogonization
 
House Price Estimation as a Function Fitting Problem with using ANN Approach
House Price Estimation as a Function Fitting Problem with using ANN ApproachHouse Price Estimation as a Function Fitting Problem with using ANN Approach
House Price Estimation as a Function Fitting Problem with using ANN Approach
 
Unsupervised learning
Unsupervised learning Unsupervised learning
Unsupervised learning
 
Introduction to Principle Component Analysis
Introduction to Principle Component AnalysisIntroduction to Principle Component Analysis
Introduction to Principle Component Analysis
 
Matrix Factorization In Recommender Systems
Matrix Factorization In Recommender SystemsMatrix Factorization In Recommender Systems
Matrix Factorization In Recommender Systems
 
PCA Final.pptx
PCA Final.pptxPCA Final.pptx
PCA Final.pptx
 
introduction to Statistical Theory.pptx
 introduction to Statistical Theory.pptx introduction to Statistical Theory.pptx
introduction to Statistical Theory.pptx
 
Machine Learning.pdf
Machine Learning.pdfMachine Learning.pdf
Machine Learning.pdf
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)
 
Singular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptxSingular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptx
 
EDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptxEDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptx
 
Standard Statistical Feature analysis of Image Features for Facial Images usi...
Standard Statistical Feature analysis of Image Features for Facial Images usi...Standard Statistical Feature analysis of Image Features for Facial Images usi...
Standard Statistical Feature analysis of Image Features for Facial Images usi...
 
Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and lda
 
Predicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningPredicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine Learning
 
Predicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningPredicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine Learning
 
Predicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningPredicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine Learning
 
ML-Unit-4.pdf
ML-Unit-4.pdfML-Unit-4.pdf
ML-Unit-4.pdf
 
Asset Price Prediction with Machine Learning
Asset Price Prediction with Machine LearningAsset Price Prediction with Machine Learning
Asset Price Prediction with Machine Learning
 
Principal Component Analysis(PCA) understanding document
Principal Component Analysis(PCA) understanding documentPrincipal Component Analysis(PCA) understanding document
Principal Component Analysis(PCA) understanding document
 

Recently uploaded

Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdfTutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
aqil azizi
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Unbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptxUnbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptx
ChristineTorrepenida1
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
Aditya Rajan Patra
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
symbo111
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
Pratik Pawar
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
BrazilAccount1
 
DfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributionsDfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributions
gestioneergodomus
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
Intella Parts
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
karthi keyan
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
Osamah Alsalih
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 

Recently uploaded (20)

Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdfTutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Unbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptxUnbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptx
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
 
DfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributionsDfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributions
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 

Unit3_1.pptx

  • 2. Principal Component Analysis (PCA)  The central idea of principal component analysis (PCA) is to reduce the dimensionality of a data set consisting of a large number of interrelated features while retaining as much as possible of the variation present in the data set.  This is achieved by transforming to a new set of features, the principal components (PCs), which are uncorrelated, and which are ordered so that the first few retain most of the variation present in all of the original features.
  • 3. Principal Component Analysis (PCA) Mathematics Behind PCA  PCA can be thought of as an unsupervised learning problem.  The whole process of PCA can be summarized as follows: • Standardize the given set of d-dimensional samples using Z-score Normalization. • Compute the covariance matrix of the standardized dataset. • Compute the eigenvectors and the corresponding eigenvalues of the covariance matrix. • Sort the eigenvectors by decreasing order of eigenvalues and choose the eigenvectors corresponding to the largest k eigenvalues to form a d × k dimensional matrix W • Use this d × k eigenvector matrix W to transform the samples onto the new subspace
  • 4.  Consider the following two-dimensional dataset with features x1 and x2:  Our goal is to use PCA to reduce the dimensions of our dataset from two to one, that is, from to R2 to R Principal Component Analysis (PCA)
  • 5.  Lets visualize the given data set: Principal Component Analysis (PCA)
  • 6. Principal Component Analysis (PCA) 1. Standardization of the given dataset  Suppose we want to perform PCA on two features - a person's age and weight. If the unit of weight is in grams, then the magnitude of its spread or variance will be much larger than that of the age feature.  The variance of the weight would be in the order of magnitude of say 10,000 while that of age would be say 10.  As PCA uses the variance of each features to reduce the dimensionality, it would focus more on extracting information from features with higher variances and ignore the other features of less variance.  The way to overcome this is to initially perform standardization such that all the features are transformed to the same unitless scale. In PCA, we perform Z-score normalization
  • 7. Principal Component Analysis (PCA) 1. Standardization of the given dataset  Z-score normalization is performed as follows:  After performing this standardization, the transformed features would each have a mean of 0 and a standard deviation of 1.
  • 8. Principal Component Analysis (PCA) 1. Standardization of the given dataset  As an example, let's manually standardize our first feature x1. We first need to compute the mean and the variance of this feature. We begin with the mean:  Next, let's compute the standard deviation:
  • 9. Principal Component Analysis (PCA) 1. Standardization of the given dataset  Now, we can compute each scaled value of x1 as :
  • 10. Principal Component Analysis (PCA) 1. Standardization of the given dataset  For example, to compute the first value:  We repeat this process for the rest of the values in the feature to finally obtain the scaled feature z1.
  • 11. Principal Component Analysis (PCA) 1. Standardization of the given dataset  Remember that we've only standardized the feature x1 - we need to repeat the entire process (starting with computing the mean and standard deviation) for the feature x2 also. The data set after Z-score normalization will be as follows: Z1 Z2 1 1.650 0.990 2 0.889 0.078 3 -0.637 0.990 4 0.126 0.534 5 -1.019 -0.835 6 -1.019 -1.749
  • 12. Principal Component Analysis (PCA) 1. Standardization of the given dataset  Our dataset visually looks like the following after standardization  As we can see, the layout of the points still looks similar even after standardization, and they are now centered around the origin!
  • 13. Principal Component Analysis (PCA)  The next step of PCA is to find a line (also known as principal components ) on which to project the given samples that captures the relationship between the two features well.  How well the relationship is captured is based on how much variance is preserved when the samples are projected onto the line. Finding the principal components
  • 14. Principal Component Analysis (PCA) Finding the principal components  Let's intuitively understand what is meant by finding the principal components, consider the following example. Suppose we have the following samples: