SlideShare a Scribd company logo
1 of 12
TABLE OF CONTENTS
 INTRODUCTION
 WHAT IS PCA
 PRINCIPLE COMPONENTS IN PCA
 STEPS OF PCA
 APPLICATIONS
 CONCLUSION
INTRODUCTION
 PCA is a dimensionality reduction technique that has four main
parts: feature covariance, eigen decomposition, principal
component transformation, and choosing components in terms of
explained variance. PCA works by considering the variance of each
attribute because the high attribute shows the good split between
the classes, and hence it reduces the dimensionality.
 Some real-world applications of PCA are image processing,
movie recommendation system, optimizing the power
allocation in various communication channels.
WHAT IS PCA ?
To understand PCA, we have to know its purpose. To understand the
purpose, we have to know the Curse of Dimensionality i .e As the
number of features or dimensions grows, the amount of data we need
to generalize accurately grows exponentially.
There are two options to reduce dimensionality:
Feature elimination: we remove some features directly.
Feature extraction: we keep the important fraction of all the features.
We apply PCA to achieve this. Note that PCA is not the only method
that does the feature extraction.
SOME COMMON TERMS USED IN PCA
ALGORITHM:
Dimensionality: It is the number of
features or variables present in the
given dataset. More easily, it is the
number of columns present in the
dataset.
Correlation: It signifies that how strongly
two variables are related to each other.
Such as if one changes, the other
variable also gets changed. The
correlation value ranges from -1 to +1.
Here, -1 occurs if variables are
inversely proportional to each other, and
+1 indicates that variables are directly
proportional to each other.
Continuation….
Orthogonal: It defines that variables are
not correlated to each other, and hence
the correlation between the pair of
variables is zero.
Eigenvectors: If there is a square
matrix M, and a non-zero vector v is
given. Then v will be eigenvector if Av
is the scalar multiple of v.
Covariance Matrix: A matrix containing
the covariance between the pair of
variables is called the Covariance
Matrix.
Principal Components in PCA
 As described above, the transformed new features or the output of PCA are the Principal Components.
The number of these PCs are either equal to or less than the original features present in the dataset.
Some properties of these principal components are given below:
• The principal component must be the linear combination of the original features.
• These components are orthogonal, i.e., the correlation between a pair of variables is zero.
• The importance of each component decreases when going to 1 to n, it means the 1 PC has the most
importance, and n PC will have the least importance
STEPS OF PCA
Getting the dataset
Firstly, we need to take the input dataset and divide it into two subparts X and Y, where X is the training set, and
Y is the validation set.
Representing data into a structure
Now we will represent our dataset into a structure. Such as we will represent the two-dimensional matrix of
independent variable X. Here each row corresponds to the data items, and the column corresponds to the
Features. The number of columns is the dimensions of the dataset.
Standardizing the data
In this step, we will standardize our dataset. Such as in a particular column, the features with high variance are
more important compared to the features with lower variance.
If the importance of features is independent of the variance of the feature, then we will divide each data item in a
column with the standard deviation of the column. Here we will name the matrix as Z.
Calculating the Covariance of Z
To calculate the covariance of Z, we will take the matrix Z, and will transpose it. After transpose, we will multiply it
by Z. The output matrix will be the Covariance matrix of Z.
Calculating the Eigen Values and Eigen Vectors
Now we need to calculate the eigenvalues and eigenvectors for the resultant covariance matrix Z. Eigenvectors
or the covariance matrix are the directions of the axes with high information. And the coefficients of these
eigenvectors are defined as the eigenvalues.
Sorting the Eigen Vectors
In this step, we will take all the eigenvalues and will sort them in decreasing order, which means from largest to
smallest. And simultaneously sort the eigenvectors accordingly in matrix P of eigenvalues. The resultant matrix
will be named as P*.
Calculating the new features Or Principal Components
Here we will calculate the new features. To do this, we will multiply the P* matrix to the Z. In the resultant matrix
Z*, each observation is the linear combination of original features. Each column of the Z* matrix is independent
of each other.
Remove less or unimportant features from the new dataset.
The new feature set has occurred, so we will decide here what to keep and what to remove. It means, we will
only keep the relevant or important features in the new dataset, and unimportant features will be removed out
 Pros
• PCA reduces the dimensionality without losing information from any features.
• Reduce storage space needed to store data.
• Speed up the learning algorithm (with lower dimension).
• Address the multicollinearity issue (all principal components are orthogonal to each other).
• Help visualize data with high dimensionality (after reducing the dimension to 2 or 3).
 Cons
• Using PCA prevents interpretation of the original features, as well as their impact because eigenvectors
are not meaningful.
• You may face some difficulties in calculating the covariances and covariance matrices
Applications of Principal Component
Analysis
• PCA is mainly used as the dimensionality reduction technique in various AI applications such as
computer vision, image compression, etc.
• It can also be used for finding hidden patterns if data has high dimensions. Some fields where PCA is
used are Finance, data mining, Psychology, etc
• PCA in machine learning is used to visualize multidimensional data.
• In healthcare data to explore the factors that are assumed to be very important in increasing the risk of
any chronic disease.
• PCA helps to resize an image.
CONCLUSION

More Related Content

Similar to Feature selection using PCA.pptx

Feature Engineering in Machine Learning
Feature Engineering in Machine LearningFeature Engineering in Machine Learning
Feature Engineering in Machine LearningPyingkodi Maran
 
Machine Learning.pdf
Machine Learning.pdfMachine Learning.pdf
Machine Learning.pdfBeyaNasr1
 
pcappt-140121072949-phpapp01.pptx
pcappt-140121072949-phpapp01.pptxpcappt-140121072949-phpapp01.pptx
pcappt-140121072949-phpapp01.pptxABINASHPADHY6
 
Singular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptxSingular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptxrajalakshmi5921
 
EDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptxEDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptxrajalakshmi5921
 
Machine learning Mind Map
Machine learning Mind MapMachine learning Mind Map
Machine learning Mind MapAshish Patel
 
Image recogonization
Image recogonizationImage recogonization
Image recogonizationSANTOSH RATH
 
Unit_2_Feature Engineering.pdf
Unit_2_Feature Engineering.pdfUnit_2_Feature Engineering.pdf
Unit_2_Feature Engineering.pdfPyingkodi Maran
 
dimension reduction.ppt
dimension reduction.pptdimension reduction.ppt
dimension reduction.pptDeadpool120050
 
AIML_UNIT 2 _PPT_HAND NOTES_MPS.pdf
AIML_UNIT 2 _PPT_HAND NOTES_MPS.pdfAIML_UNIT 2 _PPT_HAND NOTES_MPS.pdf
AIML_UNIT 2 _PPT_HAND NOTES_MPS.pdfMargiShah29
 
Slides distancecovariance
Slides distancecovarianceSlides distancecovariance
Slides distancecovarianceShrey Nishchal
 
Standard Statistical Feature analysis of Image Features for Facial Images usi...
Standard Statistical Feature analysis of Image Features for Facial Images usi...Standard Statistical Feature analysis of Image Features for Facial Images usi...
Standard Statistical Feature analysis of Image Features for Facial Images usi...Bulbul Agrawal
 
Beginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix FactorizationBeginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix FactorizationBenjamin Bengfort
 
Lect5 principal component analysis
Lect5 principal component analysisLect5 principal component analysis
Lect5 principal component analysishktripathy
 

Similar to Feature selection using PCA.pptx (20)

Feature Engineering in Machine Learning
Feature Engineering in Machine LearningFeature Engineering in Machine Learning
Feature Engineering in Machine Learning
 
Machine Learning.pdf
Machine Learning.pdfMachine Learning.pdf
Machine Learning.pdf
 
ML-Unit-4.pdf
ML-Unit-4.pdfML-Unit-4.pdf
ML-Unit-4.pdf
 
Unit3_1.pptx
Unit3_1.pptxUnit3_1.pptx
Unit3_1.pptx
 
pcappt-140121072949-phpapp01.pptx
pcappt-140121072949-phpapp01.pptxpcappt-140121072949-phpapp01.pptx
pcappt-140121072949-phpapp01.pptx
 
Singular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptxSingular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptx
 
EDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptxEDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptx
 
Pca ppt
Pca pptPca ppt
Pca ppt
 
Machine learning Mind Map
Machine learning Mind MapMachine learning Mind Map
Machine learning Mind Map
 
Image recogonization
Image recogonizationImage recogonization
Image recogonization
 
Unit_2_Feature Engineering.pdf
Unit_2_Feature Engineering.pdfUnit_2_Feature Engineering.pdf
Unit_2_Feature Engineering.pdf
 
dimension reduction.ppt
dimension reduction.pptdimension reduction.ppt
dimension reduction.ppt
 
Practical --1.pdf
Practical --1.pdfPractical --1.pdf
Practical --1.pdf
 
AIML_UNIT 2 _PPT_HAND NOTES_MPS.pdf
AIML_UNIT 2 _PPT_HAND NOTES_MPS.pdfAIML_UNIT 2 _PPT_HAND NOTES_MPS.pdf
AIML_UNIT 2 _PPT_HAND NOTES_MPS.pdf
 
Slides distancecovariance
Slides distancecovarianceSlides distancecovariance
Slides distancecovariance
 
ML Lab.docx
ML Lab.docxML Lab.docx
ML Lab.docx
 
Standard Statistical Feature analysis of Image Features for Facial Images usi...
Standard Statistical Feature analysis of Image Features for Facial Images usi...Standard Statistical Feature analysis of Image Features for Facial Images usi...
Standard Statistical Feature analysis of Image Features for Facial Images usi...
 
Beginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix FactorizationBeginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix Factorization
 
M5.pptx
M5.pptxM5.pptx
M5.pptx
 
Lect5 principal component analysis
Lect5 principal component analysisLect5 principal component analysis
Lect5 principal component analysis
 

Recently uploaded

Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...M56BOOKSTORE PRODUCT/SERVICE
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 

Recently uploaded (20)

Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 

Feature selection using PCA.pptx

  • 1.
  • 2. TABLE OF CONTENTS  INTRODUCTION  WHAT IS PCA  PRINCIPLE COMPONENTS IN PCA  STEPS OF PCA  APPLICATIONS  CONCLUSION
  • 3. INTRODUCTION  PCA is a dimensionality reduction technique that has four main parts: feature covariance, eigen decomposition, principal component transformation, and choosing components in terms of explained variance. PCA works by considering the variance of each attribute because the high attribute shows the good split between the classes, and hence it reduces the dimensionality.  Some real-world applications of PCA are image processing, movie recommendation system, optimizing the power allocation in various communication channels.
  • 4. WHAT IS PCA ? To understand PCA, we have to know its purpose. To understand the purpose, we have to know the Curse of Dimensionality i .e As the number of features or dimensions grows, the amount of data we need to generalize accurately grows exponentially. There are two options to reduce dimensionality: Feature elimination: we remove some features directly. Feature extraction: we keep the important fraction of all the features. We apply PCA to achieve this. Note that PCA is not the only method that does the feature extraction.
  • 5. SOME COMMON TERMS USED IN PCA ALGORITHM: Dimensionality: It is the number of features or variables present in the given dataset. More easily, it is the number of columns present in the dataset. Correlation: It signifies that how strongly two variables are related to each other. Such as if one changes, the other variable also gets changed. The correlation value ranges from -1 to +1. Here, -1 occurs if variables are inversely proportional to each other, and +1 indicates that variables are directly proportional to each other.
  • 6. Continuation…. Orthogonal: It defines that variables are not correlated to each other, and hence the correlation between the pair of variables is zero. Eigenvectors: If there is a square matrix M, and a non-zero vector v is given. Then v will be eigenvector if Av is the scalar multiple of v. Covariance Matrix: A matrix containing the covariance between the pair of variables is called the Covariance Matrix.
  • 7. Principal Components in PCA  As described above, the transformed new features or the output of PCA are the Principal Components. The number of these PCs are either equal to or less than the original features present in the dataset. Some properties of these principal components are given below: • The principal component must be the linear combination of the original features. • These components are orthogonal, i.e., the correlation between a pair of variables is zero. • The importance of each component decreases when going to 1 to n, it means the 1 PC has the most importance, and n PC will have the least importance
  • 8. STEPS OF PCA Getting the dataset Firstly, we need to take the input dataset and divide it into two subparts X and Y, where X is the training set, and Y is the validation set. Representing data into a structure Now we will represent our dataset into a structure. Such as we will represent the two-dimensional matrix of independent variable X. Here each row corresponds to the data items, and the column corresponds to the Features. The number of columns is the dimensions of the dataset. Standardizing the data In this step, we will standardize our dataset. Such as in a particular column, the features with high variance are more important compared to the features with lower variance. If the importance of features is independent of the variance of the feature, then we will divide each data item in a column with the standard deviation of the column. Here we will name the matrix as Z. Calculating the Covariance of Z To calculate the covariance of Z, we will take the matrix Z, and will transpose it. After transpose, we will multiply it by Z. The output matrix will be the Covariance matrix of Z.
  • 9. Calculating the Eigen Values and Eigen Vectors Now we need to calculate the eigenvalues and eigenvectors for the resultant covariance matrix Z. Eigenvectors or the covariance matrix are the directions of the axes with high information. And the coefficients of these eigenvectors are defined as the eigenvalues. Sorting the Eigen Vectors In this step, we will take all the eigenvalues and will sort them in decreasing order, which means from largest to smallest. And simultaneously sort the eigenvectors accordingly in matrix P of eigenvalues. The resultant matrix will be named as P*. Calculating the new features Or Principal Components Here we will calculate the new features. To do this, we will multiply the P* matrix to the Z. In the resultant matrix Z*, each observation is the linear combination of original features. Each column of the Z* matrix is independent of each other. Remove less or unimportant features from the new dataset. The new feature set has occurred, so we will decide here what to keep and what to remove. It means, we will only keep the relevant or important features in the new dataset, and unimportant features will be removed out
  • 10.  Pros • PCA reduces the dimensionality without losing information from any features. • Reduce storage space needed to store data. • Speed up the learning algorithm (with lower dimension). • Address the multicollinearity issue (all principal components are orthogonal to each other). • Help visualize data with high dimensionality (after reducing the dimension to 2 or 3).  Cons • Using PCA prevents interpretation of the original features, as well as their impact because eigenvectors are not meaningful. • You may face some difficulties in calculating the covariances and covariance matrices
  • 11. Applications of Principal Component Analysis • PCA is mainly used as the dimensionality reduction technique in various AI applications such as computer vision, image compression, etc. • It can also be used for finding hidden patterns if data has high dimensions. Some fields where PCA is used are Finance, data mining, Psychology, etc • PCA in machine learning is used to visualize multidimensional data. • In healthcare data to explore the factors that are assumed to be very important in increasing the risk of any chronic disease. • PCA helps to resize an image.