SlideShare a Scribd company logo
1 of 35
Dimensionality Reduction
Fereshteh Sadeghi
CSEP 546
Motivation
• Clustering
• One way to summarize a complex real-valued data point with a single
categorical variable
• Dimensionality reduction
• Another way to simplify complex high-dimensional data
• Summarize data with a lower dimensional real valued vector
Motivation
• Clustering
• One way to summarize a complex real-valued data point with a single
categorical variable
• Dimensionality reduction
• Another way to simplify complex high-dimensional data
• Summarize data with a lower dimentional real valued vector
• Given data points in d dimensions
• Convert them to data points in r<d dimensions
• With minimal loss of information
Data Compression
(inches)
(cm)
Reduce data from
2D to 1D
Andrew Ng
Data Compression
Reduce data from
2D to 1D
(inches)
(cm)
Andrew Ng
Data Compression
Reduce data from 3D to 2D
Andrew Ng
Principal Component Analysis (PCA) problem formulation
Reduce from 2-dimension to 1-dimension: Find a direction (a vector )
onto which to project the data so as to minimize the projection error.
Reduce from n-dimension to k-dimension: Find vectors
onto which to project the data, so as to minimize the projection error.
Andrew Ng
Covariance
• Variance and Covariance:
• Measure of the “spread” of a set of points around their center of mass(mean)
• Variance:
• Measure of the deviation from the mean for points in one dimension
• Covariance:
• Measure of how much each of the dimensions vary from the mean with
respect to each other
• Covariance is measured between two dimensions
• Covariance sees if there is a relation between two dimensions
• Covariance between one dimension is the variance
Positive: Both dimensions increase or decrease together Negative: While one increase the other decrease
Covariance
• Used to find relationships between dimensions in high dimensional
data sets
The Sample mean
Eigenvector and Eigenvalue
Ax = λx
A: Square Matirx
λ: Eigenvector or characteristic vector
X: Eigenvalue or characteristic value
• The zero vector can not be an eigenvector
• The value zero can be eigenvalue
Eigenvector and Eigenvalue
Ax = λx
A: Square Matirx
λ: Eigenvector or characteristic vector
X: Eigenvalue or characteristic value
Example
Eigenvector and Eigenvalue
Ax - λx = 0
(A – λI)x = 0
B = A – λI
Bx = 0
x = B-10 = 0
If we define a new matrix B:
If B has an inverse:
BUT! an eigenvector
cannot be zero!!
x will be an eigenvector of A if and only if B does
not have an inverse, or equivalently det(B)=0 :
det(A – λI) = 0
Ax = λx
Example 1: Find the eigenvalues of
two eigenvalues: 1,  2
Note: The roots of the characteristic equation can be repeated. That is, λ1 = λ2 =…= λk.
If that happens, the eigenvalue is said to be of multiplicity k.
Example 2: Find the eigenvalues of
λ = 2 is an eigenvector of multiplicity 3.









5
1
12
2
A
)
2
)(
1
(
2
3
12
)
5
)(
2
(
5
1
12
2
2























 A
I











2
0
0
0
2
0
0
1
2
A
0
)
2
(
2
0
0
0
2
0
0
1
2
3








 



 A
I
Eigenvector and Eigenvalue
Principal Component Analysis
Input:
Summarize a D dimensional vector X with K dimensional
feature vector h(x)
Set of basis vectors:
Principal Component Analysis
Basis vectors are orthonormal
New data representation h(x)
Principal Component Analysis
New data representation h(x)
Empirical mean of the data
• The top three principal components of SIFT descriptors from a set of images are computed
• Map these principal components to the principal components of the RGB space
• pixels with similar colors share similar structures
SIFT feature visualization
Original Image
• Divide the original 372x492 image into patches:
• Each patch is an instance that contains 12x12 pixels on a grid
• View each as a 144-D vector
Application: Image compression
PCA compression: 144D  60D
PCA compression: 144D  16D
16 most important eigenvectors
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
PCA compression: 144D ) 6D
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
6 most important eigenvectors
PCA compression: 144D  3D
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
3 most important eigenvectors
PCA compression: 144D  1D
60 most important eigenvectors
Looks like the discrete cosine bases of JPG!...
2D Discrete Cosine Basis
http://en.wikipedia.org/wiki/Discrete_cosine_transform
Dimensionality reduction
• PCA (Principal Component Analysis):
• Find projection that maximize the variance
• ICA (Independent Component Analysis):
• Very similar to PCA except that it assumes non-Guassian features
• Multidimensional Scaling:
• Find projection that best preserves inter-point distances
• LDA(Linear Discriminant Analysis):
• Maximizing the component axes for class-separation
• …
• …

More Related Content

Similar to PCA_csep546.pptx

Lecture32
Lecture32Lecture32
Lecture32
zukun
 
Kulum alin-11 jan2014
Kulum alin-11 jan2014Kulum alin-11 jan2014
Kulum alin-11 jan2014
rolly purnomo
 
Week 12 Dimensionality Reduction Bagian 1
Week 12 Dimensionality Reduction Bagian 1Week 12 Dimensionality Reduction Bagian 1
Week 12 Dimensionality Reduction Bagian 1
khairulhuda242
 

Similar to PCA_csep546.pptx (20)

Lecture32
Lecture32Lecture32
Lecture32
 
An introduction to R
An introduction to RAn introduction to R
An introduction to R
 
SVD.ppt
SVD.pptSVD.ppt
SVD.ppt
 
Large Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache SparkLarge Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache Spark
 
Introduction to Principle Component Analysis
Introduction to Principle Component AnalysisIntroduction to Principle Component Analysis
Introduction to Principle Component Analysis
 
Kulum alin-11 jan2014
Kulum alin-11 jan2014Kulum alin-11 jan2014
Kulum alin-11 jan2014
 
Realtime Analytics
Realtime AnalyticsRealtime Analytics
Realtime Analytics
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
pcappt-140121072949-phpapp01.pptx
pcappt-140121072949-phpapp01.pptxpcappt-140121072949-phpapp01.pptx
pcappt-140121072949-phpapp01.pptx
 
Data Mining Lecture_10(b).pptx
Data Mining Lecture_10(b).pptxData Mining Lecture_10(b).pptx
Data Mining Lecture_10(b).pptx
 
DimensionalityReduction.pptx
DimensionalityReduction.pptxDimensionalityReduction.pptx
DimensionalityReduction.pptx
 
Week 12 Dimensionality Reduction Bagian 1
Week 12 Dimensionality Reduction Bagian 1Week 12 Dimensionality Reduction Bagian 1
Week 12 Dimensionality Reduction Bagian 1
 
Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and lda
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
Regression Analysis.pptx
Regression Analysis.pptxRegression Analysis.pptx
Regression Analysis.pptx
 
Principal Component Analysis (PCA) and LDA PPT Slides
Principal Component Analysis (PCA) and LDA PPT SlidesPrincipal Component Analysis (PCA) and LDA PPT Slides
Principal Component Analysis (PCA) and LDA PPT Slides
 
Paris data-geeks-2013-03-28
Paris data-geeks-2013-03-28Paris data-geeks-2013-03-28
Paris data-geeks-2013-03-28
 
5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdf5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdf
 
Unit 4_3 Correlation Regression.pptx
Unit 4_3 Correlation Regression.pptxUnit 4_3 Correlation Regression.pptx
Unit 4_3 Correlation Regression.pptx
 
Data Mining Lecture_9.pptx
Data Mining Lecture_9.pptxData Mining Lecture_9.pptx
Data Mining Lecture_9.pptx
 

Recently uploaded

🔥HOT🔥📲9602870969🔥Prostitute Service in Udaipur Call Girls in City Palace Lake...
🔥HOT🔥📲9602870969🔥Prostitute Service in Udaipur Call Girls in City Palace Lake...🔥HOT🔥📲9602870969🔥Prostitute Service in Udaipur Call Girls in City Palace Lake...
🔥HOT🔥📲9602870969🔥Prostitute Service in Udaipur Call Girls in City Palace Lake...
Apsara Of India
 
Rohini Sector 18 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 18 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 18 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 18 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure
 

Recently uploaded (20)

🔥HOT🔥📲9602870969🔥Prostitute Service in Udaipur Call Girls in City Palace Lake...
🔥HOT🔥📲9602870969🔥Prostitute Service in Udaipur Call Girls in City Palace Lake...🔥HOT🔥📲9602870969🔥Prostitute Service in Udaipur Call Girls in City Palace Lake...
🔥HOT🔥📲9602870969🔥Prostitute Service in Udaipur Call Girls in City Palace Lake...
 
Book Cheap Flight Tickets - TraveljunctionUK
Book  Cheap Flight Tickets - TraveljunctionUKBook  Cheap Flight Tickets - TraveljunctionUK
Book Cheap Flight Tickets - TraveljunctionUK
 
A tour of African gastronomy - World Tourism Organization
A tour of African gastronomy - World Tourism OrganizationA tour of African gastronomy - World Tourism Organization
A tour of African gastronomy - World Tourism Organization
 
High Profile 🔝 8250077686 📞 Call Girls Service in Siri Fort🍑
High Profile 🔝 8250077686 📞 Call Girls Service in Siri Fort🍑High Profile 🔝 8250077686 📞 Call Girls Service in Siri Fort🍑
High Profile 🔝 8250077686 📞 Call Girls Service in Siri Fort🍑
 
9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday Safaris
9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday Safaris9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday Safaris
9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday Safaris
 
08448380779 Call Girls In Shahdara Women Seeking Men
08448380779 Call Girls In Shahdara Women Seeking Men08448380779 Call Girls In Shahdara Women Seeking Men
08448380779 Call Girls In Shahdara Women Seeking Men
 
Rohini Sector 18 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 18 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 18 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 18 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
Texas Tales Brenham and Amarillo Experiences Elevated by Find American Rental...
Texas Tales Brenham and Amarillo Experiences Elevated by Find American Rental...Texas Tales Brenham and Amarillo Experiences Elevated by Find American Rental...
Texas Tales Brenham and Amarillo Experiences Elevated by Find American Rental...
 
Night 7k to 12k Daman Call Girls 👉👉 8617697112⭐⭐ 100% Genuine Escort Service ...
Night 7k to 12k Daman Call Girls 👉👉 8617697112⭐⭐ 100% Genuine Escort Service ...Night 7k to 12k Daman Call Girls 👉👉 8617697112⭐⭐ 100% Genuine Escort Service ...
Night 7k to 12k Daman Call Girls 👉👉 8617697112⭐⭐ 100% Genuine Escort Service ...
 
Hire 💕 8617697112 Reckong Peo Call Girls Service Call Girls Agency
Hire 💕 8617697112 Reckong Peo Call Girls Service Call Girls AgencyHire 💕 8617697112 Reckong Peo Call Girls Service Call Girls Agency
Hire 💕 8617697112 Reckong Peo Call Girls Service Call Girls Agency
 
Kanpur Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Kanpur Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceKanpur Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Kanpur Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
 
Night 7k Call Girls Noida Sector 93 Escorts Call Me: 8448380779
Night 7k Call Girls Noida Sector 93 Escorts Call Me: 8448380779Night 7k Call Girls Noida Sector 93 Escorts Call Me: 8448380779
Night 7k Call Girls Noida Sector 93 Escorts Call Me: 8448380779
 
ITALY - Visa Options for expats and digital nomads
ITALY - Visa Options for expats and digital nomadsITALY - Visa Options for expats and digital nomads
ITALY - Visa Options for expats and digital nomads
 
"Embark on the Ultimate Adventure: Top 10 Must-Visit Destinations for Thrill-...
"Embark on the Ultimate Adventure: Top 10 Must-Visit Destinations for Thrill-..."Embark on the Ultimate Adventure: Top 10 Must-Visit Destinations for Thrill-...
"Embark on the Ultimate Adventure: Top 10 Must-Visit Destinations for Thrill-...
 
visa consultant | 📞📞 03094429236 || Best Study Visa Consultant
visa consultant | 📞📞 03094429236 || Best Study Visa Consultantvisa consultant | 📞📞 03094429236 || Best Study Visa Consultant
visa consultant | 📞📞 03094429236 || Best Study Visa Consultant
 
Call Girls Service !! New Friends Colony!! @9999965857 Delhi 🫦 No Advance VV...
Call Girls Service !! New Friends Colony!! @9999965857 Delhi 🫦 No Advance  VV...Call Girls Service !! New Friends Colony!! @9999965857 Delhi 🫦 No Advance  VV...
Call Girls Service !! New Friends Colony!! @9999965857 Delhi 🫦 No Advance VV...
 
08448380779 Call Girls In Chirag Enclave Women Seeking Men
08448380779 Call Girls In Chirag Enclave Women Seeking Men08448380779 Call Girls In Chirag Enclave Women Seeking Men
08448380779 Call Girls In Chirag Enclave Women Seeking Men
 
Genesis 1:6 || Meditate the Scripture daily verse by verse
Genesis 1:6  ||  Meditate the Scripture daily verse by verseGenesis 1:6  ||  Meditate the Scripture daily verse by verse
Genesis 1:6 || Meditate the Scripture daily verse by verse
 
Call Girls Service !! Indirapuram!! @9999965857 Delhi 🫦 No Advance VVVIP 🍎 S...
Call Girls Service !! Indirapuram!! @9999965857 Delhi 🫦 No Advance  VVVIP 🍎 S...Call Girls Service !! Indirapuram!! @9999965857 Delhi 🫦 No Advance  VVVIP 🍎 S...
Call Girls Service !! Indirapuram!! @9999965857 Delhi 🫦 No Advance VVVIP 🍎 S...
 
Hire 💕 8617697112 Chamba Call Girls Service Call Girls Agency
Hire 💕 8617697112 Chamba Call Girls Service Call Girls AgencyHire 💕 8617697112 Chamba Call Girls Service Call Girls Agency
Hire 💕 8617697112 Chamba Call Girls Service Call Girls Agency
 

PCA_csep546.pptx

  • 2. Motivation • Clustering • One way to summarize a complex real-valued data point with a single categorical variable • Dimensionality reduction • Another way to simplify complex high-dimensional data • Summarize data with a lower dimensional real valued vector
  • 3. Motivation • Clustering • One way to summarize a complex real-valued data point with a single categorical variable • Dimensionality reduction • Another way to simplify complex high-dimensional data • Summarize data with a lower dimentional real valued vector • Given data points in d dimensions • Convert them to data points in r<d dimensions • With minimal loss of information
  • 5. Data Compression Reduce data from 2D to 1D (inches) (cm) Andrew Ng
  • 6. Data Compression Reduce data from 3D to 2D Andrew Ng
  • 7. Principal Component Analysis (PCA) problem formulation Reduce from 2-dimension to 1-dimension: Find a direction (a vector ) onto which to project the data so as to minimize the projection error. Reduce from n-dimension to k-dimension: Find vectors onto which to project the data, so as to minimize the projection error. Andrew Ng
  • 8.
  • 9. Covariance • Variance and Covariance: • Measure of the “spread” of a set of points around their center of mass(mean) • Variance: • Measure of the deviation from the mean for points in one dimension • Covariance: • Measure of how much each of the dimensions vary from the mean with respect to each other • Covariance is measured between two dimensions • Covariance sees if there is a relation between two dimensions • Covariance between one dimension is the variance
  • 10. Positive: Both dimensions increase or decrease together Negative: While one increase the other decrease
  • 11. Covariance • Used to find relationships between dimensions in high dimensional data sets The Sample mean
  • 12. Eigenvector and Eigenvalue Ax = λx A: Square Matirx λ: Eigenvector or characteristic vector X: Eigenvalue or characteristic value • The zero vector can not be an eigenvector • The value zero can be eigenvalue
  • 13. Eigenvector and Eigenvalue Ax = λx A: Square Matirx λ: Eigenvector or characteristic vector X: Eigenvalue or characteristic value Example
  • 14. Eigenvector and Eigenvalue Ax - λx = 0 (A – λI)x = 0 B = A – λI Bx = 0 x = B-10 = 0 If we define a new matrix B: If B has an inverse: BUT! an eigenvector cannot be zero!! x will be an eigenvector of A if and only if B does not have an inverse, or equivalently det(B)=0 : det(A – λI) = 0 Ax = λx
  • 15. Example 1: Find the eigenvalues of two eigenvalues: 1,  2 Note: The roots of the characteristic equation can be repeated. That is, λ1 = λ2 =…= λk. If that happens, the eigenvalue is said to be of multiplicity k. Example 2: Find the eigenvalues of λ = 2 is an eigenvector of multiplicity 3.          5 1 12 2 A ) 2 )( 1 ( 2 3 12 ) 5 )( 2 ( 5 1 12 2 2                         A I            2 0 0 0 2 0 0 1 2 A 0 ) 2 ( 2 0 0 0 2 0 0 1 2 3               A I Eigenvector and Eigenvalue
  • 16. Principal Component Analysis Input: Summarize a D dimensional vector X with K dimensional feature vector h(x) Set of basis vectors:
  • 17. Principal Component Analysis Basis vectors are orthonormal New data representation h(x)
  • 18. Principal Component Analysis New data representation h(x) Empirical mean of the data
  • 19.
  • 20.
  • 21.
  • 22.
  • 23. • The top three principal components of SIFT descriptors from a set of images are computed • Map these principal components to the principal components of the RGB space • pixels with similar colors share similar structures SIFT feature visualization
  • 24. Original Image • Divide the original 372x492 image into patches: • Each patch is an instance that contains 12x12 pixels on a grid • View each as a 144-D vector Application: Image compression
  • 27. 16 most important eigenvectors 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12
  • 29. 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 6 most important eigenvectors
  • 31. 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 3 most important eigenvectors
  • 33. 60 most important eigenvectors Looks like the discrete cosine bases of JPG!...
  • 34. 2D Discrete Cosine Basis http://en.wikipedia.org/wiki/Discrete_cosine_transform
  • 35. Dimensionality reduction • PCA (Principal Component Analysis): • Find projection that maximize the variance • ICA (Independent Component Analysis): • Very similar to PCA except that it assumes non-Guassian features • Multidimensional Scaling: • Find projection that best preserves inter-point distances • LDA(Linear Discriminant Analysis): • Maximizing the component axes for class-separation • … • …