SlideShare a Scribd company logo
1 of 19
Foundations of Machine Learning
Sudeshna Sarkar
IIT Kharagpur
Module 3: Instance Based Learning and
Feature Reduction
Part C: Feature Extraction
Feature extraction - definition
• Given a set of features 𝐹 = {𝑥1, … , 𝑥𝑁}
the Feature Extraction(“Construction”) problem is
is to map 𝐹 to some feature set 𝐹′′ that maximizes
the learner’s ability to classify patterns
Feature Extraction
• Find a projection matrix w from N-dimensional to M-
dimensional vectors that keeps error low
• Assume that N features are linear combination of M < 𝑁
vectors
𝑧𝑖 = 𝑤𝑖1𝑥𝑖1 + ⋯ + 𝑤𝑖𝑑𝑥𝑖𝑁
𝒛 = 𝑤𝑇
𝒙
• What we expect from such basis
– Uncorrelated, cannot be reduced further
– Have large variance or otherwise bear no information
Geometric picture of principal components (PCs)
Geometric picture of principal components (PCs)
Geometric picture of principal components (PCs)
Algebraic definition of PCs
.
,
,
2
,
1
,
1
1
1
1 p
j
x
w
x
w
z
N
i
ij
i
j
T



 

  N
p
x
x
x 

,
,
, 2
1 
]
var[ 1
z
Given a sample of p observations on a vector of N variables
define the first principal component of the sample
by the linear transformation
where the vector
is chosen such that is maximum.
)
,
,
,
(
)
,
,
,
(
2
1
1
21
11
1
Nj
j
j
j
N
x
x
x
x
w
w
w
w




PCA
PCA
• Choose directions such that a total variance of data
will be maximum
1. Maximize Total Variance
• Choose directions that are orthogonal
2. Minimize correlation
• Choose 𝑀 < 𝑁 orthogonal directions which
maximize total variance
PCA
• 𝑁-dimensional feature space
• N × 𝑁 symmetric covariance matrix estimated from
samples 𝐶𝑜𝑣 𝒙 = Σ
• Select 𝑀 largest eigenvalue of the covariance matrix
and associated 𝑀 eigenvectors
• The first eigenvector will be a direction with largest
variance
PCA for image compression
p=1 p=2 p=4 p=8
p=16 p=32 p=64 p=100
Original
Image
Is PCA a good criterion for classification?
• Data variation
determines the
projection direction
• What’s missing?
– Class information
What is a good projection?
• Similarly, what is a
good criterion?
– Separating different
classes
Two classes
overlap
Two classes are
separated
What class information may be useful?
Between-class distance
• Between-class distance
– Distance between the centroids
of different classes
What class information may be useful?
Within-class distance
• Between-class distance
– Distance between the centroids of
different classes
• Within-class distance
• Accumulated distance of an instance
to the centroid of its class
• Linear discriminant analysis (LDA) finds
most discriminant projection by
• maximizing between-class distance
• and minimizing within-class distance
Linear Discriminant Analysis
• Find a low-dimensional space such that when
𝒙 is projected, classes are well-separated
Means and Scatter after projection
Good Projection
• Means are as far away as possible
• Scatter is small as possible
• Fisher Linear Discriminant
 
 
2
1 2
2 2
1 2
m m
J
s s



w
Thank You

More Related Content

Similar to introduction to machine learning 3c-feature-extraction.pptx

Singular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptxSingular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptxrajalakshmi5921
 
EDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptxEDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptxrajalakshmi5921
 
Lect4 principal component analysis-I
Lect4 principal component analysis-ILect4 principal component analysis-I
Lect4 principal component analysis-Ihktripathy
 
introduction to Statistical Theory.pptx
 introduction to Statistical Theory.pptx introduction to Statistical Theory.pptx
introduction to Statistical Theory.pptxDr.Shweta
 
Digital Image Classification.pptx
Digital Image Classification.pptxDigital Image Classification.pptx
Digital Image Classification.pptxHline Win
 
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Maninda Edirisooriya
 
ML SFCSE.pptx
ML SFCSE.pptxML SFCSE.pptx
ML SFCSE.pptxNIKHILGR3
 
04 Classification in Data Mining
04 Classification in Data Mining04 Classification in Data Mining
04 Classification in Data MiningValerii Klymchuk
 
Data Science and Machine Learning with Tensorflow
 Data Science and Machine Learning with Tensorflow Data Science and Machine Learning with Tensorflow
Data Science and Machine Learning with TensorflowShubham Sharma
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learningAmAn Singh
 
Week 13 Feature Selection Computer Vision Bagian 2
Week 13 Feature Selection Computer Vision Bagian 2Week 13 Feature Selection Computer Vision Bagian 2
Week 13 Feature Selection Computer Vision Bagian 2khairulhuda242
 
Machine learning Introduction
Machine learning IntroductionMachine learning Introduction
Machine learning IntroductionKuppusamy P
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersAlbert Y. C. Chen
 
UNIT 3: Data Warehousing and Data Mining
UNIT 3: Data Warehousing and Data MiningUNIT 3: Data Warehousing and Data Mining
UNIT 3: Data Warehousing and Data MiningNandakumar P
 
Predictive analytics
Predictive analyticsPredictive analytics
Predictive analyticsDinakar nk
 

Similar to introduction to machine learning 3c-feature-extraction.pptx (20)

Singular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptxSingular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptx
 
EDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptxEDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptx
 
Lect4 principal component analysis-I
Lect4 principal component analysis-ILect4 principal component analysis-I
Lect4 principal component analysis-I
 
introduction to Statistical Theory.pptx
 introduction to Statistical Theory.pptx introduction to Statistical Theory.pptx
introduction to Statistical Theory.pptx
 
Digital Image Classification.pptx
Digital Image Classification.pptxDigital Image Classification.pptx
Digital Image Classification.pptx
 
Covariance.pdf
Covariance.pdfCovariance.pdf
Covariance.pdf
 
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
 
ML SFCSE.pptx
ML SFCSE.pptxML SFCSE.pptx
ML SFCSE.pptx
 
04 Classification in Data Mining
04 Classification in Data Mining04 Classification in Data Mining
04 Classification in Data Mining
 
Data Science and Machine Learning with Tensorflow
 Data Science and Machine Learning with Tensorflow Data Science and Machine Learning with Tensorflow
Data Science and Machine Learning with Tensorflow
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
Week 13 Feature Selection Computer Vision Bagian 2
Week 13 Feature Selection Computer Vision Bagian 2Week 13 Feature Selection Computer Vision Bagian 2
Week 13 Feature Selection Computer Vision Bagian 2
 
PPT1
PPT1PPT1
PPT1
 
Knn 160904075605-converted
Knn 160904075605-convertedKnn 160904075605-converted
Knn 160904075605-converted
 
Density based clustering
Density based clusteringDensity based clustering
Density based clustering
 
Machine learning Introduction
Machine learning IntroductionMachine learning Introduction
Machine learning Introduction
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
 
UNIT 3: Data Warehousing and Data Mining
UNIT 3: Data Warehousing and Data MiningUNIT 3: Data Warehousing and Data Mining
UNIT 3: Data Warehousing and Data Mining
 
Biehl hanze-2021
Biehl hanze-2021Biehl hanze-2021
Biehl hanze-2021
 
Predictive analytics
Predictive analyticsPredictive analytics
Predictive analytics
 

More from Pratik Gohel

Introduction to Asic Design and VLSI Design
Introduction to Asic Design and VLSI DesignIntroduction to Asic Design and VLSI Design
Introduction to Asic Design and VLSI DesignPratik Gohel
 
Information and Theory coding Lecture 18
Information and Theory coding Lecture 18Information and Theory coding Lecture 18
Information and Theory coding Lecture 18Pratik Gohel
 
introduction to machine learning 3c.pptx
introduction to machine learning 3c.pptxintroduction to machine learning 3c.pptx
introduction to machine learning 3c.pptxPratik Gohel
 
introduction to machine learning 3d-collab-filtering.pptx
introduction to machine learning 3d-collab-filtering.pptxintroduction to machine learning 3d-collab-filtering.pptx
introduction to machine learning 3d-collab-filtering.pptxPratik Gohel
 
Introduction to embedded System.pptx
Introduction to embedded System.pptxIntroduction to embedded System.pptx
Introduction to embedded System.pptxPratik Gohel
 
710402_Lecture 1.ppt
710402_Lecture 1.ppt710402_Lecture 1.ppt
710402_Lecture 1.pptPratik Gohel
 
Interdependencies of IoT and cloud computing.pptx
Interdependencies of IoT and cloud computing.pptxInterdependencies of IoT and cloud computing.pptx
Interdependencies of IoT and cloud computing.pptxPratik Gohel
 
6-IoT protocol.pptx
6-IoT protocol.pptx6-IoT protocol.pptx
6-IoT protocol.pptxPratik Gohel
 
C Programming for ARM.pptx
C Programming for ARM.pptxC Programming for ARM.pptx
C Programming for ARM.pptxPratik Gohel
 
ARM Introduction.pptx
ARM Introduction.pptxARM Introduction.pptx
ARM Introduction.pptxPratik Gohel
 
machine learning.ppt
machine learning.pptmachine learning.ppt
machine learning.pptPratik Gohel
 

More from Pratik Gohel (18)

Introduction to Asic Design and VLSI Design
Introduction to Asic Design and VLSI DesignIntroduction to Asic Design and VLSI Design
Introduction to Asic Design and VLSI Design
 
Information and Theory coding Lecture 18
Information and Theory coding Lecture 18Information and Theory coding Lecture 18
Information and Theory coding Lecture 18
 
introduction to machine learning 3c.pptx
introduction to machine learning 3c.pptxintroduction to machine learning 3c.pptx
introduction to machine learning 3c.pptx
 
introduction to machine learning 3d-collab-filtering.pptx
introduction to machine learning 3d-collab-filtering.pptxintroduction to machine learning 3d-collab-filtering.pptx
introduction to machine learning 3d-collab-filtering.pptx
 
13486500-FFT.ppt
13486500-FFT.ppt13486500-FFT.ppt
13486500-FFT.ppt
 
Introduction to embedded System.pptx
Introduction to embedded System.pptxIntroduction to embedded System.pptx
Introduction to embedded System.pptx
 
Lecture 3.ppt
Lecture 3.pptLecture 3.ppt
Lecture 3.ppt
 
710402_Lecture 1.ppt
710402_Lecture 1.ppt710402_Lecture 1.ppt
710402_Lecture 1.ppt
 
UNIT-2.pptx
UNIT-2.pptxUNIT-2.pptx
UNIT-2.pptx
 
Interdependencies of IoT and cloud computing.pptx
Interdependencies of IoT and cloud computing.pptxInterdependencies of IoT and cloud computing.pptx
Interdependencies of IoT and cloud computing.pptx
 
Chapter1.pdf
Chapter1.pdfChapter1.pdf
Chapter1.pdf
 
6-IoT protocol.pptx
6-IoT protocol.pptx6-IoT protocol.pptx
6-IoT protocol.pptx
 
IOT gateways.pptx
IOT gateways.pptxIOT gateways.pptx
IOT gateways.pptx
 
AVRTIMER.pptx
AVRTIMER.pptxAVRTIMER.pptx
AVRTIMER.pptx
 
C Programming for ARM.pptx
C Programming for ARM.pptxC Programming for ARM.pptx
C Programming for ARM.pptx
 
ARM Introduction.pptx
ARM Introduction.pptxARM Introduction.pptx
ARM Introduction.pptx
 
arm
armarm
arm
 
machine learning.ppt
machine learning.pptmachine learning.ppt
machine learning.ppt
 

Recently uploaded

Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2RajaP95
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 

Recently uploaded (20)

POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 

introduction to machine learning 3c-feature-extraction.pptx

  • 1. Foundations of Machine Learning Sudeshna Sarkar IIT Kharagpur Module 3: Instance Based Learning and Feature Reduction Part C: Feature Extraction
  • 2. Feature extraction - definition • Given a set of features 𝐹 = {𝑥1, … , 𝑥𝑁} the Feature Extraction(“Construction”) problem is is to map 𝐹 to some feature set 𝐹′′ that maximizes the learner’s ability to classify patterns
  • 3. Feature Extraction • Find a projection matrix w from N-dimensional to M- dimensional vectors that keeps error low • Assume that N features are linear combination of M < 𝑁 vectors 𝑧𝑖 = 𝑤𝑖1𝑥𝑖1 + ⋯ + 𝑤𝑖𝑑𝑥𝑖𝑁 𝒛 = 𝑤𝑇 𝒙 • What we expect from such basis – Uncorrelated, cannot be reduced further – Have large variance or otherwise bear no information
  • 4. Geometric picture of principal components (PCs)
  • 5. Geometric picture of principal components (PCs)
  • 6. Geometric picture of principal components (PCs)
  • 7. Algebraic definition of PCs . , , 2 , 1 , 1 1 1 1 p j x w x w z N i ij i j T         N p x x x   , , , 2 1  ] var[ 1 z Given a sample of p observations on a vector of N variables define the first principal component of the sample by the linear transformation where the vector is chosen such that is maximum. ) , , , ( ) , , , ( 2 1 1 21 11 1 Nj j j j N x x x x w w w w    
  • 8. PCA
  • 9. PCA • Choose directions such that a total variance of data will be maximum 1. Maximize Total Variance • Choose directions that are orthogonal 2. Minimize correlation • Choose 𝑀 < 𝑁 orthogonal directions which maximize total variance
  • 10. PCA • 𝑁-dimensional feature space • N × 𝑁 symmetric covariance matrix estimated from samples 𝐶𝑜𝑣 𝒙 = Σ • Select 𝑀 largest eigenvalue of the covariance matrix and associated 𝑀 eigenvectors • The first eigenvector will be a direction with largest variance
  • 11. PCA for image compression p=1 p=2 p=4 p=8 p=16 p=32 p=64 p=100 Original Image
  • 12. Is PCA a good criterion for classification? • Data variation determines the projection direction • What’s missing? – Class information
  • 13. What is a good projection? • Similarly, what is a good criterion? – Separating different classes Two classes overlap Two classes are separated
  • 14. What class information may be useful? Between-class distance • Between-class distance – Distance between the centroids of different classes
  • 15. What class information may be useful? Within-class distance • Between-class distance – Distance between the centroids of different classes • Within-class distance • Accumulated distance of an instance to the centroid of its class • Linear discriminant analysis (LDA) finds most discriminant projection by • maximizing between-class distance • and minimizing within-class distance
  • 16. Linear Discriminant Analysis • Find a low-dimensional space such that when 𝒙 is projected, classes are well-separated
  • 17. Means and Scatter after projection
  • 18. Good Projection • Means are as far away as possible • Scatter is small as possible • Fisher Linear Discriminant     2 1 2 2 2 1 2 m m J s s    w