SlideShare a Scribd company logo
1 of 32
Theory and Toolkits
of PCA
2009 5/4 IRLab
Study Group
Presenter : Chin-Hui Chen
Agenda
 Theory :
◦ 1. Scenario
◦ 2. What is PCA?
◦ 3. How to minimize Squared-Error ?
◦ 4. Dimensionality Reduction
 Toolkit :
◦ A list of PCA toolkits
◦ Demo
Scenario (Point? Line?)
 Consider a 2-dimension space
Least Squared Error
Agenda
 Theory :
◦ 1. Scenario
◦ 2. What is PCA?
◦ 3. How to minimize Squared-Error ?
◦ 4. Dimensionality Reduction
 Toolkit :
◦ A list of PCA toolkits
◦ Demo
What is PCA ? (1)
 Principal component analysis (PCA)
involves a mathematical procedure that
transforms a number of possibly
correlated variables into a smaller number
of uncorrelated variables called “principal
components”.
What is PCA ? (2)
 What can PCA do ?
◦ Dimensionality Reduction
 For example :
◦ Assuming N points in D-dim space
◦ e.g. {x1, x2, x3, x4} ; xi = (v1, v2)
◦ A set (M) of basis for projection
◦ e.g. {u1}
 They are orthonormal bases (長度1,兩兩內積0)
 M << D (represent the feature in M dimensions)
◦ e.g. xi = (p1)
Agenda
 Theory :
◦ 1. Scenario
◦ 2. What is PCA?
◦ 3. How to minimize Squared-Error ?
◦ 4. Dimensionality Reduction
 Toolkit :
◦ A list of PCA toolkits
◦ Demo
How to minimize Squared-Error ?
 Consider a D-dimension space
◦ Given N point : {x1, x2, …, xn}
◦ xi is a D-dim vector
 How to
◦ 1. 找一個點使得squared-error最小
◦ 2. 找一條線使得squared-error最小
How to ? - Point
◦ Goal : Find x0 s.t. min.
◦
◦ Let .
How to ? – Point - Line
 ∴ x0 =
◦ 1. 找一個點使得squared-error最小
◦ 2. 找一條線使得squared-error最小
 L : xk’- x0 = ake
 xk’= x0 + ake
 = m + ake
How to ? – Line
 L : xk’ = m + ake
 Goal :
 Find a1…an

How to ? – Line
 每個部份微分後 [2ak – 2et(xk-m)]

 What does it mean ?
xk’ = m + ake
How to ? – Line
 Then, how about e ?
How to ? – Line
 Let
Independent of e
How to ? – Line
f(x,y) ->
But if x,y : g(x,y)=0
 J’1(e) = -etSe
 Use lagrange multiplier :

 Because |e| = 1 , u = etSe – λ(ete-1)
How to ? – Line

◦ What is S ?
 Covariance Matrix (共變異數矩陣)
◦ Assume D-dim
How to ? – Line
 , we know S.
 Then, what is e ? Eigenvectors of S.
AX= λX Eigen : same
How to ? – conclusion
 Summary :
◦ Find a line : xk’= m + ake
 ak = et(xk-m)
 Se = λe ; e = eigenvectors of covariance matrix.
◦ D-dim space can find D eigenvectors.
Agenda
 Theory :
◦ 1. Scenario
◦ 2. What is PCA?
◦ 3. How to minimize Squared-Error ?
◦ 4. Dimensionality Reduction
 Toolkit :
◦ A list of PCA toolkits
◦ Demo
Dimensionality
Reduction
Dimensionality Reduction
 Consider a 2-dim space …
X1 = (a,b)
X2 = (c,d)
X1 = (a’,b’)
X2 = (c’,d’)
We are going to do …
X1 = (a’)
X2 = (c’)
Dimensionality Reduction
 We want to proof :
◦ Axes of the data are independent.
 Consider N m-dim vectors
◦ {x1, x2, … ,xn}
◦ Let X=[x1-m x2-m … xn-m]T m = mean
◦ Let E = [e1 e2 … em]
Se = λe
eigen decomposition Eigen vector {e1,…,em}
Eigen value {λ1,…, λm}
Dimensionality Reduction
 SE = [Se1 Se2 … Sem]
 = [λe1 λe2 … λem]

 =
 = ED
 S = EDE-1
E = [e1 e2 … em]
Dimensionality Reduction
 We want to know new Covariance Matrix
of projected vectors.
 Let Y = [y1 y2 … yn]T
 E = [e1 e2 … em]
 Y = ETX
 SY
Dimensionality Reduction
 SY = D
 1. Covariance of two axes are 0.
 2. represent data↑->covariance of axes↑
 -> λ ↑
Dimensionality Reduction
 Conclusion :
 If we want to reduce
 dimension D to M
 (M<<D)
 1. Find S
 2. ->eigenvalues
 3. Select Top M
 4. Project data
Agenda
 Theory :
◦ 1. Scenario
◦ 2. What is PCA?
◦ 3. How to minimize Squared-Error ?
◦ 4. Dimensionality Reduction
 Toolkit :
◦ A list of PCA toolkits
◦ Demo
Toolkits
A List of PCA Toolkits
 C & Java
◦ Fionn Murtagh's Multivariate Data Analysis Software and Resources
◦ http://astro.u-strasbg.fr/~fmurtagh/mda-sw/
 Perl
◦ PDL::PCA
 Matlab
◦ Statistics Toolbox™ : princomp
 Weka
◦ weka.attributeSelection.PrincipalComponents
(http://www.laps.ufpa.br/aldebaro/weka/feature_selection.html )
A List of PCA Toolkits
 C & Java
◦ Fionn Murtagh's Multivariate Data Analysis Software and Resources
◦ http://astro.u-strasbg.fr/~fmurtagh/mda-sw/
C :
Download: pca.c
Compile: cc pca.c -lm -o pcac
Run: ./pcac spectr.dat 36 8 R > pcaout.c.txt
Java :
Download: JAMA, PCAcorr.java
Compile: javac –classpath Jama-1.0.2.jar PCAcorr.java
Run: java PCAcorr iris.dat > pcaout.java.txt
PCA (Principal component analysis) Theory and Toolkits
PCA (Principal component analysis) Theory and Toolkits

More Related Content

What's hot

Principal Component Analysis (PCA) and LDA PPT Slides
Principal Component Analysis (PCA) and LDA PPT SlidesPrincipal Component Analysis (PCA) and LDA PPT Slides
Principal Component Analysis (PCA) and LDA PPT SlidesAbhishekKumar4995
 
"FingerPrint Recognition Using Principle Component Analysis(PCA)”
"FingerPrint Recognition Using Principle Component Analysis(PCA)”"FingerPrint Recognition Using Principle Component Analysis(PCA)”
"FingerPrint Recognition Using Principle Component Analysis(PCA)”Er. Arpit Sharma
 
Independent Component Analysis
Independent Component Analysis Independent Component Analysis
Independent Component Analysis Ibrahim Amer
 
Lect5 principal component analysis
Lect5 principal component analysisLect5 principal component analysis
Lect5 principal component analysishktripathy
 
Principal component analysis - application in finance
Principal component analysis - application in financePrincipal component analysis - application in finance
Principal component analysis - application in financeIgor Hlivka
 
Implement principal component analysis (PCA) in python from scratch
Implement principal component analysis (PCA) in python from scratchImplement principal component analysis (PCA) in python from scratch
Implement principal component analysis (PCA) in python from scratchEshanAgarwal4
 
Principal Component Analysis
Principal Component AnalysisPrincipal Component Analysis
Principal Component AnalysisMason Ziemer
 
Principal Component Analysis For Novelty Detection
Principal Component Analysis For Novelty DetectionPrincipal Component Analysis For Novelty Detection
Principal Component Analysis For Novelty DetectionJordan McBain
 
Probabilistic PCA, EM, and more
Probabilistic PCA, EM, and moreProbabilistic PCA, EM, and more
Probabilistic PCA, EM, and morehsharmasshare
 
Independent component analysis
Independent component analysisIndependent component analysis
Independent component analysisVanessa S
 
A Correlative Information-Theoretic Measure for Image Similarity
A Correlative Information-Theoretic Measure for Image SimilarityA Correlative Information-Theoretic Measure for Image Similarity
A Correlative Information-Theoretic Measure for Image SimilarityFarah M. Altufaili
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Mostafa G. M. Mostafa
 
Intro to MATLAB and K-mean algorithm
Intro to MATLAB and K-mean algorithmIntro to MATLAB and K-mean algorithm
Intro to MATLAB and K-mean algorithmkhalid Shah
 

What's hot (20)

Principal Component Analysis (PCA) and LDA PPT Slides
Principal Component Analysis (PCA) and LDA PPT SlidesPrincipal Component Analysis (PCA) and LDA PPT Slides
Principal Component Analysis (PCA) and LDA PPT Slides
 
"FingerPrint Recognition Using Principle Component Analysis(PCA)”
"FingerPrint Recognition Using Principle Component Analysis(PCA)”"FingerPrint Recognition Using Principle Component Analysis(PCA)”
"FingerPrint Recognition Using Principle Component Analysis(PCA)”
 
Independent Component Analysis
Independent Component Analysis Independent Component Analysis
Independent Component Analysis
 
Pca
PcaPca
Pca
 
Principal component analysis
Principal component analysisPrincipal component analysis
Principal component analysis
 
Lect5 principal component analysis
Lect5 principal component analysisLect5 principal component analysis
Lect5 principal component analysis
 
Pca ankita dubey
Pca ankita dubeyPca ankita dubey
Pca ankita dubey
 
Pca ppt
Pca pptPca ppt
Pca ppt
 
Understandig PCA and LDA
Understandig PCA and LDAUnderstandig PCA and LDA
Understandig PCA and LDA
 
Principal component analysis - application in finance
Principal component analysis - application in financePrincipal component analysis - application in finance
Principal component analysis - application in finance
 
Lda
LdaLda
Lda
 
Implement principal component analysis (PCA) in python from scratch
Implement principal component analysis (PCA) in python from scratchImplement principal component analysis (PCA) in python from scratch
Implement principal component analysis (PCA) in python from scratch
 
Principal Component Analysis
Principal Component AnalysisPrincipal Component Analysis
Principal Component Analysis
 
Principal Component Analysis For Novelty Detection
Principal Component Analysis For Novelty DetectionPrincipal Component Analysis For Novelty Detection
Principal Component Analysis For Novelty Detection
 
Principal component analysis
Principal component analysisPrincipal component analysis
Principal component analysis
 
Probabilistic PCA, EM, and more
Probabilistic PCA, EM, and moreProbabilistic PCA, EM, and more
Probabilistic PCA, EM, and more
 
Independent component analysis
Independent component analysisIndependent component analysis
Independent component analysis
 
A Correlative Information-Theoretic Measure for Image Similarity
A Correlative Information-Theoretic Measure for Image SimilarityA Correlative Information-Theoretic Measure for Image Similarity
A Correlative Information-Theoretic Measure for Image Similarity
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)
 
Intro to MATLAB and K-mean algorithm
Intro to MATLAB and K-mean algorithmIntro to MATLAB and K-mean algorithm
Intro to MATLAB and K-mean algorithm
 

Similar to PCA (Principal component analysis) Theory and Toolkits

5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdf5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdfRahul926331
 
DimensionalityReduction.pptx
DimensionalityReduction.pptxDimensionalityReduction.pptx
DimensionalityReduction.pptx36rajneekant
 
Aaa ped-17-Unsupervised Learning: Dimensionality reduction
Aaa ped-17-Unsupervised Learning: Dimensionality reductionAaa ped-17-Unsupervised Learning: Dimensionality reduction
Aaa ped-17-Unsupervised Learning: Dimensionality reductionAminaRepo
 
principle component analysis.pptx
principle component analysis.pptxprinciple component analysis.pptx
principle component analysis.pptxwahid ullah
 
AAC ch 3 Advance strategies (Dynamic Programming).pptx
AAC ch 3 Advance strategies (Dynamic Programming).pptxAAC ch 3 Advance strategies (Dynamic Programming).pptx
AAC ch 3 Advance strategies (Dynamic Programming).pptxHarshitSingh334328
 
Principal Components Analysis, Calculation and Visualization
Principal Components Analysis, Calculation and VisualizationPrincipal Components Analysis, Calculation and Visualization
Principal Components Analysis, Calculation and VisualizationMarjan Sterjev
 
Principal Component Analysis PCA
Principal Component Analysis PCAPrincipal Component Analysis PCA
Principal Component Analysis PCAAbdullah al Mamun
 
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdfCD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdfRajJain516913
 
Line drawing algorithm and antialiasing techniques
Line drawing algorithm and antialiasing techniquesLine drawing algorithm and antialiasing techniques
Line drawing algorithm and antialiasing techniquesAnkit Garg
 
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...MLconf
 
Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.Leonid Zhukov
 
Support Vector Machines Simply
Support Vector Machines SimplySupport Vector Machines Simply
Support Vector Machines SimplyEmad Nabil
 
Distributed Architecture of Subspace Clustering and Related
Distributed Architecture of Subspace Clustering and RelatedDistributed Architecture of Subspace Clustering and Related
Distributed Architecture of Subspace Clustering and RelatedPei-Che Chang
 
Open GL T0074 56 sm4
Open GL T0074 56 sm4Open GL T0074 56 sm4
Open GL T0074 56 sm4Roziq Bahtiar
 
machine learning.pptx
machine learning.pptxmachine learning.pptx
machine learning.pptxAbdusSadik
 
INTRODUCTION TO MATLAB presentation.pptx
INTRODUCTION TO MATLAB presentation.pptxINTRODUCTION TO MATLAB presentation.pptx
INTRODUCTION TO MATLAB presentation.pptxDevaraj Chilakala
 
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...Tomoya Murata
 

Similar to PCA (Principal component analysis) Theory and Toolkits (20)

5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdf5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdf
 
DimensionalityReduction.pptx
DimensionalityReduction.pptxDimensionalityReduction.pptx
DimensionalityReduction.pptx
 
Randomized algorithms ver 1.0
Randomized algorithms ver 1.0Randomized algorithms ver 1.0
Randomized algorithms ver 1.0
 
Aaa ped-17-Unsupervised Learning: Dimensionality reduction
Aaa ped-17-Unsupervised Learning: Dimensionality reductionAaa ped-17-Unsupervised Learning: Dimensionality reduction
Aaa ped-17-Unsupervised Learning: Dimensionality reduction
 
principle component analysis.pptx
principle component analysis.pptxprinciple component analysis.pptx
principle component analysis.pptx
 
AAC ch 3 Advance strategies (Dynamic Programming).pptx
AAC ch 3 Advance strategies (Dynamic Programming).pptxAAC ch 3 Advance strategies (Dynamic Programming).pptx
AAC ch 3 Advance strategies (Dynamic Programming).pptx
 
Principal Components Analysis, Calculation and Visualization
Principal Components Analysis, Calculation and VisualizationPrincipal Components Analysis, Calculation and Visualization
Principal Components Analysis, Calculation and Visualization
 
Principal Component Analysis PCA
Principal Component Analysis PCAPrincipal Component Analysis PCA
Principal Component Analysis PCA
 
ML unit2.pptx
ML unit2.pptxML unit2.pptx
ML unit2.pptx
 
Line circle draw
Line circle drawLine circle draw
Line circle draw
 
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdfCD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
 
Line drawing algorithm and antialiasing techniques
Line drawing algorithm and antialiasing techniquesLine drawing algorithm and antialiasing techniques
Line drawing algorithm and antialiasing techniques
 
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
 
Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.Numerical Linear Algebra for Data and Link Analysis.
Numerical Linear Algebra for Data and Link Analysis.
 
Support Vector Machines Simply
Support Vector Machines SimplySupport Vector Machines Simply
Support Vector Machines Simply
 
Distributed Architecture of Subspace Clustering and Related
Distributed Architecture of Subspace Clustering and RelatedDistributed Architecture of Subspace Clustering and Related
Distributed Architecture of Subspace Clustering and Related
 
Open GL T0074 56 sm4
Open GL T0074 56 sm4Open GL T0074 56 sm4
Open GL T0074 56 sm4
 
machine learning.pptx
machine learning.pptxmachine learning.pptx
machine learning.pptx
 
INTRODUCTION TO MATLAB presentation.pptx
INTRODUCTION TO MATLAB presentation.pptxINTRODUCTION TO MATLAB presentation.pptx
INTRODUCTION TO MATLAB presentation.pptx
 
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
 

More from HopeBay Technologies, Inc.

What is twitter a social network or news media?
What is twitter a social network or news media?What is twitter a social network or news media?
What is twitter a social network or news media?HopeBay Technologies, Inc.
 
Emerging topic detection on twitter based on temporal and social terms evalua...
Emerging topic detection on twitter based on temporal and social terms evalua...Emerging topic detection on twitter based on temporal and social terms evalua...
Emerging topic detection on twitter based on temporal and social terms evalua...HopeBay Technologies, Inc.
 
Time is of the Essence : Improving Recency Ranking Using Twitter Data
Time is of the Essence : Improving Recency Ranking Using Twitter DataTime is of the Essence : Improving Recency Ranking Using Twitter Data
Time is of the Essence : Improving Recency Ranking Using Twitter DataHopeBay Technologies, Inc.
 
Mining interesting locations and travel sequences from gps trajectories
Mining interesting locations and travel sequences from gps trajectoriesMining interesting locations and travel sequences from gps trajectories
Mining interesting locations and travel sequences from gps trajectoriesHopeBay Technologies, Inc.
 
A General Framework for Enhancing Prediction Performance on Time Series Data
A General Framework for Enhancing Prediction Performance on Time Series DataA General Framework for Enhancing Prediction Performance on Time Series Data
A General Framework for Enhancing Prediction Performance on Time Series DataHopeBay Technologies, Inc.
 

More from HopeBay Technologies, Inc. (7)

COSCUP NAS也可以揀土豆
COSCUP NAS也可以揀土豆COSCUP NAS也可以揀土豆
COSCUP NAS也可以揀土豆
 
What is twitter a social network or news media?
What is twitter a social network or news media?What is twitter a social network or news media?
What is twitter a social network or news media?
 
Emerging topic detection on twitter based on temporal and social terms evalua...
Emerging topic detection on twitter based on temporal and social terms evalua...Emerging topic detection on twitter based on temporal and social terms evalua...
Emerging topic detection on twitter based on temporal and social terms evalua...
 
Time is of the Essence : Improving Recency Ranking Using Twitter Data
Time is of the Essence : Improving Recency Ranking Using Twitter DataTime is of the Essence : Improving Recency Ranking Using Twitter Data
Time is of the Essence : Improving Recency Ranking Using Twitter Data
 
Mining interesting locations and travel sequences from gps trajectories
Mining interesting locations and travel sequences from gps trajectoriesMining interesting locations and travel sequences from gps trajectories
Mining interesting locations and travel sequences from gps trajectories
 
Deep Learning in a nutshell
Deep Learning in a nutshellDeep Learning in a nutshell
Deep Learning in a nutshell
 
A General Framework for Enhancing Prediction Performance on Time Series Data
A General Framework for Enhancing Prediction Performance on Time Series DataA General Framework for Enhancing Prediction Performance on Time Series Data
A General Framework for Enhancing Prediction Performance on Time Series Data
 

Recently uploaded

Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...Suhani Kapoor
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Data Warehouse , Data Cube Computation
Data Warehouse   , Data Cube ComputationData Warehouse   , Data Cube Computation
Data Warehouse , Data Cube Computationsit20ad004
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 

Recently uploaded (20)

Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Data Warehouse , Data Cube Computation
Data Warehouse   , Data Cube ComputationData Warehouse   , Data Cube Computation
Data Warehouse , Data Cube Computation
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 

PCA (Principal component analysis) Theory and Toolkits

  • 1. Theory and Toolkits of PCA 2009 5/4 IRLab Study Group Presenter : Chin-Hui Chen
  • 2. Agenda  Theory : ◦ 1. Scenario ◦ 2. What is PCA? ◦ 3. How to minimize Squared-Error ? ◦ 4. Dimensionality Reduction  Toolkit : ◦ A list of PCA toolkits ◦ Demo
  • 3. Scenario (Point? Line?)  Consider a 2-dimension space Least Squared Error
  • 4. Agenda  Theory : ◦ 1. Scenario ◦ 2. What is PCA? ◦ 3. How to minimize Squared-Error ? ◦ 4. Dimensionality Reduction  Toolkit : ◦ A list of PCA toolkits ◦ Demo
  • 5. What is PCA ? (1)  Principal component analysis (PCA) involves a mathematical procedure that transforms a number of possibly correlated variables into a smaller number of uncorrelated variables called “principal components”.
  • 6. What is PCA ? (2)  What can PCA do ? ◦ Dimensionality Reduction  For example : ◦ Assuming N points in D-dim space ◦ e.g. {x1, x2, x3, x4} ; xi = (v1, v2) ◦ A set (M) of basis for projection ◦ e.g. {u1}  They are orthonormal bases (長度1,兩兩內積0)  M << D (represent the feature in M dimensions) ◦ e.g. xi = (p1)
  • 7. Agenda  Theory : ◦ 1. Scenario ◦ 2. What is PCA? ◦ 3. How to minimize Squared-Error ? ◦ 4. Dimensionality Reduction  Toolkit : ◦ A list of PCA toolkits ◦ Demo
  • 8. How to minimize Squared-Error ?  Consider a D-dimension space ◦ Given N point : {x1, x2, …, xn} ◦ xi is a D-dim vector  How to ◦ 1. 找一個點使得squared-error最小 ◦ 2. 找一條線使得squared-error最小
  • 9. How to ? - Point ◦ Goal : Find x0 s.t. min. ◦ ◦ Let .
  • 10. How to ? – Point - Line  ∴ x0 = ◦ 1. 找一個點使得squared-error最小 ◦ 2. 找一條線使得squared-error最小  L : xk’- x0 = ake  xk’= x0 + ake  = m + ake
  • 11. How to ? – Line  L : xk’ = m + ake  Goal :  Find a1…an 
  • 12. How to ? – Line  每個部份微分後 [2ak – 2et(xk-m)]   What does it mean ? xk’ = m + ake
  • 13. How to ? – Line  Then, how about e ?
  • 14. How to ? – Line  Let Independent of e
  • 15. How to ? – Line f(x,y) -> But if x,y : g(x,y)=0  J’1(e) = -etSe  Use lagrange multiplier :   Because |e| = 1 , u = etSe – λ(ete-1)
  • 16. How to ? – Line  ◦ What is S ?  Covariance Matrix (共變異數矩陣) ◦ Assume D-dim
  • 17. How to ? – Line  , we know S.  Then, what is e ? Eigenvectors of S. AX= λX Eigen : same
  • 18. How to ? – conclusion  Summary : ◦ Find a line : xk’= m + ake  ak = et(xk-m)  Se = λe ; e = eigenvectors of covariance matrix. ◦ D-dim space can find D eigenvectors.
  • 19. Agenda  Theory : ◦ 1. Scenario ◦ 2. What is PCA? ◦ 3. How to minimize Squared-Error ? ◦ 4. Dimensionality Reduction  Toolkit : ◦ A list of PCA toolkits ◦ Demo
  • 21. Dimensionality Reduction  Consider a 2-dim space … X1 = (a,b) X2 = (c,d) X1 = (a’,b’) X2 = (c’,d’) We are going to do … X1 = (a’) X2 = (c’)
  • 22. Dimensionality Reduction  We want to proof : ◦ Axes of the data are independent.  Consider N m-dim vectors ◦ {x1, x2, … ,xn} ◦ Let X=[x1-m x2-m … xn-m]T m = mean ◦ Let E = [e1 e2 … em] Se = λe eigen decomposition Eigen vector {e1,…,em} Eigen value {λ1,…, λm}
  • 23. Dimensionality Reduction  SE = [Se1 Se2 … Sem]  = [λe1 λe2 … λem]   =  = ED  S = EDE-1 E = [e1 e2 … em]
  • 24. Dimensionality Reduction  We want to know new Covariance Matrix of projected vectors.  Let Y = [y1 y2 … yn]T  E = [e1 e2 … em]  Y = ETX  SY
  • 25. Dimensionality Reduction  SY = D  1. Covariance of two axes are 0.  2. represent data↑->covariance of axes↑  -> λ ↑
  • 26. Dimensionality Reduction  Conclusion :  If we want to reduce  dimension D to M  (M<<D)  1. Find S  2. ->eigenvalues  3. Select Top M  4. Project data
  • 27. Agenda  Theory : ◦ 1. Scenario ◦ 2. What is PCA? ◦ 3. How to minimize Squared-Error ? ◦ 4. Dimensionality Reduction  Toolkit : ◦ A list of PCA toolkits ◦ Demo
  • 29. A List of PCA Toolkits  C & Java ◦ Fionn Murtagh's Multivariate Data Analysis Software and Resources ◦ http://astro.u-strasbg.fr/~fmurtagh/mda-sw/  Perl ◦ PDL::PCA  Matlab ◦ Statistics Toolbox™ : princomp  Weka ◦ weka.attributeSelection.PrincipalComponents (http://www.laps.ufpa.br/aldebaro/weka/feature_selection.html )
  • 30. A List of PCA Toolkits  C & Java ◦ Fionn Murtagh's Multivariate Data Analysis Software and Resources ◦ http://astro.u-strasbg.fr/~fmurtagh/mda-sw/ C : Download: pca.c Compile: cc pca.c -lm -o pcac Run: ./pcac spectr.dat 36 8 R > pcaout.c.txt Java : Download: JAMA, PCAcorr.java Compile: javac –classpath Jama-1.0.2.jar PCAcorr.java Run: java PCAcorr iris.dat > pcaout.java.txt

Editor's Notes

  1. 因為這代表如果你已經知道 e , 將空間中任一點xk投射到t直線 L 上, 只需要將原座標為移後與 e 做內積, 就可以得到空間轉換後的新座標