SlideShare a Scribd company logo
1 of 64
Principal
Component
Analysis
Given a
classiļ¬cation
problemā€¦
How
dowechoosethe
right
features?
4
3
Pengamatan Peubah Ganda
- memerlukan
ā€˜sumberdayaā€™ lebih,
dalam analisis
- informasi tumpang
tindih pada beberapa
peubah
4
Apa itu Komponen Utama
ā€¢ Merupakan kombinasi linear dari peubah yang
diamati ļƒ  informasi yang terkandung pada KU
merupakan gabungan dari semua peubah
dengan bobot tertentu
ā€¢ Kombinasi linear yang dipilih merupakan
kombinasi linear dengan ragam paling besar ļƒ 
memuat informasi paling banyak
ā€¢ Antar KU bersifat ortogonal ļƒ  tidak berkorelasi
ļƒ  informasi tidak tumpang tindih
5
Analisis Komponen Utama
Gugus peubah asal
{X1, X2, ā€¦, Xp}
Gugus KU
{KU1, KU2, ā€¦, KUp}
Hanya dipilih k < p
KU saja, namun
mampu memuat
sebagian besar
informasi
6
Ilustrasi Komponen Utama
Untuk menceritakan bagaimana wajah
pacar kita waktu SMA, tidak perlu
disebutkan hidungnya mancung, kulitnya
halus, rambutnya indah tergerai dan
sebagainya. Tapi cukup katakan ā€˜Pacar
saya waktu SMA orangnya cantikā€™. Kata
ā€˜cantikā€™ sudah mampu menggambarkan
uraian sebelumnya.
7
Bentuk Komponen Utama
KU1 = a1x = a11x1 + ā€¦ + a1pxp
Jika gugus peubah asal {X1, X2, ā€¦, Xp}
memiliki matriks ragam peragam ļ“ maka
ragam dari komponen utama adalah
= a1ā€™ļ“a1 =
Tugas kita adalah bagaimana mendapatkan
vektor a1 sehingga ragam di atas
maksimum (vektor ini disebut vektor
ļƒ„ļƒ„ļ€½ ļ€½
p
i
p
j
ijjiaa
1 1
11 ļ³
2
1KUļ³
8
Mendapatkan KU pertama
ā€¢ Vektor a1 merupakan vektor ciri matriks ļ“
yang berpadanan dengan akar ciri paling
besar.
ā€¢ Kombinasi linear dari {X1, X2, ā€¦, Xp} berupa
KU1 = a1x = a11x1 + ā€¦ + a1pxp dikenal
sebagai KU pertama dan memiliki ragam
sebesar ļ¬1 = akar ciri terbesar
9
KU kedua
ā€¢ Bentuknya KU2 = a2x = a21x1 + ā€¦ + a2pxp
ā€¢ Mencari vektor a2 sehingga ragam dari
KU2 maksimum, dan KU2 tidak berorelasi
dengan KU1
ā€¢ a2 tidak lain adalah vektor ciri yang
berpadanan dengan akar ciri terbesar
kedua dari matriks ļ“.
10
Komponen Utama
Misalkan ļ¬1 ļ‚³ ļ¬2 ļ‚³ ā€¦ ļ‚³ ļ¬p > 0 adalah vektor ciri
yang berpadanan dengan vektor ciri a1, a2, ā€¦, ap
dari matriks ļ“, dan panjang dari setiap vektor itu
masing masing adalah 1, atau aiā€™ai = 1 untuk i = 1,
2, ā€¦, p. Maka KU1 = a1ā€™x, KU2 = a2ā€™x, ā€¦, KUp = apā€™x
berturut-turut adalah komponen utama pertama,
kedua, ā€¦, ke-p dari x. Lebih lanjut var(KU1) = ļ¬1,
var(KU2) = ļ¬2, ā€¦, var(KUp) = ļ¬p, atau akar ciri dari
matriks ragam peragam ļ“ adalah ragam dari
komponen-komponen utama.
11
Kontribusi setiap KU
ā€¢ Ragam dari setiap KU sama dengan akar
ciri ļ“, yaitu ļ¬i
ā€¢ Total ragam peubah asal seluruhnya
adalah tr(ļ“), dan ini sama dengan
penjumlahan dari seluruh akar ciri
ā€¢ Jadi kontribusi setiap KU ke-j adalah
sebesar
ļƒ„ļ€½
p
i i
j
1
ļ¬
ļ¬
12
Interpretasi setiap KU
ā€¢ Interpretasi setiap KU didasarkan pada
nilai pada vektor aj, karena nilai ini
berhubungan linear dengan korelasi
antara X dengan KU
ā€¢ Informasi pada KU didominasi oleh
informasi X yang memiliki koefisien
besar.
13
Permasalahan Umum dalam
AKU
ā€¢ Penentuan KU
menggunakan
ā€˜matriks ragam-
peragamā€™ vs
ā€˜matriks korelasiā€™
ā€¢ Penentuan
banyaknya KU
14
Menggunakan matriks korelasi atau
ragam peragam?
Secara umum ini adalah pertanyaan
yang sulit. Karena tidak ada hubungan
yang jelas antara akar ciri dan vektor
ciri matriks ragam peragam dengan
matriks korelasi, dan komponen utama
yang dihasilkan oleh keduanya bisa
sangat berbeda. Demikian juga dengan
berapa banyak komponen utama yang
digunakan.
15
Menggunakan matriks korelasi atau
ragam peragam?
Perbedaan satuan pengukuran yang
umumnya berimplikasi pada perbedaan
keragaman peubah, menjadi salah satu
pertimbangan utama penggunaan matriks
korelasi. Meskipun ada juga beberapa
pendapat yang mengatakan gunakan selalu
matriks korelasi.
16
Menggunakan matriks korelasi atau
ragam peragam?
Penggunaan matriks korelasi memang cukup
efektif kecuali pada dua hal.
Pertama, secara teori pengujian statistik terhadap
akar ciri dan vektor ciri matriks korelasi jauh
lebih rumit.
Kedua, dengan menggunakan matriks korelasi
kita memaksakan setiap peubah memiliki ragam
yang sama sehingga tujuan mendapatkan peubah
yang kontribusinya paling besar tidak tercapai.
17
Penentuan Banyaknya KU
Metode 1
ā€¢ didasarkan pada kumulatif proporsi keragaman
total yang mampu dijelaskan.
ā€¢ Metode ini merupakan metode yang paling banyak
digunakan, dan bisa diterapkan pada penggunaan
matriks korelasi maupun matriks ragam peragam.
ā€¢ Minimum persentase kergaman yang mampu
dijelaskan ditentukan terlebih dahulu, dan
selanjutnya banyaknya komponen yang paling
kecil hingga batas itu terpenuhi dijadikan sebagai
banyaknya komponen utama yang digunakan.
ā€¢ Tidak ada patokan baku berapa batas minimum
tersebut, sebagian buku menyebutkan 70%, 80%,
bahkan ada yang 90%.
18
Penentuan Banyaknya KU
Metode 2
ā€¢ hanya bisa diterapkan pada penggunaan matriks
korelasi. Ketika menggunakan matriks ini, peubah
asal ditransformasi menjadi peubah yang memiliki
ragam sama yaitu satu.
ā€¢ Pemilihan komponen utama didasarkan pada
ragam komponen utama, yang tidak lain adalah
akar ciri. Metode ini disarankan oleh Kaiser (1960)
yang berargumen bahwa jika peubah asal saling
bebas maka komponen utama tidak lain adalah
peubah asal, dan setiap komponen utama akan
memiliki ragam satu.
ā€¢ Dengan cara ini, komponen yang berpadanan
dengan akar ciri kurang dari satu tidak digunakan.
19
Penentuan Banyaknya KU
Metode 3
ā€¢ penggunaan grafik yang disebut plot scree.
ā€¢ Cara ini bisa digunakan ketika titik awalnya
matriks korelasi maupun ragam peragam.
ā€¢ Plot scree merupakan plot antara akar ciri ļ¬k
dengan k.
ā€¢ Dengan menggunakan metode ini, banyaknya
komponen utama yang dipilih, yaitu k, adalah jika
pada titik k tersebut plotnya curam ke kiri tapi
tidak curam di kanan. Ide yang ada di belakang
metode ini adalah bahwa banyaknya komponen
utama yang dipilih sedemikian rupa sehingga
selisih antara akar ciri yang berurutan sudah tidak
besar lagi. Interpretasi terhadap plot ini sangat
20
Kegunaan Lain KU
ā€¢ Plot skor KU dua
dimensi sebagai alat
awal diagnosis pada
analisis gerombol
ā€¢ KU yang saling
bebas mengatasi
masalah
multikolinear dalam
analisis regresi
Data Reduction
ā€¢ summarization of data with many (p)
variables by a smaller set of (k) derived
(synthetic, composite) variables.
n
p
A n
k
X
Data Reduction
ā€¢ ā€œResidualā€ variation is information in A
that is not retained in X
ā€¢ balancing act between
ā€“ clarity of representation, ease of
understanding
ā€“ oversimplification: loss of important or
relevant information.
Principal Component Analysis
(PCA)
ā€¢ probably the most widely-used and well-
known of the ā€œstandardā€ multivariate
methods
ā€¢ invented by Pearson (1901) and
Hotelling (1933)
ā€¢ first applied in ecology by Goodall
(1954) under the name ā€œfactor analysisā€
(ā€œprincipal factor analysisā€ is a synonym of PCA).
Principal Component Analysis
(PCA)
ā€¢ takes a data matrix of n objects by p
variables, which may be correlated, and
summarizes it by uncorrelated axes
(principal components or principal axes)
that are linear combinations of the
original p variables
ā€¢ the first k components display as much
as possible of the variation among
objects.
Geometric Rationale of PCA
ā€¢ objects are represented as a cloud of n
points in a multidimensional space with
an axis for each of the p variables
ā€¢ the centroid of the points is defined by
the mean of each variable
ā€¢ the variance of each variable is the
average squared deviation of its n
values around the mean of that
variable.
ļ€Ø ļ€©ļƒ„ļ€½
ļ€­
ļ€­
ļ€½
n
m
iimi XX
n
V
1
2
1
1
Geometric Rationale of PCA
ā€¢ degree to which the variables are
linearly correlated is represented by
their covariances.
ļ€Ø ļ€©ļ€Ø ļ€©ļƒ„ļ€½
ļ€­ļ€­
ļ€­
ļ€½
n
m
jjmiimij XXXX
n
C
11
1
Sum over all
n objects
Value of
variable j
in object m
Mean of
variable j
Value of
variable i
in object m
Mean of
variable i
Covariance of
variables i and j
Geometric Rationale of PCA
ā€¢ objective of PCA is to rigidly rotate the
axes of this p-dimensional space to new
positions (principal axes) that have the
following properties:
ā€“ ordered such that principal axis 1 has the
highest variance, axis 2 has the next
highest variance, .... , and axis p has the
lowest variance
ā€“ covariance among each pair of the principal
axes is zero (the principal axes are
uncorrelated).
0
2
4
6
8
10
12
14
0 2 4 6 8 10 12 14 16 18 20
Variable X1
VariableX2
+
2D Example of PCA
ā€¢ variables X1 and X2 have positive covariance & each
has a similar variance.
67.61 ļ€½V 24.62 ļ€½V 42.32,1 ļ€½C
35.81 ļ€½X
91.42 ļ€½X
-6
-4
-2
0
2
4
6
8
-8 -6 -4 -2 0 2 4 6 8 10 12
Variable X1
VariableX2
Configuration is Centered
ā€¢ each variable is adjusted to a mean of
zero (by subtracting the mean from each value).
-6
-4
-2
0
2
4
6
-8 -6 -4 -2 0 2 4 6 8 10 12
PC 1
PC2
Principal Components are Computed
ā€¢ PC 1 has the highest possible variance (9.88)
ā€¢ PC 2 has a variance of 3.03
ā€¢ PC 1 and PC 2 have zero covariance.
The Dissimilarity Measure Used in PCA
is Euclidean Distance
ā€¢ PCA uses Euclidean Distance calculated
from the p variables as the measure of
dissimilarity among the n objects
ā€¢ PCA derives the best possible k
dimensional (k < p) representation of the
Euclidean distances among objects.
Generalization to p-dimensions
ā€¢ In practice nobody uses PCA with only 2
variables
ā€¢ The algebra for finding principal axes
readily generalizes to p variables
ā€¢ PC 1 is the direction of maximum
variance in the p-dimensional cloud of
points
ā€¢ PC 2 is in the direction of the next
highest variance, subject to the
constraint that it has zero covariance
with PC 1.
Generalization to p-dimensions
ā€¢ PC 3 is in the direction of the next
highest variance, subject to the
constraint that it has zero covariance
with both PC 1 and PC 2
ā€¢ and so on... up to PC p
-6
-4
-2
0
2
4
6
8
-8 -6 -4 -2 0 2 4 6 8 10 12
Variable X1
VariableX2
PC 1
PC 2
ā€¢ each principal axis is a linear combination of the
original two variables
ā€¢ PCj = ai1Y1 + ai2Y2 + ā€¦ ainYn
ā€¢ aijā€™s are the coefficients for factor i, multiplied by
the measured value for variable j
-6
-4
-2
0
2
4
6
8
-8 -6 -4 -2 0 2 4 6 8 10 12
Variable X1
VariableX2
PC 1
PC 2
ā€¢ PC axes are a rigid rotation of the original variables
ā€¢ PC 1 is simultaneously the direction of maximum
variance and a least-squares ā€œline of best fitā€
(squared distances of points away from PC 1 are
minimized).
Generalization to p-dimensions
ā€¢ if we take the first k principal components,
they define the k-dimensional ā€œhyperplane of
best fitā€ to the point cloud
ā€¢ of the total variance of all p variables:
ā€“ PCs 1 to k represent the maximum possible proportion
of that variance that can be displayed in k
dimensions
ā€“ i.e. the squared Euclidean distances among points
calculated from their coordinates on PCs 1 to k are
the best possible representation of their squared
Euclidean distances in the full p dimensions.
Covariance vs Correlation
ā€¢ using covariances among variables only
makes sense if they are measured in the
same units
ā€¢ even then, variables with high variances
will dominate the principal components
ā€¢ these problems are generally avoided by
standardizing each variable to unit
variance and zero mean.
ļ€Ø ļ€©
i
iim
im
XX
X
SD
ļ€­
ļ€½ļ‚¢
Mean
variable i
Standard deviation
of variable i
Covariance vs Correlation
ā€¢ covariances between the standardized
variables are correlations
ā€¢ after standardization, each variable has a
variance of 1.000
ā€¢ correlations can be also calculated from
the variances and covariances:
ji
ij
ij
VV
C
r ļ€½
Covariance of
variables i and j
Variance
of variable jVariance
of variable i
Correlation between
variables i and j
The Algebra of PCA
ā€¢ first step is to calculate the cross-
products matrix of variances and
covariances (or correlations) among every
pair of the p variables
ā€¢ square, symmetric matrix
ā€¢ diagonals are the variances, off-diagonals
are the covariances.
X1 X2
X1
6.6707 3.4170
X2 3.4170 6.2384
X1 X2
X1
1.0000 0.5297
X2
0.5297 1.0000
Variance-covariance Matrix Correlation Matrix
The Algebra of PCA
ā€¢ in matrix notation, this is computed as
ā€¢ where X is the n x p data matrix, with
each variable centered (also standardized by SD
if using correlations).
X1 X2
X1
6.6707 3.4170
X2 3.4170 6.2384
X1 X2
X1
1.0000 0.5297
X2
0.5297 1.0000
Variance-covariance Matrix Correlation Matrix
XXS ļ‚¢ļ€½
Manipulating Matrices
ā€¢ transposing: could change the columns to
rows or the rows to columns
ā€¢ multiplying matrices
ā€“ must have the same number of columns in the
premultiplicand matrix as the number of rows
in the postmultiplicand matrix
X = 10 0 4
7 1 2
Xā€™ = 10 7
0 1
4 2
The Algebra of PCA
ā€¢ sum of the diagonals of the variance-
covariance matrix is called the trace
ā€¢ it represents the total variance in the
data
ā€¢ it is the mean squared Euclidean distance
between each object and the centroid in
p-dimensional space.
X1 X2
X1
6.6707 3.4170
X2 3.4170 6.2384
X1 X2
X1
1.0000 0.5297
X2
0.5297 1.0000
Trace = 12.9091 Trace = 2.0000
The Algebra of PCA
ā€¢ finding the principal axes involves
eigenanalysis of the cross-products
matrix (S)
ā€¢ the eigenvalues (latent roots) of S are
solutions (ļ¬) to the characteristic
equation
0IS ļ€½ļ€­ļ¬
The Algebra of PCA
ā€¢ the eigenvalues, ļ¬1, ļ¬2, ... ļ¬p are the
variances of the coordinates on each
principal component axis
ā€¢ the sum of all p eigenvalues equals the
trace of S (the sum of the variances of
the original variables).
X1 X2
X1
6.6707 3.4170
X2
3.4170 6.2384
ļ¬1 = 9.8783
ļ¬2 = 3.0308
Note: ļ¬1+ļ¬2 =12.9091
Trace = 12.9091
The Algebra of PCA
ā€¢ each eigenvector consists of p values
which represent the ā€œcontributionā€ of
each variable to the principal component
axis
ā€¢ eigenvectors are uncorrelated (orthogonal)
ā€“ their cross-products are zero.
u1 u2
X1
0.7291 -0.6844
X2
0.6844 0.7291
Eigenvectors
0.7291*(-0.6844) + 0.6844*0.7291 = 0
The Algebra of PCA
ā€¢ coordinates of each object i on the kth
principal axis, known as the scores on PC
k, are computed as
ā€¢ where Z is the n x k matrix of PC scores,
X is the n x p centered data matrix and
U is the p x k matrix of eigenvectors.
pipkikikki xuxuxuz ļ€«ļ€«ļ€«ļ€½ ļŒ2211
The Algebra of PCA
ā€¢ variance of the scores on each PC axis is
equal to the corresponding eigenvalue for
that axis
ā€¢ the eigenvalue represents the variance
displayed (ā€œexplainedā€ or ā€œextractedā€) by
the kth axis
ā€¢ the sum of the first k eigenvalues is the
variance explained by the k-dimensional
ordination.
-6
-4
-2
0
2
4
6
-8 -6 -4 -2 0 2 4 6 8 10 12
PC 1
PC2ļ¬1 = 9.8783 ļ¬2 = 3.0308 Trace = 12.9091
PC 1 displays (ā€œexplainsā€)
9.8783/12.9091 = 76.5% of the total variance
The Algebra of PCA
ā€¢ The cross-products matrix computed
among the p principal axes has a simple
form:
ā€“ all off-diagonal values are zero (the principal
axes are uncorrelated)
ā€“ the diagonal values are the eigenvalues.
PC1 PC2
PC1
9.8783 0.0000
PC2
0.0000 3.0308
Variance-covariance Matrix
of the PC axes
A more challenging example
ā€¢ data from research on habitat definition
in the endangered Baw Baw frog
ā€¢ 16 environmental and structural variables
measured at each of 124 sites
ā€¢ correlation matrix used because variables
have different units
Philoria frosti
Axis Eigenvalue
% of
Variance
Cumulative %
of Variance
1 5.855 36.60 36.60
2 3.420 21.38 57.97
3 1.122 7.01 64.98
4 1.116 6.97 71.95
5 0.982 6.14 78.09
6 0.725 4.53 82.62
7 0.563 3.52 86.14
8 0.529 3.31 89.45
9 0.476 2.98 92.42
10 0.375 2.35 94.77
Eigenvalues
Interpreting Eigenvectors
ā€¢ correlations
between variables
and the principal
axes are known
as loadings
ā€¢ each element of
the eigenvectors
represents the
contribution of a
given variable to
a component
1 2 3
Altitude 0.3842 0.0659 -0.1177
pH -0.1159 0.1696 -0.5578
Cond -0.2729 -0.1200 0.3636
TempSurf 0.0538 -0.2800 0.2621
Relief -0.0765 0.3855 -0.1462
maxERht 0.0248 0.4879 0.2426
avERht 0.0599 0.4568 0.2497
%ER 0.0789 0.4223 0.2278
%VEG 0.3305 -0.2087 -0.0276
%LIT -0.3053 0.1226 0.1145
%LOG -0.3144 0.0402 -0.1067
%W -0.0886 -0.0654 -0.1171
H1Moss 0.1364 -0.1262 0.4761
DistSWH -0.3787 0.0101 0.0042
DistSW -0.3494 -0.1283 0.1166
DistMF 0.3899 0.0586 -0.0175
How many axes are needed?
ā€¢ does the (k+1)th principal axis represent
more variance than would be expected
by chance?
ā€¢ several tests and rules have been
proposed
ā€¢ a common ā€œrule of thumbā€ when PCA is
based on correlations is that axes with
eigenvalues > 1 are worth interpreting
Baw Baw Frog - PCA of 16 Habitat Variables
0.0
1.0
2.0
3.0
4.0
5.0
6.0
7.0
1 2 3 4 5 6 7 8 9 10
PC Axis Number
Eigenvalue
What are the assumptions of PCA?
ā€¢ assumes relationships among variables are
LINEAR
ā€“ cloud of points in p-dimensional space has
linear dimensions that can be effectively
summarized by the principal axes
ā€¢ if the structure in the data is
NONLINEAR (the cloud of points twists
and curves its way through p-dimensional
space), the principal axes will not be an
efficient and informative summary of the
data.
When should PCA be used?
ā€¢ In community ecology, PCA is useful for
summarizing variables whose
relationships are approximately linear or
at least monotonic
ā€“ e.g. A PCA of many soil properties might
be used to extract a few components that
summarize main dimensions of soil variation
ā€¢ PCA is generally NOT useful for
ordinating community data
ā€¢ Why? Because relationships among
species are highly nonlinear.
0
10
20
30
40
50
60
70
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Simulated Environmetal Gradient (R)
Abundance
The ā€œHorseshoeā€ or Arch Effect
ā€¢ community trends along environmenal
gradients appear as ā€œhorseshoesā€ in PCA
ordinations
ā€¢ none of the PC axes effectively
summarizes the trend in species
composition along the gradient
ā€¢ SUs at opposite extremes of the
gradient appear relatively close
together.
Environmental Gradient
Abundance
Ambiguity of Absence
0
Beta Diversity 2R - Covariance
Axis 1
Axis2
The ā€œHorseshoeā€Effect
ā€¢ curvature of the gradient and the degree
of infolding of the extremes increase
with beta diversity
ā€¢ PCA ordinations are not useful summaries
of community data except when beta
diversity is very low
ā€¢ using correlation generally does better
than covariance
ā€“ this is because standardization by species
improves the correlation between Euclidean
distance and environmental distance.
What if thereā€™s more than one
underlying ecological gradient?
The ā€œHorseshoeā€ Effect
ā€¢ when two or more underlying gradients
with high beta diversity a ā€œhorseshoeā€ is
usually not detectable
ā€¢ the SUs fall on a curved hypersurface
that twists and turns through the p-
dimensional species space
ā€¢ interpretation problems are more severe
ā€¢ PCA should NOT be used with community
data (except maybe when beta diversity is very low).
Impact on Ordination History
ā€¢ by 1970 PCA was the ordination method
of choice for community data
ā€¢ simulation studies by Swan (1970) &
Austin & Noy-Meir (1971) demonstrated
the horseshoe effect and showed that
the linear assumption of PCA was not
compatible with the nonlinear structure
of community data
ā€¢ stimulated the quest for more
appropriate ordination methods.

More Related Content

What's hot

Presentasi distribusi poisson
Presentasi distribusi poissonPresentasi distribusi poisson
Presentasi distribusi poissonWulan_Ari_K
Ā 
APG Pertemuan 4 : Multivariate Normal Distribution (2)
APG Pertemuan 4 : Multivariate Normal Distribution (2)APG Pertemuan 4 : Multivariate Normal Distribution (2)
APG Pertemuan 4 : Multivariate Normal Distribution (2)Rani Nooraeni
Ā 
APG Pertemuan 3 : SAMPLE GEOMETRI AND RANDOM SAMPLING (2)
APG Pertemuan 3 : SAMPLE GEOMETRI  AND RANDOM SAMPLING (2)APG Pertemuan 3 : SAMPLE GEOMETRI  AND RANDOM SAMPLING (2)
APG Pertemuan 3 : SAMPLE GEOMETRI AND RANDOM SAMPLING (2)Rani Nooraeni
Ā 
APG Pertemuan 1-2 (1)
APG Pertemuan 1-2 (1)APG Pertemuan 1-2 (1)
APG Pertemuan 1-2 (1)Rani Nooraeni
Ā 
Power Point Korelasi
Power Point KorelasiPower Point Korelasi
Power Point Korelasiguest027789
Ā 
APG Pertemuan 6 : Mean Vectors From Two Populations
APG Pertemuan 6 : Mean Vectors From Two PopulationsAPG Pertemuan 6 : Mean Vectors From Two Populations
APG Pertemuan 6 : Mean Vectors From Two PopulationsRani Nooraeni
Ā 
Analisis Variansi (Anava)
Analisis Variansi (Anava)Analisis Variansi (Anava)
Analisis Variansi (Anava)Adhitya Akbar
Ā 
Laporan Pratikum analisis regresi linier sederhana
Laporan Pratikum analisis regresi linier sederhanaLaporan Pratikum analisis regresi linier sederhana
Laporan Pratikum analisis regresi linier sederhanagita Ta
Ā 
ANALISIS REGRESI LINIER BERGANDA DAN PENGUJIAN ASUMSI RESIDUAL
ANALISIS REGRESI LINIER BERGANDA DAN PENGUJIAN ASUMSI RESIDUALANALISIS REGRESI LINIER BERGANDA DAN PENGUJIAN ASUMSI RESIDUAL
ANALISIS REGRESI LINIER BERGANDA DAN PENGUJIAN ASUMSI RESIDUALArning Susilawati
Ā 
APG Pertemuan 7 : Manova (2)
APG Pertemuan 7 : Manova (2)APG Pertemuan 7 : Manova (2)
APG Pertemuan 7 : Manova (2)Rani Nooraeni
Ā 
Pengujian Hipotesis Rata-Rata
Pengujian Hipotesis Rata-RataPengujian Hipotesis Rata-Rata
Pengujian Hipotesis Rata-RataAvidia Sarasvati
Ā 
Pembahasan anova 1 arah
Pembahasan anova 1 arahPembahasan anova 1 arah
Pembahasan anova 1 arahUNESA
Ā 
APG Pertemuan 5 : Inferences about a Mean Vector and Comparison of Several Mu...
APG Pertemuan 5 : Inferences about a Mean Vector and Comparison of Several Mu...APG Pertemuan 5 : Inferences about a Mean Vector and Comparison of Several Mu...
APG Pertemuan 5 : Inferences about a Mean Vector and Comparison of Several Mu...Rani Nooraeni
Ā 
Analisis diskriminan (teori)
Analisis diskriminan (teori)Analisis diskriminan (teori)
Analisis diskriminan (teori)agitayuda
Ā 
3 . analisis regresi linier berganda dua peubah
3 .  analisis regresi  linier berganda dua peubah3 .  analisis regresi  linier berganda dua peubah
3 . analisis regresi linier berganda dua peubahYulianus Lisa Mantong
Ā 
Probabilitas - Statistik 2
Probabilitas - Statistik 2Probabilitas - Statistik 2
Probabilitas - Statistik 2Deni Wahyu
Ā 
Peluang dan Distribusi Peluang
Peluang dan Distribusi PeluangPeluang dan Distribusi Peluang
Peluang dan Distribusi Peluangbagus222
Ā 
Distribusi teoretis
Distribusi teoretisDistribusi teoretis
Distribusi teoretisEman Mendrofa
Ā 

What's hot (20)

Presentasi distribusi poisson
Presentasi distribusi poissonPresentasi distribusi poisson
Presentasi distribusi poisson
Ā 
APG Pertemuan 4 : Multivariate Normal Distribution (2)
APG Pertemuan 4 : Multivariate Normal Distribution (2)APG Pertemuan 4 : Multivariate Normal Distribution (2)
APG Pertemuan 4 : Multivariate Normal Distribution (2)
Ā 
APG Pertemuan 3 : SAMPLE GEOMETRI AND RANDOM SAMPLING (2)
APG Pertemuan 3 : SAMPLE GEOMETRI  AND RANDOM SAMPLING (2)APG Pertemuan 3 : SAMPLE GEOMETRI  AND RANDOM SAMPLING (2)
APG Pertemuan 3 : SAMPLE GEOMETRI AND RANDOM SAMPLING (2)
Ā 
APG Pertemuan 1-2 (1)
APG Pertemuan 1-2 (1)APG Pertemuan 1-2 (1)
APG Pertemuan 1-2 (1)
Ā 
Statistika inferensial 1
Statistika inferensial 1Statistika inferensial 1
Statistika inferensial 1
Ā 
Power Point Korelasi
Power Point KorelasiPower Point Korelasi
Power Point Korelasi
Ā 
APG Pertemuan 6 : Mean Vectors From Two Populations
APG Pertemuan 6 : Mean Vectors From Two PopulationsAPG Pertemuan 6 : Mean Vectors From Two Populations
APG Pertemuan 6 : Mean Vectors From Two Populations
Ā 
Analisis Variansi (Anava)
Analisis Variansi (Anava)Analisis Variansi (Anava)
Analisis Variansi (Anava)
Ā 
Laporan Pratikum analisis regresi linier sederhana
Laporan Pratikum analisis regresi linier sederhanaLaporan Pratikum analisis regresi linier sederhana
Laporan Pratikum analisis regresi linier sederhana
Ā 
ANALISIS REGRESI LINIER BERGANDA DAN PENGUJIAN ASUMSI RESIDUAL
ANALISIS REGRESI LINIER BERGANDA DAN PENGUJIAN ASUMSI RESIDUALANALISIS REGRESI LINIER BERGANDA DAN PENGUJIAN ASUMSI RESIDUAL
ANALISIS REGRESI LINIER BERGANDA DAN PENGUJIAN ASUMSI RESIDUAL
Ā 
APG Pertemuan 7 : Manova (2)
APG Pertemuan 7 : Manova (2)APG Pertemuan 7 : Manova (2)
APG Pertemuan 7 : Manova (2)
Ā 
Pengujian Hipotesis Rata-Rata
Pengujian Hipotesis Rata-RataPengujian Hipotesis Rata-Rata
Pengujian Hipotesis Rata-Rata
Ā 
Pembahasan anova 1 arah
Pembahasan anova 1 arahPembahasan anova 1 arah
Pembahasan anova 1 arah
Ā 
Peluang dan Statistika
Peluang dan StatistikaPeluang dan Statistika
Peluang dan Statistika
Ā 
APG Pertemuan 5 : Inferences about a Mean Vector and Comparison of Several Mu...
APG Pertemuan 5 : Inferences about a Mean Vector and Comparison of Several Mu...APG Pertemuan 5 : Inferences about a Mean Vector and Comparison of Several Mu...
APG Pertemuan 5 : Inferences about a Mean Vector and Comparison of Several Mu...
Ā 
Analisis diskriminan (teori)
Analisis diskriminan (teori)Analisis diskriminan (teori)
Analisis diskriminan (teori)
Ā 
3 . analisis regresi linier berganda dua peubah
3 .  analisis regresi  linier berganda dua peubah3 .  analisis regresi  linier berganda dua peubah
3 . analisis regresi linier berganda dua peubah
Ā 
Probabilitas - Statistik 2
Probabilitas - Statistik 2Probabilitas - Statistik 2
Probabilitas - Statistik 2
Ā 
Peluang dan Distribusi Peluang
Peluang dan Distribusi PeluangPeluang dan Distribusi Peluang
Peluang dan Distribusi Peluang
Ā 
Distribusi teoretis
Distribusi teoretisDistribusi teoretis
Distribusi teoretis
Ā 

Similar to 07 analisis komponen utama

Mva 06 principal_component_analysis_2010_11
Mva 06 principal_component_analysis_2010_11Mva 06 principal_component_analysis_2010_11
Mva 06 principal_component_analysis_2010_11P Palai
Ā 
A comparative analysis of predictve data mining techniques3
A comparative analysis of predictve data mining techniques3A comparative analysis of predictve data mining techniques3
A comparative analysis of predictve data mining techniques3Mintu246
Ā 
Factor analysis
Factor analysis Factor analysis
Factor analysis Mintu246
Ā 
Overview and Implementation of Principal Component Analysis
Overview and Implementation of Principal Component Analysis Overview and Implementation of Principal Component Analysis
Overview and Implementation of Principal Component Analysis Taweh Beysolow II
Ā 
ResearchPaper
ResearchPaperResearchPaper
ResearchPaperDaniel Healy
Ā 
pcappt-140121072949-phpapp01.pptx
pcappt-140121072949-phpapp01.pptxpcappt-140121072949-phpapp01.pptx
pcappt-140121072949-phpapp01.pptxABINASHPADHY6
Ā 
Robust Fuzzy Data Clustering In An Ordinal Scale Based On A Similarity Measure
Robust Fuzzy Data Clustering In An Ordinal Scale Based On A Similarity MeasureRobust Fuzzy Data Clustering In An Ordinal Scale Based On A Similarity Measure
Robust Fuzzy Data Clustering In An Ordinal Scale Based On A Similarity MeasureIJRES Journal
Ā 
Enhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial DatasetEnhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial DatasetAlaaZ
Ā 
Comparative Study of the Effect of Different Collocation Points on Legendre-C...
Comparative Study of the Effect of Different Collocation Points on Legendre-C...Comparative Study of the Effect of Different Collocation Points on Legendre-C...
Comparative Study of the Effect of Different Collocation Points on Legendre-C...IOSR Journals
Ā 
Cluster Analysis
Cluster Analysis Cluster Analysis
Cluster Analysis Baivab Nag
Ā 
IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING
IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING
IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING cscpconf
Ā 
IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING
IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING
IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING csandit
Ā 
Factor Extraction method in factor analysis with example in R studio.pptx
Factor Extraction method in factor analysis with example in R studio.pptxFactor Extraction method in factor analysis with example in R studio.pptx
Factor Extraction method in factor analysis with example in R studio.pptxGauravRajole
Ā 
Advanced cosine measures for collaborative filtering
Advanced cosine measures for collaborative filteringAdvanced cosine measures for collaborative filtering
Advanced cosine measures for collaborative filteringLoc Nguyen
Ā 

Similar to 07 analisis komponen utama (20)

Mva 06 principal_component_analysis_2010_11
Mva 06 principal_component_analysis_2010_11Mva 06 principal_component_analysis_2010_11
Mva 06 principal_component_analysis_2010_11
Ā 
PCA.ppt
PCA.pptPCA.ppt
PCA.ppt
Ā 
A comparative analysis of predictve data mining techniques3
A comparative analysis of predictve data mining techniques3A comparative analysis of predictve data mining techniques3
A comparative analysis of predictve data mining techniques3
Ā 
Factor analysis
Factor analysis Factor analysis
Factor analysis
Ā 
Class9_PCA_final.ppt
Class9_PCA_final.pptClass9_PCA_final.ppt
Class9_PCA_final.ppt
Ā 
Covariance.pdf
Covariance.pdfCovariance.pdf
Covariance.pdf
Ā 
Overview and Implementation of Principal Component Analysis
Overview and Implementation of Principal Component Analysis Overview and Implementation of Principal Component Analysis
Overview and Implementation of Principal Component Analysis
Ā 
ResearchPaper
ResearchPaperResearchPaper
ResearchPaper
Ā 
pcappt-140121072949-phpapp01.pptx
pcappt-140121072949-phpapp01.pptxpcappt-140121072949-phpapp01.pptx
pcappt-140121072949-phpapp01.pptx
Ā 
Robust Fuzzy Data Clustering In An Ordinal Scale Based On A Similarity Measure
Robust Fuzzy Data Clustering In An Ordinal Scale Based On A Similarity MeasureRobust Fuzzy Data Clustering In An Ordinal Scale Based On A Similarity Measure
Robust Fuzzy Data Clustering In An Ordinal Scale Based On A Similarity Measure
Ā 
Enhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial DatasetEnhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial Dataset
Ā 
Understandig PCA and LDA
Understandig PCA and LDAUnderstandig PCA and LDA
Understandig PCA and LDA
Ā 
Comparative Study of the Effect of Different Collocation Points on Legendre-C...
Comparative Study of the Effect of Different Collocation Points on Legendre-C...Comparative Study of the Effect of Different Collocation Points on Legendre-C...
Comparative Study of the Effect of Different Collocation Points on Legendre-C...
Ā 
Dimesional Analysis
Dimesional Analysis Dimesional Analysis
Dimesional Analysis
Ā 
Pca ppt
Pca pptPca ppt
Pca ppt
Ā 
Cluster Analysis
Cluster Analysis Cluster Analysis
Cluster Analysis
Ā 
IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING
IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING
IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING
Ā 
IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING
IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING
IMAGE REGISTRATION USING ADVANCED TOPOLOGY PRESERVING RELAXATION LABELING
Ā 
Factor Extraction method in factor analysis with example in R studio.pptx
Factor Extraction method in factor analysis with example in R studio.pptxFactor Extraction method in factor analysis with example in R studio.pptx
Factor Extraction method in factor analysis with example in R studio.pptx
Ā 
Advanced cosine measures for collaborative filtering
Advanced cosine measures for collaborative filteringAdvanced cosine measures for collaborative filtering
Advanced cosine measures for collaborative filtering
Ā 

More from Fisheries and Marine Department

More from Fisheries and Marine Department (20)

BDPP_Pertemuan 5 dan 6 ekologi akuakultur
BDPP_Pertemuan 5 dan 6  ekologi akuakulturBDPP_Pertemuan 5 dan 6  ekologi akuakultur
BDPP_Pertemuan 5 dan 6 ekologi akuakultur
Ā 
BDPP_Pertemuan 4_komoditas dalam budidaya
BDPP_Pertemuan 4_komoditas  dalam budidayaBDPP_Pertemuan 4_komoditas  dalam budidaya
BDPP_Pertemuan 4_komoditas dalam budidaya
Ā 
BDPP_Pertemuan 7 Nutrien dan Pakan Ikan
BDPP_Pertemuan 7 Nutrien dan Pakan IkanBDPP_Pertemuan 7 Nutrien dan Pakan Ikan
BDPP_Pertemuan 7 Nutrien dan Pakan Ikan
Ā 
04 water quality and management
04 water quality and management04 water quality and management
04 water quality and management
Ā 
BDPP_Pertemuan 1_Ruang Lingkup Budidaya
BDPP_Pertemuan 1_Ruang Lingkup BudidayaBDPP_Pertemuan 1_Ruang Lingkup Budidaya
BDPP_Pertemuan 1_Ruang Lingkup Budidaya
Ā 
BDPP_Pertemuan 2_aquaculture systems
BDPP_Pertemuan 2_aquaculture systemsBDPP_Pertemuan 2_aquaculture systems
BDPP_Pertemuan 2_aquaculture systems
Ā 
BDPP_Pertemuan 3_prinsip prinsip akuakultur
BDPP_Pertemuan 3_prinsip prinsip akuakulturBDPP_Pertemuan 3_prinsip prinsip akuakultur
BDPP_Pertemuan 3_prinsip prinsip akuakultur
Ā 
Pertemuan vi
Pertemuan viPertemuan vi
Pertemuan vi
Ā 
Pertemuan v
Pertemuan vPertemuan v
Pertemuan v
Ā 
Pertemuan iv
Pertemuan ivPertemuan iv
Pertemuan iv
Ā 
Pertemuan iii
Pertemuan iiiPertemuan iii
Pertemuan iii
Ā 
Pertemuan ii
Pertemuan iiPertemuan ii
Pertemuan ii
Ā 
Pertemuan i
Pertemuan iPertemuan i
Pertemuan i
Ā 
05 reresi linier berganda
05 reresi linier berganda05 reresi linier berganda
05 reresi linier berganda
Ā 
04 regresi linier-sederhana
04 regresi linier-sederhana04 regresi linier-sederhana
04 regresi linier-sederhana
Ā 
03 jenis jenis+data
03 jenis jenis+data03 jenis jenis+data
03 jenis jenis+data
Ā 
02 teori penarikan contoh
02 teori penarikan contoh02 teori penarikan contoh
02 teori penarikan contoh
Ā 
06 analisis faktor
06 analisis faktor06 analisis faktor
06 analisis faktor
Ā 
Minggu 1 dan 2
Minggu 1 dan 2Minggu 1 dan 2
Minggu 1 dan 2
Ā 
Minggu 4
Minggu 4Minggu 4
Minggu 4
Ā 

Recently uploaded

Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
Ā 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
Ā 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
Ā 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
Ā 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
Ā 
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļøcall girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø9953056974 Low Rate Call Girls In Saket, Delhi NCR
Ā 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
Ā 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)Dr. Mazin Mohamed alkathiri
Ā 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
Ā 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
Ā 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
Ā 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
Ā 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxsocialsciencegdgrohi
Ā 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
Ā 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
Ā 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
Ā 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
Ā 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
Ā 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxJiesonDelaCerna
Ā 

Recently uploaded (20)

Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
Ā 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
Ā 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Ā 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
Ā 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
Ā 
Model Call Girl in Tilak Nagar Delhi reach out to us at šŸ”9953056974šŸ”
Model Call Girl in Tilak Nagar Delhi reach out to us at šŸ”9953056974šŸ”Model Call Girl in Tilak Nagar Delhi reach out to us at šŸ”9953056974šŸ”
Model Call Girl in Tilak Nagar Delhi reach out to us at šŸ”9953056974šŸ”
Ā 
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļøcall girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
call girls in Kamla Market (DELHI) šŸ” >ą¼’9953330565šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
Ā 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Ā 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
Ā 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
Ā 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
Ā 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
Ā 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
Ā 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
Ā 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
Ā 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
Ā 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Ā 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Ā 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
Ā 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptx
Ā 

07 analisis komponen utama

  • 3. 3 Pengamatan Peubah Ganda - memerlukan ā€˜sumberdayaā€™ lebih, dalam analisis - informasi tumpang tindih pada beberapa peubah
  • 4. 4 Apa itu Komponen Utama ā€¢ Merupakan kombinasi linear dari peubah yang diamati ļƒ  informasi yang terkandung pada KU merupakan gabungan dari semua peubah dengan bobot tertentu ā€¢ Kombinasi linear yang dipilih merupakan kombinasi linear dengan ragam paling besar ļƒ  memuat informasi paling banyak ā€¢ Antar KU bersifat ortogonal ļƒ  tidak berkorelasi ļƒ  informasi tidak tumpang tindih
  • 5. 5 Analisis Komponen Utama Gugus peubah asal {X1, X2, ā€¦, Xp} Gugus KU {KU1, KU2, ā€¦, KUp} Hanya dipilih k < p KU saja, namun mampu memuat sebagian besar informasi
  • 6. 6 Ilustrasi Komponen Utama Untuk menceritakan bagaimana wajah pacar kita waktu SMA, tidak perlu disebutkan hidungnya mancung, kulitnya halus, rambutnya indah tergerai dan sebagainya. Tapi cukup katakan ā€˜Pacar saya waktu SMA orangnya cantikā€™. Kata ā€˜cantikā€™ sudah mampu menggambarkan uraian sebelumnya.
  • 7. 7 Bentuk Komponen Utama KU1 = a1x = a11x1 + ā€¦ + a1pxp Jika gugus peubah asal {X1, X2, ā€¦, Xp} memiliki matriks ragam peragam ļ“ maka ragam dari komponen utama adalah = a1ā€™ļ“a1 = Tugas kita adalah bagaimana mendapatkan vektor a1 sehingga ragam di atas maksimum (vektor ini disebut vektor ļƒ„ļƒ„ļ€½ ļ€½ p i p j ijjiaa 1 1 11 ļ³ 2 1KUļ³
  • 8. 8 Mendapatkan KU pertama ā€¢ Vektor a1 merupakan vektor ciri matriks ļ“ yang berpadanan dengan akar ciri paling besar. ā€¢ Kombinasi linear dari {X1, X2, ā€¦, Xp} berupa KU1 = a1x = a11x1 + ā€¦ + a1pxp dikenal sebagai KU pertama dan memiliki ragam sebesar ļ¬1 = akar ciri terbesar
  • 9. 9 KU kedua ā€¢ Bentuknya KU2 = a2x = a21x1 + ā€¦ + a2pxp ā€¢ Mencari vektor a2 sehingga ragam dari KU2 maksimum, dan KU2 tidak berorelasi dengan KU1 ā€¢ a2 tidak lain adalah vektor ciri yang berpadanan dengan akar ciri terbesar kedua dari matriks ļ“.
  • 10. 10 Komponen Utama Misalkan ļ¬1 ļ‚³ ļ¬2 ļ‚³ ā€¦ ļ‚³ ļ¬p > 0 adalah vektor ciri yang berpadanan dengan vektor ciri a1, a2, ā€¦, ap dari matriks ļ“, dan panjang dari setiap vektor itu masing masing adalah 1, atau aiā€™ai = 1 untuk i = 1, 2, ā€¦, p. Maka KU1 = a1ā€™x, KU2 = a2ā€™x, ā€¦, KUp = apā€™x berturut-turut adalah komponen utama pertama, kedua, ā€¦, ke-p dari x. Lebih lanjut var(KU1) = ļ¬1, var(KU2) = ļ¬2, ā€¦, var(KUp) = ļ¬p, atau akar ciri dari matriks ragam peragam ļ“ adalah ragam dari komponen-komponen utama.
  • 11. 11 Kontribusi setiap KU ā€¢ Ragam dari setiap KU sama dengan akar ciri ļ“, yaitu ļ¬i ā€¢ Total ragam peubah asal seluruhnya adalah tr(ļ“), dan ini sama dengan penjumlahan dari seluruh akar ciri ā€¢ Jadi kontribusi setiap KU ke-j adalah sebesar ļƒ„ļ€½ p i i j 1 ļ¬ ļ¬
  • 12. 12 Interpretasi setiap KU ā€¢ Interpretasi setiap KU didasarkan pada nilai pada vektor aj, karena nilai ini berhubungan linear dengan korelasi antara X dengan KU ā€¢ Informasi pada KU didominasi oleh informasi X yang memiliki koefisien besar.
  • 13. 13 Permasalahan Umum dalam AKU ā€¢ Penentuan KU menggunakan ā€˜matriks ragam- peragamā€™ vs ā€˜matriks korelasiā€™ ā€¢ Penentuan banyaknya KU
  • 14. 14 Menggunakan matriks korelasi atau ragam peragam? Secara umum ini adalah pertanyaan yang sulit. Karena tidak ada hubungan yang jelas antara akar ciri dan vektor ciri matriks ragam peragam dengan matriks korelasi, dan komponen utama yang dihasilkan oleh keduanya bisa sangat berbeda. Demikian juga dengan berapa banyak komponen utama yang digunakan.
  • 15. 15 Menggunakan matriks korelasi atau ragam peragam? Perbedaan satuan pengukuran yang umumnya berimplikasi pada perbedaan keragaman peubah, menjadi salah satu pertimbangan utama penggunaan matriks korelasi. Meskipun ada juga beberapa pendapat yang mengatakan gunakan selalu matriks korelasi.
  • 16. 16 Menggunakan matriks korelasi atau ragam peragam? Penggunaan matriks korelasi memang cukup efektif kecuali pada dua hal. Pertama, secara teori pengujian statistik terhadap akar ciri dan vektor ciri matriks korelasi jauh lebih rumit. Kedua, dengan menggunakan matriks korelasi kita memaksakan setiap peubah memiliki ragam yang sama sehingga tujuan mendapatkan peubah yang kontribusinya paling besar tidak tercapai.
  • 17. 17 Penentuan Banyaknya KU Metode 1 ā€¢ didasarkan pada kumulatif proporsi keragaman total yang mampu dijelaskan. ā€¢ Metode ini merupakan metode yang paling banyak digunakan, dan bisa diterapkan pada penggunaan matriks korelasi maupun matriks ragam peragam. ā€¢ Minimum persentase kergaman yang mampu dijelaskan ditentukan terlebih dahulu, dan selanjutnya banyaknya komponen yang paling kecil hingga batas itu terpenuhi dijadikan sebagai banyaknya komponen utama yang digunakan. ā€¢ Tidak ada patokan baku berapa batas minimum tersebut, sebagian buku menyebutkan 70%, 80%, bahkan ada yang 90%.
  • 18. 18 Penentuan Banyaknya KU Metode 2 ā€¢ hanya bisa diterapkan pada penggunaan matriks korelasi. Ketika menggunakan matriks ini, peubah asal ditransformasi menjadi peubah yang memiliki ragam sama yaitu satu. ā€¢ Pemilihan komponen utama didasarkan pada ragam komponen utama, yang tidak lain adalah akar ciri. Metode ini disarankan oleh Kaiser (1960) yang berargumen bahwa jika peubah asal saling bebas maka komponen utama tidak lain adalah peubah asal, dan setiap komponen utama akan memiliki ragam satu. ā€¢ Dengan cara ini, komponen yang berpadanan dengan akar ciri kurang dari satu tidak digunakan.
  • 19. 19 Penentuan Banyaknya KU Metode 3 ā€¢ penggunaan grafik yang disebut plot scree. ā€¢ Cara ini bisa digunakan ketika titik awalnya matriks korelasi maupun ragam peragam. ā€¢ Plot scree merupakan plot antara akar ciri ļ¬k dengan k. ā€¢ Dengan menggunakan metode ini, banyaknya komponen utama yang dipilih, yaitu k, adalah jika pada titik k tersebut plotnya curam ke kiri tapi tidak curam di kanan. Ide yang ada di belakang metode ini adalah bahwa banyaknya komponen utama yang dipilih sedemikian rupa sehingga selisih antara akar ciri yang berurutan sudah tidak besar lagi. Interpretasi terhadap plot ini sangat
  • 20. 20 Kegunaan Lain KU ā€¢ Plot skor KU dua dimensi sebagai alat awal diagnosis pada analisis gerombol ā€¢ KU yang saling bebas mengatasi masalah multikolinear dalam analisis regresi
  • 21. Data Reduction ā€¢ summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite) variables. n p A n k X
  • 22. Data Reduction ā€¢ ā€œResidualā€ variation is information in A that is not retained in X ā€¢ balancing act between ā€“ clarity of representation, ease of understanding ā€“ oversimplification: loss of important or relevant information.
  • 23. Principal Component Analysis (PCA) ā€¢ probably the most widely-used and well- known of the ā€œstandardā€ multivariate methods ā€¢ invented by Pearson (1901) and Hotelling (1933) ā€¢ first applied in ecology by Goodall (1954) under the name ā€œfactor analysisā€ (ā€œprincipal factor analysisā€ is a synonym of PCA).
  • 24. Principal Component Analysis (PCA) ā€¢ takes a data matrix of n objects by p variables, which may be correlated, and summarizes it by uncorrelated axes (principal components or principal axes) that are linear combinations of the original p variables ā€¢ the first k components display as much as possible of the variation among objects.
  • 25. Geometric Rationale of PCA ā€¢ objects are represented as a cloud of n points in a multidimensional space with an axis for each of the p variables ā€¢ the centroid of the points is defined by the mean of each variable ā€¢ the variance of each variable is the average squared deviation of its n values around the mean of that variable. ļ€Ø ļ€©ļƒ„ļ€½ ļ€­ ļ€­ ļ€½ n m iimi XX n V 1 2 1 1
  • 26. Geometric Rationale of PCA ā€¢ degree to which the variables are linearly correlated is represented by their covariances. ļ€Ø ļ€©ļ€Ø ļ€©ļƒ„ļ€½ ļ€­ļ€­ ļ€­ ļ€½ n m jjmiimij XXXX n C 11 1 Sum over all n objects Value of variable j in object m Mean of variable j Value of variable i in object m Mean of variable i Covariance of variables i and j
  • 27. Geometric Rationale of PCA ā€¢ objective of PCA is to rigidly rotate the axes of this p-dimensional space to new positions (principal axes) that have the following properties: ā€“ ordered such that principal axis 1 has the highest variance, axis 2 has the next highest variance, .... , and axis p has the lowest variance ā€“ covariance among each pair of the principal axes is zero (the principal axes are uncorrelated).
  • 28. 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 16 18 20 Variable X1 VariableX2 + 2D Example of PCA ā€¢ variables X1 and X2 have positive covariance & each has a similar variance. 67.61 ļ€½V 24.62 ļ€½V 42.32,1 ļ€½C 35.81 ļ€½X 91.42 ļ€½X
  • 29. -6 -4 -2 0 2 4 6 8 -8 -6 -4 -2 0 2 4 6 8 10 12 Variable X1 VariableX2 Configuration is Centered ā€¢ each variable is adjusted to a mean of zero (by subtracting the mean from each value).
  • 30. -6 -4 -2 0 2 4 6 -8 -6 -4 -2 0 2 4 6 8 10 12 PC 1 PC2 Principal Components are Computed ā€¢ PC 1 has the highest possible variance (9.88) ā€¢ PC 2 has a variance of 3.03 ā€¢ PC 1 and PC 2 have zero covariance.
  • 31. The Dissimilarity Measure Used in PCA is Euclidean Distance ā€¢ PCA uses Euclidean Distance calculated from the p variables as the measure of dissimilarity among the n objects ā€¢ PCA derives the best possible k dimensional (k < p) representation of the Euclidean distances among objects.
  • 32. Generalization to p-dimensions ā€¢ In practice nobody uses PCA with only 2 variables ā€¢ The algebra for finding principal axes readily generalizes to p variables ā€¢ PC 1 is the direction of maximum variance in the p-dimensional cloud of points ā€¢ PC 2 is in the direction of the next highest variance, subject to the constraint that it has zero covariance with PC 1.
  • 33. Generalization to p-dimensions ā€¢ PC 3 is in the direction of the next highest variance, subject to the constraint that it has zero covariance with both PC 1 and PC 2 ā€¢ and so on... up to PC p
  • 34. -6 -4 -2 0 2 4 6 8 -8 -6 -4 -2 0 2 4 6 8 10 12 Variable X1 VariableX2 PC 1 PC 2 ā€¢ each principal axis is a linear combination of the original two variables ā€¢ PCj = ai1Y1 + ai2Y2 + ā€¦ ainYn ā€¢ aijā€™s are the coefficients for factor i, multiplied by the measured value for variable j
  • 35. -6 -4 -2 0 2 4 6 8 -8 -6 -4 -2 0 2 4 6 8 10 12 Variable X1 VariableX2 PC 1 PC 2 ā€¢ PC axes are a rigid rotation of the original variables ā€¢ PC 1 is simultaneously the direction of maximum variance and a least-squares ā€œline of best fitā€ (squared distances of points away from PC 1 are minimized).
  • 36. Generalization to p-dimensions ā€¢ if we take the first k principal components, they define the k-dimensional ā€œhyperplane of best fitā€ to the point cloud ā€¢ of the total variance of all p variables: ā€“ PCs 1 to k represent the maximum possible proportion of that variance that can be displayed in k dimensions ā€“ i.e. the squared Euclidean distances among points calculated from their coordinates on PCs 1 to k are the best possible representation of their squared Euclidean distances in the full p dimensions.
  • 37. Covariance vs Correlation ā€¢ using covariances among variables only makes sense if they are measured in the same units ā€¢ even then, variables with high variances will dominate the principal components ā€¢ these problems are generally avoided by standardizing each variable to unit variance and zero mean. ļ€Ø ļ€© i iim im XX X SD ļ€­ ļ€½ļ‚¢ Mean variable i Standard deviation of variable i
  • 38. Covariance vs Correlation ā€¢ covariances between the standardized variables are correlations ā€¢ after standardization, each variable has a variance of 1.000 ā€¢ correlations can be also calculated from the variances and covariances: ji ij ij VV C r ļ€½ Covariance of variables i and j Variance of variable jVariance of variable i Correlation between variables i and j
  • 39. The Algebra of PCA ā€¢ first step is to calculate the cross- products matrix of variances and covariances (or correlations) among every pair of the p variables ā€¢ square, symmetric matrix ā€¢ diagonals are the variances, off-diagonals are the covariances. X1 X2 X1 6.6707 3.4170 X2 3.4170 6.2384 X1 X2 X1 1.0000 0.5297 X2 0.5297 1.0000 Variance-covariance Matrix Correlation Matrix
  • 40. The Algebra of PCA ā€¢ in matrix notation, this is computed as ā€¢ where X is the n x p data matrix, with each variable centered (also standardized by SD if using correlations). X1 X2 X1 6.6707 3.4170 X2 3.4170 6.2384 X1 X2 X1 1.0000 0.5297 X2 0.5297 1.0000 Variance-covariance Matrix Correlation Matrix XXS ļ‚¢ļ€½
  • 41. Manipulating Matrices ā€¢ transposing: could change the columns to rows or the rows to columns ā€¢ multiplying matrices ā€“ must have the same number of columns in the premultiplicand matrix as the number of rows in the postmultiplicand matrix X = 10 0 4 7 1 2 Xā€™ = 10 7 0 1 4 2
  • 42. The Algebra of PCA ā€¢ sum of the diagonals of the variance- covariance matrix is called the trace ā€¢ it represents the total variance in the data ā€¢ it is the mean squared Euclidean distance between each object and the centroid in p-dimensional space. X1 X2 X1 6.6707 3.4170 X2 3.4170 6.2384 X1 X2 X1 1.0000 0.5297 X2 0.5297 1.0000 Trace = 12.9091 Trace = 2.0000
  • 43. The Algebra of PCA ā€¢ finding the principal axes involves eigenanalysis of the cross-products matrix (S) ā€¢ the eigenvalues (latent roots) of S are solutions (ļ¬) to the characteristic equation 0IS ļ€½ļ€­ļ¬
  • 44. The Algebra of PCA ā€¢ the eigenvalues, ļ¬1, ļ¬2, ... ļ¬p are the variances of the coordinates on each principal component axis ā€¢ the sum of all p eigenvalues equals the trace of S (the sum of the variances of the original variables). X1 X2 X1 6.6707 3.4170 X2 3.4170 6.2384 ļ¬1 = 9.8783 ļ¬2 = 3.0308 Note: ļ¬1+ļ¬2 =12.9091 Trace = 12.9091
  • 45. The Algebra of PCA ā€¢ each eigenvector consists of p values which represent the ā€œcontributionā€ of each variable to the principal component axis ā€¢ eigenvectors are uncorrelated (orthogonal) ā€“ their cross-products are zero. u1 u2 X1 0.7291 -0.6844 X2 0.6844 0.7291 Eigenvectors 0.7291*(-0.6844) + 0.6844*0.7291 = 0
  • 46. The Algebra of PCA ā€¢ coordinates of each object i on the kth principal axis, known as the scores on PC k, are computed as ā€¢ where Z is the n x k matrix of PC scores, X is the n x p centered data matrix and U is the p x k matrix of eigenvectors. pipkikikki xuxuxuz ļ€«ļ€«ļ€«ļ€½ ļŒ2211
  • 47. The Algebra of PCA ā€¢ variance of the scores on each PC axis is equal to the corresponding eigenvalue for that axis ā€¢ the eigenvalue represents the variance displayed (ā€œexplainedā€ or ā€œextractedā€) by the kth axis ā€¢ the sum of the first k eigenvalues is the variance explained by the k-dimensional ordination.
  • 48. -6 -4 -2 0 2 4 6 -8 -6 -4 -2 0 2 4 6 8 10 12 PC 1 PC2ļ¬1 = 9.8783 ļ¬2 = 3.0308 Trace = 12.9091 PC 1 displays (ā€œexplainsā€) 9.8783/12.9091 = 76.5% of the total variance
  • 49. The Algebra of PCA ā€¢ The cross-products matrix computed among the p principal axes has a simple form: ā€“ all off-diagonal values are zero (the principal axes are uncorrelated) ā€“ the diagonal values are the eigenvalues. PC1 PC2 PC1 9.8783 0.0000 PC2 0.0000 3.0308 Variance-covariance Matrix of the PC axes
  • 50. A more challenging example ā€¢ data from research on habitat definition in the endangered Baw Baw frog ā€¢ 16 environmental and structural variables measured at each of 124 sites ā€¢ correlation matrix used because variables have different units Philoria frosti
  • 51. Axis Eigenvalue % of Variance Cumulative % of Variance 1 5.855 36.60 36.60 2 3.420 21.38 57.97 3 1.122 7.01 64.98 4 1.116 6.97 71.95 5 0.982 6.14 78.09 6 0.725 4.53 82.62 7 0.563 3.52 86.14 8 0.529 3.31 89.45 9 0.476 2.98 92.42 10 0.375 2.35 94.77 Eigenvalues
  • 52. Interpreting Eigenvectors ā€¢ correlations between variables and the principal axes are known as loadings ā€¢ each element of the eigenvectors represents the contribution of a given variable to a component 1 2 3 Altitude 0.3842 0.0659 -0.1177 pH -0.1159 0.1696 -0.5578 Cond -0.2729 -0.1200 0.3636 TempSurf 0.0538 -0.2800 0.2621 Relief -0.0765 0.3855 -0.1462 maxERht 0.0248 0.4879 0.2426 avERht 0.0599 0.4568 0.2497 %ER 0.0789 0.4223 0.2278 %VEG 0.3305 -0.2087 -0.0276 %LIT -0.3053 0.1226 0.1145 %LOG -0.3144 0.0402 -0.1067 %W -0.0886 -0.0654 -0.1171 H1Moss 0.1364 -0.1262 0.4761 DistSWH -0.3787 0.0101 0.0042 DistSW -0.3494 -0.1283 0.1166 DistMF 0.3899 0.0586 -0.0175
  • 53. How many axes are needed? ā€¢ does the (k+1)th principal axis represent more variance than would be expected by chance? ā€¢ several tests and rules have been proposed ā€¢ a common ā€œrule of thumbā€ when PCA is based on correlations is that axes with eigenvalues > 1 are worth interpreting
  • 54. Baw Baw Frog - PCA of 16 Habitat Variables 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 1 2 3 4 5 6 7 8 9 10 PC Axis Number Eigenvalue
  • 55. What are the assumptions of PCA? ā€¢ assumes relationships among variables are LINEAR ā€“ cloud of points in p-dimensional space has linear dimensions that can be effectively summarized by the principal axes ā€¢ if the structure in the data is NONLINEAR (the cloud of points twists and curves its way through p-dimensional space), the principal axes will not be an efficient and informative summary of the data.
  • 56. When should PCA be used? ā€¢ In community ecology, PCA is useful for summarizing variables whose relationships are approximately linear or at least monotonic ā€“ e.g. A PCA of many soil properties might be used to extract a few components that summarize main dimensions of soil variation ā€¢ PCA is generally NOT useful for ordinating community data ā€¢ Why? Because relationships among species are highly nonlinear.
  • 57. 0 10 20 30 40 50 60 70 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Simulated Environmetal Gradient (R) Abundance
  • 58. The ā€œHorseshoeā€ or Arch Effect ā€¢ community trends along environmenal gradients appear as ā€œhorseshoesā€ in PCA ordinations ā€¢ none of the PC axes effectively summarizes the trend in species composition along the gradient ā€¢ SUs at opposite extremes of the gradient appear relatively close together.
  • 60. Beta Diversity 2R - Covariance Axis 1 Axis2
  • 61. The ā€œHorseshoeā€Effect ā€¢ curvature of the gradient and the degree of infolding of the extremes increase with beta diversity ā€¢ PCA ordinations are not useful summaries of community data except when beta diversity is very low ā€¢ using correlation generally does better than covariance ā€“ this is because standardization by species improves the correlation between Euclidean distance and environmental distance.
  • 62. What if thereā€™s more than one underlying ecological gradient?
  • 63. The ā€œHorseshoeā€ Effect ā€¢ when two or more underlying gradients with high beta diversity a ā€œhorseshoeā€ is usually not detectable ā€¢ the SUs fall on a curved hypersurface that twists and turns through the p- dimensional species space ā€¢ interpretation problems are more severe ā€¢ PCA should NOT be used with community data (except maybe when beta diversity is very low).
  • 64. Impact on Ordination History ā€¢ by 1970 PCA was the ordination method of choice for community data ā€¢ simulation studies by Swan (1970) & Austin & Noy-Meir (1971) demonstrated the horseshoe effect and showed that the linear assumption of PCA was not compatible with the nonlinear structure of community data ā€¢ stimulated the quest for more appropriate ordination methods.