SlideShare a Scribd company logo
1 of 32
Download to read offline
STAT 4002
Applied Multivariate Analysis
Chapter 7
Factor Analysis
1 / 25
Agenda
The factor analysis model
Maximum likelihood estimation
Rotation of factors
Factor scores
2 / 25
Introduction
I Factor analysis (FA) is a useful multivariate statistical
technique to model the covariance or correlation structure
between variables.
I The objective is to model the covariance or correlation
structure by introducing some unobservable factors (also
known as latent variables).
I This technique is commonly used in psychology, education
research and marketing research where they often involve
unobservable factors that cannot be directly observed, such as
self-confidence, intelligence quotient (IQ), emotional quotient
(EQ), verbal ability, analytic power, loyalty etc.
3 / 25
The factor analysis model
The k-factor analysis model can be formulated as follows. Let
x = (X1, . . . Xp)0 be observable random variables,
µ = (µ1, . . . µp)0 be a constant vector representing the mean, and
f = (F1, . . . Fk)0 be unobservable common factors (latent
variables). Then we can write
X1 = µ1 + λ11F1 + . . . + λ1kFk + ε1,
X2 = µ2 + λ21F1 + . . . + λ2kFk + ε2,
.
.
.
Xp = µp + λp1F1 + . . . + λpkFk + εp,
where λij is the factor loading (sensitivity) of the i-th response
with respect to the j-th factor.
4 / 25
The factor analysis model
In matrix notations, we have
x = µ + Λf + ε,
where
Λ =



λ11 · · · λ1k
.
.
.
.
.
.
λp1 · · · λpk


 , ε =



ε1
.
.
.
εp


 and k < p. (1)
Λ is a p × k matrix of factor loadings with respect to the common
factors f and ε is a p × 1 vector of unique factors (also known as
specific factors, or uniquenesses).
Since we are interested in the covariance structure rather than the
mean, we may simply assume that µ = 0. In addition, we also
assume that E(f) = 0 and E(ε) = 0 and hence E(x) = 0.
5 / 25
The factor analysis model
We further assume that
Var(f) = Ik,
Var(ε) = Ψ = diag(ψ11, . . . , ψpp),
Cov(f, ε) = 0k×p,
the covariance matrix of x could then be written as
Σ = Var(Λf + ε) = ΛΛ0
+ Ψ. (2)
The variance of Xi can be split into two parts
σii =
k
X
j=1
λ2
ij + ψii = h2
i + ψii,
where h2
i =
Pk
j=1 λ2
ij is called the communality and ψii is the
specific variance.
6 / 25
Example
Consider the covariance matrix
Σ =




19 30 2 12
30 57 5 23
2 5 38 47
12 23 47 68



 .
The equality




19 30 2 12
30 57 5 23
2 5 38 47
12 23 47 68



 =




4 1
7 2
−1 6
1 8





4 7 −1 1
1 2 6 8

+




2 0 0 0
0 4 0 0
0 0 1 0
0 0 0 3




implies that Σ has the structure produced by an k = 2 factor
model
7 / 25
Example
with
Λ =




4 1
7 2
−1 6
1 8



 and Ψ =




2 0 0 0
0 4 0 0
0 0 1 0
0 0 0 3




The variance of X1 can be decomposed as
σ11 = λ2
11 + λ2
12 + ψ11,
or
19 = 42
+ 12
+ 2.
8 / 25
Remarks on standardization
I For cases in which the units of the variables are not
comparable, it is usually desirable to work with the
standardized variables.
I Standardization avoids the problems of having one variable
with large variance dominate the factor loadings.
I The decomposition of the covariance matrix could then be
applied on the sample correlation matrix R,
R = Λ̂Λ̂
0
+ Ψ̂.
I Note that the results based on Σ and R are not the same.
9 / 25
Maximum likelihood estimation
If x has a multivariate normal distribution, then (n − 1)S would
have a Wishart distribution with n − 1 degrees of freedom, in
which n is the number of observations. Recall from Chapter 2 that
the density function of the Wishart distribution with n − 1 degrees
of freedom is
fn−1(S|Σ) =
|S|(n−p−2)/2|Σ|−(n−1)/2e−tr(Σ−1
S)/2
K
,
where K is a scaling constant. Hence, the log-likelihood function
could be defined as
l(Σ|S) = − ln K +
n − p − 2
2
ln |S| −
n − 1
2
ln |Σ| −
tr(Σ−1
S)
2
in terms of the unknown parameter Σ
10 / 25
Maximum likelihood estimation
Alternatively, using Σ = ΛΛ0
+ Ψ we can rewrite the likelihood
function as
l(Λ, Ψ|S) = − ln K +
n − p − 2
2
ln |S| −
n − 1
2
ln |ΛΛ0
+ Ψ|
−
tr[(ΛΛ0
+ Ψ)−1S]
2
(3)
in terms of Λ and Ψ. The maximum likelihood estimator (MLE)
Λ̂ and Ψ̂ are the value that maximize l(Λ, Ψ).
This is a complicated maximization problem because both Λ̂ and
Ψ̂ are matrix-valued. While explicit solution is not available, a
result were obtained in Jóreskog(1969) who developed a reliable
numerical method for the computation of the maximum likelihood
estimate.
11 / 25
Maximum likelihood estimation
R has a built-in function factanal() to compute the MLE of the
k-factor model on the correlation matrix. Let us use the
decath.csv data again to illustrate this.
d

−read . csv ( ” decath . csv ” ) # read i n data
x
−d [ , 2 : 1 1 ] # e x t r a c t column 2 to 11
fa2
−f a c t a n a l ( x , f a c t o r s =2, s c o r e s=” r e g r e s s i o n ” ) # save output to fa2
names ( fa2 ) # d i s p l a y items i n fa
[ 1 ] ” converged ” ” l o a d i n g s ” ” u n i q u e n e s s e s ” ” c o r r e l a t i o n ” ” c r i t e r i a ”
[ 6 ] ” f a c t o r s ” ” dof ” ”method” ” rotmat ” ” s c o r e s ”
[ 1 1 ] ”STATISTIC” ”PVAL” ”n . obs ” ” c a l l ”
(U

−fa2 $ u n i q u e n e s s e s ) # save and d i s p l a y uniqueness to U
m100 h110 m400 m1500 longjump highjump pole shot d i s c u s j a v
0.284 0.287 0.234 0.594 0.347 0.737 0.285 0.109 0.188 0.473
12 / 25
Maximum likelihood estimation
Let us display the factor loadings Λ̂ = L, communality and the
uniqueness Ψ̂ = diag(U).
(L

−fa2 $ l o a d i n g s ) # save and d i s p l a y f a c t o r l o a d i n g s to L
Loadings :
Factor1 Factor2
m100 0.781 −0.325
h110 0.741 −0.406
m400 0.875 −0.030
m1500 0.544 0.333
longjump −0.738 0.328
highjump −0.388 0.335
pole −0.576 0.619
shot −0.135 0.934
d i s c u s −0.102 0.895
j a v −0.177 0.704
Factor1 Factor2
SS l o a d i n g s 3.308 3.155
Proportion Var 0.331 0.315
Cumulative Var 0.331 0.646
apply (L ˆ2 ,1 ,sum) # compute communality
m100 h110 m400 m1500 longjump highjump pole shot d i s c u s j a v
0.716 0.713 0.766 0.406 0.653 0.263 0.715 0.891 0.812 0.527
13 / 25
Maximum likelihood estimation
Factor 1 can be interpreted as the weighted average of speed and
jumping ability while factor 2 is the power. We can also compute
Λ̂Λ̂
0
+ Ψ̂ and compare with the correlation matrix.
RMLE

−L%∗%t (L)+diag (U) # compute RMLE = LL’+U
round (RMLE, 3 ) # d i s p l a y RMLE
m100 h110 m400 m1500 longjump highjump pole shot d i s c u s
j a v
m100 1.000 0.710 0.693 0.317 −0.683 −0.412 −0.651 −0.409 −0.371
−0.367
h110 0.710 1.000 0.660 0.268 −0.680 −0.423 −0.678 −0.479 −0.439
−0.417
m400 0.693 0.660 1.000 0.466 −0.656 −0.350 −0.522 −0.146 −0.116
−0.176
m1500 0.317 0.268 0.466 1.000 −0.292 −0.099 −0.107 0.238 0.242
0.138
longjump −0.683 −0.680 −0.656 −0.292 1.000 0.396 0.628 0.406 0.369
0.361
highjump −0.412 −0.423 −0.350 −0.099 0.396 1.000 0.431 0.366 0.340
0.305
pole −0.651 −0.678 −0.522 −0.107 0.628 0.431 1.000 0.656 0.613
0.538
shot −0.409 −0.479 −0.146 0.238 0.406 0.366 0.656 1.000 0.850
0.681
d i s c u s −0.371 −0.439 −0.116 0.242 0.369 0.340 0.613 0.850 1.000
0.648
j a v −0.367 −0.417 −0.176 0.138 0.361 0.305 0.538 0.681 0.648
1.000
14 / 25
Maximum likelihood estimation
R

−cor ( x )
round (R, 3 ) # compare with c o r r . Matrix
m100 h110 m400 m1500 longjump highjump pole shot d i s c u s
j a v
m100 1.000 0.751 0.698 0.254 −0.691 −0.364 −0.627 −0.420 −0.353
−0.344
h110 0.751 1.000 0.655 0.155 −0.654 −0.487 −0.709 −0.489 −0.403
−0.350
m400 0.698 0.655 1.000 0.554 −0.636 −0.275 −0.521 −0.142 −0.154
−0.150
m1500 0.254 0.155 0.554 1.000 −0.356 −0.132 −0.070 0.202 0.288
0.045
longjump −0.691 −0.654 −0.636 −0.356 1.000 0.471 0.632 0.391 0.375
0.446
highjump −0.364 −0.487 −0.275 −0.132 0.471 1.000 0.472 0.321 0.376
0.338
pole −0.627 −0.709 −0.521 −0.070 0.632 0.472 1.000 0.643 0.620
0.557
shot −0.420 −0.489 −0.142 0.202 0.391 0.321 0.643 1.000 0.856
0.703
d i s c u s −0.353 −0.403 −0.154 0.288 0.375 0.376 0.620 0.856 1.000
0.618
j a v −0.344 −0.350 −0.150 0.045 0.446 0.338 0.557 0.703 0.618
1.000
15 / 25
Rotation of factors
The covariance structure will not change if Λ is replaced by
Π = ΛG for any orthogonal matrix G, because
ΠΠ0
+ Ψ = ΛGG0
Λ0
+ Ψ
= ΛIΛ0
+ Ψ
= Σ.
Geometrically speaking, the multiplication with an orthogonal
matrix is equivalent to a rotation of the principal axes. So it is
possible to find an orthogonal matrix (also known as rotation
matrix) to make the interpretation of factors easier.
16 / 25
Rotation of factors
One commonly used method to determine the rotation matrix is
called varimax, which is the default option of factanal() in R.
Varimax is to find the orthogonal matrix G such that
V =
k
X
j=1


p
X
i=1
π4
ij −
1
p
p
X
i=1
π2
ij
!2

 =
k
X
j=1
 p
X
i=1
(π2
ij − π̄2
•j)2
#
is maximized, where π̄•j = 1
p
Pp
i=1 πij, πij is the (i, j)-th entry of
Π = ΛG. The factor loadings in the output of factanal() are
the rotated factor loadings.
17 / 25
Factor scores
Once the k-factor model is fitted and the factor loadings are
obtained, it may be of interest to estimate the realized value of the
factors f given individual observation x0 = (x1, . . . , xp)0. That is,
ˆ
f0 = E(f | x = x0).
To compute the above conditional expectation, we need the joint
distribution of x = (X1, . . . , Xp)0 and f = (F1, . . . , Fk)0. Note
that Cov(Xi, Fj) = λij, we get

x
f

∼ Np+k

0
0

,

ΛΛ0
+ Ψ Λ
Λ0
Ik

.
18 / 25
Factor scores
Then, the conditional expectation of f given x is
f̂ = E(f | x) = Λ0
(ΛΛ0
+ Ψ)−1
x.
Replacing Λ and Ψ with their estimates and given the observation
x = x0, the factor score is given by
f̂0 = Λ̂
0
(Λ̂Λ̂
0
+ Ψ̂)−1
x0.
The factor scores defined above is called the regression factor
score. However, this is a biased estimator.
19 / 25
Factor scores
An alternative unbiased estimate
f̂ =
h
(Ψ−1/2
Λ)0
(Ψ−1/2
Λ)
i−1
(Ψ−1/2
Λ)0
Ψ−1/2
x
= Λ0
Ψ−1
Λ
−1
Λ0
Ψ−1
x.
20 / 25
Factor scores
f̂ defined above is unbiased in the sense that
E[f̂|f] = E[ Λ0
Ψ−1
Λ
−1
Λ0
Ψ−1
x|f]
= E[ Λ0
Ψ−1
Λ
−1
Λ0
Ψ−1
(Λf + ) |f]
= Λ0
Ψ−1
Λ
−1
Λ0
Ψ−1
Λf
= f.
The factor score is then obtained by replacing Λ and Ψ with their
estimates and x by the observation x0,
f̂0 =

Λ̂
0
Ψ̂
−1
Λ̂
−1
Λ̂
0
Ψ̂
−1
x0.
This estimate is known as the Bartlett’s factor score.
21 / 25
Factor scores
The R function factanal() compute the factor scores and store
them in scores of the output object, the option
scores=regression
would compute the regression factor scores, while
scores=Bartlett
would output the Bartlett’s factor scores.
22 / 25
Factor scores
Let us plot these factor scores versus the observation number.
fs1 -fa2$scores [,1] # save 1st factor scores to fs1
fs2 -fa2$scores [,2] # save 2nd factor scores to fs2
par(mfrow=c(2 ,1)) # define 2x1 multi -frame graph
plot(fs1 ,type=o) # plot fs1
plot(fs2 ,type=o) # plot fs2
par(mfrow=c(1 ,1)) # reset multi -frame graph to 1x1
plot(fs1 ,fs2 ,main=factor score with obs. no.)
text(fs1 -0.1 , fs2 +0.1 , cex =0.6) # add obs. no. to the points
Recall that these observations are ordered according to athletics’
official result. The first factor score is the smaller the better while
the second factor scores is the larger the better.
23 / 25
Factor scores
0 5 10 15 20 25 30 35
−1
0
1
2
Index
fs1
0 5 10 15 20 25 30 35
−3
−1
0
1
2
Index
fs2
Figure 1: A plot of the factor scores against observation number.
24 / 25
Factor scores
−1 0 1 2
−3
−2
−1
0
1
2
factor score with obs. no.
fs1
fs2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
2122
23
24
25
26
27
28
29
30
31
32
33
34
Figure 2: A plot of ˆ
f1 against ˆ
f2.
25 / 25

More Related Content

Similar to ch7_lin_updatedApril13_2022.pdf

Natural and Clamped Cubic Splines
Natural and Clamped Cubic SplinesNatural and Clamped Cubic Splines
Natural and Clamped Cubic Splines
Mark Brandao
 

Similar to ch7_lin_updatedApril13_2022.pdf (20)

2 random variables notes 2p3
2 random variables notes 2p32 random variables notes 2p3
2 random variables notes 2p3
 
The Odd Generalized Exponential Log Logistic Distribution
The Odd Generalized Exponential Log Logistic DistributionThe Odd Generalized Exponential Log Logistic Distribution
The Odd Generalized Exponential Log Logistic Distribution
 
Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...
 
On Clustering Financial Time Series - Beyond Correlation
On Clustering Financial Time Series - Beyond CorrelationOn Clustering Financial Time Series - Beyond Correlation
On Clustering Financial Time Series - Beyond Correlation
 
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...
 
NUMERICAL METHODS
NUMERICAL METHODSNUMERICAL METHODS
NUMERICAL METHODS
 
Probability Distribution
Probability DistributionProbability Distribution
Probability Distribution
 
3_MLE_printable.pdf
3_MLE_printable.pdf3_MLE_printable.pdf
3_MLE_printable.pdf
 
Regression on gaussian symbols
Regression on gaussian symbolsRegression on gaussian symbols
Regression on gaussian symbols
 
Fisher_info_ppt and mathematical process to find time domain and frequency do...
Fisher_info_ppt and mathematical process to find time domain and frequency do...Fisher_info_ppt and mathematical process to find time domain and frequency do...
Fisher_info_ppt and mathematical process to find time domain and frequency do...
 
Natural and Clamped Cubic Splines
Natural and Clamped Cubic SplinesNatural and Clamped Cubic Splines
Natural and Clamped Cubic Splines
 
Central tendency
Central tendencyCentral tendency
Central tendency
 
Big Data Analysis
Big Data AnalysisBig Data Analysis
Big Data Analysis
 
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and ScalaFolding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala
 
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
 
Statistical Methods
Statistical MethodsStatistical Methods
Statistical Methods
 
Derivatives
DerivativesDerivatives
Derivatives
 
Reading Seminar (140515) Spectral Learning of L-PCFGs
Reading Seminar (140515) Spectral Learning of L-PCFGsReading Seminar (140515) Spectral Learning of L-PCFGs
Reading Seminar (140515) Spectral Learning of L-PCFGs
 
Chapter 04 answers
Chapter 04 answersChapter 04 answers
Chapter 04 answers
 
Fin500J_topic10_Probability_2010_0000000
Fin500J_topic10_Probability_2010_0000000Fin500J_topic10_Probability_2010_0000000
Fin500J_topic10_Probability_2010_0000000
 

Recently uploaded

Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
HyderabadDolls
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
 

Recently uploaded (20)

Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about them
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
 

ch7_lin_updatedApril13_2022.pdf

  • 1. STAT 4002 Applied Multivariate Analysis Chapter 7 Factor Analysis 1 / 25
  • 2. Agenda The factor analysis model Maximum likelihood estimation Rotation of factors Factor scores 2 / 25
  • 3. Introduction I Factor analysis (FA) is a useful multivariate statistical technique to model the covariance or correlation structure between variables. I The objective is to model the covariance or correlation structure by introducing some unobservable factors (also known as latent variables). I This technique is commonly used in psychology, education research and marketing research where they often involve unobservable factors that cannot be directly observed, such as self-confidence, intelligence quotient (IQ), emotional quotient (EQ), verbal ability, analytic power, loyalty etc. 3 / 25
  • 4. The factor analysis model The k-factor analysis model can be formulated as follows. Let x = (X1, . . . Xp)0 be observable random variables, µ = (µ1, . . . µp)0 be a constant vector representing the mean, and f = (F1, . . . Fk)0 be unobservable common factors (latent variables). Then we can write X1 = µ1 + λ11F1 + . . . + λ1kFk + ε1, X2 = µ2 + λ21F1 + . . . + λ2kFk + ε2, . . . Xp = µp + λp1F1 + . . . + λpkFk + εp, where λij is the factor loading (sensitivity) of the i-th response with respect to the j-th factor. 4 / 25
  • 5. The factor analysis model In matrix notations, we have x = µ + Λf + ε, where Λ =    λ11 · · · λ1k . . . . . . λp1 · · · λpk    , ε =    ε1 . . . εp    and k < p. (1) Λ is a p × k matrix of factor loadings with respect to the common factors f and ε is a p × 1 vector of unique factors (also known as specific factors, or uniquenesses). Since we are interested in the covariance structure rather than the mean, we may simply assume that µ = 0. In addition, we also assume that E(f) = 0 and E(ε) = 0 and hence E(x) = 0. 5 / 25
  • 6.
  • 7. The factor analysis model We further assume that Var(f) = Ik, Var(ε) = Ψ = diag(ψ11, . . . , ψpp), Cov(f, ε) = 0k×p, the covariance matrix of x could then be written as Σ = Var(Λf + ε) = ΛΛ0 + Ψ. (2) The variance of Xi can be split into two parts σii = k X j=1 λ2 ij + ψii = h2 i + ψii, where h2 i = Pk j=1 λ2 ij is called the communality and ψii is the specific variance. 6 / 25
  • 8.
  • 9. Example Consider the covariance matrix Σ =     19 30 2 12 30 57 5 23 2 5 38 47 12 23 47 68     . The equality     19 30 2 12 30 57 5 23 2 5 38 47 12 23 47 68     =     4 1 7 2 −1 6 1 8     4 7 −1 1 1 2 6 8 +     2 0 0 0 0 4 0 0 0 0 1 0 0 0 0 3     implies that Σ has the structure produced by an k = 2 factor model 7 / 25
  • 10.
  • 11. Example with Λ =     4 1 7 2 −1 6 1 8     and Ψ =     2 0 0 0 0 4 0 0 0 0 1 0 0 0 0 3     The variance of X1 can be decomposed as σ11 = λ2 11 + λ2 12 + ψ11, or 19 = 42 + 12 + 2. 8 / 25
  • 12. Remarks on standardization I For cases in which the units of the variables are not comparable, it is usually desirable to work with the standardized variables. I Standardization avoids the problems of having one variable with large variance dominate the factor loadings. I The decomposition of the covariance matrix could then be applied on the sample correlation matrix R, R = Λ̂Λ̂ 0 + Ψ̂. I Note that the results based on Σ and R are not the same. 9 / 25
  • 13. Maximum likelihood estimation If x has a multivariate normal distribution, then (n − 1)S would have a Wishart distribution with n − 1 degrees of freedom, in which n is the number of observations. Recall from Chapter 2 that the density function of the Wishart distribution with n − 1 degrees of freedom is fn−1(S|Σ) = |S|(n−p−2)/2|Σ|−(n−1)/2e−tr(Σ−1 S)/2 K , where K is a scaling constant. Hence, the log-likelihood function could be defined as l(Σ|S) = − ln K + n − p − 2 2 ln |S| − n − 1 2 ln |Σ| − tr(Σ−1 S) 2 in terms of the unknown parameter Σ 10 / 25
  • 14. Maximum likelihood estimation Alternatively, using Σ = ΛΛ0 + Ψ we can rewrite the likelihood function as l(Λ, Ψ|S) = − ln K + n − p − 2 2 ln |S| − n − 1 2 ln |ΛΛ0 + Ψ| − tr[(ΛΛ0 + Ψ)−1S] 2 (3) in terms of Λ and Ψ. The maximum likelihood estimator (MLE) Λ̂ and Ψ̂ are the value that maximize l(Λ, Ψ). This is a complicated maximization problem because both Λ̂ and Ψ̂ are matrix-valued. While explicit solution is not available, a result were obtained in Jóreskog(1969) who developed a reliable numerical method for the computation of the maximum likelihood estimate. 11 / 25
  • 15. Maximum likelihood estimation R has a built-in function factanal() to compute the MLE of the k-factor model on the correlation matrix. Let us use the decath.csv data again to illustrate this. d −read . csv ( ” decath . csv ” ) # read i n data x −d [ , 2 : 1 1 ] # e x t r a c t column 2 to 11 fa2 −f a c t a n a l ( x , f a c t o r s =2, s c o r e s=” r e g r e s s i o n ” ) # save output to fa2 names ( fa2 ) # d i s p l a y items i n fa [ 1 ] ” converged ” ” l o a d i n g s ” ” u n i q u e n e s s e s ” ” c o r r e l a t i o n ” ” c r i t e r i a ” [ 6 ] ” f a c t o r s ” ” dof ” ”method” ” rotmat ” ” s c o r e s ” [ 1 1 ] ”STATISTIC” ”PVAL” ”n . obs ” ” c a l l ” (U −fa2 $ u n i q u e n e s s e s ) # save and d i s p l a y uniqueness to U m100 h110 m400 m1500 longjump highjump pole shot d i s c u s j a v 0.284 0.287 0.234 0.594 0.347 0.737 0.285 0.109 0.188 0.473 12 / 25
  • 16. Maximum likelihood estimation Let us display the factor loadings Λ̂ = L, communality and the uniqueness Ψ̂ = diag(U). (L −fa2 $ l o a d i n g s ) # save and d i s p l a y f a c t o r l o a d i n g s to L Loadings : Factor1 Factor2 m100 0.781 −0.325 h110 0.741 −0.406 m400 0.875 −0.030 m1500 0.544 0.333 longjump −0.738 0.328 highjump −0.388 0.335 pole −0.576 0.619 shot −0.135 0.934 d i s c u s −0.102 0.895 j a v −0.177 0.704 Factor1 Factor2 SS l o a d i n g s 3.308 3.155 Proportion Var 0.331 0.315 Cumulative Var 0.331 0.646 apply (L ˆ2 ,1 ,sum) # compute communality m100 h110 m400 m1500 longjump highjump pole shot d i s c u s j a v 0.716 0.713 0.766 0.406 0.653 0.263 0.715 0.891 0.812 0.527 13 / 25
  • 17. Maximum likelihood estimation Factor 1 can be interpreted as the weighted average of speed and jumping ability while factor 2 is the power. We can also compute Λ̂Λ̂ 0 + Ψ̂ and compare with the correlation matrix. RMLE −L%∗%t (L)+diag (U) # compute RMLE = LL’+U round (RMLE, 3 ) # d i s p l a y RMLE m100 h110 m400 m1500 longjump highjump pole shot d i s c u s j a v m100 1.000 0.710 0.693 0.317 −0.683 −0.412 −0.651 −0.409 −0.371 −0.367 h110 0.710 1.000 0.660 0.268 −0.680 −0.423 −0.678 −0.479 −0.439 −0.417 m400 0.693 0.660 1.000 0.466 −0.656 −0.350 −0.522 −0.146 −0.116 −0.176 m1500 0.317 0.268 0.466 1.000 −0.292 −0.099 −0.107 0.238 0.242 0.138 longjump −0.683 −0.680 −0.656 −0.292 1.000 0.396 0.628 0.406 0.369 0.361 highjump −0.412 −0.423 −0.350 −0.099 0.396 1.000 0.431 0.366 0.340 0.305 pole −0.651 −0.678 −0.522 −0.107 0.628 0.431 1.000 0.656 0.613 0.538 shot −0.409 −0.479 −0.146 0.238 0.406 0.366 0.656 1.000 0.850 0.681 d i s c u s −0.371 −0.439 −0.116 0.242 0.369 0.340 0.613 0.850 1.000 0.648 j a v −0.367 −0.417 −0.176 0.138 0.361 0.305 0.538 0.681 0.648 1.000 14 / 25
  • 18. Maximum likelihood estimation R −cor ( x ) round (R, 3 ) # compare with c o r r . Matrix m100 h110 m400 m1500 longjump highjump pole shot d i s c u s j a v m100 1.000 0.751 0.698 0.254 −0.691 −0.364 −0.627 −0.420 −0.353 −0.344 h110 0.751 1.000 0.655 0.155 −0.654 −0.487 −0.709 −0.489 −0.403 −0.350 m400 0.698 0.655 1.000 0.554 −0.636 −0.275 −0.521 −0.142 −0.154 −0.150 m1500 0.254 0.155 0.554 1.000 −0.356 −0.132 −0.070 0.202 0.288 0.045 longjump −0.691 −0.654 −0.636 −0.356 1.000 0.471 0.632 0.391 0.375 0.446 highjump −0.364 −0.487 −0.275 −0.132 0.471 1.000 0.472 0.321 0.376 0.338 pole −0.627 −0.709 −0.521 −0.070 0.632 0.472 1.000 0.643 0.620 0.557 shot −0.420 −0.489 −0.142 0.202 0.391 0.321 0.643 1.000 0.856 0.703 d i s c u s −0.353 −0.403 −0.154 0.288 0.375 0.376 0.620 0.856 1.000 0.618 j a v −0.344 −0.350 −0.150 0.045 0.446 0.338 0.557 0.703 0.618 1.000 15 / 25
  • 19. Rotation of factors The covariance structure will not change if Λ is replaced by Π = ΛG for any orthogonal matrix G, because ΠΠ0 + Ψ = ΛGG0 Λ0 + Ψ = ΛIΛ0 + Ψ = Σ. Geometrically speaking, the multiplication with an orthogonal matrix is equivalent to a rotation of the principal axes. So it is possible to find an orthogonal matrix (also known as rotation matrix) to make the interpretation of factors easier. 16 / 25
  • 20. Rotation of factors One commonly used method to determine the rotation matrix is called varimax, which is the default option of factanal() in R. Varimax is to find the orthogonal matrix G such that V = k X j=1   p X i=1 π4 ij − 1 p p X i=1 π2 ij !2   = k X j=1 p X i=1 (π2 ij − π̄2 •j)2 # is maximized, where π̄•j = 1 p Pp i=1 πij, πij is the (i, j)-th entry of Π = ΛG. The factor loadings in the output of factanal() are the rotated factor loadings. 17 / 25
  • 21.
  • 22. Factor scores Once the k-factor model is fitted and the factor loadings are obtained, it may be of interest to estimate the realized value of the factors f given individual observation x0 = (x1, . . . , xp)0. That is, ˆ f0 = E(f | x = x0). To compute the above conditional expectation, we need the joint distribution of x = (X1, . . . , Xp)0 and f = (F1, . . . , Fk)0. Note that Cov(Xi, Fj) = λij, we get x f ∼ Np+k 0 0 , ΛΛ0 + Ψ Λ Λ0 Ik . 18 / 25
  • 23.
  • 24. Factor scores Then, the conditional expectation of f given x is f̂ = E(f | x) = Λ0 (ΛΛ0 + Ψ)−1 x. Replacing Λ and Ψ with their estimates and given the observation x = x0, the factor score is given by f̂0 = Λ̂ 0 (Λ̂Λ̂ 0 + Ψ̂)−1 x0. The factor scores defined above is called the regression factor score. However, this is a biased estimator. 19 / 25
  • 25.
  • 26. Factor scores An alternative unbiased estimate f̂ = h (Ψ−1/2 Λ)0 (Ψ−1/2 Λ) i−1 (Ψ−1/2 Λ)0 Ψ−1/2 x = Λ0 Ψ−1 Λ −1 Λ0 Ψ−1 x. 20 / 25
  • 27. Factor scores f̂ defined above is unbiased in the sense that E[f̂|f] = E[ Λ0 Ψ−1 Λ −1 Λ0 Ψ−1 x|f] = E[ Λ0 Ψ−1 Λ −1 Λ0 Ψ−1 (Λf + ) |f] = Λ0 Ψ−1 Λ −1 Λ0 Ψ−1 Λf = f. The factor score is then obtained by replacing Λ and Ψ with their estimates and x by the observation x0, f̂0 = Λ̂ 0 Ψ̂ −1 Λ̂ −1 Λ̂ 0 Ψ̂ −1 x0. This estimate is known as the Bartlett’s factor score. 21 / 25
  • 28.
  • 29. Factor scores The R function factanal() compute the factor scores and store them in scores of the output object, the option scores=regression would compute the regression factor scores, while scores=Bartlett would output the Bartlett’s factor scores. 22 / 25
  • 30. Factor scores Let us plot these factor scores versus the observation number. fs1 -fa2$scores [,1] # save 1st factor scores to fs1 fs2 -fa2$scores [,2] # save 2nd factor scores to fs2 par(mfrow=c(2 ,1)) # define 2x1 multi -frame graph plot(fs1 ,type=o) # plot fs1 plot(fs2 ,type=o) # plot fs2 par(mfrow=c(1 ,1)) # reset multi -frame graph to 1x1 plot(fs1 ,fs2 ,main=factor score with obs. no.) text(fs1 -0.1 , fs2 +0.1 , cex =0.6) # add obs. no. to the points Recall that these observations are ordered according to athletics’ official result. The first factor score is the smaller the better while the second factor scores is the larger the better. 23 / 25
  • 31. Factor scores 0 5 10 15 20 25 30 35 −1 0 1 2 Index fs1 0 5 10 15 20 25 30 35 −3 −1 0 1 2 Index fs2 Figure 1: A plot of the factor scores against observation number. 24 / 25
  • 32. Factor scores −1 0 1 2 −3 −2 −1 0 1 2 factor score with obs. no. fs1 fs2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 2122 23 24 25 26 27 28 29 30 31 32 33 34 Figure 2: A plot of ˆ f1 against ˆ f2. 25 / 25