Example 33.2 Principal Factor Analysis This example uses t.docx

Example 33.2 Principal Factor Analysis
This example uses the data presented in Example 33.1 and
performs a principal factor analysis
with squared multiple correlations for the prior communality
estimates. Unlike Example 33.1,
which analyzes the principal components (with default
PRIORS=ONE), the current analysis is
based on a common factor model. To use a common factor
model, you specify PRIORS=SMC in
the PROC FACTOR statement, as shown in the following:
ods graphics on;
proc factor data=SocioEconomics
priors=smc msa residual
rotate=promax reorder
outstat=fact_all
plots=(scree initloadings preloadings loadings);run;
ods graphics off;
In the PROC FACTOR statement, you include several other

options to help you analyze the
results. To help determine whether the common factor model is
appropriate, you request the
Kaiser’s measure of sampling adequacy with the MSA option.
You specify the RESIDUALS
option to compute the residual correlations and partial
correlations.
The ROTATE= and REORDER options are specified to enhance
factor interpretability. The
ROTATE=PROMAX option produces an orthogonal varimax
prerotation (default) followed by
an oblique Procrustes rotation, and the REORDER option
reorders the variables according to
their largest factor loadings. An OUTSTAT= data set is created
by PROC FACTOR and
displayed in Output 33.2.15.
PROC FACTOR can produce high-quality graphs that are very
useful for interpreting the factor
solutions. To request these graphs, you must first enable ODS
Graphics by specifying the ODS
GRAPHICS ON statement, as shown in the preceding
statements. All ODS graphs in PROC
FACTOR are requested with the PLOTS= option. In this
example, you request a scree plot

(SCREE) and loading plots for the factor matrix during the
following three stages: initial
unrotated solution (INITLOADINGS), prerotated (varimax)
solution (PRELOADINGS), and
promax-rotated solution (LOADINGS). The scree plot helps you
determine the number of
factors, and the loading plots help you visualize the patterns of
factor loadings during various
stages of analyses.
Principal Factor Analysis: Kaiser’s MSA and Factor Extraction
Results
Output 33.2.1 displays the results of the partial correlations and
Kaiser’s measure of sampling
adequacy.
Output 33.2.1 Principal Factor Analysis: Partial Correlations
and Kaiser’s MSA
Partial Correlations Controlling all other Variables
Population School Employment Services HouseValue
http://support.sas.com/documentation/cdl/en/statug/63347/HTM
L/default/statug_factor_sect028.htm
L/default/statug_factor_sect006.htm#statug.factor.factorpriorso

p
p
L/default/statug_factor_sect006.htm#statug.factor.factormsa
L/default/statug_factor_sect006.htm#statug.factor.factorresidual
s
L/default/statug_factor_sect006.htm#statug.factor.factorrotate
L/default/statug_factor_sect006.htm#statug.factor.factorreorder
L/default/statug_factor_sect006.htm#statug.factor.factoroutstat
L/default/statug_factor_sect029.htm#statug.factor.pprint1__
L/default/statug_factor_sect006.htm#statug.factor.factorgplots
L/default/statug_factor_sect029.htm#statug.factor.facex2a
Partial Correlations Controlling all other Variables
Population 1.00000 -0.54465 0.97083 0.09612 0.15871
School -0.54465 1.00000 0.54373 0.04996 0.64717
Employment 0.97083 0.54373 1.00000 0.06689 -0.25572

Services 0.09612 0.04996 0.06689 1.00000 0.59415
HouseValue 0.15871 0.64717 -0.25572 0.59415 1.00000
Kaiser's Measure of Sampling Adequacy: Overall MSA =
0.57536759
0.47207897 0.55158839 0.48851137 0.80664365 0.61281377
If the data are appropriate for the common factor model, the
partial correlations (controlling all
other variables) should be small compared to the original
correlations. For example, the partial
correlation between the variables School and HouseValue is ,
slightly less than the original
correlation of (see Output 33.1.3). The partial correlation
between Population and School is
, which is much larger in absolute value than the original
correlation; this is an indication of
trouble. Kaiser’s MSA is a summary, for each variable and for
all variables together, of how
much smaller the partial correlations are than the original
correlations. Values of or are
considered good, while MSAs below are unacceptable. The

variables Population, School, and
Employment have very poor MSAs. Only the Services variable
has a good MSA. The overall
MSA of is sufficiently poor that additional variables should be
included in the analysis to
better define the common factors. A commonly used rule is that
there should be at least three
variables per factor. In the following analysis, you determine
that there are two common factors
in these data. Therefore, more variables are needed for a
reliable analysis.
Output 33.2.2 displays the results of the principal factor
extraction.
Output 33.2.2 Principal Factor Analysis: Factor Extraction
Prior Communality Estimates: SMC
0.96859160 0.82228514 0.96918082 0.78572440 0.84701921
Eigenvalues of the Reduced Correlation Matrix: Total =
4.39280116 Average = 0.87856023
Eigenvalue Difference Proportion Cumulative
1 2.73430084 1.01823217 0.6225 0.6225

2 1.71606867 1.67650586 0.3907 1.0131
3 0.03956281 0.06408626 0.0090 1.0221
4 -.02452345 0.04808427 -0.0056 1.0165
5 -.07260772 -0.0165 1.0000
L/default/statug_factor_sect028.htm#statug.factor.facex1c
L/default/statug_factor_sect006.htm#statug.factor.factormsa
L/default/statug_factor_sect029.htm#statug.factor.facex2r
이강복
강조
이강복
강조
The square multiple correlations are shown as prior
communality estimates in Output 33.2.2. The
PRIORS=SMC option basically replaces the diagonal of the
original observed correlation matrix
by these square multiple correlations. Because the square
multiple correlations are usually less
than one, the resulting correlation matrix for factoring is called
the reduced correlation matrix. In
the current example, the SMCs are all fairly large; hence, you

expect the results of the principal
factor analysis to be similar to those in the principal component
analysis.
The first two largest positive eigenvalues of the reduced
correlation matrix account for of
the common variance. This is possible because the reduced
correlation matrix, in general, is not
necessarily positive definite, and negative eigenvalues for the
matrix are possible. A pattern like
this suggests that you might not need more than two common
factors. The scree and variance
explained plots of Output 33.2.3 clearly support the conclusion
that two common factors are
present. Showing in the left panel of Output 33.2.3 is the scree
plot of the eigenvalues of the
reduced correlation matrix. A sharp bend occurs at the third
eigenvalue, reinforcing the
conclusion that two common factors are present. These
cumulative proportions of common
variance explained by factors are plotted in the right panel of
Output 33.2.3, which shows that
the curve essentially flattens out after the second factor.
Output 33.2.3 Scree and Variance Explained Plots

Principal Factor Analysis: Initial Factor
Solution
For the current analysis, PROC FACTOR retains two factors by
certain default criteria. This
decision agrees with the conclusion drawn by inspecting the
scree plot. The principal factor
p

이강복
강조
pattern with the two factors is displayed in Output 33.2.4. This
factor pattern is similar to the
principal component pattern seen in Output 33.1.5 of Example
33.1. For example, the variable
Services has the largest loading on the first factor, and the
Population variable has the smallest.
The variables Population and Employment have large positive
loadings on the second factor, and
the HouseValue and School variables have large negative
loadings.
Output 33.2.4 Initial Factor Pattern Matrix and Communalities
Factor Pattern
Factor1 Factor2

Services 0.87899 -0.15847
HouseValue 0.74215 -0.57806
Employment 0.71447 0.67936
School 0.71370 -0.55515
Population 0.62533 0.76621
Variance Explained by Each
Factor
Factor1 Factor2
2.7343008 1.7160687
Final Communality Estimates: Total = 4.450370
0.97811334 0.81756387 0.97199928 0.79774304 0.88494998

Comparing the current factor loading matrix in Output 33.2.4
with that in Output 33.1.5 in
Example 33.1, you notice that the variables are arranged
differently in the two output tables. This
is due to the use of the REORDER option in the current
analysis. The advantage of using this
option might not be very obvious in Output 33.2.4, but you can
see its value when looking at the
rotated solutions, as shown in Output 33.2.7 and Output
33.2.11.
The final communality estimates are all fairly close to the priors
(shown in Output 33.2.2). Only
the communality for the variable HouseValue increased
appreciably, from to .
Therefore, you are sure that all the common variance is
accounted for.

Output 33.2.5 shows that the residual correlations (off-diagonal
elements) are low, the largest
being . The partial correlations are not quite as impressive,
since the uniqueness values are
also rather small. These results indicate that the squared
multiple correlations are good but not
quite optimal communality estimates.
Output 33.2.5 Residual and Partial Correlations
Residual Correlations With Uniqueness on the Diagonal
L/default/statug_factor_sect029.htm#statug.factor.facex2d
L/default/statug_factor_sect028.htm#statug.factor.facex1e

L/default/statug_factor_sect029.htm#statug.factor.facex2g
L/default/statug_factor_sect029.htm#statug.factor.facex2l
Population 0.02189 -0.01118 0.00514 0.01063 0.00124
School -0.01118 0.18244 0.02151 -0.02390 0.01248
Employment 0.00514 0.02151 0.02800 -0.00565 -0.01561

Services 0.01063 -0.02390 -0.00565 0.20226 0.03370
HouseValue 0.00124 0.01248 -0.01561 0.03370 0.11505
Root Mean Square Off-Diagonal Residuals: Overall =
0.01693282
0.00815307 0.01813027 0.01382764 0.02151737 0.01960158
Partial Correlations Controlling Factors
Population 1.00000 -0.17693 0.20752 0.15975 0.02471
School -0.17693 1.00000 0.30097 -0.12443 0.08614
Employment 0.20752 0.30097 1.00000 -0.07504 -0.27509
Services 0.15975 -0.12443 -0.07504 1.00000 0.22093

HouseValue 0.02471 0.08614 -0.27509 0.22093 1.00000
Root Mean Square Off-Diagonal Partials: Overall = 0.18550132
0.15850824 0.19025867 0.23181838 0.15447043 0.18201538
As displayed in Output 33.2.6, the unrotated factor pattern
reveals two tight clusters of variables,
with the variables HouseValue and School at the negative end of
Factor2 axis and the variables
Employment and Population at the positive end. The Services
variable is in between but closer to
the HouseValue and School variables. A good rotation would
place the axes so that most
variables would have zero loadings on most factors. As a result,
the axes would appear as though

they are put through the variable clusters.
Output 33.2.6 Unrotated Factor Loading Plot
L/default/statug_factor_sect029.htm#statug.factor.facex2f
Principal Factor Analysis: Varimax Prerotation
In Output 33.2.7, the results of the varimax prerotation are
shown. To yield the varimax-rotated
factor loading (pattern), the initial factor loading matrix is
postmultiplied by an orthogonal
transformation matrix. This orthogonal transformation matrix is
shown in Output 33.2.7,
followed by the varimax-rotated factor pattern. This rotation or
transformation leads to small
loadings of Population and Employment on the first factor and
small loadings of HouseValue

and School on the second factor. Services appears to have a
larger loading on the first factor than
it has on the second factor, although both loadings are
substantial. Hence, Services appears to be
factorially complex.
With the REORDER option in effect, you can see the variable
clusters clearly in the factor
pattern. The first factor is associated more with the first three
variables (first three rows of
variables): HouseValue, School, and Services. The second
factor is associated more with the last
two variables (last two rows of variables): Population and
Employment.
For orthogonal factor solutions such as the current varimax-
rotated solution, you can also
interpret the values in the factor loading (pattern) matrix as

correlations. For example,
HouseValue and Factor 1 have a high correlation at , while
Population and Factor 1 have a
low correlation at .
Output 33.2.7 Varimax Rotation: Transform Matrix and Rotated
Pattern
Orthogonal Transformation Matrix
1 2
1 0.78895 0.61446
2 -0.61446 0.78895

Rotated Factor Pattern
Factor1 Factor2
HouseValue 0.94072 -0.00004
School 0.90419 0.00055
Services 0.79085 0.41509
Population 0.02255 0.98874
Employment 0.14625 0.97499
Factor
Factor1 Factor2
2.3498567 2.1005128

0.97811334 0.81756387 0.97199928 0.79774304 0.88494998
The variance explained by the factors are more evenly
distributed in the varimax-rotated
solution, as compared with that of the unrotated solution.
Indeed, this is a typical fact for any
kinds of factor rotation. In the current example, before the
varimax rotation the two factors
explain and , respectively, of the common variance (see Output
33.2.4). After the
varimax rotation the two rotated factors explain and ,
respectively, of the common
variance. However, the total variance accounted for by the
factors remains unchanged after the

varimax rotation. This invariance property is also observed for
the communalities of the
variables after the rotation, as evidenced by comparing the
current communality estimates in
Output 33.2.7 with those in Output 33.2.4.
Output 33.2.8 shows the graphical plot of the varimax-rotated
factor loadings. Clearly,
HouseValue and School cluster together on the Factor 1 axis,
while Population and Employment
cluster together on the Factor 2 axis. Service is closer to the
cluster of HouseValue and School.
Output 33.2.8 Varimax-Rotated Factor Loadings

L/default/statug_factor_sect029.htm#statug.factor.facex2i
An alternative to the scatter plot of factor loadings is the so-
called vector plot of loadings, which
is shown in Output 33.2.9. The vector plot is requested with the
suboption VECTOR in the
PLOTS= option. That is:
plots=preloadings(vector)
This generates the vector plot of loadings in Output 33.2.9.
Output 33.2.9 Varimax-Rotated Factor Loadings: Vector Plot
L/default/statug_factor_sect029.htm#statug.factor.facex2x
L/default/statug_factor_sect006.htm#statug.factor.factorgplots

L/default/statug_factor_sect029.htm#statug.factor.facex2x
Principal Factor Analysis: Oblique Promax Rotation
For some researchers, the varimax-rotated factor solution in the
preceding section might be good
enough to provide them useful and interpretable results. For
others who believe that common
factors are seldom orthogonal, an obliquely rotated factor
solution might be more desirable, or at
least should be attempted.
PROC FACTOR provides a very large class of oblique factor
rotations. The current example
shows a particular one—namely, the promax rotation as
requested by the ROTATE=PROMAX
option.

The results of the promax rotation are shown in Output 33.2.10
and Output 33.2.11. The
corresponding plot of factor loadings is shown in Output
33.2.12.
Output 33.2.10 Promax Rotation: Procrustean Target and
Transformation
Target Matrix for Procrustean Transformation
Factor1 Factor2
L/default/statug_factor_sect029.htm#statug.factor.facex2j
L/default/statug_factor_sect029.htm#statug.factor.facex2o
Target Matrix for Procrustean Transformation

Factor1 Factor2
HouseValue 1.00000 -0.00000
School 1.00000 0.00000
Services 0.69421 0.10045
Population 0.00001 1.00000
Employment 0.00326 0.96793
Procrustean Transformation Matrix
1 2
1 1.04116598 -0.0986534
2 -0.1057226 0.96303019
Normalized Oblique Transformation
Matrix

1 2
1 0.73803 0.54202
2 -0.70555 0.86528
Output 33.2.10 shows the Procrustean target, to which the
varimax factor pattern is rotated,
followed by the display of the Procrustean transformation
matrix. This is the matrix that
transforms the varimax factor pattern so that the rotated pattern
is as close as possible to the
Procrustean target. However, because the variances of factors
have to be fixed at 1 during the
oblique transformation, a normalized version of the Procrustean
transformation matrix is the one
that is actually used in the transformation. This normalized
transformation matrix is shown at the

bottom of Output 33.2.10. Using this transformation matrix
leads to the promax-rotated factor
solution, as shown in Output 33.2.11.
Output 33.2.11 Promax Rotation: Factor Correlations and Factor
Pattern
Inter-Factor Correlations
Factor1 Factor2
Factor1 1.00000 0.20188
Factor2 0.20188 1.00000
Rotated Factor Pattern (Standardized Regression Coefficients)
Factor1 Factor2
HouseValue 0.95558485 -0.0979201
School 0.91842142 -0.0935214

Services 0.76053238 0.33931804
Factor1 Factor2
Population -0.0790832 1.00192402
Employment 0.04799 0.97509085
After the promax rotation, the factors are no longer
uncorrelated. As shown in Output 33.2.11,
the correlation of the two factors is now . In the (initial)
unrotated and the varimax solutions,

the two factors are not correlated.
In addition to allowing the factors to be correlated, in an
oblique factor solution you seek a
pattern of factor loadings that is more "differentiated" (referred
to as the "simple structures" in
the literature). The more differentiated the loadings, the easier
the interpretation of the factors.
For example, factor loadings of Services and Population on
Factor 2 are and ,
respectively, in the (orthogonal) varimax-rotated factor pattern
(see Output 33.2.7). With the
(oblique) promax rotation (see Output 33.2.11), these two
loadings become even more
differentiated with values and , respectively. Overall, however,
the factor patterns
before and after the promax rotation do not seem to differ too

much. This fact is confirmed by
comparing the graphical plots of factor loadings. The plots in
Output 33.2.12 (promax-rotated
factor loadings) and Output 33.2.8 (varimax-rotated factor
loadings) show very similar patterns.
Output 33.2.12 Promax Rotation: Factor Loading Plot
L/default/statug_factor_sect029.htm#statug.factor.facex2i
Unlike the orthogonal factor solutions where you can interpret
the factor loadings as correlations

between variables and factors, in oblique factor solutions such
as the promax solution, you have
to turn to the factor structure matrix for examining the
correlations between variables and
factors. Output 33.2.13 shows the factor structures of the
promax-rotated solution.
Output 33.2.13 Promax Rotation: Factor Structures and Final
Communalities
Factor Structure (Correlations)
Factor1 Factor2
HouseValue 0.93582 0.09500
School 0.89954 0.09189
Services 0.82903 0.49286
Population 0.12319 0.98596

Employment 0.24484 0.98478
Factor Ignoring Other Factors
L/default/statug_factor_sect029.htm#statug.factor.facex2n
Factor1 Factor2
2.4473495 2.2022803
0.97811334 0.81756387 0.97199928 0.79774304 0.88494998
Basically, the factor structure matrix shown in Output 33.2.13
reflects a similar pattern to the

factor pattern matrix shown in Output 33.2.11. The critical
difference is that you can have the
correlation interpretation only by using the factor structure
matrix. For example, in the factor
structure matrix shown in Output 33.2.13, the correlation
between Population and Factor 2 is
. The corresponding value shown in the factor pattern matrix in
Output 33.2.11 is ,
which certainly cannot be interpreted as a correlation
coefficient.
Common variance explained by the promax-rotated factors are
2.447 and 2.202, respectively, for
the two factors. Unlike the orthogonal factor solutions (for
example, the prerotated varimax
solution), variance explained by these promax-rotated factors do
not sum up to the total

communality estimate 4.45. In oblique factor solutions, variance
explained by oblique factors
cannot be partitioned for the factors. Variance explained by a
common factor is computed while
ignoring the contributions from the other factors.
However, the communalities for the variables, as shown in the
bottom of Output 33.2.13, do not
change from rotation to rotation. They are still the same set of
communalities in the initial,
varimax-rotated, and promax-rotated solutions. This is a basic
fact about factor rotations: they
only redistribute the variance explained by the factors; the total
variance explained by the factors
for any variable (that is, the communality of the variable)
remains unchanged.
In the literature of exploratory factor analysis, reference axes
had been an important tool in factor

rotation. Nowadays, rotations are seldom done through the uses
of the reference axes. Despite
that, results about reference axes do provide additional
information for interpreting factor
analysis results. For the current example of the promax rotation,
PROC FACTOR shows the
relevant results about the reference axes in Output 33.2.14.
Output 33.2.14 Promax Rotation: Reference Axis Correlations
and Reference Structures
Reference Axis Correlations
Factor1 Factor2
Factor1 1.00000 -0.20188
Factor2 -0.20188 1.00000
Reference Structure (Semipartial Correlations)

Factor1 Factor2
HouseValue 0.93591 -0.09590
L/default/statug_factor_sect029.htm#statug.factor.facex2m
Reference Structure (Semipartial Correlations)
Factor1 Factor2
School 0.89951 -0.09160

Services 0.74487 0.33233
Population -0.07745 0.98129
Employment 0.04700 0.95501
Factor Eliminating Other
Factors
Factor1 Factor2
2.2480892 2.0030200
To explain the results in the reference-axis system, some
geometric interpretations of the factor
axes are needed. Consider a single factor in a system of
common factors in an oblique factor
solution. Taking away the factor under consideration, the

remaining factors span a
hyperplane in the factor space of dimensions. The vector that is
orthogonal to this
hyperplane is the reference axis (reference vector) of the factor
under consideration. Using the
same definition for the remaining factors, you have reference
vectors for factors.
A factor in an oblique factor solution can be considered as the
sum of two independent
components: its associated reference vector and a component
that is overlapped with all other
factors. In other words, the reference vector of a factor is a
unique part of the factor that is not
predictable from all other factors. Thus, the loadings on a
reference vector are the unique effects
of the corresponding factor, partialling out the effects from all
other factors. The variances

explained by a reference vector are the unique variances
explained by the corresponding factor,
partialling out the variances explained by all other factors.
Output 33.2.14 shows the reference axis correlations. The
correlation between the reference
vectors is . Next, Output 33.2.14 shows the loadings on the
reference vectors in the table
entitled "Reference Structure (Semipartial Correlations)." As
explained previously, loadings on a
reference vector are also the unique effects of the corresponding
factor, partialling out the effects
from the all other factors. For example, the unique effect of
Factor 1 on HouseValue is .
Another important property of the reference vector system is
that loadings on a reference vector

are also correlations between the variables and the
corresponding factor, partialling out the
correlations between the variables and other factors. This means
that the loading 0.936 in the
reference structure table is the unique correlation between
HouseValue and Factor 1, partialling
out the correlation between HouseValue with Factor 2. Hence,
as suggested by the title of table,
all loadings reported in the "Reference Structure (Semipartial
Correlations)" can be interpreted as
semipartial correlations between variables and factors.
The last table shown in Output 33.2.14 are the variances
explained by the reference vectors. As

explained previously, these are also unique variances explained
by the factors, partialling out the
variances explained by all other factors (or eliminating all other
factors, as suggested by the title
of the table). In the current example, Factor 1 explains of the
variable variances, partialling
out all variable variances explained by Factor 2.
Notice that factor pattern (shown in Output 33.2.11), factor
structures (correlations, shown in
Output 33.2.13), and reference structures (semipartial
correlations, shown in Output 33.2.14)
give you different information about the oblique factor
solutions such as the promax-rotated
solution. However, for orthogonal factor solutions such as the
varimax-rotated solution, factor
structures and reference structures are all the same as the factor

pattern.
Principal Factor Analysis: Factor Rotations with Factor Pattern
Input
The promax rotation is one of the many rotations that PROC
FACTOR provides. You can
specify many different rotation algorithms by using the
ROTATE= options. In this section, you
explore different rotated factor solutions from the initial
principal factor solution. Specifically,
you want to examine the factor patterns yielded by the
quartimax transformation (an orthogonal
transformation) and the Harris-Kaiser (an oblique
transformation), respectively.
Rather than analyzing the entire problem again with new
rotations, you can simply use the
OUTSTAT= data set from the preceding factor analysis results.

First, the OUTSTAT= data set is printed using the following
statements:
proc print data=fact_all;run;
The output data set is displayed in Output 33.2.15.
Output 33.2.15 Output Data Set
Factor Output Data Set
Obs _TYPE_ _NAME_ Population School Employment Services
HouseValue
1 MEAN 6241.67 11.4417 2333.33 120.833 17000.00
2 STD 3439.99 1.7865 1241.21 114.928 6367.53
3 N 12.00 12.0000 12.00 12.000 12.00
4 CORR Population 1.00 0.0098 0.97 0.439 0.02

5 CORR School 0.01 1.0000 0.15 0.691 0.86
6 CORR Employment 0.97 0.1543 1.00 0.515 0.12
7 CORR Services 0.44 0.6914 0.51 1.000 0.78
8 CORR HouseValue 0.02 0.8631 0.12 0.778 1.00
L/default/statug_factor_sect029.htm#statug.factor.pprint1__

Obs _TYPE_ _NAME_ Population School Employment Services
HouseValue
9 COMMUNAL 0.98 0.8176 0.97 0.798 0.88
10 PRIORS 0.97 0.8223 0.97 0.786 0.85
11 EIGENVAL 2.73 1.7161 0.04 -0.025 -0.07
12 UNROTATE Factor1 0.63 0.7137 0.71 0.879 0.74
13 UNROTATE Factor2 0.77 -0.5552 0.68 -0.158 -0.58
14 RESIDUAL Population 0.02 -0.0112 0.01 0.011 0.00
15 RESIDUAL School -0.01 0.1824 0.02 -0.024 0.01
16 RESIDUAL Employment 0.01 0.0215 0.03 -0.006 -0.02
17 RESIDUAL Services 0.01 -0.0239 -0.01 0.202 0.03
18 RESIDUAL HouseValue 0.00 0.0125 -0.02 0.034 0.12
19 PRETRANS Factor1 0.79 -0.6145 . . .

20 PRETRANS Factor2 0.61 0.7889 . . .
21 PREROTAT Factor1 0.02 0.9042 0.15 0.791 0.94
22 PREROTAT Factor2 0.99 0.0006 0.97 0.415 -0.00
23 TRANSFOR Factor1 0.74 -0.7055 . . .
24 TRANSFOR Factor2 0.54 0.8653 . . .
25 FCORR Factor1 1.00 0.2019 . . .
26 FCORR Factor2 0.20 1.0000 . . .
27 PATTERN Factor1 -0.08 0.9184 0.05 0.761 0.96
28 PATTERN Factor2 1.00 -0.0935 0.98 0.339 -0.10
29 RCORR Factor1 1.00 -0.2019 . . .
30 RCORR Factor2 -0.20 1.0000 . . .
31 REFERENC Factor1 -0.08 0.8995 0.05 0.745 0.94

32 REFERENC Factor2 0.98 -0.0916 0.96 0.332 -0.10
33 STRUCTUR Factor1 0.12 0.8995 0.24 0.829 0.94
34 STRUCTUR Factor2 0.99 0.0919 0.98 0.493 0.09
Various results from the previous factor analysis are saved in
this data set, including the initial
unrotated solution (its factor pattern is saved in observations
with _TYPE_=UNROTATE), the
prerotated varimax solution (its factor pattern is saved in
observations with
_TYPE_=PREROTAT), and the oblique promax solution (its
factor pattern is saved in
observations with _TYPE_=PATTERN).
When PROC FACTOR reads in an input data set with
TYPE=FACTOR, the observations with
_TYPE_=PATTERN are treated as the initial factor pattern to

be rotated by PROC FACTOR.
Hence, it is important that you provide the correct initial factor
pattern for PROC FACTOR to
read in.
In the current example, you need to provide the unrotated
solution from the preceding analysis as
the input factor pattern. The following statements create a
TYPE=FACTOR data set fact2 from
the preceding OUTSTAT= data set fact_all:
data fact2(type=factor);
set fact_all;
if _TYPE_ in('PATTERN' 'FCORR') then delete;
if _TYPE_='UNROTATE' then _TYPE_='PATTERN';

In these statements, you delete observations with
_TYPE_=PATTERN or _TYPE_=FCORR,
which are for the promax-rotated factor solution, and change
observations with
_TYPE_=UNROTATE to _TYPE_=PATTERN in the new data
set fact2. In this way, the initial
orthogonal factor pattern matrix is saved in the observations
with _TYPE_=PATTERN.
You use this new data set and rotate the initial solution to
another oblique solution with the
ROTATE=QUARTIMAX option, as shown in the following
statements:
proc factor data=fact2 rotate=quartimax reorder;
run;
As shown in Output 33.2.16, the input data set is of the
FACTOR type for the new rotation.

Output 33.2.16 Quartimax Rotation With Input Factor Pattern
Quartimax Rotation From a TYPE=FACTOR Data Set
The FACTOR Procedure
Input Data Type FACTOR
N Set/Assumed in Data Set 12
N for Significance Tests 12
The quartimax-rotated factor pattern is displayed in Output
33.2.17.
Output 33.2.17 Quartimax-Rotated Factor Pattern
Orthogonal Transformation Matrix
1 2

1 0.80138 0.59815
2 -0.59815 0.80138
Factor1 Factor2
HouseValue 0.94052 -0.01933
L/default/statug_factor_sect029.htm#statug.factor.facex2p
L/default/statug_factor_sect029.htm#statug.factor.facex2q
Factor1 Factor2

School 0.90401 -0.01799
Services 0.79920 0.39878
Population 0.04282 0.98807
Employment 0.16621 0.97179
Factor
Factor1 Factor2
2.3699941 2.0803754
The quartimax rotation produces an orthogonal transformation
matrix shown at the top of Output
33.2.17. After the transformation, the factor pattern is shown
next. Compared with the varimax-
rotated factor pattern (see Output 33.2.7), the quartimax-rotated

factor pattern shows some
differences. The loadings of HouseValue and School on Factor 1
drop only slightly in the
quartimax factor pattern, while the loadings of Services,
Population, and Employment on Factor
1 gain relatively larger amounts. The total variance explained
by Factor 1 in the varimax-rotated
solution (see Output 33.2.7) is , while it is after the quartimax-
rotation. In other words,
more variable variances are explained by the first factor in the
quartimax factor pattern than in
the varimax factor pattern. Although not very strongly
demonstrated in the current example, this
illustrates a well-known property about the quartimax rotation:
it tends to produce a general
factor for all variables.

Another oblique rotation is now explored. The Harris-Kaiser
transformation weighted by the
Cureton-Mulaik technique is applied to the initial factor pattern.
To achieve this, you use the
ROTATE=HK and NORM=WEIGHT options in the following
PROC FACTOR statement:
ods graphics on;
proc factor data=fact2 rotate=hk norm=weight reorder
plots=loadings;run;
ods graphics off;
Output 33.2.18 shows the variable weights in the rotation.
Output 33.2.18 Harris-Kaiser Rotation: Weights
Variable Weights for Rotation

0.95982747 0.93945424 0.99746396 0.12194766 0.94007263
L/default/statug_factor_sect006.htm#statug.factor.factornorm
L/default/statug_factor_sect029.htm#statug.factor.facex2s
While all other variables have weights at least as large as , the
weight for Services is only
. This means that due to its small weight, Services is not as

important as the other variables
for determining the rotation (transformation). This makes sense
when you look at the initial
unrotated factor pattern plot in Output 33.2.6. In the plot, there
are two main clusters of
variables, and Services does not seem to fall into either of the
clusters. In order to yield a Harris-
Kaiser rotation (transformation) that would gear towards to two
clusters, the Cureton-Mulaik
weighting essentially downweights the contribution from
Services in the factor rotation.
The results of the Harris-Kaiser factor solution are displayed in
Output 33.2.19, with a graphical
plot of rotated loadings displayed in Output 33.2.20.
Output 33.2.19 Harris-Kaiser Rotation: Factor Correlations and
Factor Pattern

Inter-Factor Correlations
Factor1 Factor2
Factor1 1.00000 0.08358
Factor2 0.08358 1.00000
Factor1 Factor2
HouseValue 0.94048 0.00279
School 0.90391 0.00327
Services 0.75459 0.41892
Population -0.06335 0.99227
Employment 0.06152 0.97885
Because the Harris-Kaiser produces an oblique factor solution,

you compare the current results
with that of the promax (see Output 33.2.11), which also
produces an oblique factor solution.
The correlation between the factors in the Harris-Kaiser
solution is ; this value is much
smaller than the same correlation in the promax solution, which
is . However, the Harris-
Kaiser rotated factor pattern shown in Output 33.2.19 is more or
less the same as that of the
promax-rotated factor pattern shown in Output 33.2.11. Which
solution would you consider to be
more reasonable or interpretable?
From the statistical point of view, the Harris-Kaiser and promax
factor solutions are equivalent.
They explain the observed variable relationships equally well.
From the simplicity point of view,

however, you might prefer to interpret the Harris-Kaiser
solution because the factor correlation is
smaller. In other words, the factors in the Harris-Kaiser solution
do not overlap that much
conceptually; hence they should be more distinctive to interpret.
However, in practice simplicity
in factor correlations might not the only principle to consider.
Researchers might actually expect
to have some factors to be highly correlated based on
theoretical or substantive grounds.
Although the Harris-Kaiser and the promax factor patterns are
very similar, the graphical plots of
the loadings from the two solutions paint slightly different
pictures. The plot of the promax-
L/default/statug_factor_sect029.htm#statug.factor.facex2f
L/default/statug_factor_sect029.htm#statug.factor.facex2t

L/default/statug_factor_sect029.htm#statug.factor.facex2w
L/default/statug_factor_sect029.htm#statug.factor.facex2t
rotated loadings is shown in Output 33.2.12, while the plot of
the loadings for the current Harris-
Kaiser solution is shown in Output 33.2.20.
Output 33.2.20 Harris-Kaiser Rotation: Factor Loading Plot
The two factor axes in the Harris-Kaiser rotated pattern (Output
33.2.20) clearly cut through the
centers of the two variable clusters, while the Factor 1 axis in
the promax solution lies above a
variable cluster (Output 33.2.12). The reason for this subtle

difference is that in the Harris-Kaiser
rotation, the Services is a "loner" that has been downweighted
by the Cureton-Mulaik technique
(see its relatively small weight in Output 33.2.18). As a result,
the rotated axes are basically
determined by the two variable clusters in the Harris-Kaiser
rotation.
As far as the current discussion goes, it is not recommending
one rotation method over another.
Rather, it simply illustrates how you could control certain types
of characteristics of factor
rotation through the many options supported by PROC
FACTOR. Should you prefer an
orthogonal rotation to an oblique rotation? Should you choose
the oblique factor solution with
the smallest factor correlations? Should you use a weighting
scheme that would enable you to

find independent variable clusters? While PROC FACTOR
enables you to explore all these
L/default/statug_factor_sect029.htm#statug.factor.facex2s
alternatives, you must consult advanced textbooks and
published articles to get satisfactory and
complete answers to these questions.

Assignment
#6
Jake
Rifkin
Introduction:
In
this
report,
principal
component
analysis
(PCA)
is
explored
for
its

properties
in
dimensional
reduction
as
well
as
its
applicability
to
linear
modeling.
The
data
set
used
is
a
stock
portfolio
dataset
and
the
goal

of
PCA
and
linear
regression
is
to
help
explain
the
variation
in
the
market
index
with
20
daily
individual
stock
returns.
PCA
allows

us
to
reduce
and
group
the
stocks
into
principle
components
to
better
understand
how
the
stocks
are
related
to
each
other
and
what
the

covariance
structure
looks
like
while
linear
regression
allows
us
to
evaluate
the
impact
of
an
individual
stock
on
the
market
index
close.

Results:
Correlations
between
stocks
and
the
market
index
The
daily
log
return
for
each
stock
is
calculated
as
well
as
the

log
return
of
the
market
index
variable.
The
correlation
between
the
log
return
of
each
stock
and
the
log
return
of
the
market
index

is
summarized
in
the
table
below.
All
of
the
log
returns
have
a
positive
correlation
with
the
log
market
return
ranging
from
.44

to
.76.
In
other
words,
all
stock
returns
have
at
least
a
moderately
strong
correlation
with
the
return
of
the
market
index.
At
this

point,
noting
that
so
many
stocks
are
positively
correlated
to
the
market
index,
we
should
expect
that
some
degree
of
multicollinearity
is
present

between
the
predictor
stocks.
As
all
log
return
are
positively
correlated,
each
of
these
stocks
is
giving
a
relatively
similar
piece
of
information
to

our
model
if
we
were
to
employ
a
linearly
model
to
predict
the
daily
market
return.
A
full
interpretation
of
the
covariance
matrix
would

be
a
worthwhile
exercise
to
unpack
the
underlying
correlation
structure
especially
since
we’re
looking
to
fit
a
linear
model
to
explain
the
variation

in
the
log
of
the
market
index.
Multicollinearity
is
a
major
issue
for
linear
models.
return_AA
return_BAC
return_BHI
return_CVX
return_DD
return_DOW
return_DPS

Response_vv
corr
.63241
.65019
.5775
.7209
.68952
.62645
.4435
return_GS
return_HAL
return_HES
return_HON
return_HUN
return_JPM
return_KO
Response_vv

corr
.71216
.5975
.6108
.76838
.58194
.65785
.5998
return_MMM
return_MPC
return_PEP
return_SLB
return_WFC
return_XOM
Response_vv
corr

.76085
.47312
.50753
.69285
.73357
.72111
It
is
worth
noting
that
many
of
these
stocks
belong
to
the
same
sector.
For

example,
the
BAC,
GS,
JPM,
and
WFC
stocks
all
belong
to
the
banking
sector.
One
way
to
potentially
achieve
some
dimensional
reduction
and

gain
a
better
understanding
if
we’re
getting
similar
information
from
similar
stocks
is
to
group
the
correlation
by
their
industry.
The
following
chart
presents

the
stock
correlations
to
the
market
index
grouped
by
sector.
Looking
at
the
banking
sector,
all
four

banking
stocks
have
high
correlation
to
the
market
index
and
low
variation
within
the
group.
The
oil
refining
group
has
a
much
greater
variability.

While
our
base
sizes
for
the
various
clusters
are
fairly
small,
we
can
start
to
gain
an
understanding
of
how
sectors
relate
to
the

market
index
returns.
A
way
to
summarize
by
sector
would
be
to
take
the
mean
correlation
coefficient
by
sector,
but
it
should
be

noted
that
this
method
is
not
recommended
for
interpreting
the
effect
of
a
sector
on
the
market
index
returns.
For
one,
it
is
inappropriate

to
simply
average
the
correlation
coefficients
to
produce
a
mean
coefficient.
If
a
mean
coefficient
is
desired
for
Pearson
correlation
coefficients,
a
Fisher

z
transformation
where
the
correlation
coefficient
is
transformed
to
a
z-‐value
and
then
averaged
to
produce
a
single
correlation
coefficient
is
a
preferable
interpretation1.

Additionally,
averaging
the
correlation
coefficients
ignores
the
joint
effect
of
the
various
individual
stocks
on
the
market
index.
In
other
words,
as

a
Pearson
correlation
coefficient
is
only
between
a
response
variable
and
an
explanatory
variable,
there
is
an
implicit
assumption
of
et
ceteris
paribus
of

the
other
variables.
This
is
an
over
simplified
view
of
potentially
joint
effects
and
gives
rise
to
the
reason
why
the
covariance
matrix

and
PCA
is
used
to
analyze
the
underlying
covariance
structure
of
a
set
of
explanatory
variables
on
a
single
response
variable.

1
Corey,
David
M.,
William
P.
Dunlap,
and
Michael
J.
Burke.
"Averaging
Correlations:
Expected

Values
Principal
Component
Analysis
In
this
section,
PCA
is
used
to
further
examine
the
underlying
covariance
structure

of
the
stock
return
variables
(excluding
the
market
index
variable)
with
the
goal
of
dimensional
reduction
and
pre-‐
supposing
potential
multicollinearity.
The
following

plot
represents
the
proportion
of
variance
explained
by
the
principal
component.
Approximately
48%
of
the
variance
is
explained
by
the
first
principle
component

while
the
second
principle
component
drops
to
explaining
less
than
8%
of
the
variance.
As
there
is
no
exact
way
to
choose
the
number

of
principal
components
to
use,
there
are
a
couple
of
general
wisdoms
to
employ.
One
states
that
components
should
be
included
until
at
least

80%
of
the
variance
is
explained.
In
our
chart,
this
would
suggest
including
8
principal
components.
This
method
ensures
that
a
high

proportion
of
variance
is
explained
by
the
included
components
but
may
not
yield
the
simplest
interpretation.
Another
wisdom
to
consider
is
that
of
parsimony.

The
concept
of
creating
the
most
parsimonious
explanation
of
the
phenomenon
means
explaining
as
much
as
possible
with
as
few
components
as
possible.

The
concept
of
parsimony
prevents
including
too
many
components
which
makes
the
interpretation
of
the
effects
difficult.
Ideally,
the
first
few
principal
components

contain
the
vast
majority
of
the
variance
(70-‐90%).
A
way
to
potentially
achieve
parsimony
is
to
exclude
any
components
whose
Eigenvalue
is
less
than

the
mean
Eigenvalue
of
all
the
other
components.
Applying
this
threshold
to
our
data,
we
would
only
include
the
first
3
components
and
only

explain
62.61%
of
the
variance.
For
this
report,
I
will
include
the
first
8
principle

components
in
the
regression
portion
as
they
explain
80%
of
the
variance.
In
this
section,
the
first
two
principal
components
are
examined
more
closely

by
plotting
the
first
two
principal
components
in
a
scatter
plot.
Additionally,
a
table
of
the
first
two
principal
components
is
included.

Looking
at
the
first
principal
component,
which
contains
48%
of
the
variance;
note
that
all
of
the
stock

tickers
are
positive
and
relatively
closely
contained.
One
interpretation
of
this
principal
component
could
be
a
weighted
measure
of
the
overall
market
performance.
Some

stocks
contribute
slightly
less
than
others,
but
for
the
most
part,
all
stocks
are
contributing
fairly
equally
across
the
board.
The
second
principal

component
has
3
distinct
outliers
clearly
visible
in
the
scatter
plot
while
the
rest
of
the
data
clusters
around
the
bottom
right
corner.
Interestingly,

the
3
stocks
towards
the
high
end
of
principal
component
2
are
all
soft
drink
manufacturers.
This
suggests
that
the
soft
drink
sector
provides

disproportionately
to
explaining
the
variance
of
the
second
principal
component.
Multiple
Regression
Modeling
In
this
section,
a
training
and
test

set
is
created
to
aid
in
comparing
two
multiple
regression
models
on
metrics
like
mean
absolute
error
and
goodness
of
fit.
The
first
model

fit
uses
all
of
the
log
stock
return
variables
to
predict
the
market
index
variable.
The
second
model
fit
uses
the
first
eight
principal

components.
The
models
are
compared
across
their
goodness
of
fit
as
well
as
their
ability
to
predict
in
and

out
of
sample
data
with
MSE
and
MAE.
The
first
model
includes
all
20
stocks
as
predictors
attempting
to
explain
the
variance

of
the
training
set
market
return.
The
overall
adjusted
r
square
is
.8919
indicating
that
the
model
explains
a
lot
of
the
training

set
variation.
Not
all
of
the
parameter
estimates
are
statistically
significant
and
if
this
model
was
to
be
used
for
inferential
purposes,
it
should

be
noted
that
those
variables
should
be
trimmed
and
the
model
refit
on
a
subset
of
the
variables.
In
addition
to
examining
the
coefficients,

the
variance
inflation
factor
(VIF)
of
each
coefficient
is
checked.
While
no
VIFs
are
greater
than
5
(an
indicator
of
potential
multicollinearity),
the

blind
log
transformation
we
employed
on
the
returns
at
the
beginning
of
the
analysis
is
most
likely
suppressing
the
values
of
the
VIF.
Considering

that
the
data
in
the
regression
is
all
on
the
log
scale,
the
VIFs
of
2
or
3
may
be
indicators
of
multicollinearity.

Looking
at
the
residual
plots,
the
log
return
model
fits
the
data
fairly
well
and
it
appears
that
no
major
assumptions
of

linear
modeling
are
violated.
The
residual
plot
shows
a
good
random
spread.
The
QQ
plot
indicates
incredibly
minimal
departures
form
normality
and
cook’s
D

indicates
only
a
couple
of
outliers.
The
histogram
of
the
residuals
appears
to
follow
the
assumptions
of
normality
and
homogeneity.
Overall,
this
model

seems
to
fit
well
though
concerns
about
multicollinearity
of
the
predictor
variables
exist
and
the
model
should
prune
the
non-‐significant
coefficients
and
refit.

The
second
model
is
fit
on
the
first
eight
principal
components
to
explain
the
log
return
of
the
market
index.

This
model
has
an
adjusted
r
squared
value
of
.8886
which
indicates
a
very
high
proportion
of
the
variance
of
the
log
return
of

the
market
index
is
explained
by
these
8
variables.
Already,
we
see
a
reduction
in
model
complexity
with
this
model
in
terms
of
the

number
of
terms
included.
Looking
at
the
coefficients,
the
first
4
and
the
eighth
principle
components
have
statistically
significant
non
zero
effects
on

the
log
of
the
market
index.
The
insignificant
variables
should
be
pruned
from
the
model
and
the
model
should
be
refit.
So
far,
this

model
has
taken
a
slight
hit
in
adjusted
r
square
compared
to
the
first
model
but
the
real
benefit
of
the
second
model
is

apparent
when
looking
at
the
VIFs
of
the
coefficients.
All
VIFs
are
within
.03
of
1.
While
the
previous
model
has
probably
multicollinearity,
this

model
is
relatively
free
of
multicollinearity
as
indicated
by
the
low
VIF
values.
This
indicates
that
a
more
stable
model

that
better
and
more
consistently
explains
the
variation
in
the
log
of
the
market
index.
Additionally,
this
model
will
most
likely
extrapolate
better
than

the
previous
model.
Looking
at
the
residual
of
the
second
model,
the
model
has
just
as
good
of
a
goodness
of
fit
as

the
first
model.
All
assumptions
of
linear
regression
appear
to
be
maintained.
The
residual
scatter
plot
is
random
and
devoid
of
a
clear

pattern
while
the
QQ
plot
shows
minimal
deviations
from
normality.
The
Cook’s
D
plot
shows
only
a
few
influential
points
and
the
histogram
of

the
residual
appears
to
fit
the
assumption
of
normality
perfectly.
Overall,
this
model
fits
as
well
as
the
previous
model
and
avoids
some
issues

with
multicollinearity
that
are
present
in
the
log
return
version
of
the
model.
The
insignificant
variables
should
be
pruned
and
the
model
should

be
refit.
This
model
has
the
added
benefit
of
having
potentially
easier
interpretability
than
the
previous
model
as
long
as
the
interpretation
of
the

PCA
components
is
logical.
Finally,
the
two
models
are
compared
on
their
ability
to
predict
using
MSE
and

MAE.
The
table
below
summarizes
the
findings
on
training
and
testing
data.
Overall,
both
models
perform
similarly.
They
both
have
higher
error
rates

in
the
out
of
sample
data
than
the
in
sample
data.
The
first
model
fit
on
log
returns
outperforms
the
principal
component
fit
model

at
the
hundred-‐thousandths
place
in
the
MAE
on
both
the
training
and
test
data.
The
first
model
outperforms
the
second
model
by
more
in

the
training
set
than
the
out
of
sample
test
set.
Data/Model
Model
1
MSE
Model
1
MAE
Model
2
MSE
Model

2
MAE
Test
.000009306
.002144904
.000009677
.002179249
Train
.000005994
.001902032
.000006410
.001975239
Conclusions:
In
this

report,
stock
return
data
was
analyzed
with
the
goal
of
fitting
and
comparing
two
models
that
explain
the
variation
in
the
log
return
of

the
market
index.
Principal
component
analysis
was
conducted
to
aide
in
dimension
reduction
as
well
as
for
use
as
regression
inputs.
Overall,
PCA
was

only
modestly
beneficial
over
the
log
transformation
of
the
stock
returns
because
it
took
several
principal
components
to
reach
an
appropriate
threshold
of

variance
explained.
Principal
components
beyond
the
second
only
accounted
for
less
than
7%
of
the
variance
of
the
explanatory
variables.
Additionally,
including
many

principal
components
directly
challenges
the
goal
of
having
a
parsimonious
interpretation.
One
major
assumption
this
report
makes
is
that
the
log
transformations

of
the
daily
returns
were
necessary
and
good
for
the
modeling.
First,
the
log
transformation
at
the
beginning
made
it
difficult
to
identify

multicollinearity
using
the
VIF
metric
and
secondly
PCA
is
not
scale
invariant
and
thus
performing
PCA
on
the
non-‐log
returns
could
yield
different
results.

In
terms
of
modeling
help,
PCA
helped
our
linear
model
reduce
the
multicollinearity
issue
present
in
the
raw
data
though
both
models
seemed

to
perform
similarly
on
testing
and
training
data.
In
terms
of
which
model
is
preferable,
in
reality
both
models
perform
very
similarly
and
given

the
nature
of
the
data,
if
our
goal
is
accurate
prediction,
a
time
series
approach
may
be
a
more
appropriate.
In
choosing
between
the

two
models
fit
in
this
report,
the
second
model
has
less
multicollinearity
with
very
similar
performance.
This
model
would
be
the
preferred
model

to
employ.
Code:
libname
mydata
"/scs/wtm926/"
access=readonly;
data
temp;
set
mydata.stock_portfolio_data;
run;

proc
print
data=temp(obs=10);
run;
quit;
proc
sort
data=temp;
by
date;
run;
quit;
data
temp;
set
temp;

*
Compute
the
log-‐returns
-‐
log
of
the
ratio
of
today's
price
to
yesterday's
price;
*
Note
that
the
data
needs

to
be
sorted
in
the
correct
direction
in
order
for
us
to
compute
the
correct
return;
return_AA
=
log(AA/lag1(AA));

return_BAC
=
log(BAC/lag1(BAC));
return_BHI
=
log(BHI/lag1(BHI));
return_CVX
=
log(CVX/lag1(CVX));
return_DD
=
log(DD/lag1(DD));
return_DOW
=

log(DOW/lag1(DOW));
return_DPS
=
log(DPS/lag1(DPS));
return_GS
=
log(GS/lag1(GS));
return_HAL
=
log(HAL/lag1(HAL));
return_HES
=
log(HES/lag1(HES));

return_HON
=
log(HON/lag1(HON));
return_HUN
=
log(HUN/lag1(HUN));
return_JPM
=
log(JPM/lag1(JPM));
return_KO
=
log(KO/lag1(KO));
return_MMM

=
log(MMM/lag1(MMM));
return_MPC
=
log(MPC/lag1(MPC));
return_PEP
=
log(PEP/lag1(PEP));
return_SLB
=
log(SLB/lag1(SLB));
return_WFC
=
log(WFC/lag1(WFC));
return_XOM

=
log(XOM/lag1(XOM));
*return_VV
=
log(VV/lag1(VV));
response_VV
=
log(VV/lag1(VV));
run;
proc
print
data=temp(obs=10);
run;
quit;

*
We
can
use
ODS
TRACE
to
print
out
all
of
the
data
sets
available
to
ODS
for
a

particular
SAS
procedure.;
*
We
can
also
look
these
data
sets
up
in
the
SAS
User's
Guide
in
the
chapter
for
the
selected

procedure.;
*ods
trace
on;
ods
output
PearsonCorr=portfolio_correlations;
proc
corr
data=temp;
*var
return:
with
response_VV;
var
return_:;
with
response_VV;

run;
quit;
*ods
trace
off;
proc
print
data=portfolio_correlations;
run;
quit;
data
wide_correlations;
set
portfolio_correlations

(keep=return_:);
run;
*
Note
that
wide_correlations
is
a
'wide'
data
set
and
we
need
a
'long'
data
set;

*
SAS
has
two
'standard'
data
formats
-‐
wide
and
long;
*
We
can
use
PROC
TRANSPOSE
to
convert
data
from
one
format
to

the
other;
proc
transpose
data=wide_correlations
out=long_correlations;
run;
quit;
data
long_correlations;
set
long_correlations;
tkr

=
substr(_NAME_,8,3);
drop
_NAME_;
rename
COL1=correlation;
run;
proc
print
data=long_correlations;
run;
quit;

*
Merge
on
sector
id
and
make
a
colored
bar
plot;
data
sector;
input
tkr
$
1-‐3
sector
$
4-‐35;

datalines;
AA
Industrial
-‐
Metals
BAC
Banking
BHI
Oil
Field
Services
CVX
Oil
Refining
DD
Industrial
-‐

Chemical
DOW
Industrial
-‐
Chemical
DPS
Soft
Drinks
GS
Banking
HAL
Oil
Field
Services
HES
Oil
Refining
HON

Manufacturing
HUN
Industrial
-‐
Chemical
JPM
Banking
KO
Soft
Drinks
MMM
Manufacturing
MPC
Oil
Refining
PEP
Soft
Drinks

SLB
Oil
Field
Services
WFC
Banking
XOM
Oil
Refining
VV
Market
Index
;
run;

proc
print
data=sector;
run;
quit;
proc
sort
data=sector;
by
tkr;
run;
proc
sort
by
tkr;
run;

data
long_correlations;
merge
long_correlations
(in=a)
sector
(in=b);
by
tkr;
if
(a=1)
and
(b=1);
run;

proc
print
run;
quit;
*
Make
Grouped
Bar
Plot;
*
p.
48
Statistical
Graphics
Procedures
By

Example;
ods
graphics
on;
title
'Correlations
with
the
Market
Index';
proc
sgplot
format
correlation
3.2;
vbar
tkr

/
response=correlation
group=sector
groupdisplay=cluster
datalabel;
run;
quit;
ods
graphics
off;
*
Still
not
the
correct
graphic.
We

want
the
tickers
grouped
and
color
coded
by
sector;
*
We
want
ticker
labels
directly
under
the
x-‐axis
and
sector
labels
under
the
ticker

labels
denoting
each
group.
Looks
like
we
have
an
open
SAS
graphics
project.;
ods
graphics
on;
title
'Correlations
with
the

Market
Index';
proc
sgplot
format
correlation
3.2;
vbar
sector
/
group=tkr
groupdisplay=cluster
datalabel;
run;
quit;
ods

graphics
off;
*
SAS
can
produce
bar
plots
by
sector
of
the
mean
correlation;
proc
means
data=long_correlations

nway
noprint;
class
sector;
var
correlation;
output
out=mean_correlation
mean(correlation)=mean_correlation;
run;
quit;
proc
print
data=mean_correlation;
run;

ods
graphics
on;
title
'Mean
Correlations
with
the
Market
Index
by
Sector';
proc
sgplot
data=mean_correlation;
format
mean_correlation
3.2;

vbar
sector
/
response=mean_correlation
datalabel;
run;
quit;
ods
graphics
off;
*
Note
that
we

have
been
using
PROC
SGPLOT
to
display
a
data
summary,
and
hence
we
have
not
been
able
to
make
the
display
that
we

want.
In
reality
PROC
SGPLOT
is
designed
to
take
an
input
data
set,
perform
some
routine
data
summaries,
and
display
that

output.
Unfortunately,
routine
data
summaries
are
typically
frequency
counts
for
discrete
data
or
averages
for
contiuous
data.
Here
is

an
example
of
the
default
use
of
PROC
SGPLOT.;
ods
graphics
on;
title
'Mean
Correlations
with
the
Market
Index
by
Sector
-‐
SGPLOT

COMPUTES
MEANS';
proc
sgplot
format
correlation
3.2;
vbar
sector
/
stat=mean
datalabel;
run;
quit;
ods
graphics

off;
*
Reset
title
statement
to
blank;
title
'';
*****************************************************
*******************************;
*
Begin
Modeling;
*****************************************************
*******************************;

*
Note
that
we
do
not
want
the
response
variable
in
the
data
used
to
compute
the
principal
components;

data
return_data;
set
temp
(keep=
return_:);
*
What
happens
when
I
put
this
keep
statement
in
the
set
statement?;

*
Look
it
up
in
The
Little
SAS
Book;
run;
proc
print
data=return_data(obs=10);
run;

*****************************************************
*******************************;
*
Compute
Principal
Components;
*****************************************************
*******************************;
ods
graphics
on;
proc
princomp
data=return_data
out=pca_output
outstat=eigenvectors
plots=scree(unpackpanel);
run;
quit;

ods
graphics
off;
*
Notice
that
PROC
PRINCOMP
produces
a
lot
of
output;
*
How
many
principal
components
should
we
keep?;
*

Do
the
principal
components
have
any
interpretability?;
*
Can
we
display
that
interpretability
using
graphics?;
proc
print
data=pca_output(obs=10);
run;

proc
print
data=eigenvectors(where=(_TYPE_='SCORE'));
run;
*
Display
the
two
plots
and
the
Eigenvalue
table
from
the
output;

*
Plot
the
first
two
eigenvectors;
data
pca2;
set
eigenvectors(where=(_NAME_
in
('Prin1','Prin2')));
drop
_TYPE_

;
run;
proc
print
data=pca2;
run;
proc
transpose
data=pca2
out=long_pca;
run;
quit;
proc
print
data=long_pca;
run;

data
long_pca;
set
long_pca;
format
tkr
$3.;
tkr
=
substr(_NAME_,8,3);
drop
_NAME_;
run;

proc
print
data=long_pca;
run;
*
Plot
the
first
two
principal
components;
ods
graphics
on;
proc
sgplot
data=long_pca;

scatter
x=Prin1
y=Prin2
/
datalabel=tkr;
run;
quit;
ods
graphics
off;
*****************************************************
*******************************;
*
Create
a

training
data
set
and
a
testing
data
set
from
the
PCA
output
;

*
Note
that
we
will
use
a
SAS
shortcut
to
keep
both
of
these
'datasets'
in
one
;

*
data
set
that
we
will
call
cv_data
(cross-‐validation
data).

;
*****************************************************
*******************************;
data
cv_data;
merge
pca_output
temp(keep=response_VV);
*
No
BY
statement

needed
here.
We
are
going
to
append
a
column
in
its
current
order;
*
generate
a
uniform(0,1)
random
variable
with
seed
set

to
123;
u
=
uniform(123);
if
(u
<
0.70)
then
train
=
1;
else
train
=
0;

if
(train=1)
then
train_response=response_VV;
else
train_response=.;
run;
proc
print
data=cv_data(obs=10);
run;

*
You
can
double
check
this
merge
by
printing
out
the
original
data;
proc
print
data=temp(keep=response_VV
obs=10);
run;
quit;

*****************************************************
********************************
*******;
*
Fit
a
regression
model
using
the
raw
predictor
variables;
*****************************************************
********************************
*******;
*
Fit

a
regression
model
using
all
of
the
raw
predictor
variables
and
VV
as
the
response
variable;
ods
graphics
on;
proc
reg

data=cv_data;
model
train_response
=
return_:
/
vif
;
output
out=model1_output
predicted=Yhat
;
run;
quit;
ods
graphics
off;
*
Examine
the

Goodness-‐Of-‐Fit
for
this
model.
How
well
does
it
fit?
Are
there
any
problems?;
*****************************************************
********************************
*******;
*

Fit
a
regression
model
using
the
rotated
predictor
variables
(Principal
Component
Scores)
;
*****************************************************
********************************
*******;
*
Now
fit
a
regression
model
using

your
selected
number
of
principal
components
and
VV
as
the
response
variable;
*
Examine
the
Goodness-‐Of-‐Fit
for
this
model.
How

well
does
it
fit?
Are
there
any
problems?;
ods
graphics
on;
proc
reg
data=cv_data;
model
train_response
=
prin1-‐prin8
/

vif
;
output
out=model2_output
predicted=Yhat
;
run;
quit;
ods
graphics
off;
*****************************************************
********************************
*******;
*

Compare
fit
and
predictive
accuracy
of
the
two
fitted
models
;
*****************************************************
********************************
*******;
*
Use
the
Mean
Square
Error
(MSE)
and
the

Mean
Absolute
Error
(MAE)
metrics
for
your
comparison.;
proc
print
data=model1_output(obs=10);
run;
*
Model

1;
data
model1_output;
set
model1_output;
square_error
=
(response_VV
-‐
Yhat)**2;
absolute_error
=
abs(response_VV
-‐
Yhat);
run;

proc
means
data=model1_output
nway
noprint;
class
train;
var
square_error
absolute_error;
output
out=model1_error
mean(square_error)=MSE_1
mean(absolute_error)=MAE_1;

run;
quit;
proc
print
data=model1_error;
title
"Model
1
(Log
Return)
MSE";run;
*
Model
2;
data
model2_output;

set
model2_output;
square_error
=
(response_VV
-‐
Yhat)**2;
absolute_error
=
abs(response_VV
-‐
Yhat);
run;
proc
means

data=model2_output
nway
noprint;
class
train;
var
square_error
absolute_error;
output
out=model2_error
mean(square_error)=MSE_2
mean(absolute_error)=MAE_2;
run;
quit;

proc
print
data=model2_error;
title
"Model
2
(PCA)
MSE";
run;
*
Merge
them
together
in
one
table;
data
error_table;

merge
model1_error(drop=
_TYPE_
_FREQ_)
model2_error(drop=
_TYPE_
_FREQ_);
by
train;
run;
proc
print
data=error_table;
run;

PCA Examples
1. Job Performance
This example uses the PRINCOMP procedure to analyze job
performance. Police officers were
rated by their supervisors in 14 categories as part of standard
police departmental administrative
procedure.
The following statements create the Jobratings data set:
options validvarname=any;
data Jobratings;
input ('Communication Skills'n
'Problem Solving'n
'Learning Ability'n

'Judgment Under Pressure'n
'Observational Skills'n
'Willingness to Confront Problems'n
'Interest in People'n
'Interpersonal Sensitivity'n
'Desire for Self-Improvement'n
'Appearance'n
'Dependability'n
'Physical Ability'n
'Integrity'n
'Overall Rating'n) (1.);
datalines;

26838853879867
74758876857667
56757863775875
67869777988997
99997798878888
89897899888799
89999889899798
87794798468886
35652335143113
89888879576867
76557899446397
97889998898989
76766677598888

... more lines ...
99899899899899
76656399567486
;
The data set Jobratings contains 14 variables. Each variable
contains the job ratings, using a
scale measurement from 1 to 10 (1=fail to comply,
10=exceptional). The last variable Overall
Rating contains a score as an overall index on how each officer
performs.
The following statements request a principal component
analysis on the Jobratings data set,

output the scores to the Scores data set (OUT= Scores), and
produce default plots. Note that
variable Overall Rating is excluded from the analysis.
ods graphics on;
proc princomp data=Jobratings(drop='Overall Rating'n);
run;
By default, the PROC PRINCOMP statement requests principal
components computed from the
correlation matrix, so the total variance is equal to the number
of variables, 13. In this example,
it would also be reasonable to use the COV option, which would
cause variables with a high
variance (such as Dependability) to have more influence on the
results than variables with a low
variance (such as Learning Ability). If you used the COV

option, scores would be computed
from centered rather than standardized variables.
Conclusions
The first component reflects overall performance since the first
eigenvector shows approximately
equal loadings on all variables. The second eigenvector has high
positive loadings on the
variables Observational Skills and Willingness to Confront
Problems but even higher negative
loadings on the variables Interest in People and Interpersonal
Sensitivity. This component seems
to reflect the ability to take action, but it also reflects a lack of
interpersonal skills. The third
eigenvector has a very high positive loading on the variable
Physical Ability and high negative
loadings on the variables Problem Solving and Learning Ability.

This component seems to
reflect physical strength, but also shows poor learning and
problem-solving skills.
In short, the three components represent the following:
First Component:
overall performance
Second Component:
smart, tough, and introverted
Third Component:
superior strength and average intellect
2. Basketball Rankings
The data in this example are rankings of 35 college basketball
teams. The rankings were
made before the start of the 1985–86 season by 10 news

services.
The purpose of the principal component analysis is to compute a
single variable that best
summarizes all 10 of the preseason rankings.
Note that the various news services rank different numbers of
teams, varying from 20
through 30 (there is a missing rank in one of the variables,
WashPost). And, of course, not all
services rank the same teams, so there are missing values in
these data. Each of the 35 teams
is ranked by at least one news service.
The PRINCOMP procedure omits observations with missing
values. To obtain principal
component scores for all of the teams, it is necessary to replace
the missing values. Since it is

the best teams that are ranked, it is not appropriate to replace
missing values with the mean of
the nonmissing values. Instead, an ad hoc method is used that
replaces missing values with
the mean of the unassigned ranks. For example, if 20 teams are
ranked by a news service,
then ranks 21 through 35 are unassigned. The mean of ranks 21
through 35 is 28, so missing
values for that variable are replaced by the value 28. To prevent
the method of missing-value
replacement from having an undue effect on the analysis, each
observation is weighted
according to the number of nonmissing values it has. See
Example 71.2 in Chapter 71, The
PRINQUAL Procedure, for an alternative analysis of these data.
Since the first principal component accounts for 78% of the

variance, there is substantial
agreement among the rankings. The eigenvector shows that all
the news services are about
equally weighted; this is also suggested by the nearly horizontal
line of the pattern profile
plot in Output 70.2.3. So a simple average would work almost
as well as the first principal
component. The following statements produce Output 70.2.1.
/*-----------------------------------------------------------*/
/* */
/* Pre-season 1985 College Basketball Rankings */
/* (rankings of 35 teams by 10 news services) */
/* */
/* Note: (a) news services rank varying numbers of teams; */

/* (b) not all teams are ranked by all news services; */
/* (c) each team is ranked by at least one service; */
/* (d) rank 20 is missing for UPI. */
/* */
/*-----------------------------------------------------------*/
data HoopsRanks;
input School $13. CSN DurSun DurHer WashPost USAToday
Sport InSports UPI AP SI;
label CSN = 'Community Sports News (Chapel Hill, NC)'
DurSun = 'Durham Sun'
DurHer = 'Durham Morning Herald'
WashPost = 'Washington Post'

USAToday = 'USA Today'
Sport = 'Sport Magazine'
InSports = 'Inside Sports'
UPI = 'United Press International'
AP = 'Associated Press'
SI = 'Sports Illustrated'
;
L/default/statug_prinqual_sect030.htm
L/default/prinqual_toc.htm
L/default/prinqual_toc.htm
L/default/statug_princomp_sect020.htm#statug.princomp.princo
mpoutball_d
L/default/statug_princomp_sect020.htm#statug.princomp.princo

mpoutball_a
이강복
강조
format CSN--SI 5.1;
datalines;
Louisville 1 8 1 9 8 9 6 10 9 9
Georgia Tech 2 2 4 3 1 1 1 2 1 1
Kansas 3 4 5 1 5 11 8 4 5 7
Michigan 4 5 9 4 2 5 3 1 3 2
Duke 5 6 7 5 4 10 4 5 6 5
UNC 6 1 2 2 3 4 2 3 2 3
Syracuse 7 10 6 11 6 6 5 6 4 10
Notre Dame 8 14 15 13 11 20 18 13 12 .

Kentucky 9 15 16 14 14 19 11 12 11 13
LSU 10 9 13 . 13 15 16 9 14 8
DePaul 11 . 21 15 20 . 19 . . 19
Georgetown 12 7 8 6 9 2 9 8 8 4
Navy 13 20 23 10 18 13 15 . 20 .
Illinois 14 3 3 7 7 3 10 7 7 6
Iowa 15 16 . . 23 . . 14 . 20
Arkansas 16 . . . 25 . . . . 16
Memphis State 17 . 11 . 16 8 20 . 15 12
Washington 18 . . . . . . 17 . .
UAB 19 13 10 . 12 17 . 16 16 15
UNLV 20 18 18 19 22 . 14 18 18 .

NC State 21 17 14 16 15 . 12 15 17 18
Maryland 22 . . . 19 . . . 19 14
Pittsburgh 23 . . . . . . . . .
Oklahoma 24 19 17 17 17 12 17 . 13 17
Indiana 25 12 20 18 21 . . . . .
Virginia 26 . 22 . . 18 . . . .
Old Dominion 27 . . . . . . . . .
Auburn 28 11 12 8 10 7 7 11 10 11
St. Johns 29 . . . . 14 . . . .
UCLA 30 . . . . . . 19 . .
St. Joseph's . . 19 . . . . . . .
Tennessee . . 24 . . 16 . . . .
Montana . . . 20 . . . . . .

Houston . . . . 24 . . . . .
Virginia Tech . . . . . . 13 . . .
;
/* PROC MEANS is used to output a data set containing the
*/
/* maximum value of each of the newspaper and magazine
*/
/* rankings. The output data set, maxrank, is then used */
/* to set the missing values to the next highest rank plus */
/* thirty-six, divided by two (that is, the mean of the */
/* missing ranks). This ad hoc method of replacing missing */
/* values is based more on intuition than on rigorous */
/* statistical theory. Observations are weighted by the */

/* number of nonmissing values. */
/* */
title 'Pre-Season 1985 College Basketball Rankings';
proc means data=HoopsRanks;
output out=MaxRank
max=CSNMax DurSunMax DurHerMax
WashPostMax USATodayMax SportMax
InSportsMax UPIMax APMax SIMax;
run;
Conclusions

Pre-Season 1985 College Basketball Rankings
College Teams as Ordered by PROC PRINCOMP
Obs School Prin1
1 Georgia Tech -0.58068
2 UNC -0.53317
3 Michigan -0.47874
4 Kansas -0.40285
5 Duke -0.38464
6 Illinois -0.33586
7 Syracuse -0.31578
8 Louisville -0.31489
9 Georgetown -0.29735

10 Auburn -0.09785
11 Kentucky 0.00843
12 LSU 0.00872
13 Notre Dame 0.09407
14 NC State 0.19404
15 UAB 0.19771
16 Oklahoma 0.23864
17 Memphis State 0.25319
18 Navy 0.28921
19 UNLV 0.35103
20 DePaul 0.43770
21 Iowa 0.50213

22 Indiana 0.51713
23 Maryland 0.55910
24 Arkansas 0.62977
25 Virginia 0.67586
26 Washington 0.67756
27 Tennessee 0.70822
28 St. Johns 0.71425
29 Virginia Tech 0.71638
30 St. Joseph's 0.73492
31 UCLA 0.73965
32 Pittsburgh 0.75078
Obs School Prin1

33 Houston 0.75534
34 Montana 0.75790
35 Old Dominion 0.76821
Assignment
#7
Jake
Rifkin

Introduction:
In
this
report,
exploratory
factor
analysis
is
performed
on
stock
portfolio
return
data.
The
goal
of
the
report
is
to
explore

and
understand
the
common
factors
shared
between
the
various
stocks
included
in
the
portfolio
by
examining
their
correlation
structure
through
factor
analysis.
Rotations
are

used
to
increase
the
interpretability
of
the
factor
analyses.
The
data
set
is
pruned
to
contain
4
key
sectors
(banking,
oil
field
services,

oil
refining,
and
industrial
chemical),
thus
we
would
ideally
expect
exploratory
factor
analysis
to
identify
four
common
factors.
All
data
is
log
transformed
per

the
instructions
of
the
assignment
to
transform
our
daily
return
calculation
into
a
more
normal
distribution
thus
meeting
a
fundamental
requirement
of
factor

analysis
that
the
predictor
variables
have
a
multinormal
distribution.
Results:
Principal
Factor
Analysis
without
Rotation
The
first
exploratory

factor
analysis
is
fit
using
the
principal
factor
analysis
method
with
priors
determined
by
smc
and
no
apriori
suggestion
of
number
of
factors.
SAS

selects
the
number
of
factors
for
us
present
in
the
data
using
the
proportion
of
variance
explained
by
calculating
eigenvalues
and
identifying
the

point
of
diminishing
returns.
Visually,
this
corresponds
to
the
elbow
in
the
scree
plot.
The
first
factor
analysis
includes

2
factors
which
is
fewer
factors
than
expected
considering
we
have
a
dataset
that
contains
four
sectors.
This
method
of
factor
selection
is
not

immune
to
Heywood
cases
as
indicated
by
the
presence
of
negative
eigenvalues
that
lead
to
some
mathematical
violations.
It
also
leads
to
the

first
two
factors
explaining
a
cumulative
1.0101
of
the
variance.
It
is
odd
to
consider
that
more
than
100%
of
the
variance
could
be

explained.
Given
that
Heywood
cases
are
present,
we
should
not
proceed
with
this
model.
The
factors
loadings
produced
by
this
first
analysis
are
presented

below.
A
simple
structure
is
not
present
and
interpretation
of
these
factors
is
difficult.
The
first
factor
is
relatively
flat
in
terms
of

variance
and
all
values
are
positive
loadings.
The
second
factor
has
a
greater
variance
than
the
first,
contains
many
negative
loadings,
and
in
examining

the
absolute
values
of
the
factors,
none
are
greater
than
.4.
While
it
was
expected
exploratory
factor
analysis
would
identify

the
sectors
as
factors,
plotting
the
two
factors
loadings
against
each
other
it
is
interesting
to
note
the
clumps
of
the
different
sectors
are

located
close
to
each
other
in
the
coordinate
plane.
As
there
is
little
variance
in
the
first
factor,
all
returns
fall
relatively

close
together
on
the
horizontal
axis.
The
vertical
axis
is
what
creates
the
separation
between
the
clusters.
The
bottom
three
returns
(BHI,
HAL,
and

SLB)
are
all
oil
field
services,
the
next
higher
three
(XOM,
CVA,
and
HES)
all
belong
to
the
oil
refining
sector.
All
oil
related

stocks
have
negative
loadings
in
the
second
factor.
The
top
right
quadrant
of
the
graph
contains
the
remaining
two
sectors
and
the
banking

and
industrial
chemical
sectors
are
also
clustered
closely
together.
Principal
Factor
Analysis
with
Orthogonal
Rotation
The
second
principal

factor
analysis
model
fit
for
this
assignment
includes
an
orthogonal
varimax
rotation.
As
we
have
not
specified
the
number
of
factors
to
expect
apriori,

SAS
performs
the
same
eigenvalue
calculation
as
the
first
model
and
thus
arrives
at
the
same
number
of
factors
to
include
into
the

model.
Specifying
a
rotation
did
not
help
SAS
generate
additional
factors
as
the
rotation
step
occurs
after
fitting
the
model.
The
same
issue

with
Heywood
cases
is
still
present
in
this
version
of
the
model.
Before
the
rotation,
SAS
has
produced
the
same
factor
loadings
as
the

previous
model.
After
the
varimax
rotation,
the
variance
explained
by
each
factor
becomes
much
closer
than
the
first
model.
A
simple
structure
has

still
not
been
achieved
as
there
is
a
lack
of
zeros
in
our
factor
loading
matrix.
The
components
that
changed
the
most
are
the

three
oil
field
service
stocks
(SLB,
HAL,
and
BHI).
The
rotational
change
is
most
visible
in
the
plot
of
the
components

against
the
two
factor
loadings.
Examining
the
plot,
the
clusters
of
our
sectors
still
remain
closely
related
to
each
other,
but
the
rotation

has
moved
all
of
the
components
into
the
top
right
quadrant.
As
for
the
interpretability
of
the
factor
loadings,
the
rotation
seems

to
add
some
clarity
into
what
the
factors
represent.
The
first
factor
has
larger
loadings
for
the
banking
industries
(BAC,
JPM,
and
WFC)

while
the
second
factor
has
high
positive
loadings
for
the
oil
field
services
and
moderately
high
loadings
for
the
oil
refining
industry.
The
interpretation

may
be
a
bit
clearer,
but
2
factors
are
still
less
than
the
four
sectors
we’d
expect
and
we
have
not
yet
achieved
a

simple
structure.
That
coupled
with
the
presence
of
Heywood
cases
in
the
estimation
of
the
factors
means
we
should
continue
our

search
for
an
acceptable
factor
analysis.
Maximum
Likelihood
Estimation
Factor
Analysis
with
Varimax
rotation
The
next
model
uses
maximum

likelihood
estimation
to
perform
factor
analysis
with
a
varimax
rotation.
Like
the
previous
two
models,
the
number
of
factors
is
not
provided
to
the

model
apriori
and
the
MLE
process
also
reaches
the
conclusion
that
2
factors
should
be
considered.
Unlike
the
previous
two
models,
MLE
has

a
formal
statistical
test
that
can
aide
in
the
interpretation
of
how
many
factors
to
consider.
The
test
uses

a
chi-‐square
distribution
and
has
the
null
hypothesis
that
there
are
no
common
factors
with
the
alternative
hypothesis
that
there
is
at
least
one

common
factor.
If
the
test
is
significant
at
our
predetermined
alpha
rate,
we
reject
the
null
hypothesis,
conclude
that
at
least
one
common
factor

exists,
and
iterate
through
the
tests
by
increasing
the
number
of
common
factors
in
the
null
hypothesis
until
we
reach
a
result
that

is
not
significant.
There
are
a
few
other
major
benefits
that
are
achieved
when
using
MLE
as
the
factor
estimation
tool
including
the
ability

to
formally
compare
factor
models
against
each
other
using
criteria
like
AIC
of
BIC
as
the
presence
of
the
likelihood
function
allows
these

direct
statistical
comparisons
(assuming
all
assumptions
are
met).
The
rotated
factor
loadings
produced
by
the
MLE
are
slightly
different
than
the
rotated
factor

loadings
produced
by
our
second
model
but
very
similar.
In
terms
of
interpretability,
there
is
a
similar
level
of
interpretability
of
the
factor
loadings

as
the
previous
model
since
both
models
achieve
similar
factor
loadings
after
the
rotation.
A
simple
structure
is
still
not
present
as

no
zeros
or
near
zero
values
are
present
in
the
loadings.
It
is
reassuring
to
see
that
similar
results
are
produced
from
different
techniques.

While
a
formal
bootstrap
or
cross
validation
method
should
be
employed
to
ensure
that
the
factor
loadings
are
not
specific
to
this
particular

sample,
at
the
very
least,
we’re
seeing
consistent
results
from
different
processes.
Maximum
Likelihood
Estimation
Factor
Analysis
with
rotation

Max
prior
The
final
factor
analysis
model
in
this
report
is
another
maximum
likelihood
estimation
factor
analysis
that
uses
the
orthogonal
varimax

rotation
but
uses
a
different
prior
communality.
This
approach
uses
the
largest
absolute
correlation
for
a
variable
with
any
other
variable
as
the
communality

estimate
for
the
variable.
Using
this
prior
structure,
SAS
suggests
five
factors
should
be
included
in
this
model
using
the
same
iterative
statistical

testing
framework
as
the
previous
model.
The
large
difference
in
factors
included
in
the
model
based
on
differing
prior
communality
estimates
suggests
that
the

prior
communality
has
a
large
effect
on
factor
analysis.
Looking
at
the
eigenvalues
produced
by
this
model,
there
are
fewer
Heywood

cases
than
the
previous
models
but
Heywood
cases
are
still
present.
Because
of
the
presence
of
Heywood
cases,
this
model
is

potentially
misspecified
and
should
not
be
used.
It
is
difficult
to
interpret
if
the
five-‐factor
model
fits
the
data
better
than
the
two-‐factor

model.
In
terms
of
interpretability,
the
five-‐factor
model’s
first
four
factors
align
with
our
expectations
of
seeing
a
factor
per
sector.
Note
that
we

still
have
not
reached
a
simple
structure
as
indicated
by
the
lack
of
zeros
in
the
loadings
matrix.
The
fifth
factor
is
especially

puzzling
and
difficult
to
interpret.
The
absolute
values
of
the
factor
loadings
for
this
model
are
all
below
.25
and
the
largest
three
loadings

in
the
fifth
factor
are
spread
across
our
sectors.
Conclusions:
In
this
report,
we
fit
four
different
factor
analysis

models.
Two
using
principal
factor
analysis
and
two
uses
maximum
likelihood
estimation.
None
of
our
models
perfectly
aligned
with
our
intuitive
understanding
of

the
data
being
split
into
four
sectors
though
all
models
hinted
at
the
four
sectors
either
through
clusters
in
the
factor
relationship
or
through

the
individual
factors.
All
of
our
models
had
Heywood
cases
present
meaning
that
these
models
should
not
be
used.
Perhaps
these
Heywood
cases

are
indicating
that
our
data
does
not
perfectly
align
with
the
strict
assumptions
of
the
factor
analysis
models.
Rotations
helped

increase
the
interpretability
of
our
model,
but
we
were
unable
to
find
a
simple
structure
in
any
of
the
models.
Perhaps
if
we
had

used
an
oblique
rotation
and
relaxed
the
orthogonality
requirement
enforced
by
varimax
rotations,
we
could
have
found
a
simpler
structure,
but
we
would
no

longer
be
looking
at
the
correlation
structure.
Additionally,
the
transformation
of
the
returns
to
the
log
scale
may
have
helped
our
data
fit

the
model,
though
we
never
verified
this
assumption
in
this
report.
Overall,
the
factor
analysis
process
is
interesting
to
try
and
understand
how
the

data
are
correlated
to
each
other,
though
it
did
confirm
knowledge
that
we
already
had
and
was
readily
accessible
to
us
by
simply

researching
the
companies.
It
seems
that
a
great
deal
more
time
would
be
required
to
fiddle
with
the
knobs
of
this
model
to
achieve

a
simple
structure
that
is
interpretable
and
informative
of
our
use
case.
Given
the
Heywood
cases,
I
would
not
suggest
using
any
of
the

models
developed
in
this
report.
Code:
libname
mydata
"/scs/wtm926/"
access=readonly;
/*
1
*/
data
temp;

set
mydata.stock_portfolio_data;
drop
AA
HON
MMM
DPS
KO
PEP
MPC
GS
;
run;
proc
sort
data=temp;
by
date;

run;
data
temp;
set
temp;
*
Compute
the
log-‐returs;
return_BAC
=
log(BAC/lag1(BAC));
return_BHI
=
log(BHI/lag1(BHI));
return_CVX
=
log(CVX/lag1(CVX));

return_DD
=
log(DD/lag1(DD));
return_DOW
=
log(DOW/lag1(DOW));
return_HAL
=
log(HAL/lag1(HAL));
return_HES
=
log(HES/lag1(HES));
return_HUN
=
log(HUN/lag1(HUN));
return_JPM
=
log(JPM/lag1(JPM));

return_SLB
=
log(SLB/lag1(SLB));
return_WFC
=
log(WFC/lag1(WFC));
return_XOM
=
log(XOM/lag1(XOM));
response_VV
=
log(VV/lag1(VV));
run;
data
return_data;
set

temp
(keep=
return_:);
run;
/*
2
*/
ods
graphics
on;
proc
factor

data=return_data
method=principal
priors=smc
rotate=none
plots=(all);
run;
ods
graphics
off;
/*
3
*/

ods
graphics
on;
proc
factor
data=return_data
method=principal
priors=smc
rotate=varimax
plots=(all);
run;

ods
graphics
off;
/*
4
*/
ods
graphics
on;
proc
factor
data=return_data
method=ML
priors=smc

rotate=varimax
plots=(loadings);
run;
ods
graphics
off;
/*
5
*/
ods
graphics
on;

proc
factor
data=return_data
method=ML
priors=max
rotate=varimax
plots=(loadings);
run;
ods
graphics
off;

HW03 Collinearity, Principal Components and Factor Analysis
(100 points)
Some Definitions:
Factor - A linear combination of the original variables. They
represent the
underlying dimensions that summarize or account for the
original set of observed
variables.
Factor Rotation - The process of manipulating or adjusting the
factor axes to
achieve a simpler and more meaningful factor solution.
Orthogonal - Mathematical independence, no correlation of
factor axes to each
other, at right angles or 90 degrees.

Example 33.2 Principal Factor Analysis This example uses t.docx

Example 33.2 Principal Factor Analysis This example uses t.docx

Recommended

Recommended

More Related Content

Similar to Example 33.2 Principal Factor Analysis This example uses t.docx

Similar to Example 33.2 Principal Factor Analysis This example uses t.docx (20)

More from SANSKAR20

More from SANSKAR20 (20)

Recently uploaded

Recently uploaded (20)

Example 33.2 Principal Factor Analysis This example uses t.docx