factor analysis (basics) for research .ppt

•FACTOR ANALYSIS
© Dr. Maher Khelifa 1

• Factor analysis is a technique that is used to
reduce a large number of variables into fewer
numbers of factors.

• Suppose you are conducting a survey and you want
to know whether the items in the survey have similar
patterns of responses, do these items “hang
together” to create a construct? The basic
assumption of factor analysis is that for a collection
of observed variables there are a set
of underlying variables called factors (smaller than the
observed variables), that can explain the
interrelationships among those variables

• Frequently used to develop questionnaires.
• If development of scale is the task then it is important to ensure
the relation of questions with the task.
• Data reduction tool
• Removes redundancy or duplication from a set of correlated variables
• Represents correlated variables with a smaller set of “derived” variables.
• Factors are formed that are relatively independent of one another.
• Two types of “variables”:
• latent variables: factors
• observed variables

• Latent variables are those variables that are
measured indirectly using observable
variables.
• Observed variables are the variables that are
measured directly, or also indicator variables

Understanding Factor Analysis
• Factor analysis is commonly used in:
• Data reduction
• Scale development
• The evaluation of the psychometric quality of a measure,
and
• The assessment of the dimensionality of a set of variables.

 Regardless of purpose, factor analysis is used in:
 the determination of a small number of factors based on a
particular number of inter-related quantitative variables.
 Unlike variables directly measured such as speed, height,
weight, etc., some variables such as egoism, creativity,
happiness, religiosity, comfort are not a single
measurable entity.
 They are constructs that are derived from the
measurement of other, directly observable variables .

 Constructs are usually defined as unobservable latent variables. E.g.:
 motivation/love/hate/care/altruism/anxiety/worry/stress/product
quality/physical aptitude/democracy /reliability/power.
 Example: the construct of teaching effectiveness. Several variables are used
to allow the measurement of such construct (usually several scale items are
used) because the construct may include several dimensions.
 Factor analysis measures not directly observable constructs by measuring
several of its underlying dimensions.
 The identification of such underlying dimensions (factors) simplifies the
understanding and description of complex constructs.

• Generally, the number of factors is much smaller than the
number of measures.
• Therefore, the expectation is that a factor represents a set
of measures.
• From this angle, factor analysis is viewed as a data-
reduction technique as it reduces a large number of
overlapping variables to a smaller set of factors that
reflect construct(s) or different dimensions of contruct(s).

 The assumption of factor analysis is that underlying
dimensions (factors) can be used to explain complex
phenomena.
 Observed correlations between variables result from their
sharing of factors.
 Example: Correlations between a person’s test scores
might be linked to shared factors such as general
intelligence, critical thinking and reasoning skills, reading
comprehension etc.
© Dr. Maher Khelifa
1
0

Ingredients of a Good Factor
Analysis Solution
• A major goal of factor analysis is to represent
relationships among sets of variables parsimoniously yet
keeping factors meaningful.
• A good factor solution is both simple and interpretable.
• When factors can be interpreted, new insights are
possible.

Application of Factor Analysis
 Defining indicators of constructs:
 Ideally 4 or more measures should be chosen to represent each
construct of interest.
 The choice of measures should, as much as possible, be guided by
theory, previous research, and logic.

 Defining dimensions for an existing measure:
In this case the variables to be analyzed are chosen by the
initial researcher and not the person conducting the analysis.
Factor analysis is performed on a predetermined set of
items/scales.
Results of factor analysis may not always be satisfactory:
 The items or scales may be poor indicators of the construct or
constructs.
 There may be too few items or scales to represent each
underlying dimension.

 Selecting items or scales to be included in a measure.
Factor analysis may be conducted to determine what items or
scales should be included and excluded from a measure.
Results of the analysis should not be used alone in making
decisions of inclusions or exclusions. Decisions should be
taken in conjunction with the theory and what is known
about the construct(s) that the items or scales assess.

Steps in Factor Analysis
• Factor analysis usually proceeds in four steps:
• 1st Step: the correlation matrix for all variables is computed
• 2nd Step: Factor extraction
• 3rd Step: Factor rotation
• 4th Step: Make final decisions about the number of
underlying factors

Steps in Factor Analysis:
The Correlation Matrix
• 1st Step: the correlation matrix
• Generate a correlation matrix for all variables
• Identify variables not related to other variables
• If the correlation between variables are small, it is unlikely that
they share common factors (variables must be related to each
other for the factor model to be appropriate).
• Think of correlations in absolute value.
• Correlation coefficients greater than 0.3 in absolute value are
indicative of acceptable correlations.
• Examine visually the appropriateness of the factor model.

• Bartlett Test of Sphericity:
• used to test the hypothesis the correlation matrix is
an identity matrix.
• It is one of the statistics associated with factor analysis. It
is a test statistic used to examine the hypothesis that the
variables are uncorrelated in the population.
• In other words, the population correlation matrix is an
identity matrix; each variable has no correlation with the
other variables (r = 0).

• Bartlett's test of sphericity is used to test the null
hypothesis that the variables in the population
correlation matrix are uncorrelated. The population
correlation matrix is an identity matrix; each variable
correlates perfectly with itself (r = 1) but has no
correlation with the other variables (r = 0). The
observed significance level is .0000. It is small
enough to reject the hypothesis. It is concluded that
the strength of the relationship among variables is
strong. It is a good idea to proceed a factor analysis
for the data.

• The Kaiser-Meyer-Olkin (KMO) measure of sampling
adequacy:
• . Small values of the KMO statistic indicate that the correlations between
pairs of variables cannot be explained by other variables and that factor
analysis may not be appropriate.
 The closer the KMO measure to 1 indicate a sizeable sampling
adequacy (.8 and higher are great, .7 is acceptable, .6 is
mediocre, less than .5 is unaccaptable ).
 Reasonably large values are needed for a good factor analysis.
Small KMO values indicate that a factor analysis of the
variables may not be a good idea.

• Communalities – This is the proportion of each variable's
variance that can be explained by the factors (e.g., the
underlying latent content). It is also noted as
h2 and can be defined as the sum of
squared factor loadings for the variables.
• It should be at ;east .40
• initial communalities represent the relation
between the variable and all other variables (i.e.,
the squared multiple correlation between the item
and all other items) before rotation.

• Extraction communalities are estimates of the
variance in each variable accounted for by the
factors in the factor solution.

Factor Extraction
 2nd Step: Factor extraction
 The primary objective of this stage is to determine the factors.
 Initial decisions can be made here about the number of factors
underlying a set of measured variables.
 Estimates of initial factors are obtained using usually Principal axis
factoring/
 It is the most commonly used extraction method .
2
4

Methods of factor extraction
• Principal component analysis
• Unweighted least squares
• Generalized least squares
• Maximum likelihood
• Alpha factoring
• Image factoring

Factor Extraction
 In principal components analysis, linear combinations of the observed
variables are formed.
 The 1st principal component is the combination that accounts for the
largest amount of variance in the sample (1st extracted factor).
 The 2nd principle component accounts for the next largest amount of
variance and is uncorrelated with the first (2nd extracted factor).
 Successive components explain progressively smaller portions of the total
sample variance, and all are uncorrelated with each other.
2
6

Factor Extraction
 To decide on how many factors we need to
represent the data, we use 2 statistical criteria:
 Eigen Values, and
 The Scree Plot.
Eigenvalue. The eigenvalue represents the
total variance explained by each factor.
 The determination of the number of factors is
usually done by considering only factors with
Eigen values greater than 1.
 Factors with a variance less than 1 are no better
than a single variable, since each variable is
expected to have a variance of 1.
 If the number of variables is less than 20,
this approach will result in a conservative
number of factors.
2
7
Total Variance Explained
Comp
onent
Initial Eigenvalues
Extraction Sums of Squared
Loadings
Total
% of
Variance
Cumulati
ve % Total
% of
Variance
Cumulati
ve %
1 3.046 30.465 30.465 3.046 30.465 30.465
2 1.801 18.011 48.476 1.801 18.011 48.476
3 1.009 10.091 58.566 1.009 10.091 58.566
4 .934 9.336 67.902
5 .840 8.404 76.307
6 .711 7.107 83.414
7 .574 5.737 89.151
8 .440 4.396 93.547
9 .337 3.368 96.915
10 .308 3.085 100.000
Extraction Method: Principal Component Analysis.

Factor Extraction
 The examination of the Scree plot provides a visual
of the total variance associated with each factor.
 Scree plot. A scree plot is a plot of the Eigen values
against the number of factors in order of extraction
 The steep slope shows the large factors.
 The gradual trailing off (scree) shows the rest of the
factors usually lower than an Eigen value of 1.
 In choosing the number of factors, in addition to the
statistical criteria, one should make initial decisions
based on conceptual and theoretical grounds.
 At this stage, the decision about the number of
factors is not final.
2
8

• A Priori Determination. Sometimes, because of prior
knowledge, the researcher knows how many factors to
expect and thus can specify the number of factors to be
extracted beforehand.
• Determination Based on Percentage of Variance. In
this approach the number of factors extracted is
determined so that the cumulative percentage of variance
extracted by the factors reaches a satisfactory level. It is
recommended that the factors extracted should account
for at least 60% of the variance.
2
9

Factor Rotation
 3rd Step: Factor rotation.
 In this step, factors are rotated.
 Un-rotated factors are typically not very interpretable
 Factors are rotated to make them more meaningful and easier to
interpret.
 In rotating the factors, we would like each factor to have
nonzero, or significant, loadings or coefficients for only
some of the variables. Likewise, we would like each
variable to have nonzero or significant loadings with only
a few factors, if possible with only one.
 Different rotation methods may result in the identification of
somewhat different factors.
3
0

Factor Rotation
 The most popular rotational method is Varimax rotations. It
minimizes the number of variables with high loadings on a
factor, thereby enhancing the interpretability of the factors.
This rotation results in factors that are uncorrelated.
 The rotation is called oblique rotation when the axes are not
maintained at right angles, and the factors are correlated.
Sometimes, allowing for correlations among factors can simplify
the factor pattern matrix. Oblique rotation should be used
when factors in the population are likely to be strongly
correlated.

Factor Rotation
• Oblique rotations are less frequently used because their results
are more difficult to summarize.
• Other rotational methods include:
 Quartimax (Orthogonal)
 Equamax (Orthogonal)
 Promax (oblique)
3
2

Making Final Decisions
• 4th Step: Making final decisions
• The final decision about the number of factors to choose is the number of factors
for the rotated solution that is most interpretable.
• To identify factors, group variables that have large loadings for the same factor.
• By examining the factor matrix, one could select for each factor the variable with the
highest loading on that factor. That variable could then be used as a surrogate variable for
the associated factor.
• However, the choice is not as easy if two or more variables have similarly high loadings. In
such a case, the choice between these variables should be based on theoretical and
measurement considerations.
• Plots of loadings provide a visual for variable clusters.
• Interpret factors according to the meaning of the variables
3
3

• This decision should be guided by:
• A priori conceptual beliefs about the number of factors from past
research or theory
• Eigen values computed in step 2.
• The relative interpretability of rotated solutions computed in step
3.
3
4

factor analysis (basics) for research .ppt

Recommended

Recommended

More Related Content

Similar to factor analysis (basics) for research .ppt

Similar to factor analysis (basics) for research .ppt (20)

Recently uploaded

Recently uploaded (20)

factor analysis (basics) for research .ppt