SlideShare a Scribd company logo
1 of 56
EXPLORATORY FACTOR
ANALYSIS (EFA)
Kalle Lyytinen & James Gaskin
Learning Objectives
1. Understand what is the factor analysis
technique and its applications in research
2. Discuss exploratory factor analysis (EFA)
3. Run EFA with SPSS and interpret the resulted
output
4. Estimate shortly reliability
5. Assess shortly construct validity
The whole works
Theory Constructs
Items linked to
constructs
EFA
Collect data
Build/Run Structural
Model
Modify the
Measurement Model
Link items to
constructs; Label
constructs
Test structural
hypotheses
Conduct CFA
Without CMB
Conduct CFA
With CMB
Conduct
Multi-group
CFA
Goodness of fit & psychometric
properties filter
Data cleaning filter
Modify the
Structural Model
Goodness of fit filter
Contribute to theory
Analyzing the factor
structure of the
multi-item data
Family Tree of SEM
T-test
Latent
Growth
Curve
Analysis
ANOVA
Multi-way
ANOVA Repeated
Measure
Designs
Growth
Curve
Analysis
Bivariate
Correlation
Multiple
Regression
Path
Analysis
Structural
Equation
Modeling
Factor
Analysis
Confirmatory
Factor
Analysis
Exploratory
Factor
Analysis
Source: PIRE
Is the difference
between
samples on a
variable
significant?
Is the correlation
between
different
variables
significant?
Multiple samples,
multiple variables, over
time, etc.
Multiple variables,
overall model,
measurement
model, etc.
SCOPE of Factor Analysis today
 Factor analysis and principal component analysis
 Carrying out the analyses in SPSS
 Deciding on the number of factors
 Rotating factors
 Producing factor and component scores
 Assumptions and sample size
 Exploratory and confirmatory FA
Types of Measurement Models
 Exploratory (EFA)
 Confirmatory (CFA)
 Multitrait-Multimethod (MTMM)
 Hierarchical CFA
EFA vs. CFA
 Exploratory Factor Analysis
is concerned with how
many factors are necessary
to explain the relations
among a set of indicators
and with estimation of factor
loadings. It is associated
with theory development.
 Confirmatory Factor
Analysis is concerned with
determining if the number of
factors “conform” to what is
expected on the basis of
pre-established theory. Do
items load as predicted on
the expected number of
factors. Hypothesize
beforehand the number of
factors.
CONTENT:
1. Does the system provide the precise information you need?
2. Does the information content meet your needs?
3. Does the system provide reports that seem to be just about exactly what you need?
4. Does the system provide sufficient information?
ACCURACY:
1. Is the system accurate?
2. Are you satisfied with the accuracy of the system?
FORMAT:
1. Do you think the output is presented in a useful format?
2. Is the information clear?
EASE OF USE:
1. Is the system user friendly?
2. Is the system easy to use?
TIMELINESS:
1. Do you get the information you need in time?
2. Does the system provide up-to-date information?
End-User Computing Satisfaction (EUCS)
EUCS: An instrument for measuring satisfaction with an information system
Factor Analysis
 Factor Analysis is a method for identifying a structure (or
factors, or dimensions) that underlies the relations
among a set of observed variables.
 Factor analysis is a technique that transforms the
correlations among a set of observed variables into
smaller number of underlying factors, which contain all
the essential information about the linear
interrelationships among the original test scores.
 Factor analysis is a statistical procedure that involves the
relationship between observed variables
(measurements) and the underlying latent factors.
Factor Analysis
 Factor analysis is a fundamental component of Structural
Equation modeling.
 Factor analysis explores the inter-relationships among
variables to discover if those variables can be grouped
into a smaller set of underlying factors.
Many variables are “reduced” (grouped) into a smaller
number of factors
These variables reflect the causal impact of the “latent”
underlying factors
 Statistical technique for dealing with multiple variables
Explore data for patterns.
Often a researcher is unclear if items or variables have a discernible patterns. Factor Analysis
can be done in an Exploratory fashion to reveal
patterns among the inter-relationships of the items.
Data Reduction.
Factor analysis can be used to reduce a large number of variables into a smaller and more
manageable number of factors. Factor analysis can create factor scores for each subject that
represents these higher order variables.
Factor Analysis can be used to reduce a large number of variables into a parsimonious set of few
factors that account better for the underlying variance (causal impact) in the measured
phenomenon.
Confirm Hypothesis of Factor Structure.
Factor Analysis can be used to test whether a set of items designed to measure a certain
variable(s) do, in fact, reveal the hypothesized factor structure (i.e. whether the underlying latent
factor truly “causes” the variance in the observed variables and how “certain” we can be about it).
In measurement research when a researcher wishes to validate a scale with a given or
hypothesized factor structure, Confirmatory Factor Analysis is used.
Theory Testing.
Factor Analysis can be used to test a priori hypotheses about the relations among a set of
observed variables.
Applications of Factor Analysis
How would you group these Items?
In EFA, the researcher
is attempting to explore
the relationships among
items to determine if the
items can be grouped
into a smaller number of
underlying factors.
In this analysis, all items
are assumed to be related
to all factors.
V1
V2
V3
V4
ε
ε
ε
ε
Factor 1
Factor 1
Exploratory Factor Analysis
Factorial Solution
Factor
Loading
Item
Cross-Loading ?
Measured Variables or
Indicators:
These variables are those
that the researcher has
observed or measured.
In this example, they are
the four items on the scale.
Note, they are drawn as
rectangles or squares.
V1
V2
V3
V4
ε
ε
ε
ε
Factor 1
Factor 1
Exploratory Factor Analysis
Unmeasured or Latent
Variables:
These variables are not directly
measurable, rather the researcher
only
has indicators of these measures.
These variables are more often the
more interesting, but more difficult
variables
to measure (e.g., self-efficacy).
In this example, the latent variables
are the two factors.
Note, they are drawn as elipses
V1
V2
V3
V4
ε
ε
ε
ε
Exploratory Factor Analysis
Factor 1
Factor 1
V1
V2
V3
V4
ε
ε
ε
ε
Factor 1
Factor 1
Exploratory Factor Analysis
Factor Loadings:
Measure the relationship between
the items and the factors.
Factor loadings can be interpreted
like correlation coefficients;
ranging between -1.0 and +1.0.
The closer the value is to 1.0,
positive or negative, the stronger
the relationship between the factor
and the item.
Loadings can be both positive
or negative.
Factor Loadings:
Note the direction of the arrows;
the factors are thought to
influence the indicators, not
vice versa.
Each item is being predicted by
the factors.
V1
V2
V3
V4
ε
ε
ε
ε
Factor 1
Factor 1
Exploratory Factor Analysis
Errors in Measurement:
Each of the indicator variables has
some error in measurement.
The small circles with the ε indicate
the error.
The error is composed of 'we know
not what' or are not measured
directly.
These errors in measurement are
considered the reliability estimates
for each indicator variable.
V1
V2
V3
V4
ε
ε
ε
ε
Factor 1
Factor 1
Exploratory Factor Analysis
Multi-Indicator Approach
 A multiple-indicator approach reduces the
overall effect of measurement error of any
individual observed variable on the accuracy of
the results
 A distinction is made between observed
variables (indicators) and underlying latent
variables or factors (constructs)
 Together the observed variables and the latent
variables make up the measurement model
Conceptual Model
Positive
Affect
Guilt
Fear
Sadness
Negative
Affect
This model holds that there
are two uncorrelated factors
that explain the relationships
among the six emotion variables
Variables Factor
(Observed) (Latent)
Awe
Joy
Happiness
Measurement Model
Items Positive Affect
(Factor 1)
Negative Affect
(Factor 2)
Joy Loading* 0
Awe Loading 0
Happiness Loading 0
Fear 0 Loading
Guilt 0 Loading
Sadness 0 Loading
*The loading is a data-driven parameter that estimates the relationships
(correlation) between an observed item and a latent factor.
Data Matrix must have sufficient number of correlations
Variables must be inter-related in some way since factor analysis
seeks the underlying common dimensions among the variables. If
the variables are not related each variable will be its own factor!!
Rule of thumb: substantial number of correlations greater than .30
Metric variables are assumed, although dummy variables may be
used (coded 0,1).
The factors or unobserved variables are assumed to be independent
of one another. All variables in a factor analysis must consist of at
least an ordinal scale. Nominal data are not appropriate for factor
analysis.
Assumptions of Factor Analysis
Quick Quips about Factor Analysis
How many cases? Rule of 10—10 cases for every item; rule of
100– number of respondents should be the larger of (1) 5 times
number of variables or (2) 100.
How many variables do I need to FA? More the better (at least 3)
Is normality of data required? Nope
Is it necessary to standardize one variables before FA? Nope
Can you pool data from two samples together in a FA? Yep, but
must show they have same factor structure.
Two statistics on the SPSS output allow you to look at some of the
basic assumptions.
Kaiser-Meyer-Olkin (KMO) Measure of Sampling Adequacy, and
Bartlett's Test of Sphericity
Kaiser-Meyer-Olkin Measure of Sampling Adequacy generally indicates
whether or not the variables are able to be grouped into a smaller
set of underlying factors. That is, will data factor well???
KMO varies from 0 to 1 and should be .60 or higher to proceed (can us
.50 more lenient cut-off)
High values (close to 1.0) generally indicate that a factor analysis may
be useful with your data.
If the value is less than .50, the results of the factor analysis probably
won't be very useful.
Tests for Basic Assumptions
Kaiser-Meyer-Olkin (KMO)
 Marvelous - - - - - - .90s
 Meritorious - - - - - .80s
 Middling - - - - - - - .70s
 Mediocre - - - - - - - .60s
 Miserable - - - - - - .50s
 Unacceptable - - - below .50
KMO Statistics: Interpreting the Output
In this example, the data support the use of factor analysis and suggest that the
data may be grouped into a smaller set of underlying factors.
What does Bartlett’s Test of Sphericity explore?
Correlation Matrix
 Bartlett's Test of Sphericity
Tests hypothesis that correlation matrix is an
identity matrix.
 Diagonals are ones
 Off-diagonals are zeros
Significant result indicates matrix is not an
identity matrix.
Bartlett’s Test of Sphericity
Bartlett’s Test of Sphericity compares your correlation matrix to an identity matrix’
An identity matrix is a correlation matrix with 1.0 on the principal diagonal and
zeros in all other correlations. So clearly you want your Bartlett value to be
significant as you are expecting relationships between your variables, if a factor
analysis is going to be appropriate!
Problem with Bartlett’s test occurs with large n’s as small correlations tend to be
statistically significant – so test may not mean much!
Two Extraction Methods
 Principal Component Analysis
 Considers all of the available variance (common + unique) (places 1’s on diagonal of
correlation matrix).
 Seeks a linear combination of variables such that maximum variance is extracted—repeats
this step.
 Use when there is concern with prediction, parsimony and knows specific and error variance
are small.
 Results in orthogonal (uncorrelated factors)
 Principal Axis Factoring (PFA) or Common Factor Analysis
• Considers only common variance (places communality estimates on diagonal of correlation
matrix).
• Seeks least number of factors that can account for the common variance (correlation) of a
set of variables.
• PAF is only analyzing common factor variability; removing the uniqueness or unexplained
variability from the model.
 Called Principal Axis Factoring (PFA).
 PFA preferred in SEM cause it accounts for co-variation, whereas PCS accounts for total
variance
Methods of Factor Extraction
 Principal-axis factoring (PAF)
diagonals replaced by estimates of
communalities
iterative process
continues until negligible changes in
communalities
What is a Common Factor?
 It is an abstraction, a hypothetical
construct that affects at least two of our
measurement variables.
 We want to estimate the common factors
that contribute to the variance in our
variables.
 Is this an act of discovery or an act of
invention?
What is a Unique Factor?
 It is a factor that contributes to the
variance in only one variable.
 There is one unique factor for each
variable.
 The unique factors are unrelated to one
another and unrelated to the common
factors.
 We want to exclude these unique factors
from our solution.
Comparison of Extraction Models
 PCA vs. PAF
Factor loadings and eigenvalues are a little
larger with Principal Components
One may always obtain a solution with
Principal Components
Often little practical difference
FYI—Other less-used Extraction Methods (Image, alpha, ML ULS, GLS factoring)
Principal Components Extraction
 A communality (C) is the extent to which an item correlates
with all other items.
Thus, in PCA extraction method when the initial
communalities are set to 1.0, then all of the variability of
each item is accounted for in the analysis.
 Of course some of the variability is explained and some is
unexplained.
 In PCA with these initial communalities set to 1.0, you are
trying to find both the common factor variance and the
unique or error variance.
Principal Components Extraction
 Statisticians have indicated that assuming that all of the variability of
the items whether explained or unique can be accounted for in the
analysis is flawed and definitely should not be used in an
exploratory factor model.
 Some researchers suggest PAF as the appropriate method for
factor extraction using EFA.
 In PAF extraction, the amount of variability each item shares with all
other items is determined and this value is inserted into the
correlation matrix replacing the 1.0 on the diagonals. As a result,
PAF is only analyzing common factor variability; removing the
uniqueness or unexplained variability from the model.
Factor Rotation: Orthogonal
 Varimax (most common)
 minimizes number of variables with high loadings (or low) on a
factor—makes it possible to identify a variable with a factor
 Quartimax
 minimizes the number of factors needed to explain each
variable. Tend to generate a general factor on which most
variables load with med to high vales—not helpful for research
 Equimax
 combination of Varimax and Quartimax
Q&A:
Why use rotation method? Rotation causes factor loading to be more
clearly differentiated—necessary to facilitate interpretation
Non-orthogonal (oblique)
The real issue is you don’t have a basis for knowing how many
factors there are or what they are much less whether they are
correlated! Researchers assume variables are indicators of two or
more factors, a measurement model which implies orthogonal
rotation.
 Direct oblimin (DO)
Factors are allowed to be correlated. Diminished interpretability
 Promax
Computationally faster than DO
Used for large datasets
Oblique Rotation
The variables are assessed for the unique
relationship between each factor and the
variables (removing relationships that are
shared by multiple factors)
The matrix of unique relationships is called
the pattern matrix.
The pattern matrix is treated like the
loading matrix in orthogonal rotation.
Decisions to be made
 EXTRACTION:
PCA vs PAF
 ROTATION:
Orthogonal or Oblique (non-orthogonal)
Procedures for Factor Analysis
 Multiple different statistical procedures exist by which the number of
appropriate number of factors can be identified.
 These procedures are called "Extraction Methods."
 By default SPSS does PCA extraction
 This Principal Components Method is simpler and until more
recently was considered the appropriate method for Exploratory
Factor Analysis.
 Statisticians now advocate for a different extraction method due to a
flaw in the approach that Principal Components utilizes for
extraction.
What else?
 How many factors do you extract?
One convention is to extract all factors with
eigenvalues greater than 1 (e.g. PCA)
Another is to extract all factors with non-
negative eigenvalues
Yet another is to look at the scree plot
Number based on theory
Try multiple numbers and see what gives
best interpretation.
Total Variance Explained
3.513 29.276 29.276 3.296 27.467 27.467 3.251 27.094 27.094
3.141 26.171 55.447 2.681 22.338 49.805 1.509 12.573 39.666
1.321 11.008 66.455 .843 7.023 56.828 1.495 12.455 52.121
.801 6.676 73.132 .329 2.745 59.573 .894 7.452 59.573
.675 5.623 78.755
.645 5.375 84.131
.527 4.391 88.522
.471 3.921 92.443
.342 2.851 95.294
.232 1.936 97.231
.221 1.841 99.072
.111 .928 100.000
Factor
1
2
3
4
5
6
7
8
9
10
11
12
Total % of Variance Cumulative % Total % of Variance Cumulative % Total % of Variance Cumulative %
Initial Eigenvalues Extraction Sums of Squared Loadings RotationSums of SquaredLoadings
Extraction Method: Principal Axis Factoring.
Eigenvalues greater than 1
Scree Plot
Scree Plot
Factor Number
12
11
10
9
8
7
6
5
4
3
2
1
Eigenvalue
4
3
2
1
0
Three Factor Solution
Criteria For Retention Of Factors
 Eigenvalue greater than 1
Single variable has variance equal to 1
 Plot of total variance - Scree plot
Gradual trailing off of variance accounted for
is called the scree.
 Note cumulative % of variance of rotated
factors
Interpretation of Rotated Matrix
 Loadings of .40 or higher
 Name each factor based on 3 or 4
variables with highest loadings.
 Do not expect perfect conceptual fit of all
variables.
Loading size based on sample
(from Hair et al 2010 Table 3-2)
Significant Factor Loadings based on Sample Size
Sample Size Sufficient Factor Loading
50 0.75
60 0.70
70 0.65
85 0.60
100 0.55
120 0.50
150 0.45
200 0.40
250 0.35
350 0.30
What else?
 How do you know when the factor
structure is good?
When it makes sense and has a (relatively)
simple and clean structure.
Total Variance Explained > .60
 How do you interpret factors?
Good question, that is where the true art of
this comes in.
Why EFA?
49
?
Why EFA?
50
EDM 643 51
Reflective versus Formative
Diet (Reflective)
 R1. I eat healthy food.
 R2. I do not each much
junk food.
 R3. I have a balanced
diet.
Health (Formative)
 F1. I have a balanced diet
 F2. I exercise regularly
 F3. I get sufficient sleep
each night
Diet
R1 R2 R3
e1 e2 e3
Health
F1 F2 F3
e3
EDM 643 52
 Direction of causality is from
construct to measure
 Measures expected to be
correlated
 Indicators are
interchangeable
 Direction of causality is from
measure to construct
 No reason to expect the
measures are correlated
 Indicators are not
interchangeable
*From Jarvis et al 2003
Diet
R1 R2 R3
e1 e2 e3
Health
F1 F2 F3
e3
Diet (Reflective) Health (Formative)
Adequacy
 Residuals ≤ 5%
 KMO ≥ 0.8 is better
 Communalities ≥ 0.5 is better
Validity
 Face Validity (do they make sense?)
 Pattern Matrix
 Convergent (high loadings)
 Discriminant (no cross-loadings)
 Factor Correlations
 ≤.7 is better
EDM 643 54
Reliability
 Split data and do two EFAs
 Cronbach’s Alpha (>.70) for each factor
SPSS: Scale  Reliability Analysis
EDM 643 55
Factor analysis (1)

More Related Content

What's hot

Factor analysis
Factor analysis Factor analysis
Factor analysis Nima
 
Factor Analysis - Statistics
Factor Analysis - StatisticsFactor Analysis - Statistics
Factor Analysis - StatisticsThiyagu K
 
2-Theoretical-Frameworks.pptx
2-Theoretical-Frameworks.pptx2-Theoretical-Frameworks.pptx
2-Theoretical-Frameworks.pptxEmilJohnLatosa
 
Introduction to Structural Equation Modeling Partial Least Sqaures (SEM-PLS)
Introduction to Structural Equation Modeling Partial Least Sqaures (SEM-PLS)Introduction to Structural Equation Modeling Partial Least Sqaures (SEM-PLS)
Introduction to Structural Equation Modeling Partial Least Sqaures (SEM-PLS)Ali Asgari
 
Multinomial Logistic Regression
Multinomial Logistic RegressionMultinomial Logistic Regression
Multinomial Logistic RegressionDr Athar Khan
 
Structural equation modeling in amos
Structural equation modeling in amosStructural equation modeling in amos
Structural equation modeling in amosBalaji P
 
Basics of Structural Equation Modeling
Basics of Structural Equation ModelingBasics of Structural Equation Modeling
Basics of Structural Equation Modelingsmackinnon
 
Introduction to Structural Equation Modeling
Introduction to Structural Equation ModelingIntroduction to Structural Equation Modeling
Introduction to Structural Equation ModelingUniversity of Southampton
 
Univariate & bivariate analysis
Univariate & bivariate analysisUnivariate & bivariate analysis
Univariate & bivariate analysissristi1992
 
Structured equation model
Structured equation modelStructured equation model
Structured equation modelKing Abidi
 
Introduction to Regression Analysis
Introduction to Regression AnalysisIntroduction to Regression Analysis
Introduction to Regression AnalysisSibashis Chakraborty
 
Theoretical framewrk [Research Methodology]
Theoretical framewrk [Research Methodology]Theoretical framewrk [Research Methodology]
Theoretical framewrk [Research Methodology]srpj30
 

What's hot (20)

Factor analysis
Factor analysis Factor analysis
Factor analysis
 
Structural Equation Modelling (SEM) Part 3
Structural Equation Modelling (SEM) Part 3Structural Equation Modelling (SEM) Part 3
Structural Equation Modelling (SEM) Part 3
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive Statistics
 
Factor Analysis - Statistics
Factor Analysis - StatisticsFactor Analysis - Statistics
Factor Analysis - Statistics
 
2-Theoretical-Frameworks.pptx
2-Theoretical-Frameworks.pptx2-Theoretical-Frameworks.pptx
2-Theoretical-Frameworks.pptx
 
Introduction to Structural Equation Modeling Partial Least Sqaures (SEM-PLS)
Introduction to Structural Equation Modeling Partial Least Sqaures (SEM-PLS)Introduction to Structural Equation Modeling Partial Least Sqaures (SEM-PLS)
Introduction to Structural Equation Modeling Partial Least Sqaures (SEM-PLS)
 
Multinomial Logistic Regression
Multinomial Logistic RegressionMultinomial Logistic Regression
Multinomial Logistic Regression
 
Structural equation modeling in amos
Structural equation modeling in amosStructural equation modeling in amos
Structural equation modeling in amos
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
 
Basics of Structural Equation Modeling
Basics of Structural Equation ModelingBasics of Structural Equation Modeling
Basics of Structural Equation Modeling
 
Introduction to Structural Equation Modeling
Introduction to Structural Equation ModelingIntroduction to Structural Equation Modeling
Introduction to Structural Equation Modeling
 
Case Study Research Methods
Case Study Research MethodsCase Study Research Methods
Case Study Research Methods
 
Multicollinearity
MulticollinearityMulticollinearity
Multicollinearity
 
Confirmatory Factor Analysis
Confirmatory Factor AnalysisConfirmatory Factor Analysis
Confirmatory Factor Analysis
 
Univariate & bivariate analysis
Univariate & bivariate analysisUnivariate & bivariate analysis
Univariate & bivariate analysis
 
Structured equation model
Structured equation modelStructured equation model
Structured equation model
 
Introduction to Regression Analysis
Introduction to Regression AnalysisIntroduction to Regression Analysis
Introduction to Regression Analysis
 
Time series analysis
Time series analysisTime series analysis
Time series analysis
 
Theoretical framewrk [Research Methodology]
Theoretical framewrk [Research Methodology]Theoretical framewrk [Research Methodology]
Theoretical framewrk [Research Methodology]
 
SEM
SEMSEM
SEM
 

Similar to Factor analysis (1)

Similar to Factor analysis (1) (20)

EFA
EFAEFA
EFA
 
Overview Of Factor Analysis Q Ti A
Overview Of  Factor  Analysis  Q Ti AOverview Of  Factor  Analysis  Q Ti A
Overview Of Factor Analysis Q Ti A
 
Factor anaysis scale dimensionality
Factor anaysis scale dimensionalityFactor anaysis scale dimensionality
Factor anaysis scale dimensionality
 
Factor Analysis in Research
Factor Analysis in ResearchFactor Analysis in Research
Factor Analysis in Research
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
 
Factor analysis using spss 2005
Factor analysis using spss 2005Factor analysis using spss 2005
Factor analysis using spss 2005
 
FactorAnalysis.ppt
FactorAnalysis.pptFactorAnalysis.ppt
FactorAnalysis.ppt
 
Correlational research
Correlational researchCorrelational research
Correlational research
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
 
Multivariate Approaches in Nursing Research Assignment.pdf
Multivariate Approaches in Nursing Research Assignment.pdfMultivariate Approaches in Nursing Research Assignment.pdf
Multivariate Approaches in Nursing Research Assignment.pdf
 
Module 13 - Exploratory Factor Analysis.pptx
Module 13 - Exploratory Factor Analysis.pptxModule 13 - Exploratory Factor Analysis.pptx
Module 13 - Exploratory Factor Analysis.pptx
 
Exploratory
Exploratory Exploratory
Exploratory
 
08 - FACTOR ANALYSIS PPT.pptx
08 - FACTOR ANALYSIS PPT.pptx08 - FACTOR ANALYSIS PPT.pptx
08 - FACTOR ANALYSIS PPT.pptx
 
Priya
PriyaPriya
Priya
 
QCI WORKSHOP- Factor analysis-
QCI WORKSHOP- Factor analysis-QCI WORKSHOP- Factor analysis-
QCI WORKSHOP- Factor analysis-
 
Factor Analysis (Marketing Research)
Factor Analysis (Marketing Research)Factor Analysis (Marketing Research)
Factor Analysis (Marketing Research)
 
April Heyward Research Methods Class Session - 8-5-2021
April Heyward Research Methods Class Session - 8-5-2021April Heyward Research Methods Class Session - 8-5-2021
April Heyward Research Methods Class Session - 8-5-2021
 
ANALYZING DATA BY BY SELLIGER AND SHAOAMY (1989)
ANALYZING DATA BY BY SELLIGER AND SHAOAMY (1989)ANALYZING DATA BY BY SELLIGER AND SHAOAMY (1989)
ANALYZING DATA BY BY SELLIGER AND SHAOAMY (1989)
 
factor analysis.pdf
factor analysis.pdffactor analysis.pdf
factor analysis.pdf
 

Recently uploaded

04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 

Recently uploaded (20)

04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 

Factor analysis (1)

  • 1. EXPLORATORY FACTOR ANALYSIS (EFA) Kalle Lyytinen & James Gaskin
  • 2. Learning Objectives 1. Understand what is the factor analysis technique and its applications in research 2. Discuss exploratory factor analysis (EFA) 3. Run EFA with SPSS and interpret the resulted output 4. Estimate shortly reliability 5. Assess shortly construct validity
  • 3. The whole works Theory Constructs Items linked to constructs EFA Collect data Build/Run Structural Model Modify the Measurement Model Link items to constructs; Label constructs Test structural hypotheses Conduct CFA Without CMB Conduct CFA With CMB Conduct Multi-group CFA Goodness of fit & psychometric properties filter Data cleaning filter Modify the Structural Model Goodness of fit filter Contribute to theory Analyzing the factor structure of the multi-item data
  • 4. Family Tree of SEM T-test Latent Growth Curve Analysis ANOVA Multi-way ANOVA Repeated Measure Designs Growth Curve Analysis Bivariate Correlation Multiple Regression Path Analysis Structural Equation Modeling Factor Analysis Confirmatory Factor Analysis Exploratory Factor Analysis Source: PIRE Is the difference between samples on a variable significant? Is the correlation between different variables significant? Multiple samples, multiple variables, over time, etc. Multiple variables, overall model, measurement model, etc.
  • 5. SCOPE of Factor Analysis today  Factor analysis and principal component analysis  Carrying out the analyses in SPSS  Deciding on the number of factors  Rotating factors  Producing factor and component scores  Assumptions and sample size  Exploratory and confirmatory FA
  • 6. Types of Measurement Models  Exploratory (EFA)  Confirmatory (CFA)  Multitrait-Multimethod (MTMM)  Hierarchical CFA
  • 7. EFA vs. CFA  Exploratory Factor Analysis is concerned with how many factors are necessary to explain the relations among a set of indicators and with estimation of factor loadings. It is associated with theory development.  Confirmatory Factor Analysis is concerned with determining if the number of factors “conform” to what is expected on the basis of pre-established theory. Do items load as predicted on the expected number of factors. Hypothesize beforehand the number of factors.
  • 8. CONTENT: 1. Does the system provide the precise information you need? 2. Does the information content meet your needs? 3. Does the system provide reports that seem to be just about exactly what you need? 4. Does the system provide sufficient information? ACCURACY: 1. Is the system accurate? 2. Are you satisfied with the accuracy of the system? FORMAT: 1. Do you think the output is presented in a useful format? 2. Is the information clear? EASE OF USE: 1. Is the system user friendly? 2. Is the system easy to use? TIMELINESS: 1. Do you get the information you need in time? 2. Does the system provide up-to-date information? End-User Computing Satisfaction (EUCS) EUCS: An instrument for measuring satisfaction with an information system
  • 9. Factor Analysis  Factor Analysis is a method for identifying a structure (or factors, or dimensions) that underlies the relations among a set of observed variables.  Factor analysis is a technique that transforms the correlations among a set of observed variables into smaller number of underlying factors, which contain all the essential information about the linear interrelationships among the original test scores.  Factor analysis is a statistical procedure that involves the relationship between observed variables (measurements) and the underlying latent factors.
  • 10. Factor Analysis  Factor analysis is a fundamental component of Structural Equation modeling.  Factor analysis explores the inter-relationships among variables to discover if those variables can be grouped into a smaller set of underlying factors. Many variables are “reduced” (grouped) into a smaller number of factors These variables reflect the causal impact of the “latent” underlying factors  Statistical technique for dealing with multiple variables
  • 11. Explore data for patterns. Often a researcher is unclear if items or variables have a discernible patterns. Factor Analysis can be done in an Exploratory fashion to reveal patterns among the inter-relationships of the items. Data Reduction. Factor analysis can be used to reduce a large number of variables into a smaller and more manageable number of factors. Factor analysis can create factor scores for each subject that represents these higher order variables. Factor Analysis can be used to reduce a large number of variables into a parsimonious set of few factors that account better for the underlying variance (causal impact) in the measured phenomenon. Confirm Hypothesis of Factor Structure. Factor Analysis can be used to test whether a set of items designed to measure a certain variable(s) do, in fact, reveal the hypothesized factor structure (i.e. whether the underlying latent factor truly “causes” the variance in the observed variables and how “certain” we can be about it). In measurement research when a researcher wishes to validate a scale with a given or hypothesized factor structure, Confirmatory Factor Analysis is used. Theory Testing. Factor Analysis can be used to test a priori hypotheses about the relations among a set of observed variables. Applications of Factor Analysis
  • 12. How would you group these Items?
  • 13. In EFA, the researcher is attempting to explore the relationships among items to determine if the items can be grouped into a smaller number of underlying factors. In this analysis, all items are assumed to be related to all factors. V1 V2 V3 V4 ε ε ε ε Factor 1 Factor 1 Exploratory Factor Analysis
  • 15. Measured Variables or Indicators: These variables are those that the researcher has observed or measured. In this example, they are the four items on the scale. Note, they are drawn as rectangles or squares. V1 V2 V3 V4 ε ε ε ε Factor 1 Factor 1 Exploratory Factor Analysis
  • 16. Unmeasured or Latent Variables: These variables are not directly measurable, rather the researcher only has indicators of these measures. These variables are more often the more interesting, but more difficult variables to measure (e.g., self-efficacy). In this example, the latent variables are the two factors. Note, they are drawn as elipses V1 V2 V3 V4 ε ε ε ε Exploratory Factor Analysis Factor 1 Factor 1
  • 17. V1 V2 V3 V4 ε ε ε ε Factor 1 Factor 1 Exploratory Factor Analysis Factor Loadings: Measure the relationship between the items and the factors. Factor loadings can be interpreted like correlation coefficients; ranging between -1.0 and +1.0. The closer the value is to 1.0, positive or negative, the stronger the relationship between the factor and the item. Loadings can be both positive or negative.
  • 18. Factor Loadings: Note the direction of the arrows; the factors are thought to influence the indicators, not vice versa. Each item is being predicted by the factors. V1 V2 V3 V4 ε ε ε ε Factor 1 Factor 1 Exploratory Factor Analysis
  • 19. Errors in Measurement: Each of the indicator variables has some error in measurement. The small circles with the ε indicate the error. The error is composed of 'we know not what' or are not measured directly. These errors in measurement are considered the reliability estimates for each indicator variable. V1 V2 V3 V4 ε ε ε ε Factor 1 Factor 1 Exploratory Factor Analysis
  • 20. Multi-Indicator Approach  A multiple-indicator approach reduces the overall effect of measurement error of any individual observed variable on the accuracy of the results  A distinction is made between observed variables (indicators) and underlying latent variables or factors (constructs)  Together the observed variables and the latent variables make up the measurement model
  • 21. Conceptual Model Positive Affect Guilt Fear Sadness Negative Affect This model holds that there are two uncorrelated factors that explain the relationships among the six emotion variables Variables Factor (Observed) (Latent) Awe Joy Happiness
  • 22. Measurement Model Items Positive Affect (Factor 1) Negative Affect (Factor 2) Joy Loading* 0 Awe Loading 0 Happiness Loading 0 Fear 0 Loading Guilt 0 Loading Sadness 0 Loading *The loading is a data-driven parameter that estimates the relationships (correlation) between an observed item and a latent factor.
  • 23. Data Matrix must have sufficient number of correlations Variables must be inter-related in some way since factor analysis seeks the underlying common dimensions among the variables. If the variables are not related each variable will be its own factor!! Rule of thumb: substantial number of correlations greater than .30 Metric variables are assumed, although dummy variables may be used (coded 0,1). The factors or unobserved variables are assumed to be independent of one another. All variables in a factor analysis must consist of at least an ordinal scale. Nominal data are not appropriate for factor analysis. Assumptions of Factor Analysis
  • 24. Quick Quips about Factor Analysis How many cases? Rule of 10—10 cases for every item; rule of 100– number of respondents should be the larger of (1) 5 times number of variables or (2) 100. How many variables do I need to FA? More the better (at least 3) Is normality of data required? Nope Is it necessary to standardize one variables before FA? Nope Can you pool data from two samples together in a FA? Yep, but must show they have same factor structure.
  • 25. Two statistics on the SPSS output allow you to look at some of the basic assumptions. Kaiser-Meyer-Olkin (KMO) Measure of Sampling Adequacy, and Bartlett's Test of Sphericity Kaiser-Meyer-Olkin Measure of Sampling Adequacy generally indicates whether or not the variables are able to be grouped into a smaller set of underlying factors. That is, will data factor well??? KMO varies from 0 to 1 and should be .60 or higher to proceed (can us .50 more lenient cut-off) High values (close to 1.0) generally indicate that a factor analysis may be useful with your data. If the value is less than .50, the results of the factor analysis probably won't be very useful. Tests for Basic Assumptions
  • 26. Kaiser-Meyer-Olkin (KMO)  Marvelous - - - - - - .90s  Meritorious - - - - - .80s  Middling - - - - - - - .70s  Mediocre - - - - - - - .60s  Miserable - - - - - - .50s  Unacceptable - - - below .50
  • 27. KMO Statistics: Interpreting the Output In this example, the data support the use of factor analysis and suggest that the data may be grouped into a smaller set of underlying factors. What does Bartlett’s Test of Sphericity explore?
  • 28. Correlation Matrix  Bartlett's Test of Sphericity Tests hypothesis that correlation matrix is an identity matrix.  Diagonals are ones  Off-diagonals are zeros Significant result indicates matrix is not an identity matrix.
  • 29. Bartlett’s Test of Sphericity Bartlett’s Test of Sphericity compares your correlation matrix to an identity matrix’ An identity matrix is a correlation matrix with 1.0 on the principal diagonal and zeros in all other correlations. So clearly you want your Bartlett value to be significant as you are expecting relationships between your variables, if a factor analysis is going to be appropriate! Problem with Bartlett’s test occurs with large n’s as small correlations tend to be statistically significant – so test may not mean much!
  • 30. Two Extraction Methods  Principal Component Analysis  Considers all of the available variance (common + unique) (places 1’s on diagonal of correlation matrix).  Seeks a linear combination of variables such that maximum variance is extracted—repeats this step.  Use when there is concern with prediction, parsimony and knows specific and error variance are small.  Results in orthogonal (uncorrelated factors)  Principal Axis Factoring (PFA) or Common Factor Analysis • Considers only common variance (places communality estimates on diagonal of correlation matrix). • Seeks least number of factors that can account for the common variance (correlation) of a set of variables. • PAF is only analyzing common factor variability; removing the uniqueness or unexplained variability from the model.  Called Principal Axis Factoring (PFA).  PFA preferred in SEM cause it accounts for co-variation, whereas PCS accounts for total variance
  • 31. Methods of Factor Extraction  Principal-axis factoring (PAF) diagonals replaced by estimates of communalities iterative process continues until negligible changes in communalities
  • 32. What is a Common Factor?  It is an abstraction, a hypothetical construct that affects at least two of our measurement variables.  We want to estimate the common factors that contribute to the variance in our variables.  Is this an act of discovery or an act of invention?
  • 33. What is a Unique Factor?  It is a factor that contributes to the variance in only one variable.  There is one unique factor for each variable.  The unique factors are unrelated to one another and unrelated to the common factors.  We want to exclude these unique factors from our solution.
  • 34. Comparison of Extraction Models  PCA vs. PAF Factor loadings and eigenvalues are a little larger with Principal Components One may always obtain a solution with Principal Components Often little practical difference FYI—Other less-used Extraction Methods (Image, alpha, ML ULS, GLS factoring)
  • 35. Principal Components Extraction  A communality (C) is the extent to which an item correlates with all other items. Thus, in PCA extraction method when the initial communalities are set to 1.0, then all of the variability of each item is accounted for in the analysis.  Of course some of the variability is explained and some is unexplained.  In PCA with these initial communalities set to 1.0, you are trying to find both the common factor variance and the unique or error variance.
  • 36. Principal Components Extraction  Statisticians have indicated that assuming that all of the variability of the items whether explained or unique can be accounted for in the analysis is flawed and definitely should not be used in an exploratory factor model.  Some researchers suggest PAF as the appropriate method for factor extraction using EFA.  In PAF extraction, the amount of variability each item shares with all other items is determined and this value is inserted into the correlation matrix replacing the 1.0 on the diagonals. As a result, PAF is only analyzing common factor variability; removing the uniqueness or unexplained variability from the model.
  • 37. Factor Rotation: Orthogonal  Varimax (most common)  minimizes number of variables with high loadings (or low) on a factor—makes it possible to identify a variable with a factor  Quartimax  minimizes the number of factors needed to explain each variable. Tend to generate a general factor on which most variables load with med to high vales—not helpful for research  Equimax  combination of Varimax and Quartimax Q&A: Why use rotation method? Rotation causes factor loading to be more clearly differentiated—necessary to facilitate interpretation
  • 38. Non-orthogonal (oblique) The real issue is you don’t have a basis for knowing how many factors there are or what they are much less whether they are correlated! Researchers assume variables are indicators of two or more factors, a measurement model which implies orthogonal rotation.  Direct oblimin (DO) Factors are allowed to be correlated. Diminished interpretability  Promax Computationally faster than DO Used for large datasets
  • 39. Oblique Rotation The variables are assessed for the unique relationship between each factor and the variables (removing relationships that are shared by multiple factors) The matrix of unique relationships is called the pattern matrix. The pattern matrix is treated like the loading matrix in orthogonal rotation.
  • 40. Decisions to be made  EXTRACTION: PCA vs PAF  ROTATION: Orthogonal or Oblique (non-orthogonal)
  • 41. Procedures for Factor Analysis  Multiple different statistical procedures exist by which the number of appropriate number of factors can be identified.  These procedures are called "Extraction Methods."  By default SPSS does PCA extraction  This Principal Components Method is simpler and until more recently was considered the appropriate method for Exploratory Factor Analysis.  Statisticians now advocate for a different extraction method due to a flaw in the approach that Principal Components utilizes for extraction.
  • 42. What else?  How many factors do you extract? One convention is to extract all factors with eigenvalues greater than 1 (e.g. PCA) Another is to extract all factors with non- negative eigenvalues Yet another is to look at the scree plot Number based on theory Try multiple numbers and see what gives best interpretation.
  • 43. Total Variance Explained 3.513 29.276 29.276 3.296 27.467 27.467 3.251 27.094 27.094 3.141 26.171 55.447 2.681 22.338 49.805 1.509 12.573 39.666 1.321 11.008 66.455 .843 7.023 56.828 1.495 12.455 52.121 .801 6.676 73.132 .329 2.745 59.573 .894 7.452 59.573 .675 5.623 78.755 .645 5.375 84.131 .527 4.391 88.522 .471 3.921 92.443 .342 2.851 95.294 .232 1.936 97.231 .221 1.841 99.072 .111 .928 100.000 Factor 1 2 3 4 5 6 7 8 9 10 11 12 Total % of Variance Cumulative % Total % of Variance Cumulative % Total % of Variance Cumulative % Initial Eigenvalues Extraction Sums of Squared Loadings RotationSums of SquaredLoadings Extraction Method: Principal Axis Factoring. Eigenvalues greater than 1
  • 44. Scree Plot Scree Plot Factor Number 12 11 10 9 8 7 6 5 4 3 2 1 Eigenvalue 4 3 2 1 0 Three Factor Solution
  • 45. Criteria For Retention Of Factors  Eigenvalue greater than 1 Single variable has variance equal to 1  Plot of total variance - Scree plot Gradual trailing off of variance accounted for is called the scree.  Note cumulative % of variance of rotated factors
  • 46. Interpretation of Rotated Matrix  Loadings of .40 or higher  Name each factor based on 3 or 4 variables with highest loadings.  Do not expect perfect conceptual fit of all variables.
  • 47. Loading size based on sample (from Hair et al 2010 Table 3-2) Significant Factor Loadings based on Sample Size Sample Size Sufficient Factor Loading 50 0.75 60 0.70 70 0.65 85 0.60 100 0.55 120 0.50 150 0.45 200 0.40 250 0.35 350 0.30
  • 48. What else?  How do you know when the factor structure is good? When it makes sense and has a (relatively) simple and clean structure. Total Variance Explained > .60  How do you interpret factors? Good question, that is where the true art of this comes in.
  • 51. EDM 643 51 Reflective versus Formative Diet (Reflective)  R1. I eat healthy food.  R2. I do not each much junk food.  R3. I have a balanced diet. Health (Formative)  F1. I have a balanced diet  F2. I exercise regularly  F3. I get sufficient sleep each night Diet R1 R2 R3 e1 e2 e3 Health F1 F2 F3 e3
  • 52. EDM 643 52  Direction of causality is from construct to measure  Measures expected to be correlated  Indicators are interchangeable  Direction of causality is from measure to construct  No reason to expect the measures are correlated  Indicators are not interchangeable *From Jarvis et al 2003 Diet R1 R2 R3 e1 e2 e3 Health F1 F2 F3 e3 Diet (Reflective) Health (Formative)
  • 53. Adequacy  Residuals ≤ 5%  KMO ≥ 0.8 is better  Communalities ≥ 0.5 is better
  • 54. Validity  Face Validity (do they make sense?)  Pattern Matrix  Convergent (high loadings)  Discriminant (no cross-loadings)  Factor Correlations  ≤.7 is better EDM 643 54
  • 55. Reliability  Split data and do two EFAs  Cronbach’s Alpha (>.70) for each factor SPSS: Scale  Reliability Analysis EDM 643 55