Factor analysis in R by Aman Chauhan

Factor Analysis
 Factor analysis is a useful tool for investigating variable relationships for complex
concepts such as socioeconomic status, dietary patterns, or psychological scales.
 It allows researchers to investigate concepts that are not easily measured directly
by collapsing a large number of variables into a few interpretable underlying
factors
 It has no interdependent or dependent variable as we were having in regression or
multiple regression.

What is a factor
 The key concept of factor analysis is that multiple observed variables have similar patterns of
responses because they are all associated with a latent (i.e. not directly measured) variable.
 For example, people may respond similarly to questions about income, education, and occupation,
which are all associated with the latent variable socioeconomic status.
 So in Simple words Common underlying factors are known as factors

Why we use Factor Analysis
 To lower down the no. of variables as it is a data summarization technique
 As in factor analysis we see the interrelationship between large no. of variables
and on the basis of that relationship we reduce many variable into some under
common(similar) underlying dimension.
 For example we have large no. of student’s data we can break them in smaller ones
by creating some category
1. Academic 2. Sports 3. Cultural

Understanding factor analysis
Regardless of purpose factor analysis is used in the determination of small no. of
factors based on particular inter-related quantitative variables.

Assumptions in FA
 Variables must be interrelated
 Sample size should be min. 50, preferred 100
 Observations min. 5, preferred 10

Issues
 Use Principal Component Analysis or Common Factor Analysis(Factor Analysis)

Basic Difference b/w PCA and FA
 PCA- In this total variance is taken Unique variance, error variance and hard
Variance.
 FA- Only those variables are taken whose variance is common or we can say only
Shared variance is taken.

Performing PCA
We will use the built-in dataset mtcars. The dataset has 32 instances for 11 variables.
It gives 11 features like ‘miles per gallon’, ‘number of cylinders’, ‘horsepower’, etc.
of 32 different models of cars. In the dataset, there are two categorical variables.
First is ‘vs’ that shows whether the car’s engine is ‘v’ shaped (1) or not (0).
The second one is ‘am’ that shows whether the car has an automatic transmission (1)
or manual (0).
We will have to ignore these two variables in the analysis as PCA is for numeric data
and cannot deal with categorical variables.
We will compute the principal components using the prcomp() function to achieve
this

Code-
 mtcars.pca<-prcomp(mtcars[,c(1:7,10,11)],center=TRUE, scale.=TRUE)
 summary(mtcars.pca)

Factor Analysis in R
 Factor analysis (FA) or exploratory factor analysis is another technique to reduce
the number of variables to a smaller set of factors. FA identifies the relationships
among a set of variables and narrows it down to a smaller set.
 We will be using the bfi dataset, which is a built-in dataset provided in R. It
comprises 25 different personality factors. We will require the psych and the GPA
rotation packages. So, install and load them into the library.

code
 parallel <- fa.parallel(bfi,fm="minres",fa='fa‘)
 Output
Parallel analysis suggests that the number of factors = 7 and the number of
components = NA

Applying Factor Analysis
 Now that we know how many factors we need, we can perform the factor analysis
using the fa() function.
 factors <- fa(bfi,nfactors=7,rotate='oblimin',fm='minres')
 print(factors)

Summary
 PCA and factor analysis in R are both multivariate analysis techniques. They
both work by reducing the number of variables while maximizing the
proportion of variance covered. The prime difference between the two
methods is the new variables derived.
 The principal components are normalized linear combinations of the
original variables. The factors are measurement models of latent variables.
While both techniques have the same purpose, they have different
approaches and results.

Factor analysis in R by Aman Chauhan

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Factor analysis in R by Aman Chauhan

Similar to Factor analysis in R by Aman Chauhan (20)

Recently uploaded

Recently uploaded (20)

Factor analysis in R by Aman Chauhan