Galambos N Analysis Of Survey Results

Nora Galambos, PhD
Office of Institutional Research
Stony Brook University

» What hypotheses are being tested?
» What types of analyses are planned to test the
hypotheses?
» Look over the instrument and create a map or
outline of possible analysis methods
» What is the magnitude of the differences you
would like to detect?

» The most obvious reason for pilot testing is to
be able to estimate the sample size.
» Find potential sources of bias
» Assists in power calculations
» Discover possible distribution problems prior to
surveying the entire sample

» A Type I error occurs when a true null
hypothesis is rejected. The probability of a
Type I error is denoted by α, and is the
significance level of the hypothesis test, with
0.05 being a common value for α.

» On the other hand, a Type II error occurs when
the null hypothesis is false and it is not
rejected. A Type II error is denoted by β and is
often set to 0.20.

True Results

Experimental Results Ho is true Ho is false

Reject Ho α (Type I error rate) Power = 1 - β

Accept Ho β (Type II error rate)

» Statistical Power Analysis for the Behavioral
Sciences—Jacob Cohen
» The power of a significance test is the probability of
rejecting a false null hypothesis, and is equal to 1 -
β. If β is 0.20, the power = 0.80.
» 0.80 is generally considered to be adequate level
for the power
» Since sample size and power are related, a small
sample size results in less power, or reduced
probability of rejecting a false null hypothesis.

d = 0.2, 0.5, 0.8 (small, medium, and large effects)

n (for each group) 0.2 0.5 0.8
30 0.03 0.24 0.66
40 0.04 0.35 0.82
50 0.06 0.45 0.91
60 0.07 0.55 >0.995
80 0.12 0.82 >0.995
100 0.29 0.99 >0.995
200 0.29 >0.995 >0.995
500 0.72 >0.995 >0.995

» Missing Completely at Random (MCAR)
˃ Given two variables X and Y, the missingness is unrelated to either.
The missing values in X are independent of Y and vice versa.
˃ If the data are MCAR, then listwise deletion is appropriate
» Missing at Random (MAR)
˃ Given two variables X and Y, the missingness is related to or
dependent upon X, but not Y. Suppose X = age and Y = income and
income is more often missing in certain age groups, but within each
age group, no income group is missing more often that any
others, then the data are MAR.
» Nonignorable
˃ Given two variables X and Y, the missingness is related to X, but may
also be related to Y. In our age-income example, certain income
groups within an age group may be less likely to respond.

» Select items with a missing percentage greater
than 1% or 2%.
» Recode them into binary variables where with
1=missing and 0=non-missing.
» Analyze these variables by the demographic
variables using t-tests or chi-square, as
appropriate.
» Significant results indicate that missingness is
associated with one or more of the
demographic variables.

» Used to uncover relationship patterns among a
group of variables with the goal of reducing the
variables to a smaller group
» Two types of data reduction methods--
confirmatory and exploratory
» Exploratory factor analysis does not assume any
particular structure prior to the analysis and is used
to “explore” relationships between variables
» Confirmatory factor analysis is used to test
hypotheses regarding the underlying structure of a
group of variables
» Traditional factor analysis and principal
components analysis are exploratory data
reduction methods

» Principal components analysis a method often
used for reducing the number of variables
» Principal components analysis is part of the
factor analysis procedures in SAS and SPSS
» Although factor analysis (FA) and principal
components analysis (PCA) have mathematical
differences the results are often similar
» Many authors loosely use the term “factor
analysis” to refer to data reduction methods, in
general

» Finds groups that are correlated with each
other, possibly measuring the same
construct.
» Reduces the variables in the data to a
smaller number of items that account for
most of the variance of all of the variables in
the data
» The first component accounts for the
greatest amount of variance. Then second
one accounts for the greatest amount not
accounted for by the first component and is
uncorrelated with the first component.

» Suggested sample size: at least 100 subjects
and 10 observations per variable
» A correlation analysis of the variables should
result in most correlations greater than 0.3
» Bartlett’s test of sphericity is significant (p <
0.05)
» Kaiser-Meyer-Olkin (KMO) test of sampling
adequacy ≥ 0.6
» Determinant >0.00001 which indicates that
multicollinearity is not a problem

» In SPSS select principal components
under “extraction method”
» Select varimax rotation.
˃A rotation uses a transformation to aid in the
interpretation of the factor solution
˃A varimax rotation is orthogonal, so the components
are uncorrelated, which maximizes the column
variance

» Kaiser criterion—choose components with
eigenvalues greater than one.
» Scree plot—plot of eigenvalues
˃ Retain the eigenvalues before the leveling off point of the plot.

» Want the proportion of variance accounted
for by each factor (or component) to be 5%
to 10%
» Cumulative variance accounted for should
be 70% to 80%

Total Variance Explained
Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings
Component Total % of Variance
Cumulative %
Total % of Variance Cumulative % Total % of Variance
Cumulative %
1 14.26 47.53 47.53 14.26 47.53 47.53 7.22 24.06 24.06
2 2.55 8.49 56.02 2.55 8.49 56.02 5.79 19.31 43.37
3 1.37 4.56 60.58 1.37 4.56 60.58 4.41 14.70 58.07
4 1.09 3.64 64.22 1.09 3.64 64.22 1.84 6.15 64.22
5 0.98 3.26 67.48
6 0.86 2.86 70.33
7 0.80 2.67 73.00
8 0.75 2.51 75.51
9 0.68 2.25 77.76
10 0.62 2.06 79.82
11 0.58 1.93 81.75
12 0.56 1.88 83.63
13 0.49 1.64 85.27
14 0.48 1.59 86.85

» There should be at least three items with
significant loadings on each component
» Check the conceptualization of the component
items
» With an orthogonal rotation the factor loadings
= correlation between variable and component
» A communality is the proportion of variance in
a variable that is accounted for by the retained
components or factors. A communality is large
if it loads heavily on at least one component.

» Factor score
˃Save the regression scores as variables
˃Standardize the survey responses
˃For each subject’s response, multiply the
standardized survey response by the corresponding
regression weights—add the results
» Factor-based score
˃Average the responses of the items in the
component
˃Check for reverse codings and missing data.

» Cronbach’s Alpha is used to measure the
reliability or the internal consistency of the
factors or components.
» The variables in a scale are all entered into the
calculation to obtain the alpha score.
» A Cronbach’s alpha > 0.7 is considered to be
sufficient for demonstrating internal
consistency for most social science research,
while values > 0.6 are marginably acceptable

Galambos N Analysis Of Survey Results

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Galambos N Analysis Of Survey Results

Similar to Galambos N Analysis Of Survey Results (20)

Galambos N Analysis Of Survey Results