Mean, standard deviation, reliability,
correlation, and regression
Data entry in SPSS
• SPSS Statistics is a software package used for logical
batched and non-batched statistical analysis.
• The data entry in SPSS is crucial for smoother
• Refer to this link for the entry of data in SPSS
• Mean : The mean is the average of all numbers
and is sometimes called the arithmetic mean.
• Standard deviation : a quantity expressing by
how much the members of a group differ from
the mean value for the group.
• Divide the outcome of mean and standard deviation
by the number of items for each scale.
• Like: If value given in table for TC is 12.47 and
number of items is 5, then 12.47/5 = 2.494
This indicate that respondents considered their training
as less helpful as the mean estimate is low on the scale
of 7 point Likert scale.
• Reliability -A test is considered reliable if we get the same result
• Cronbach’s alpha, α (or coefficient alpha), developed by Lee
Cronbach in 1951, measures reliability, or internal consistency.
“Reliability” is how well a test measures what it should. For example,
a company might give a job satisfaction survey to their employees.
High reliability means it measures job satisfaction, while low
reliability means it measures something else (or possibly nothing at
• Cronbach’s alpha tests to see if multiple-question Likert scale surveys
are reliable. These questions measure latent variables — hidden or
unobservable variables like: a person’s conscientiousness, neurosis or
openness. These are very difficult to measure in real life. Cronbach’s
alpha will tell you if the test you have designed is accurately
measuring the variable of interest.
• Reliability above 0.70 is acceptable level to
indicate that the scale used to collect data
provides consistent results and thus is reliable
for further analysis
• Correlation analysis is a method of statistical
evaluation used to study the strength of a relationship
between two, numerically measured, continuous
variables (e.g. height and weight).
• Pearson’s product-moment coefficient is the
measurement of correlation and ranges (depending on
the correlation) between +1 and -1. +1 indicates the
strongest positive correlation possible, and -1
indicates the strongest negative correlation possible.
• Put all the total values in SPSS
• As correlation estimates between each
variable is positive and significant (p value
<0.001), this indicates all the variables are
related to each other.
• This gives basis for further regression analysis
to understand the causal relationship between
• Regression analysis is used to model the
relationship between a response variable and
one or more predictor variables.
1. Standardize the variable data
In statistics, standardized coefficients or beta
coefficients are the estimates resulting from
a regression analysis that have been
standardized so that the variances of
dependent and independent variables are 1.
2. Put the data in regression model
IV: Independent variable = TC
M: Mediator = SE
DV: Dependent variable = CAA
• First enter IV in independent variable and
mediator as dependent variable
• Then, put DV as dependent variable and
IV as independent variable.
• Then click on ‘next’
• And put mediator as independent variable
R square estimate
• R-squared is a statistical measure of how close the
data are to the fitted regression line. It is also known
as the coefficient of determination, or the coefficient
of multiple determination for multiple regression.
• R-squared is always between 0 and 100%:
• 0% indicates that the model explains none of the
variability of the response data around its mean.
• 100% indicates that the model explains all the
variability of the response data around its mean.
• In general, the higher the R-squared, the better the
model fits your data.
• The p-value for each term tests the null hypothesis
that the coefficient is equal to zero (no effect). A low
p-value (< 0.05) indicates that you can reject the null
hypothesis. In other words, a predictor that has a low
p-value is likely to be a meaningful addition to your
model because changes in the predictor's value are
related to changes in the response variable.
• Conversely, a larger (insignificant) p-value suggests
that changes in the predictor are not associated with
changes in the response.
A standardized beta coefficient compares the strength of the effect of
each individual independent variable to the dependent variable. The
higher the absolute value of the beta coefficient, the stronger the
For example, a beta of -.9 has a stronger effect than a beta of +.8.
Standardized beta coefficients have standard deviations as their units.
This means the variables can be easily compared to each other. In
other words, standardized beta coefficients are the coefficients that
you would get if the variables in the regression were all converted
to z-scores before running the analysis.
• As R square changed from 0.250 to 0.299, this
indicate addition of mediator in equation
contributes towards relationship between IV
• As P value is below 0.05, this indicate chances
of Type I and type II error is less than 5%.
• Effect of TC on SE = 0.526, p < 0.05
• Effect of SE on CAA = 0.261, P< 0.05
• Effect of TC on CAA = 0.362, P< 0.05
• As all the relationship is significant, this indicate the
SE has a significant mediating role between TC and
Β = 0.362***
Β = 0.526***
Β = 0.261***
• Note that a mediational model is a causal
• For example, the mediator is presumed to
cause the outcome and not vice versa. If the
presumed causal model is not correct, the
results from the mediational analysis are likely
of little value.
• Mediation is not defined statistically; rather
statistics can be used to evaluate a presumed
Baron and Kenny mediation steps
• The above steps of mediation is based on the four step mediation
analysis test proposed by Baron and Kenny (1986), Judd and Kenny
(1981), and James and Brett (1984).
• Thus, to indicate mediation four steps are to be analyzed-
Step 1: Show that the causal variable is correlated with the
outcome. Use Y as the criterion variable in a regression equation and X
as a predictor (estimate and test path c in the above figure). This step
establishes that there is an effect that may be mediated.
Step 2: Show that the causal variable is correlated with the
mediator. Use M as the criterion variable in the regression equation and
X as a predictor (estimate and test path a). This step essentially involves
treating the mediator as if it were an outcome variable.
Step 3: Show that the mediator affects the outcome variable. Use
Y as the criterion variable in a regression equation and X and M
as predictors (estimate and test path b). It is not sufficient just to
correlate the mediator with the outcome because the mediator and
the outcome may be correlated because they are both caused by
the causal variable X. Thus, the causal variable must be
controlled in establishing the effect of the mediator on the
Step 4: To establish that M completely mediates the X-Y
relationship, the effect of X on Y controlling for M (path c')
should be zero (see discussion below on significance
testing). The effects in both Steps 3 and 4 are estimated in the
Final mediation decision
• If all four of these steps are met, then the data are
consistent with the hypothesis that variable
M completely mediates the X-Y relationship, and if
the first three steps are met but the Step 4 is not,
then partial mediation is indicated. Meeting these
steps does not, however, conclusively establish that
mediation has occurred because there are other
(perhaps less plausible) models that are consistent
with the data. Some of these models are considered
later in the Specification Error section.