2. Types of Statistical Tests
When running a t test and ANOVA
We compare:
Mean differences between groups
We assume
random sampling
the groups are homogeneous
distribution is normal
samples are large enough to represent population
(>30)
DV Data: represented on an interval or ratio scale
These are Parametric tests!
3. Types of Tests
When the assumptions are violated:
Subjects were not randomly sampled
DV Data:
Ordinal (ranked)
Nominal (categorized: types of car, levels of
education, learning styles, Likert Scale)
The scores are greatly skewed or we have no
knowledge of the distribution
We use tests that are equivalent to t test and
ANOVA
Non-Parametric Test!
4. Requirements for Chi-
Square test 4
Must be a random sample from population
Data must be in raw frequencies
Variables must be independent
A sufficiently large sample size is required
(at least 20)
Actual count data (not percentages)
Observations must be independent.
Does not prove causality.
5. Different Scales, Different Measures
of Association
Scale of Both Measures of
Variables Association
Nominal Scale Pearson Chi-
Square: χ2
Ordinal Scale Spearman’s rho
Interval or Ratio Pearson r
Scale
6. Chi Square
Used when data are nominal (both IV and DV)
Comparing frequencies of distributions occurring in
different categories or groups
Tests whether group distributions are different
• Shoppers’ preference for the taste of 3 brands of candy
determines the association between IV and DV by
counting the frequencies of distribution
• Gender relative to study preference (alone or in group)
7. Important
The chi square test can only be used on
data that has the following characteristics:
The data must be in the form
of frequencies
The frequency data must have a
precise numerical value and must be
organised into categories or groups.
The expected frequency in any one cell
of the table must be greater than 5.
The total number of observations must be
greater than 20.
8. Formula
χ 2 = ∑ (O – E)2
E
χ2 = The value of chi square
O = The observed value
E = The expected value
∑ (O – E)2 = all the values of (O – E) squared then added
together
9. ?What is it
Test of proportions
Non parametric test
Dichotomous variables are used
Tests the association between two
factors
e.g. treatment and disease
gender and mortality
10. types of chi-square analysis
techniques
Tests of Independence is a chi-square
technique used to determine whether two
characteristics (such as food spoilage and
refrigeration temperature) are related or
independent.
Goodness-of-fit test is a chi-square test
technique used to study similarities between
proportions or frequencies between groupings
(or classification) of categorical data (comparing
a distribution of data with another distribution of
data where the expected frequencies are
known).
11. Chi Square Test of Independence
Purpose
To determine if two variables of interest independent (not
related) or are related (dependent)?
When the variables are independent, we are saying that knowledge of
one gives us no information about the other variable. When they are
dependent, we are saying that knowledge of one variable is helpful in
predicting the value of the other variable.
The chi-square test of independence is a test of the influence or
impact that a subject’s value on one variable has on the same
subject’s value for a second variable.
Some examples where one might use the chi-squared test of
independence are:
• Is level of education related to level of income?
• Is the level of price related to the level of quality in production?
Hypotheses
The null hypothesis is that the two variables are independent. This will
be true if the observed counts in the sample are similar to the
expected counts.
• H0: X and Y are independent
• H1: X and Y are dependent
12. Chi Square Test of
Independence
Wording of Research questions
Are X and Y independent?
Are X and Y related?
The research hypothesis states that the
two variables are dependent or related.
This will be true if the observed counts for
the categories of the variables in the
sample are different from the expected
counts.
Level of Measurement
Both X and Y are categorical
13. Assumptions
Chi Square Test of Independence
Each subject contributes data to only one cell
Finite values
Observations must be grouped in categories. No assumption is
made about level of data. Nominal, ordinal, or interval data may be
used with chi-square tests.
A sufficiently large sample size
In general N > 20.
No one accepted cutoff – the general rules are
• No cells with observed frequency = 0
• No cells with the expected frequency < 5
• Applying chi-square to small samples exposes the researcher to an
unacceptable rate of Type II errors.
Note: chi-square must be calculated on actual count data, not
substituting percentages, which would have the effect of pretending
the sample size is 100.
14. Chi Square Test of Goodness of Fit
Purpose
To determine whether an observed
frequency distribution departs significantly
from a hypothesized frequency distribution.
This test is sometimes called a One-sample
Chi Square Test.
Hypotheses
The null hypothesis is that the two variables are
independent. This will be true if the observed
counts in the sample are similar to the expected
counts.
• H0: X follows the hypothesized distribution
• H1: X deviates from the hypothesized distribution
15. Chi Square Test of Goodness of Fit
Sample Research Questions
Do students buy more Coke, Gatorade or
Coffee?
Does my sample contain a
disproportionate amount of Hispanics as
compared to the population of the county
from which they were sampled?
Has the ethnic composition of the city of
Amman changed since 1990?
Level of Measurement
X is categorical
16. Assumptions
Chi Square Test of Goodness of Fit
The research question involves the comparison of
the observed frequency of one categorical variable
within a sample to the expected frequency of that
variable.
The observed and theoretical distributions must
contain the same divisions (i.e. ‘levels’ or
‘classes’)
The expected frequency in each division must be
>5
There must be a sufficient sample (in general
N>20)
17. Steps in Test of Hypothesis
1. Determine the appropriate test
2. Establish the level of significance:α
3. Formulate the statistical hypothesis
4. Calculate the test statistic
5. Determine the degree of freedom
6. Compare computed test statistic against a
tabled/critical value
18. Determine Appropriate Test. 1
Chi Square is used when both variables are
measured on a nominal scale.
It can be applied to interval or ratio data that
have been categorized into a small number of
groups.
It assumes that the observations are
randomly sampled from the population.
All observations are independent (an
individual can appear only once in a table and
there are no overlapping categories).
It does not make any assumptions about the
shape of the distribution nor about the
homogeneity of variances.
19. Establish Level of. 2
Significance
α is a predetermined value
The convention
• α = .05
• α = .01
• α = .001
20. Determine The Hypothesis:. 3
Whether There is an
Association or Not
Ho : The two variables are independent
Ha : The two variables are associated
21. Calculating Test Statistics. 4
Contrasts observed frequencies in each cell of a
contingency table with expected frequencies.
The expected frequencies represent the number
of cases that would be found in each cell if the
null hypothesis were true ( i.e. the nominal
variables are unrelated).
Expected frequency of two unrelated events is
product of the row and column frequency divided
by number of cases.
Fe= Fr Fc / N
Expected frequency = row total x column total
Grand total
23. Calculating Test Statistics. 4
O
fre bse
qu r ve
en d
c ie
s
( Fo − Fe ) 2
χ = ∑
2
Fe
Ex que
fre
pe ncy
cte
d
qu ed
cy
fre pect
en
Ex
24. Determine Degrees of. 5
f
ber o
Num column
) df = (R-1)(C-1
Freedom
ls in
leve ariable
v
Numb
er o
le vels in f
ro
variab w
le
25. Compare computed test statistic. 6
against a tabled/critical value
The computed value of the Pearson chi-
square statistic is compared with the
critical value to determine if the
computed value is improbable
The critical tabled values are based on
sampling distributions of the Pearson
chi-square statistic
If calculated χ2 is greater than χ2 table
value, reject Ho
26. Decision and Interpretation
If the probability of the test statistic is less than
or equal to the probability of the alpha error rate,
we reject the null hypothesis and conclude that
our data supports the research hypothesis. We
conclude that there is a relationship between the
variables.
If the probability of the test statistic is greater
than the probability of the alpha error rate, we
fail to reject the null hypothesis. We conclude
that there is no relationship between the
variables, i.e. they are independent.
27. Example
Suppose a researcher is interested in
voting preferences on gun control issues.
A questionnaire was developed and sent
to a random sample of 90 voters.
The researcher also collects information
about the political party membership of the
sample of 90 respondents.
28. Bivariate Frequency Table or
Contingency Table
Favor Neutral Oppose f row
Democrat 10 10 30 50
Republican 15 15 10 40
f column 25 25 40 n = 90
29. Bivariate Frequency Table or
Contingency Table
Favor Neutral Oppose f row
Democrat 10 10 30 50
Republican 15 15 10 40
f dcolumn 25 25 40 n = 90
e
e r v cie s
b s en
O qu
fre
30. Row frequency
Bivariate Frequency Table or
Contingency Table
Favor Neutral Oppose f row
Democrat 10 10 30 50
Republican 15 15 10 40
f column 25 25 40 n = 90
31. Bivariate Frequency Table or
Contingency Table
Favor Neutral Oppose f row
Democrat 10 10 30 50
Republican 15 15 10 40
f column 25 25 40 n = 90
Column frequency
32. Determine Appropriate Test. 1
1. Party Membership ( 2 levels) and
Nominal
2. Voting Preference ( 3 levels) and
Nominal
34. Determine The Hypothesis. 3
• Ho : There is no difference between D & R
in their opinion on gun control issue.
• Ha : There is an association between
responses to the gun control survey and
the party membership in the population.
35. Calculating Test Statistics. 4
Favor Neutral Oppose f row
Democrat fo =10 fo =10 fo =30 50
fe =13.9 fe =13.9 fe=22.2
Republican fo =15 fo =15 fo =10 40
fe =11.1 fe =11.1 fe =17.8
f column 25 25 40 n = 90
36. Calculating Test Statistics. 4
Favor Neutral Oppose f row
= 50*25/90
Democrat fo =10 fo =10 fo =30 50
fe =13.9 fe =13.9 fe=22.2
Republica fo =15 fo =15 fo =10 40
n f =11.1 f =11.1 fe =17.8
e e
f column 25 25 40 n = 90
37. Calculating Test Statistics. 4
Favor Neutral Oppose f row
Democrat fo =10 fo =10 fo =30 50
fe =13.9 fe =13.9 fe=22.2
= 40* 25/90
Republica fo =15 fo =15 fo =10 40
n f =11.1 f =11.1 fe =17.8
e e
f column 25 25 40 n = 90
40. Compare computed test statistic. 6
against a tabled/critical value
α = 0.05
df = 2
Critical tabled value = 5.991
Test statistic, 11.03, exceeds critical value
Null hypothesis is rejected
Democrats & Republicans differ
significantly in their opinions on gun
control issues
41. SPSS Output for Gun Control
Example
Chi-Square Tests
Asymp. Sig.
Value df (2-sided)
Pearson Chi-Square 11.025a 2 .004
Likelihood Ratio 11.365 2 .003
Linear-by-Linear
8.722 1 .003
Association
N of Valid Cases 90
a. 0 cells (.0%) have expected count less than 5. The
minimum expected count is 11.11.
42. Additional Information in SPSS
Output
Exceptions that might distort χ2
Assumptions
Associations in some but not all categories
Low expected frequency per cell
Extent of association is not same as
statistical significance
Demonstrated
through an example
43. Another Example Heparin Lock
Placement
Complication Incidence * Heparin Lock Placement Time Group Crosstabulation
Heparin Lock
Placement Time Group
1 2 Total Time:
Complication Had Compilca Count 9 11 20 1 = 72 hrs
Incidence Expected Count 10.0 10.0 20.0 2 = 96 hrs
% within Heparin Lock
18.0% 22.0% 20.0%
Placement Time Group
Had NO Compilca Count 41 39 80
Expected Count 40.0 40.0 80.0
% within Heparin Lock
82.0% 78.0% 80.0%
Placement Time Group
Total Count 50 50 100
Expected Count 50.0 50.0 100.0
% within Heparin Lock
100.0% 100.0% 100.0%
Placement Time Group
from Polit Text: Table 8-1
44. Hypotheses in Heparin Lock Placement
Ho: There is no association between
complication incidence and length of
heparin lock placement. (The variables are
independent).
Ha: There is an association between
complication incidence and length of
heparin lock placement. (The variables are
related).
45. More of SPSS Output
Chi-Square Tests
Asymp. Sig. Exact Sig. Exact Sig.
Value df (2-sided) (2-sided) (1-sided)
Pearson Chi-Square .250b 1 .617
Continuity Correctiona .063 1 .803
Likelihood Ratio .250 1 .617
Fisher's Exact Test .803 .402
Linear-by-Linear
.248 1 .619
Association
N of Valid Cases 100
a. Computed only for a 2x2 table
b. 0 cells (.0%) have expected count less than 5. The minimum expected count is 10.
00.
46. Pearson Chi-Square
Pearson Chi-Square = .
250, p = .617
Since the p > .05, we fail
to reject the null
hypothesis that the Chi-Square Tests
complication rate is
Asymp. Sig. Exact Sig. Exact Sig.
Value df (2-sided) (2-sided) (1-sided)
Pearson Chi-Square .250b 1 .617
unrelated to heparin lock Continuity Correctiona
Likelihood Ratio
.063
.250
1
1
.803
.617
placement time. Fisher's Exact Test
Linear-by-Linear
.248 1 .619
.803 .402
Association
Continuity correction is N of Valid Cases 100
a. Computed only for a 2x2 table
used in situations in b. 0 cells (.0%) have expected count less than 5. The minimum expected count is 10.
which the expected
00.
frequency for any cell in a
2 by 2 table is less than
10.
47. More SPSS Output
Symmetric Measures
Asymp.
a b
Value Std. Error Approx. T Approx. Sig.
Nominal by Phi -.050 .617
Nominal Cramer's V .050 .617
Interval by Interval Pearson's R -.050 .100 -.496 .621c
Ordinal by Ordinal Spearman Correlation -.050 .100 -.496 .621c
N of Valid Cases 100
a. Not assuming the null hypothesis.
b. Using the asymptotic standard error assuming the null hypothesis.
c. Based on normal approximation.
48. Phi Coefficient
Pearson Chi-Square Symmetric Measures
Asym
provides information
Value Std. E
Nominal by Phi -.050
Nominal Cramer's V .050
about the existence of Interval by Interval
Ordinal by Ordinal
Pearson's R
Spearman Correlation
-.050
-.050
relationship between 2
N of Valid Cases 100
a. Not assuming the null hypothesis.
b. Using the asymptotic standard error assuming the null hypo
nominal variables, but not c. Based on normal approximation.
about the magnitude of
the relationship
Phi coefficient is the
χ
2
measure of the strength
of the association φ=
N
49. Cramer’s V
When the table is larger than 2 Symmetric Measures
by 2, a different index must be Value
Asymp
Std. Er
used to measure the strength
Nominal by Phi -.050
Nominal Cramer's V .050
of the relationship between the Interval by Interval
Ordinal by Ordinal
Pearson's R
Spearman Correlation
-.050
-.050
.1
.1
variables. One such index is N of Valid Cases
a. Not assuming the null hypothesis.
100
Cramer’s V. b. Using the asymptotic standard error assuming the null hypoth
c. Based on normal approximation.
If Cramer’s V is large, it means
that there is a tendency for
particular categories of the first
variable to be associated with
particular categories of the
χ 2
second variable.
V=
N (k − 1)
50. Cramer’s V
When the table is larger than 2 Symmetric Measures
by 2, a different index must be Value
Asymp
Std. Er
used to measure the strength
Nominal by Phi -.050
Nominal Cramer's V .050
of the relationship between the Interval by Interval
Ordinal by Ordinal
Pearson's R
Spearman Correlation
-.050
-.050
.1
.1
variables. One such index is N of Valid Cases
a. Not assuming the null hypothesis.
100
Cramer’s V. b. Using the asymptotic standard error assuming the null hypoth
c. Based on normal approximation.
If Cramer’s V is large, it means
that there is a tendency for
particular categories of the first
variable to be associated with
particular categories of the
χ 2
second variable.
V=
N (k − 1)
Number of Smallest of number
cases of rows or columns
51. Interpreting Cell Differences in
a Chi-square Test - 1
A chi-square test of
independence of the
relationship between sex
and marital status finds a
statistically significant
relationship between the
variables.
52. Interpreting Cell Differences in
a Chi-square Test - 2
Researcher often try to identify try to identify which cell or cells are
the major contributors to the significant chi-square test by examining
the pattern of column percentages.
Based on the column percentages, we would identify cells on the
married row and the widowed row as the ones producing the
significant result because they show the largest differences: 8.2% on
the married row (50.9%-42.7%) and 9.0% on the widowed row
(13.1%-4.1%)
53. Interpreting Cell Differences in
a Chi-square Test - 3
Using a level of significance of 0.05, the critical value for a
standardized residual would be -1.96 and +1.96. Using
standardized residuals, we would find that only the cells on the
widowed row are the significant contributors to the chi-square
relationship between sex and marital status.
If we interpreted the contribution of the married marital status,
we would be mistaken. Basing the interpretation on column
percentages can be misleading.
54. Chi-Square Test of
Independence: post hoc test
) in SPSS (1
You can conduct a chi-square test
of independence in crosstabulation
of SPSS by selecting:
Analyze > Descriptive
Statistics > Crosstabs…
55. Chi-Square Test of
Independence: post hoc test
) in SPSS (2
First, select and move the
variables for the question to
“Row(s):” and “Column(s):”
list boxes.
The variable mentioned first
in the problem, sex, is used
as the independent variable
and is moved to the
“Column(s):” list box.
The variable mentioned
Second, click on second in the problem,
“Statistics…” button to [fund], is used as the
request the test dependent variable and is
statistic. moved to the “Row(s)” list
box.
56. Chi-Square Test of
Independence: post hoc test
) in SPSS (3
First, click on “Chi-square” to
request the chi-square test of Second, click on “Continue”
independence. button to close the Statistics
dialog box.
57. Chi-Square Test of
Independence: post hoc test
) in SPSS (6
In the table Chi-Square Tests result,
SPSS also tells us that “0 cells have
expected count less than 5 and the
minimum expected count is 70.63”.
The sample size requirement for the
chi-square test of independence is
satisfied.
58. Chi-Square Test of
Independence: post hoc test
( in SPSS (7
The probability of the chi-square test
statistic (chi-square=2.821( was
p=0.244, greater than the alpha level of
significance of 0.05. The null hypothesis
that differences in "degree of religious
fundamentalism" are independent of
differences in "sex" is not rejected.
The research hypothesis that
differences in "degree of religious
fundamentalism" are related to
differences in "sex" is not supported by
this analysis.
Thus, the answer for this question is
False. We do not interpret cell
differences unless the chi-square test
statistic supports the research
hypothesis.
59. SPSS Analysis
Chi Square Test of Goodness
of Fit
Analyze – X variable
goes here
Nonparametric Tests –
Chi Square If all expected
frequencies are
the same, click
this box
If all expected
frequencies are
not the same, enter
the expected value
for each division
here
60. Examples
((using the Montana.sav data
All All
expected frequencies expected frequencies
are the same are not the same
61. Types of Statistical Inference
Parameter estimation:
It is used to estimate a population value, such as a
mean, relative risk index or a mean difference
between two groups.
Estimation can take two forms:
• Point estimation: involves calculating a single statistic to
estimate the parameter. E.g. mean and median.
Disadvantages: they offer no context for interpreting their
accuracy and a point estimate gives no information regarding
the probability that it is correct or close to the population value.
• Interval estimation: is to estimate a range of values that has
a high probability of containing the population value .
62. Interval Estimation
For example, it is more likely the population
height mean lies between 165-175cm.
Interval estimation involves constructing a
confidence interval (CI) around the point
estimate.
The upper and lower limits of the CI are called
confidence limits.
A CI around a sample mean communicates a
range of values for the population value, and the
probability of being right. That is, the estimate is
made with a certain degree of confidence of
capturing the parameter.
63. Confidence Intervals around a
Mean
95% CI = (mean + (1.96 x SEM)
This statement indicates that we can be 95% confident that the
population mean lies between the confident limits , and that these
limits are equal to 1.96 times the true standard error, above and
below the sample mean.
E.g. if the mean = 61 inches, and SEM = 1, What is 95% CI.
Solution: 95% CI = (61 + (1.96 X 1))
95% CI = (61 + 1.96)
95% CI = 59.04 < μ < 62.96
E.g. if the mean = 61 inches, and SEM = 1, What is 99% CI.
Solution: 99% CI = (61 + (2.58 X 1))
99% CI = (61 + 2.58)
99% CI = 58.42 < μ < 63.58
64. Data Interpretation
Consideration:
1. Accuracy
1. critical view of the data
2. investigating evidence of the results
3. consider other studies’ results
4. peripheral data analysis
5. conduct power analysis: type I & type II
True False
True Correct Type-II
False Type -I Correct
65. Types of Errors
If You…… When the Null Then You
Hypothesis is… Have…….
Reject the null True (there really Made a Type I
hypothesis are no difference) Error
Reject the null False (there really ☻
hypothesis are difference)
Accept the null False (there really Made Type II
hypothesis are difference) Error
Accept the null True (there really ☻
hypothesis are no difference)
66. alpha : the level of significance used for
establishing type-I error
β : the probability of type-II error
1 – β : is the probability of obtaining
significance results ( power)
Effect size: how much we can say that the
intervention made a significance
difference
67. 2. Meaning of the results
- translation of the results and make it
understandable
3. Importance:
- translation of the significant findings into
practical findings
4. Generalizability:
- how can we make the findings useful for all
the population
5. Implication:
- what have we learned related to what has
been used during study
68. Needed Parameters
Alpha--chance of a Type I error
Beta--chance of a Type II error
Power = 1 - beta
Effect size--difference between groups or
amount of variance explained or how
much relationship there is between the DV
and the IVs
69. ?Remember this in English
Type I error is when you say there is a
difference or relationship and there is not
Type II error is when you say there is no
difference or relationship and there really
is
70. ?Which is more important
Type I error more important if possibility of
harm or lethal effect
Type II error more important in relatively
unexplored areas of research
In some studies, Type I and Type II errors
may be equally important
71. How to Increase Power
1. Increase the n
2. Decrease the unexplained variance--control by design or statistics
(e.g. ANCOVA)
3. Increase alpha (controversial)
4. Use a one tailed test (directional hypothesis)--puts the zone of
rejection all in one tail; same effect as increasing alpha
5. Use parametric statistics as long as you meet the assumptions. If
not, parametric statistics are LESS powerful
6. Decrease measurement error (decrease unexplained variance)--use
more reliable instruments, standardize measurement protocol,
frequent calibration of physiologic instruments, improve inter-rater
reliability
72. ?What is good power
By tradition, “good” power is 80%
The correct answer is it depends on the nature of
the phenomenon and which kind of error is most
important in your study. This is a theoretical
argument that you have to make.
Using convention (alpha = .05 and power = .80,
beta = .20) you are saying that Type I error is
_________ as serious as a Type II error
73. Effect Size
How large an effect do I expect exists in the
population if the null is false?
OR
How much of a difference do I want to be
able to detect?
The larger the effect, the fewer the cases
needed to see it. (The difference is so big
you can trip on it.)
74. The World According to Power
Kraemer & Thiemann
The more stringent the significance level, the greater the
necessary sample size. More subjects are needed for a
1% level than a 5% level
Two tailed tests require larger sample sizes than one
tailed tests. Assessing two directions at the same time
requires a greater investment.
The smaller the effect size, the larger the necessary
sample size. Subtle effects require greater efforts.
The larger the power required, the larger the necessary
sample size. Greater protection from failure requires
greater effort.
The smaller the sample size, the smaller the power, ie
the greater the chance of failure
75. The World According to Power
Kraemer & Thiemann
If one proposed to go with a sample size
of 20 or fewer, you have to be willing to
have a high risk of failure or a huge effect
size
To achieve 99% power for a effect size of .
01, you need > 150,000 subjects
76. Power for each test
You do a power analysis for each statistic
you are going to use.
Choose the sample size based on the
highest number of subjects from the power
analysis.
Use the most conservative power
analysis--guarantees you the most
subjects