SlideShare a Scribd company logo
1 of 10
Week 5 Lecture 14
The Chi Square Test
Quite often, patterns of responses or measures give us a lot of
information. Patterns are generally the result of counting how
many things fit into a particular category. Whenever we make a
histogram, bar, or pie chart we are looking at the pattern of the
data. Frequently, changes in these visual patterns will be our
first clues that things have changed, and the first clue that we
need to initiate a research study (Lind, Marchel, & Wathen,
2008).
One of the most useful test in examining patterns and
relationships in data involving counts (how many fit into this
category, how many into that, etc.) is the chi-square. It is
extremely easy to calculate and has many more uses than we
will cover. Examining patterns involves two uses of the Chi-
square - the goodness of fit and the contingency table. Both of
these uses have a common trait: they involve counts per group.
In fact, the chi-square is the only statistic we will look at that
we use when we have counts per multiple groups (Tanner &
Youssef-Morgan, 2013). Chi Square Goodness of Fit Test
The goodness of fit test checks to see if the data distribution
(counts per group) matches some pattern we are interested in.
Example: Are the employees in our example company
distributed equal across the grades? Or, a more reasonable
expectation for a company might be are the employees
distributed in a pyramid fashion – most on the bottom and few
at the top?
The Chi Square test compares the actual versus a proposed
distribution of counts by generating a measure for each cell or
count: (actual – expected)2/actual. Summing these for all of the
cells or groups provides us with the Chi Square Statistic. As
with our other tests, we determine the p-value of getting a result
as large or larger to determine if we reject or not reject our null
hypothesis. An example will show the approach using Excel.
Regardless of the Chi Square test, the chi square related
functions are found in the fx Statistics window rather than the
Data Analysis where we found the t and ANOVA test functions.
The most important for us are:
· CHISQ.TEST (actual range, expected range) – returns the p-
value for the test
· CHISQ.INV.RT(p-value, df) – returns the actual Chi Square
value for the p-value or probability value used.
· CHISQ.DIST.RT(X, df) – returns the p-value for a given
value.
When we have a table of actual and expected results, using the
=CHISQ.TEST(actual range, expected range) will provide us
with the p-value of the calculated chi square value (but does not
give us the actual calculated chi square value for the test). We
can compare this value against our alpha criteria (generally
0.05) to make our decision about rejecting or not rejecting the
null hypothesis.
If, after finding the p-value for our chi square test, we want to
determine the calculated value of the chi square statistic, we can
use the =CHISQ.INV.RT(probability, df) function, the value for
probability is our chi square test outcome, and the degrees of
freedom (df) equals the number of cells in our actual table
minus 1 (6 – 1 =5 for an problem working with our 6 grade
levels). Finally, if we are interested in the probability of
exceeding a particular chi square value, we can use the
CHIDIST or CHISQ.DIST.RT function.
Excel Example. To see if our employees are distributed in a
traditional pyramid shape, we would use the Chi Square
Goodness of Fit test as we are dealing both with count data and
with a proposed distribution pattern. For this test, let us assume
the following table shows the expected distribution of our 50
employees in a pyramid organizational structure.
A
B
C
D
E
F
15
12
10
6
4
3
Grade: Total
Count: 50
The actual or observed distribution within our sample is shown
below.
A
B
C
D
E
F
15
7
5
5
12
6
Grade: Total
Count: 50
The research question: Are employees distributed in a pyramidal
fashion?
Step 1: Ho: No difference exists between observed and expected
frequency counts Ha: Observed and Expected frequencies differ.
Step 2: Reject the null hypothesis if the p-value < alpha = .05.
Step 3: Chi Square Goodness of Fit test.
Step 4: Conduct the test. Below is a screen short of an Excel
solution.
Step 5: Conclusions and Interpretation. Since our p-value of
0.00024 is < our alpha of 0.05, we reject the null hypothesis.
The employees are not distributed in a pyramid pattern.
Side Note: We might think that if our sample had an equal
number of employees per grade we would have a better chance
of grade based differences averaging out. Doing this same test
and assuming an equal distribution across grades produces a p-
value of 0.063 causing us to fail to reject the null hypothesis.
The student is encouraged to try this, the equal value for each
grade would be 50/6.
Effect size. For a single row, goodness-of-fit test, the
associated effect size measure is called effect size r, and equals
the square root of: the chi square value/(N*df), where df = the
number of cells – 1. A value less than .30 is considered small,
between .30 and .50 is considered moderate, and more than .50
is considered large (Steinberg, 2008). Since we rejected the
null in the example above, the effect size would be: r= square
root (23.75/50*5) = sgrt(0.095) =0.31. This is a moderate
impact, suggesting that both sample size and variable
interaction had some impact. With moderate results, we
generally would want to get a larger sample and repeat the test
(Tanner & Youssef-Morgan, 2013). Chi Square Contingency
Table test
Contingency table tests, also known as tests of independence,
are slightly more complex than goodness of fit tables. They
classify the data by two or more variable labels (we will limit
our discussions to two variable tables). Looking a lot like the
input table for the ANOVA 2factor without replication we
looked at last week. Both variables involve the counts per
category (nominal, ordinal, or interval/ratio data in ranges) of
items that meet our research interest (Lind, Marchel, & Wathen,
2008).
With most contingency tables, we do not have a given expected
frequency as we had with the goodness of fit situation. To find
the expected value for each cell for a multiple row/column
table, we use the formula: row total * column total/grand total
(which suggests the expected frequency is the average of the
observed frequencies per cell, not an unreasonable expectation).
Once we have generated the values for the expected table, we
use the same formula to perform the Chi Square test. Manually,
this is the sum of ((actual – expected)2/expected) for all of the
cells. The same fx Chi Square functions used for the Goodness
of Fit test are used for the Contingency Table analysis.
The null hypothesis for a contingency table test is “no
relationship exists between the variables.” The alternate
hypothesis would be: “a relationship exists.” In general, you
are testing either for similar distributions between the groups of
interest or to see if a relationship ("correlation") exists (even if
the data is nominal level). The df for a contingency table is
(number of rows-1)*(number of columns – 1).
Excel Example. The data entry for this test is the same as with
our earlier test, and the functions are found in the fx statistical
list. One possible explanation for different salaries is the
performance on the job, reflected in the performance rating.
We might wonder if males and females are evaluated differently
(either due to actual performance or to bias; if so, we have
another issue to examine). So, our research question for this
issue becomes, are males and females rated the same?
Step 1: Ho: Male and Female ratings are similar (no difference
in distributions)
Ha: Males and Females rating distributions differ Step 2:
Reject Ho if p-value is < alpha = 0.05. Step 3: Chi Square
Contingency Table Analysis Step 4: Perform Test.
Step 5: Conclusions and Interpretation. Since the p-value
(CHISQ.TEST result) is greater than
(>) alpha = .05, we fail to reject the null hypothesis and
conclude that males and females are evaluated in a similar
pattern. It does not appear that performance rating impact
average salary differences.
Effect size. Now, as with the t-test and ANOVA, had we
rejected the null hypothesis, we would have wanted to examine
the practical impact of the outcome using an effect size
measure. The effect size measure for the Chi Square is a
correlation measure. Two measures are generally used with the
contingency table outcomes – the Phi coefficient and Cramer’s
V (Tanner & Youssef-Morgan, 2013).
The Phi coefficient (=square root of (chi square/sample size))
provides a rough estimate of the correlation between the two
variables. Phi is primarily used with small tables (2x2, 2x3, or
3x2). Values below .30 are weak, .30 to about .50 are
moderate, and above .50 (to 1) are strong relationships (Tanner
& Youssef-Morgan, 2013).
Cramer’s V can be considered as a percent of the shared
variation – or common variation between the variables. It
equals the square root of (phi squared/(smaller number of rows
or columns -1). It ranges from 0 (no relationship or variation in
common) to 1.0 (strong relationship, all variation in common)
(Tanner & Youssef-Morgan, 2013).
For our example above, it would not make sense to calculate
either value since we did not reject the null; but for illustrative
purposes we will.
· Phi = square root of (1.978/50) = square root of (0.03956) =
0.199 –small, no relationship
· V = square root of (0.1.99^2/(2-1)) = 0.19. Note, when the
smaller of the number of rows and columns equals 1, V will
equal Phi (Tanner & Youssef-Morgan, 2013). Caution
Due to the division involved in calculating the Chi Square
value, it is extremely influenced with cells that have small
expected values. Most texts say simply that if the expected
frequency in one or more cells is less than (<) 5 to not use the
Chi Square distribution in a hypothesis test. There are some
different opinions about this issue. Different texts issue
different rules on what to do if we have expected frequencies of
5 or less in cells.
As a compromise, let’s use the standard that no more than 20%
of the cells should have an expected value of less than 5. If
they do, we need to combine rows or columns to reduce this
percentage interest (Lind, Marchel, & Wathen, 2008).
References
Lind, D. A., Marchel, W. G., & Wathen, S. A. (2008).
Statistical Techniques in Business & Finance. (13th Ed.)
Boston: McGraw-Hill Irwin.
Steinberg, W.J. (2008). Statistics Alive! Thousand Oaks, CA:
Sage Publications, Inc.
Tanner, D. E. & Youssef-Morgan, C. M. (2013). Statistics for
Managers. San Diageo, CA: Bridgeport Education.
Week 5 Lecture 13
This week we look at two different approaches to analyzing data
and making inferences about the populations they come from.
The first is confidence intervals, a range of values that we
expect to contain the actual population mean based on the
sample results we obtained. The other is a way to use nominal
and ordinal data in a statistical analysis. The Chi Square family
of tests looks at patterns within samples and sees whether the
underlying populations could contain the same pattern of
measure distributions (Lind, Marchel, & Wathen, 2008).
Confidence Intervals
When we perform a t-test or ANOVA, we are using a single
point estimate for the means of the populations we are testing.
Some professionals and managers are a bit uncomfortable with
this; they understand that the sample has a sampling error – and
the actual population mean could be – and most likely is – a bit
different. They are interested in getting an estimate of what the
sampling error is and how much the population mean could
differ from the sample mean.
We deal with this through the use of confidence intervals, a
range of values that have a specific probability of containing
the actual population mean. We have seen one example of a
confidence mean already, the intervals used to determine which
population means varied when we rejected the null hypothesis
for the ANOVA test were confidence intervals.
Confidence intervals often provide the added information and
comfort about estimates of population parameter values that the
single point estimates lack. Since the one thing we do know
about a statistic generated from a sample is that it will not
exactly equal the population parameter, we can use a confidence
interval to get a better feel for the range of values that might be
the actual population parameter. They also give us an
indication of how much variation exists in the data set. The
larger the range (at the same confidence level), the more
variation within the sample data set and the less representative
the mean would be (Lind, Marchel, & Wathen, 2008). We are
going to look at two different kinds of confidence intervals this
week – intervals for a one sample mean and intervals for the
differences between the means of two samples (Lind, Marchel,
& Wathen, 2008).
One Sample Confidence Interval for the mean
A confidence interval is simply a range of values that could
contain the actual population parameter of interest. It is
centered on the sample mean, and uses the variation in the
sample to estimate a range of possible values (Lind, Marchel, &
Wathen, 2008). To construct a confidence interval, we use
several pieces of information from the sample and the
confidence level we want.
From the sample we use the mean, standard deviation, and size.
To get the confidence level – a desired probability (usually set
at 95%), that the interval does, in fact, contain the population
mean.
Example. The confidence interval for the female mean salary in
the population would be calculated this way. The sample mean
value is 38, the standard deviation is 18., and the sample size is
25 3 (from Week 1 material). Once we determine the
confidence level we want, we use the associated 2-tail t value to
achieve it. The t-value is found with the fx function t.inv.2t
(Prob, df). For a 95% confidence interval, we would use
t.inv.2t(0.05, 24), this equals 2.064 (rounded).
We now have all the information we need to construct a 95%
confidence interval for the female salary mean:
CI = mean +/- t * stdev/sqrt(sample size) = 38 +/-
2.064*18.3/sqrt(25) = 38 +/- 7.6.
This is typically written as 30.4 to 45.6. Note: the standard
deviation divided by the square root of the sample size is called
the standard error of the mean, and is the variation measure of
the sample used in several statistical tests, including the t-test
and confidence intervals.
The associated 95% CI for males is 44.6 to 59.3. Note that the
endpoints overlap – male smallest vale is 44.6 while the female
largest value is 45.6. This suggests that both population
average salaries could be the same and around 45. However,
just as the two one-sample t-tests gave us misleading
information on possible equality, using two confidence intervals
to compare two populations also is not the best approach.
The Confidence Interval for mean differences.
When comparing multiple samples, it is always best to use all
the possible information in a single test or procedure. The same
is true for confidence intervals. If we are interested in seeing if
sample means could be equal, we look to see if the difference
between the averages could be 0 or not. If so, then the means
could be the same; if not, then the means must be significantly
different.
The formula for the mean difference confidence interval is mean
difference +/- t*standard error. The standard error for the
difference of two populations is found by adding the
variance/sample size (which is the standard error squared) for
each and taking the square root (Lind, Marchel, & Wathen,
2008). For our salary data set we have the following values:
Female mean = 38 Male mean = 52 t = t.inv.2t(0.05,
48) = 2.106
Female Stdev = 18.3 Maler Stdev = 17.8 Sample size = 50,
df = 48
Standard error = sqrt(Variance (female)/25 + Variance
(male)/25) = Sqrt(334.7/25 + 316/25) = 5.10.
This gives us a 95% confidence interval for the difference
equaling:
(52-38) +/- 2.106 * 5.10 = 14 +/- 10.7 = 3.3 to 24.7.
Since this confidence interval does not contain 0, we are 95%
confident that the male and female salary means are not equal –
which is the same result we got from our 2 sample t-test in week
2. We also now have a sense of how much variation exists in
our measures.
Side note: The “+/- t* SE” term is often called the margin of
error. We most often hear this phrase in conjunction with
opinion polls – particularly political polls, “candidate A has
43% approval rating with a margin of error of 3.5%. While we
do not deal with proportions in the class, they are calculated the
same as an empirical probability – number of positive replies
divided by the sample size. The construction of these margins
or confidences is conceptually the same – a t-value and a
standard error of the proportion based on the sample size and
results (Lind, Marchel, & Wathen, 2008).
References
Lind, D. A., Marchel, W. G., & Wathen, S. A. (2008).
Statistical Techniques in Business & Finance. (13th Ed.)
Boston: McGraw-Hill Irwin.

More Related Content

Similar to Week 5 Lecture 14 The Chi Square TestQuite often, patterns of .docx

1Running head RESEARCH PROJECT PROPOSAL 13RESEA.docx
1Running head RESEARCH PROJECT PROPOSAL    13RESEA.docx1Running head RESEARCH PROJECT PROPOSAL    13RESEA.docx
1Running head RESEARCH PROJECT PROPOSAL 13RESEA.docxfelicidaddinwoodie
 
Week 3 Lecture 9 Effect Size When we reject the null h.docx
Week 3 Lecture 9 Effect Size When we reject the null h.docxWeek 3 Lecture 9 Effect Size When we reject the null h.docx
Week 3 Lecture 9 Effect Size When we reject the null h.docxcockekeshia
 
QUESTION 1Question 1 Describe the purpose of ecumenical servic.docx
QUESTION 1Question 1 Describe the purpose of ecumenical servic.docxQUESTION 1Question 1 Describe the purpose of ecumenical servic.docx
QUESTION 1Question 1 Describe the purpose of ecumenical servic.docxmakdul
 
Quantitative_analysis.ppt
Quantitative_analysis.pptQuantitative_analysis.ppt
Quantitative_analysis.pptmousaderhem1
 
PAGE O&M Statistics – Inferential Statistics Hypothesis Test.docx
PAGE  O&M Statistics – Inferential Statistics Hypothesis Test.docxPAGE  O&M Statistics – Inferential Statistics Hypothesis Test.docx
PAGE O&M Statistics – Inferential Statistics Hypothesis Test.docxgerardkortney
 
Medical Statistics Part-II:Inferential statistics
Medical Statistics Part-II:Inferential  statisticsMedical Statistics Part-II:Inferential  statistics
Medical Statistics Part-II:Inferential statisticsRamachandra Barik
 
Article Write-upsTo help you connect what you’re learning in cl.docx
Article Write-upsTo help you connect what you’re learning in cl.docxArticle Write-upsTo help you connect what you’re learning in cl.docx
Article Write-upsTo help you connect what you’re learning in cl.docxdavezstarr61655
 
Marketing Research Hypothesis Testing.pptx
Marketing Research Hypothesis Testing.pptxMarketing Research Hypothesis Testing.pptx
Marketing Research Hypothesis Testing.pptxxababid981
 
Chi square and t tests, Neelam zafar & group
Chi square and t tests, Neelam zafar & groupChi square and t tests, Neelam zafar & group
Chi square and t tests, Neelam zafar & groupNeelam Zafar
 
STATISTICS : Changing the way we do: Hypothesis testing, effect size, power, ...
STATISTICS : Changing the way we do: Hypothesis testing, effect size, power, ...STATISTICS : Changing the way we do: Hypothesis testing, effect size, power, ...
STATISTICS : Changing the way we do: Hypothesis testing, effect size, power, ...Musfera Nara Vadia
 
Day 11 t test for independent samples
Day 11 t test for independent samplesDay 11 t test for independent samples
Day 11 t test for independent samplesElih Sutisna Yanto
 
Chi-square tests are great to show if distributions differ or i.docx
 Chi-square tests are great to show if distributions differ or i.docx Chi-square tests are great to show if distributions differ or i.docx
Chi-square tests are great to show if distributions differ or i.docxMARRY7
 
© 2014 Laureate Education, Inc. Page 1 of 5 Week 4 A.docx
© 2014 Laureate Education, Inc.   Page 1 of 5  Week 4 A.docx© 2014 Laureate Education, Inc.   Page 1 of 5  Week 4 A.docx
© 2014 Laureate Education, Inc. Page 1 of 5 Week 4 A.docxgerardkortney
 
Excel Files AssingmentsCopy of Student_Assignment_File.11.01..docx
Excel Files AssingmentsCopy of Student_Assignment_File.11.01..docxExcel Files AssingmentsCopy of Student_Assignment_File.11.01..docx
Excel Files AssingmentsCopy of Student_Assignment_File.11.01..docxSANSKAR20
 
BUS 308 – Week 4 Lecture 2 Interpreting Relationships .docx
BUS 308 – Week 4 Lecture 2 Interpreting Relationships .docxBUS 308 – Week 4 Lecture 2 Interpreting Relationships .docx
BUS 308 – Week 4 Lecture 2 Interpreting Relationships .docxcurwenmichaela
 
BUS 308 – Week 4 Lecture 2 Interpreting Relationships .docx
BUS 308 – Week 4 Lecture 2 Interpreting Relationships .docxBUS 308 – Week 4 Lecture 2 Interpreting Relationships .docx
BUS 308 – Week 4 Lecture 2 Interpreting Relationships .docxjasoninnes20
 

Similar to Week 5 Lecture 14 The Chi Square TestQuite often, patterns of .docx (20)

1Running head RESEARCH PROJECT PROPOSAL 13RESEA.docx
1Running head RESEARCH PROJECT PROPOSAL    13RESEA.docx1Running head RESEARCH PROJECT PROPOSAL    13RESEA.docx
1Running head RESEARCH PROJECT PROPOSAL 13RESEA.docx
 
Chapter12
Chapter12Chapter12
Chapter12
 
Week 3 Lecture 9 Effect Size When we reject the null h.docx
Week 3 Lecture 9 Effect Size When we reject the null h.docxWeek 3 Lecture 9 Effect Size When we reject the null h.docx
Week 3 Lecture 9 Effect Size When we reject the null h.docx
 
QUESTION 1Question 1 Describe the purpose of ecumenical servic.docx
QUESTION 1Question 1 Describe the purpose of ecumenical servic.docxQUESTION 1Question 1 Describe the purpose of ecumenical servic.docx
QUESTION 1Question 1 Describe the purpose of ecumenical servic.docx
 
Quantitative_analysis.ppt
Quantitative_analysis.pptQuantitative_analysis.ppt
Quantitative_analysis.ppt
 
PAGE O&M Statistics – Inferential Statistics Hypothesis Test.docx
PAGE  O&M Statistics – Inferential Statistics Hypothesis Test.docxPAGE  O&M Statistics – Inferential Statistics Hypothesis Test.docx
PAGE O&M Statistics – Inferential Statistics Hypothesis Test.docx
 
Medical Statistics Part-II:Inferential statistics
Medical Statistics Part-II:Inferential  statisticsMedical Statistics Part-II:Inferential  statistics
Medical Statistics Part-II:Inferential statistics
 
Article Write-upsTo help you connect what you’re learning in cl.docx
Article Write-upsTo help you connect what you’re learning in cl.docxArticle Write-upsTo help you connect what you’re learning in cl.docx
Article Write-upsTo help you connect what you’re learning in cl.docx
 
Data analysis
Data analysisData analysis
Data analysis
 
Marketing Research Hypothesis Testing.pptx
Marketing Research Hypothesis Testing.pptxMarketing Research Hypothesis Testing.pptx
Marketing Research Hypothesis Testing.pptx
 
Chi square and t tests, Neelam zafar & group
Chi square and t tests, Neelam zafar & groupChi square and t tests, Neelam zafar & group
Chi square and t tests, Neelam zafar & group
 
STATISTICS : Changing the way we do: Hypothesis testing, effect size, power, ...
STATISTICS : Changing the way we do: Hypothesis testing, effect size, power, ...STATISTICS : Changing the way we do: Hypothesis testing, effect size, power, ...
STATISTICS : Changing the way we do: Hypothesis testing, effect size, power, ...
 
Day 11 t test for independent samples
Day 11 t test for independent samplesDay 11 t test for independent samples
Day 11 t test for independent samples
 
Chi-square tests are great to show if distributions differ or i.docx
 Chi-square tests are great to show if distributions differ or i.docx Chi-square tests are great to show if distributions differ or i.docx
Chi-square tests are great to show if distributions differ or i.docx
 
© 2014 Laureate Education, Inc. Page 1 of 5 Week 4 A.docx
© 2014 Laureate Education, Inc.   Page 1 of 5  Week 4 A.docx© 2014 Laureate Education, Inc.   Page 1 of 5  Week 4 A.docx
© 2014 Laureate Education, Inc. Page 1 of 5 Week 4 A.docx
 
TEST OF SIGNIFICANCE.pptx
TEST OF SIGNIFICANCE.pptxTEST OF SIGNIFICANCE.pptx
TEST OF SIGNIFICANCE.pptx
 
Basic statistics
Basic statisticsBasic statistics
Basic statistics
 
Excel Files AssingmentsCopy of Student_Assignment_File.11.01..docx
Excel Files AssingmentsCopy of Student_Assignment_File.11.01..docxExcel Files AssingmentsCopy of Student_Assignment_File.11.01..docx
Excel Files AssingmentsCopy of Student_Assignment_File.11.01..docx
 
BUS 308 – Week 4 Lecture 2 Interpreting Relationships .docx
BUS 308 – Week 4 Lecture 2 Interpreting Relationships .docxBUS 308 – Week 4 Lecture 2 Interpreting Relationships .docx
BUS 308 – Week 4 Lecture 2 Interpreting Relationships .docx
 
BUS 308 – Week 4 Lecture 2 Interpreting Relationships .docx
BUS 308 – Week 4 Lecture 2 Interpreting Relationships .docxBUS 308 – Week 4 Lecture 2 Interpreting Relationships .docx
BUS 308 – Week 4 Lecture 2 Interpreting Relationships .docx
 

More from cockekeshia

at least 2 references in each peer responses! I noticed .docx
at least 2 references in each peer responses! I noticed .docxat least 2 references in each peer responses! I noticed .docx
at least 2 references in each peer responses! I noticed .docxcockekeshia
 
At least 2 pages longMarilyn Lysohir, an internationally celebra.docx
At least 2 pages longMarilyn Lysohir, an internationally celebra.docxAt least 2 pages longMarilyn Lysohir, an internationally celebra.docx
At least 2 pages longMarilyn Lysohir, an internationally celebra.docxcockekeshia
 
At least 2 citations. APA 7TH EditionResponse 1. TITop.docx
At least 2 citations. APA 7TH EditionResponse 1. TITop.docxAt least 2 citations. APA 7TH EditionResponse 1. TITop.docx
At least 2 citations. APA 7TH EditionResponse 1. TITop.docxcockekeshia
 
At each decision point, you should evaluate all options before selec.docx
At each decision point, you should evaluate all options before selec.docxAt each decision point, you should evaluate all options before selec.docx
At each decision point, you should evaluate all options before selec.docxcockekeshia
 
At an elevation of nearly four thousand metres above sea.docx
At an elevation of nearly four thousand metres above sea.docxAt an elevation of nearly four thousand metres above sea.docx
At an elevation of nearly four thousand metres above sea.docxcockekeshia
 
At a minimum, your outline should include the followingIntroducti.docx
At a minimum, your outline should include the followingIntroducti.docxAt a minimum, your outline should include the followingIntroducti.docx
At a minimum, your outline should include the followingIntroducti.docxcockekeshia
 
At least 500 wordsPay attention to the required length of these.docx
At  least 500 wordsPay attention to the required length of these.docxAt  least 500 wordsPay attention to the required length of these.docx
At least 500 wordsPay attention to the required length of these.docxcockekeshia
 
At a generic level, innovation is a core business process concerned .docx
At a generic level, innovation is a core business process concerned .docxAt a generic level, innovation is a core business process concerned .docx
At a generic level, innovation is a core business process concerned .docxcockekeshia
 
Asymmetric Cryptography•Description of each algorithm•Types•Encrypt.docx
Asymmetric Cryptography•Description of each algorithm•Types•Encrypt.docxAsymmetric Cryptography•Description of each algorithm•Types•Encrypt.docx
Asymmetric Cryptography•Description of each algorithm•Types•Encrypt.docxcockekeshia
 
Astronomy HWIn 250-300 words,What was Aristarchus idea of the.docx
Astronomy HWIn 250-300 words,What was Aristarchus idea of the.docxAstronomy HWIn 250-300 words,What was Aristarchus idea of the.docx
Astronomy HWIn 250-300 words,What was Aristarchus idea of the.docxcockekeshia
 
Astronomy ASTA01The Sun and PlanetsDepartment of Physic.docx
Astronomy ASTA01The Sun and PlanetsDepartment of Physic.docxAstronomy ASTA01The Sun and PlanetsDepartment of Physic.docx
Astronomy ASTA01The Sun and PlanetsDepartment of Physic.docxcockekeshia
 
Astronomers have been reflecting laser beams off the Moon since refl.docx
Astronomers have been reflecting laser beams off the Moon since refl.docxAstronomers have been reflecting laser beams off the Moon since refl.docx
Astronomers have been reflecting laser beams off the Moon since refl.docxcockekeshia
 
Astrategicplantoinformemergingfashionretailers.docx
Astrategicplantoinformemergingfashionretailers.docxAstrategicplantoinformemergingfashionretailers.docx
Astrategicplantoinformemergingfashionretailers.docxcockekeshia
 
Asthma, Sleep, and Sun-SafetyPercentage of High School S.docx
Asthma, Sleep, and Sun-SafetyPercentage of High School S.docxAsthma, Sleep, and Sun-SafetyPercentage of High School S.docx
Asthma, Sleep, and Sun-SafetyPercentage of High School S.docxcockekeshia
 
Asthma DataSchoolNumStudentIDGenderZipDOBAsthmaRADBronchitisWheezi.docx
Asthma DataSchoolNumStudentIDGenderZipDOBAsthmaRADBronchitisWheezi.docxAsthma DataSchoolNumStudentIDGenderZipDOBAsthmaRADBronchitisWheezi.docx
Asthma DataSchoolNumStudentIDGenderZipDOBAsthmaRADBronchitisWheezi.docxcockekeshia
 
Assumption-Busting1. What assumption do you have that is in s.docx
Assumption-Busting1.  What assumption do you have that is in s.docxAssumption-Busting1.  What assumption do you have that is in s.docx
Assumption-Busting1. What assumption do you have that is in s.docxcockekeshia
 
Assuming you have the results of the Business Impact Analysis and ri.docx
Assuming you have the results of the Business Impact Analysis and ri.docxAssuming you have the results of the Business Impact Analysis and ri.docx
Assuming you have the results of the Business Impact Analysis and ri.docxcockekeshia
 
Assuming you are hired by a corporation to assess the market potenti.docx
Assuming you are hired by a corporation to assess the market potenti.docxAssuming you are hired by a corporation to assess the market potenti.docx
Assuming you are hired by a corporation to assess the market potenti.docxcockekeshia
 
Assuming that you are in your chosen criminal justice professi.docx
Assuming that you are in your chosen criminal justice professi.docxAssuming that you are in your chosen criminal justice professi.docx
Assuming that you are in your chosen criminal justice professi.docxcockekeshia
 
assuming that Nietzsche is correct that conventional morality is aga.docx
assuming that Nietzsche is correct that conventional morality is aga.docxassuming that Nietzsche is correct that conventional morality is aga.docx
assuming that Nietzsche is correct that conventional morality is aga.docxcockekeshia
 

More from cockekeshia (20)

at least 2 references in each peer responses! I noticed .docx
at least 2 references in each peer responses! I noticed .docxat least 2 references in each peer responses! I noticed .docx
at least 2 references in each peer responses! I noticed .docx
 
At least 2 pages longMarilyn Lysohir, an internationally celebra.docx
At least 2 pages longMarilyn Lysohir, an internationally celebra.docxAt least 2 pages longMarilyn Lysohir, an internationally celebra.docx
At least 2 pages longMarilyn Lysohir, an internationally celebra.docx
 
At least 2 citations. APA 7TH EditionResponse 1. TITop.docx
At least 2 citations. APA 7TH EditionResponse 1. TITop.docxAt least 2 citations. APA 7TH EditionResponse 1. TITop.docx
At least 2 citations. APA 7TH EditionResponse 1. TITop.docx
 
At each decision point, you should evaluate all options before selec.docx
At each decision point, you should evaluate all options before selec.docxAt each decision point, you should evaluate all options before selec.docx
At each decision point, you should evaluate all options before selec.docx
 
At an elevation of nearly four thousand metres above sea.docx
At an elevation of nearly four thousand metres above sea.docxAt an elevation of nearly four thousand metres above sea.docx
At an elevation of nearly four thousand metres above sea.docx
 
At a minimum, your outline should include the followingIntroducti.docx
At a minimum, your outline should include the followingIntroducti.docxAt a minimum, your outline should include the followingIntroducti.docx
At a minimum, your outline should include the followingIntroducti.docx
 
At least 500 wordsPay attention to the required length of these.docx
At  least 500 wordsPay attention to the required length of these.docxAt  least 500 wordsPay attention to the required length of these.docx
At least 500 wordsPay attention to the required length of these.docx
 
At a generic level, innovation is a core business process concerned .docx
At a generic level, innovation is a core business process concerned .docxAt a generic level, innovation is a core business process concerned .docx
At a generic level, innovation is a core business process concerned .docx
 
Asymmetric Cryptography•Description of each algorithm•Types•Encrypt.docx
Asymmetric Cryptography•Description of each algorithm•Types•Encrypt.docxAsymmetric Cryptography•Description of each algorithm•Types•Encrypt.docx
Asymmetric Cryptography•Description of each algorithm•Types•Encrypt.docx
 
Astronomy HWIn 250-300 words,What was Aristarchus idea of the.docx
Astronomy HWIn 250-300 words,What was Aristarchus idea of the.docxAstronomy HWIn 250-300 words,What was Aristarchus idea of the.docx
Astronomy HWIn 250-300 words,What was Aristarchus idea of the.docx
 
Astronomy ASTA01The Sun and PlanetsDepartment of Physic.docx
Astronomy ASTA01The Sun and PlanetsDepartment of Physic.docxAstronomy ASTA01The Sun and PlanetsDepartment of Physic.docx
Astronomy ASTA01The Sun and PlanetsDepartment of Physic.docx
 
Astronomers have been reflecting laser beams off the Moon since refl.docx
Astronomers have been reflecting laser beams off the Moon since refl.docxAstronomers have been reflecting laser beams off the Moon since refl.docx
Astronomers have been reflecting laser beams off the Moon since refl.docx
 
Astrategicplantoinformemergingfashionretailers.docx
Astrategicplantoinformemergingfashionretailers.docxAstrategicplantoinformemergingfashionretailers.docx
Astrategicplantoinformemergingfashionretailers.docx
 
Asthma, Sleep, and Sun-SafetyPercentage of High School S.docx
Asthma, Sleep, and Sun-SafetyPercentage of High School S.docxAsthma, Sleep, and Sun-SafetyPercentage of High School S.docx
Asthma, Sleep, and Sun-SafetyPercentage of High School S.docx
 
Asthma DataSchoolNumStudentIDGenderZipDOBAsthmaRADBronchitisWheezi.docx
Asthma DataSchoolNumStudentIDGenderZipDOBAsthmaRADBronchitisWheezi.docxAsthma DataSchoolNumStudentIDGenderZipDOBAsthmaRADBronchitisWheezi.docx
Asthma DataSchoolNumStudentIDGenderZipDOBAsthmaRADBronchitisWheezi.docx
 
Assumption-Busting1. What assumption do you have that is in s.docx
Assumption-Busting1.  What assumption do you have that is in s.docxAssumption-Busting1.  What assumption do you have that is in s.docx
Assumption-Busting1. What assumption do you have that is in s.docx
 
Assuming you have the results of the Business Impact Analysis and ri.docx
Assuming you have the results of the Business Impact Analysis and ri.docxAssuming you have the results of the Business Impact Analysis and ri.docx
Assuming you have the results of the Business Impact Analysis and ri.docx
 
Assuming you are hired by a corporation to assess the market potenti.docx
Assuming you are hired by a corporation to assess the market potenti.docxAssuming you are hired by a corporation to assess the market potenti.docx
Assuming you are hired by a corporation to assess the market potenti.docx
 
Assuming that you are in your chosen criminal justice professi.docx
Assuming that you are in your chosen criminal justice professi.docxAssuming that you are in your chosen criminal justice professi.docx
Assuming that you are in your chosen criminal justice professi.docx
 
assuming that Nietzsche is correct that conventional morality is aga.docx
assuming that Nietzsche is correct that conventional morality is aga.docxassuming that Nietzsche is correct that conventional morality is aga.docx
assuming that Nietzsche is correct that conventional morality is aga.docx
 

Recently uploaded

Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitolTechU
 

Recently uploaded (20)

Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptx
 

Week 5 Lecture 14 The Chi Square TestQuite often, patterns of .docx

  • 1. Week 5 Lecture 14 The Chi Square Test Quite often, patterns of responses or measures give us a lot of information. Patterns are generally the result of counting how many things fit into a particular category. Whenever we make a histogram, bar, or pie chart we are looking at the pattern of the data. Frequently, changes in these visual patterns will be our first clues that things have changed, and the first clue that we need to initiate a research study (Lind, Marchel, & Wathen, 2008). One of the most useful test in examining patterns and relationships in data involving counts (how many fit into this category, how many into that, etc.) is the chi-square. It is extremely easy to calculate and has many more uses than we will cover. Examining patterns involves two uses of the Chi- square - the goodness of fit and the contingency table. Both of these uses have a common trait: they involve counts per group. In fact, the chi-square is the only statistic we will look at that we use when we have counts per multiple groups (Tanner & Youssef-Morgan, 2013). Chi Square Goodness of Fit Test The goodness of fit test checks to see if the data distribution (counts per group) matches some pattern we are interested in. Example: Are the employees in our example company distributed equal across the grades? Or, a more reasonable expectation for a company might be are the employees distributed in a pyramid fashion – most on the bottom and few at the top? The Chi Square test compares the actual versus a proposed distribution of counts by generating a measure for each cell or count: (actual – expected)2/actual. Summing these for all of the cells or groups provides us with the Chi Square Statistic. As with our other tests, we determine the p-value of getting a result as large or larger to determine if we reject or not reject our null hypothesis. An example will show the approach using Excel.
  • 2. Regardless of the Chi Square test, the chi square related functions are found in the fx Statistics window rather than the Data Analysis where we found the t and ANOVA test functions. The most important for us are: · CHISQ.TEST (actual range, expected range) – returns the p- value for the test · CHISQ.INV.RT(p-value, df) – returns the actual Chi Square value for the p-value or probability value used. · CHISQ.DIST.RT(X, df) – returns the p-value for a given value. When we have a table of actual and expected results, using the =CHISQ.TEST(actual range, expected range) will provide us with the p-value of the calculated chi square value (but does not give us the actual calculated chi square value for the test). We can compare this value against our alpha criteria (generally 0.05) to make our decision about rejecting or not rejecting the null hypothesis. If, after finding the p-value for our chi square test, we want to determine the calculated value of the chi square statistic, we can use the =CHISQ.INV.RT(probability, df) function, the value for probability is our chi square test outcome, and the degrees of freedom (df) equals the number of cells in our actual table minus 1 (6 – 1 =5 for an problem working with our 6 grade levels). Finally, if we are interested in the probability of exceeding a particular chi square value, we can use the CHIDIST or CHISQ.DIST.RT function. Excel Example. To see if our employees are distributed in a traditional pyramid shape, we would use the Chi Square Goodness of Fit test as we are dealing both with count data and with a proposed distribution pattern. For this test, let us assume the following table shows the expected distribution of our 50 employees in a pyramid organizational structure. A B C D
  • 3. E F 15 12 10 6 4 3 Grade: Total Count: 50 The actual or observed distribution within our sample is shown below. A B C D E F 15 7 5 5 12 6 Grade: Total Count: 50 The research question: Are employees distributed in a pyramidal fashion? Step 1: Ho: No difference exists between observed and expected frequency counts Ha: Observed and Expected frequencies differ. Step 2: Reject the null hypothesis if the p-value < alpha = .05. Step 3: Chi Square Goodness of Fit test. Step 4: Conduct the test. Below is a screen short of an Excel solution. Step 5: Conclusions and Interpretation. Since our p-value of
  • 4. 0.00024 is < our alpha of 0.05, we reject the null hypothesis. The employees are not distributed in a pyramid pattern. Side Note: We might think that if our sample had an equal number of employees per grade we would have a better chance of grade based differences averaging out. Doing this same test and assuming an equal distribution across grades produces a p- value of 0.063 causing us to fail to reject the null hypothesis. The student is encouraged to try this, the equal value for each grade would be 50/6. Effect size. For a single row, goodness-of-fit test, the associated effect size measure is called effect size r, and equals the square root of: the chi square value/(N*df), where df = the number of cells – 1. A value less than .30 is considered small, between .30 and .50 is considered moderate, and more than .50 is considered large (Steinberg, 2008). Since we rejected the null in the example above, the effect size would be: r= square root (23.75/50*5) = sgrt(0.095) =0.31. This is a moderate impact, suggesting that both sample size and variable interaction had some impact. With moderate results, we generally would want to get a larger sample and repeat the test (Tanner & Youssef-Morgan, 2013). Chi Square Contingency Table test Contingency table tests, also known as tests of independence, are slightly more complex than goodness of fit tables. They classify the data by two or more variable labels (we will limit our discussions to two variable tables). Looking a lot like the input table for the ANOVA 2factor without replication we looked at last week. Both variables involve the counts per category (nominal, ordinal, or interval/ratio data in ranges) of items that meet our research interest (Lind, Marchel, & Wathen, 2008). With most contingency tables, we do not have a given expected frequency as we had with the goodness of fit situation. To find the expected value for each cell for a multiple row/column table, we use the formula: row total * column total/grand total (which suggests the expected frequency is the average of the
  • 5. observed frequencies per cell, not an unreasonable expectation). Once we have generated the values for the expected table, we use the same formula to perform the Chi Square test. Manually, this is the sum of ((actual – expected)2/expected) for all of the cells. The same fx Chi Square functions used for the Goodness of Fit test are used for the Contingency Table analysis. The null hypothesis for a contingency table test is “no relationship exists between the variables.” The alternate hypothesis would be: “a relationship exists.” In general, you are testing either for similar distributions between the groups of interest or to see if a relationship ("correlation") exists (even if the data is nominal level). The df for a contingency table is (number of rows-1)*(number of columns – 1). Excel Example. The data entry for this test is the same as with our earlier test, and the functions are found in the fx statistical list. One possible explanation for different salaries is the performance on the job, reflected in the performance rating. We might wonder if males and females are evaluated differently (either due to actual performance or to bias; if so, we have another issue to examine). So, our research question for this issue becomes, are males and females rated the same? Step 1: Ho: Male and Female ratings are similar (no difference in distributions) Ha: Males and Females rating distributions differ Step 2: Reject Ho if p-value is < alpha = 0.05. Step 3: Chi Square Contingency Table Analysis Step 4: Perform Test. Step 5: Conclusions and Interpretation. Since the p-value (CHISQ.TEST result) is greater than (>) alpha = .05, we fail to reject the null hypothesis and conclude that males and females are evaluated in a similar pattern. It does not appear that performance rating impact average salary differences. Effect size. Now, as with the t-test and ANOVA, had we rejected the null hypothesis, we would have wanted to examine
  • 6. the practical impact of the outcome using an effect size measure. The effect size measure for the Chi Square is a correlation measure. Two measures are generally used with the contingency table outcomes – the Phi coefficient and Cramer’s V (Tanner & Youssef-Morgan, 2013). The Phi coefficient (=square root of (chi square/sample size)) provides a rough estimate of the correlation between the two variables. Phi is primarily used with small tables (2x2, 2x3, or 3x2). Values below .30 are weak, .30 to about .50 are moderate, and above .50 (to 1) are strong relationships (Tanner & Youssef-Morgan, 2013). Cramer’s V can be considered as a percent of the shared variation – or common variation between the variables. It equals the square root of (phi squared/(smaller number of rows or columns -1). It ranges from 0 (no relationship or variation in common) to 1.0 (strong relationship, all variation in common) (Tanner & Youssef-Morgan, 2013). For our example above, it would not make sense to calculate either value since we did not reject the null; but for illustrative purposes we will. · Phi = square root of (1.978/50) = square root of (0.03956) = 0.199 –small, no relationship · V = square root of (0.1.99^2/(2-1)) = 0.19. Note, when the smaller of the number of rows and columns equals 1, V will equal Phi (Tanner & Youssef-Morgan, 2013). Caution Due to the division involved in calculating the Chi Square value, it is extremely influenced with cells that have small expected values. Most texts say simply that if the expected frequency in one or more cells is less than (<) 5 to not use the Chi Square distribution in a hypothesis test. There are some different opinions about this issue. Different texts issue different rules on what to do if we have expected frequencies of 5 or less in cells. As a compromise, let’s use the standard that no more than 20% of the cells should have an expected value of less than 5. If they do, we need to combine rows or columns to reduce this
  • 7. percentage interest (Lind, Marchel, & Wathen, 2008). References Lind, D. A., Marchel, W. G., & Wathen, S. A. (2008). Statistical Techniques in Business & Finance. (13th Ed.) Boston: McGraw-Hill Irwin. Steinberg, W.J. (2008). Statistics Alive! Thousand Oaks, CA: Sage Publications, Inc. Tanner, D. E. & Youssef-Morgan, C. M. (2013). Statistics for Managers. San Diageo, CA: Bridgeport Education. Week 5 Lecture 13 This week we look at two different approaches to analyzing data and making inferences about the populations they come from. The first is confidence intervals, a range of values that we expect to contain the actual population mean based on the sample results we obtained. The other is a way to use nominal and ordinal data in a statistical analysis. The Chi Square family of tests looks at patterns within samples and sees whether the underlying populations could contain the same pattern of measure distributions (Lind, Marchel, & Wathen, 2008). Confidence Intervals When we perform a t-test or ANOVA, we are using a single point estimate for the means of the populations we are testing. Some professionals and managers are a bit uncomfortable with this; they understand that the sample has a sampling error – and the actual population mean could be – and most likely is – a bit different. They are interested in getting an estimate of what the sampling error is and how much the population mean could differ from the sample mean. We deal with this through the use of confidence intervals, a range of values that have a specific probability of containing the actual population mean. We have seen one example of a
  • 8. confidence mean already, the intervals used to determine which population means varied when we rejected the null hypothesis for the ANOVA test were confidence intervals. Confidence intervals often provide the added information and comfort about estimates of population parameter values that the single point estimates lack. Since the one thing we do know about a statistic generated from a sample is that it will not exactly equal the population parameter, we can use a confidence interval to get a better feel for the range of values that might be the actual population parameter. They also give us an indication of how much variation exists in the data set. The larger the range (at the same confidence level), the more variation within the sample data set and the less representative the mean would be (Lind, Marchel, & Wathen, 2008). We are going to look at two different kinds of confidence intervals this week – intervals for a one sample mean and intervals for the differences between the means of two samples (Lind, Marchel, & Wathen, 2008). One Sample Confidence Interval for the mean A confidence interval is simply a range of values that could contain the actual population parameter of interest. It is centered on the sample mean, and uses the variation in the sample to estimate a range of possible values (Lind, Marchel, & Wathen, 2008). To construct a confidence interval, we use several pieces of information from the sample and the confidence level we want. From the sample we use the mean, standard deviation, and size. To get the confidence level – a desired probability (usually set at 95%), that the interval does, in fact, contain the population mean. Example. The confidence interval for the female mean salary in the population would be calculated this way. The sample mean value is 38, the standard deviation is 18., and the sample size is 25 3 (from Week 1 material). Once we determine the confidence level we want, we use the associated 2-tail t value to achieve it. The t-value is found with the fx function t.inv.2t
  • 9. (Prob, df). For a 95% confidence interval, we would use t.inv.2t(0.05, 24), this equals 2.064 (rounded). We now have all the information we need to construct a 95% confidence interval for the female salary mean: CI = mean +/- t * stdev/sqrt(sample size) = 38 +/- 2.064*18.3/sqrt(25) = 38 +/- 7.6. This is typically written as 30.4 to 45.6. Note: the standard deviation divided by the square root of the sample size is called the standard error of the mean, and is the variation measure of the sample used in several statistical tests, including the t-test and confidence intervals. The associated 95% CI for males is 44.6 to 59.3. Note that the endpoints overlap – male smallest vale is 44.6 while the female largest value is 45.6. This suggests that both population average salaries could be the same and around 45. However, just as the two one-sample t-tests gave us misleading information on possible equality, using two confidence intervals to compare two populations also is not the best approach. The Confidence Interval for mean differences. When comparing multiple samples, it is always best to use all the possible information in a single test or procedure. The same is true for confidence intervals. If we are interested in seeing if sample means could be equal, we look to see if the difference between the averages could be 0 or not. If so, then the means could be the same; if not, then the means must be significantly different. The formula for the mean difference confidence interval is mean difference +/- t*standard error. The standard error for the difference of two populations is found by adding the variance/sample size (which is the standard error squared) for each and taking the square root (Lind, Marchel, & Wathen, 2008). For our salary data set we have the following values: Female mean = 38 Male mean = 52 t = t.inv.2t(0.05, 48) = 2.106 Female Stdev = 18.3 Maler Stdev = 17.8 Sample size = 50, df = 48
  • 10. Standard error = sqrt(Variance (female)/25 + Variance (male)/25) = Sqrt(334.7/25 + 316/25) = 5.10. This gives us a 95% confidence interval for the difference equaling: (52-38) +/- 2.106 * 5.10 = 14 +/- 10.7 = 3.3 to 24.7. Since this confidence interval does not contain 0, we are 95% confident that the male and female salary means are not equal – which is the same result we got from our 2 sample t-test in week 2. We also now have a sense of how much variation exists in our measures. Side note: The “+/- t* SE” term is often called the margin of error. We most often hear this phrase in conjunction with opinion polls – particularly political polls, “candidate A has 43% approval rating with a margin of error of 3.5%. While we do not deal with proportions in the class, they are calculated the same as an empirical probability – number of positive replies divided by the sample size. The construction of these margins or confidences is conceptually the same – a t-value and a standard error of the proportion based on the sample size and results (Lind, Marchel, & Wathen, 2008). References Lind, D. A., Marchel, W. G., & Wathen, S. A. (2008). Statistical Techniques in Business & Finance. (13th Ed.) Boston: McGraw-Hill Irwin.