The document provides information about the chi-square test, including its introduction by Karl Pearson, its applications and uses, assumptions, and examples. The chi-square test is used to determine if an observed set of frequencies differ from expected frequencies. It can be used to test differences between categorical data and expected values. Examples shown include a goodness of fit test comparing blood group frequencies to expected equal distribution, and a one-dimensional coin flipping example.
3. INTRODUCTION
•Karl Pearson(1880) introduced a test to
distinguish whether an observed set of
frequencies differs from a specified frequency
distribution
•A chi-square test, also referred to as chi-
squared test.
Karl Pearson
4. • “x” in Greek represents „chi‟ , square of this is „chi
square‟ or „ chi-squared test‟ .
• It is used to check the prevalence among the data.
• The chi-square test uses frequency data to generate
a statistical result
• Pearson's chi-squared test (χ2) is the best-known of
several chi-squared tests.
5. APPLICATION/UTILITY
• Chi square test is the test which tests
the significant difference between
expected value and observed value
which can be used for categorical data.
• OR, In simple words it is used to test
whether a significant difference exists
between the observed number of
samples and the expected number of
responses.
6. Parametric and Nonparametric Tests
• The term "non-parametric" refers to the fact that
the chi-square tests do not require assumptions
about population parameters nor do they test
hypotheses about population parameters.
• Some examples of hypothesis tests, such as the t
tests and ANOVA, are parametric tests and they
do include assumptions about parameters and
hypotheses about parameters.
6
7. Parametric and Nonparametric Tests
• The most obvious difference between the chi-
square tests and the other hypothesis tests we have
considered (t and ANOVA) is the nature of the
data.
• For chi-square, the data are frequencies rather
than numerical scores.
7
8. The Chi-Square Test for Goodness-of-Fit
• A statistical method used to determine goodness
of fit.
– Goodness of fit refers to how close the
observed data are to those predicted from a
hypothesis
Note:
– The chi square test does not prove that a
hypothesis is correct
• It evaluates to what extent the data and the
hypothesis have a good fit
8
9. The Chi-Square Test for Goodness-of-Fit
(cont.)
• The chi-square test for goodness-of-fit uses
frequency data from a sample to test hypotheses
about the shape or proportions of a population.
• Each individual in the sample is classified into one
category on the scale of measurement.
• The data, called observed frequencies, simply
count how many individuals from the sample are
in each category.
10. The Chi-Square Test for Goodness-of-Fit
(cont.)
• The null hypothesis specifies the proportion of the
population that should be in each category.
• The proportions from the null hypothesis are used
to compute expected frequencies that describe how
the sample would appear if it were in perfect
agreement with the null hypothesis.
10
11. Insight of chi-square Test
Interpretation
Rationale for the Chi-Square Test
The Basics of a Chi-Square Test
Types of Chi-Square Tests
One dimensional example
Two folded example
Conclusion
12. INTERPRETATION
• If the difference of Observed value and
Expected value is zero or less, then there is no
significant difference. But, if the difference is
more then, there will be statistically
significant difference.
13. Rationale for the Chi-Square Test
• Various chi-square tests to deal with cases
involving frequency data
• Enumeration data
• Categorical data
• Qualitative data
• Contingency table
14. The Basics of A Chi-Square Test
• The chi-square test compares the observed
frequencies with the expected frequencies:
2
2 ( )O E
E
• Where
– O = observed data in each category
– E = expected data in each category based on the
experimenter’s hypothesis
– S = Sum of the calculations for each category
15. Chi Square Calculation
Each entry in the summation can be referred to as
“The observed minus the
expected, squared, divided by the expected”.
The chi square value for the test as a whole is
“The sum of the observed minus the
expected, squared, divided by the expected”
18. Characteristics of the Chi-Square Distribution
1. It is not symmetric.
2. The shape of the chi-square distribution
depends upon the degrees of freedom, just
like Student’s t-distribution.
19. Characteristics of the Chi-Square Distribution
1. It is not symmetric.
2. The shape of the chi-square distribution
depends upon the degrees of freedom, just
like Student’s t-distribution.
3. As the number of degrees of freedom
increases, the chi-square distribution
becomes more symmetric.
20. Characteristics of the Chi-Square Distribution
1. It is not symmetric.
2. The shape of the chi-square
distribution depends upon the degrees
of freedom, just like Student’s t-distribution.
3. As the number of degrees of freedom
increases, the chi-square distribution
becomes more symmetric.
4. The values are non-negative. That is, the
values of are greater than or equal to 0.
22. The Basics of A Chi-Square Test
The chi-square value can occur easily by chance or
whether it is an unusual event that is unlikely to occur
by chance
25. Types of Chi-Square Test
Chi-square tests to determine the following:
• Whether the two variables are independent.
• Whether various subgroups are homogeneous.
• Whether there is a significant difference in the
proportions in the subclasses among the subgroups
26. Assumptions / Limitation in the
Use of Chi-Square
• Data is from a random sample.
• A sufficiently large sample size is required (at least
20)
• Observations must be independent.
• Actual count data (not percentages)
• Does not prove causality.
28. EXAMPLE
• In many families where the parents could have
produced children of all four blood groups, the total
number of children with each blood group was :
– Blood group A 26
– Blood group B 31
– Blood group AB 39
– Blood group O 24
– TOTAL 120
29. Cont…
Those numbers are the results observed and
have been put into the O column in this table
46. Cont…
• Which is always the amount of
possible phenotypes minus one.
So in this case 4 – 1 = 3
47. Cont…
• Then we find the 3 degrees of freedom row
and read along until we find the number
which is closest to our result…
• Finally we look which column this value is
in…
48. Cont…
• If its in the 0.05 (or 50 %) column or
any column less than this…
( meaning that the chi squared value is
bigger than the one in the 0.05 column)
• Then any differences are NOT due to
chance…
• Which means that something else is
affecting our results…
50. cont
• But if its in a column above 0.05 (known as
the critical value)…
• (meaning that the chi squared value is
smaller than the one in the 0.05 column)
• Then any differences in our results are due to
chance…
51. Cont…
• And the null Hypothesis is accepted
• Remember that the critical value is
ALWAYS 0.05…
57. Cont…
• In this case the critical value is 3.841
• We got a value 0.72 which is lower than
the critical value, so we accept null
hypothesis.
• If the value is more than the critical value
we got , then we reject the null hypothesis.
58. SOME AYURVEDIC JOURNALS
USING CHI SQUARE
• Allopathic vs. Ayurvedic practices in tertiary care
institutes of urban North India
(Indian Journal of Pharmacology, Vol. 39, No. 1,
January-February, 2007, pp. 52-54).
• Assessment of oligospermia with special reference to
rakta dusti.
(International research journal of Pharmacy 2012)
• Evaluation of efficacy of karshaneeya yavagu (An
Ayurvedic preparation) in the management of
Obesity.
(International Journal of Research in Ayurveda &
Pharmacy;Mar2012, Vol. 3 Issue 2, p295)
59. Conclusion
• Note that all we really can conclude is that our data
is different from the expected outcome given a
situation
– Regardless of the position of the frequencies
we’d have come up with the same result
– It is a non-directional test regardless of the
prediction.
Editor's Notes
A portmanteau test is a type of statistical hypothesis test in which the null hypothesis is well specified, but the alternative hypothesis is more loosely specified. Tests constructed in this context can have the property of being at least moderately powerful against a wide range of departures from the null hypothesis. Thus, in applied statistics, a portmanteau test provides a reasonable way of proceeding as a general check of a model's match to a dataset where there are many different ways in which the model may depart from the underlying data generating process. Use of such tests avoids having to be very specific about the particular type of departure being tested.
The percentage of various drugs used in both the hospitals was compared using Chi-square test. P -value < 0.001 was considered statistically significant.