Contingency Table Test
Presented by
Muhammad Asad Hayat, 14-CE-148
Civil Engineering Department, UET Taxila
What is Contingency Table?
• A table with at least two rows and two columns used
in statistics to present the data.
• It displays frequency distribution (Observed
frequency) of the variables.
• It shows the distribution of one variable in rows and
another in columns.
• It is used to study the correlation between the two
variables.
Example of Contingency Table
Groups Dogs Cats Total
Males 42 10 52
Females 9 39 48
Total 51 49 100
Marginal Totals
Variables involved in the
experiment and tabulated in
contingency table are called
marginal totals.
A simple 2 x 2 Contingency Table
Origin of Contingency Table
• The term contingency table was first used by Karl
Pearson who was an English mathematician and
biometrician in "On the Theory of Contingency and
Its Relation to Association and Normal
Correlation", published in 1904.
Why do we perform Table Test?
• Many times the elements of a sample from a
population may be classified according to two
different criteria or in the form of table. It is then of
interest to know whether the variables involved are
statistically independent; for example, we may
consider the population of graduating engineers and
may wish to determine whether starting salary is
independent of academic disciplines.
• In fact, we are interested in testing the hypothesis
that the row-and-column methods of classification
are independent. If we reject this hypothesis, we
conclude some interaction exists between the two
criteria of classification.
Terms
Null Hypothesis
When we want to test a table data stating that value
of one variable is related to that of another, we
assume that variables are not related to each other
(independent). This assumption is called Null
Hypothesis (H0). If we reject null hypothesis, we say
that our result is statistically significant.
Alternative Hypothesis
Assumption of independency other than Null
Hypothesis is called Alternate Hypothesis or
Rejecting null hypothesis means accepting
alternative hypothesis.
)(Ho
Terms (Contd.)
Significance Level (α)
The significance level, also denoted as alpha, is
the probability of rejecting the null hypothesis when it
is true. For example 5%, 1%, significance level is
chosen mostly.
 Degree of Freedom (df)
The number of independent comparisons that can
be made among the elements of a sample.
It can be calculated by following formula:
1]-Columnsof[No.1]-Rowsof[No.df 
Terms (Contd.)
P-Value
The P-value is the smallest level of significance that
would lead to rejection of the null hypothesis with
the given data.
Pearson’s Chi-Square Test (Xo2)
• Pearson’s Chi-Square Analysis is used to analyze a
contingency table.
• It is employed when variables involved in
experiment are nominal. For ordinal/higher
variables, Chi-Square Test is not used.
• We find expected frequencies from contingency
table and then use following formula to calculate
Chi-Square Value:
 Fruquency][Expected
Frequency]Expected-Fruquency[Observed 2
Pearson’s Chi-Square Test (Xo2)
(Contd.)
• Expected frequency of a individual variable rows is
obtained by following formula:
• This formula is for only one variable in the table. We
apply this for every variable involved in the
experiment.
lesRow Variabx
VariablesofNo.Total
VariablesColumn

Pearson’s Chi-Square Test (Xo2)
(Contd.)
• Chi-Square Value is then compared with Chi-
Square Distribution value for which we need
significance level and degree of freedom.
• If our Chi-Square Value is greater than that of Chi-
Square distribution table value (Or P-Value is less
than or equal to level of significance), we reject the
null hypothesis and say that variables involved in
the experiment are not independent and affect
each other (Or they are contingent).
• This means that statistically, there exists no
correlation or very low correlation between
variables.
Steps involved in Pearson’s Chi-
Square Test
Chi-Square test involves following steps:
1. Choosing a significance level (α)
2. Define parameter of interest
3. Null hypothesis (Ho)
4. Alternate hypothesis (Rejecting null hypothesis)
5. Finding expected frequencies
6. Determination of Chi-Square value
7. Determination of degree of freedom
8. Determination of Chi-Square distribution value
9. Comparing Chi-Square value with Chi-Square distribution value
10. Concluding result
Example
A company has to choose among three health
insurance plans. Management wishes to know
whether the preference for plans is independent of
job classification and wants to use α = 0.05 (5%
significance level). The opinions of a random sample
of 500 employees are shown below:
Job
Classification Plan 1 Plan 2 Plan 3 Total
Salaried
Workers 160 140 40 340
Hourly
Workers 40 60 60 160
Total 200 200 100 500
Solution: Example
• Parameter of Interest:
Our interest is in “Which insurance plan do
employees prefer?”
• Null Hypothesis
Preference is independent of salaried versus hourly
job classification.
• Alternate Hypothesis
Preference is not independent of salaried versus
hourly job classification or “We reject the null
hypothesis.”
• Degree of Freedom
Since there are 3 columns and 2 rows, degree of
freedom (df) = (r-1)(c-1) = (2-1)(3-1) = 2
Solution: Example (Contd.)
Job
Classification Plan 1 Plan 2 Plan 3 Total
Salaried
Workers 136 136 68 340
Hourly
Workers 64 64 32 160
Total 200 200 100 500
In next step we find expected frequencies:
136340x
500
200

?
Solution: Example (Contd.)
• Chi-Square Value
Chi-Square Value is found by following formula:
49.63
32
32)-(60
64
64)-(40
136
136)-(160
ValueSquareChi
Fruquency)(Expected
Frequency)Expected-FruquencyObserved
222
2




















.......
(
49.63X2
0 
Significance Level
Solution: Example (Contd.)
• Chi-Square Table Value
In next step, we find value from chi-squared
distribution value.
• P-Value
In Chi-Square distribution table, with our significance
level and degree of freedom, we shall see whether
where our P-Value (In fact Chi-Square Value) lies.
We come to know that value is very small as
compared to level of significance. Therefore, we can
reject null hypothesis.
5.99X2
0.05,2 
Solution: Example (Contd.)
Also, we have following relation:
Above relation shows us that preference for health
insurance plans is not independent of job
classification. This means we reject our null
hypothesis and accept alternate hypothesis.
2
0.05,2
2
0 XX 

?Any
Questions
Contingency Table Test, M. Asad Hayat, UET Taxila

Contingency Table Test, M. Asad Hayat, UET Taxila

  • 1.
    Contingency Table Test Presentedby Muhammad Asad Hayat, 14-CE-148 Civil Engineering Department, UET Taxila
  • 2.
    What is ContingencyTable? • A table with at least two rows and two columns used in statistics to present the data. • It displays frequency distribution (Observed frequency) of the variables. • It shows the distribution of one variable in rows and another in columns. • It is used to study the correlation between the two variables.
  • 3.
    Example of ContingencyTable Groups Dogs Cats Total Males 42 10 52 Females 9 39 48 Total 51 49 100 Marginal Totals Variables involved in the experiment and tabulated in contingency table are called marginal totals. A simple 2 x 2 Contingency Table
  • 4.
    Origin of ContingencyTable • The term contingency table was first used by Karl Pearson who was an English mathematician and biometrician in "On the Theory of Contingency and Its Relation to Association and Normal Correlation", published in 1904.
  • 5.
    Why do weperform Table Test? • Many times the elements of a sample from a population may be classified according to two different criteria or in the form of table. It is then of interest to know whether the variables involved are statistically independent; for example, we may consider the population of graduating engineers and may wish to determine whether starting salary is independent of academic disciplines. • In fact, we are interested in testing the hypothesis that the row-and-column methods of classification are independent. If we reject this hypothesis, we conclude some interaction exists between the two criteria of classification.
  • 6.
    Terms Null Hypothesis When wewant to test a table data stating that value of one variable is related to that of another, we assume that variables are not related to each other (independent). This assumption is called Null Hypothesis (H0). If we reject null hypothesis, we say that our result is statistically significant. Alternative Hypothesis Assumption of independency other than Null Hypothesis is called Alternate Hypothesis or Rejecting null hypothesis means accepting alternative hypothesis. )(Ho
  • 7.
    Terms (Contd.) Significance Level(α) The significance level, also denoted as alpha, is the probability of rejecting the null hypothesis when it is true. For example 5%, 1%, significance level is chosen mostly.  Degree of Freedom (df) The number of independent comparisons that can be made among the elements of a sample. It can be calculated by following formula: 1]-Columnsof[No.1]-Rowsof[No.df 
  • 8.
    Terms (Contd.) P-Value The P-valueis the smallest level of significance that would lead to rejection of the null hypothesis with the given data.
  • 9.
    Pearson’s Chi-Square Test(Xo2) • Pearson’s Chi-Square Analysis is used to analyze a contingency table. • It is employed when variables involved in experiment are nominal. For ordinal/higher variables, Chi-Square Test is not used. • We find expected frequencies from contingency table and then use following formula to calculate Chi-Square Value:  Fruquency][Expected Frequency]Expected-Fruquency[Observed 2
  • 10.
    Pearson’s Chi-Square Test(Xo2) (Contd.) • Expected frequency of a individual variable rows is obtained by following formula: • This formula is for only one variable in the table. We apply this for every variable involved in the experiment. lesRow Variabx VariablesofNo.Total VariablesColumn 
  • 11.
    Pearson’s Chi-Square Test(Xo2) (Contd.) • Chi-Square Value is then compared with Chi- Square Distribution value for which we need significance level and degree of freedom. • If our Chi-Square Value is greater than that of Chi- Square distribution table value (Or P-Value is less than or equal to level of significance), we reject the null hypothesis and say that variables involved in the experiment are not independent and affect each other (Or they are contingent). • This means that statistically, there exists no correlation or very low correlation between variables.
  • 12.
    Steps involved inPearson’s Chi- Square Test Chi-Square test involves following steps: 1. Choosing a significance level (α) 2. Define parameter of interest 3. Null hypothesis (Ho) 4. Alternate hypothesis (Rejecting null hypothesis) 5. Finding expected frequencies 6. Determination of Chi-Square value 7. Determination of degree of freedom 8. Determination of Chi-Square distribution value 9. Comparing Chi-Square value with Chi-Square distribution value 10. Concluding result
  • 13.
    Example A company hasto choose among three health insurance plans. Management wishes to know whether the preference for plans is independent of job classification and wants to use α = 0.05 (5% significance level). The opinions of a random sample of 500 employees are shown below: Job Classification Plan 1 Plan 2 Plan 3 Total Salaried Workers 160 140 40 340 Hourly Workers 40 60 60 160 Total 200 200 100 500
  • 14.
    Solution: Example • Parameterof Interest: Our interest is in “Which insurance plan do employees prefer?” • Null Hypothesis Preference is independent of salaried versus hourly job classification. • Alternate Hypothesis Preference is not independent of salaried versus hourly job classification or “We reject the null hypothesis.” • Degree of Freedom Since there are 3 columns and 2 rows, degree of freedom (df) = (r-1)(c-1) = (2-1)(3-1) = 2
  • 15.
    Solution: Example (Contd.) Job ClassificationPlan 1 Plan 2 Plan 3 Total Salaried Workers 136 136 68 340 Hourly Workers 64 64 32 160 Total 200 200 100 500 In next step we find expected frequencies: 136340x 500 200  ?
  • 16.
    Solution: Example (Contd.) •Chi-Square Value Chi-Square Value is found by following formula: 49.63 32 32)-(60 64 64)-(40 136 136)-(160 ValueSquareChi Fruquency)(Expected Frequency)Expected-FruquencyObserved 222 2                     ....... ( 49.63X2 0 
  • 17.
  • 18.
    Solution: Example (Contd.) •Chi-Square Table Value In next step, we find value from chi-squared distribution value. • P-Value In Chi-Square distribution table, with our significance level and degree of freedom, we shall see whether where our P-Value (In fact Chi-Square Value) lies. We come to know that value is very small as compared to level of significance. Therefore, we can reject null hypothesis. 5.99X2 0.05,2 
  • 19.
    Solution: Example (Contd.) Also,we have following relation: Above relation shows us that preference for health insurance plans is not independent of job classification. This means we reject our null hypothesis and accept alternate hypothesis. 2 0.05,2 2 0 XX   ?Any Questions