SlideShare a Scribd company logo
1 of 30
Download to read offline
KEBBI STATE UNIVERSITY OF SCIENCE & TECHNOLOGY,
ALIERO
COLLEGE OF HEALTH SCIENCES
DEPARTMENT OF COMMUNITH HEALTH
COM 201: BIOSTATISTICS
CHI-SQUARE TEST OF ASSOCIATION
By
Prof AU Ka’oje
Test of Association I: Chisquare Test
• A chi-square test (denoted as 𝜒2) is used to assess whether the distribution of a
categorical variable is significantly different between two or more groups.
• Used to /detect the existence of association/relation between two variables -
categorical variables.
• between data in rows and data in columns,
• but it does not indicate the strength of any association
• In health research, a test of chi-square is frequently used to assess whether disease
(present/absent) is associated with exposure (yes/no)
• Chi-square tests are appropriate for most study designs but the results are
influenced by the sample size.
• It depends on the size of the differences between observed and expected
frequencies, degrees of freedom, and sample size.
• There are two types of tests:
• 1. The goodness of fit: It is used to determine how good the observed data
represents the population (expected values)
• 2. Test for independence: It is used to determine if there is a relation between two
categorical variables
Test of Association I: Chisquare Test
• 𝜒2 cannot have negative value, as
distribution curve is always on the
positive side, i.e, skewed to the right.
• More areas under the curve towards the
left of the graph
• it has no symmetry
• It depends on degree of freedom (df)
which are always one less than sample
size (n-1)
• The chi- squared test is a non-
parametric test.
Presentation of Chisquare Test Data
• Data is presented in an r x c table (row x
column),
• cross-classification or contingency table
• Data are presented in cells, arranged in
rows (horizontal) and columns
(vertical).
• These often appear in the form of a 2 x
2 table
• 𝜒2 is also used in more than two rows
and/or columns;
• m by n contingency table with m columns
and n rows
Example of a 2 x 2 table because each
variable has two levels
• 𝜒2 only works when frequencies are used in the cells.
• Data such as proportions, means or physical measurements are not valid.
• 𝜒2 – is more accurate when large frequencies are used –
• test is used to detect an association between data in rows and data in columns, but
it does not indicate the strength of any association.
• In a contingency table, one variable (usually the exposure) forms the rows and the
other variable (usually the disease) forms the columns.
• Column is the y axis and row is the x-axis
• Four internal cells (a – d) show the counts for each of the disease/exposure
groups
Presentation of Chisquare Test Data
• Cell ‘a’ shows the number who satisfy
exposure present (immunized) and
disease present (illness positive).
• Cell ‘b’ - number who satisfy exposure
present (immunized) and disease
absent (illness negative), etc
• As in all analyses, it is important to
identify which variable is the outcome
variable (Column) and which variable
is the explanatory variable (Row).
Example of a 2 x 2 table
OUTCOME
Exposure
present
Illness
positive
Illness
negative
Total
Immunized a b a+b
Not
immunized
c d c+d
Total a + c b + d a+b+c+d
• Important for setting up the crosstabulation table to display the percentages that
are appropriate for answering the research question.
• Can be achieved by either:
• entering the independent (explanatory) variable in the rows, the dependent
(outcome) in the columns and using row percentages, or
• entering the independent (explanatory) variable in the columns, the dependent
(outcome) in the rows and using column percentages.
Assumptions/Conditions for using a chi-square test
• The assumptions that must be met when using a chi-square test are that:
• Each observation must be independent
• each participant is represented in the table once only (NO REPEAT DATA)
• None of the cells contains zero frequency
• all of the expected frequencies should be more than 1
• No cell should contain expected frequency of less than 5;
• For large data Not more than 20% of total cells contain frequency of less than 5
• Samples are randomly drawn and are independent
• If these conditions are not met, the Chi-squared test is not valid and therefore
cannot be used.
• If the 𝜒2 is not valid and a 2 x 2 table is being used, Fisher's exact test is utilised.
Which Chi-square to use?
• Chi-square statistic that is conventionally used depends on both the sample size
and the expected cell counts.
• Pearson’s chi-square
• Continuity correction
• Fisher’s exact test
• Linear-by-linear
Fisher’s exact test
• Fisher’s exact test is a gold standard test as such when available could be used in all
situations
• Pearson’s chi-square and the continuity correction tests are approximations.
• Fisher’s exact test is generally calculated for 2 × 2 contingency tables and, small
sample size;
• depending on the program used, may also be produced for crosstabulations larger than 2 × 2.
• The exact calculation based on the exact distribution of the test statistics provides a
reliable P value irrespective of the sample size or distribution of the data.
• In a 2 × 2 contingency table, the Pearson’s chi-square produces smaller P values than
Fisher’s exact.
• a type I error may occur
Fisher’s exact test……..
• A correction made to the calculation of Pearson’s chi-square (Yates continuity
correction) increases the P value.
• correction tends to overestimate the P value, and
• a type II error may occur
• Yates correction should generally not be applied except if the sample size is small.
• Linear-by-linear test is a trend test,
• Most appropriate in situations in which an ordered exposure variable has three or
more categories and the outcome variable is binary.
Calculating Chi-square value
• Test statistic is calculated by taking the:
• frequencies that are actually observed
(O) and then working out the
frequencies which would be expected
(E) if the null hypothesis was true.
• Null hypothesis (Ho) - will be that there
is no association between the variables.
• Alternative hypothesis (Hi) - there is an
association b/w the variables.
• The expected count is the expected
value due by chance alone and is
calculated for each cell as:
Total
a b Row total =
a+b
c d Row total =
c+d
Column
total =
a+c
Column
total =
b+d
Grand total =
a+b+c+d
For cell ‘a’, expected count = (a+b) x (a+c)
(a+b+c+d)
Expected count =
!"# $"%&' ( )"'*+, $"%&'
-.&,/ $"%&'
This formula is used to produce the x statistic:
§ Where:
§ O = observed frequencies,
§ E = expected frequencies.
§ Degrees of freedom (d.f.), calculated using
the formula: d.f. = (r-1) x (c-1), where
§ r = number of rows and
§ c = number of columns.
Expected frequency for each cell (a –
d) in 2 x 2 table can be calculated
using as:
Expected frequencies can also be
calculated using probabiity theory/
distribution
• Chi-square statistic compares the observed count in each cell to the count which
would be expected under the assumption of no association between the row and
column classifications.
• Continuity corrected (Yates’s continuity correction) chi-square is calculated in a similar way
with correction for small data; 0.5 is subtracted from the absolute value of this
deviation before you square it.
• It is especially important to use when frequencies are small.
• Yates' correction can only be used for 2 x 2 tables.
• it lowers the value of the chi-square statistic and, therefore, makes it less significant, i.e, the significance is
slightly reduced.
• Ho for a chi-square test - there is no significant difference between the observed and
expected frequencies.
• If the observed and expected values are similar, then the 𝜒2 value will be close to zero and
therefore will not be significant.
• the larger the difference between the observed and expected frequencies, the larger the
𝜒2 value becomes and the less likely that the null hypothesis is true.
• more likely the P value will be significant.
Steps in Hypothesis Testing
1. State hypothesis: null & alternatative
2. Decide level of significance
3. Decide apprpriate test statistic for the hypothesis
4. Review/ensure assumptions of sampling distribution
5. Choose the critical region of the test statistic
6. Work out test statistic:
§ i. write out chisquare formula, if not provided
§ ii. Calculate the expected frequency for each cell,
§ iii. Create table of expected frequencies and other variables,
§ iv. Compute the final chisquare value; check statistic against a distribution with known properties
7. Make a decision rule/interprete your finding
8.Conclusion
EXAMPLE: Frequencies for HbA1c testing by ethnic group (Source: adapted from Stewart and
Rao,2000)
CELLS O E O – E (O – E)2 (O – E)2
/E
a
b
c
d
Total ∑ (O – E)2
/E =
Create table of expected frequencies
• Using the marginal total, probabilities can
calculated.
• P(HbA1C)+ve = 558/774
• P(N) – 198/774
• If HO is true and the two events are
independent, then P(HbA1C
+ve and N) =
P(HbA1C
+ve) X P(N) – 198/774
• The joint probability in the cell for Asian
in HbA1C
+ve = 558/774 x 198/774
• The expected frequencies in the cell for
Asian in HbA1C
+ve = (558/774) x (198/774)
x 774 =
Steps in Hypothesis testing
1. State hypothesis: null & alternatative
2. Decide level of significance
3. Decide apprpriate test statistic for the hypothesis
4. Review/ensure assumptions of sampling distribution
5. Choose the critical region of the test statistic: the critical limit of X2 with df of 1, alpha
0.05from the table =7.185
6. Work out test statistic:
§ i. write out chisquare formula, if not provided
§ ii. Calculate the expected frequency for each cell,
§ iii. Create table of expected frequencies and other variables,
§ iv. Compute the final chisquare value; statistic can then be checked against a distribution with known
properties
7. Make a decision rule/interprete your finding: if the observed value was bigger than this critical value
you would say that there was a significant relationship between the two variables.
8.Conclusion
• The Chi-square test statistic = 7.32
• On Chi-square distribution table, we look along the row for d.f. = 1.
• We Look along the row to find the values to the left and right of the x2 statistic - it lies in
between 6.635 and 10.827.
• Reading up the columns for these two values shows that the corresponding P-value is less than
0.01 but greater than 0.001 - we can therefore write the P-value as P < 0.01.
• Thus there is strong evidence to reject the null hypothesis, and we may conclude
that there is an association between being Asian and receiving an HbAlc check.
Asian patients are significantly less likely to receive an HbAlc check, and appear to
receive a poorer quality of care in this respect.
Class work
• The table shows the results of the
field trial of 2 whooping cough
vaccines.
• The question that arises is
whether the vaccine B was really
superior to vaccine A,
• OR
• whether the difference was
merely due to chance.
Assignment: In a study to find diabetes mellitus is related with blood group, a group
of 65 patients of DM were compared with that of 120 normal healthyindividuals. The
observation is presented in the table below;
Subject O A B AB TOTAL
NORMAL 58 30 28 4 120
DIABETIC 32 16 15 2 65
TOTAL 90 46 43 6 185
Is there an association between being diabetic and having a particular blood group type?
Chisquare test in SPSS
• When conducting a chi-square test in SPSS, the significance level is calculated using
the ‘asymptotic’ method,
• which means that P values are calculated based on the assumption that the data has
a large enough sample size to conform to a certain distribution
• If the sample size is small or some cells have a low count, the ‘exact’ P values should
be reported since the asymptotic P values will be unreliable.
• Exact calculation based on the exact distribution of the test statistics provides a
reliable P value irrespective of the sample size or distribution of the data
§ Result shows that 40.2% of males in the sample were premature compared with 20.3%
of females, i.e., rate of prematurity in the males is almost twice that in the females.
§ the smallest cell has an observed count of 12.
§ Expected number for the cell: 59 × 45/141, or 18.83 as shown in the footnote of the
Chi-Square Tests table overleaf
§ In the 𝜒2 table, the third column’s heading is ‘Asymp. Sig. (two-sided)’, indicates the significance
level for a two-sided test, is calculated asymptotically.
§ the sample size is large, so chi-square distribution approximate the exact distribution of the
Pearson statistic; so the Pearson chi-square value should be reported.
§ The continuity correction (Yates) results in a P value of 0.020, which is slightly higher than the P
value of 0.017 for the Fisher’s exact test.
§ The Fisher’s exact test would not be reported in this study because the sample size (141) is large
• This test is two-tailed and the corresponding value indicates that the difference in
rates of prematurity between the genders is statistically significant at P = 0.017.
• This result can be reported as ‘Fisher’s exact test indicated that there was a
significant difference in prematurity between males and females (40.2% vs 20.3%,
P = 0.02)’.
Strength of association
• These include
• Phi
• Contingency Coefficient
• Cramer’s V
• Most of these tests are measures of the strength of association.
• Measures are based on modifying the chi-square statistic to take account of sample
size and degrees of freedom and
• they try to restrict the range of the test statistic from 0 to 1
• to make them similar to the correlation coefficient
Strength of association
• Phi: This statistic is accurate for 2 X 2 contingency tables.
• for tables with greater than two dimensions the value of phi may not lie between 0 and 1 because the
chi-square value can exceed the sample size.
• Pearson suggested the use of the coefficient of contingency.
• Contingency Coefficient: This coefficient ensures a value between 0 and 1
• unfortunately, it seldom reaches its upper limit of 1,
• for this reason Cramer devised Cramer’s V.
• Cramer’s V: When both variables have only two categories, phi and Cramer’s V are
identical.
• when variables have more than two categories Cramer’s statistic can attain its maximum of
one – unlike the other two – and
• so it is the most useful.
• Cramer’s statistic is 0.36 out of a possible
maximum value of 1.
• This represents a medium association
between the variables.
• like a correlation coefficient then this
represents a medium effect size
• The value is highly significant (p < .001)
indicating that a value of the test statistic
that is this big is unlikely to have happened
by chance, and therefore the strength of
the relationship is significant.
• These results confirm what the chi-square
test already told us but also give us some
idea of the size of effect.

More Related Content

Similar to Chisquare Test of Association.pdf in biostatistics

Tests of statistical significance : chi square and spss
Tests of statistical significance : chi square and spss Tests of statistical significance : chi square and spss
Tests of statistical significance : chi square and spss Drsnehas2
 
Chi-Square Presentation - Nikki.ppt
Chi-Square Presentation - Nikki.pptChi-Square Presentation - Nikki.ppt
Chi-Square Presentation - Nikki.pptBAGARAGAZAROMUALD2
 
The Chi-Square Statistic: Tests for Goodness of Fit and Independence
The Chi-Square Statistic: Tests for Goodness of Fit and IndependenceThe Chi-Square Statistic: Tests for Goodness of Fit and Independence
The Chi-Square Statistic: Tests for Goodness of Fit and Independencejasondroesch
 
chi-Square. test-
chi-Square. test-chi-Square. test-
chi-Square. test-shifanaz9
 
Chi square test
Chi square testChi square test
Chi square testNayna Azad
 
Research methodology and iostatistics ppt
Research methodology and iostatistics pptResearch methodology and iostatistics ppt
Research methodology and iostatistics pptNikhat Mohammadi
 
Chi square test final presentation
Chi square test final presentationChi square test final presentation
Chi square test final presentationRitesh Tiwari
 
Categorical Data and Statistical Analysis
Categorical Data and Statistical AnalysisCategorical Data and Statistical Analysis
Categorical Data and Statistical AnalysisMichael770443
 
Test of significance in Statistics
Test of significance in StatisticsTest of significance in Statistics
Test of significance in StatisticsVikash Keshri
 
Chi-square IMP.ppt
Chi-square IMP.pptChi-square IMP.ppt
Chi-square IMP.pptShivraj Nile
 

Similar to Chisquare Test of Association.pdf in biostatistics (20)

Tests of statistical significance : chi square and spss
Tests of statistical significance : chi square and spss Tests of statistical significance : chi square and spss
Tests of statistical significance : chi square and spss
 
Chi square mahmoud
Chi square mahmoudChi square mahmoud
Chi square mahmoud
 
Chi-Square Presentation - Nikki.ppt
Chi-Square Presentation - Nikki.pptChi-Square Presentation - Nikki.ppt
Chi-Square Presentation - Nikki.ppt
 
Contingency tables
Contingency tables  Contingency tables
Contingency tables
 
The Chi-Square Statistic: Tests for Goodness of Fit and Independence
The Chi-Square Statistic: Tests for Goodness of Fit and IndependenceThe Chi-Square Statistic: Tests for Goodness of Fit and Independence
The Chi-Square Statistic: Tests for Goodness of Fit and Independence
 
Statistic and orthodontic by almuzian
Statistic and orthodontic by almuzianStatistic and orthodontic by almuzian
Statistic and orthodontic by almuzian
 
chi-Square. test-
chi-Square. test-chi-Square. test-
chi-Square. test-
 
Chi square test
Chi square testChi square test
Chi square test
 
Research methodology and iostatistics ppt
Research methodology and iostatistics pptResearch methodology and iostatistics ppt
Research methodology and iostatistics ppt
 
chapter18.ppt
chapter18.pptchapter18.ppt
chapter18.ppt
 
chapter18.ppt
chapter18.pptchapter18.ppt
chapter18.ppt
 
chi sqare test.ppt
chi sqare test.pptchi sqare test.ppt
chi sqare test.ppt
 
chapter18.ppt
chapter18.pptchapter18.ppt
chapter18.ppt
 
Chi square test final presentation
Chi square test final presentationChi square test final presentation
Chi square test final presentation
 
Chi square test
Chi square test Chi square test
Chi square test
 
Categorical Data and Statistical Analysis
Categorical Data and Statistical AnalysisCategorical Data and Statistical Analysis
Categorical Data and Statistical Analysis
 
Chisquared test.pptx
Chisquared test.pptxChisquared test.pptx
Chisquared test.pptx
 
Test of significance in Statistics
Test of significance in StatisticsTest of significance in Statistics
Test of significance in Statistics
 
Chi-square IMP.ppt
Chi-square IMP.pptChi-square IMP.ppt
Chi-square IMP.ppt
 
chi_square test.pptx
chi_square test.pptxchi_square test.pptx
chi_square test.pptx
 

Recently uploaded

Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationBoston Institute of Analytics
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service LucknowAminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknowmakika9823
 

Recently uploaded (20)

Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project Presentation
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service LucknowAminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
 

Chisquare Test of Association.pdf in biostatistics

  • 1. KEBBI STATE UNIVERSITY OF SCIENCE & TECHNOLOGY, ALIERO COLLEGE OF HEALTH SCIENCES DEPARTMENT OF COMMUNITH HEALTH COM 201: BIOSTATISTICS CHI-SQUARE TEST OF ASSOCIATION By Prof AU Ka’oje
  • 2. Test of Association I: Chisquare Test • A chi-square test (denoted as 𝜒2) is used to assess whether the distribution of a categorical variable is significantly different between two or more groups. • Used to /detect the existence of association/relation between two variables - categorical variables. • between data in rows and data in columns, • but it does not indicate the strength of any association • In health research, a test of chi-square is frequently used to assess whether disease (present/absent) is associated with exposure (yes/no) • Chi-square tests are appropriate for most study designs but the results are influenced by the sample size.
  • 3. • It depends on the size of the differences between observed and expected frequencies, degrees of freedom, and sample size. • There are two types of tests: • 1. The goodness of fit: It is used to determine how good the observed data represents the population (expected values) • 2. Test for independence: It is used to determine if there is a relation between two categorical variables
  • 4. Test of Association I: Chisquare Test • 𝜒2 cannot have negative value, as distribution curve is always on the positive side, i.e, skewed to the right. • More areas under the curve towards the left of the graph • it has no symmetry • It depends on degree of freedom (df) which are always one less than sample size (n-1) • The chi- squared test is a non- parametric test.
  • 5. Presentation of Chisquare Test Data • Data is presented in an r x c table (row x column), • cross-classification or contingency table • Data are presented in cells, arranged in rows (horizontal) and columns (vertical). • These often appear in the form of a 2 x 2 table • 𝜒2 is also used in more than two rows and/or columns; • m by n contingency table with m columns and n rows Example of a 2 x 2 table because each variable has two levels
  • 6. • 𝜒2 only works when frequencies are used in the cells. • Data such as proportions, means or physical measurements are not valid. • 𝜒2 – is more accurate when large frequencies are used – • test is used to detect an association between data in rows and data in columns, but it does not indicate the strength of any association. • In a contingency table, one variable (usually the exposure) forms the rows and the other variable (usually the disease) forms the columns. • Column is the y axis and row is the x-axis • Four internal cells (a – d) show the counts for each of the disease/exposure groups
  • 7. Presentation of Chisquare Test Data • Cell ‘a’ shows the number who satisfy exposure present (immunized) and disease present (illness positive). • Cell ‘b’ - number who satisfy exposure present (immunized) and disease absent (illness negative), etc • As in all analyses, it is important to identify which variable is the outcome variable (Column) and which variable is the explanatory variable (Row). Example of a 2 x 2 table OUTCOME Exposure present Illness positive Illness negative Total Immunized a b a+b Not immunized c d c+d Total a + c b + d a+b+c+d
  • 8. • Important for setting up the crosstabulation table to display the percentages that are appropriate for answering the research question. • Can be achieved by either: • entering the independent (explanatory) variable in the rows, the dependent (outcome) in the columns and using row percentages, or • entering the independent (explanatory) variable in the columns, the dependent (outcome) in the rows and using column percentages.
  • 9. Assumptions/Conditions for using a chi-square test • The assumptions that must be met when using a chi-square test are that: • Each observation must be independent • each participant is represented in the table once only (NO REPEAT DATA) • None of the cells contains zero frequency • all of the expected frequencies should be more than 1 • No cell should contain expected frequency of less than 5; • For large data Not more than 20% of total cells contain frequency of less than 5 • Samples are randomly drawn and are independent
  • 10. • If these conditions are not met, the Chi-squared test is not valid and therefore cannot be used. • If the 𝜒2 is not valid and a 2 x 2 table is being used, Fisher's exact test is utilised. Which Chi-square to use? • Chi-square statistic that is conventionally used depends on both the sample size and the expected cell counts. • Pearson’s chi-square • Continuity correction • Fisher’s exact test • Linear-by-linear
  • 11. Fisher’s exact test • Fisher’s exact test is a gold standard test as such when available could be used in all situations • Pearson’s chi-square and the continuity correction tests are approximations. • Fisher’s exact test is generally calculated for 2 × 2 contingency tables and, small sample size; • depending on the program used, may also be produced for crosstabulations larger than 2 × 2. • The exact calculation based on the exact distribution of the test statistics provides a reliable P value irrespective of the sample size or distribution of the data. • In a 2 × 2 contingency table, the Pearson’s chi-square produces smaller P values than Fisher’s exact. • a type I error may occur
  • 12. Fisher’s exact test…….. • A correction made to the calculation of Pearson’s chi-square (Yates continuity correction) increases the P value. • correction tends to overestimate the P value, and • a type II error may occur • Yates correction should generally not be applied except if the sample size is small. • Linear-by-linear test is a trend test, • Most appropriate in situations in which an ordered exposure variable has three or more categories and the outcome variable is binary.
  • 13. Calculating Chi-square value • Test statistic is calculated by taking the: • frequencies that are actually observed (O) and then working out the frequencies which would be expected (E) if the null hypothesis was true. • Null hypothesis (Ho) - will be that there is no association between the variables. • Alternative hypothesis (Hi) - there is an association b/w the variables. • The expected count is the expected value due by chance alone and is calculated for each cell as: Total a b Row total = a+b c d Row total = c+d Column total = a+c Column total = b+d Grand total = a+b+c+d For cell ‘a’, expected count = (a+b) x (a+c) (a+b+c+d) Expected count = !"# $"%&' ( )"'*+, $"%&' -.&,/ $"%&'
  • 14. This formula is used to produce the x statistic: § Where: § O = observed frequencies, § E = expected frequencies. § Degrees of freedom (d.f.), calculated using the formula: d.f. = (r-1) x (c-1), where § r = number of rows and § c = number of columns. Expected frequency for each cell (a – d) in 2 x 2 table can be calculated using as: Expected frequencies can also be calculated using probabiity theory/ distribution
  • 15. • Chi-square statistic compares the observed count in each cell to the count which would be expected under the assumption of no association between the row and column classifications. • Continuity corrected (Yates’s continuity correction) chi-square is calculated in a similar way with correction for small data; 0.5 is subtracted from the absolute value of this deviation before you square it. • It is especially important to use when frequencies are small. • Yates' correction can only be used for 2 x 2 tables. • it lowers the value of the chi-square statistic and, therefore, makes it less significant, i.e, the significance is slightly reduced. • Ho for a chi-square test - there is no significant difference between the observed and expected frequencies. • If the observed and expected values are similar, then the 𝜒2 value will be close to zero and therefore will not be significant.
  • 16. • the larger the difference between the observed and expected frequencies, the larger the 𝜒2 value becomes and the less likely that the null hypothesis is true. • more likely the P value will be significant. Steps in Hypothesis Testing 1. State hypothesis: null & alternatative 2. Decide level of significance 3. Decide apprpriate test statistic for the hypothesis 4. Review/ensure assumptions of sampling distribution 5. Choose the critical region of the test statistic 6. Work out test statistic: § i. write out chisquare formula, if not provided § ii. Calculate the expected frequency for each cell, § iii. Create table of expected frequencies and other variables, § iv. Compute the final chisquare value; check statistic against a distribution with known properties 7. Make a decision rule/interprete your finding 8.Conclusion
  • 17. EXAMPLE: Frequencies for HbA1c testing by ethnic group (Source: adapted from Stewart and Rao,2000) CELLS O E O – E (O – E)2 (O – E)2 /E a b c d Total ∑ (O – E)2 /E = Create table of expected frequencies
  • 18. • Using the marginal total, probabilities can calculated. • P(HbA1C)+ve = 558/774 • P(N) – 198/774 • If HO is true and the two events are independent, then P(HbA1C +ve and N) = P(HbA1C +ve) X P(N) – 198/774 • The joint probability in the cell for Asian in HbA1C +ve = 558/774 x 198/774 • The expected frequencies in the cell for Asian in HbA1C +ve = (558/774) x (198/774) x 774 =
  • 19. Steps in Hypothesis testing 1. State hypothesis: null & alternatative 2. Decide level of significance 3. Decide apprpriate test statistic for the hypothesis 4. Review/ensure assumptions of sampling distribution 5. Choose the critical region of the test statistic: the critical limit of X2 with df of 1, alpha 0.05from the table =7.185 6. Work out test statistic: § i. write out chisquare formula, if not provided § ii. Calculate the expected frequency for each cell, § iii. Create table of expected frequencies and other variables, § iv. Compute the final chisquare value; statistic can then be checked against a distribution with known properties 7. Make a decision rule/interprete your finding: if the observed value was bigger than this critical value you would say that there was a significant relationship between the two variables. 8.Conclusion
  • 20.
  • 21. • The Chi-square test statistic = 7.32 • On Chi-square distribution table, we look along the row for d.f. = 1. • We Look along the row to find the values to the left and right of the x2 statistic - it lies in between 6.635 and 10.827. • Reading up the columns for these two values shows that the corresponding P-value is less than 0.01 but greater than 0.001 - we can therefore write the P-value as P < 0.01. • Thus there is strong evidence to reject the null hypothesis, and we may conclude that there is an association between being Asian and receiving an HbAlc check. Asian patients are significantly less likely to receive an HbAlc check, and appear to receive a poorer quality of care in this respect.
  • 22. Class work • The table shows the results of the field trial of 2 whooping cough vaccines. • The question that arises is whether the vaccine B was really superior to vaccine A, • OR • whether the difference was merely due to chance.
  • 23. Assignment: In a study to find diabetes mellitus is related with blood group, a group of 65 patients of DM were compared with that of 120 normal healthyindividuals. The observation is presented in the table below; Subject O A B AB TOTAL NORMAL 58 30 28 4 120 DIABETIC 32 16 15 2 65 TOTAL 90 46 43 6 185 Is there an association between being diabetic and having a particular blood group type?
  • 24. Chisquare test in SPSS • When conducting a chi-square test in SPSS, the significance level is calculated using the ‘asymptotic’ method, • which means that P values are calculated based on the assumption that the data has a large enough sample size to conform to a certain distribution • If the sample size is small or some cells have a low count, the ‘exact’ P values should be reported since the asymptotic P values will be unreliable. • Exact calculation based on the exact distribution of the test statistics provides a reliable P value irrespective of the sample size or distribution of the data
  • 25. § Result shows that 40.2% of males in the sample were premature compared with 20.3% of females, i.e., rate of prematurity in the males is almost twice that in the females. § the smallest cell has an observed count of 12. § Expected number for the cell: 59 × 45/141, or 18.83 as shown in the footnote of the Chi-Square Tests table overleaf
  • 26. § In the 𝜒2 table, the third column’s heading is ‘Asymp. Sig. (two-sided)’, indicates the significance level for a two-sided test, is calculated asymptotically. § the sample size is large, so chi-square distribution approximate the exact distribution of the Pearson statistic; so the Pearson chi-square value should be reported. § The continuity correction (Yates) results in a P value of 0.020, which is slightly higher than the P value of 0.017 for the Fisher’s exact test. § The Fisher’s exact test would not be reported in this study because the sample size (141) is large
  • 27. • This test is two-tailed and the corresponding value indicates that the difference in rates of prematurity between the genders is statistically significant at P = 0.017. • This result can be reported as ‘Fisher’s exact test indicated that there was a significant difference in prematurity between males and females (40.2% vs 20.3%, P = 0.02)’.
  • 28. Strength of association • These include • Phi • Contingency Coefficient • Cramer’s V • Most of these tests are measures of the strength of association. • Measures are based on modifying the chi-square statistic to take account of sample size and degrees of freedom and • they try to restrict the range of the test statistic from 0 to 1 • to make them similar to the correlation coefficient
  • 29. Strength of association • Phi: This statistic is accurate for 2 X 2 contingency tables. • for tables with greater than two dimensions the value of phi may not lie between 0 and 1 because the chi-square value can exceed the sample size. • Pearson suggested the use of the coefficient of contingency. • Contingency Coefficient: This coefficient ensures a value between 0 and 1 • unfortunately, it seldom reaches its upper limit of 1, • for this reason Cramer devised Cramer’s V. • Cramer’s V: When both variables have only two categories, phi and Cramer’s V are identical. • when variables have more than two categories Cramer’s statistic can attain its maximum of one – unlike the other two – and • so it is the most useful.
  • 30. • Cramer’s statistic is 0.36 out of a possible maximum value of 1. • This represents a medium association between the variables. • like a correlation coefficient then this represents a medium effect size • The value is highly significant (p < .001) indicating that a value of the test statistic that is this big is unlikely to have happened by chance, and therefore the strength of the relationship is significant. • These results confirm what the chi-square test already told us but also give us some idea of the size of effect.