OBJECTIVES:
Recognize the differences between categorical data and continuous data
Discuss assumptions of chi square distribution
Correctly interpret and use the terms:
chi-square test of independence,
contingency table
degrees of freedom,
“2x2” and “r x c” table.
Calculate expected numbers of the cells of a contingency table .
Calculate chi-square test statistic and its appropriate degrees of freedom.
Refer the chi-square table to obtain tabulated value.
Categorical variables take on values that are names or labels, such as ethnicity (e.g., Sindhi, Punjabi, Balochi etc.) and methods of teaching (e.g. lecture, discussion, activity based etc.)
Quantitative variables are numerical. They represent a measurable quantity. For example, the number of students taking Biostatistics Supplementary classes .
CHI-SQUARE TEST:
It is used to determine whether there is a significant association between the two categorical variables from a single population.
CHI-SQUARE DISTRIBUTION PROPERTIES:
As the degrees of freedom increases, the chi-square
curve approaches a normal distribution
It has many shapes which are based on its degree of freedom (df)
Distribution is skewed to the right
A chi-square distribution takes positive values only.
Commonly used approaches are:
Test for independence
Test of homogeneity
CHI-SQUARE TEST OF INDEPENDENCE:
A chi-square test of independence is used when we want to see if there is a relationship/association between two categorical variables.
EXAMPLES OF RELATIONSHIPS
BETWEEN QUALITATIVE VARIABLES:
Qualitative variables are either ordinal or nominal.
Examples:
Do the nurses feel differently about a new postoperative procedure than doctors?
Preference (Old/New) Subjects (Nurses/ Doctors)
Is there any relationship between Soya Use & Lung cancer?
Soya Intake (yes/no) Lung cancer (yes/no)
Is there any relationship between parent’s and their children Children’s Education (Illiterate/Up to Intermediate/Graduate)
education?
Parent’s Education (Illiterate/Up to Intermediate/Graduate)
CONTINGENCY TABLE:
The table which classifies categories of the qualitative
variable.
The number of individuals or items assigned to each category is called the frequency.
WHAT INFORMATION DOES CONTINGENCY TABLE REVEAL?
When we consider two categorical variables at a time, then an observation will belong to a particular category of variable one as well as a particular category of variable two. This type of table is referred as contingency table.
The simplest form of contingency table is a 2x2 contingency table with both
variables having exactly two categories.
WHAT OTHER INFORMATION DOES
CONTINGENCY TABLE REVEAL?
In this table Two independent categorical variables that
form a “r x c” contingency table, where “r” is the number of rows (number of categories in first variable e.g. helmet used at the time of accident or not?) and “c” is the number of columns (number of categories in the second variable e.g. got severe brain injury.
Z Score,T Score, Percential Rank and Box Plot Graph
Lecture 12 Chi-Square.pptx
1.
2. Chi-Square Testof
Independence
Shakir Rahman
BScN, MScN, MSc Applied Psychology, PhD Nursing (Candidate)
University of Minnesota USA.
Principal & Assistant Professor
Ayub International College of Nursing & AHS Peshawar
Visiting Faculty
Swabi College of Nursing & Health Sciences Swabi
Nowshera College of Nursing & Health Sciences Nowshera
1
3. LEARNINGOBJECTIVES
By the end of this session the students would be able to:
• Recognize the differences between categorical data and continuous data
• Discuss assumptions of chi square distribution
• Correctly interpret and use the terms:
chi-square test of independence,
contingency table
degrees of freedom,
“2x2” and “r x c” table.
• Calculate expected numbers of thecells of a contingency table .
• Calculate chi-square test statistic and its appropriate degrees of freedom.
• Refer the chi-square table to obtain tabulatedvalue
2
.
4. • Categorical variables take on values that are names or
labels, such as ethnicity (e.g., Sindhi, Punjabi, Balochi
etc.) and methods of teaching (e.g. lecture, discussion,
activity based etc.)
• Quantitative variables are numerical. They represent a
measurable quantity. For example, the number of
students taking Biostatistics Supplementary classes .
4
6. CHI-SQUARETEST
• It is used to determine whether there is a significant
association between the two categorical variables
from a single population.
6
7. CHI-SQUAREDISTRIBUTION
PROPERTIES
• As the degrees of freedom increases, the chi-square
curve approaches a normal distribution
• It has many shapes which are based on its degree of
freedom (df)
• Distribution is skewed to the right
• A chi-square distribution takes positive values only
7
10. CHI-SQUARE TEST OF
INDEPENDENCE
A chi-square test of independenceis used
when we want to see if there is a
relationship/association between
two categorical variables
1
0
11. EXAMPLES OF RELATIONSHIPS
BETWEEN QUALITATIVE VARIABLES
• Qualitative variables are either ordinal ornominal.
Examples:
Do the nurses feel differently about a new postoperative procedure
than doctors?
Preference (Old/New) Subjects (Nurses/ Doctors)
children
Is there any relationship between Soya Use & Lung cancer?
Soya Intake (yes/no) Lung cancer (yes/no)
Is there any relationship between parent’s and their
education?
Parent’s Education (Illiterate/Up toIntermediate/Graduate)
10
Children’s Education (Illiterate/Up toIntermediate/Graduate)
13. CONTINGENCY
TABLE
•The table which classifies categories of the qualitative
variable.
•The number of individuals or items assigned to each
category is called the frequency.
13
14. • When we consider two categorical variables at a time,
then an observation will belong to a particular category
of variable one as well as a particular category of
variable two. This type of table is referred as
contingency table
The simplest form of contingency table is a 2x2 contingency table withboth
variables having exactly two categories.
WHAT INFORMATION DOES
CONTINGENCY TABLEREVEAL?
14
15. Helmet used at
the time of road
accident
Got serious brain injury
Yes No
Yes 5 995
No 25 975
•What information does cell no.1 give?
Five persons (5), who used helmet at the time of
road accident had serious brain injury.
15
16. WHAT OTHER INFORMATION DOES
CONTINGENCY TABLE REVEAL?
In this table Two independent categorical variables that
form a “r x c” contingency table, where “r” is the number of
rows (number of categories in first variable e.g. helmet used
at the time of accident or not?) and “c” is the number of
columns (number of categories in the second variable e.g.
got severe brain injury or not?) in the table.
16
18. • The data are obtained from a random sample
• Expected frequencies of each cell must be 5 or
greater than 5
Note: Must use frequencies:
In case, if percentages are given then
convert those into frequencies.
ASSUMPTIONS OF CHI-SQUARE TESTOF
INDEPENDENCE
18
19. FISHER'SEXACTTEST
• If assumptions of chi- square isnot
fulfilling:
i.e. one or more of the cells has an expected
frequency less than five
Fisher's exact test is used regardless of how
small the expected frequency is…….
19
20. HYPOTHESIS TESTING IN CHI-SQUARE
TEST OFINDEPENDENCE
Null Hypothesis:
H0: Two variables areindependent
OR
H0: There is no association between two variables
Alternate Hypothesis:
Ha: Two variables are notindependent
OR
Ha: There is an association between two variables.
20
21. TestStatistic: Chi-square test
Expected Frequency (E) for a Cell=
(Row Total X Column Total) / GrandTotal
)2
( O E
2 ij ij
Eij
HYPOTHESIS TESTING IN CHI-SQUARE
TEST OF
Significance level
I:
NAlp
Dha
EPENDENCE
21
22. Degrees of freedom (df)= (rows-1)(columns-1), where “r” is the
total number of rows and “c” is the total number of columns.
HYPOTHESIS TESTING IN CHI-SQUARE
TEST OF INDEPENDENCE
Critical Region: 2
(cal)> 2
(tab) or 2
,df
22
24. How to calculate Chi-square test statistic?
Row# Column# O(Observed) E(Expected) (O-E) (O-E)2 (O-E)2/E
1 1 O11 E11 (O11-E11) (O11-E11)2 (O11-E11)2/E11
1 2 O12 E12 (O12-E12) (O12-E12)2 (O12-E12)2/E12
. . . . . . .
i j Oij Eij (Oij-Eij) (Oij-Eij)2 (Oij-Eij)2/Eij
. . . . . . .
r c Orc Erc (Orc-Erc) (Orc-Erc)2 (Orc-Erc)2/Erc
Sum GrandTotal GrandTotal 2
24
25. STEPSTOCALCULATECHI
SQUARE(2 )
• First calculate all expected cells(E)
• Subtract Expected frequency from Observed
frequency
• Square the difference ofO-E
• Divide (O-E)2 by E
• Do this for all cells in the table, and add them all
together
• Sum of column (O-E)2/E give us Chi- Square (2 )
value
25
26. Helmet used atthe
time of road
accident
Got serious braininjury Total
Yes No
Yes 5 995 1000
No 25 975 1000
Total 30 1970 2000
Observed Frequency:
Helmet used
at the time
of road
accident
Got serious brain injury Total
Yes No
Yes 1000*30/2000 1000*1970/2000 1000
No 1000*30/2000 1000*1970/2000 1000
Total 30 1970 2000
Calculation of Expected Frequency:
27. Helmet used
at the time of
roadaccident
Got serious brain injury
Yes No
Yes 15 985
No 15 985
Expected Frequencies after calculation:
27
28. A total of 165 patients with incomplete spinal cord injury came to a clinic over a
period of one year were treated with three treatment regimens (1: Only medicine;
2: Medicine & physical therapies; 3: Medicine and physical therapies with counseling.
Each patient’s condition was rated fully improved, partially improved or not improved.
The resultsare shown here.
Type of Therapy Patient’s Condition Total
Fully improved Partially
improved
Not improved
Only Medicine 10 15 25 R1 =50
Medicine & physicaltherapies 15 25 15 R2 =55
Medicine & physicaltherapies
with counseling
20 30 10 R3 =60
Total C1 =45 C2 =70 C3 =50 N =165
TYPEOFTHERAPYANDPATIENT’SCONDITION
WITHINCOMPLETESPINALCORDINJURY
Test whether there is an associationbetween type of therapy and patient’scondition
at 5% level ofsignificance.
28
29. Type of Therapy and Patient’s Condition with Incomplete Spinal
Cord Injury (contd.)
Type of Therapy Patient’s Condition Total
Fully improved Partially improved Not improved
Only Medicine (I) O11 = 10 O12 = 15 O13 = 25 R1 = 50
E11 = (50)(45)/165 E12 =(50)(70)/165 E13 =(50)(50)/165
= 13.6 = 21.2 = 15.2
Medicine & physical
therapies with skill
building activities (II)
O21 =15
E21= (55)(45)/165
= 15.0
O22 =25
E22= (55)(70)/165
= 23.3
O23 =15
E23=(55)(50)/165
= 16.7
R2 = 55
Medicine & physical
therapies with skill
building activities and
counseling (III)
O31= 20
E31=(45)(60)/165
= 16.4
O32 =30
E32=(70)(60)/165
= 25.5
O33 =10
E33=(50)(60)/165
= 18.1
R3 = 60
Total C1 =45 C2 =70 C3 =50 N =165
32. STEPSOFHYPOTHESIS
TE
STING
1) Hypothesis:
H0: There is no association between type of therapy andpatient’s
condition
Ha: There is an association between type of therapy andpatient’s
condition
2) Alpha =0.05
3) Test Statistics:Chi Square Test
Eij
Chi Square calculated value = 14.59
2 (O E ) 2
ij ij
32
33. STEPSOFHYPOTHESISTESTING
> 2 tab
(cal)
4) Critical Region: Reject H0 if 2
df = (r-1)(c-1) = (3-1)(3-1) = 4
2 tab = 2 = 2=0.05,df=4
,df
2 = 9.49
5) Conclusion: As 2
(cal) = 14.59 and is greater than the tabulatedvalue of
9.49. So, we Reject H0 at 5% level of significance and conclude that
there is an association between type of therapy and patient’s condition.
33
35. Acknowledgements
Dr Tazeen Saeed Ali
RM, RM, BScN, MSc ( Epidemiology & Biostatistics), Ph.D.
(Medical Sciences), Post Doctorate (Health Policy & Planning)
Associate Dean School of Nursing & Midwifery
The Aga Khan University Karachi.
Kiran Ramzan Ali Lalani
BScN, MSc Epidemiology & Biostatistics (Candidate)
Registered Nurse (NICU)
Aga Khan University Hospital
36. REFERENCES
Kuzma, J.W. (2004). Basic Statistics for the
Health Sciences. (4thed.). California:
Mayfield.
Bluman, G. A. (2008). Elementary
Statistics, A step by step approach(7th ed.)
McGraw Hill.