1. Chi-Square Test
What is chi-square testing?
o Identifiessignificantdifferencesamongthe observedfrequenciesandthe expected
frequenciesof aparticulargroup
o Attemptstoidentifywhetherany differencesbetweenthe expectedandobserved
frequenciesare due tochance,or some otherfactor that isaffectingit.
o There are actuallymanytypesof Chi-square tests,butthe mostcommonone isthe
Pearson Chi-squareTest.
Terms and Definitions
o Categorical Data- 2 types
a. Numerical data- informof numbers.(ex.1,2,3,4)
b. Categorical data- comesinformof divisions.(ex.Yesorno)
o ExpectedFrequencies
-valuesforparametersthatare hypothesizedtooccur
-can be determinedthrough:
1) Hypothesizingthatthe frequencies are equal foreach category.
2) Hypothesizingthe valuesonthe basisof some prior knowledge.
3) A mathematical method (seePage 3)
Two applications ofPearson Chi-Square Test
1) Chi-square testforIndependence
-Thistestswhetherthe “category”fromwhichthe data comesfromaffectsthe data.
-May alsobe thoughtof as testingwhetherthe categoriesinthe experiment“prefer”certain
kindsof data.
Example:Isthere a difference inthe carchoicesof male and females?
2) Chi-square testforgoodness-of-fit
-Thistestswhetherthe observed data“fit”the expected data.
Example:Dothe car salesthisyearmatch the car saleslastyear?(ie.Didwe still sell around50
blue cars? 25 redcars?)
Requirementsofthe Chi-squaredTest
1. The valuesof the parameterstobe comparedare quantitative andnominal.
2. There shouldbe one or more categoriesinthe setup.
3. The observationsshouldbe independentof eachother.
4. An adequate sample size.(Atleast10)
5. Most of the time,itis the frequencyof the observationsthatare used.
2. Example
A studentwantstosee whetherthe foodpreferencesof malesandfemalesdiffered.He triedtosee
whethermalesorfemaleshadageneral difference inthe preference forcookedandraw foods. A survey
was conductedwiththe followingresults:
Twelve malespreferredCookedfoods.
Eightmalespreferred Rawfoods.
Five femalespreferred Cookedfoods.
Five femalespreferred Cawfoods.
Step 1: State the null hypothesisand the alternative hypothesis.
Ho: There isno significantdifference betweenthe food preferencesof malesandfemales.
Or
Foodpreference isindependentof gender.
Ha: There is a significantdifferencebetweenthe foodpreferencesof malesandfemales.
Or
Foodpreference isaffectedbygender.
Step 2: State the level ofsignificance. (FishThingy)
α = 0.05
0.05 is the level of significance for most scientific experiments.
Step 3: Set up a contingencytable:
The contingencytable summarizesthe data.
The categoriesonthe columnsare the “preferences”thatyouare checking.
The categoriesonthe rows are the “populations”whosepreferencesare beingchecked.A row total and
columntotal isalwaysincludedaswell.
Preference Male Female Total
(Row)
Cooked 12 5 17
Raw 8 5 13
Total (Column) 20 10 30
3. Step 4: Compute for the expectedfrequencies.
The chi-square testforindependenceusuallyusesthe thirdmethodof gettingexpectedfrequencies.
ExpectedFrequency=(RowTotal)(ColumnTotal)
Grand total
ThisexpectedfrequencyiscomputedforEACH cell.
Preference Male Female Total
(Row)
Cooked (20)(17)/30
= 11.33
(10)(17)/30
= 5.67
17
Raw (13)(20)/30
= 8.67
(13)(10)/30
= 4.33
13
Total
(Column)
20 10 30
The fundamental formulaforthe Chi-squaredtestis:
Where O isthe observedfrequencies
E is the expectedfrequencies
Andx2
isthe chi-square value
Step 5: Rearrange the table to show the observedand expectedfrequenciesonthe columns,and the
subcategorieson the rows.
Preference Observed Expected Chi-square
CookedMales 12 11.33 0.0396
CookedFemales 5 5.67 0.0792
Raw Males 8 8.67 0.0518
Raw Females 5 4.33 0.1037
Total 0.2743
4. Step 6: Determine the degreesoffreedom
The degreesof freedomis: df = (Rows – 1)(Columns– 1)
df = (2 – 1)(2 – 1) = 1
Step 7: Check the tabular Chi-squaredvalue with your df and level ofsignificance.
Checkingthe table,we see thatthe tabular chi-squaredvaluefordf = 1, and α = 0.05 is3.841.
Since our calculatedchi-squareislessthanthis,the conclusionisto acceptthe null hypothesis. Hence,
foodpreference isindependentof gender.
If it were greater, we would rejectthe null hypothesis.