Relationship Between Work Status and Life Satisfaction
1. Crosstabulation
Crosstabulation is useful to show the relationship between two or more categorical variables.
Usually, continuous data is not used for chi-square analyses since a great deal of information is
lost by the process of categorization.
Crosstabs Example
Do people withdifferentworkstatuses(e.g.,full-time,retired,etc.) differin(a) how
excitinglifeisand(b) happiness?
Thisamountsto tabulatingfrequenciesforlife excitementandhappiness,butitmustbe
brokendownbywork status
Crosstabsgivesfrequenciesforone variable separatelyforeachlevel of anothervariable
Computing Crosstabs in SPSS
Choose Statistics,Summarize,Crosstabs
Selectcategorical variables;putone inRow and the otherin Column
Output:
Case ProcessingSummaryshowsmissingvaluesforeachtable
Crosstabshowsfrequenciesof one variable foreachlevel of the other
Calculating Percentages
Choose Cells,RowPercentagestoshow percentagesacrosseachrow
Choose Cells,ColumnPercentagestoshow percentagesacrosseachcolumn
Choose Cells,Total Percentagestofindpercentageof respondentthatwere ineachcell
Expected Counts
Expectedcountsare basedon marginal percentages
Multiplythe marginal percentagestogethertogetthe expectedpercentage forthatcell,
thenmultiplybyN toget expectedcounts
Or, have SPSScompute them -- Choose Cells,ExpectedCounts
Residuals
Difference betweenexpectedandobservedcounts
Choose Cells,UnstandardizedResiduals
StandardizedResidualsare distributedasz-scores(theywere dividedbythe standard
deviationof the residuals)
Controlling for a Third Variable
2. Controllingforavariable meansitisheldconstant
Thisallowsusto lookat crosstabsseparatelyforeachvalue of a thirdvariable
Example:wrkstatbylife separatelyformenandwomen
In SPSSadd sex asa layerinCrosstabs
Bar Charts
Simple
Can onlyshowfrequenciesof one variable
Choose Graphs,Bar, Simple
ClusterandStacked
Can showfrequenciesof one variable brokendownbyanother
Percentage informationcanalsobe shown
Crosstabs
Compute percentagesof happyfordifferentvaluesof wkstat
Compute percentagesof wkstatfordifferentvaluesof life;include expectedvaluesand
residuals
Compute percentagesof wkstatfordifferentvaluesof life,layeredbygender
Compute barcharts for wkstatby life
Chi-Square (2)
Chi-Square Lecture
Chi-Square Example
Researchquestions - Are there genderdifferencesinhappiness?How aboutinhow
importantitis to have a fulfillingjob?
What wouldyouhypothesize?
The hypothesistestforwhetherthe patternof percentagesin one variable differsasa
functionof anotheriscalledthe chi-square test
Hypothesis Testing
3. We testthe null hypothesisthatnothinginterestingishappening(versusalternative
hypothesisthatfindingsare interesting)
The null hypothesiscanonlybe rejectedif there isa.05 probabilitythatourfindingsare
due to chance
Hypothesistestsdetermine the extenttowhichourfindingsmaybe due tochance
Computing the Pearson Chi-Square test in SPSS
Chi-Square (2
) Testsof Independence:SPSScancompute the expectedvalueforeach
cell,basedonthe assumptionthatthe twovariablesare independentof eachother.If
there isa large discrepancybetweenthe observedvaluesandthe expectedvalues,the
c2 statisticwouldbe large,whichsuggestsasignificantdifference betweenobserved
and expectedvalues.Inaddition,aprobabilityvalue isalsocomputed.
*Statistics, *Summarize,*Crosstabs
* the desiredvariableinthe listtothe left,then*the uppermostof the right
arrows to indicate thatthisvariable be the row variable.
* a secondvariable,and* the middle rightarrow (toindicate the column
variable).
For three or more variables:use the lowestbox inthiswindow.*onthe third
variable undersectionlist,and then* the lowestof the three rightarrows.
* OK whencomplete.
You can nowconduct a chi-square analysis. *Statistics.Here,manydifferent
testsof independence orassociationare listed.* Chi-square,* Phi and
Cramer'sV, * Continue,* OK
To conduct a cross tabulationandchi-square analysisonasubsetof a certain
variable,selectthe variablesforcrosstabulation,choose cellvalues,andthe
desiredstatistics.Then,*Data (inthe Menu Bar at the topof the screen).*
SelectCases,*If conditionissatisfied,*If.Selectdesiredvariablefromlistonthe
left,* rightarrow to paste itin the "active"box,type inselectedlevelsto
consider.* Continue whencompleted.
OutputshowsPearsonchi-square and"Asymp.Sig."(significance level)
If "Asymp.Sig."islessthan.05 thenthe residualsdifferasa functionof the
independentvariable
The chi-square test essentially tells us whether the results of a crosstab are
statistically significant
A chi-square will be significant if the residuals for one level of a variable differ as
a function of another variable
The chi-square value does not tell us the nature of the differences
The Chi-Square Formula
4. What are all those symbols?
2
= chi-square
= Sigma(sumof...)
fo = frequencyobserved
fe = frequencyexpected
Degreesof freedomare necessarytocompute the significance of the chi-square:df =
(#rows - 1)(#columns - 1)
Assumptions of the Chi-Square
Categoriesare independent(nooverlap)
Must have an expectedcountof at least5 in eachcell
Rememberthatlarge samplesmeanlarge chi-squares,thusmakingiteasiertofinda
significantchi-square (thisiscalledpower)
5. Bivariate Analysis : Categorical Variables
Doing Crosstabulation in SPSS
The following working examples refer to the dataset from the US General Social Survey
1993 .
1. Analyze -> Descriptive Statistics -> Crosstabs
2. Select and put independant variable in "Column(s) " box, dependent variable in "Row(s)"
this "Column(s)" and " Row(s)" arrangement facilitates the "percentage downward"
convention in the output crosstabulation table
if on no ground we could distinguish independent from dependent variable, it doesn't
matter putting the variables in which box
3. Then press the Statistics button
in most questionnaire surveys, researchers are interested not only in the sample statistics,
but also in generalizing the findings to the target population
click to choose Chi-square to do the test for significance of the relationship between two
variables, this is a MUST to choose
6. below the Chi -square, there are measures of the strength of the relationship , choose the
appropriate ones corresponding to the level of measurement
press Continue to return to the above dialog box
4. Having returned to the first dialog box, press Cells
each classification in the crosstabulation table is named a cell
click to choose Observed Counts to obtain actual number of cases in each classification
if the positions of independent and dependent variables can be assumed, choose Column
Percentages to produce the "percentage downward" output
however, if independent-dependent cannot be assumed, both Column Percentages and
Row Percentages must be chosen
press Continue to return to the first dialog box
press OK if you want to get the results immediately, or
press Paste to copy out the command syntax, then run it in the Syntax window to get the
output
7. 5. SPSS Output for Crosstabulation
5.1 Number of cases in each cell and the "percentage downward" results
we want to know whether males and females behaved differently in the 1992 election
VOTE92 Voting in 1992 Election * SEX Respondent's Sex Crosstabulation
SEX Respondent's
Sex Total
1 Male 2 Female
VOTE92 Voting in 1992
Election
1 voted
Count 448 584 1032
% within SEX
Respondent's Sex
72.1% 70.3% 71.1%
2 did not
vote
Count 173 247 420
% within SEX
Respondent's Sex
27.9% 29.7% 28.9%
Total
Count 621 831 1452
% within SEX
Respondent's Sex
100.0% 100.0% 100.0%
the cells with yellow background show the "percentage downward" result:
o 72.1% of male respondents voted in the 1992 election, and 27.9% did not
o 72.1% + 27.9% = 100% as shown in the Total row at the bottom
compare the yellow column with the blue column, we could draw an initial conclusion
that males and females did not behave differently
8. however, we are interested more in inferring the sample finding to the target population,
the above conclusion must be tested for statistical significance by the Chi-squre test
shown below
5.2 Test for significance of the relationship between sex and voting behaviour - Chi-square test
the null hypothesis is: no relationship between sex and voting behaviour
normally, the row starting with Pearson Chi-Square is what we need to examine
the column labelled "Asymp. Sig. (2-sided) " is the level of significance for the chi-
square value (0.601) with the corresponding degree of freedom (df=1)
o the significance shows p=0.438
o in sociological research, typical level of significance adopted to reject the null
hypothesis is p 0.05
o in the current example, 0.438 is much greater than 0.05, we are much confident in
accepting the null hypothesis
o hence, we may conclude that sex and voting behavior have no relationship
existing in our target population
the Chi-square test is nonparametric, which means the strict assumption of population
distribution is relaxed
however, there is still requirement to fulfill: the expected frequency (not observed or
actual frequency) in each cell must be 5 or more
o should such requirement is not fulfilled, a warning will be issued in the SPSS
output
o in any circumstances if the proportion of cells with expected frequency less 5 is as
high as 25% or more, the chi-square is not reliable
o you should consider again the classifications in the variables involved in the
analysis,
regroup some categories to yield more cases
exclude categories with almost no or very few cases
Chi-Square Tests
Value df
Asymp. Sig.
(2-sided)
Exact Sig.
(2-sided)
Exact Sig.
(1-sided)
Pearson Chi-Square .601(b) 1 .438
Continuity
Correction(a)
.514 1 .473
Likelihood Ratio .602 1 .438
Fisher's Exact Test .448 .237
Linear-by-Linear
Association .601 1 .438
N of Valid Cases 1452
a Computed only for a 2x2 table
b 0 cells (.0%) have expected count less than 5. The minimum expected