How to use spss


Published on

Published in: Technology, Sports
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

How to use spss

  1. 1. A Step-by-Step Guide to Analysis and Interpretotion Brian C. Cronk ll I L :,-
  2. 2. ChoosingtheAppropriafeSfafistical lesf Ytrh.t b Yq I*l QraJoi? Dtfbsr h ProportdE Mo.s Tha 1 lnd€Fndont Varidl6 lldr Tho 2 L6Eb d li(bsxlq* Vdidb lhre Thsn 2 L€wls of Indop€nddtt Varisd€ f'bre Tha 'l Indopadqrl Vdbue '| Ind.Fddrt Vri*b fro.! Itn I l.doFfihnt NOTE:Relevantsectionnumbersare giveninparentheses.Forinstance, '(6.9)"refersyouto Section6.9in Chapter6. I
  3. 3. Notice SPSSis a registeredtrademarkof SPSS,Inc.Screenimages@by SPSS,Inc. andMicrosoftCorporation.Usedwith permission. Thisbookis not approvedor sponsoredby SPSS. "PyrczakPublishing"isanimprintof FredPyrczak,Publisher,A CaliforniaCorporation. Althoughtheauthorandpublisherhavemadeeveryefforttoensuretheaccuracyand completenessof informationcontainedin thisbook,weassumenoresponsibilityfor errors,inaccuracies,omissions,or anyinconsistencyherein.Any slightsof people, places,or organizationsareunintentional. ProjectDirector:MonicaLopez. ConsultingEditors:GeorgeBumrss,JoseL. Galvan,MatthewGiblin,DeborahM. Oh, JackPetit.andRichardRasor. Editdrialassistanceprovidedby CherylAlcorn,RandallR.Bruce,KarenM. Disner, BrendaKoplin,EricaSimmons,andSharonYoung. Coverdesignby RobertKiblerandLarryNichols. Printedin theUnitedStatesof AmericabyMalloy,Inc. Copyright@2008,2006,2004,2002,1999byFredPyrczak,Publisher.All rights reserved.No portionof thisbookmaybereproducedor transmittedin anyformorby any meanswithoutthepriorwrittenpermissionof thepublisher. rsBNl-884s85-79-5
  4. 4. Tableof Contents IntroductiontotheFifthEdition What'sNew? Audience Organization SPSSVersions Availabilityof SPSS Conventions Screenshots PracticeExercises Acknowledgments'/ ChapterI GettingStarted Ll t.2 1.3 1.4 1.5 1.6 1.7 Chapter2 EnteringandModifyingData StartingSPSS EnteringData DefiningVariables LoadingandSavingDataFiles RunningYourFirstAnalysis ExaminingandPrintingOutputFiles Modi$ingDataFiles VariablesandDataRepresentation TransformationandSelectionof Data Chapter3 DescriptiveStatistics 3.1 3.2 3.3 3.4 3.5 Chapter4 GraphingData FrequencyDistributionsandpercentileRanksfor a singlevariable FrequencyDistributionsandpercentileRanksfor Multille variables Measuresof CentralTendencyandMeasuresof Dispersion foraSingleGroup Measuresof CentralTendencyandMeasuresof Dispersion for MultipleGroups StandardScores 4l 4l 43 45 49 2.1 ') ') v v v v vi vi vi vi vii vii I I I 2 5 6 8 ll ll t2 l7 t7 20 24 )7 29 29 29 3l 33 36 39 2l Chapter5 PredictionandAssociation 4.1 4.2 4.3 4.4 4.5 4.6 5.1 5.2 5.3 5.4 GraphingBasics TheNewSPSSChartBuilder BarCharts,PieCharts,andHistograms Scatterplots AdvancedBarCharts EditingSPSSGraphs PearsonCorrelation Coefficient SpearmanCorrelation Coefficient SimpleLinear Regression Multiple LinearRegression u,
  5. 5. Chapter6 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 Chapter7 7.1 7.2 7.3 7.4 7.5 7.6 Chapter8 8.1 8.2 8.3 8.4 AppendixA AppendixB ParametricInferentialStatistics Reviewof BasicHypothesisTesting Single-Samplet Test Independent-SamplesI Test Paired-Samplest Test One-WayANOVA FactorialANOVA Repeated-MeasuresANOVA Mixed-DesignANOVA Analysisof Covariance MultivariateAnalysisof Variance(MANOVA) NonparametricInferentialStatistics Chi-SquareGoodnessof Fit Chi-SquareTestof Independence Mann-WhitneyUTest WilcoxonTest Kruskal-Wallis,F/Test FriedmanTest TestConstruction Item-TotalAnalysis Cronbach'sAlpha Test-RetestReliability Criterion-RelatedValidiw EffectSize PracticeExerciseDataSets PracticeDataSetI PracticeDataSet2 PracticeDataSet3 Glossary SampleDataFilesUsedin Text COINS.sav GRADES.sav HEIGHT.sav QUESTIONS.sav RACE.sav SAMPLE.sav SAT.sav OtherFiles Informationfor Usersof EarlierVersionsof SPSS GraphingDatawithSPSS13.0and14.0 53 53 )) 58 6l 65 69 72 75 79 8l 85 85 87 .90 93 95 97 99 99 100 l0l t02 103 r09 109 ll0 ll0 lt3AppendixC AppendixD AppendixE AppendixF tt7 n7 ll7 ll7 n7 l18 l18 lt8 lt8 l19 t2l tv
  6. 6. ChapterI Section1.1 StartingSPSS ffi$t't**** ffi c rrnoitllttt (- lhoari{irgqrory r,Crcrt*rsrcq.,y urhgDd.b6.Wbrd (i lpanrnaridirgdataura f- Dml*ro* fe tf*E h lholifrra GettingStarted Startup proceduresfor SPSSwill differ slightly,dependingon the exactconfigurationof the machineon which it is installed.On most computers,you can start SPSSby clicking on Start, then clicking on Programs,then on SPSS. On many installations,therewill be an SPSSicon on the desktopthat you can double-clickto start theprogram. When SPSSis started,you may be pre- sentedwith the dialog box to the left, depending on theoptionsyour systemadministratorselected for your versionof the program.If you havethe dialog box, click Type in data and OK, which will presenta blankdata window.' If you were not presentedwith the dialog box to the left, SPSSshouldopenautomatically with a blankdata window. The data window and the output win- dow provide the basic interface for SPSS. A blankdata window is shownbelow. Section1.2 EnteringData One of the keys to success with SPSSis knowing how it stores and usesyour data.To illustratethe basicsof data entry with SPSS,we will useExample1.2.1. Example1.2.1 A surveywasgivento several students from four different classes (Tues/Thurs mom- ings, Tues/Thursafternoons, Mon/Wed/Fri mornings, and Mon/Wed/Fri afternoons). The students were asked r! *9*_r1_*9lt.:g H*n-g:fH"gxr__}rry".** rtlxlel&l *'.1rtlale| lgj'SlfilHl*lml sl el*l I ' Itemsthatappearin the glossaryarepresentedin bold. Italics areusedto indicatemenuitems.
  7. 7. ChapterI GeningStarted whetheror not they were "morning people"and whetheror not they worked.This surveyalso askedfor their final gradein the class(100% being the highestgade possible).Theresponsesheetsfrom two studentsarepresentedbelow: ResponseSheetI ID: Dayof class: Classtime: Areyouamorningperson? Finalgradein class: Doyouworkoutsideschool? ResponseSheet2 ID: Dayof class: Classtime: Are you a morningperson? X Yes - No Finalgradein class: Dovouworkoutsideschool? 4593 MWF X TTh Morning X Aftemoon Yes X No 8s% Full-time Part{ime XNo l90l x MwF _ TTh X Morning - Afternoon 83% Full-time X Part-time No Our goal is to enterthe datafrom the two studentsinto SPSSfor usein future analyses.Thefirststepis to determinethevariablesthatneedto beentered.Any informa- tion thatcanvary amongparticipantsis a variablethatneedsto be considered.Example 1.2.2liststhevariableswewill use. Example1.2.2 ID Dayof class Classtime Morningperson Finalgrade Whetheror notthestudentworksoutsideschool In theSPSSdatawindow,columnsrepresentvariablesandrowsrepresentpartici- pants.Therefore,wewill becreatinga datafile with sixcolumns(variables)andtworows (students/participants). Section1.3 Defining Variables Beforewe canenteranydata,we mustfirst entersomebasicinformationabout eachvariableintoSPSS.Forinstance,variablesmustfirstbegivennamesthat: o beginwith aletter; o donotcontainaspace.
  8. 8. ChapterI GettingStarted Thus, the variablename"Q7" is acceptable,while the variablename"7Q" is not. Similarly, the variable name "PRE_TEST" is acceptable,but the variable name "PRE TEST" is not. Capitalizationdoesnot matter,but variablenamesare capitalizedin this text to make it clear when we are referringto a variablename,even if the variable nameis not necessarilycapitalizedin screenshots. To definea on the VariableViewtabat thebottomofthemainscreen.ThiswillshowyoutheVari-@ able Viewwindow. To returnto theData Viewwindow. click on the Data View tab. Fb m u9* o*.*Trqll t!-.G q".E u?x !!p_Ip ,'lul*lEll r"l*l ulhl **l{,lrl EiliEltfil_sJelrl l .lt-*l*lr"$,c"x.l From the Variable Viewscreen,SPSSallows you to createandedit all of the vari- ablesin your datafile. Eachcolumn representssomepropertyof a variable,andeachrow representsa variable.All variablesmust be given a name.To do that, click on the first empty cell in the Name column and type a valid SPSSvariablename.The programwill thenfill in defaultvaluesfor mostof theotherproperties. Oneusefulfunctionof SPSSis theabilityto definevariableandvaluelabels.Vari- able labelsallow you to associatea descriptionwith eachvariable.Thesedescriptionscan describethevariablesthemselvesor thevaluesof thevariables. Value labelsallow you to associatea descriptionwith eachvalueof a variable.For example,for most procedures,SPSSrequiresnumericalvalues.Thus, for datasuchasthe day of the class(i.e., Mon/Wed/Fri and Tues/Thurs),we needto first code the valuesas numbers.We can assignthe numberI to Mon/Wed/Friand the number2to Tues/Thurs. To helpus keeptrackof thenumberswe haveassignedto thevalues,we usevaluelabels. To assignvaluelabels,click in the cell you want to assignvaluesto in the Values column.This will bring up a smallgraybutton(seeanow, below at left). Click on thatbut- ton to bring up theValue Labelsdialog box. When you enter a value label, you must click Add aftereachentry.This will J::::*.-,.Tl mOVe the value and itS associated label into the bottom section of the window. When all labels have been added, click OK to return to the Variable Viewwindow. iv*rl** --- v& 12 -Jil s*l !!+ | L.b.f ll6rhl|
  9. 9. ChapterI GeningStarred In additionto namingandlabelingthevariable,you havetheoptionof definingthe variabletype.To do so,simply click on theType,Width,or Decimalscolumnsin the Vari- able Viewwindow. The defaultvalue is a numericfield that is eight digits wide with two decimalplacesdisplayed.If your dataaremorethaneightdigitsto the left of the decimal place,theywill be displayedin scientificnotation(e.g.,the number2,000,000,000will be displayedas2.00E+09).'SPSSmaintainsaccuracybeyondtwo decimalplaces,but all out- put will be roundedto two decimalplacesunlessotherwiseindicatedin the Decimals col- umn. In our example,we will beusingnumericvariableswith all of thedefaultvalues. Practice Exercise Createa datafile for the six variablesandtwo samplestudentspresentedin Exam- ple 1.2.1.Nameyour variables:ID, DAY, TIME, MORNING, GRADE, andWORK. You shouldcodeDAY as I : Mon/Wed/Fri,2 = Tues/Thurs.CodeTIME as I : morning,2 : afternoon.CodeMORNING as0 = No, I : Yes.CodeWORK as0: No, I : Part-Time,2 : Full-Time. Be sureyou entervalue labelsfor the different variables.Note that because valuelabelsarenot appropriatefor ID andGRADE, thesearenot coded.When done,your Variable Viewwindow shouldlook like thescreenshotbelow: J -rtrr,d r9"o'ldq${:ilpt"?- "*- .?-- {!,_q,ru.g Click on the Data Viewtab to openthe data-entryscreen.Enter datahorizontally, beginningwith the first student'sID number.Enterthecodefor eachvariablein theappro- priatecolumn;to entertheGRADE variablevalue,enterthestudent'sclassgrade. F.E*UaUar Qgtr Irrddn Anhna gnphr Ufrrs Hhdow E* *lgl dJl blblAl'ri-l-Etetmtototttrslglglqjglej ulFId't lr*lEl&lr6lglolrt' 2 Dependinguponyour versionof SPSS,it maybedisplayedas2.08 + 009.
  10. 10. ChapterI GettingStarted - Thepreviousdatawindowcanbechangedto lookinsteadlike thescreenshotbe- l*.bv clickingontheValueLabelsicon(seeanow).In thiscase,thecellsdisplayvalue labelsratherthanthecorrespondingcodes.If datais enteredin thismode,it is notneces- saryto entercodes,asclickingthebuttonwhichappearsin eachcellasthecellis selected will presenta drop-downlist of thepredefinedlablis.You mayuseeithermethod,accord- ingtoyourpreference. : [[o|vrwl vrkQ!9try / *rn*to*u*J----.-- )1 Insteadof clicking the ValueLabels icon, you may optionallytogglebetweenviewsby clickingvalueLaiels under theViewmenu. Section1.4 Loading and SavingData Files Onceyou haveenteredyourdata,you will need to saveit with a uniquenamefor lateruseso thatyou canretrieveit whennecessary. LoadingandsavingSpSSdatafilesworksin the sameway asmostWindows-basedsoftware.Underthe File menu, there are Open, Save, and Save As commands.SPSSdata files have a .,.sav" extension. which is addedby defaultto the end of the filename. ThistellsWindowsthatthefileisanSpSSdatafile. SaveYourData When you saveyour datafile (by clicking File, thenclicking Saveor SaveAs to specifya uniquename),pay specialattentionto whereyou saveit. trrtistsystemsdefaultto the.location<c:programfilesspss>.You will probablywant to saveyour dataon a floppy disk,cD-R, or removableUSB drive sothatyou cantaie the file withvou. ,t ,t1 r ti il 'i. I rlii |: H- Load YourData When you load your data (by clicking File, then clicking Open,thenData, or by clicking theopenfile folder icon),you get a similarwindow.This window listsall files with the ".sav" extension.If you havetroublelocatingyour saved file, make sure you are looking in theright directory. tu l{il Ddr lrm#m Anrfrrr Cr6l! D{l lriifqffi
  11. 11. ChapterI GeningStarted PracticeExercise To be surethatyou havemasteredsav- ing andopeningdatafiles,nameyour sample datafile "SAMPLE"andsaveit to a removable FilE Edt $ew Data Transform Annhze @al storagemedium.Onceit is saved,SPSSwill displaythe nameof the file at the top of the data window. It is wise to saveyour work frequently,in caseof computercrashes.Note thatfilenamesmay be upper-or lowercase.In thistext,uppercaseis usedfor clarity. After you have savedyour data,exit SPSS(by clicking File, then Exit). Restart SPSSandloadyour databy selectingthe"SAMPLE.sav"file youjust created. Section1.5 RunningYour FirstAnalysis Any time you opena data window, you canmn any of the analysesavailable.To get started,we will calculatethe students'averagegrade.(With only two students,you can easilycheckyour answerby hand,but imaginea datafile with 10,000studentrecords.) The majority of the availablestatisticaltests are under the Analyze menu. This menudisplaysall the optionsavailablefor your versionof the SPSSprogram(themenusin thisbookwerecreatedwith SPSSStudentVersion15.0).Otherversionsmay haveslightly differentsetsof options. j rttrtJJ File Edlt Vbw Data TransformI nnafzc Gretrs UUtias gdFrdov*Help El tlorl rl(llnl lVisible:6ol GanoralHnnarf&dd Corr*lrtr Re$$r$on Classfy OdrRrdrrtMr Scab Norparimetrlclcrtt Tirna5arl6t Q.rlty Corfrd Rff(trve,., )i ,) ) ir l. ,.),. Eipbrc,,. CrogstSr,.. Rdio,., P-Pflok,., Q€ Phs.,, ) l ) ) To calculatea mean (average),we areaskingthe computerto summarizeour data set.Therefore,we run the commandby clicking Analyze,thenDescriptive Statistics,then Descriptives. This brings up the Descriptives dialog box. Note that the left side of the box containsa list of all the variablesin our datafile. On theright is an area labeled Variable(s), where we can specifythe variableswe would like to usein this particularanalysis. .Srql 3s,l A*r*.. I r ktlmllff al Cottpsr Milns ) 't901.00 , Itjg*r*qgudrr,*ts"uss- OAY f- 9mloddrov*p*vri*lq
  12. 12. ChapterI GettingStarted We want to compute the mean for the variable called GRADE. Thus, we need to select the variablename in the left window (by clicking on it). To transferit to the right window, click on the right arrow between the two windows. The arrow always points to the window oppositethe highlighted item and can be used to transfer l:rt.Ij in m ;F* | -t:g.J -!tJ PR:lf- Smdadr{rdvdarvai& selectedvariablesin either direction.Note that double-clickingon the variablenamewill also transfer the variable to the opposite window. StandardWindows conventionsof "Shift" clickingor "Ctrl" clickingto selectmultiplevariablescanbe usedaswell. When we click on the OK button,the analysiswill be conducted,and we will be readyto examineour output. Section1.6 ExaminingandPrintingOutputFiles After an analysis is performed, the output is placedin the output window, and the output window becomesthe active window. If this is the first analysis you have conductedsince starting SPSS,then a new output window will be created.If you haverun previous outputisaddedto theendof yourpreviousoutput. To switchbackandforthbetweenthedatawindowandtheoutput window,select thedesiredwindowfromtheWindowmenubar(seearrow,below). Theoutputwindowis splitintotwo sections.Theleftsectionis anoutlineof the output(SPSSreferstothisasthe"outlineview").Therightsectionis theoutputitself. irllliliirrillliirrrI -d * lnl-Xj H. Ee lbw A*t lra'dorm -qg*g!r*!e!|ro_ Craphr,Ufr!3 Uhdo'N Udp slsl*glelsl*letssJsl#_#rl+l*l +l-l&hjl :lqlel, * Descrlptlves f]aiagarll l: lrrs datcra&ple.lav o lle*crhlurr Sl.*liilca N Mlnlmum Hadmum Xsrn Std.Dwiation ufinuc valldN(|lstrylsa) I 2 83.00 85.00 81,0000 1.41421 ffiffi?iffi rr---*.* r*4 The sectionon the left of the output window providesan outline of the entireout- put window. All of the analysesarelistedin theorderin which they wereconducted.Note that this outline can be usedto quickly locatea sectionof the output.Simply click on the sectionyou would like to see,andtheright window will jump to the appropriateplace. analysesandsavedthem,your ornt El Pccc**tvs* r'fi Trb 6r** lS Adi€D*ard ffi Dcscrtfhcsdkdics
  13. 13. ChapterI GeningStarted Clicking on a statisticalprocedurealsoselectsall of the outputfor thatcommand. By pressingtheDeletekey,thatoutputcanbe deletedfrom the output window. This is a quick way to be surethatthe output window containsonly the desiredoutput.Outputcan also be selectedand pastedinto a word processorby clicking Edit, then Copy Objeclsto copy the output.You canthenswitchto your word processorand click Edit, thenPaste. To print your output,simply click File, thenPrint, or click on the printer icon on the toolbar.You will havethe option of printing all of your outputor just the currentlyse- lected section.Be careful when printing! Each time you mn a command,the output is addedto the end of your previousoutput.Thus,you could be printing a very largeoutput file containinginformationyou may not want or need. Oneway to ensurethatyour output window containsonly the resultsof thecurrent commandis to createa new output window just beforerunningthe command.To do this, click File, thenNew, then Outpul. All your subsequentcommandswill go into your new output window. Practice Exercise Load the sampledatafile you createdearlier(SAMPLE.sav).Run theDescriptives commandfor the variableGRADE and print the output.Your output shouldlook like the exampleon page7. Next,selectthedata window andprint it. Section1.7 ModifyingDataFiles Once you havecreateda datafile, it is really quite simple to add additionalcases (rows/participants)or additionalvariables(columns).ConsiderExample1.7.1. Example1.7.1 Twomorestudentsprovideyouwithsurveys.Theirinformationis: ResponseSheet3 ID: Dayof class: Classtime: Are you a morningperson? Finalgradein class: Do you work outsideschool? ResponseSheet4 ID: Day of class: Classtime: Are you a morningperson? Finalgradein class: Do you work outsideschool? 8734 80% MWF Morning Yes Full-time No 1909 X MWF X Morning X Yes 73% Full+ime No X TTh Afternoon XNo Part-time TTH Afternoon No X Part-time
  14. 14. ChapterI GettingStarted To addthesedata,simply placetwo additionalrows in theData View window (af- ter loadingyour sampledata).Notice that asnew participantsareadded,the row numbers becomebold. when done,the screenshouldlook like the screenshothere. New variablescan also be added.For example,if the first two participantswere given specialtrainingon time management,andthetwo new participantswerenot, thedata file canbe changedto reflectthis additionalinformation.The new variablecould be called TRAINING (whetheror not the participantreceivedtraining), and it would be codedso that 0 : No and I : Yes. Thus,the first two participantswould be assigneda "1" andthe Iasttwo participantsa "0." To do this, switch to the Variable View window, then add the TRAINING variableto the bottom of the list. Then switchback to theData View window to updatethe data. f+rilf,t - tt Inl vl Sa E& Uew Qpta lransform &rpFzc gaphs Lffitcs t/itFdd^,SE__-- 14:TRAINING l0 lvGbt€ri of t0 NAY TIME MORNING GRADE woRKI mruruwe 1r 1 4593.0f1 Tueffhu aterncon No 85.0u Nol Yes I 1901.OCIManA/Ved/ m0rnrng Yes ffi.0n iiart?mel- yes 3 8734"00 Tueffhu momtng No 80.n0 Noi No 4 1909.00MonrlVed/ morning Yes 73.00 Part-TimeI No ' s I (l) .rView { Vari$c Vlew . l-.1 =J "isPssW rll'l ,i Adding dataand addingvariablesarejust logical extensionsof the procedureswe usedto originally createthe datafile. Savethis new data file. We will be using it again laterin thebook. '.., j .l lrrl vl nh E*__$*'_P$f_I'Sgr &1{1zcOmhr t$*ues$ilndonHug_ Tffiffi ID DAY TIME MORNING GRADE WORK var ^ 1 4593.00 Tueffhu aternoon No 85.00 No 2 1gnl.B0MonMed/ m0rnrng Yes 83.00 Part-Time 3 8734.00 Tue/Thu mornrng No 80,00 No 1909.00MonAfVed/ mornrng Yeg 73.00 Part-Time ) .mfuUiewffi I rb$ Vbw / l{l rll '.- - -,,,---Jd* 15P55Procus*rlsready I i ,4
  15. 15. ChapterI GettingStarted Practice Exercise Follow the exampleabove(whereTRAINING is the new variable).Make the modificationsto yourSAMPLE.savdatafile andsaveit. l0
  16. 16. Chapter2 EnteringandModifying Data In Chapter 1, we learnedhow to createa simpledatafile, saveit, perform a basic analysis,and examinethe output.In this section,we will go into more detail aboutvari- ablesanddata. Section2.1 VariablesandDataRepresentation In SPSS,variablesarerepresentedascolumnsin the datafile. Participantsarerep- resentedasrows.Thus,if we collect4 piecesof informationfrom 100participants,we will havea datafile with 4 columnsand 100rows. Measurement Scales Therearefour typesof measurementscales:nominal, ordinal, interval, andratio. While themeasurementscalewill determinewhich statisticaltechniqueis appropriatefor a given set of data,SPSSgenerallydoesnot discriminate.Thus, we startthis sectionwith this warning: If you ask it to, SPSSmay conductan analysisthat is not appropriatefor your data.For a morecompletedescriptionof thesefour measurementscales,consultyour statisticstext or the glossaryin AppendixC. Newer versionsof SPSSallow you to indicatewhich types of data you have when you define your variable.You do this using the Measurecolumn.You can indicateNominal,Ordinal,or Scale(SPSS doesnot distinguishbetweeninterval andratio scales). Look at the sampledatafile we createdin Chapterl. We calcu- lateda mean for the variableGRADE. GRADE wasmeasuredon a ra- tio scale,andthemeanis anacceptablesummarystatistic(assumingthatthedistribution isnormal). We could havehad SPSScalculatea mean for the variableTIME insteadof GRADE.If wedid,wewouldgettheoutputpresentedhere. TheoutputindicatesthattheaverageTIME was 1.25.RememberthatTIME was coded as an ordinal variable (I = morningclass,2-afternoon class).Thus, the mean is not an appropriatestatisticfor an ordinal scale,but SPSScalculatedit any- way. The importanceof consider- ing the type of data cannot be overemphasized. Just because SPSSwill compute a statistic for you doesnot meanthatyou should Measure @Nv f $cale .sriltr r Nominal ll *lq]eH"N-ql*l trlllql eilr $l-g :* Sl astts .l.:D gtb :$sh .6M6.ffi $arlrba"t S#(| ht6x0tMn a LS 2.qg Lt@
  17. 17. ql total 2.00 2.Bn 4.00 3.00 1.00 4.00 4.00 3.00 7.00 2.00 1.00 2.UB 3.00 Chapter2 EnteringandModifying Data useit. Later in the text,when specificstatisticalproceduresarediscussed,the conditions underwhich they areappropriatewill be addressed. Missing Data Often,participantsdo not providecompletedata.For somestudents,you may have a pretestscorebut not a posttestscore.Perhapsone studentleft one questionblank on a survey,or perhapsshedid not stateher age.Missing datacanweakenany analysis.Often, a singlemissingquestioncaneliminatea sub- ject from all analyses. If you havemissingdatain your data set, leave that cell blank. In the exampleto the left, the fourth subjectdid not complete Question2. Note thatthetotal score(which is calculatedfrom both questions)is alsoblank becauseof the missing data for Question2. SPSSrepresentsmissing data in the data window with a period(althoughyou should not entera period-just leaveit blank). Section2.2 TransformationandSelectionof Data Weoftenhavemoredatain a datafile thanwewantto includein a specificanaly- sis.For example,our sampledatafile containsdatafrom four participants,two of whom receivedspecialtrainingandtwo of whomdid not.If we wantedto conductananalysis usingonlythetwo participantswhodidnotreceivethetraining,we wouldneedto specify theappropriatesubset. Selectinga Subset F|! Ed vl6{ , O*. lr{lrfum An*/& e+hr ( We canusethe SelectCasescommandto specify a subset of our data. The Select Cases command is located under the Data menu. When you select this command,the dialog box below will appear. t'llitl&JE il :id O*fFV{ldrr PrS!tU6.,. CoptO.tafropc,tir3,.. l,j.l,/r,:irrlrr! lif l ll:L*s,,. Hh.o*rr,., Dsfti fi*blc Rc*pon$5ct5,,, ConyD*S sd.rt Csat You can specify which cases(partici- pants)you want to selectby using the selec- tion criteria,which appearon the right sideof theSelectCasesdialogbox. q*d-:-"-- "-"""-*--*--**-""*-^*l 6 Alce a llgdinlctidod ,rl r irCmu*dcaa ] i*np* | i{^ lccdotincoarrpr : ;.,* | -:--J c llaffrvci*lc l0&t C6ttSldrDonoan!.ffi foKl aar I c-"rl x* | t2
  18. 18. Chapter2 EnteringandModifying Data By default,All caseswill be selected.The most commonway to selecta subsetis to click If condition is satisfied,thenclick on the button labeledfi This will bring up a newdialogbox thatallowsyou to indicatewhichcasesyou would like to use. You can enter the logic used to select the subsetin the upper section. If the logical statement is true for a given case, then that case will be selected.If the logical statement is false. that case will not be selected.For example, you can selectall casesthat were coded as Mon/Wed/Fri by enteringthe formula DAY = I in the upper- ?Ais"I c'-t I Ht I rightpartof thewindow.If DAY is l, thenthestatementwill betrue,andSPSSwill select the case.If DAY is anythingotherthan l, the statementwill be false,andthe casewill not be selected.Once you have enteredthe logical statement,click Continueto return to the SelectCasesdialogbox. Then,click OK to returnto thedata window. After you haveselectedthecases,thedata window will changeslightly. The casesthat werenot selectedwill be markedwith a diagonalline throughthe casenumber.For example,for our sampledata,the first and third casesarenot selected.only the secondandfourthcasesareselectedfor this subset. U;J;J:.1-glL1 E{''di',*tI , 'J-e.l-,'JlJ.!J-El[aasi"-Eo,t----i ilqex4q lffiIl,?,l*;*"'= ,Jl _!JlJ 0 U IAFTAN(r"nasl sl"J=tx-s*t"lBi!?Blt1trb:r 1 I , I I l i{ 1 ,1 'l 1 I 1 : t 'l 1 'l EffEN'EEEgl''EEE'o ,.,:r. rt lnl vl !k_l** -#gdd.i.&lFlib'- ID TIME MORNING ERADE WORK TRAINING /,-< 4533.m Tueffhui affsrnoon No ffi.m Na Yes NotSelected 2 1901.m- 6h4lto*- ieifrfft MpnMed/i mornino. -..- ^,-.-.*.*..,-- J.- . - .-..,..".*-....- ': Yss 83,U1Fad-Jime Yes Splacled -'4 TuElThu. morning No m.m No No NotSelected 4 MonA/Ved/1morning Yes ru.mPart-Time No s !LJii. vbryJv,itayss7 I . *-J *]fsPssProcaesaFrcady I i ,1, An additionalvariablewill also be createdin your data file. The new variableis calledFILTER_$ andindicateswhethera casewasselectedor not. If we calculatea mean GRADE using the subsetwe just selected,we will receive the output at right. Notice that we now havea mean of 78.00 with a samplesize(M) of 2 in- steadof 4. DescripthreStailstics N Minimum Maximum Mean std. Deviation UKAUE ValidN IliclwisP'l 2 2 73.00 83.00 78.0000 7.0711 l3
  19. 19. Chapter2 EnteringandModifyingData Be carefulwhen you selectsubsets.Thesubsetremainsin ffict until you run the commandagain and selectall cases.You cantell if you havea subsetselectedbecausethe bottomof the data window will indicatethat a filter is on. In addition,when you examine your output,N will be lessthanthe total numberof recordsin your dataset if a subsetis selected.The diagonallines throughsomecaseswill also be evidentwhen a subsetis se- lected.Be carefulnot to saveyour datafile with a subsetselected,asthis cancauseconsid- erableconfusionlater. Computing a New Variable SPSScan alsobe used to computea new variable or manipulateyour existing vari- ables. To illustrate this, we will create a new data file. This file will contain data for four participants and three variables(Ql, Q2, and Q3). The variables represent the number of points each participant received on three different questions.Now enter the data shown on the screen to the right. When done, save this data file as "QUESTIONS.sav."We will beusingit againin laterchapters. I TrnnsformAnalyze Graphs Utilities Whds Rersdeinto5ameVariable*,,, RacodointoDffferantVarlables.,, Ar*omSicRarode,,. Vlsual8inrfrg,.. After clicking the Compute Variable command,we get the dialog box at right. The blank field marked Target Variable is where we enter the name of the new variablewe want to create. In this example, we are creating a variablecalled TOTAL, so type the word"total." Notice that there is an equals sign between the Target Variable blank and the Numeric Expression blank. Thesetwo blank areasare the Now you will calculatethe total scorefor eachsubject.We coulddo this manually,but if the data file were large, or if there were a lot of questions,this would take a long time. It is more efficient (and more accurate) to have SPSS compute the totals for you. To do this, click Transform and then click Compute Variable. U $J-:iidijl lij -!CJ:l Jslcl ll;s rtg-sJ rt rt rl ,_g-.|J :3 lll--g'L'"J til , rr | {q*orfmsrccucrsdqf l4 nh E* vir$, T|{dorm *lslel EJ-rlrj -lgltj{l -|tlf,la*intt m eltj I l* ,---- LHJ {#i#ffirtr!;errtt*; , rrwI i+t*... *l gl w ca lllmr*dCof 0rr/ti* &fntndi) Oldio. E${t iil :J n*ri c*rl "*l
  20. 20. Chapter2 EnteringandModifying Data iii:Hffiliji:.: .i .i>t ii"alCt i-Jr:J::i i-3J:J l:j -:15 JJJI tJ -tJ-il --q-|J is:Jlll --q*J m |f-- | ldindm.!&dioncqdinl tsil nact I c:nt I x* | two sides of an equation that SPSS will calculate.For example,total: ql + q2 + q3 is the equationthat is enteredin the samplepresentedhere (screenshotat left).Notethatit is pos- sible to create any equation here simply by using the number and operationalkeypad at the bottom of the dialog box. When we click OK, SPSSwill createa new variablecalled TOTAL andmakeit equalto the sum of thethreequestions. Save your data file again so thatthenew variablewill be available for futuresessions. -lJ t::,, - ltrl-Xl Sindow Help 3.n0 3.0n 4,n0 10.00 4.00 31 2.ool 2.oo..........;. 41 1.001 3001 .:1 l-'r--i-----i I il I i , l, lqg,t_y!"*_i VariabteViewJ lit rljl W*; Recodinga Variable-Dffirent Variable SPSS can create a new variable based upon data from another variable. Say we want to split our participantson the basisof their total score.We want to create a variablecalledGROUP,which is coded I if the total score is low (lessthanor equalto 8) or 2 if the total scoreis high (9 or larger).To do this, we click Transform, then Recodeinto Dffirent Variables. , l,rll r-al +. conp$ovdiouc',' ---.:1.- Cd.nVail'r*dnCasas.,, l{ -l I -- - rr 'rtr I o..**^c--u-r-c 4.00 2.00 i.m Racodrlrto 0ffrror* Yal Art(tn*Rrcodr... U*dFhn|ro,,. S*a *rd llm tllhsd,,, Oc!t6 I}F sairs.., Rid&c l4sitE V*s.,. Rrdon iMbar*trr,,. l5 Eile gdit SEw Qata lransform $nalyza 9aphs [tilities Add'gns F{| [dt !la{ Data j Trrx&tm Analrra
  21. 21. Chapter2 EnteringandModifyingData This will bring up the Recode into Different Variables dialog box shown here. Transfer the variableTOTAL to the middle blank. Type "group" in the Name field underOutputVariable.Click Change,and the middle blank will show that TOTAL is becoming GROUP.asshownbelow. ladtnl c€ rlccdm confbil -'tt" I rygJ**l-H+ | r t *.!*lr r&*ri*i*t ;rln I r-":-'-'1** lirli iT- I r nryrOr:frr**"L ,f- i c nq.,saa*ld6lefl; F- ,.F--*-_-_-_____ : " *r***o I a lrt*cn*r I I nni. rT..".''..."...- I ir:L-_- t' l6 i4i'|(tthah* ;F- I" n*'L,*l'||.r.$, : r----**-: ; r {:ei.* T &lrYdd.r*t li-- '- i"r,.!*r h^.,",r y..,t larir,r it:.' I gf-ll $q I '*J til To help keep track of variablesthat have been recoded, it's a good idea to open the Variable View and enter"Recoded"in the Label column in the TOTAL row. This is especially useful with large datasetswhich may include manyrecodedvariables. Click Old andNew Values.This will bring up the Recodedialog box. In this example,we have entered a 9 in the Range, value through HIGHEST field and a 2 in the Value field under New Value.When we click Add, theblank on the right displaysthe recodingformula.Now enteran 8 on the left in the Range, LOWEST through valueblank and a I in the Valuefield underNew Value.Click Add, thenContinue.Click OK. You will be redirectedto the data window. A new variable (GROUP) will have been added and codedas I or 2, basedon TOTAL. *u"'." -ltrlIl Flc Ed Drt! Tr{lform {*!c ce|6.,||tf^,!!!ry I+ NtnHbvli|bL-lo|rnrV*#r l6
  22. 22. Chapter3 DescriptiveStatistics ln Chapter2, wediscussedmanyof theoptionsavailablein SPSSfor dealingwith data.Now we will discusswaysto summarizeour data.Theproceduresusedto describe andsummarizedataarecalleddescriptivestatistics. Section3.1 FrequencyDistributionsand PercentileRanks for a SingleVariable Description TheFrequenciescommandproducesfrequencydistributionsfor thespecifiedvari- ables.Theoutputincludesthenumberof occurrences,percentages,validpercentages,and cumulativepercentages.Thevalid percentagesandthe cumulativepercentagescomprise onlythedatathatarenotdesignatedasmissing. TheFrequenciescommandis usefulfor describingsampleswherethemeanis not useful(e.g.,nominalor ordinalscales).It is alsousefulasa methodof gettingthefeelof yourdata.It providesmoreinformationthanjust a meanandstandarddeviationandcan beusefulin determiningskewandidentifyingoutliers.A specialfeatureof thecommand isitsabilityto determinepercentileranks. Assumptions Cumulativepercentagesandpercentilesarevalidonly for datathataremeasured onat leastanordinal scale.Becausetheoutputcontainsonelinefor eachvalueof a vari- able,thiscommandworksbestonvariableswitharelativelysmallnumberof values. Drawing Conclusions TheFrequenciescommandproducesoutputthatindicatesboththenumberof cases in thesampleof a particularvalueandthepercentageof caseswith thatvalue.Thus,con- clusionsdrawnshouldrelateonlyto describingthenumbersor percentagesof casesin the sample.If thedataareatleastordinalin nature,conclusionsregardingthecumulativeper- centageand/orpercentilescanbedrawn. .SPSSData Format TheSPSSdatafile for obtainingfrequencydistributionsrequiresonlyonevariable, andthatvariablecanbeof anytype. tt
  23. 23. Chapter3 DescriptiveStatistics Creating a Frequency Distribution To run the Frequer?ciescommand, click Analyze, then Descriptive Statistics, then Frequencies.(This exampleusesthe CARS.savdatafile that comeswith SPSS. It is typically located at <C:Program FilesSPSSCars.sav>.) This will bring up the main dialog box. Transferthe variablefor which you would like a frequencydistributioninto the Disbtlvlr... N Erpbr,.. croac*a,.. Rrno,., F.Pt'lok,., aaPUs,., Variable(s)blank to the right. Be surethat the Display frequency tables option is checked.Click OK to receiveyour output. Note that the dialog boxes in newer versionsof SPSSshow both the typeof variable(theicon immediatelyleft of the variable name) and the variable labels if they are entered. Thus, the variableYEAR shows up in the dialog box asModel Year(moduloI0). i:rl.&{l&l&lslsl}sl i1 rmpg i18 MilesperGallonlmr /Erqlr,onispUcamr / Hurepowor[horc dv*,id"w"bir 1|ut d t!rc toAceileistc dr',Ccxr*yolOrbin[c l7 Oisgayhequercytder xl q!l jq? | .f"tq I . He_l sr**i,1..1f*:.,.I rry*,:.I Outputfor a Frequency Distribution The outputconsistsof two sections.The first sectionindicatesthe numberof re- cordswith valid data for eachvariableselected.Recordswith a blank scorearelistedas missing.In thisexample,thedatafile contained406 records.Noticethatthevariablelabel is ModelYear(modulo100). statistics The second section of the output contains a cumulative frequency distribution for each variable Wselected.Atthetopofthesection,thevariablelabelis | * y.1"1 | oo? | given.The outputiiself consistsof five columns.The first I MissingI t I Jolumnliststhi valuesof thevariablein sortedorder.There is a row for eachvalueof your variable, and additionalrows are added at the bottom for the Total and Missing data. The secondcolumngivesthe frequency of eachvalue,includingmissingvalues. Thethirdcolumngivesthepercentageof all records (including records with missingdata)for eachvalue.The fourth column,labeledValidPercenl,givesthe percentageof records(withoutincluding records with missing data) for each value.If therewereany missingvalues, thesevalueswould be larger than the valuesin columnthreebecausethe total ModolYo.r (modulo 100) Pcrcenl Valid P6rc€nl Cumulativs vatE 72 73 74 75 76 77 79 80 81 82 Total Missing 0 (Missing) Total 34 28 40 27 30 34 28 29 29 30 31 405 1 406 I 4 7.1 6.9 9.9 6.7 8.4 6.9 8.9 7.1 7.1 7.4 7.6 99.8 100.0 I 4 7.2 6.9 9.9 6.7 7.4 8.4 6.9 8.9 f.2 7.2 7.4 7.7 100.0 E4 15.6 22.5 32.3 39.0 46.4 54.8 61.7 70.6 77.8 84.9 92.3 |00.0 r8 &99rv I @ cdrFrb'l{tirE } r5117gl
  24. 24. Chapter3 DescriptiveStatistics numberof recordswould havebeenreducedby thenumberof recordswith missingvalues. The final column gives cumulativepercentages.Cumulativepercentagesindicatethe per- centageof recordswith a scoreequalto or smallerthan the currentvalue.Thus, the last value is always 100%.Thesevaluesare equivalentto percentile ranks for the values listed. Determining PercentiIe Ranl<s :,,. tril YI !rydI |*"1 lT Oirpbarfrcqlcreyttblce frfix*... I Central TendencyandDispersior sections suchasthe Median or Mode. whichcannot (seeSection3.3). This brings up the Frequencies: Statisticsdialog box. Check any additional desiredstatisticby clickingon the blanknext to it. For percentiles, enter the desired percentile rank in the blank to the right of thePercentile(s)label.Then,click Add to add it to the list of percentilesrequested.Once you haveselectedall your requiredstatistics, click Continue to return to the main dialog box.Click OK. The Frequencies command can be used to provide a number of descriptive statistics,as well as a variety of percentile values(includingquartiles, cut points,and scorescorrespondingto a specificpercentile rank). To obtain either the descriptiveor percentile functions of the Frequencies command,click the Statisticsbutton at the bottomof the maindialog box. Note thatthe of this box are useful for calculatingvalues, be calculatedwith theDescriptiyescommand PscdibV.lrr xl c{q I *g"d I Hdo I tr Ourilr3 I F nrs**rtd!i* ,crnqo,p, i f- Vdrixtgor0mi&ohlr Oi$.r$pn" l* SUaa** n v$*$i I* nmgc f Mi*n n |- Hrrdilrtl l- S"E.mcur 0idthfim' t- ghsrurt T Kutd*b Statistics ModelYear(modulo100 N Vatid Missing Percentiles 25 50 75 80 405 1 73.00 76.00 79.00 80.00 Outputfor PercentileRanl<s The Statisticsdialog box adds on to the previousoutput from the Frequenciescommand.The new sectionof theoutputis shownat left. The output containsa row for eachpieceof informationyou requested.In the exampleabove,we checkedQuartilesand askedfor the 80th percentile. Thus, the output contains rows for the 25th, 50th. 75th,and80thpercentiles. Mla pa Galmlm3 Sfndr*Pi*rcsnr SHslsp{rierltuso /v***v*$t*(ttu /lino toaccrbrar $1C**{ry o{Origr[c l9
  25. 25. Chaprer,1 Descriptire Statistics PracticeExercise UsingPracticeDataSetI in AppendixB, createa frequencydistributiontablefor themathematicsskillsscores.Determinethemathematicsskillsscoreat whichthe60th percentilelies. section3.2 FrequencyDistributionsand percentileRanks for Multiple Variables Description The Crosslabscommandproducesfrequencydistributionsfor multiplevariables. Theoutputincludesthenumberof occurrencesof eachcombinationof levelJof eachvari- able.It ispossibleto havethecommandgivepercentagesfor anyor all variables. The Crosslabscommandis usefulfor describingsampleswherethe meanis not useful(e'g.,nominalor ordinalscales).It is alsousefulasa methodfor gettinga feelfor yourdata. Assumptions Becausethe outputcontainsa row or columnfor eachvalueof a variable.this commandworksbestonvariableswitharelativelysmallnumberof values. ThisexampleusestheSAMpLE.savdata ;ilffi; file, which you createdin Chapter l. To run the chrfy procedure, ctick Analyze, then Descriptive DttaRcd.Etbn Statistics,then Crosstabs.This will bring up ttt. scah mainCrosstabsdialogbox,below. ,SPSSData Format The SPSSdata file for the Crosstabs commandrequirestwo or morevariables.Those variablescanbeof anytype. RunningtheCrosstabsCommand I lnalyzc Orphn Ut||Uot RcF*r ) (orprycrllcEnr G*ncralllrgarFlodcl The dialog box initially lists all vari- ableson the left and containstwo blanks la- beled Row(s) and Column(s). Enter one vari- able(TRAINING) in theRow(s)box. Enterthe second (WORK) in the Column(s) box. To analyzemore than two variables,you would enter the third, fourth, etc., in the unlabeled area(ust undertheLayer indicator). ) ) , ) ) ) ) i, Ror{.} T€K I r---r ftr;;ho.- '-l lrJ I .;lm&! ryq I 20
  26. 26. Chapter3 DescriptiveStatistics percentagesand other information to be generatedfor eachcombinationof values.Click Cells,andyou will get thebox at right. For the example presentedhere, check Row, Column, and Total percentages.Then click Continue. This will return you to the Crosstabsdialog box. Click OK to run theanalvsis. TRAINING'WURKCross|nl)tilntlo|l WORK TolalNO Parl-Time TRAINING Yes Count %withinTRAININO %withinwoRK %ofTolal I 50.0% 50.0% 25.0% 1 50.0% 50.0% 25.0% 100.0% 50.0% 50.0% No Count %withinTRAINING %withinWORK %ofTolal 1 50.0% 50.0% 25.0% 1 50.0% 50.0% 25.0% ? 1000% 50.0% 50.0% Total Count %withinTRA|NtNo %wilhinWORK %ofTolal 50.0% 100.0% 50.0% a 500% 100.0% 50.0% 4 r00.0% 100.0% 100.0% Interpreting Crosstabs Output The output consistsof a contingencytable.Each level of WORK is given a column.Each level of TRAINING is given a row. In addition, a row is added for total, and a column is added for total. The Cells button allows you to specify W: t C",ti* | t*"1 ,"1 Eachcell containsthe numberof participants(e.g.,one participantreceivedno traininganddoesnot work; two participantsreceivedno training,regardlessof employ- mentstatus). Thepercentagesfor eachcell arealsoshown.Row percentagesaddup to 100% horizontally.Columnpercentagesaddupto 100%vertically.Forexample,of all theindi- vidualswhohadno training, 50ohdid notworkand50o%workedpart-time(usingthe"o/o withinTRAINING" row).Of theindividualswhodid notwork,50o/ohadno trainingand 50%hadtraining(usingthe"o/owithinwork"row). Practice Exercise UsingPracticeDataSet I in AppendixB, createa contingencytableusingthe Crosstabscommand.Determinethe numberof participantsin eachcombinationof the variablesSEXandMARITAL. Whatpercentageof participantsis married?Whatpercent- ageof participantsis maleandmarried? Section3.3 Measuresof Central Tendencyand Measuresof Dispersion for a SingleGroup Description Measuresof centraltendencyarevaluesthat representa typicalmemberof the sampleor population.Thethreeprimarytypesarethemean,median,andmode.Measures of dispersiontell you thevariabilityof yourscores.Theprimarytypesaretherangeand thestandarddeviation.Together,a measureof centraltendencyanda measureof disper- sionprovideagreatdealof informationabouttheentiredataset. ''Pd€rl.!p. - r-Bait*" ;F Bu : ,l- U]dadr&ad F corm if- sragatrd "1'"1--_rry-ys___ . 2l
  27. 27. Chapter,l DescriptiveStatistics We will discussthesemeasuresof central tendencyandmeasuresof dispersionin the con- text of the Descriplives command. Note that many of thesestatisticscan also be calculated with several other commands (e.g., the Frequenciesor CompareMeans commandsare requiredto computethe mode or median-the Statisticsoption for theFrequenciescommandis shownhere). iffi{ltl*::l'.,xl Fac*Vd*c-----:":'-'-"-" "- |7 Arruer |* O*pai*furjF tqLteiotpr F rac$*['* r.-I 16-k'I ':'I I+l lcer**r**nc*r1 !*{* | f- rlm Cr* | , f u"g.t -:.-i i0hx*ioo*".'*-' lf Sld.dr',iitbnl* lli*nn ]fV"iro f.H**ntrn lfnxrgo f.5.t.ncr : T Modt :-^t5m l- Vdsm$apn&bcirr oidrlatin-- -- r5tcffi: ; f Kutu{b i Assumptions Eachmeasureof centraltendencyandmeasureof dispersionhasdifferent assump- tionsassociatedwith it. The mean is the mostpowerfulmeasureof centraltendency,andit hasthe mostassumptions.For example,to calculatea mean,the datamustbe measuredon an interval or ratio scale.In addition,thedistributionshouldbe normally distributedor, at least,not highly skewed.The median requiresat leastordinal data.Becausethe median indicatesonly the middle score(when scoresarearrangedin order),thereareno assump- tions aboutthe shapeof the distribution.The mode is the weakestmeasureof centralten- dency.Thereareno assumptionsfor the mode. The standard deviation is themostpowerful measureof dispersion,but it, too, has severalrequirements.It is a mathematicaltransformationof the variance (the standard deviationis the squareroot of thevariance).Thus,if oneis appropriate,theotheris also. The standard deviation requiresdatameasuredon an interval or ratio scale.In addition, the distributionshouldbe normal.The range is the weakestmeasureof dispersion.To cal- culatea range, the variablemustbe at leastordinal. For nominal scaledata,the entire frequencydistributionshouldbe presentedasa measureof dispersion. Drawing Conclusions A measureof centraltendencyshouldbe accompaniedby a measureof dispersion, Thus, when reporting a mean, you shouldalso report a standard deviation. When pre- sentinga median, you shouldalsostatetherange or interquartilerange. .SPSSData Format Only onevariableis required. 22
  28. 28. Chapter3 DescriptiveStatistics Running the Command The Descriptives command will be the command you will most likely use for obtaining measuresof centraltendencyandmeasuresof disper- sion. This exampleusesthe SAMPLE.sav data file we haveusedin thepreviouschapters. ,t X dlt da.v qil n".dI cr*l I f,"PI opdqr"..I To run the command, click Analyze, then Descriptive Statistics,then Descriptives. This will bring up the main dialog box for the Descriptives command. Any variables you would like informationaboutcanbe placedin the right blank by double-clickingthem or by selectingthem,thenclicking on theanow. ! D ' cond*s . Rolrar*n : classfy : 0€tdRedrctitrt ) ) ) ) d** ?n-"* ?,r,qx /t**ts f S&r dr.d!r&!d Y*rcr ri By default, you will receivethe N (number of cases/participants),the minimum value, the maximum value,the mean, and the standard deviation.Note that someof thesemay not be appropriatefor the type of data you haveselected. If you would like to changethe defaultstatistics that aregiven, click Optionsin the main dialog box. You will begiventheOptionsdialogbox presentedhere. F Morr l- Slm r@t qq..'I ,|'?bl ltl {l '!t ,l ,lt il 'i I I : "i I ", ;i I ; F su aa**n F, Mi*ilm f u"or- F7Maiilrn l- nrrcr I- S.r.npur I otlnyotdq: * I {f V;i*hlC I r lpr,*an I r *car*remar i r Dccemdnnmre Reading the Output The output for the Descriptivescommandis quite straightforward.Each type of outputrequestedis presentedin a column,andeachvariableis given in a row. The output presentedhereis for the sampledatafile. It showsthatwe haveonevariable(GRADE) and that we obtainedthe N, minimum, maximum,mean, and standard deviation for this variable. DescriptiveStatistics N Minimum Maximum Mean Std.Deviation graoe ValidN (listwise) 4 4 73.00 85.00 80.2500 5.25198 lA-dy* ct.dn Ltffibc GonardtFra*!@ 23
  29. 29. Chapter3 DescriptiveStatistics Practice Exercise UsingPracticeDataSet I in AppendixB, obtainthe descriptivestatisticsfor the ageof theparticipants.What is themean?The median?The mode?What is thestandard deviation?Minimum?Maximum?The range? Section3.4 Measuresof Central Tendency and Measuresof Dispersion for Multiple Groups Description The measuresof centraltendencydiscussedearlierare often needednot only for theentiredataset,but alsofor severalsubsets.Oneway to obtainthesevaluesfor subsets would be to usethe data-selectiontechniquesdiscussedin Chapter2 andapply theDe- scriptivescommandto eachsubset.An easierway to performthis task is to usetheMeans command.The Meanscommandis designedto providedescriptivestatisticsfor subsets ofyour data. Assumptions The assumptionsdiscussedin the sectionon Measuresof CentralTendencyand Measuresof Dispersionfor a SingleGroup(Section3.3)alsoapplyto multiplegroups. Drawing Conclusions A measureof centraltendencyshouldbe accompaniedby a measureof dispersion. Thus,whengiving a mean,you shouldalsoreporta standarddeviation.Whenpresenting a median,you shouldalsostatetherangeor interquartilerange. SPSSData Format Two variablesin the SPSSdatafile are required.One representsthe dependent variable and will be the variablefor which you receivethe descriptivestatistics.The otheris theindependentvariable andwill beusedin creatingthesubsets.Notethatwhile SPSScallsthis variablean independentvariable, it may not meetthe strictcriteriathat definea trueindependentvariable (e.g.,treatmentmanipulation).Thus,someSPSSpro- ceduresreferto it asthegroupingvariable. RunningtheCommand This example ! RnalyzeGraphsUtilities nsportt F ' DescriptiveStatistirs ) GeneralLinearftladel F ' Csrrelata ) . Regression I ' (fassify F WindowHetp I-l r.l Firulbgt5il | - Ona-Sarnplef feft. Independent-SamdesTTe Falred-SarnplEsTTest,,, Ons-Way*|iJOVA,,, uses the SAMPLE.sav data file you created in Chapterl. The Meanscommandis run by clicking Analyze, then Compare Means, thenMeans. This will bringup the maindialog box for the Means command. Place the selectedvariablein the blank field labeled DependentList. 1A LA
  30. 30. Chapter3 DescriptiveStatistics Placethe grouping variable in thebox labeledIndependentList.In this example, throughuseof the SAMPLE.savdatafile, measuresof centraltendencyand measuresof dispersion for the variable GRADE will be given for each level of the variable MORNING. :I tu DependantList € arv ,du** /wqrk €tr"ining rTril ll".i I lLayarlal1*- I :'r:rrt| ..!'l?It.Ii I IndependentLi$: i r:ffi lr-, tffi, r l*i.rl I L-:- ryl HesetI CancelI l"rpI By default,the mean,numberof cases,and standard deviation are given. If you would like additionalmeasures,click Optionsand you will be presentedwith the dialog box at right. You can opt to includeany numberof measures. Reading the Output The output for the Means commandis split into two sections.The first section,called a case processingsummary, gives informationaboutthe data used. In our sample data file, there are four students(cases),all of whom were includedin the analysis. I Std.Enord Kutosis Skemrcro fd Stdirtlx: mil'*-* lltlur$uofCa*o* lStardad Doviaion ml I I Lqlry-l c""dI x,r I Sld.Enool$karm HanorricMcan :J Medan 5tt Minirn"rm Manimlrn Rarqo Fist La{ VsianNc GaseProcessingSummary Cases lncluded Excluded Total N Percent N Percent N Percent grade- morning 4 100.0% 0 .OYo 4 | 100.0% 25
  31. 31. Chapter3 DescriptiveStatistics The secondsectionof the out- put is the report from the Means com- mand. This report lists the name of the dependent variable at the top (GRADE). Every level of the inde- pendent variable (MORNING) is shown in a row in the table.In this example,the levelsare 0 and l, labeledNo and Yes. Note thatif a variableis labeled,thelabelswill be usedinsteadof theraw values. The summarystatisticsgiven in the reportcorrespondto the data,wherethe level of theindependentvariable is equalto therow heading(e.g.,No, Yes).Thus,two partici- pantswereincludedin eachrow. An additionalrow is added,namedTotal. That row containsthe combineddata. andthe valuesarethe sameasthey would be if we hadrun theDescriptiyescommandfor thevariableGRADE. Extension to More Than One Independent Variable If you have more than one independent variable, SPSScan break down the output even fur- ther. Rather than adding more variables to the Independent List section of the dialog box, you need to add them in a different layer. Note that SPSS indicates with which layeryou areworking. If you click Next, you will be presentedwith Layer 2 of 2, and you can selecta secondindependent variable (e.g., TRAINING). Now, when you run the command(by clicking On, you will be given summary statistics for the variable GRADE by each level of MORNING andTRAINING. Your output will look like the output at right. You now have two main sections(No and yes), along with the Total. Now, how- ever, each main section is broken down into subsections(No, yes, andTotal). The variable you used in Level I (MORNING) is the first one listed,and it definesthe main sections.The variableyou had in Level 2 (TRAINING) is listedsec- Repott GRADE MORNING Mean N Std.Deviation NO Yes Total 82.5000 78.0000 80.2500 2 4 3.53553 7.07107 5.25198 Report ORADE MORNING TRAINING Mean N Std.Deviation No Yes NO Total 85.0000 80.0000 82.5000 1 1 I 3.53553 Yes Yes NO Total 83.0000 73.0000 78.0000 1 1 1 7.07107 Total Yes NO Total 84.0000 76.5000 80.2500 a z 4 1.41421 4.54575 5.?5198 id 26
  32. 32. Chapter3 DescriptiveStatistics ond.Thus,the first row representsthoseparticipantswho werenot morningpeopleand whoreceivedtraining.Thesecondrowrepresentsparticipantswhowerenotmorningpeo- pleanddid notreceivetraining.Thethirdrow representsthetotalfor all participantswho werenotmorningpeople. Noticethatstandarddeviationsarenotgivenfor all of therows.Thisis because thereisonlyoneparticipantpercellin thisexample.Oneproblemwithusingmanysubsets is thatit increasesthenumberof participantsrequiredto obtainmeaningfulresults.Seea researchdesigntextor yourinstructorfor moredetails. Practice Exercise UsingPracticeDataSetI in AppendixB, computethemeanandstandarddevia- tion of agesfor eachvalueof maritalstatus.Whatis theaverageageof themarriedpar- ticipants?Thesingleparticipants?Thedivorcedparticipants? Section3.5 Standard Scores Description Standardscoresallowthecomparisonof differentscalesby transformingthescores intoa commonscale.Themostcommonstandardscoreis thez-score.A z-scoreis based ona standardnormaldistribution(e.g.,a meanof 0 anda standarddeviationof l). A z-score,therefore,representsthenumberof standarddeviationsaboveor belowthemean (e.9.,az-scoreof -1.5representsascoreI %standarddeviationsbelowthemean). Assumptions Z-scoresarebasedon thestandardnormal distribution.Therefore,thedistribu- tionsthatareconvertedtoz-scoresshouldbenormallydistributed,andthescalesshouldbe eitherintervalor ratio. Drawing Conclusions Conclusionsbasedonz-scoresconsistof thenumberof standarddeviationsabove or belowthemean.Forexample,astudentscores85onamathematicsexamin aclassthat hasa meanof 70andstandarddeviationof 5.Thestudent'stestscoreis l5 pointsabove theclassmean(85- 70: l5). Thestudent'sz-scoreis 3 becauseshescored3 standard deviationsabovethemean(15+ 5 :3). If thesamestudentscores90ona readingexam, witha classmeanof 80anda standarddeviationof 10,thez-scorewill be I .0because sheis onestandarddeviationabovethe mean.Thus,eventhoughher raw scorewas higheronthereadingtest,sheactuallydidbetterin relationto otherstudentsonthemathe- maticstestbecauseherz-scorewashigheronthattest. .SPSSData Format Calculatingz-scoresrequiresonlya singlevariablein SPSS.Thatvariablemustbe numerical. 27
  33. 33. Chapter3 DescriptiveStatistics Running the Command Computingz-scoresis a componentof the Descriptivescommand.To accessit, click Analyze, thenDescriptive Statistics,thenDescriptives. This exampleusesthe sampledata file (SAMPLE.sav) createdin ChaptersI and2. 19 Srva*ndudi3advduosts vcriaHas Myzc eqhs Uti$tbl WMow Help ) b,lrstlK- al @nerdLlneuFbdel ) Correlate ) This will bring up the stan- dard dialog box for the Descrip- /ives command.Notice the check- box in the bottom-left corner la- beled Save standardized values as variables.Checkthis box andmove the variableGRADE into the right- handblank. Then click OK to com- pletethe analysis.You will be pre- sented with the standard output from theDescriptivescommand.Notice thatthez-scoresarenot listed.They wereinserted into thedata window asa new variable. Switch to the Data View window and examineyour data file. Notice that a new variable,called ZGRADE, has beenadded.When you askedSPSSto save standardized values,it createda new variablewith the samenameasyour old variableprecededby a Z. Thez-scoreis computedfor eachcaseandplacedin thenew variable. lr| -tsJXEb E* S€w Qpt. lrnsfam end/2. gr$t6 t*l tsr.dI c"odI HdpI ldry | elslel&l *il{|lelej sJglelffilslffilfw,qlqj $citffrtirffi Tua/Thulaiemoon Yas Yes No Mi- Reading the Output After you conductedyour analysis,the new variablewascreated.You canperform any numberof subsequentanalyseson thenew variable. Practice Exercise Using PracticeData Set2 in AppendixB, determinethez-scorethatcorrespondsto eachemployee'ssalary.Determinethe mean z-scoresfor salariesof male employeesand femaleemployees.Determinethe meanz-scorefor salariesof thetotal sample. rc11i-io- doay drnue dMonNtNs dwnnn drR$HtNs 28
  34. 34. Chapter4 GraphingData Section4.1 GraphingBasics In addition to the frequencydistributions,the measuresof central tendencyand measuresof dispersiondiscussedin Chapter3, graphingis a usefulway to summarize,or- ganize,andreduceyour data.It hasbeensaidthat a pictureis worth a thousandwords.In thecaseof complicateddatasets,this is certainlytrue. With Version 15.0of SPSS,it is now possibleto makepublication-qualitygraphs usingonly SPSS.One importantadvantageof usingSPSSto createyour graphsinsteadof othersoftware(e.g.,Excel or SigmaPlot)is that the datahavealreadybeenentered.Thus, duplicationis eliminated,andthechanceof makinga transcriptionerroris reduced. Section4.2 TheNewSPSSChartBuilder DataSet For the graphingexamples,we will usea new setof data.Enterthe databelowby defining the three subjectvariablesin the Variable View window: HEIGHT (in inches), WEIGHT (in pounds),and SEX (l = male,2 = female).When you createthe variables, designateHEIGHT and WEIGHT as Scalemeasuresand SEX as a Nominal measure(in thefar-rightcolumnof the VariableView).Switchto theData Viewto enterthedatavaluesfor the 16participants.Now usetheSaveAs com- mandtosavethefile,namingit HEIGHT.sav. bCIb -- iNiomiiiai - Measure Scale HEIGHT 66 69 /5 72 68 63 74 70 66 64 60 67 64 63 67 65 WEIGHT 150 155 160 160 150 140 165 150 ll0 100 95 ll0 105 100 ll0 105 SEX I I I I I I I I 2 2 2 2 2 2 2 2 29
  35. 35. Chapter4 GraphingData Make sureyou have enteredthe datacorrectlyby calculatinga mean for eachof the threevariables(click Analyze,thenDescriptive Statistics,thenDescriptives).Compare yourresultswith thosein thetablebelow. DescrlptlveStatistics N Minimum Maximum Mean srd. Dpvi2lion l-ttstuFlI WEIGHT SEX ValidN (listwise) 16 16 16 16 60.00 06 nn 1.00 74.00 165.00 2.00 66.9375 129.0625 1.5000 J.9Ub// 26.3451 .5164 Chart Builder Basics Make surethat the HEIGHT.savdatafile you createdaboveis open.In order to usethe chartbuilder,you musthavea datafile open. NewwithVersionl5.0ofSPSSistheChartBuildercom.W mand. This command is accessedusing Graphs, then Chart Builder in the submenu.This is a very versatilenew commandthat canmakegraphsof excellentquality. When you first run the Chart Builder command,you will probablybepresentedwith the following dialog box: Bcforeyur rrc thlsdalog,moasuranar*hvelshold bcsctgecrh fw cadrvadabb h yourdurt. In dtbn, f yow chartcodahscataqo*d v6d&. v*re hbds sha.rldbr &fhcd for eachcrtrgory kass O( to doflrcyorr chart, Pr6srDafineV.riaHafroportbsto mt masrcnrant brd orddhe v*.te l&b for rhartvsi$bs, :, f* non't*row $rUdalogagaFr This dialog box is askingyouto ensurethatyour variables are properly de- fined.Referto Sections1.3 and2.1 if you haddifficulty definingthevariablesusedin creatingthe datasetfor this example,or to refreshyour knowledgeof thistopic.Click oK. cc[ffy Eesknotnents Ocfknvubt# kopcrtcr.,. The Chart Builder allows you to makeany kind of graphthat is normally usedin publicationor presentation,and much of it is be- yond the scopeof this text. This text,however,will go overthe basics of the ChartBuilder sothatyou canunderstandits mechanics. On the left sideof the Chart Builder window arethe four main tabsthat let you control the graphsyou are making. The first one is theGallery tab.The Gallerytaballowsyou to choosethebasicformat ofyour graph. l"ry{Y:_ litleo/Footndar - rct"ph; Lulitieswindt ol( 30
  36. 36. Chapter4 GraphingData For example, the screenshothere showsthedifferentkindsof barchartsthat theChartBuilder cancreate. After you have selectedthe basic form of graph that you want using the Gallery tab, you simply drag the image from the bottom right of the window up to the main window at the top (where it reads,"Drag a Gallery charthereto useit asyour startingpoint"). Alternatively,you can use the Ba- sicElemenlstab to drag a coordinatesys- tem (labeledChooseAxes)to the top win- dow, then drag variables and elements into thewindow. The other tabs (Groups/Point ID and Titles/Footnotes)can be usedfor add- ing other standard elements to your graphs. The examples in this text will cover some of the basic types of graphs @9Pk8: 0rr9 a 63llst ctrt fsg b re it e y* 6t'fig pohr OR Clkl m f€ 86r Ele|mb * b tulH r dwt €lsffirt bf ele|Ft Chrtpftrbv [43 airr?b deb dnsrfiom: Ll3 Aroa PleFokr Scalbillot Hbbqran HUH-ot, 8oph DJ'lAm 8artsElpnF& n"ct I cror | ,bh I you canmakewith the ChartBuilder.After a little experimentationon your own, onceyou havemasteredthe examplesin the chapter,you will soongain a full understandingof the ChartBuilder. Section4.3 Bar Charts, PieCharts,and Histograms Description Barcharts,piecharts,andhistogramsrepresentthenumberof timeseachscoreoc- cursthroughthevaryingheightsof barsor sizesof piepieces.Theyaregraphicalrepresen- tationsof thefrequencydistributionsdiscussedin Chapter3. Drawing Conclusions TheFrequenciescommandproducesoutputthatindicatesboththenumberof cases in the samplewith a particularvalueandthepercentageof caseswith thatvalue.Thus, conclusionsdrawnshouldrelateonly to describingthe numbersor percentagesfor the sample.If thedataareatleastordinalin nature,conclusionsregardingthecumulativeper- centagesand/orpercentilescanalsobedrawn. SPSSData Format Youneedonlvonevariableto usethiscommand. 3l
  37. 37. Chapter4 GraphingData Running the Command The Frequenciescommandwill produce graphicalfrequencydistributions.Click Analyze, then Descriptive Statistics, then Frequencies. You will be presentedwith the maindialog box for the Frequenciescommand,where you can enter the variablesfor which vou would like to | *nalyze Gr;pk Udties Window Hdp creategraphsor charts.(SeeChapter3 for otheroptionswith this command.) You will receive the charts for any variables lectedin the mainFrequenciescommanddialog box. Output The bar chartconsistsof a I'axis, representingthe frequency,andanXaxis, representingeachscore.Note that the only valuesrepresentedon the X axis are thosevalues with nonzerofrequencies(61, 62, and 7l arenot repre- sented). h.lgtrt 66.!0 67.m 68.00 h.lght G a ,I a L t LiwLlW .a'fJul (6fnpSg MBan* ) GeneralLinearMsdel) Click the Charts button at the bot- tom to producefrequencydistributions.This will giveyou theChartsdialogbox. Therearethreetypesof chartsavail- able with this command: Bar charts, Pie charts, andHistograms. For eachtype, the I axis can be either a frequencycount or a percentage(selectedwith the Chart Values option). );,r.: xl 0Kl n"*dI c"!q I l1t"l 65.00 70.s
  38. 38. Chapter4 GraphingData NEUMAf{l{COLLEiSELt*i:qARy A$TO|',J,pA .igU14 hclght The pie chart showsthe per- centageof the whole that is repre- sentedby eachvalue. The Histogramcommandcre- atesa groupedfrequencydistribution. Therangeof scoresissplitintoevenly spacedgroups.The midpointof each groupis plottedon theX axis,andthe I axisrepresentsthenumberof scores for eachgroup. If you select With Normal Curve,a normalcurvewill be super- imposedoverthedistribution.Thisis very usefulin determiningif the dis- tribution you have is approximately normal.The distributionrepresented hereis clearlynot normaldueto the asymmetryof thevalues. h166.9l S. Oae,.lr07 flrl0 Practice Exercise UsePracticeDataSet I in AppendixB. After you haveenteredthe data,constructa histogramthat representsthe mathematicsskills scoresanddisplaysa normal curve,anda barchartthatrepresentsthe frequenciesfor thevariableAGE. Section4.4 Scatterplots Description Scatterplots(also called scattergramsor scatterdiagrams)display two values for eachcasewith a mark on thegraph.TheXaxis representsthevaluefor onevariable.The I axisrepresentsthevaluefor the secondvariable. s0.00 t3r0 €alr 05!0 66.00 67.!0 Gen0 !9.!0 tos nfit 13!o il.m h.lght JJ
  39. 39. Chapter-1 GraphingData Assumptions Bothvariablesshouldbeintervalor ratio scales.If nominalor ordinaldataare used,becautiousaboutyourinterpretationof thescattergram. .SPSSData Format Youneedtwovariablestoperformthiscommand. Running the Command You can producescatterplotsby clicking Graphs, then Chart Builder. (Note:You canalsousetheLegacyDialogs. For this method, pleaseseeAppendixF.) r l0l ln Gallerv Choose from: selectScatter/Dol.ThendragtheSimple Scatter icon (top left) up to the main chart areaas shownin the screenshotat left. Disre- gardtheElementPropertieswindow thatpops up by choosingClose. Next,dragtheHEIGHT variableto the X-Axis area,and the WEIGHT variableto the Y-Axisarea(rememberthat standardgraphing conventionsindicate that dependent vari- ablesshouldbe I/ andindependentvariables shouldbeX. This would meanthat we aretry- ing to predictweightsfrom heights).At this point,your screenshouldlook like the exam- ple below. Note that your actual data arenot shown-just a setof dummy values. Wrilitll'.,: ,, .Jol V*l&bi: ^ry.J Y*J - '"? | Click OK. You should graph(nextpage)asOutput. get your new orrq a 6ilby (h*t fes b & it e tl ".:';oon, l ln iLs clr* s fE Bs[ pleitbnb t b b krth 3 cfst Bleffit by €l8ffit Chrifrwr* (& mtrpb dstr Ctffii'w: Frwih Si LtE lr@ Fb/Fq|n gnt$rrOol l,lbbgran HlgfFl"tr l@bt Ral Ars iEbM{ Ffip*t!4., opbr., I 6raph* ulfftlqs Wnd 8n Lh PlrifsLa Scfflnal xbbrs Hg||rd 34 , x**J" s*J ...ryFl
  40. 40. Chapter4 GraphingData Output Theoutputwill consistofamarkforeachparticipantattheappropriateX and levels. Adding a Third Variable Eventhoughthe scatterplotis a two-dimensionalgraph,it canplota third variable.To make it do so, selectthe Groups/PointID tabin theChartBuilder. Click theGrouping/stackingvariableop- tion.Again,disregardtheElementProp- ertieswindow that popsup. Next, drag thevariableSEXintotheupper-rightcor- ner whereit indicatesSet Color.When thisis done,yourscreenshouldlooklike theimageat right.If you arenotableto dragthevariableSEX,it maybebecause it is notidentifiedasnominalor ordinal in the VariableViewwindow. Click OK to haveSPSSproduce thegraph. arlo i?Jo ?0.00 t:.${ hdtht !|||d d*|er btrdtn- b$tdl l- cotrnrcpr:tvr$ I- aontpl*rt 35
  41. 41. Chapter4 GraphingData Now our outputwill havetwo differentsetsof marks.One setrepresentsthe male participants,and the secondsetrepresentsthe femaleparticipants.Thesetwo setswill ap- pearin two differentcolorson your screen.You canusethe SPSScharteditor(seeSection 4.6) to makethemdifferentshapes,asshownin theexamplebelow. os 65,00 67.50 helght Practice Exercise UsePracticeDataSet2 in AppendixB. Constructa scatterplotto examinetherela- tionshipbetweenSALARYandEDUCATION. Section4.5 AdvancedBar Charts Description Bar chartscan be producedwith the Frequencie.scommand(seeSection4.3). Sometimes.however.we areinterestedin a barchartwherethe I/ axisis nota frequency. To producesuchachart,weneedtousetheBarchartscommand. SPSSData Format You need at least two variablesto perform this command.There are two basic kinds of bar charts-those for between-subjectsdesignsand thosefor repeated-measures designs.Usethebetween-subjectsmethodif onevariableis theindependentvariable and the other is the dependentvariable. Use the repeated-measuresmethodif you havea de- pendentvariable for eachvalueof theindependentvariable (e.g.,you would havethree sPx iil 60.00 36
  42. 42. Chapter4 GraphingData variablesfor a designwith threevaluesof the independentvariable).This normallyoc- curswhenyou makemultiple observationsovertime. This exampleusesthe GRADES.savdatafile, which will be createdin Chapter6. Pleaseseesection6.4 forthedataif you would like to follow along. Running the Command Open the Chart Builder by clicking Graphs, then Chart Builder. In the Gallery tab, selectBar. lf you had only one inde- pendent variable, you would selectthe SimpleBar chart example (top left corner).If you havemore thanone independentvariable (as in this example), tfldr( select the Clustered Bar Chart example from themiddle of the top row. Drag the exampleto the top work- ing area. Once you do, the working area should look like the screenshotbelow. (Note that you will need to open the data file you would like to graphin order to run thiscommand.) h4 | G.laryahd lsr to @ t 6 p cfwxry m ffi * $r 0* dds t bto drr drrrl!* y"J .*t I r,* | :gi lh. y*rfts yu vttdld {a b. rsd te grmt! yw d.t, rh ffi qa..dr vrt d. {db. Edr.*6ot.' h *. dst, vtlB enpcr*.dby |SddSri,lARV vrtdb cdon d b Ur Yd. Vrtdrr U* (&gqb n.ryst d !d c *rdd nDe(rd*L, **h o b. red o. c&eskd d q 6 . gdslo a F Ftrg Yrt aic. Cdtfry LSdrl f o,-l ryl *.r! "l If you are using a repeated-measuresdesign like our example here using GRADES.savfrom Chapter6 (threedifferent variablesrepresentingthe i valuesthat we want),you needto selectall threevariables(you can<Ctrl>-clickthemto selectmultiple variables)andthendragall threevariablenamesto the Y-Axisarea.Whenyou do. vou will be giventhewarningmessageabove.Click OK. tG*ptrl uti$Ueswh& l?i;ffitF.t- d'd{4rfr trrd... /ft,Jthd) /l*n*|ts,., dq*oAtrm, , 9{ m hlpd{ sc.ffp/Dat tffotm tldrtff 60elot oidA# JI
  43. 43. Chapter4 GraphingData ,'rsji,. *lgl$ *rrrt plYkrlur.r ollmbdaa. 8{ Lll. ,fat H.JPd., t(.&|rih Krtogrqn HCtstoef loxpbt orrl Axas ir?i:J g; '! I' ;:Nl iai inilrut lr &t: nt r*dlF*... dnif*ntmld,.. /tudttbdJ {i*rEkucrt}&"., &rcqsradtrcq,,. n"i* l. crot J rr! | Output Practice Exercise Use PracticeData Set I in Appendix B. Constructa clusteredbar graphexamining the relationshipbetweenMATHEMATICS SKILLS scores(as the OepenOentvariabtej and MARITAL STATUS and SEX (as independentvariables).Make sureyou classify bothSEX andMARITAL STATUSasnominalvariables. Next, you will need to dragthe INSTRUCT variableto the top right in the Cluster: set color area (see screenshotat left). Note: The Chart Builder pays attention to the types of vari- ablesthat you ask it to graph.If you are getting etTormessages or unusualresults,be sure that your categorical variables are properly designatedas Nominal in the Variable View tab (See Chapter2, Section2.l). 38
  44. 44. Chapter4 GraphingData Section4.6 EditingSPSSGraphs Whatever command you use to createyour graph,you will probably want to do some editing to make it appearexactly as you want it to look. In SPSS,you do this in much the sameway thatyou edit graphs in other software programs(e.g.,Excel).After your graph is made, in the output window, select your graph (this will createhandlesaroundthe out- sideof the entireobject)and right- click. Then. click SPSS Chart Object, and click Open. Alter- natively,you can double-clickon the graphto openit for editing. Whenyou openthe graph,theChartEditor window andthe correspondingProper- lies window will appear. qb li. lin.tlla. *rll..!!lflE.!l ,, ;l 61f L:lr!.H;gb.tct-]pu1 ri IE :,- r--."1 Ittttr tlttIr tllrwel w&&$!{!rJ JJJJ-JJ JJJJJJ .nlqrlcnl,f,,!sl r 9-,I rt fil mlryl OnceChart Editor is open,you caneasilyedit eachelementof the graph.To select an element,just click on the relevantspoton the graph.For example,if you haveaddeda title to your graph("Histogram" in the examplethat follows), you may selectthe element representingthetitle of the graphby clicking anywhereon the title. FFF,FfuFF|*"'4F&'E' cFtA$-qli*LBul0l al ll rI q *. $r ;l Jxr F4*.it.r":!..* ltliL&{'nl 39
  45. 45. Chapter4 GraphingData jn ExYt ltb":€klgtH,U:; Li ^'irsGssir :J*ro:l A I 3 *l.A-I,-- Onceyou haveselected an element, you can tell whether the correct elementis selectedbecauseit will have handlesaroundit. If the item you have selectedis a text element(e.g., the title of the graph),a cursor will be presentandyou canedit the text asyou would in a word processing program. If you would like to change another attributeof the element(e.g., the color or font size),usethe Propertiesbox. (Text properties areshownbelow.) With a linle practice, you can make excellentgraphs using SPSS.Once your graph is formattedthe way you want it, simply select File, Save, then Close. $o gdt lbw gsion Ek $vr {hat Trm$tr,,, Spdy$a*Tmpt*c.,. flpoft {bdt rf'.|1,,, trTT.":.TJ*"' .'*t A:r::-' o,tl*" ffiln*fot*.1 P?*l!r h ?frtmd Sa . . AaBbCc123 gltaridfu; Ua*tr$Sie 40
  46. 46. Chapter5 PredictionandAssociation Section5.1 PearsonCorrelation Coefficient Description ThePearsoncorrelationcoefficient(sometimescalledthePearsonproduct-moment correlationcoefficientor simplythePearsonr) determinesthestrengthof thelinearrela- tionshipbetweentwovariables. Assumptions Bothvariablesshouldbemeasuredonintervalor ratio scales(or a dichotomous nominalvariable).If a relationshipexistsbetweenthem,thatrelationshipshouldbelinear. Becausethe Pearsoncorrelationcoefficientis computedwith z-scores,both variables shouldalsobenormallydistributed.If yourdatado notmeettheseassumptions,consider usingtheSpearmanrhocorrelationcoefficientinstead. SP.SSData Format Two variablesarerequiredin yourSPSSdatafile.Eachsubjectmusthavedatafor bothvariables. 4 n 1 .. n"."tI ry{l i*l lfratyil qapns Reportr Utl&i*s t#irdow Heb ) ) ) ) Move at leasttwo variablesfrom the box at left into the box at right by usingthe transferarrow (or by double-clickingeach variable).Make surethat a check is in the Pearson box under Correlation Cofficients. It is acceptableto move more thantwo variables. 4l Running the Command To selectthe Pearsoncorrelationcoefficient, click Analyze, then Conelate, then Bivariate (bivariate refers to two variables).This will bring up the Bivariate Correlations dialog box. This exampleusesthe HEIGHT.sav data file enteredat the startof Chapter4. Vdri.blcr I I I rqslDescripHveSalirtk* CcmparaHranr ue"qer:dlirwarmo{d . .i lwolalad {. 0rG-tr8.d 9@,.1
  47. 47. Chapter5 PredictionandAssociation For our example,we will move all threevariablesoverandclick OK. Reading the Output The output consists of a correlation matrix. Every variableyou enteredin the command is represented asboth a row and a column.We entered three variables in our command. Therefore,we havea 3 x 3 table.There are also three rows in each cell-the correlation,the significancelevel, and Vdi{$b* OX I lsffi -N- ml/'* I Tc* d $lrfmma*--*=*-*:-**-*l l_i::x- .--i 17Flag{flbrrcorda&rn nql :rydl !4 1 the N. If a correlation is signifi- cant at lessthan the .05 level, a single * will appearnext to the correlation.If it is significantat the .01 levelor lower, ** will ap- pear next to the correlation. For example, the correlation in the output at right has a significance level of < .001, so it is flagged with ** to indicatethat it is less than.01. To read the correlations. selecta row and a column. For example,the correlationbetweenheightandweight is determinedthroughselectionof the WEIGHT row andthe HEIGHT column(.806).We get the sameanswerby selectingthe HEIGHT row and the WEIGHT column.The correlationbetweena variableand itself is alwaysl, sothereis a diagonalsetof I s. Drawing Conclusions The correlationcoefficientwill be between-1.0 and+1.0.Coefficientscloseto 0.0 representa weakrelationship.Coefficientscloseto 1.0or-1.0 representa strongrelation- ship. Generally,correlationsgreaterthan 0.7 areconsideredstrong.Correlationslessthan 0.3 areconsideredweak.Correlationsbetween0.3 and0.7areconsideredmoderate. Significant correlationsare flaggedwith asterisks.A significant correlationindi- catesa reliablerelationship,but not necessarilya strongcorrelation.With enoughpartici- pants,a very small correlationcan be significant.PleaseseeAppendix A for a discussion of effect sizesfor correlations. Phrasinga SignificantResult In the exampleabove,we obtaineda correlationof .806 betweenHEIGHT and WEIGHT. A correlationof .806is a strongpositivecorrelation,andit is significantat the .001level.Thus,we couldstatethefollowingin a resultssection: Correlations heioht weioht sex netgnt Pearsonuorrelalron Sig.(2-tailed) N 1 16 .806' .000 16 -.644' .007 16 weight PearsonCorrelation Sig.(2-tailed) N .806' .000 16 I 16 .968' .000 16 sex PearsonCorrelation Sig.(2-tailed) N -.644' .007 16 -.968' .000 16 1 16 ". Correlationis significantat the 0.01levet(2-tailed). 4/
  48. 48. Chapter5 PredictionandAssociation A Pearsoncorrelationcoefficientwascalculatedfor the relationshipbetween participants'height and weight. A strong positive correlationwas found (r(14) : .806,p < .001),indicatinga significantlinearrelationshipbetween thetwo variables.Tallerparticipantstendto weighmore. The conclusionstatesthe direction(positive),strength(strong),value (.806),de- greesof freedom(14), and significancelevel (< .001)of the correlation.In addition,a statementof directionis included(talleris heavier). Note thatthedegreesof freedomgivenin parenthesesis 14.The outputindicatesan N of 16.While mostSPSSproceduresgive degreesof freedom,the correlationcommand givesonly theN (thenumberof pairs).For a correlation,thedegreesof freedomis N - 2. Phrasing ResultsThat Are Not Significant Usingour SAMPLE.savdataset from the previous chapters,we could calculatea correlationbetweenID and GRADE. If so, we get the outPut at right.Thecorrelationhasa significance level of .783.Thus,we could write the following in a resultssection(notethat thedegreesof freedomis N - 2): A Pearsoncorrelationwas calculatedexaminingthe relationshipbetween participants' ID numbers and grades.A weak correlation that was not significantwasfound(, (2): .217,p > .05).ID numberis notrelatedto grade in thecourse. Practice Exercise UsePracticeDataSet2 in AppendixB. Determinethe valueof the Pearsonconela- tion coefficientfor therelationshipbetweenSALARY andYEARS OF EDUCATION. Section5.2 SpearmanCorrelationCoeflicient Description The Spearmancorrelationcoefficientdeterminesthe strengthof the relationshipbe- tweentwo variables.It is a nonparametricprocedure.Therefore,it is weakerthanthe Pear- soncorrelationcoefficient.but it canbe usedin moresituations. Assumptions Becausethe Spearmancorrelationcoefficientfunctionson the basisof the ranksof data,it requiresordinal (or interval or ratio) datafor both variables.They do not needto be normallydistributed. Correlations ID GRADE lD PearsonUorrelatlon Sig.(2{ailed) N 1.000 4 .217 7A? 4 GMDE PearsonCorrelation Sig.(2-tailed) N .217 .783 4 1.000 4 43
  49. 49. Chapter5 PredictionandAssociation SP.SSData Format Two variablesarerequiredin yourSPSSdatafile. Eachsubjectmustprovidedata forbothvariables. Running the Command Click Analyze, then Correlate, then Bivariate.This will bringup themaindialogbox for Bivariate Correlations(ust like the Pearson correlation). About halfway down the dialog box, there is a sectionfor indicatingthe type of correlationyou will compute.You can selectas many correlationsasyou want. For our example, removethecheckin thePearsonbox (by clicking on it) andclick on theSpearmanbox. |;,rfiy* Grapk Utilitior wndow Halp i*CsreldionCoefficientsj j f f"igs-"jjl- fienddrstzu.b Use the variablesHEIGHT and WEIGHT from ourHEIGHT.savdatafile (Chapter4). This is also one of the few commandsthat allows you to choosea one-tailedtest.if desired. Reading the Output The output is essen- tially the sameas for the Pear- son correlation.Each pair of variables has its correlation coefficientindicatedtwice.The Spearmanrho can range from -1.0 to +1.0,just like thePear- sonr. The output listed above indicatesa correlationof .883 betweenHEIGHT and WEIGHT. Note the significancelevelof .000,shownin the "Sig. (2-tailed)"row. This is, in fact,a significancelevel of <.001. The actualalphalevelroundsout to.000, but it is not zero. Drawing Conclusions The correlationwill bebetween-1.0 and+1.0.Scorescloseto 0.0representa weak relationship.Scorescloseto 1.0or -1.0 representa strongrelationship.Significantcorrela- tions are flaggedwith asterisks.A significantcorrelationindicatesa reliablerelationship, but not necessarilya strongcorrelation.With enoughparticipants,a very small correlation can be significant.Generally,correlationsgreaterthan 0.7 are consideredstrong.Correla- tions lessthan 0.3 are consideredweak. Correlationsbetween0.3 and 0.7 arc considered moderate. RrFarts ) I Oescri$iveStatistics ) ComparcMeans ) " GenerdLinearf{udel ) Correlations HEIGHT WEIGHT Spearman'srho HEIGHT CorrelationCoeflicient Sig.(2-tailed) N ffi Sig.(2-tailed) N 1.000 16 tr-4. .000 16 .883 .000 't6 1.000 16 ". Correlationis significantat the .01 level(2-tailed) 44
  50. 50. Chapter5 PredictionandAssociation PhrasingResultsThatAreSignificant In the exampleabove,we obtaineda correlationof .883 betweenHEIGHT and WEIGHT. A correlationof .883is a strongpositivecorrelation,andit is significantat the .001level.Thus,we couldstatethefollowingin a resultssection: A Spearmanrho correlationcoefficientwas calculatedfor the relationship betweenparticipants'height and weight. A strongpositive correlationwas found (rho (14):.883, p <.001), indicatinga significantrelationship betweenthetwo variables.Tallerparticipantstendto weighmore. The conclusionstatesthe direction(positive),strength(strong),value(.883),de- greesof freedom(14), and significancelevel (< .001)of the correlation.In addition,a statementof directionis included(talleris heavier).Notethatthedegreesof freedomgiven in parenthesesis 14.TheoutputindicatesanN of 16.For a correlation,thedegreesof free- domisN-2. Phrasing ResultsThat Are Not Significant Using our SAMPLE.sav datasetfrom the previouschapters, we couldcalculatea Spearmanrho correlation between ID and GRADE. If so, we would get the output at right. The correlationco- efficientequals.000andhasa sig- nificancelevelof 1.000.Note thatthoughthis valueis roundedup and is not, in fact,ex- actly 1.000,we couldstatethefollowingin a resultssection: A Spearmanrho correlationcoefficientwas calculatedfor the relationship betweena subject'sID numberand grade.An extremelyweak correlation thatwasnot significantwasfound(r (2 = .000,p > .05).ID numberis not relatedto gradein thecourse. Practice Exercise UsePracticeDataSet2 in AppendixB. Determinethe strengthof the relationship betweensalaryandjob classificationby calculatingtheSpearmanr&ocorrelation. Section 5.3 Simple Linear Regression Description Simplelinearregressionallowsthepredictionof onevariablefrom another. Assumptions Simplelinearregressionassumesthatboth variablesareinterval- or ratio-scaled. In addition,the dependentvariable shouldbe normallydistributedaroundthe prediction line. This, of course,assumesthat the variablesare relatedto eachotherlinearly.Typi- Correlations to GRADE Spearman'srho lD CorrelationCoenicten Sig.(2{ailed) N ffi Sig. (2{ailed) N 000 .UUU 1.000 .000 1.000 4 1.000 45
  51. 51. Chapter5 PredictionandAssociation cally, both variablesshouldbe normally distributed.Dichotomousvariables (variables with only two levels)arealsoacceptableasindependentvariables. .SPSSData Format Two variablesare requiredin the SPSSdata file. Each subjectmust contributeto bothvalues. Running the Command Click Analyze, thenRegression,then Linear. This will bring up the main diatog box for LinearRegression.On theleft sideof the dialog box is a list of the variablesin your datafile (we areusingthe HEIGHT.sav data file from the start of this section).On the right are blocks for the dependent variable (the variable you are trying to predict),and the independentvariable (the variablefrom whichwe arepredicting). 0coandart t '-J ff*r,'-- Aulyze Graphs R;porte LJtl$ties Whdow Help ' Descrptive5tatistkf ComparcMems Generallinear frlod ' Corrolate > ) l j iL,:,,,r,,,'l u* I i -IqilItd.p.nd6r(rl I Crof I rrr Pm- i Er{rl Ucitbd lErra :J estimategivesyou a measure of dispersionfor your predic- tion equation. When the predictionequationis used. 68%of thedatawill fallwithin ModelSummary Model R R Square Adjusted R Souare Std.Errorof theEstimate 1 .E06 .649 .624 16.14801 a. Predictors:(Constant),height Ar-'"1 Est*6k I'J WLSWaidrl: sui*br...I pbr.. I Srrs...I Oaly*..I Variables Entered/Removed section. For our example,you shouldseethis output.R Square(calledthe coeflicientof determi- nation) givesyou theproportionof thevarianceof your dependentvariable (yEIGHT) thatcanbe explainedby variationin your independentvariable (HEIGHT). Thus, 649% of the variationin weight can be explainedby differencesin height (talier individuals weighmore). The standard error of Modetsummarv Clasifu ) DataReductbn ) We are interestedin predicting someone'sweighton thebasisof his or her height.Thus, we shouldplace the variable WEIGHT in the dependent variable block and the variable HEIGHT in the independentvariable block.Thenwe canclick OK to run the analysis. Reading the Output For simple linear regressions, we are interestedin three components of the output. The first is called the Model Summary,and it occursafterthe lt{*rt* 46
  52. 52. Chapter5 PredictionandAssociation onestandard error of estimate(predicted)value.Justover 95ohwill fall within two stan- dard errors.Thus, in the previousexample,95o/oof the time, our estimatedweight will be within32.296poundsof beingcorrect(i.e.,2x 16.148:32.296). ANOVAb Model Sumof Sorrares df Mean Souare F Sio. 1 Kegressron Residual Total 6760.323 3650.614 10410.938 I 14 15 6760.323 260.758 25.926 .0004 a' Predictors:(Constant),HEIGHT b.DependentVariable:WEIGHT The secondpart of the outputthatwe areinterestedin is the ANOVA summaryta- ble, asshownabove.The importantnumberhereis the significancelevel in the rightmost column.If that valueis lessthan.05,thenwe havea significantlinearregression.If it is largerthan.05,we do not. The final sectionof the outputis thetableof coefficients.This is wherethe actual predictionequationcanbe found. Coefficientt' Model Unstandardized Coefficients Standardized Coefficients t Sio.B Std.Error Beta 1 (Constant) height -234.681 5.434 71.552 1.067 .806 -3.280 5.092 .005 .000 a. DependentVariable:weight In mosttexts,you learnthat Y' : a + bX is the regressionequation.f' (pronounced "Y prime") is your dependentvariable (primesarenormally predictedvaluesor depend- ent variables),andX is your independentvariable. In SPSSoutput,the valuesof botha andb arefoundin theB column.The first value,-234.681,is thevalueof a (labeledCon- stant).The secondvalue,5.434,is the valueof b (labeledwith thenameof the independ- ent variable). Thus, our prediction equation for the example above is WEIGHT' : -234.681+ 5.434(HEIGHT).In otherwords,theaveragesubjectwho is an inchtallerthan anothersubjectweighs5.434poundsmore.A personwho is 60 inchestall shouldweigh -234.681+ 5.434(60):91.359pounds.Givenourearlierdiscussionof standarderror of estimate,95ohof individualswho are60 inchestall will weighbetween59.063(91.359- 32.296: 59.063)and123.655(91.359+ 32.296= 123.655)pounds. /: " I 47
  53. 53. Chapter5 PredictionandAssociation Drawing Conclusions Conclusionsfrom regressionanalysesindicate(a) whetheror not a significantpre- diction equationwas obtained,(b) the directionof the relationship,and (c) the equation itself. Phrasing Results That Are Significant In the exampleson pages46 and47, we obtainedanR Squareof .649anda regres- sion equationof WEIGHT' : -234.681+ 5.434(HEIGHT). The ANOVA resultedin .F= 25.926with I and 14 degreesof freedom.The F is significantat the lessthan .001 level. Thus,we could statethe following in a resultssection: A simple linear regressionwas calculatedpredicting participants'weight basedon theirheight.A significantregressionequationwasfound(F(1,14): 25.926,p < .001),with anR' of .649.Participants'predictedweight is equal to -234.68 + 5.43 (HEIGHT) poundswhen height is measuredin inches. Participants'averageweightincreased5.43poundsfor eachinchof height. The conclusionstatesthe direction(increase),strength(.649), value (25.926),de- greesof freedom(1,14),and significancelevel (<.001) of the regression.In addition,a statementof theequationitselfis included. Phrasing ResultsThatAre Not Significant If the ANOVA is not significant (e.g.,seethe outputat right),the section of the output labeled SE for the ANOVA will be greaterthan .05,andthe regressionequationis not significant.A results section might include the followingstatement: A simple linear regressionwas calculatedpredictingparticipants' ACT scoresbasedon their height. The regressionequationwas not significant(F(^1,14): 4.12,p > .05)with an R' of .227.Heightis not a significantpredictorof ACT scores. llorlol Srrrrrrry Hodel R Souare Adjuslsd R Souare Std.Eror of lh. Fslimale attt 221 112 3 06696 a. Predlclors:(Constan0,h8lghl a. Prodlclors:(Conslan0.h8lghl b. OependentVarlableracl Cootlklqrrr Hod€l Unstandardiz€d Slandardizsd Siots Std.Erol Bsta (u0nslan0 hei9hl | 9.35I -.411 13590 203 . r17 J OJI .2030 003 062 a. OBDendsnlva.iable:acl Note that for resultsthat arenot significant,the ANOVA resultsandR2resultsare given,but theregressionequationis not. Practice Exercise Use PracticeData Set2 in Appendix B. If we want to predictsalaryfrom yearsof education,what salarywould you predict for someonewith l2 yearsof education?What salarywould you predictfor someonewith a collegeeducation(16 years)? rt{)vP Xodel Sumof dl xeanSouare t Slo Rssldual Tolal JU/?U r31688 170t38 I 1a t5 I 408 4.12U 0621 48
  54. 54. Chapter5 PredictionandAssociation Section5.4 MultipleLinearRegression Description The multiple linear regressionanalysisallows the predictionof one variablefrom severalothervariables. Assumptions Multiple linearregressionassumesthat all variablesareinterval- or ratio-scaled. In addition,the dependentvariable shouldbe normally distributedaroundthe prediction line. This, of course,assumesthatthe variablesarerelatedto eachother linearly.All vari- ablesshouldbe normallydistributed.Dichotomousvariablesarealsoacceptableasinde- pendentvariables. ,SP,S,SData Format At leastthreevariablesarerequiredin the SPSSdatafile. Eachsubjectmust con- tributeto all values. RunningtheCommand ClickAnalyze,thenRegression,thenLinear. This will bring up the maindialog box for Linear Regression.On theleft sideof thedialogbox is a list of thevariablesin your datafile (we areusing the HEIGHT.savdata file from the start of this chapter).On the right sideof the dialog box are blanksfor thedependentvariable(thevariableyou aretryingto predict)andtheindependentvariables (thevariablesfromwhichyouarepredicting). Dmmd* l-...G LLI l&-*rt I At"h* eoptrc utiltt 5 t{,lrdq., }l+ i &ry!$$sruruct Cglpsaftladls GarnrdLhcar ldd S€lcdirnVdir* fn f*---*-- ,it'r:,I Cs Lrbr&: Er- '--- ti4svlit{ Li-Jr- sr"u*t.I Pr,rr...I s* | oei*. I We are interested in predicting someone'sweightbasedon his or herheight and sex. We believe that both sex and height influenceweight. Thus, we should placethe dependentvariable WEIGHT in the Dependentblock and the independent variables HEIGHT and SEX in the Inde- pendent(s)block.Enterbothin Block l. This will perform an analysisto de- termine if WEIGHT can be predictedfrom SEX and/or HEIGHT. There are several methods SPSS can use to conduct this analysis. These can be selectedwith the Methodbox. MethodEnter. themostwidely .roj I n{.rI ryl tb.l 49
  55. 55. Chapter5 PredictionandAssociation used,puts all variablesin the methodsuse variousmeansto Click OK to run theanalvsis. UethodlE,rt-rl ReadingtheOutput For multiplelinearregres- sion,therearethreecomponentsof the outputin which we are inter- ested.Thefirstis calledtheModel Summary,whichis foundafterthe VariablesEntered/Removedsection.For our example,you shouldget the outputabove.R Square(calledthe coefficientof determination)tellsyou the proportionof the variance in thedependentvariable (WEIGHT) thatcanbe explainedby variationin theindepend- ent variables(HEIGHT andSEX,in thiscase).Thus,99.3%of thevariationin weightcan be explainedby differencesin height and sex (taller individuals weigh more, and men weigh more).Note that when a secondvariableis added,our R Squaregoesup from .649 to .993.The .649wasobtainedusingtheSimpleLinearRegressionexamplein Section5.3. The StandardError of the Estimategives you a margin of error for the prediction equation.Usingthepredictionequation,68%oof thedatawill fall within onestandard er- ror of estimate(predicted)value.Justover95% will fall within two standard errors of estimates.Thus, in the exampleabove,95ohof the time, our estimatedweight will be within 4.591(2.296x 2) poundsof beingcorrect.In our SimpleLinearRegressionexam- ple in Section5.3,thisnumberwas32.296.Notethehigherdegreeof accuracy. The secondpart of the outputthatwe areinterestedin is the ANOVA summaryta- ble. For more informationon readingANOVA tables,referto the sectionson ANOVA in Chapter6. For now, the importantnumberis the significancein the rightmostcolumn.If thatvalueis lessthan.05,we havea significantlinearregression.If it is largerthan.05,we do not. equation,whether they are significant or not. The other enter only thosevariablesthat are significant predictors. ModelSummary Model R R Souare Adjusted R Square Std.Errorof theEstimate .99 .993 .992 2.29571 a. Predictors:(Constant),sex,height eHoveb Model Sumof Souares df MeanSouare F Sio. xegresslon Residual Total 0342424 68.514 10410.938 z 13 15 5171.212 5.270 v61.ZUZ .0000 a. Predictors:(Constant),sex,height b. DependentVariable:weight The final sectionof outputwe areinterestedin is thetableof coefficients.This is wherethe actualpredictionequationcanbe found. 50
  56. 56. Chapter5 PredictionandAssociation Coefficientf Model Unstandardized Coefficients Standardized Coefficients t Sio.B Std.Error Beta 1 (Constant) height sex 47j38 2.101 -39.133 14.843 .198 1.501 .312 -.767 176 10.588 -26.071 .007 .000 .000 a. DependentVariable:weight In mosttexts,you learnthat Y' = a + bX is theregressionequation.For multiple re- gression,our equationchangesto l" = Bs+ B1X1+ BzXz+ ... + B.X.(where z is thenumber of IndependentVariables).I/' is your dependentvariable, andtheXs areyour independ- ent variables. The Bs arelistedin a column.Thus,our predictionequationfor theexample aboveis WEIGHT' :47.138 - 39.133(SEX)+ 2.101(HEIGHT)(whereSEX is codedas I : Male, 2 = Female,andHEIGHT is in inches).In otherwords,the averagedifferencein weight for participantswho differ by one inch in heightis 2.101pounds.Malestendto weigh 39.133poundsmore than females.A femalewho is 60 inchestall shouldweigh 47.138- 39.133(2)+ 2.101(60):94.932 pounds.Givenour earlierdiscussionof thestan- dard error of estimate,95o/oof femaleswho are60 inchestall will weighbetween90.341 (94.932- 4.591: 90.341)and99.523(94.932+ 4.591= 99.523)pounds. Drawing Conclusions Conclusionsfrom regressionanalysesindicate(a) whetheror not a significantpre- diction equationwas obtained,(b) the direction of the relationship,and (c) the equation itself. Multiple regressionis generallymuch more powerful than simple linear regression. Compareour two examples. With multipleregression,you mustalsoconsiderthe significancelevelof eachin- dependentvariable. In the exampleabove,the significancelevel of both independent variablesis lessthan.001. PhrasingResultsThatAreSignificant In our example,we obtainedan R Squareof.993 anda regressionequa- tion of WEIGHT' = 47.138 39.133(SEX)+ 2.101(HEIGHT).The ANOVA resultedin F: 981.202with2 and 13degreesof freedom.F is signifi- cantatthelessthan.001level.Thus.we couldstatethefollowinein aresultssec- tion: MorblSratrtny xodsl R Souars Adlusted R Souare Std.Eror of lheEstimatg .997. 992 2 2C5r1 a Prsdictorsr(Conslan0,sex,hsighl a.Predlctors:(Conslan0,ser,hoighl b. OspBndontVariabloreighl ANr:rVAD Xodel Sumof Sdrrrraq dt XeanSouare I Heorsssron Residual Tutal ru3t2.424 68.5t4 |0410.938 2 15 5171212 981202 000r Coefllcldasr Xodel Unslanda.dizsd Slandardizad I SioStd.Eror Beta hei0hl sex at 38 2.101 .39.133 4 843 .198 L501 .312 3 t6 10.588 -26.071 007 000 000 a.DepsndenlVarlabl€:rei0hl 5l
  57. 57. Chapter5 PredictionandAssociation A multiple linear regressionwas calculatedto predict participants'weight basedon their height and sex.A significantregressionequationwas found (F(2,13): 981.202,p < .001),with an R' of .993.Participants'predicted weightis equalto 47.138- 39.133(SEX)+ 2.10l(HEIGHT),whereSEX is coded as I = Male, 2 : Female,and HEIGHT is measuredin inches. Participantsincreased2.101 pounds for each inch of height, and males weighed 39.133 pounds more than females.Both sex and height were significantpredictors. The conclusionstatesthe direction(increase),strength(.993),value(981.20),de- greesof freedom(2,13),and significancelevel (< .001)of the regression.In addition,a statementof the equationitself is included.Becausetherearemultiple independent vari- ables,we havenotedwhetheror noteachis significant. Phrasing ResultsThat Are Not Significant If the ANOVA does not find a significantrelationship,the Srg section of the output will be greaterthan .05, and the regressionequationis not sig- nificant. A resultssectionfor the output at right might include the following statement: A multiple linear regressionwas calculated predicting partici- pants'ACT scoresbasedon their height and sex. The regression equation was not significant (F(2,13): 2.511,p > .05)withan R" of .279. Neither height nor weight is a significantpredictor of lC7" scores. llorlel Surrrwy XodBl x R Souare AdtuslBd R Souare Std Eror of 528. t68 3 07525 a Prsdlclors:(ConslanD.se4hel9ht a Pr€dictors:(ConslanD,se( hsight o.OoDendBnlVaiabloracl Coetllclst 3r Yodel Unstandardizsd Cosilcisnls Standardized Coeilcionts stdSld E.rol Beia I (Constan0 h€l9hl s€x oJttl - 576 -t o?? 19.88{ .266 2011 -.668 - 296 3.102 2.168 - s62 007 019 35{ Notethatforresultsthatare "o, ,ir";;;;ilJlovA resultsandR2resultsare given,buttheregressionequationisnot. Practice Exercise UsePracticeDataSet2 in AppendixB. Determinethepredictionequationfor pre- dictingsalarybasedoneducation,yearsof service,andsex.Whichvariablesaresignificant predictors?If you believethatmenwerepaidmorethanwomenwere,whatwouldyou concludeafterconductingthisanalysis? ANI]VIP gumof dt qin I Reoressron Rssidual Total 1t.191 122.9a1 't70.t38 l3 't5 23.717 9.a57 2.5rI i tn. 52
  58. 58. Chapter6 ParametricInferentialStatistics Parametricstatisticalproceduresallow you to draw inferencesaboutpopulations basedon samplesof thosepopulations.To make theseinferences,you must be able to makecertainassumptionsabouttheshapeof thedistributionsof thepopulationsamples. Section6.1 Reviewof BasicHypothesisTesting TheNull Hypothesis In hypothesistesting,we createtwo hypothesesthat are mutually exclusive(i.e., bothcannotbe trueat thesametime)andall inclusive(i.e.,oneof themmustbe true).We referto thosetwo hypothesesasthe null hypothesisandthe alternative hypothesis.The null hypothesisgenerallystatesthatany differencewe observeis causedby randomerror. The alternative hypothesisgenerallystatesthat any differencewe observeis causedby a systematicdifferencebetweengroups. TypeI andTypeII Eruors All hypothesistestingattemptsto draw conclusions about the real world basedon the resultsof a test(a statistical test,in this case).Thereare four possible combinationsof results(seethe figure at <.r) right). = Two of thepossibleresultsarecor- A rect test results.The other two resultsare Uenors. A Type I error occurs when we ; reject a null hypothesisthat is, in fact, fr true, while a Type II error occurswhen l- we fail to reject the null hypothesis that is, in fact,false. Significance tests determinethe probabilityof makinga Type I error. In otherwords,after performinga seriesof calculations,we obtaina probability that the null hypothesisis true.If thereis a low probability,suchas5 or lessin 100(.05),by conven- tion, we rejectthe null hypothesis.In otherwords,we typicallyusethe .05 level(or less) asthemaximumType I error ratewe arewilling to accept. Whenthereis a low probabilityof a Type I error, suchas.05,we canstatethatthe significancetesthasled us to "rejectthe null hypothesis."This is synonymouswith say- ing that a differenceis "statisticallysignificant."For example,on a readingtesr,suppose you found thata randomsampleof girls from a schooldistrictscoredhigherthana random zdi 6a E- -^6 6!u trO o> 'F: n2 REALWORLD NullHypothesisTrue NullHypothesisFalse TypeI Error I NoError NoError I Typell Error 53