Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

spss intro


Published on

  • Be the first to comment

spss intro

  1. 1. SPSS for Windows Presented by: Office of Information Technology (OIT) Written by: William Dardick
  2. 2. SPSS for Windows Brief Outline 1. Introduction a. Windows i. Data Editor ii. Output iii. Syntax 2. The Basics a. Menu b. Tool Bar c. Opening Files d. Saving Files 3. Working with Data a. Date Entry b. Data Transformation c. Data Set Manipulation 4. Analyze Data a. Descriptive Statistics b. Compare Means c. Correlate d. Regression 5. HELP!!! 2
  3. 3. Introduction to SPSS version 13 (this document still holds for most of v 15) The objective of this course is to teach the user basic knowledge of SPSS version 13 in a windows setting. Most tasks in SPSS can be accomplished through the use of the pull- down menu. The basic functions of SPSS will be focused on, such as entering or importing data, creating variables, and transforming and analyzing data. For more information on SPSS for windows please use the HELP menu or refer to the detailed User and Applications guides created by SPSS for version 13 (SPSS Base Applications, Base user, Regression Models, Interactive Graphics, Advanced Models). Your are also well advised to consult statistical texts for questions regarding analysis. General Overview The majority of work that will be done in SPSS for windows will be performed in the Data Editor and the Viewer/Output Windows. SPSS syntax will not be focused on in this course but a brief overview will be given to help build on future programming. It is not necessary to know syntax to use SPSS for windows. Dialog boxes, small windows without menus, will pop up when most menu functions are used. Understanding these boxes will aid in the successful use of SPSS. Bellow is the Data editor with the Open File dialog box. This is what the general framework of SPSS for windows will look like. 3
  4. 4. Data Editor When working with SPSS the two most important windows to understand are the Data Editor and the Viewer. The Data Editor looks like most other spreadsheets you may have encountered. It is designed specifically for transforming and sorting data. When a data file is opened the title of the Data Editor window will change to the files name. The Data Editor is where you create new files or import existing files. 4
  5. 5. Viewer The Viewer is the output window for SPSS. It displays tables, charts and other graphics that have been created through SPSS. Output in the Viewer window can be edited with cut, paste and other common Edit options from the Edit menu. If print is selected from the Viewer window the entire content of the window will print. You may select to print specific sections by highlighting the output desired. The Viewer window allows cutting and pasting of tables and charts to help assist you in developing a report of results. 5
  6. 6. Syntax Editor There are two ways to work in SPSS. This course focuses on the windows pull-down version and will not be dealing with writing Syntax. Syntax is programming in SPSS. The Syntax window allows you to save and write syntax commands. To open a new Syntax file click on File, New, Syntax. To open existing Syntax files go to File, Open, Syntax. The Open File will appear. Make sure Files of Type has Syntax(*.sps)selected. 6
  7. 7. Quick Guide for learning Syntax If you need to learn syntax for some procedure that you have performed in SPSS for windows, you can always open up the SPSS journal. It keeps a record of everything you have done during this session in syntax. File-Open-Syntax Go to the C drive open the Windows folder then open the temp folder. Type *.jnl in the file name and open the SPSS folder. Copy and paste the syntax into a new syntax window. Another way to start learning syntax as you use the windows format is to set up your output to display syntax. Edit-Options The options Dialog window has several tabs. Select Draft Viewer. Check the Display commands log. Click the OK button at the bottom of the box. Your output will now display Syntax commands. This option is particularly useful for learning data manipulation and analysis syntax. 7
  8. 8. The Main Menu The Main Menu is located at the top of the window underneath the Title of the page. This menu has a variety of functions. The File menu gives access to open or close files, save files, change page set up or print the information in the file being used. The Edit menu has typical cut and paste commands The Data menu can change the way your cases and variables are organized. You can split, sort, and merge files in this menu. Transform is a useful menu used to create new variables from old variables or manipulate existing variables. Analyze is the statistical menu. There are dozens of statistics to choose from in this menu. T tests, Correlations and Regressions are just a few of your many options. The Graphs menu can show data in a visual form. Some of its options are bar, line and area graphs. Utilities is a menu that hold information about the file or variables The Window menu allows you to switch from one window to another Help is a very useful menu for people learning SPSS or having difficulty with analysis. It hold the table of contents, tutorials and Coaches for statistics and results. 8
  9. 9. Tool Bar The Tool Bar can be an easy way to perform many functions in SPSS. Becoming familiar with the main menu is important, but there are some things that will be much quicker and easier to do by selecting an icon on the tool bar. Here are the standard Icons for the tool bar in SPSS Data editor. To Open a file use the folder icon. It will bring up the open file dialog box. To Save a file use the disk icon. To Print the information in the window select the printer icon. To Recall a dialog box, click the mini dialog box icon. Undo is the curved arrow which points to your right, it takes you back one action. Go to Chart selects a chart if available. Go to Case allows you to select a case. Variables allows you to locate, and gives you the information on a given variable. Find allows you to find the next given score in a data set. Insert Case allows you to insert a case into the data set. Insert Variable allows you to insert a variable into the data set. Split File gives you different ways to split the data file and to still perform analyses on both sets of data. Weight Cases gives cases different weights for statistical analysis Select Cases can set up subgroups of cases Value labels can assign descriptive values to each value in a variable. Use Sets can restrict which variables are displayed in the dialog box. 9
  10. 10. Opening Data Files Opening data in SPSS has been made relatively simple. By selecting: File Open Data you can easily access SPSS files. The Open File dialog box opens and is set default to open SPSS(*.sav) data files. You can open spreadsheets into SPSS with the Open File dialog box such as Excel or Lodus with out complications. Txt/dat files are easy to open as well. The file wizard opens here for a clear guide to import data. Text files can be imported by selecting the text file desired. The Text Import Wizard will appear and will assist opening the file. Understanding the format of the file is important for the file to be properly read by SPSS. The file will be displayed at the bottom of the window. When the Text Import Wizard opens the first page will ask if your file matches a predetermined format. If you have previously saved a format from the text wizard and wish to use the same format you would select this option and search for your format. To 10
  11. 11. create a format for you text select No and follow the steps presented by the Text Wizard by selecting the Next option. The Text Wizard will now walk you threw steps 2-6. If your variables are separated by commas, Tabs, etc… select the Delimited option. If variables are aligned with fixed columns use the fixed width option. Make sure to answer yes to the second question if you have already named your variables in the text file. The text file is previewed at the bottom of the window to assist with selecting options. To open a database Select: File Open database The Data Wizard window will appear and assist with the opening of the database. Opening database files may require driver installation (if not already installed) or Login- ID and Passwords. 11
  12. 12. Example Transferring Files from Access to SPSS. Open database: New Query: Data base Wizard dialog box opens. SELECT Add Data Source. Dialog box odbc administrator opens SELECT User DSN Tab SELECT Add… button. SELECT the MA Driver (*.mdb), Double click Name the data source and describe it. Database can help you find the location of the file. Select… button will open files. Search the directories for access files. SELECT a file and SELECT ok 12
  13. 13. The link should now be present under user data sources in the User DSN Tab. SELECT OK. You should now be back at the Data Wizard. SELECT your newly created source. SELECT next to choose which tables to add. Drag the tables to add into the retrieve fields… Select Next. Limit Retrieved Cases if desired. Select Next Define your variables. Select Next Check your results and Select Finish. 13
  14. 14. Saving Files SPSS requires a data set. If you are not importing data from another source or a previous SPSS session, you will need to create a new data set. When your data set is complete or you simple want to stop and come back to the same data later, you will need to save your file. Saving a data editor file in SPSS. Go to: File Save The Save Data As dialog box will open. To save as a regular SPSS data file simple type in the File name and click Save. It will be saved as a SPSS (*.sav) file. You can also select other types of files to save the data. File Save as The save as dialog box will open 14
  15. 15. Go to Save as Type and select another type of file to save the data in. If you know the files ending, such as (*.xls) for excel you can simple type this at the end of your file name. EX. SPSStestfile.xls To save Output in the Viewer files first make sure you are in the viewer window. Then you can save the same way as you did with the data editor file. Naming the output will keep it from being saved as a default file name of Output#. The type of file should be Viewer Files (*.spo) 15
  16. 16. Entering Variables and New Data Variables Starting with a brief understanding of variables can be helpful when working with a data set. What is a variable? One of the most basic concepts to understand in research is that of variables. A variable is a construct, characteristic, or property, capable of taking on different values. Variables are traits that can vary from person to person or thing to thing. “A variable is a symbol that can be replaced by any one of the elements of some specified set. The particular set is called the range of the variable.” (William Hays Statistics for the social sciences 1973) The sets, scores, numbers, or values of a variable can be of two types; Discrete or continuous. A discrete variable can only have a finite set of scores. Continuous variables are ordinal, interval or ratio in value. There is some measurable difference between cases in this type of variable. The most basic distinction drawn between variables is between dependent and independent variables. When a researcher manipulates a variable it is said to be the independent variable. The variable to be measured in the experiment is the dependent variable. Another way of stating the difference between the two variable types is that the independent variable is the cause and the dependent variable is the effect. In many analyses the distinction can be seen through symbols. X is considered the independent variable and Y is the dependent variable. They are also referred to as X being the prediction variable and Y being the predicted variable. Getting Started with Variables Before entering your data into the Data Editor it is advised that you define your variables. Defining Variables can be done rather effectively by going to the bottom of your data Editor Page and selecting the Variable View Tab. You will see at the top of the window, Name, Type, Width, etc… 16
  17. 17. Select on the first line and enter the variables name into the Name column. Variables names can only be eight symbols long and cannot have a blank space. By double clicking under the Type column, you can change the type of variable. A dialog box will open that will allow you to select the variable type. The two most common types are string and numeric. A string variable is qualitative and typical a word, such as a persons name, religion or country. Numeric allows a variable to be used in all transformations and analysis. Other types of variables, such as Date or Dollar can be generated through this window. The Width allows you to change the width of columns in the data set. SPSS has a default of eight pixels. You can select to place you will have Decimals round. The default for SPSS is two decimal places for a numeric variable. Labeling your variables is very important in research. Knowing what the variables name stands for is not always intuitive. The Label option allows for detailed description of a variable. This is particularly useful when more then one person are working on a data set or there will be long periods of time between use of the data set. Going back to a data set a year later can be frustrating if you did not label the original variables. Values can be given to variables in order to compute information from what would normally be a string variable. For example: One of your variables is gender but you would like to do computations with gender as a numeric value. Select a value for Male and a separate value for Female. In this case, Female will be 1 and Male will be 2. Under the Values column in the variable view, select the variable you are going to use, in this 17
  18. 18. case Gender and select the Values box that references this variable. When the Values Box for Gender is selected, double click the box with three dots that appears in the Values column. A dialog box called Value Labels pops up. Select the box for Value and type in 1. Next to Value Label type ‘Female’. Select Add. Now when ever you have a 1 for this variable it will add the value label Female. Do the same thing for Male with the value of 2 and select add. Select OK. Value labels have been added. Toggle to the data view and select the Values Label Icon on the menu bar. You can select and deselect that icon to display and remove value labels. The variable view also can be used to indicate Missing vales. Select the Missing box for the appropriate variable. Double click the grey box with three dots. The Missing Values dialog box opens. Three options are given, no missing, select three missing values, or select a missing value range plus one discrete value. The three discrete values can be any numeric value, and string value of 8 characters or less or a blank string value. Select the Discrete missing values button. Type in the appropriate value(s) into the box(s), select ok and that value will be used as missing for all computations. You can use missing values in conjunction with value labels to keep track of why a value is missing. To change Columns size, either directly edit the number displayed or use the up or down buttons displayed when the columns box is selected. Alignment of text can be changed to left, right, center. Scales of measurement (Measure) Measurement allows you to pick from Scale (interval and ratio), Ordinal, and Nominal. Nominal data is grouping data, ordinal data is ranked or ordered without even intervals and scale data has equal intervals regardless of an absolute zero. Scales of Measurement explained When a number or name is assigned to an observation it becomes some form of measurement. The type of measurement system used for observations can dictate what types of statistical analysis can be performed on the data. In statistics there are four separate scales of measurement. The nominal scale is qualitative. Categories formed are mutually exclusive and exhaustive. This means that no observation can fall into more then one category and that there most be enough categories to include all of the observations. The type of variable is considered categorical because it forms categories. The value difference between the variables can not be said to be meaningful. The difference between genders is not quantifiable. The qualities in the two genders may be different but they cannot be quantified. When determining if a scale of measurement is nominal think about the differences between values in a variable. If the differences are not able to be quantified or 18
  19. 19. have no meaningful mathematical distinction, you are probable working with a nominal scale. Qualitative differences like location, m and m color, type of coffee, or movie, all have differences that aren’t measurable. When comparing these groups we are doing little more then naming them and using and putting observations in these named placeholders. How these qualities are named is only important for categorical purposes. We can label our m and m’s with numbers as long as there is the understanding that the numbers are there to identify them. If we label m and m’s, 1 for red, 2 for green, and 3 for brown, it does not mean that brown m and m’s have some greater value, or that the average of a red and brown m and m is a green one. Qualitative values should never be manipulated in this way when labeling them with numbers. This does not mean that there are no ways to interpret the values of such categorical data. There are statistics that are created for the soul purpose of analyzing nominal data. As we will see later statistics such as chi-square can be used to interpret nominal data. The ordinal scale is quantitative. It has mutually exclusive categories that are ranked in order of magnitude. When we rank order a variable we can tell some sort of quantitative difference between variables, however, there is nothing specified in the difference between scores to state that the difference between scores is equal. In a race, the difference between first and second place, and third and forth place do not need to be equal. They are simply ranked. First place is the fastest, then second then third. It doesn’t matter if first and second almost tied and third was far behind them. When data can be ranked, but values are not equal between intervals, as well as having the properties of the nominal scale (mutual exclusive and exhaustive) it is ordinal data. Think of your favorite foods, pick 5 and place them in order, the first one being your absolute favorite food and the 5th being the least of your favorite foods. For example, I might order mine, steamed crabs (1), barbeque ribs (2), clams casino (3), Szechwan shrimp (4), and pizza (5). Now when I look at my list of foods I can’t say exactly how much more I like steamed crabs then barbeque ribs. By adding pizza and Szechwan shrimp I don’t get the value of clam’s casino. There may be a large gap between steamed crabs and barbeque ribs, but almost no difference between the ribs and the clams. These differences between values help me to place them in a greater then less then order but it does not allow me to say anything quantitative about the magnitude (size) of difference between scores. Ordinal data can be discrete or continuous. Nonparametric statistics are useful for ordinal scale. Rank ordering can be informative, but sometimes you need to know something more about your data. The interval scale has all of the properties of the ordinal scale as well as having equal distance between scores. This means that the difference between any two variables next to each other on the scale will be equal. The difference between 10 degrees Celsius and 11 degrees is the same as the difference between 100 and 101 degrees. The interval scale gives meaning to the magnitude between values. The ratio scale has all of the properties of the interval scale but also has absolute zero. This absolute zero means that with a rating of zero there is a complete absence of this measurement. A zero on the Celsius scale does not mean you cannot have a colder 19
  20. 20. temperature. Kelvin temperature and weight in pounds are examples of absolute zeros. If you have a weight of zero there is an absence of weight. Absolute zero also gives the scale another quality. When you have 25 pounds and 50 pounds you can say 50 pounds is twice as heavy as 25 pounds. You cannot say that 10 degrees Celsius is twice as could as 20 degrees Celsius. A true zero means that there is no quantity of some particular trait. If you are weightless in space your weight is zero. There can be a height of 10ft 1000ft and 0ft. If you can remove all value from a measure and twice the value of one score means that it has twice the quantity, then you are working with a ratio scale. 20
  21. 21. Entering Data SPSS works like most spread sheets. To start entering data into SPSS go to your first variable and first case. Type the datum into the outlined box. To move to the next case in the same variable you may select enter or the directional arrow. To move across to another variable select the directional arrow. You may always use the mouse to select any box in the spreadsheet. Now try opening a File. It will appear as an SPSS(*sav) File. To open: File Open Or you can use the open folder icon on the Tool Bar. With a new file open you can edit the data simple by typing in new cases or variables into the spreadsheet. If data in an old variable or case needs to be replaced simply select the datum in question and type in a new value. 21
  22. 22. Transform The transform option in the main menu is a very useful tool for data analysis. This option allows for the creation of new variables or the transformation of old ones. Compute The Compute option allows for a multitude of transformations. Select the Compute option from the Transform menu. The Compute Variable dialog box will appear on the screen. It will look something like this: Under Target variable you can create a new variable that is 8 characters or less. By Clicking on the Type $ Label button you can give the variable Type and Definition as you would when creating a variable in the data editor. All of your existing variables are listed under the Type & Label button. The dialog box has a calculator and a list of functions which can be used to compute new variables. 22
  23. 23. Examples: Anxiety + Tension This adds two variables together that are known to the data set to create a new variable. Anxiety*10-100 This multiplies the variable by 10 and then subtracts 100. Functions can be very useful in transforming your data into a new useful Variable. For Example: Mean(test1, test2, test3) This function would average across variables and compute a new variable that was the mean for each case. Count Count returns a one for an occurrence of a value and a 0 for the absence of the value in the selected variable. Recoding The recode option can transform variables into the same variable or a new variable. Recoding into the same variable This option allows you to eliminate the old variable and replace it with a new recoded variable. To use this option select recode, into same variable, under the transform menu. The dialog box will have the variables in your data set on the left. Simple select the variable to recode and click on the arrow button. After selecting your variable click the Old and New Values button, this will bring up a new dialog box to specify recoding. Recoding into the different variable The main difference with coding variables into different values is that you will retain the original values in a separate variable. You must name the new variable, similar to the compute function, and label the variable. Selecting the Old and New values button will bring up a dialog box that will work in much the same way as it did in the Recoding into same variable option. 23
  24. 24. 24
  25. 25. Data set manipulations The data set that you have may need to be manipulated in order for you to work with the data. The Data menu allows several types of useful manipulations. Splitting a data set can be accomplished by selecting the Data menu and clicking on Split_File. The dialog box for split file appears. If you select compare groups, groups will be split as directed by the grouping variable. You can use the organize output by groups option to view the results of any procedure separately for each group. Inserting new variables/cases can be accomplished through the data menu or by right clicking on the mouse when selecting variables/cases. You can choose to Sort cases under the Data menu by selecting this option. Choose from ascending and descending order to sort your cases on a variable selected. 25
  26. 26. Merging files can be accomplished by selecting Merge Files under the data menu. You can choose to merge variables or cases. Select a new SPSS file to merge with the current file. Transposing variables and cases may come in handy for certain research questions or when transferring data into SPSS from another system. Under the Data menu select Transpose. A dialog box will appear and give options for selecting variables to be transposed. 26
  27. 27. After Transpose: 27
  28. 28. Analyze SPSS offers powerful statistical procedures in a relatively easy to use framework. This course will provide an introduction to some of these analyses. It is assumed that some base level of knowledge in statistics is known prior to this course. A Statistic should be used only when the researcher performing the analysis understands the analysis being performed. Poor understanding of statistical techniques often leads to incorrect interpretation of the results they yield. This section begins with an introduction to basic descriptive statistics. The middle road Central tendency is the middle of the road score. The one typical score used to represent the total scores is the central point. There are three methods to derive the central score of a set of values; mode, median and mean. The mode is the most frequently occurring value. You might use this for a quick look at a data set to get a rough estimate of central values. Just by looking at a frequency distribution you can determine the most frequently occurring value. It is possible to have more than on mode for a distribution. When there are two modes the distribution is said to be bimodal. As a measure of central tendency the mode is very limited for descriptive purposes. Even thought the mode is the most stable measure of central tendency when data is skewed, it is also the most susceptible to sampling error and variation, more so then the other types of central tendency. The median is the middle most score. When data is severely skewed you can get a good measure of central tendency with the median. The mean is the arithmetic average. It has the greatest amount of reliability. There will be less variability from sample to sample amongst means then amongst other forms of central tendency. Using the arithmetic Mean allows for the use of a wide variety of other statistical applications. The mean is the most sensitive measure to outliers and distribution problems. The mathematical definition of the mean is the sum of all the scores divided by the number of scores. X = ∑X N X represents any score, Σ is the symbol for “the sum of”, N is the total number of scores being summed for the set of values. 28
  29. 29. Effects of mode, median, mean when data is skewed. Mode stays constant, median stays relatively stable, and the mean is most affected by skewed data and outliers. In the theoretical normal curve all three of these measures will share the same value. Measures of dispersion A variable has scores that are likely to change. The characteristics of a single variable can be as diverse as the number of subjects measured. When you take into account that there can be thousands of variables measured from a single population or sample, each with its own dispersion, it is often not satisfactory to simple calculate the mean and look at differences. If we take two samples from a population, both with 10 people in each sample, and both with a mean score for height of 70 inches, we could conclude that these samples where equal. However, if we look at the raw scores we may not have come to the same conclusion. The first sample that was pulled from the population consists of 10 subjects, each 70 inches tall. The second population is more diverse: 50, 55, 63, 68, 72, 74, 76, 78, 79, and 85. Deviation Scores ∑(X − X ) = 0 ( Sum of Squares SS x = ∑ X − X ) 2 ∑(X − X ) 2 Variance S 2 = n ∑(X − X ) 2 Standard deviation S = x n ( Q3 − Q1 ) Semi-interquartile range 2 The Range measures the difference between the minimum and maximum scores. 29
  30. 30. Descriptive Statistics The Analyze option on the main menu has many statistics that can be preformed on your data. One of the most common options for the Analyze menu is the Descriptive Statistics procedure. By opening up the Analyze menu you can select Descriptive Statistics. Under this option you have four types of analyses that can be performed. Frequency Distribution Looking at raw data can often be confusing. There is not much that can be said with certainty by looking at raw unordered scores. The larger the data set the more difficult it is to speak meaningfully about it. Today more so then any other time in history we have hug data bases of data. Data bases in the millions are common place. The near future will bring us into the billions as common place and on its coat tails the trillions. What if a data base kept track of each order at every super market in the world? The data would only stay in the Billions if it was tracked by person, and not order, the number of visits would soon reach into the trillions. In the United States alone we could have 100 million people stop into the super market three times a week for a year. Quickly summing up these values, 3x52x100 million, we get 15.6 billion visits to the supper market that year. Imagine the data set when the number of people doing something world wide is in the billions and they do this several times a week or day. Frequency distributions are one way of organizing data so observations about the data set can be made. The frequency distribution shows the possible scores and the number of observations for each score. When working with a large distribution it is sometimes beneficial to group scores and look at the grouped distribution. 30
  31. 31. Computing Frequencies Select the first analysis labeled Frequencies. A dialog box will pop up with the heading Frequencies. In the left box all of your variables from the data set will be present. By selecting one or more of these variables and clicking the directional arrow shown in the window you can place them into the Variable(s): section of the window. The variable is now ready to be analyzed. Before doing so other options available to you in the frequencies window need to be discussed. Select Display frequency tables. This will provide a frequency table in your output or Viewer window. On the bottom of your dialog box there are three options: Statistics, Charts, and Format. Select Statistics. The Frequencies: statistics dialog box will open. The box is composed of several sections. 31
  32. 32. First is the Percentile Values section. There are several options to divide data to obtain specific percentages above and below specific points. The Quartiles option will divide the output into 4 equal groups. Cut points allows for equal groups to be obtained other then four. Individual percentiles can be selected by typing in the appropriate percentile in the Percentile(s) option. Select Quartiles. Central Tendency has four selections, Mean, Median, Mode and Sum. The Desperation section is for measuring the amount of variance in the data set. The Values are group midpoints selection is used only if the data is midpoint coded. This selection will give estimates for the median and percentiles as if the data was ungrouped. Distribution is tested by two statistics, Skewness and Kutosis. Skewness gives a measure of symmetry for the data. Kurtosis measures to see if its peak is normal. In both cases a normal distribution will have 0 Skewness and Kurtosis. A significant positive skew indicates a long right tail in the distribution. When a data set is skewed negatively it has a long left tail. A positive measure of kurtosis indicates clustering in the center with longer tails than a normal distribution. A negative kurtosis has shorter tails and cluster less then the normal curve. When finished selecting statistics for the frequencies analysis click continue. Now you are ready to run the analysis. Select OK. The output will now appear in the Viewer window. 32
  33. 33. Descriptives Next under the Descriptive Statistics option is Descriptives. By selecting descriptives you will open a dialog box. The box is set up very similar to the frequency dialog box. The main difference is the Options button. The Display Order is the only new option in Descriptives. This option allows you to display the order of the descriptives in the output as you choose. 33
  34. 34. Explore Explore can be used to run descriptive statistics on data that is grouped. The Dependent List holds the variables you want to explore. The Factor list creates the groups. The option are not very different then the other descriptive options. 34
  35. 35. Crosstabs The Crosstabs option allows you to count the frequencies that occur in any number of cells within a larger table. One of the options in the statistics dialog box allows you to compute a chi-squared along with the analysis. There are some other useful analysis procedures that can be run through Explore. 35
  36. 36. Compare Means The next selection in the Analyze menu is Compare Means. Select Compare means. There are three types of t tests to choose from as well as a means option and One-Way- ANOVA. The type of research comparison will determine what type of t test should be performed. We will focus on t tests. The one sample t test is used to compare the means of one variable to a predetermined test value either known or hypothesized. The objective with a one sample T test is used to see if a single variable differs from some specified constant. The T test is used with a single mean and is non-directional. The statistics will state difference between the mean and the constant. Select One sample T Test. The One-Sample T Test window will appear on the screen. Select the variable to be analyzed and set the test value for which you are testing the variable. The options button will allow you to change the Confidence Interval from 95 to whatever value is desired. Missing values allows for the removal of cases for each variable measured or for individual cases. Click OK to run the T-test. 36
  37. 37. Other T tests follow a similar pattern for analysis. The independent-samples T Test will compare means for two groups. This analysis should be used with two separate groups that have been randomly assigned. This time go to the Analyze menu… Compare Means and select independent-samples T Test. The window that appears is very similar to the One-Sample T test window. The main difference here is the grouping variable section. The grouping variable should be dichotomous or categorical as to divide the cases in the variables into groups. The options for this T test are the same as for the others. 37
  38. 38. A Paired-Samples T Test is used when there are two variables for the same group being compared. This test computes the difference between the two values to see if there is a significant difference from a score of zero. This is often the appropriate test for a one group pre-post test design. t-test in more detail There are three types of t-tests; independent, dependent, and single sample. 38
  39. 39. The t distribution is related to the z distribution. X − µx X − µx z= t= Sx Sx SS x Sx = n( n − 1) X − µx t= SS x n( n − 1) The independent t-test is used when you have two separate groups that you want to compare. The Best example of this t-test is the true experiment. Does the mean of one variable in one condition differ from a variable in another condition. t= (X −Y ) SS x + SS y 1 1  +  ( nx − 1) + ( n y − 1)  nx n y    ( X − Y ) − (u x − u y ) hyp t= SS x + SS y 1 1  +  ( nx − 1) + ( n y − 1)  nx n y    39
  40. 40. Correlate The next type of analysis that will be discussed is the correlation. Under the Analyze menu open the correlate option and choose bivariate. The bivariate correlation dialog box will open. Several options in the main dialog box will be presented to you. Correlation Coefficients give you three options; Pearson, Kendall’s tau-b, and Spearman. If the variables you are correlating are continuous and have a normal distribution the Pearson correlation coefficient can be selected. 40
  41. 41. Test of significance gives a two tailed and one tailed option. Flag significant correlations is a good option to help select relationships that are significant. At the bottom of the dialog screen is an option bottom. When opened a new dialog box will appear. Selecting the Statistics (means and SD, Cross products) allow you to compute other information with the correlation that will appear in the viewer window. Missing Values gives options for handling missing data. Partial correlations are computed the same way except they allow you to control the influence of other variables on the variables being correlated. Equations for correlations Y = A + BX 2 tailed ∑ ( X − X )(Y − Y ) r= N ∑ XY − ( ∑ X )( ∑ Y ) r= ( SS ) ( SS ) x y [N ∑ X 2 ][ − (∑ X ) N ∑Y 2 − (∑Y ) 2 2 ] 41
  42. 42. Regression A simple linear regression is similar to a correlation, except it has the advantage of prediction. The regression statistic can be computed by selecting Analyze, Regression, Linear. Selecting a variable to place in the Dependent box and one or more variables to be placed in the Independent Box can now perform a linear regression. The statistics option can help in the selection of output for the regression. The Statistics option allows you to select what you want to appear in the output Viewer. Y’= A+BX 42
  43. 43. Nonparametric tests and other Analyses Nonparametric tests can be invaluable when variables do not fit the normal curve or in some other violate assumptions of parametric statistics. Probable the most frequently used nonparametric test is the Chi-Square. A Chi-square test answer questions about data in the form of frequencies. It compares the values observed to vales that would be expected. Ex. If 50% of your population is male and 50% is female you expect a sample to have the same approximate frequency. If your sample of 100 people yielded 75 female and 25 male you could test this using chi-square. To perform the Chi-squared analysis, select Analyze from the main menu, go to nonparametric tests and select Chi-squared. Select your test variable. Compute by selecting OK. There are many other analyses available to you on SPSS for windows. Understanding the proper use of the statistical technique that is being used is important when performing research any type of research. Statistics such as Multiple Regression, Discriminate analysis and Factor Analysis, need to be understood before being used. If you do not understand the analysis on the way in, you will not understand the results when they come out. 43
  44. 44. Graphs Raw data can often be confusing and hard to manage. Being able to view the patterns of your data is helpful both with determining a type of analysis to use and in interpretation of the data set. Graphs are useful as a visual toll for the researcher and for the target audience of the researcher. One of the easiest graphs to develop is the bar graph. SPSS provides three basic types of bar graphs: Simple, Clustered and Stacked. A simple bar graph can be obtained by selecting Graph and selecting Bar. The Bar Charts dialog box will open. There are three types of bar charts to choose from: simple, clustered and stacked. Select Simple and click the define button. Line charts are similar to Bar charts. Area charts shades the area under the graph. Pie charts divide the variable into slices. High low charts display several variables with error bars. 44
  45. 45. Bar Graph Select Bar from the Graph Menu Select Simple and click define Basic Bar: place the variable of interest in the Category axis Click OK. 45
  46. 46. Pie Chart Select pie from the graph menu. Select summaries for groups of cases and click define. Basic Pie: place the variable of interest in the Define slices by box Click OK. 46
  47. 47. Histogram Select Histogram from the graph menu. Basic Histogram: Add variable of interest to the Variable box. Check the Display normal curve box. Click OK 47
  48. 48. Scatter Select scatter from the graph menu. Select simple scatter and click define. Select a variable for the X axis and another one for the Y axis Click OK 48
  49. 49. HELP!!! One of the most useful features on SPSS is the Help menu. You can gain access to some very powerful learning tools by clicking on the Help option on the main menu. Every window in SPSS has a Help menu. When using Help from a main widows (data editor, viewer, syntax) access to several options will be available. The first four options in the help menu can help you get started in SPSS and further your abilities with the product. The first option available in the Help menu is Topics. By selecting Topics in the Help menu you gain access to Content, Index, and Find Tabs. Content allows for selection of Help topics. Index provides a search option as well as browse option topics. The Find tab can look for instances of the word, or part of word that is typed in the window. The second Help option is the tutorial. It is set up like the Topics window except the content section has tutorials to help you use SPSS. The Statistics Coach option can help walk you through analysis steps and Help in selecting the appropriate statistic. As with any analysis if you are uncertain of the correct 49
  50. 50. analysis to use check with an expert. Remember that the accuracy of a statistic relies on the proper information being run. You may need the CD-ROM to access Syntax Guide in the Help menu. This Help tool can be useful if you ever need to write syntax in SPSS. SPSS is available on-line and can be reached be clicking on the SPSS Home Page option in the Help menu. The URL for SPSS is Results Coach can be accessed on the viewer window by double clicking on a chart or table and selecting Results Coach from the Help menu. If you are having difficulty interpreting your data this may be of assistance to you. If you are still uncertain of your results and how to interpret them please consult with an expert. Help can be used during any analysis or procedure in SPSS that opens up a new window. This will open up a window with help on the procedure you are currently working on. You are now ready to start exploring SPSS. 50
  51. 51. Additional Topics Restructuring Your Data Version 11 of SPSS has a very useful tool for the restructuring of data. Under the data menu, select restructure. The restructure data wizard will open. This new wizard is convenient to windows users. Before this option was created the only two ways to restructure your data was to write a syntax code or to manually manipulate the data in the data editor. The restructure wizard allows for most types of data restructuring. The first option available is to stack variables. The second option is to split existing variables into multiple variables. The last option is to transpose variables. This option takes you to the transpose option under the data menu. Reports The analyze menu has a report option that can be used to simplify data and provide readable information. There are several options to get information about variables extracted from SPSS. First You can use the File information option. Select Utilities, File Info will be in this menu, Select this option. The SPSS Viewer will display All file information about variables. This report can be cut and past into Microsoft word or excel for editing. The second option for extraction variable information is to simple go to the variable view tab and cut and past information into a spread sheet like excel. You can also use the display Data Info… option under File. This does the same thing as File information, except you can specify information on unopened files. 51