Upcoming SlideShare
×

# Basic statistics for the utterly confused

4,329 views

Published on

Published in: Technology
2 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
4,329
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
94
0
Likes
2
Embeds 0
No embeds

No notes for slide

### Basic statistics for the utterly confused

1. 1. Basic Statistics for the University of Pretoria Faculty of Economic & Management Science 10, 11, 14 & 15 September 2009 Presented by Sumari O’Neil sumari.oneil@up.ac.za
2. 2. Table of Contents 1. 2. STATISTICS AND ALL THAT JAZZ DESCRIPTIVE STATISTICS 1 3 2.1 Frequencies 2.2 Central tendency 6 2.3 Statistics for variability 7 2.4 Working with percentages: 3. 4 7 PARAMETRIC AND NO N-PARA METRIC STATISTICS 8 3.1Testing the assumption of normality 3.2 Equality of variances 4. FROM QUESTIONNAIRE TO DATASET 5. SCREENING AND CLEANING YOUR DATA 6. MANIPULATING YOUR DATA 9 14 15 16 17 6.1Calculating the total scores of scales or indexes 17 6.2 Reversing negatively worded items 17 6.3 Collapsing a continues variable into groups 18 7. CORRELATION ANALYSIS 18 7.1 Statistics to test relations between variables 19 7.2 How to interpret the results of the correlations 24 7.3 The coefficient of determination (r2) 25 7.4 How to write up the results of a correlation analysis in a research report 25 7.5 Graphically representing the relationship between variables: 25 7.6 Other analysis that is grounded in correlation analysis 27 8. T ESTING DI FFERENCES BETWEEN GROUPS (CAUSAL RELATIONSHIPS) 28 8.1 What does “testing for differences between groups,” mean? 28 8.2 Testing differences between two independent groups: t-test for independent groups 30 8.3 The nonparametric alternative for the t-test for independent samples: Mann-Whitney U test 32 8.4 Testing differences between two dependent / related samples 33 8.4 The non-parametric alternative to the t-test for dependent/related samples: Wilcoxon Singed-Rank Test 35 8.5 Testing differences between more than 2 groups on one variable: One-way Analysis of Variance (One way ANOVA) 35 8.6 The non-parametric alternatives for the One-way ANOVA 40 References: 41
3. 3. 1. Stat istics and all that jazz Statistics is used in quantitative research to analyse and interpret the data collected during the data collection process. (Although very elementary statistics such as frequency counts are sometimes used in qualitative statistics, most hard core qualitative researchers will RATHER DIE than use any form of statistics!) In short, it implies that you collected data from the “real world” by means of a questionnaire (most commonly used) and now you want to tell the story of the “real world” by using statistics. Field (2009) explains this by saying that we are actually building statistical “models” of reality. When you look at the model you would like to be able to say that “this is what reality” looks like! Of course, when you build a model you would want to use the best material to depict the reality as accurately as possible. In terms of statistics these material refers to firstly, your data and secondly the statistics used. The data comes first. Here, the garbage-in, garbage-out principle applies. Make sure before the study that, 1. your data answers the research question, 2 the data comes from a representative sample and 3. the data meets the parameters of the statistics you want to use. In terms of the latter, every statistic has a set of criteria to be met for optimal usage. Should your data not meet the criteria for the statistic needed to answer your research question, you will do a statistic will very little power and little validity or you may not be able to do the statistic at all. Then in terms of the statistics, you have to make sure that if your data is good, that you choose the best possible statistic to depict the “reality of your research question”. There is probably more than one option to consider when selecting a statistic. Make sure you choose the best one to increase the accuracy of your results. It should be clear by now that although statistics is used for the analyses of the data, it should actually be considered from the start of the research process. Generally research topics for explorative research (topics not explored in great depth) are better answered through qualitative research. From its nature, quantitative research gives more answers in terms of the breadth of a problem, for instance the prevalence of HIV Aids in South Africa. Qualitative research gives a better depiction of the depth of a problem, e.g. the experience of cancer survivors. After finding a topic, a research question should be stated at some point and out of the question flow the purpose of the research. Some research questions are better answered by quantitative research. For instance, questions that revolves around determination (such as the prediction of -1-
4. 4. one event by means of another), validity (i.e. the validity of a questionnaire) and causal relationships between variables (e.g. whether gender is the cause of a negative attitude) are all better answered through quantitative methods. When stating the research question, you should already have an idea of what type of analysis you can possibly use. Most statistical analyses have some data requirements. For instance, requirements of sample size and level of measurement (i.e. most parametric statistics require data to be at least on interval scale of measurement). Has the topic been explored in depth and breadth? What methodology would answer the Research question best? Find a topic Research Question Does the design fit the criteria for the statistical analysis? Is the sample big enough for the statistical analysis? •Design oPlan for measurement oSampling plan and procedures oData analysis •Interpretation of results •Conclusion & recommendations Data analysis & interpretation The Research Process Fig. 1: The research process -2-
5. 5. BOX 1: Different approaches to research 2. Descriptive Statistics Descriptive statistics tells you what your data looks like. Say for instance, you used a questionnaire to gather data. Let’s say the questionnaire was about asked biographical questions about the managers (e.g. age, year’s experience, gender) that completed it, as well as questions with regard to their management style. By doing descriptive statistics you will be able to draw a profile of the managers that took part in your research. You would also be able to get an idea of the management styles they use. The first step of statistical analysis usually involves descriptive statistics. You can use it to describe the sample, to check if the data is fit for specific analysis or to answer a specific descriptive or exploratory research question. -3-
6. 6. For different types of data, different descriptive statistics are used. In other words, different descriptive statistics are used for data from different levels of measurement. Nominal and ordinal data are henceforth referred to as categorical data/variables. This is since these two levels of measurement indicate different categorical answers in your data set. Interval and Ratio level data on the other hand is referred to as Scale data/continues variables. This is because they indicate respondent answers on a scale from 0/1, 2, 3, through to x. A special type of categorical variable is the dichotomous variable. This is a variable that represents only 2 categories. For instance, the variable of gender represents male and female. Descriptive statistics include frequencies/frequency counts, statistics of central tendency and statistics that indicate variability/dispersion. 2.1 Frequencies Frequencies indicate to us the amount of cases (respondents), which falls into each of the available categories. Frequencies can be displayed in terms of counts or percentages. Frequencies are usually displayed by means of frequency tables, but can also be displayed graphically in graphs and charts. Suitable graphs to display frequencies for categorical data are bar charts or pie charts. Example of a frequency table: VOTE FOR CLINTON, BUSH, PEROT Valid Bush Perot Clinton Total Frequency 661 278 908 1847 Percent 35.8 15.1 49.2 100.0 Valid Percent 35.8 15.1 49.2 100.0 Cumulative Percent 35.8 50.8 100.0 In this example, I wanted to see the frequency of people that voted for each one of the three candidates for the US presidential elections in 1994. In my interpretation, it is very obvious that most of the voters (908) voted for Clinton in 1994. Example of bar chart: -4-
7. 7. VOTE FOR CLINTON, BUSH, PEROT 1,000 Frequency 800 600 908 400 661 200 278 0 Bush Perot Clinton VOTE FOR CLINTON, BUSH, PEROT Here I drew up a bar chart of the frequency distribution in the above-mentioned example. Example of Pie chart: VOTE FOR CLINTON, BUSH, PEROT Bush Perot Clinton 35.79% 49.16% 15.05% Here is a pie chart displaying the percentages of the frequencies. For a scale variable, one would display a frequency distribution graphically by means of a histogram and not a bar chart or pie chart. In SPSS you also have the option of adding a normal curve to the histogram to get an idea of the normality of the distribution. Another option is to graphically represent it by means of a frequency polygon. Although a frequency polygon is appropriate for ordinal data as well as other scale date, it is not appropriate for nominal data. -5-
8. 8. 2.2 Central tendency For variables measured on nominal scale, the statistic for central tendency is the mode. The mode indicates the category with the greatest number of cases. Say for instance, your question asked peoples occupation from a list. If most of the people indicated they were medical doctors that would be the mode of the dataset for that question. As an example of mode look at the following (From Glosser (2004) http://www.mathgoodies.com/lessons/vol8/mode.html): Example 1: The following is the number of problems that Ms. Matty assigned for homework on 10 different days. What is the mode? 8, 11, 9, 14, 9, 15, 18, 6, 9, 10 Solution: Ordering the data from least to greatest, we get: 6, 8, 9, 9, 9, 10, 11, 14, 15, 18 Answer: The mode is 9. For ordinal level data the best indicator of central tendency is the median the median is the exact middle point of the data set. It indicates the value above and below which half of the cases fall. (From http://www.uwsp.edu/psych/stat/5/CT-Var.htm) For interval and ratio data, one uses the mean (average score) as indicator of central tendency. Thus with categorical data the mode and median have the same function as a mean. The mean -6-
9. 9. is not used with interval data if the distribution is skewed (not normal). In this case you will use the median. 2.3 Statistics for variability As mentioned above, another type of measure that can be used to summarise a data set is the measures of dispersion or variability. These measures refer to summaries of the size of the differences between each score and every other score. There are three measures of variability: Range: Range The difference between the largest and smallest score Variance: Variance Extent of the differences among scores. The greater the differences the more the mean fails to represent the data set. The range takes only into account the largest and smallest score. The variance takes into account every score. Standard deviation: The standard deviation of the scores from the mean in the deviation same measurement unit as the original score. Since categorical variables have a restricted range (it will always be bound to the number of categories), the variability is often not used as a description. One can rather look at the minimum and maximum scores in the data set or the range. For scale data, the standard deviation is used. Take note, that if the standard deviation = 0, all the scores are the same variability. The higher the standard deviation, the higher the variability. Working 2.4 Working with percentages: In order to compare frequencies, most researchers work out the percentage of frequency in each category. Percentages represent the proportion of responses within each category in your dataset, and serves two purposes: 1) it simplifies the data by reducing the numbers to a range from 1 – 100 and 2) it translates the data into a standard form for relative comparrison. To calculate the percentage, you need to know the number of observations in the category and the total number of observations in the data set. The formula for percentages is: Percentage = f /N * 100% where f = the number of observations in the category and N = the total number of observations in the data set. N can also be described as the “base,” “total,” or universe. -7-
10. 10. For example, People in poverty in Johannesburg is 400,000. The total number of people living in Johannesburg is 132 000 000. What is the percentage of people living in poverty? Of the total of poor people, 260 000 are women, what is the percentage of poor women living in Johannesburg? There are some rules when it comes to interpreting percentages: 1. Percentages cannot be averaged unless each is weighted by the size of the group from which it is computed. This is referred to as a weighted average. 2. When a very small base is used (say the percentage of out of 5) it is easy to overestimate the percentage. For instance, 60% would seem like a huge difference, while it may only indicate 3/5. 3. Parametric and Non-parametric stat istics When we need to use inferential statistics, the optimal is to use parametric statistics. To use parametric tests, the data that we use should meet a number of assumptions. If it does not meet the assumptions, the results will be inaccurate. As such, it is extremely important that you test the assumptions of a specific statistic before you continue with the analysis. Specific statistics has specific assumptions, however, they generally include: • Normally distributed data: It is assumed that the data is from a normally distributed population. If you remember that inferential statistics is done to prove that some or the other results are applicable to an entire population, you should also understand that the population’s distribution should also be normal. This assumption is however different depending on the context in which it is used. • Equal variances / homogeneity of variances: If two or more groups are compared, or used in the research, they should have equal variances or spread of scores • Independence: There must be independence of observations, except when the data are paired (paired data refers to data that is related to the same respondents over more than one measurement, like in pre-and post measurements, or respondents that are in some way related to each other). How do we know if there exists independence of observations? Well, you will have to look at the design of the research. Where did the data come from? Was it observations of two entirely different groups, or was it a pre-post measurement of the same group. How do we prove that this assumption was met? Easy! By describing and explaining the research design. There are statistical ways to prove independence of observations – however they are not used for the type of statistics we will go through in this course. -8-
11. 11. • Interval data: The variables (specifically the independent variable) should be on at least interval level of measurement (or if categorical it should have a minimum of 7 categories). This assumption is tested by common sense and not through a statistical analysis. When the assumptions of parametric tests are not met, we should look at the non-parametric alternative to the parametric test (we will also look at non-parametric alternatives with each type of statistic in following tasks). Although non-parametric statistics also has some assumptions, there are fewer restrictions on the data that can be used. The general assumptions of non-parametric statistics are: • Independence of observations except when paired • Few assumptions concerning the population’s distribution • The scale of measurement of the dependent variable may be categorical or ordinal • The primary focus is either the rank ordering or the frequencies of the data • Sample size requirements are less stringent than for parametric tests. If we look at the assumptions above it is clear why non-parametric statistics are often referred to as statistics for small samples and distribution free tests. 3.1 Testing the assumption of normality What is a normal distribution? The normal distribution has 4 characteristics: • It is unimodal – thus, it has only one hump in the middle of the distribution with the mode in the middle • The mean, mode and medial are equal • It is symmetrical (not skewed) • It is asymptotic (the extreme scores never touch the x-axis) • It’s neither too peaked not too flat, thus the kurtosis is equal to 0. An illustration of the normal distribution The statistics to look at when you check for normality of the distribution include: • Skewness -9-
12. 12. • Kurtosis • Kolmogorov-Smirnov (or K-S from now on) (the vodka statistic) • Shapiro-Wilk test • Q-Q Plots • Box-and-whiskers plots • Histogram Skewness refers to the lack of symmetry. A distribution with a long tail to the right have is positively skewed and visa versa. How to see the skewness of a distribution with SPSS: From the menu, choose: Analyse > Descriptive statistics > Descriptives > From the options… box, select skewness. The output will give you a number. E.g.-5,845. The +/- in from of the number indicates to what direction the skewness tends and the number how skew the distribution is. The higher the number, the more skewed the distribution. Kurtosis on the other hand measures the flatness or peakedness of the distribution. Very peaked distributions have positive kurtosis and very flat curves have a negative kurtosis. A perfect normal distribution has kurtosis = 0. To check the kurtosis, you can follow the same procedure as for skewness, but instead of selecting “skewness”, select “kurtosis”. (Both skewness and kurtosis can also be computed by SPSS under the “Frequency” option of “Analyse”.) To use skewness and kurtosis to see if the distribution is normal, you have to convert the given skewness and kurtosis scores to z-scores. Use the following formula: zskewness = (K-0)/SEskewness - 10 -
13. 13. or z kurtosis = (S-0)/SEkurtosis. S = Skewness; K = kurtosis; SE = Standard Error (of skewness or kurtosis). If the value is smaller than 1.96, the distribution is normal. In larger samples, this value should be increased to 2.58. And very large samples it should be increased to 3.29. When a sample is larger than 200, one should look at the shape from the histogram rather than significance testing. Significance tests of skewness and kurtosis should with large samples because they are likely to be significant even when skew and kurtosis are not too different from normal (Field, 2009, p. 139). Skewness and kurtosis gives us a numerical value by which we can judge whether a distribution is normal or not. When you draw up a histogram, you can graphically see if the distribution is skewed or flat or peaked. Histogram 2,000 Valid 1013 Missing 504 Skewness -3.817 Std. Error of Skewness .077 Kurtosis 12.594 Std. Error of Kurtosis .154 1,500 Frequency N 1,000 500 Mean = 1.94 Std. Dev. = 0.232 N = 1,013 0 0.5 1 1.5 2 2.5 Counselling for Mental Problems Example of SPSS output How to draw a histogram with SPSS: From the menu, choose: Analyse > Descriptive statistics > Frequencies > From the charts options… box, select histogram and select the tickbox with normal curve. Another plot that can be used is the P-P Plot (Probability-probability plot). A normal distribution on a P-P Plot should be a diagonal straight line. - 11 -
14. 14. Drawing a P-P Plot with SPSS: From the menu, choose: Analyse > Descriptive statistics > P-P Plots Box 3: Describing the different groups in your sample: Using the split file command Most of the time there are different subpopulations represented in the sample. In these cases you would most likely want to explore each of the subpopulations. One of the functions in SPSS that can help you do this is the split file function. The split file function allows you to identify a grouping variable (a variable that is used to specify categories of people). When you select the split file function, any subsequent procedure that you will do in SPSS will be carried out, in turn, on each category specified by the grouping variable. For this reason it is important to turn off the split file function after you have completed the computations you wanted done in that way. (To switch it of follow the same path (given below) and click on the reset button.) To select the split file command: From the menu, choose > Data > Split file. Here the split file dialogue box will open: Select “Organise output by groups, and the select the grouping variable (e.g. sex). Then click OK. Another way in which normality can be tested is by means of the Kolmogorov-Smirnov (K-S) and the Shapiro-Wilk tests. These tests compare the distribution with a comparable normal - 12 -
15. 15. distribution. In both these tests, we are actually testing a hypothesis. This hypothesis is that the distribution of the sample is the same as the distribution of a population with the same mean and standard deviation. Remember that we always statistically test to reject or accept the null hypothesis. The null hypothesis in this case will be: There is no difference between the distributions of the sample and population (thus they are equal). If this is true (if we accept the null hypothesis), it means that the sample distribution is normally distributed). The Shapiro-Wilk test is used for small sample sizes (less than 50), otherwise use the K-S test. The limitation of these tests is similar to the skewness and kurtosis significant tests: that is, if the sample size is large, it will easily show significant differences (non-normality). For this reason one would always plot data and use the graphs in collaboration with any other test used. Komogorov-Smirnov & Shapiro Wilk in SPSS: How to do this? Well, it is actually easy with SPSS. First, you have to do the statistics by: From the menu, choose Analyse > Explore…In the dependent list, put all the variables of interest to you (that you want to test). If any of the variables is a grouping variable, you can put it in the factor list. This will split the file so that your computations will be done for different subgroups (e.g. for males and females). If you click on statistics select Descriptives. Continue. If you click on Plots select under box plots, the option of factor levels together. Under descriptive select stem- and-leaf. Also Select the Normal plots with tests and click on continue. Then OK. There is a lot of output, but only some that is of importance specifically for the K-S and Shapiro-Wilk. You may look at the descriptors per variable if you haven’t drawn up any originally. The important statistic is the tests for normality. Tests of Normality a To Be Well Liked or Popular To Obey Respondent's Sex Male Female Male Female Kolmogorov-Smirnov Statistic df Sig. .398 408 .000 .444 574 .000 .227 408 .000 .266 574 .000 Statistic .643 .548 .865 .857 Shapiro-Wilk df 408 574 408 574 Sig. .000 .000 .000 .000 a. Lilliefors Significance Correction Example of normality tests output How do you interpret these tests? The statistic is the actual K-S statistic and the df is the degrees of freedom (should be the same as the sample size). The one we look at to judge whether to accept or reject the null-hypothesis is the sig. or significance value. If the sig. i less than 0.05, there is a significant difference between the population and sample - 13 -
16. 16. distribution – therefore we reject the null hypothesis and say that the distribution is not normal. In the case of the above shown table, I will report: The Kilmogorov-Smirnov statistic was significant (p<0.05) and therefore the distribution is not normal. You will see that the normality output will also include Q-Q plots and stem and leaf plots and even box-plots (box and whiskers plots). 3.2 Equality of variances You can see the variances by using the descriptive and frequency commands in SPSS. However, these give you an indication of the variance of the different groups, but you do not know if the differences on face value are statistically significant. There are other statistics that tell us to what extent there are significant differences between different samples. The most common of these are the Levine’s test of homogeneity of variance and the Bartlett’s test for homogeneity of variance. The Levene’s test in SPSS Explore: Go to Analyse>Descriptive statistics>Explore…put the dependent variable in the “Dependent List. The grouping variable should be in the “Factor list”. Under the “Plots” options select Histograms with normality plots and “untransformed” under “Spread vs Level with Levene’s test”. Test of Homogeneity of Variance Levene Statistic Age df1 df2 Sig. Based on Mean .070 1 53 .792 Based on Median .033 1 53 .856 Based on Median and with .033 1 52.457 .856 .052 1 53 .820 adjusted df Based on trimmed mean Read the statistics based on the mean. If the significance is smaller than 0.05, it indicates that the variances are not equal. Significance larger tan 0.05 indicates that variances are equal. To report the results of the Levene’s test: Levene’s test is denoted by the letter F . F as well as the degrees of freedom (df) should be - 14 -
17. 17. mentioned in the report. The general form of reporting is: F(df1; df2) = value , sig. E.g. F(1; 53) = 0.070, 0.792. 4. From questionnaire to dataset The data collected during the research needs to be coded and entered into SPSS to create a dataset with which you can work. For the purposes of using statistical programmes, you have to define and label the variables you measured during data collection. For instance, if I measured level of statistical knowledge, the label may be STATKNOW and the levels of that variable was measured on a 5 point scale where 1 was no knowledge, was some knowledge, 3 was average knowledge, 4 was above expected and 5 was exceeding knowledge. So the levels would be the codes that I will use to indicate the levels of statistical knowledge. When you are measuring a lot of variables it is very easy to become confused with codes and labels. For this reason, researchers create codebooks. The codebook lists all the variables included for the statistics, as well as their labels and the codes ascribed to each answer category given. For instance, if I measured gender in a questionnaire (in other words, a question asking each respondents’ gender), “Gender” will be the variable name. In the SPSS data file, I will refer to gender as “SEX” and the codes that identifies each respondents’ gender is 1 or 2, where 1 indicates “Female’ and 2 indicates “Male”. In my codebook, I will illustrate this as: Variables Gender Variable SPSS Variable Name SEX Coding Instruction 1 = Female 2 = Male The codebook can be created as soon as your data analysis tool is finalised and contains only closed answer categories. In the case where you want to use a qualitative data collection tool, such as an open ended questionnaire, you will have to wait until after you collected your data. Variable names should: • Be unique • Must begin with a letter (not a number) • Cannot include full stops, blanks or other characters - 15 -
18. 18. • Cannot include words used as commands by SPSS (all, ne, eq, to, lt, by, or, gt,. And, not, ge, with) • Cannot exceed 64 characters The responses must all be coded with numbers. Otherwise you would not be able to do any statistics with them. Even open-ended questions should be transformed to numerical codes to use it in SPSS. Before you can analyse data with a statistics programme like SPSS, you will need to create some form of data set for it to work on. The dataset will need read the data you collected into the chosen programme for it to work with the data. For this course we are using SPSS (Statistical Programme for the Social Sciences). But you may decide to use MS Excel for the data analysis or SAS (Statistical Analysis System). In which case, you would have to read the data into that programme. Since you will be working on SPSS you will need to open or create an SPSS data set. When you are working with raw data (the answers of the respondents are on the questionnaires only) you need to create a template and insert the data into the SPSS spreadsheet. If our data is in an electronic form, it can be opened in SPSS. (Note that data should be in an Excel spreadsheet or a text file to open with SPSS). 5. Scr eening and cleaning your data Sally did research on managers’ stress level and blood pressure. She collected the data on stress using the General Stress Inventory and a registered nurse took the blood pressure levels. As soon as Sally had all the data she read it into SPSS and started the analysis. To her amazement she found inconsistent results. Lucky for her she went back and checked her data before she started writing the report. It turned out that Sally made a lot of mistakes while reading in the data, and that caused the inconsistent results! Like with Sally, it often happens that mistakes are made when capturing data. When the dataset is faulty, it can lead to wrong conclusions and therefore invalid and unreliable research! For this reason, the first step after capturing data is to screen and clean the dataset. To screen data means that you explore the dataset for any errors, find the errors and correct them. To identify errors means that you have to know what the correct data will look like, right. - 16 -
19. 19. This is easy, you know what the data should look like since a codebook is available that shows you what the range of the data should be for each variable. For instance, if you measured the variable of home language in South Africa with a closed ended question with 11 answer options (one for each language), you know that for the variable of language there is a range of 1 – 11. Anything outside this range will be a mistake. See, easy! Now, how do you screen for errors in SPSS? Well, basically you want SPSS to describe the data. And, what if you find that a variable is not in the same range as you expected, how will you know which one of the cases is out wrong? You can either search the variable or you can do more detailed descriptive statistics. Your choice! As soon as you have identified the error you can replace it with the correct value by going back to the raw data (questionnaires). If you do not know what the correct value is, you need to delete the value and replace it with a missing value (or just keep the cell empty). 6. Manipulating your data With SPSS one can add up scores, for instance to adding the scores on individual items of a questionnaire to get a scale score. Continues scores may need to be collapsed into categories to create a categorical variable, or if too few responses of a specific category are present the number of categories on a questionnaire can be reduced. Skewed distributions can also be transformed if needed. 6.1Calculating the total scores of scales or indexes In some questionnaires a number of questions (items) measure a specific construct. In other words, you will not look at the single items alone. If this is the case, we would like to add the responses on these items to obtain a total for each person. We may also use scale in which case we want to add the responses of all the items together to obtain a scale score. To do this in SPSS go to > Transform >compute variable. 6.2 Reversing negatively worded items In some scales, the wording of particular items has been reversed to help prevent response bias. Using the “Transform” function in SPSS, the item can be recoded positively. - 17 -
20. 20. 6.3 Collapsing a continues variable into groups Sometimes you will need to divide your sample according to scores to create groups. For instance in terms of income, you would want to create categories of low income, middle income and high income if the question on the questionnaire asked you to write in the income. So in writing the answer, you may have a continues variable of income, but say you want to compare the three different income groups on for instance the variable of hope. In such cases, you will transform the continues variable into a categorical variable. You may ask, why do you not use categories from the beginning? Well, using interval or ratio level of measurement gives you much more detail to work with. If you ask age in categories, every person in your sample will just fall into a category but if you ask the specific age, you have much more detail on your samples age. It gives you also a wider variety of analysis to work with since if needed, you may always collapse the continues variable into a categorical one. To do this in SPSS go to > Transform > Recode into different variable 7. Correlation analysis When we talk about relationships between variables, we imply that the variables influence each other. Take note, influence does not imply a causal relationship! If ice cream sales in Bloemfontein are very high this month, and the amount of drowning is very high, there will be a correlation or relationship between ice cream sales and drowning. Does this mean that ice cream sales cause drowning? Or does it maybe mean that drowning cause ice cream sale? Of course not! There is no logical or theoretical link between these two events! So a relationship implies that at a given time, in a given context, the rate or frequency of occurrence of two variables (that is in this case ice cream sales and drowning) increase. Relationships between variables are also referred to as associations between variables. The nature of a relationship/association implies its strength and the direction of the relationship. The strength of a relationship is indicated by a correlation coefficient (the symbol r is used to indicate the correlation coefficient in statistics output). The correlation coefficient is a number between 0 – 1 that indicates how strong the relationship between variables are. A coefficient of 0 indicates no relationship and 1 indicates a perfect relationship. - 18 -
21. 21. The direction is whether the relationship is positive or negative. A positive relationship implies that if the properties of the one variable increase, the properties in the other one will also increase. Or if the properties in the one decrease, the properties in the other will also decrease. THUS, a positive relationship means that the variables co-vary in the same direction. A positive relationship is also referred to as a direct relationship. A negative relationship means that if the scores in one variable increase the scores in the other variable decrease. THUS, a negative relationship means that the variables co-vary in different directions. A negative correlation is also referred to as an indirect relationship. The positive and negative correlations refer to linear relationships – in other words, both of them are fitted on a straight diagonal line (See the scatter gram examples below – you will see that a positive and negative correlation both fit on a straight diagonal line). In statistics, a correlation analysis is used to test the nature of the relationships between variables. Therefore, relationships are also referred to as correlations – positive and negative correlation. As example, if I want to know if students with higher order thinking skills understand statistics better, I will do a correlation analysis. That is, I will ask: Is there a positive relationship between higher order thinking skills and student’s understanding of statistics? For relationship questions, I will conduct a correlation analysis. If the analysis is significant, it will tell me that the better the higher order thinking skills, the better students’ understand statistics. It does however not tell me that higher order thinking cause statistics understanding! There is a difference. Questions about relationships between variables are usually descriptive research. In other words, the aim of the research when you are using correlations is to describe the relationships that exists between a and b. 7.1 Statistics to test relations between variables Different statistics are used to test the relationship between variables. They are all referred to as types of correlation analysis, but are used for different types of data. They include: • Pearson / product-moment; • Spearman; • Point Baserial; - 19 -
22. 22. • Phi coefficient and so forth. 7.1.1 The Pearson / Product-Moment correlation A Pearson correlation coefficient is used when you are working with continuous data, in other words, data on the interval or ratio level of measurement. The Pearson correlation is also a parametric test or a parametric statistic. In short, in statistics we have two legs or two kinds of statistics – those that are parametric and those that are non-parametric. Parametric indicates that there are certain assumptions or parameters (borders) that the data should adhere to in order for it to qualify for parametric statistics. Should the data not adhere to the parameters or assumptions, the equivalent but NON-parametric alternative should be used. The Pearson correlation coefficient is a parametric statistic. To use the Pearson productmoment correlation your data should adhere to the following assumptions or parameters: • Data must be on Interval level • A linear relationship must exist (can be indicated by means of a scatter plot) • The distributions must be similar (Thus, if they are skewed, they must be skewed in the same direction), but preferably normal. • Outliers must be identified and omitted from the computation (please note if you delete the outliers, delete only the cell with the outlier value) - 20 -
23. 23. How do I know if there are outliers? To see if there are any outliers, we draw up a box-and-whiskers plot and a stem and leaf plot. Both of these can be drawn up under Analyse…Descriptives..Explore Outliers ..Under statistics select Outliers and under Plots select stem and leaf. To read stem and leaf plots use the following link: http://www.cmh.edu/stats/definitions/stem.htm. The box plot gives you a good idea of the outliers and the identity of the outliers. In other words, it does not only show you the outlier, but also which number in the data set has that particular value. The maximum value, which is not an outlier The median The 1ste – 3de quartile of the distribution, thus it will be the bell part in a normal distribution Outliers or extreme values that do not fit with the rest of the distribution The minimum value, which is not an outlier Outliers cannot be included in the analysis. There are different ways to deal with outliers: 1. Outliers can be removed 2. Data can be transformed: Outliers skew distributions. The skewness can be reduced somewhat by transformations of the dataset. (See Field (2009) p. 155 for a short and understandable description of different transformation options. 3. Change the score: Should the transformation fail, the value can be replaced by: a. Changing it to the next highest score in the dataset plus 1. b. The mean plus two standard deviations - 21 -
24. 24. To conduct a Pearson-correlation follows the following steps should be used in SPSS: From the menu bar select: Analyze > Correlate > The options that you can choose from at this stage is: Bivariate > Partial > Distance > A bivariate correlation is a correlation between 2 variables. In the following box, select the variables that you want to correlate. Select “Pearson” under Correlation coefficients. Under Test of significance, two-tailed means that there is no specification of the direction of the correlation in the hypothesis stated. We will mostly work with this one. One tailed is only chosen when you have specified the direction of the effect (relationship). In other words, a directional hypothesis. In the bottom left hand corner, you can select “Flag significant correlations”. This will show SPSS that the significant correlations must be marked on the output. 7.1.2 The Spearman Rank-Order correlation / Spearman’s Rho Spearman’s Rho is the non-parametric alternative of Pearson correlation coefficient. It is used when one or both of the variables are measured on ordinal scale (If only one, the other should be at least on interval scale). Spearman’s Rho is indicates as rs. To do this on SPSS use the same procedure as with the Pearson correlation, but select the Spearman Rank-Order option instead. 7.1.3 Kendall’s Tau Kendall’s Tau is another non-parametric correlation and it should be used rather than Spearman’s coefficient when you have a small data set (50 or less). It is stricter and if you do both the Tau and Rho, you will probably find that the Tau is a bit lower than the Rho. To do this on SPSS use the same procedure as with the Pearson correlation, but select the Kendall’s Tau option instead. 7.1.4 The Point-Baserial correlation This statistics is computed when you want to see the relationship between a continues variable and a dichotomous variable. E.g. females and males report the total number of years of education they have had, and we want to know whether there is any correlation - 22 -
25. 25. between gender and years of education. It is indicated by rpb The assumptions that your data must meet to compute a Point Baserial correlation is: • The dichotomous variable has mutually exclusive groups whose values have been coded 1 and 0 • The two groups created by the dichotomous variables are normally distributed • The two groups created by the dichotomous variables have equal variances • The continues variable has equal variances across each level of the dichotomous variable To compute a rpb you use a normal Pearson correlation procedure. How to test for equality of variances in SPSS? To test for equality of variances, an easy way is to select from the menu bar means….Independentstatistics…compare means….Independent-Samples T-Test. The grouping variable will obviously be the dichotomous variable and the continues the one which you want to test differences for. Then click OK. This procedure will give you a table in the output that OK looks like this: Independent Samples Test Levene's Test for Equality of Variances F HIGHEST YEAR OF SCHOOL COMPLETED RS HIGHEST DEGREE Equal variances assumed Equal variances not assumed Equal variances assumed Equal variances not assumed t-test for Equality of Means Sig. 3.090 .079 5.685 t .017 df Sig. (2-tailed) Mean Difference Std. Error Difference 95% Confidence Interval of the Difference Lower Upper 1.943 1843 .052 .259 .133 -.002 .521 1.929 1677.298 .054 .259 .134 -.004 .523 1.154 1845 .248 .065 .057 -.046 .177 1.147 1680.978 .252 .065 .057 -.047 .177 Look under Levene’s test for equality of variances. If the significance value is more than 0.05 it means that the two groups have equal variances. 7.1.5 The Phi-coefficient When both variables are dichotomous the phi-coefficient is used (indicated as rphi). The assumptions that the data must meet to utilise the Phi coefficient is: • Variables must be dichotomous • Observations are independent • The observations are in the form of frequencies and not scores • There must be at least 5 counts in each category for each variable. - 23 -
26. 26. To compute the phi coefficient with SPSS: To compute the phi….From the menu bar select Analyze > Describe > Cross Tabulations Go though the same process as you would with cross tabulations. However, go to the statistics option and select “Phi and Cramer’s V” and continue and OK. The output box should give you a table like this: Symmetric Measures Nominal by Nominal Phi Cramer's V N of Valid Cases Value .208 .208 1847 Approx. Sig. .136 .136 a. Not assuming the null hypothesis. b. Using the asymptotic standard error assuming the null hypothesis. If the significance value is less than 0.05 there is a significant relationship between the two variables. You look at the Phi statistic only. 7.1.6 Cramer’s V Coefficient When you want to test the association between two categorical variables (not dichotomous) you use the Cramers’ V statistic. Obtain this by the same steps as above. 7.2 How to interpret the results of the correlations A correlation coefficient tells you two things: 1) the strength of the relationship between the variables and 2) the direction of that relationship. It does not tell you whether that relationship is statistically significant or not. Here is a rough guide to interpret the correlation coefficients in terms of strength of relationship: Correlation Strength of relationship coefficient (r) 0.0 – 0.2 Very weak, negligible 0.2 – 0.4 Weak, low 0.4 – 0.7 Moderate 0.7 – 0.9 Strong, high, marked 0.9 – 1.0 Very strong, very high You have to remember to look at the direction of the correlation as well. You can only interpret the correlation in terms of strength if the correlation is statistically significant. - 24 -
27. 27. What is statistical significance? A statistical concept indicating that the result is very unlikely due to chance and, therefore, likely represents a true relationship between the variables. Statistical significance is usually indicated by the alpha value (or probability value), which should be smaller than a chosen significance level. For most research studies the significance level of 0.05 or 0.01 is used, thus indicating that the results have only a 5% or 1% chance of being likely by chance alone. In SPSS, we look at the p-value to tell us whether results are statistically significant or not. If the pvalue is smaller than 0.05, we know the results are statistically significant by 0.05. 7.3 The coefficient of determination (r2) When the correlation coefficient is squared, it gives us an indication of the amount of variability in the one variable that is explained by the other. For example if correlation coefficient between age 2 and social intelligence is 0.78 (p < 0.05) then the r = 0.6084. This can then be interpreted as: The amount of variability that can be explained in social intelligence by means of age is 61%. This r 2 is also called the coefficient of determination (see http://www2.chass.ncsu.edu/garson/pa765/correl.htm). 7.4 How to write up the results of a correlation analysis in a research report Mostly you will write something like…The results of the chi-square analysis indicated a significant but weak association between group membership and post intervention fear (Chi-square = 0.40, p= 0.03)…depending on which analysis you used. Remember to interpret it in terms of the practical value of the research. 7.5 Graphically representing the relationship between variables: It is probably the easiest to see if a relationship exists by drawing up scatter plots of the different variables that you would like to test. A scatter plot shows how the scores on the variables co-vary (go together). Since the scatter plots gives you such a good picture of what to expect from a correlation coefficient, it is the fist step of a correlation analysis – to first draw up a scatter plot - 25 -
28. 28. Examples of scatter plots: A positive relationship A negative relationship No Relationship To draw up a scatter plot in SPSS: From the menu bar select graphs > Select Scatter. A box with the different scatter plot options should appear - We will use the simple scatter plot for now. This type of scatter plot looks at the relationship between two variables. Click on Define > Select the variables for the analysis and place as x and y-axis > If there is a grouping variable that defines different categories you may place it in the “Set markers by” block > Select the Titles option below to give headings to the plot. The different types of scatter plots that can be drawn up are: • The simple scatter plot (as indicated above), • The overlay scatter plot, • The Matrix scatter plot and • The 3-D scatter plot. With the overlay scatter plot option, you can display the covariance between several variables on the same axis / diagram. The Matrix scatter plot does the same but rather than drawing it up on the same diagram, it is drawn up in a matrix. The 3D scatter plot is used to draw a diagram of the relationship between 3 variables. - 26 -
29. 29. 7.6 Other analysis that is grounded in correlation analysis A lot of multivariate statistics is grounded in the logic of correlation analysis. They include Factor analysis, cluster analysis, regression analysis, and reliability analysis, to name but a few commonly used ones. While correlation analysis tests whether relationship exists and the strength of that relationship, regression analysis assess the predictive ability of an independent variable on a continues dependent variable. For instance, if we take high school achievement and university achievement, one can use a regression analysis to determine the extent to which the high school achievement can be used to predict achievement at university. While simple regression will assess the functional relationship between one dependent (criterion/outcome measure) one independent (predictor) variable, multiple regression is used when you want to test the predictive value of a number of predictors to a single criterion (outcome measure), where the criterion should be a scale variable (continues variable on at least interval level of measurement). When the criterion is not on interval level of measurement, logistic regression should be used. For more information on correlations and regression, see: o http://bmj.bmjjournals.com/collections/statsbk/11.shtml o Correlation and regression analysis for curve fitting find @ http://helios.bto.ed.ac.uk/bto/statistics/tress11.html o Sykes, A.O. (ND) An Introduction to Regression Analysis. Retrieved from: http://www.law.uchicago.edu/Lawecon/WkngPprs_01-25/20.Sykes.Regression.pdf o http://www.valuebasedmanagement.net/methods_regression_analysis.html o http://www.investorwords.com/4136/regression_analysis.html o http://www.blackwellpublishing.com/specialarticles/jcn_10_462.pdf o http://www.telecom.csuhayward.edu/~esuess/Links/Software/RegressionExplained/re gression_explained.doc o DAU Stats Refresher @ http://www.cne.gmu.edu/modules/dau/stat/dau2_frm.html o Dallal, G.E. (2004). The Little Handbook of Statistical Analysis @ http://www.tufts.edu/~gdallal/LHSP.HTM > Select Regression pages on the menu page. o http://www2.sjsu.edu/faculty/gerstman/StatPrimer/regression.pdf - 27 -
30. 30. Another procedure, which is based on the logic of correlations, is the factor analysis. With a factor analysis you can determine the underlying structure of a large data set. In other words, when you have 10000 variables in your dataset and want to look at how these variables fall together, you can use a factor analysis. On such a dataset the factor analysis will group the variables that fall together with each other, thus, indicating the underlying structure (or reduced number of latent variables) present in the data set. . One analysis which is very important when questionnaires are used in research is the reliability analysis. One of the main principles of selecting a data collection instrument is that it should measure what you need it to measure and it should be a reliable indicator of what ever it is you are measuring. In other words, the validity and reliability of your data collection instrument is important. While a factor analysis can assess the construct validity of an instrument, the cronbach’s alpha is one way to assess the reliability of a questionnaire. This method tests the internal consistency of the items that are supposed to measure the same thing. All of the reliability analysis options are under Analyse>Scale… 8. Test ing differences between gr oups (causal relat ions hips ) Sometimes we hypothesise that one variable (independent variable) may cause a change in another variable (dependent variable). For instance, we think that gender can influence vocational interest. In other words, if you are a male, you will have certain interests that differ from the interests of females. Thus, too prove your hypothesis you have to prove that the career interests of males and females differ from each other. In other words, you have to compare groups (in this case males and females). 8.1 What does “testing for differences between groups,” mean? Researchers often want to test the similarities or differences of the properties or characteristics between groups. Take the following example: Example 1: A researcher wants to know whether there is a difference in the personalities of sales consultants and sales managers. This would give important information for the recruitment of both groups. The research question for this study would be: Is there a difference between the personality profiles of sales consultants and sales managers? - 28 -
31. 31. Of course the researcher has to define each of the variables included in the study. They are: Type of post that is the independent variable or grouping variable, which is either sales consultant or sales manager, and personality profile (the dependent variable). The researcher defines a sales consultant as a person who is responsible for the sales of a specific product of a company. He is directly involved with the prospective buyer. The sales manager is a person who is responsible for the sales of sales consultants within a specific division of an organisation. He is not directly involved with the prospective buyer, but rather with the management of sales consultants. A personality profile is a profile that defines the personality dimensions important for a specific group. The above mentioned is the conceptualisations or conceptual definitions of the variables. However, the researcher needs to measure these concepts and therefore will specify the operational definitions/operationalise the variables. For this, he will for instance define the groupings as: For a person to fall into the category of a sales consultant, he or she has to be in a sales consultant post for at least 1 year. And a sales manager has to be in a post specified as sales manager for at least 1 year. Personality profiles are measured by means of the 16 Personality Factor questionnaires. The researcher will of course specify here what this instrument measures and how. The hypotheses will be set out as follows: H0 (null hypothesis): There is no difference in the personality profiles of sales consultants and sales managers. H1 (alternative hypothesis): There is a difference between the personality profiles of sales consultants and sales managers. The researcher can go so far as to set specific sub hypotheses for H1. These sub hypotheses will specify how the personality profiles will differ. For instance, the researcher can say that: - 29 -
32. 32. H1 (a): A sales consultant will score high on dimension A, F and Q4, and low on Q2. H1 (b): A sales manager will score high on dimension C, D and E, and lower on A and F. If sub hypotheses are specified, the researcher will have to substantiate why and how he got to these hypothesis from previous research. 8.2 Testing differences between two independent groups: t-test for independent groups When a researcher wants to see if statistically significant differences exist between two different groups with regard to a dependent variable, he will use the t-test for independent groups. For instance, if you want to test if there is a difference in the level of language skills between a group of matriculates from Gauteng and Limpopo, you will use the t-test for independent groups. The t-test is a parametric statistic. The following assumptions must be met: 1. The t-test uses the means to compare for differences. This implies that the data for the dependent variable must be on at least interval scale. 2. It is not essential for this procedure that the sample sizes of the two groups are the same. However, for the t-test, the sample size should at least be 30 per group. 3. Equal variances are assumed. For this, the Levene’s test for homogeneity of variances is used. This is given with the t-test output. You will remember from previous SA’s that the significance value of the Levene’s test should be more than 0.05. The latter will indicate that variances are equal. 4. The data for each of the two groups must be distributed normally. This can be tested by means of the descriptions of skewness and kurtosis or the Q-Q plots (or any other test for normality). To do an independent samples t-test on SPSS select ANALYSE > COMPARE MEANS > INDEPENDENT SAMPLES T-TEST. Select the dependent variable for the dependent list and the grouping variable under grouping variable. You have to define the groups – use the codes of the data set, e.g. group 1 = 0; group 2 = 1. Run the analysis. - 30 -
33. 33. The output will typically look like this: Group Statistics Income before the program Gender Male Female N 493 507 Mean 8.9939 8.9152 Std. Deviation 1.68866 1.58510 Std. Error Mean .07605 .07040 The first table shows the number of cases (N) for each group, the mean score for each group and the standard error of the mean. The mean difference when the mean of group A is subtracted from Group B Independent Samples Test Levene's Test for Equality of Variances F Income before the program Use the row of output corresponding to the outcome of the Levene’s test. E.g if Levene’s test indicates homogeneity of variances as in this case, use the upper row output for the t-test. Equal variances assumed Equal variances not assumed Sig. .421 .517 Levene’s test output. In this case, the test shows that the variances are equal F(1,998) = 0.517;p = 0.517. t-test for Equality of Means t Sig. (2-tailed) df Mean Difference Std. Error Difference 95% Confidence Interval of the Difference Lower Upper .760 998 .447 .0787 .10354 -.12446 .28191 .760 989.773 .448 .0787 .10363 -.12464 .28209 The degrees of freedom (N-1) The significance value. Should be less than 0.05 to indicate a significant difference. For this example no differences exist. The t-value is in this case 0.760 When you write the results of a t-test you must indicate the t-value as well as the significance value (p). In this example, Levine’s test showed that homogeneity of variances could be assumed. Thus, from the results, it is evident that there are no differences between the groups (t(998) = 0.760; p = 0.447). This reports the statistical significance. Lately it is required to report - 31 -
34. 34. the effect sizes of statistical results as well. There are different methods for calculating effect sized. The most common is however using r and Cohen’s d . Pearson's r can vary in magnitude from −1 to 1, with −1 indicating a perfect negative linear relation, 1 indicating a perfect positive linear relation, and 0 indicating no linear relation between two variables. Cohen gives the following guidelines for the social sciences: small effect size, r = 0.1-.23; medium, r = 0.24-.36; large, r = 0.37 or larger. Box 3: Practical and statistical significance In SPSS, we look at the p-value to tell us whether results are statistically significant or not. If the p-value is smaller than 0.05, we know the results are statistically significant by 0.05. What is statistical significance? A statistical concept indicating that the result is very unlikely due to chance and, therefore, likely represents a true relationship between the variables. Statistical significance is usually indicated by the alpha value (or probability value), which should be smaller than a chosen significance level. For most research studies the significance level of 0.05 or 0.01 is used, thus indicating that the results have only a 5% or 1% chance of being likely by chance alone. We test the significance of yours statistics by looking at the probability that our results may be due to other factors. If this probability is larger than 5% we generally do not accept it as “significant”. When it is smaller than 5%, we do accept it as being “significant”. However, this significance is does not necessarily mean that it is important. Statistical significance can sometimes be due to large samples. For this reason we calculate also the effect sizes of significant statistics. 8.3 The nonparametric alternative for the t-test for independent samples: Mann-Whitney U test Used if, the assumptions of the t-test for independent samples are not met, i.e. Data is not normally distributed Dependent variable is measured on ordinal scale Sample sizes are small (smaller than 30 larger than 5 per group). The hypothesis for a Mann-Whitney will look like: HO: there are no differences between the means of the samples ( ) (median1 =median2 for non-parametric) H1: there is a difference between the means of the two samples ( The output will typically look like this: Mann-Whitney Test - 32 - ) (median1?median 2)
35. 35. Ranks Level of education Marital status Unmarried Married Total N 504 496 1000 Mean Rank 512.13 488.68 Sum of Ranks 258114.01 242386.00 Test Statisticsa Mann-Whitney U Wilcoxon W Z Asymp. Sig. (2-tailed) Level of education 119130.00 242386.00 -1.389 .165 a. Grouping Variable: Marital status When you report the results you must mention the Z score and the Significance level. In the example above, the differences between married and unmarried employees with regard to level of education is not significant (z=-1,389; p=0.165). 8.4 Testing differences between two dependent / related samples In some research designs, a researcher has two measurements of the same group taken at two different points in time. For instance a pre-and post measurement. In such cases the researcher would like to see if there is a difference between the two measurements. A good example of such a design is when a researcher wants to test the effectiveness of a communication skills training programme. If the training programme is effective, the logical deduction would be that the scores on a second measurement of (after the training programme) will be higher than the first (before the training programme. In the case described above, the research will use the t-test for related/dependent samples. The assumptions are the same as for the t-test for independent samples, except for the independence of observations. Take the following example: We compared the mean test scores before (pre-test) and after (post-test) the subjects completed a test preparation course. We want to see if our test preparation course improved people's score on the test. - 33 -
36. 36. First, we see the descriptive statistics for both variables. The post-test mean scores are higher. However this is just on face value – we still do not know if this difference is statistically significant. Next, we see the correlation between the two variables. Remember, the groups are paired / the same and therefore, we assume that there is a correlation between the first and second measurement. There is a strong positive correlation. People who did well on the pre-test also did well on the post-test. Finally, we see the results of the Paired Samples T Test. Remember; this test is based on the difference between the two variables. Under "Paired Differences" we see the descriptive statistics for the difference between the two variables. To the right of the Paired Differences, we see the T, degrees of freedom, and significance. - 34 -
37. 37. The T value = -2.171 We have 11 degrees of freedom Our significance is .053 If the significance value is less than .05, there is a significant difference. If the significance value is greater than. 05, there is no significant difference. Here, we see that the significance value is approaching significance, but it is not a significant difference. There is no difference between pre- and post-test scores. The test preparation course did not help! To conduct a t-test for related samples on SPSS you follow the same route as with the t-test for unrelated samples. But, select the paired samples option/dependent samples t-test.. 8.4 The non-parametric alternative to the t-test for dependent/related samples: Wilcoxon Singed-Rank Test When the level of measurement for a one-group pre-post test design is on ordinal scale, data is not normally distributed, or sample sizes are small, the Wilcoxon Signed-Rank Test is used to test differences. Where the t-test uses the mean to test for differences, the Wilcoxon Signed Rank test uses the median. For more info on the Wilcoxon Sing Rank procedure see: http://learn.lboro.ac.uk/sci/ma/mlsc/documents/wsrt.pdf 8.5 Testing differences between more than 2 groups on one variable: One-way Analysis of Variance (One way ANOVA) Sometimes a researcher wants to compare the differences and similarities between more than 2 groups. - 35 -
38. 38. Example: A researcher thinks that students’ research skills are influenced by their time management skills. The research question here is: Do time management skills influence students’ research skills? For this study time management skills is the independent variable. This will therefore be the grouping variable. Research skills are the dependent variable. She measures time management skills by means of the Kubic Time-management questionnaire. This questionnaire categorise a persons time management skills in Low, low-to-moderate, Moderate-to-high, and High time management. Research skills are measured by means of the outcome/score of a student’s performance on a masters’ level dissertation. The hypotheses are as follows: H0: There is no difference between low, low-to-moderate, moderate-to-high and high time management abilities and students’ performance on their masters dissertations H1: Students with high time-management ability will perform significantly better in their masters dissertations than students with moderate-to-high, low-to-moderate and low time management skills. H2: Students with moderate-to high time management skills will perform better on their masters dissertations than students with low-to-moderate and moderate time management skills, but worse than students with high time management skills. (H3 and H4 will follow in the same pattern) To test these hypotheses, a one-way ANOVA can be performed. Note the use of “one-way” in this type of statistic. This indicates that there is only one independent variable (grouping variable) or factor involved. This is important because there are also two-or three ways ANOVAs or factorial ANOVAs which is computed when there are more than 1 factor or grouping variable used in the comparison. This is however beyond the scope of this module. As the name indicates, the ANOVA looks at the variances (or differences in variances) between the different groups. If differences exist, we assume that there are differences somewhere - 36 -
39. 39. between the means of the different groups. Or from all the groups, at least two group means differ significantly from each other. Thus, at this point in time, the H0 can be rejected. The ANOVA output itself only tells you that there is a difference somewhere, or not. It does not tell you between which groups these differences lie. To see between which groups differences exist, post-hoc tests are used. There are different post-hoc tests. The most commonly used in the Tukey’s Honestly Significant Difference or HSD test. The Bonferonni test is also used since it controls for the TYPE I Error (finding significant differences when there are none). The chances of a TYPE I Error is enhanced since repetitive comparisons are made between groups. Both these tests are conducted when equal variances of groups are assumed (parametric assumption). The one-way ANOVA as explained above is a parametric test. The assumptions or requirements for the data is the same as for the t-test for independent groups: 1. All observations must be independent from each other 2. The dependent variable must be measured on an interval or ratio scale 3. The dependent variable must be normally distributed in the population – for each group being compared. 4. The variances of all the groups must be the same (homogeneity of variances) 5. Sample sizes need not be equal, but should preferably be larger than 30 for each group. When equal variances are not assumed, but all other assumptions are met, SPSS gives you a choice of post hoc tests, which adapts for the differences between group variances. For this, you may select from Tamhanes, Dunettes or Games-Howell post-hoc tests. SPSS output will typically give you the following: a. Descriptive statistics: Descriptives Income before the program N Did not complete high school High school degree Some college Total Mean 95% Confidence Interval for Mean Lower Bound Upper Bound Std. Deviation Std. Error Minimum Maximum 459 7.6776 .82043 .03829 7.6023 7.7528 6.00 10.00 348 193 1000 9.2500 11.4560 8.9540 .72273 1.02030 1.63663 .03874 .07344 .05175 9.1738 11.3111 8.8524 9.3262 11.6008 9.0556 8.00 10.00 6.00 11.00 14.00 14.00 - 37 -
40. 40. The descriptive statistics include the number of respondents per group (N), the mean or average score per groups, the standard deviation, standard error of the mean, the minimum and maximum scores. b. Test for homogeneity of variance: Test of Homogeneity of Variances Income before the program Levene Statistic 18.420 df1 2 df2 997 Sig. .000 As with the t-test the ANOVA output, if you choose, should give you the results of the Levene’s test for homogeneity of variances. If the Levene’s test is significant (sig./p<0.05) the null hypothesis (that states that variances are equal) is rejected. THUS, the variances between groups are not equal. This gives you a fair idea of which post-hoc tests should be interpreted. b. The ANOVA / F-test ANOVA Income before the program Between Groups Within Groups Total Sum of Squares 1986.479 689.405 2675.884 df 2 997 999 Mean Square 993.240 .691 F 1436.399 Sig. .000 This is what you look at to decide whether there are any differences between the groups. In the first column, you will see that the output shows you the location of the differences – being either between groups (that is differences of variance between groups) or within groups (that is, the amount of variation that exists within each of the groups). The amount of variation for each of these is computed by means of the SUM OF SQUARES and the DEGREES OF FREEDOM. The df stands for DEGREES OF FREEDOM and is N-1. MS is the MEAN SQUARE (the variance) which is computed by SS/df. The F is the F-ratio. The ANOVA uses the F-test/F-distribution to test for differences between groups. The F-ratio is computed by between MS/within MS. For differences to be significant the between MS should be much larger than the within MS. If the grouping variable has an effect (in other words when there is a difference between groups) the F- - 38 -
41. 41. ratio should be larger than 1. To see if the differences is statistically significant, you need to look at the sig. (significance value). If the sig. < 0.05, it indicates that there are significant differences between the groups. The interpretation for this table will be written as: A significant difference exists between groups (f=1436.199; p=0.00). c. Results of the Post-hoc tests: Multiple Comparisons Dependent Variable: Income before the program Tukey HSD (I) Level of education Did not complete high school High school degree Some college Bonferroni Did not complete high school High school degree Some college Tamhane Did not complete high school High school degree Some college Games-Howell Did not complete high school High school degree Some college (J) Level of education High school degree Some college Did not complete high school Some college Did not complete high school High school degree High school degree Some college Did not complete high school Some college Did not complete high school High school degree High school degree Some college Did not complete high school Some college Did not complete high school High school degree High school degree Some college Did not complete high school Some college Did not complete high school High school degree Mean Difference (I-J) Std. Error -1.5724* .05911 -3.7784* .07134 Sig. .000 .000 95% Confidence Interval Lower Bound Upper Bound -1.7112 -1.4337 -3.9458 -3.6110 1.5724* .05911 .000 1.4337 1.7112 -2.2060* .07463 .000 -2.3811 -2.0308 3.7784* .07134 .000 3.6110 3.9458 2.2060* -1.5724* -3.7784* .07463 .05911 .07134 .000 .000 .000 2.0308 -1.7142 -3.9495 2.3811 -1.4307 -3.6073 1.5724* .05911 .000 1.4307 1.7142 -2.2060* .07463 .000 -2.3849 -2.0270 3.7784* .07134 .000 3.6073 3.9495 2.2060* -1.5724* -3.7784* .07463 .05447 .08283 .000 .000 .000 2.0270 -1.7028 -3.9773 2.3849 -1.4421 -3.5795 1.5724* .05447 .000 1.4421 1.7028 -2.2060* .08304 .000 -2.4053 -2.0066 3.7784* .08283 .000 3.5795 3.9773 2.2060* -1.5724* -3.7784* .08304 .05447 .08283 .000 .000 .000 2.0066 -1.7004 -3.9735 2.4053 -1.4445 -3.5833 1.5724* .05447 .000 1.4445 1.7004 -2.2060* .08304 .000 -2.4015 -2.0104 3.7784* .08283 .000 3.5833 3.9735 2.2060* .08304 .000 2.0104 2.4015 *. The mean difference is significant at the .05 level. The post-hoc test, compare the specific groups with each other. Those groups that differ significantly will usually be flagged by means of an *. The significance value for the group comparison should also be smaller than 0.05. - 39 -
42. 42. For this example, I have selected both post hoc tests that assumes homogeneity of variances, and those who do not. From the Levene’s statistic, I can now say that the assumption of homogeneity of variances has not been met. Therefore, I need to look at either the results of the Tamhane or Games-Howell post hoc tests. Both these tests indicate to me that there are statistically significant differences between all the groups. The hypothesis that I tested in this example was that significant differences exist in the income level of people that did not complete school, did complete school and those with post-matric training. I can now say that the ANOVA showed that significant differences exist. From the means plot I can see which of the groups has the highest income: 12 Mean of Income before the program 11 10 9 8 7 Did not complete hig High school degree Some college Level of education 8.6 The non-parametric alternatives for the One-way ANOVA When the data does not meet the requirements/assumptions of the parametric one-way ANOVA, the Kruskal-Wallis H test, the Median test and the Johnckheere-Terpstra Tests can be used. For the purpose of this module, we will only look at the Kruskal-Wallis H Test. The Kruskal-Wallis H Test is an extension of the Mann-Whitney U test. It is more powerful and preferable non-parametric alternative to use. Where the ANOVA uses the F-ratio, the KruskalWallis uses the H to assess whether differences exists. Basically it compares the medians of the samples/groups. • Data that can be used for the Krukskal-Wallis should: • Groups must be independent • More than 5 respondents per group (preferably 10) • Sample sizes should be equal or as equal as possible - 40 -
43. 43. The distribution need not be normal and variances need not be equal. In some situations, you would not want to compare more than two groups on one independent variable alone. In other words, you would like to see if there are differences based on more than one variable. Does Black, Asian and White South Africans differ in terms of demographical location, number of children, number of people living within one household?. When two independent variables are included we make use of the two-way ANOVA, when three independent variables are included we make use of the three-way ANOVA. ANOVA’s with more than one “factor” tested to see if it has an effect, can also be called Factorial ANOVA. See more at: o http://davidmlane.com/hyperstat/A134930.html o http://pluto.fss.buffalo.edu/classes/psy/segal/2072001/anova2/ANOVA2.html o http://arts.uwaterloo.ca/~djbrown/psych391/Test2/Factorial-Variance1.pdf When more than one dependent variable is included, the Multivariate analysis of variance or MANOVA is used. See: o http://userwww.sfsu.edu/~efc/classes/biol710/manova/manovanew.htm o http://www.utexas.edu/cc/docs/stat38.html Remember when to use a partial correlation? You want to keep the effect of a variable constant, to see what the relationship between two other variables are without its interference. Sometimes when we want to test differences with an ANOVA, we may also want to control for the effect of another variable. In such cases we use the Analysis of Covariance (ANCOVA). See also: http://www-users.cs.umn.edu/~ludford/Stat_Guide/ANCOVA.htm References: Field, A. (2009) Discovering statistics using SPSS. SAGE Publications Huysamen, G.K. (1998). Descriptive statistics for the social and behavioral sciences. JL van Schaik Academic: Pretoria. Pallant, J. (2003) SPSS survival manual. Open university press. - 41 -