Experimental design data analysis


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Experimental design data analysis

  1. 1. Experimental DesignData analysisGROUP 5
  3. 3. IntroductionA paired t-test is used to compare two population means where you have two samples in whichobservations in one sample can be paired with observations in the other sample.For example:A diagnostic test was made before studying a particular module and then again after completingthe module. We want to find out if, in general, our teaching leads to improvements in students’knowledge/skills.
  4. 4. First, we see the descriptive statistics forboth variables. The post-test mean scores are higher.
  5. 5. Next, we see the correlation betweenthe two variables. There is a strong positive correlation. People who did well on the pre-test also did well on the post-test.
  6. 6. Finally, we see the T, degrees offreedom, and significance. Our significance is .053 If the significance value is less than .05, there is a significant difference. If the significance value is greater than. 05, there is no significant difference. Here, we see that the significance value is approaching significance, but it is not a significant difference. There is no difference between pre- and post-test scores. Our test preparation course did not help!
  8. 8. Outline 1. Introduction 2. Hypothesis for the independent t-test 3. What do you need to run an independent t-test? 4. Formula 5. Example (Calculating + Reporting)
  9. 9. IntroductionThe independent t-test, also called the two sample t-test orstudents t-test is an inferential statistical test that determineswhether there is a statistically significant difference between themeans in two unrelated groups.
  10. 10. Hypothesis for the independent t-testThe null hypothesis for the independent t-test is that the population means from thetwo unrelated groups are equal:H0: u1 = u2In most cases, we are looking to see if we can show that we can reject the nullhypothesis and accept the alternative hypothesis, which is that the population meansare not equal:HA: u1 ≠ u2To do this we need to set a significance level (alpha) that allows us to either reject oraccept the alternative hypothesis. Most commonly, this value is set at 0.05.
  11. 11. What do you need to run anindependent t-test?In order to run an independent t-test you need thefollowing: 1. One independent, categorical variable that has two levels. 2. One dependent variable
  12. 12. Formula M: mean (the average score of the group) SD: Standard Deviation N: number of scores in each group Exp: Experimental Group Con: Control Group
  13. 13. Formula
  14. 14. Example
  15. 15. Example
  16. 16. Effect Size
  17. 17. Reporting the Result of anIndependent T-TestWhen reporting the result of an independent t-test, youneed to include the t-statistic value, the degrees of freedom(df) and the significance value of the test (P-value). Theformat of the test result is: t(df) = t-statistic, P = significancevalue.
  18. 18. Example result (APA Style)An independent samples T-test is presented the same as the one-sample t-test: t(75) = 2.11, p = .02 (one –tailed), d = .48 Degrees of freedom Value of statistic Include if test is Effect size if Significance of one-tailed available statisticExample: Survey respondents who were employed by the federal, state, or local governmenthad significantly higher socioeconomic indices (M = 55.42, SD = 19.25) than surveyrespondents who were employed by a private employer (M = 47.54, SD = 18.94) , t(255) =2.363, p = .01 (one-tailed).
  19. 19. Analysis of Variance(ANOVA)PRESENTER : MINH SANG
  20. 20. IntroductionWe already learned about the chi square test for independence, which is useful for data that ismeasured at the nominal or ordinal level of analysis.If we have data measured at the interval level, we can compare two or more population groupsin terms of their population means using a technique called analysis of variance, or ANOVA.
  21. 21. Completely randomized designPopulation 1 Population 2….. Population kMean = 1 Mean = 2 …. Mean = kVariance= 2 Variance= 2 … Variance = 2 1 2 k We want to know something about how the populations compare.Do they have the same mean? We can collect random samples fromeach population, which gives us the following data.
  22. 22. Completely randomized designMean = M1 Mean = M2 ..… Mean = MkVariance=s12 Variance=s22 …. Variance = sk2N1 cases N2 cases …. Nk casesSuppose we want to compare 3 college majors in a business school by the average annual income people make 2 years after graduation. We collect the following data (in $1000s) based on random surveys.
  23. 23. Completely randomized designAccounting Marketing Finance27 23 4822 36 3533 27 4625 44 3638 39 2829 32 29
  24. 24. Completely randomized designCan the dean conclude that there are differences among the major’s incomes?Ho : 1= 2= 3HA: 1 2 3In this problem we must take into account:1) The variance between samples, or the actual differences by major. This is called the sum of squares for treatment (SST).
  25. 25. Completely randomized design2) The variance within samples, or the variance of incomes within a single major. This is called the sum of squares for error (SSE).Recall that when we sample, there will always be a chance of getting something different than the population. We account for this through #2, or the SSE.
  26. 26. F-StatisticFor this test, we will calculate a F statistic, which is used to compare variances.F = SST/(k-1) SSE/(n-k)SST=sum of squares for treatmentSSE=sum of squares for errork = the number of populationsN = total sample size
  27. 27. F-statisticIntuitively, the F statistic is:F = explained varianceunexplained varianceExplained variance is the difference between majorsUnexplained variance is the difference based on random sampling for each group(see Figure 10-1, page 327)
  28. 28. Calculating SSTSST = ni(Mi - )2 = grand mean or = Mi/k or the sum of all values for all groups divided bytotal sample sizeMi = mean for each samplek= the number of populations
  29. 29. Calculating SSTBy majorAccounting M1=29, n1=6Marketing M2=33.5, n2=6Finance M3=37, n3=6 = (29+33.5+37)/3 = 33.17SST = (6)(29-33.17)2 + (6)(33.5-33.17)2 + (6)(37-33.17)2 = 193
  30. 30. Calculating SSTNote that when M1 = M2 = M3, then SST=0 which would support the nullhypothesis.In this example, the samples are of equal size, but we can also run this analysis with samples of varying size also.
  31. 31. Calculating SSESSE = (Xit – Mi)2In other words, it is just the variance for each sample added together.SSE = (X1t – M1)2 + (X2t – M2)2 + (X3t – M3)2SSE = [(27-29)2 + (22-29)2 +…+ (29-29)2] + [(23-33.5)2 + (36-33.5)2 +…] + [(48-37)2 + (35-37)2 +…+ (29-37)2]SSE = 819.5
  32. 32. Statistical OutputWhen you estimate this information in a computer program, it willtypically be presented in a table as follows:Source of df Sum of Mean F-ratioVariation squares squaresTreatment k-1 SST MST=SST/(k-1) F=MSTError n-k SSE MSE=SSE/(n-k) MSETotal n-1 SS=SST+SSE
  33. 33. Calculating F for our exampleF = 193/2 819.5/15F = 1.77Our calculated F is compared to the critical value using the F-distribution withF , k-1, n-k degrees of freedomk-1 (numerator df)n-k (denominator df)
  34. 34. The ResultsFor 95% confidence ( =.05), our critical F is 3.68 (averaging across the values at 14 and 16In this case, 1.77 < 3.68 so we must accept the null hypothesis.The dean is puzzled by these results because just by eyeballing the data, it looks like finance majors make more money.
  35. 35. The ResultsMany other factors may determine the salary level, such as GPA. The deandecides to collect new data selecting one student randomly from each majorwith the following average grades.
  36. 36. New dataAverage Accounting Marketing Finance M(b)A+ 41 45 51 M(b1)=45.67A 36 38 45 M(b2)=39.67B+ 27 33 31 M(b3)=30.83B 32 29 35 M(b4)=32C+ 26 31 32 M(b5)=29.67C 23 25 27 M(b6)=25 M(t)1=30.83 M(t)2=33.5 M(t)3=36.83 = 33.72
  37. 37. Randomized Block DesignNow the data in the 3 samples are not independent, they are matched by GPAlevels. Just like before, matched samples are superior to unmatched samplesbecause they provide more information. In this case, we have added a factorthat may account for some of the SSE.
  38. 38. Two way ANOVANow SS(total) = SST + SSB + SSEWhere SSB = the variability among blocks, where a block is a matched group ofobservations from each of the populationsWe can calculate a two-way ANOVA to test our null hypothesis. We will talk aboutthis next week.