Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
×

# Statistics for interpreting test scores

14,405 views

Published on

The slides touche briefly statistical procedures employed for interpreting test results. The file is especially useful for ESL teachers

• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• can you please email this to me? thank you! precious.lovely23@gmail.com

Are you sure you want to  Yes  No
Your message goes here
• the slide seems very easy to understand, but i have not gotten the email as yet

Are you sure you want to  Yes  No
Your message goes here

### Statistics for interpreting test scores

1. 1. Kinds of Statistics Purpose Target Characteristics Descriptive Statistics Summarizing, Describing Sample Statistic Inferential Statistics Analyzing, Generalization Population Parameter
2. 4. Tabulation of Data Ungrouped Data Grouped Data 15 9 16 13 11 10 8 12 16 15 13 12 11 10 9 8
3. 5. Frequency Distribution <ul><li>Graphic description of how many times a score or group of scores occurs in a sample </li></ul><ul><li>Common symbol is “f” </li></ul>
4. 6. <ul><li>Absolute Frequency </li></ul>Score(X) Frequency(f) 16 15 13 12 11 10 9 8 1 1 2 5 4 4 1 2
5. 7. Frequency distribution Score (X) Absolute Frequency Relative Frequency 16 15 13 12 11 10 9 8 1 1 2 5 4 4 1 2 0.05 0.05 0.10 0.25 0.20 0.20 0.05 0.10 Σ x = 94 N= 20 1.00
6. 8. Frequency distribution Score (X) Absolute Frequency Relative Frequency Percentage 16 15 13 12 11 10 9 8 1 1 2 5 4 4 1 2 0.05 0.05 0.10 0.25 0.20 0.20 0.05 0.10 0.05x100=5 0.05x100=5 0.10x100=10 0.25x100=25 0.20x100=20 0.20x100=20 0.05x100=5 0.10x100=10 Σ x = 94 N= 20 1.00 100
7. 9. Cumulative Frequency <ul><li>Cumulative frequency distribution is a graphic depiction of the how many times groups of scores appear in a sample </li></ul><ul><li>Common symbol is “ cf ” </li></ul><ul><li>“ cf “ is used to compute percentile scores </li></ul>
8. 10. <ul><li>Cumulative frequency </li></ul>X f cf 16 15 13 12 11 10 9 8 1 1 2 5 4 4 1 2 20 19 18 16 11 7 3 2
9. 11. <ul><li>Percentile score </li></ul><ul><ul><li>showing relative standing in a distribution </li></ul></ul><ul><ul><li>showing what percentage of scores are higher and lower than a certain score. </li></ul></ul><ul><ul><li>Percentile computation </li></ul></ul><ul><ul><li> cf </li></ul></ul><ul><li>P (percentile) = (100) ----- </li></ul><ul><li>N </li></ul><ul><li>N= number of scores </li></ul><ul><li>Cf =cumulative frequency </li></ul><ul><li>Cf 16 </li></ul><ul><li>P= (100)----- = (100)------ = 80 </li></ul><ul><li>N 20 </li></ul><ul><li>Cf 20 </li></ul><ul><li>P = (100)-----= (100) ------ = 100 </li></ul><ul><li>N 20 </li></ul>
10. 12. <ul><li>Bar graph </li></ul>
11. 14. <ul><li>Frequency Polygon </li></ul>
12. 15. Measures of Central Tendency <ul><li>Mean: arithmetic average of all scores in a distribution </li></ul><ul><li>Median: the point at which exactly half of the scores in a distribution are below & half are above </li></ul><ul><li>Mode: most frequently occurring score(s) </li></ul>
13. 16. <ul><li>Measures of central tendency </li></ul><ul><li>Mean / arithmetic average </li></ul><ul><li>_ Σ x </li></ul><ul><li>X = ------- Σ x = sum of all scores </li></ul><ul><li> N N = number of scores </li></ul><ul><li>Example: </li></ul><ul><ul><li>13 + 14 + 15 + 16 + 17 = 75 </li></ul></ul><ul><ul><ul><li>Σ x = 75 N= 5 </li></ul></ul></ul><ul><li>_ Σ x 75 </li></ul><ul><li>X = ------- = -------= 15 </li></ul><ul><li>N 5 </li></ul>
14. 17. <ul><li>Mode </li></ul>1. Odd number 13, 15, 16 , 17, 19 2. Even number 12, 13, 15, 17 , 18, 19 = 16 <ul><li>Median: </li></ul>X f 16 15 13 12 11 10 9 8 1 1 2 5 4 4 1 2
15. 18. Measures of Variability <ul><li>These describe the spread, or dispersion, of scores in a distribution </li></ul><ul><li>These measures describe the nature & extent to which scores vary </li></ul><ul><li>Three most commonly used measures are: </li></ul><ul><li>1. Range </li></ul><ul><li>2. Variance </li></ul><ul><li>3. Standard Deviation </li></ul>
16. 19. <ul><li>Measures of variability </li></ul><ul><li>Range </li></ul><ul><li>13, 15, 16, 17, 19 </li></ul><ul><li>19-13= 6 </li></ul><ul><li>Variance (V) </li></ul><ul><li> V =------------- </li></ul><ul><li>N -1 </li></ul><ul><li>3. standard deviation </li></ul>
17. 20. Example Σ x 2 28 (V ( = ---------= --------=4.6 N-1 7-1  S = 2.14 X (X - ) Σ (X - ) 2 19 18 17 16 15 14 13 + 3 +2 +1 0 -1 -2 -3 9 4 1 0 1 4 9 112 0 28
18. 21. <ul><li>Normal /bell-shaped curve </li></ul>
19. 22. <ul><li>Properties of Normal /bell-shaped curve </li></ul><ul><li>It is a symmetrical distribution </li></ul><ul><li>Most of the scores tend to occur near the center </li></ul><ul><ul><li>while more extreme scores on either side of the center become increasingly rare. </li></ul></ul><ul><ul><li>As the distance from the center increases, the frequency of scores decreases. </li></ul></ul><ul><li>The mean, median, and mode are the same. </li></ul>
20. 23. Normal Probability Curve <ul><li>Describes an expected distribution of scores in a population or sample </li></ul><ul><li>More than 2/3 of scores cluster in the middle of the curve </li></ul><ul><li>Scores that are extremely high or low are sometimes called outliers </li></ul><ul><li>Shaped like a bell </li></ul>
21. 24. <ul><li>Normal /bell-shaped curve </li></ul>
22. 25. Symmetrical vs. asymmetrical d distribution <ul><li>In a symmetrical distribution </li></ul><ul><ul><li>the part of the histogram on the left side of the fold would be the mirror image of the part on the right side of the fold. </li></ul></ul><ul><ul><li>In a asymmetrical , distribution </li></ul></ul><ul><ul><ul><li>the two sides will not be mirror images of each other. True symmetric distributions include what we will later call the normal distribution . </li></ul></ul></ul>
23. 26. <ul><li>A asymmetric distribution is either Positively or negatively skewed . </li></ul><ul><ul><li>In a positively skewed distribution the scores cluster toward the lower end of the scale (that is, the smaller numbers) with increasingly fewer scores at the upper end of the scale (that is, the larger numbers). </li></ul></ul><ul><ul><li>With a negatively skewed distribution, most of the scores occur toward the upper end of the scale while increasingly fewer scores occur toward the lower end. </li></ul></ul>
24. 28. Kurtosis Distribution <ul><li>Mesokurtic distribution : with normal distribution of scores </li></ul><ul><li>Leptokurtic distribution: packed, with low variability of scores </li></ul><ul><li>Platykurtic Distribution: flat, with high variability of scores </li></ul>
25. 29. <ul><li>Examples based on the curve </li></ul><ul><li>Adults intelligence </li></ul><ul><li>= 100 SD= 15 </li></ul><ul><li>Mean + 1 SD = 68% </li></ul><ul><li>100 + 15 = 85 -115 </li></ul><ul><li>34 percent = 100--115 </li></ul><ul><li>34 percent = 85 -- 100 </li></ul>
26. 30. <ul><li>Mean + 2 SD = 94% </li></ul><ul><li>100 + (2 X 15) = 70 --130 </li></ul><ul><li>115 + 15 = 130 </li></ul><ul><li>13 % = 115 – 130 </li></ul><ul><li>85 – 15 = 70 </li></ul><ul><li>13 % = 70 –85 </li></ul>
27. 31. <ul><li>Standard Scores </li></ul><ul><ul><li>To compare scores on different measurement scales </li></ul></ul><ul><ul><li>Z-Scores: the commonest score </li></ul></ul><ul><ul><li>Z-score properties </li></ul></ul><ul><ul><ul><li>How many scores above/below the mean </li></ul></ul></ul><ul><ul><ul><li>The mean being set at zero </li></ul></ul></ul><ul><ul><ul><li>The SD being set at one </li></ul></ul></ul>
28. 32. <ul><li>Standard Scores </li></ul><ul><ul><li>T -Score: A standard score whose distribution has a mean of 50 and a standard deviation of 10. </li></ul></ul><ul><li>Advantages of T-score </li></ul><ul><ul><ul><li>Enabling us to work with whole numbers </li></ul></ul></ul><ul><ul><ul><li>Avoiding describing subjects’ performances with negative numbers </li></ul></ul></ul>
29. 34. <ul><li>An example: </li></ul><ul><li>_ </li></ul><ul><li> 25- 20 </li></ul><ul><li> = ---------- = +1 </li></ul><ul><li> 5 </li></ul><ul><li> X - 30-35 </li></ul><ul><li>Z = ---------- = -------- = -1 </li></ul><ul><li>SD 5 </li></ul><ul><li>Ali’s Z-score : +1 </li></ul><ul><li>Better than 84% of the class </li></ul><ul><li>Ahmad’s Z-score: --1 </li></ul><ul><li>Better than only 16% 0f the class </li></ul>Student X SD Ali 25 5 20 Ahmad 30 5 35
30. 35. <ul><li>An example: </li></ul><ul><li>_ </li></ul><ul><li> X – 76- 54 </li></ul><ul><li>Z = --------- = ---------- = +1.1 </li></ul><ul><li> SD 20 </li></ul><ul><li> X - 82- 72 </li></ul><ul><li>Z = ---------- = -------- = 0.67 </li></ul><ul><li>SD 15 </li></ul><ul><li>So, using standard scores, Ali did better than Ahmad because Ali’s mark was more standard deviations above the class mean than Ahmad’s score was above his own class mean. </li></ul>Student X SD Ali 76 20 54 Ahmad 82 15 72
31. 38. <ul><li>Negative Skew </li></ul><ul><li>Positive skew </li></ul><ul><li>Test items were easy. </li></ul><ul><li>Testees performed well. </li></ul><ul><li>The score are far from zero. </li></ul><ul><li>Test items were difficult. </li></ul><ul><li>Testees performed poorly. </li></ul><ul><li>The scores are near zero. </li></ul>Skewed Distribution
32. 39. The Coefficient of Correlation, r <ul><li>The Coefficient of Correlation ( r ) is a measure of the strength of the relationship between two variables. It requires interval or ratio-scaled data. </li></ul><ul><li>It can range from -1.00 to 1.00. </li></ul><ul><li>Values of -1.00 or 1.00 indicate perfect and strong correlation. </li></ul><ul><li>Values close to 0.0 indicate weak correlation. </li></ul><ul><li>Negative values indicate an inverse relationship and positive values indicate a direct relationship. </li></ul>
33. 40. Perfect Correlation
34. 41. Correlation Coefficient - Interpretation
35. 42. <ul><li>Correlation </li></ul><ul><ul><li>Go-togetherness of variables </li></ul></ul><ul><ul><li>No cause-effect relationship </li></ul></ul><ul><ul><li>Between -1 and +1 </li></ul></ul><ul><li>Types of correlation: </li></ul><ul><ul><li>Pearson Product-moment </li></ul></ul><ul><ul><ul><li>Used for interval data </li></ul></ul></ul>
36. 43. Correlation Coefficient - Formula
37. 44. <ul><li>Spearman rank-order, rho </li></ul><ul><ul><li>used for ranked or ordinal data </li></ul></ul>Students Teacher 1 Teacher 2 D D 2 A B C D E 1 2 3 4 5 5 4 3 2 1 4 2 0 2 4 16 4 0 4 16 5 15 15 12 40
38. 45. <ul><li> 6 (40) 240 </li></ul><ul><li> P = 1-- ––––––– = ––––––––  </li></ul><ul><li> 5 (25-1) 120 </li></ul><ul><li> 1 – 2 = -1 </li></ul>
39. 46. <ul><li>Standard error of measurement (SEM) </li></ul><ul><li>True score= Raw score – Random errors </li></ul><ul><li>How to estimate </li></ul><ul><li>X = score </li></ul><ul><li>n = number of items </li></ul>
40. 47. <ul><li>An example: </li></ul><ul><li>X= 79 n = 100 </li></ul><ul><li>X (n- X) 79 (100-79) </li></ul><ul><li>SEM X = –––––––– = –––––––––––= 4 </li></ul><ul><li>n – 1 100-1 </li></ul><ul><li>79 + 1SEM = 75-83 68% </li></ul><ul><li>79 + 2SEM = 71-87 95% </li></ul>
41. 48. <ul><li>Point-biserial correlation: </li></ul>rpbi = point-biserial correlation coefficient Mp = whole-test mean for students answering item correctly (i.e., those coded as 1s) Mq = whole-test mean for students answering item incorrectly (i.e., those coded as 0s) St = standard deviation for whole test p = proportion of students answering correctly (i.e., those coded as 1s) q = proportion of students answering incorrectly (i.e., those coded as 0s)
42. 49. As another example, where the whole-test mean for Ss answering correctly is 30; the whole-test mean for Ss answering incorrectly is 45; the standard deviation for the whole test is still 8.29; the proportion of Ss answering correctly is still .50; and the proportion answering incorrectly is still .50 .
43. 50. Types of Correlation coefficients Type of Correlation Coefficient Types of Scales Pearson product-moment Both scales interval (or ratio) Spearman rank-order Both scales ordinal Phi Both scales are naturally dichotomous (nominal) Biserial One scale artificially dichotomous (nominal), one scale interval (or ratio) Point-biserial One scale naturally dichotomous (nominal), one scale interval (or ratio) Gamma One scale nominal, one scale ordinal
44. 51. <ul><li>Correction for guessing </li></ul><ul><li>A student has taken a test of 100 items. As he has no knowledge, he takes choice A and marks it for all the items. His true / corrected score is </li></ul><ul><li>75 </li></ul><ul><li>Corrected score=  25-- ------------ 4—1 </li></ul><ul><li>25-25= 0 </li></ul>
45. 52. <ul><li>Some other factors affecting a person’s score: </li></ul><ul><li>Practice effect </li></ul><ul><li>Coaching effect </li></ul><ul><li>Ceiling effect </li></ul><ul><li>Test compromise </li></ul><ul><li>Test Method </li></ul><ul><li>Test Taker’s characteristics </li></ul>
46. 53. <ul><li>In a normal distribution, what percentage of scores fall between the mean and one standard deviation? </li></ul><ul><li> a. 35% b. 50% </li></ul><ul><li>c. 68% d. 95% </li></ul><ul><li>A test has been given to 100 students. Twenty students have obtained the score of 50. What is the percentage of this score? </li></ul><ul><li>a. 10 b. 15 c. 20 d. 30 </li></ul><ul><li>f 20 </li></ul><ul><li>P x = ------ X 100 = ------= 0.20 X 100 = 20 </li></ul><ul><li>N 100 </li></ul>
47. 54. <ul><li>A test is administered to 100 students. The cumulative frequency of the score of 50 is 40. How many students have scores below the score of 50? </li></ul><ul><li>a. about 25 b. about 20 </li></ul><ul><li>c. about 40 d. about 30 </li></ul><ul><li>N=100 cf =40 </li></ul><ul><li>c f 40 </li></ul><ul><li>P= (100)----- = 100 x ------ = 40 </li></ul><ul><li>N 100 </li></ul>
48. 55. <ul><li>In a test eight of the students obtained a score of 85. This score has the highest frequency. What is the label used for this score? </li></ul><ul><li>a. Mean b. Mode </li></ul><ul><li>c. Median d. Range </li></ul>
49. 56. Fry’s Readability Graph <ul><li>Directions for Use </li></ul><ul><ul><li>Randomly select three 100-word passages from a book or an article. </li></ul></ul><ul><ul><li>Plot the average number of syllables and the average number of sentences per 100 words on the graph to determine the grade level of the material. </li></ul></ul><ul><ul><li>Choose more passages per book if great variability is observed and conclude that the book has uneven readability. </li></ul></ul><ul><ul><li>Few books will fall into the solid black area, but when they do, grade level scores are invalid. </li></ul></ul>
50. 57. Additional Directions <ul><li>Randomly select three sample passages and count exactly 100 words beginning with the beginning of a sentence. Don't count numbers. Do count proper nouns. </li></ul><ul><li>Count the number of sentences in the hundred words, estimating length of the fraction of the last sentence to the nearest 1/10th. </li></ul><ul><li>Count the total number of syllables in the 100-word passage. </li></ul><ul><li>Enter graph with average sentence length and number of syllables; plot dot where the two lines intersect. Area where dot is plotted will give you the approximate grade level. </li></ul><ul><li>If a great deal of variability is found, putting more sample counts into the average is desirable. </li></ul>