measurment, testing & eveluation


Published on

major concepts and terminologies in second language testing

Published in: Education, Business, Technology

measurment, testing & eveluation

  1. 1.
  2. 2. Measurement: Fundamental Concepts & Preliminaries<br />
  3. 3. Importance of Testing<br />In Educational situations<br />To determine the progress of students<br />To ascertain achievement of educational objectives <br />To make sound decision based on evaluation<br />To know how much learning has taken place.<br />Teaching & Testing Relationship<br /> Testing at the service of teaching<br />Washback / backwash effect<br />Positive / negative<br />
  4. 4. Concepts & Terms <br />Test:<br />A procedure designed to elicit a certain behavior from which one can make inferences about certain characteristics of an individual.<br />Assessment: <br />An ongoing Process and a kind of measurement which encompasses a wider domain than a test and is carried out in direct and indirect ways.<br />
  5. 5. Concepts & Terms <br />Measurement: <br />Process of quantifying individuals’ characteristics according to specific rules & procedure <br />Evaluation: <br />The systematic gathering of information for the purpose of making decisions. Qualitative vs. Quantitative Evaluations <br />
  6. 6. Teaching–assessment Relation<br />
  7. 7.
  8. 8. <ul><li>1. non-test, non-measure evaluation
  9. 9. Qualitative description of Ss performance
  10. 10. 2. non-test measure for evaluation
  11. 11. Teacher’s ranking for assigning grades
  12. 12. 3. test for evaluative purpose
  13. 13. Achievement testing
  14. 14. 4. test for non-evaluative purpose
  15. 15. Proficiency test for research
  16. 16. 5. non-test measure for non-evaluative purpose
  17. 17. Assigning code numbers to subjects for research</li></li></ul><li>
  18. 18. Nominal Scale<br /><ul><li>Not really a ‘scale’ because it does not scale objects along any dimension. It simply labels objects and gives the researcher the least amount of information about participants. </li></ul>Gender : Male = 1 Female = 2<br />Religious Affiliation : Catholic= 1 Protestant= 2 Jewish= 3 Muslim= 4 Other= 5<br /><ul><li>yes/no responses </li></ul>categorizing subject by hair colour<br />marital status <br />Race<br />political party affiliation<br /> college major<br />Birthplace<br />Nominal data is often generated in studies using a questionnaire design from closed, forced choice questions, e.g. type of pet (cat, dog, rat etc.) <br />
  19. 19. Ordinal Scale<br /><ul><li>Numbers are used to place objects in order, but there is no information regarding the differences (intervals) between points on the scale.</li></ul>symptoms of depression from a psychiatric assessment?<br />None= 0 Mild= 1<br />Moderate= 2 Severe= 3<br />Ranking students according to frequency of spelling errors<br />the scores on a Likert questionnaire<br />Strongly Agree= 5 <br />Agree= 4 <br />No opinion= 3<br />Disagree= 2 <br />Strongly disagree= 1<br />
  20. 20. Interval Scale<br /><ul><li>•An interval scale is a scale on which equal intervals between objects represent equal meaningful differences.
  21. 21. Determining scores on a grammar test</li></ul>A 10-degree difference has the same meaning anywhere along the scale.<br />
  22. 22. Ratio Scale<br />•Ratio scales have a true zero point and are meaningful<br />Physical scales of <br /> time<br /> length<br /> weight<br /> speed<br /> absolute temperature(Kelvin scale) <br />•<br />
  23. 23.     the categories of the variable:    <br />
  24. 24. Test Genres<br />Test Battery<br />A group of tests standardized on the same population to yield comparable results and to produce a single score. <br /> Traditional vs. Computer-adaptive <br /> Discrete-point vs. Global/ Integrative <br /> Pragmatic vs. Functional/communicative <br />Norm-referenced vs. Criterion-referenced <br /> Direct vs. Indirect <br /> Subjective vs. Objective <br />Summative vs. Formative <br />Power vs. Speed<br />
  25. 25. Dimension<br />CRT<br />NRT<br />To determine whether each student has achieved specific skills or concepts. <br />To find out how much students know before and after instruction <br />To rank each student with respect to theachievement of others in broad areas of knowledge.<br />To discriminate between high and low achievers.<br />Purpose<br />Content<br />Measures specific skills making up a designated curriculum and identified by teachers and curriculum experts. <br />Each skill is expressed as an instructional objective.<br />Measures broad skill areas sampled from a variety of textbooks, syllabi, and the judgments of curriculum experts.<br />ItemCharacteristics<br />Each skill is tested by at least four items to obtain an adequate sample of performance and to minimize the guessing effect . The items which test any given skill are parallel in difficulty.<br />Each skill is usually tested by less than four items.<br />Items vary in difficulty.<br />Selected items show high discrimination indexes.<br />ScoreInterpretation<br />Testers are compared with a preset standard for acceptable achievement. The performance of other examinees is irrelevant. A student&apos;s score is usually expressed as a percentage. Student achievement is reported for individual skills.<br />Testers are compared with other examinees and assigned a score--usually expressed as a percentile, a grade equivalent score, or a stanine.<br />Student achievement is reported  for broad skill areas, although some norm-referenced tests do report student achievement for individual skills.<br />NRT vs. CRT<br />
  26. 26. Test Items<br />Alternate response Items <br />True / False <br />Yes / No<br />Agree / Disagree<br />Right / Wrong<br />Fixed / Closed-ended response Items <br />Multiple-Choice<br />Matching <br />Free / Open-ended response Items <br />Short answer<br />Gap-Fill<br />Essay <br />
  27. 27. Teacher-made vs. Standardized Tests<br /><ul><li>Teacher-made/classroom tests: </li></ul> Small scale, classroom tests generally prepared, administered, and scored by one teacher .<br /><ul><li>Standardized Tests:</li></ul> Tests with fixed contents, constant administration and scoring procedures and statistically acceptable characteristics.<br /><ul><li>Differences between TMD & SDT
  28. 28. Administration & scoring
  29. 29. Content sampling
  30. 30. Test construction
  31. 31. Norms & Standards
  32. 32. Purpose and use</li></li></ul><li>Self-assessment<br />1. A test refers to a standard set of items to be answered.<br />2.Evaluation uses both tests and informal pieces of evidence for making a value judgment and decision. <br />4.Measurement refers to any device for obtaining information in a quantitative manner.<br />5.If a person Knows how to teach, he may not be necessarily able to judge the ability of his pupils.<br />Mohd. Pazhouhesh<br />T<br />F<br />T<br />T<br />
  33. 33. 6. Educational decisions can be made without measurement or evaluation. <br />7. Summative evaluation involves the use of tests and quizzes for the purpose of determining the effectiveness of instructional programs. <br />Mohd. Pazhouhesh<br />F<br />T<br />
  34. 34. <ul><li>The process of gathering information to make proper decisions is called ----------.</li></ul> a. measurement<br />b. testing<br />c. evaluation<br />d. examination <br /><ul><li>The subjective judgment of a teacher about a student’s performance is a kind of --------- evaluation.</li></ul>quantitative<br />standard<br />qualitative <br />comprehensive <br />Mohd. Pazhouhesh<br />