By Miguel Angel Carranza, MsE.
 
Quotes on Evaluation “ It can not be denied that a great deal of language testing is of very poor quality...they (the tests) often fail to measure accurately whatever it is they are intended to measure” (Hughes, 2000). “ Many beginning teachers believe that their most important or only  responsibility is to select effective instructional procedures” (Sax, 2002).
 
Characteristics of a Good Test 1. Validity: The extent to which a test measures what it is supposed to measure. The truthfulness or accuracy within the score of a test or interpretation of an experiment. How should a reading or listening test be designed? What does the TOEFL  evaluate? 2. Reliability (consistency) : The extent to which  individual differences are measured consistently.  Does a clasroom,  its conditions or the environment affect  students` results?  E.g: a marching band… health or personal problems:a fever. Taking the listening part at a noisy place? With poor equipement? Being drunk or sober?
3. Practicality  Financial limitations, time constraints, ease of administation, and scoring and interpreation.
 
 
 
Types of tests A) Placement test B) Diagnostic test C) Progress or achievement test* D) Proficiency test __  designed to measure how much a student has learned at a given moment in a course. __ designed to get  a general picture of a student’s knowledge and ability. __ designed to measure a student’s strengths and weaknesses in order to better things __ designed to fit new students in a course according to their level of language competence.
But ,  why do we evaluate? to measure students` learning to measure teaching effectiveness and to improve it! To provide feedack to students, to  teachers, and parents! To  diagnose  students´ weaknesses and strengths to identify which students are in need of remedial or advance work. To teach without  evaluation is a contradiction in terms” (Sax, 1997).
Key Terms in evaluation A  test: is a task or series of tasks used to obtain systematic observations, usually paper and pencil procedure, about sts´s performance. Measurement: is the process  that assigns numbers to attributes or characteristics of persons, objects, or events according to explicit formulations or rules. More quantitative
Assessment vrs Evaluation Assessment: the process of collecting, synthesizing, and interpreting information to aid in decision making. More qualitative Evaluation: the process of determining the value or worth of a program, course, or other initiative, toward the ultimate goal of making decisions about adopting, rejecting, or revising the innovation.  Evaluation is the  more inclusive term , often making use of assessment data in addition to many other data sources.
Discrete point testing vrs integrative testing Discrete point testing refers to measuring one element at a time, item by item: a series of items, each testing particular grammatical  structures. Integrative requires the candidate to combine many language elements in the completion of a task. Discrete point testing will be indirect and  Integrative will be direct.
Direct and indirect test items It is direct if it asks candidates to perform the communicative skill which is being tested  ( through a composition, oral interview) Indirect items: measure a students´ knowledge and ability by getting at what lies beneath their receptive and productive skills ( controlled items: multiple choice type for example).

Evaluation.2011intro

  • 1.
    By Miguel AngelCarranza, MsE.
  • 2.
  • 3.
    Quotes on Evaluation“ It can not be denied that a great deal of language testing is of very poor quality...they (the tests) often fail to measure accurately whatever it is they are intended to measure” (Hughes, 2000). “ Many beginning teachers believe that their most important or only responsibility is to select effective instructional procedures” (Sax, 2002).
  • 4.
  • 5.
    Characteristics of aGood Test 1. Validity: The extent to which a test measures what it is supposed to measure. The truthfulness or accuracy within the score of a test or interpretation of an experiment. How should a reading or listening test be designed? What does the TOEFL evaluate? 2. Reliability (consistency) : The extent to which individual differences are measured consistently. Does a clasroom, its conditions or the environment affect students` results? E.g: a marching band… health or personal problems:a fever. Taking the listening part at a noisy place? With poor equipement? Being drunk or sober?
  • 6.
    3. Practicality Financial limitations, time constraints, ease of administation, and scoring and interpreation.
  • 7.
  • 8.
  • 9.
  • 10.
    Types of testsA) Placement test B) Diagnostic test C) Progress or achievement test* D) Proficiency test __ designed to measure how much a student has learned at a given moment in a course. __ designed to get a general picture of a student’s knowledge and ability. __ designed to measure a student’s strengths and weaknesses in order to better things __ designed to fit new students in a course according to their level of language competence.
  • 11.
    But , why do we evaluate? to measure students` learning to measure teaching effectiveness and to improve it! To provide feedack to students, to teachers, and parents! To diagnose students´ weaknesses and strengths to identify which students are in need of remedial or advance work. To teach without evaluation is a contradiction in terms” (Sax, 1997).
  • 12.
    Key Terms inevaluation A test: is a task or series of tasks used to obtain systematic observations, usually paper and pencil procedure, about sts´s performance. Measurement: is the process that assigns numbers to attributes or characteristics of persons, objects, or events according to explicit formulations or rules. More quantitative
  • 13.
    Assessment vrs EvaluationAssessment: the process of collecting, synthesizing, and interpreting information to aid in decision making. More qualitative Evaluation: the process of determining the value or worth of a program, course, or other initiative, toward the ultimate goal of making decisions about adopting, rejecting, or revising the innovation. Evaluation is the more inclusive term , often making use of assessment data in addition to many other data sources.
  • 14.
    Discrete point testingvrs integrative testing Discrete point testing refers to measuring one element at a time, item by item: a series of items, each testing particular grammatical structures. Integrative requires the candidate to combine many language elements in the completion of a task. Discrete point testing will be indirect and Integrative will be direct.
  • 15.
    Direct and indirecttest items It is direct if it asks candidates to perform the communicative skill which is being tested ( through a composition, oral interview) Indirect items: measure a students´ knowledge and ability by getting at what lies beneath their receptive and productive skills ( controlled items: multiple choice type for example).