Norm-Referenced
&
Criterion-Referenced Tests
© Fariba Chamani, 2015
Testing in Language Programs
by James Dean Brown (2005)
Chapter one: Types and Uses of Language Tests
Different Types of Tests
1. Norm Referenced Tests
 Measure global language abilities.
 Compare students’ performance to others.
2. Criterion Referenced Tests
 Measure specific instructional objectives
 Compares students’ performance to a specific
criterion (Objectives or Standard)
NRT & CRT
NRT
1. Administrators
2. Proficiency tests
(TOEFL)
3. Percentile
4. Normal Distribution
5. Predictability of form
& content
CRT
1. Teachers
2. Achievement Tests
3. Percentage
4. Criterion (objectives,
Standards)
5. Unpredictability of
form & content
NRT & CRT Differences
1. Type of Interpretation
2. Type of Measurement
3. Purpose of Testing
4. Distribution of Scores
5. Test Structure
6. Knowledge of Questions
Type of Interpretation
NRT Relative
Student’s performance is compared to those
of other students in percentile terms.
CRT Absolute
Student’s performance is compared to the
percentage of material learned.
Type of Measurement
NRT: Measures general language
abilities or proficiency
CRT: Measures specific objective-
based language points
Purpose of Testing
NRT: Spreading students out along a
continuum of general abilities or
proficiency
CRT: Assessing the amount of material
learned by each student
Distribution of Scores
NRT: Normal distribution of scores
around the mean
CRT: Varies, often non-normal
Test Structure
NRT: A few long subtests with a
variety of item contents
CRT: A series of short, well-
defined subtests with similar item
contents
Knowledge of Questions
NRT: Students have no idea about
the content in test items
CRT: Students know exactly what
content to expect in test items
Matching Tests to Decision
Purposes
Basic Decisions:
Proficiency, Placement, Achievement,
Diagnostic
Decision Purposes:
Program level, classroom level
Program Level Decisions
1. Program Level Proficiency Decisions
1. Program Level Placement Decisions
1. Program Level Proficiency
Decisions
 Proficiency tests assess the general knowledge or
skills required to entry into a group of similar
institutions.
 Proficiency decisions requires knowing the
general level of proficiency of each student in
comparison to other students in order to admit the
students who fit the standards of a specific
institute.
Thus
Proficiency tests should be made on the basis of NRT
Fairness in Program Level
Proficiency Decisions
Since proficiency decisions can have a drastic effect on
students’ lives the issue of fairness must be considered
with extreme care.
 Proficiency tests might sound unfair because of the
arbitrary way they are handled in some settings but
they are often necessary.
 Proficiency tests might be unfair if they are used to
make comparison among different programs
(Solution = Program Fair Tests).
2. Program Level Placement
Decisions
 Placement decisions intends to group students of
similar ability levels together to make the classes
homogeneous.
 To do so, students’ relative knowledge or skill levels
must be assessed compared to other students
Thus
NRT is needed for program level placement decisions.
Proficiency Tests & Placement
Tests
 Proficiency tests are very general while
placement tests are specifically related to a
specific program.
 The purpose of program, the range of
abilities within the program, and the type
of students involved are the factors that
may make a proficiency test inappropriate
for purpose of testing placement.
Classroom Level Decisions
1. Classroom level Achievement Decisions
1. Classroom level Diagnostic Decisions
1. Classroom level Achievement
Decisions
 Achievement decisions are decisions about the
amount of learning that students have
accomplished.
 Achievement tests should be designed with
very specific reference to a particular course
and its objectives.
Thus
Achievement tests should be criterion referenced
2. Classroom level Diagnostic
Decisions
 Diagnostic decisions are made at the beginning or
middle of the term to foster achievement by
promoting strengths and to eliminate weaknesses
of students.
 Diagnostic tests are used to determine the degree
to which the specific instructional objectives of the
course have been accomplished.
Thus
Diagnostic tests should be criterion referenced.
A Single Test to Fulfill All
Test Functions
 There is no single test that can fulfill all
four functions of proficiency, placement,
achievement, and diagnostic
Because:
1. The ranges of ability tested by the four
types of tests are very different.
2. The content varies in each type of test.
Differences in the Range of
Abilities
 NRT proficiency tests measure a very wide range
of abilities, for example, TOEFL measures from
virtually no English to native-like ability.
 Placement tests are very different in the range of
abilities they assess depending on the abilities
handled by the particular institution involved. So
the range of such tests is much narrower than
TOEFL.
Then:
Then neither a TOEFL test can be used for placement
decisions, nor a placement test for TOEFL.
Differences in Variety of
Content
 Content of a proficiency test covers the whole
range of content types and ability levels covered
across many institutions.
 Content of a placement test, on the contrary,
should be more narrowed to meet the needs of one
specific institution.
 Content of diagnostic or achievement test should
also be even more narrowly defined to reflect the
exact content of the course, as expressed in
objectives for that course.
THANK YOU

Norm-referenced & Criterion-referenced Tests

  • 1.
    Norm-Referenced & Criterion-Referenced Tests © FaribaChamani, 2015 Testing in Language Programs by James Dean Brown (2005) Chapter one: Types and Uses of Language Tests
  • 2.
    Different Types ofTests 1. Norm Referenced Tests  Measure global language abilities.  Compare students’ performance to others. 2. Criterion Referenced Tests  Measure specific instructional objectives  Compares students’ performance to a specific criterion (Objectives or Standard)
  • 3.
    NRT & CRT NRT 1.Administrators 2. Proficiency tests (TOEFL) 3. Percentile 4. Normal Distribution 5. Predictability of form & content CRT 1. Teachers 2. Achievement Tests 3. Percentage 4. Criterion (objectives, Standards) 5. Unpredictability of form & content
  • 4.
    NRT & CRTDifferences 1. Type of Interpretation 2. Type of Measurement 3. Purpose of Testing 4. Distribution of Scores 5. Test Structure 6. Knowledge of Questions
  • 5.
    Type of Interpretation NRTRelative Student’s performance is compared to those of other students in percentile terms. CRT Absolute Student’s performance is compared to the percentage of material learned.
  • 6.
    Type of Measurement NRT:Measures general language abilities or proficiency CRT: Measures specific objective- based language points
  • 7.
    Purpose of Testing NRT:Spreading students out along a continuum of general abilities or proficiency CRT: Assessing the amount of material learned by each student
  • 8.
    Distribution of Scores NRT:Normal distribution of scores around the mean CRT: Varies, often non-normal
  • 9.
    Test Structure NRT: Afew long subtests with a variety of item contents CRT: A series of short, well- defined subtests with similar item contents
  • 10.
    Knowledge of Questions NRT:Students have no idea about the content in test items CRT: Students know exactly what content to expect in test items
  • 11.
    Matching Tests toDecision Purposes Basic Decisions: Proficiency, Placement, Achievement, Diagnostic Decision Purposes: Program level, classroom level
  • 12.
    Program Level Decisions 1.Program Level Proficiency Decisions 1. Program Level Placement Decisions
  • 13.
    1. Program LevelProficiency Decisions  Proficiency tests assess the general knowledge or skills required to entry into a group of similar institutions.  Proficiency decisions requires knowing the general level of proficiency of each student in comparison to other students in order to admit the students who fit the standards of a specific institute. Thus Proficiency tests should be made on the basis of NRT
  • 14.
    Fairness in ProgramLevel Proficiency Decisions Since proficiency decisions can have a drastic effect on students’ lives the issue of fairness must be considered with extreme care.  Proficiency tests might sound unfair because of the arbitrary way they are handled in some settings but they are often necessary.  Proficiency tests might be unfair if they are used to make comparison among different programs (Solution = Program Fair Tests).
  • 15.
    2. Program LevelPlacement Decisions  Placement decisions intends to group students of similar ability levels together to make the classes homogeneous.  To do so, students’ relative knowledge or skill levels must be assessed compared to other students Thus NRT is needed for program level placement decisions.
  • 16.
    Proficiency Tests &Placement Tests  Proficiency tests are very general while placement tests are specifically related to a specific program.  The purpose of program, the range of abilities within the program, and the type of students involved are the factors that may make a proficiency test inappropriate for purpose of testing placement.
  • 17.
    Classroom Level Decisions 1.Classroom level Achievement Decisions 1. Classroom level Diagnostic Decisions
  • 18.
    1. Classroom levelAchievement Decisions  Achievement decisions are decisions about the amount of learning that students have accomplished.  Achievement tests should be designed with very specific reference to a particular course and its objectives. Thus Achievement tests should be criterion referenced
  • 19.
    2. Classroom levelDiagnostic Decisions  Diagnostic decisions are made at the beginning or middle of the term to foster achievement by promoting strengths and to eliminate weaknesses of students.  Diagnostic tests are used to determine the degree to which the specific instructional objectives of the course have been accomplished. Thus Diagnostic tests should be criterion referenced.
  • 20.
    A Single Testto Fulfill All Test Functions  There is no single test that can fulfill all four functions of proficiency, placement, achievement, and diagnostic Because: 1. The ranges of ability tested by the four types of tests are very different. 2. The content varies in each type of test.
  • 21.
    Differences in theRange of Abilities  NRT proficiency tests measure a very wide range of abilities, for example, TOEFL measures from virtually no English to native-like ability.  Placement tests are very different in the range of abilities they assess depending on the abilities handled by the particular institution involved. So the range of such tests is much narrower than TOEFL. Then: Then neither a TOEFL test can be used for placement decisions, nor a placement test for TOEFL.
  • 22.
    Differences in Varietyof Content  Content of a proficiency test covers the whole range of content types and ability levels covered across many institutions.  Content of a placement test, on the contrary, should be more narrowed to meet the needs of one specific institution.  Content of diagnostic or achievement test should also be even more narrowly defined to reflect the exact content of the course, as expressed in objectives for that course.
  • 23.