SlideShare a Scribd company logo
Criterion-referenced Language Testing
Chapter 1
Alternative paradigms
In this chapter
Definition of norm- referenced tests and criterion-referenced tests
1
The differences and similarities of norm-referenced and
criterion-referenced approaches
The place of criterion-referenced tests in language testing
theory and research
2
3
NRT definition
Any test that is primarily designed to disperse the performances of students in a
normal distribution based on their general abilities, or proficiencies, for purposes
of categorizing the students into levels or comparing students’ performances to
the performances of the others who formed the normative group.
The interpretation given an examinee’s score is called a relative decision because
it is understood as that examinee’s position relative to the scores of all of the
other examinees who took the test.
Criterion-referenced tests
They yield absolute decisions because each examinee's score is meaningful without reference to the scores of
the other examinees.
Glaser (1963): criterion-referenced measures indicate the content of the behavioral repertory, and the
correspondence between what an individual does and the underlying continuum of achievement. These
measures provide information as to the degree of competence attained by a particular student which is
independent of reference to the performance of others.
Variations of CRTs
Domain-referenced tests
Hively, Patterson, and Page (1968): DRTs are based on item forms; item forms: the documents which
delineate a domain of student behaviors and content-area material to which test items are then referenced.
Osburn (1968): universe-defined test; any test constructed and administered in a way that an examinee's score
on the test provides an unbiased estimate of his score on some explicitly defined universe of item content.
Variations of CRTs
Mastery-referenced tests
Tests that link different mastery decisions to specific instructional contexts with specific instructional
objectives that are not necessarily related to any external domain of knowledge.
The differences among the RT applications tend to focus on types of sampling generalizability;
whether generalized to a domain, to an instructional set of objectives, to a mastery decision, etc.
Variations of CRTs
Objectives-referenced tests
Are constructed so that subsets of the items measure the specific objectives of a course, program of
study, or other clearly delineated subject-matter area.
They can be CRTs if: a) objectives are written to define a domain; b) items are representative samples
of behavior from this domain.
Differences and similarities between NRTs and CRTs
in educational settings
Teaching/testing mismatches.
Lack of instructional sensitivity.
Lack of curricular relevance.
Restriction to the normal distribution.
Restriction to items that discriminate.
Why were criterion-referenced tests developed?
5
4
1
2
3
CRTs developed in response to problems, or weaknesses, that were perceived in the pervasive
norm- referenced testing of the day. The problems with NRTs:
Teaching/testing mismatches
In cases where large-scale, standardized examinations are used, material tested in the
examination may nor be directly related to the teaching going on at the particular
institution involved. Such mismatches may arise because of the general nature of the
material that is typically tested on an NRT or because the content of the test is not
directly related to the curriculum at the institution.
Lack of instructional sensitivity
Because of their general and abstracted nature and putative global applicability across a
variety of instructional settings, NRTs are nor suited to measuring the specific learning
points and skills developed in a particular program. As a result, NRTs cannot be
expected to measure the amount of knowledge or skill that a student has within a well-
defined content area. Furthermore, NRTs cannot be expected to be effective for
diagnosis of deficiencies with reference to particular courses or programs.
Lack of curricular relevance
Because of they can cause teaching testing mismatches and generally lack sensitivity to
instruction, some educators feel thar NRTs are nor effective for evaluating the effects
of curriculum change on student achievement. Thus, NRTs are nor felt to be
particularly well-suited to assessing the strengths and weaknesses of a given program,
or for suggesting useful areas of instructional amelioration in a specific program, or for
comparing the relative strengths and weaknesses of different language programs.
Restriction to the normal distribution
NRTs are designed, statistically analyzed, and revised with the purpose of creating a
normal distribution of scores. Thus, one of the first issues that must be addressed in
analyzing NRT results is the degree to which the distribution of scores is normal.
However, if all of the students know all of the material, a test that reflects that
knowledge would be desirable. On such a test, a normal distribution of scores could nor
reasonably be expected to appear.
Restriction to items that discriminate
To create an NRT, those items are selected that about half of the students cannot
answer correctly on average. Thus, the focus of the test is on content that the students
(or at least 50 percent of them) do not know rather than on what they have learned (as
on a CRT). As a result, test designers sometimes have a tendency to select items simply
because they discriminate well between high-achieving and low-achieving students
rather than because the items are related to the curriculum or anything that the students
are learning.
Differences and similarities between NRTs and CRTs
CRTs are different
CRTs can be used to circumvent all of the complaints mentioned previously.
CRTs can be expected to have these characteristics:
Emphasis on teaching/testing matches; Focus on instructional sensitivity; Curricular relevance;
Absence of normal distribution restrictions; No item discrimination restriction.
Differences and similarities between NRTs and CRTs
Comparisons between NRTs and CRTs
Hudson and Lynch (1984): relative standing and absolute standing
NRM: broad, less descriptive indication of relative standing
CRM: gaps in coverage, more descriptive information, absolute standing
Differences and similarities between NRTs and CRTs
Fundamental distinctions between CRTs and NRTs
NRTs and CRTs both:
Require specification of the achievement domain to be measured; Require a relevant and representative
sample of test items; Use the same types of test items; Use the same rules for item writing (except for
item difficulty); Are judged by the same qualities of goodness (validity and reliability); Are useful in
educational measurement.
Differences and similarities between NRTs and CRTs
Davies' principles
1. CRTs cannot be constructed in a completely separate way from NRTs without "the usual canons of
item discreteness and discrimination".
Counter claim: CRT items are certainly concerned with discriminating between very different features
than those in NRT development.
Differences and similarities between NRTs and CRTs
Davies' principles (continued)
2. Because teachers are concerned with very small groups of learners what they require is a criterion-
referenced use of a norm-referenced tests, that "does not discriminate greatly among their students but
which does establish an adequate dichotomy… between plus success and minus success". Further, for
every CRT there must be a population for whom the tests could be norm-referenced.
Counter claim: this principle is on the face of it an inverse statement of what goes on in classrooms.
Differences and similarities between NRTs and CRTs
Davies' principles (continued)
3. Criterion-referencing is linked to exercises while norm-referencing is linked to tests.
Counter claim: this proposed definition of "test" and "exercise" is not found elsewhere and is only argued
by Davies himself.
The place of CRTs in language testing theory and
research
Four questions are raised
1. What makes language testing special?
a. Language and language acquisition are different in nature from other educational content areas such
as mathematics, etc.
b. Language is interactional;
c. Language is situated.
The place of CRTs in language testing theory and
research
2. What is language proficiency?
There are different perspectives and definitions to this complicated construct.
3.What is communicative language ability?
Communicative competence was proposed by Hymes (1972) and Campbell and Wales
(1970) as a broader view of Chomsky’s linguistic competence.
Several models of communicative competence were proposed. These models differ
mostly in terms of what they include. The differing views of communicative
competence and communicative language ability produce differing views of how
language tests can best assess the communicative abilities of examinees.
3.What is communicative language ability? (continued)
Morrow (1979) asserts that language tests must reflect the following features of language use:
1. Language is used in interaction.
2. Interactions are usually unpredictable.
3. Language has a context.
4. Language is used for a purpose.
5. There is a need to examine performance.
6. Language is authentic, not simplified.
7. Language success is behavior based.
1. Be criterion-referenced against the operational
performance of a set of language tasks.
2. Be concerned with validating itself against those criteria
and be concerned with content, construct and predictive
validity, not concurrent validity.
3. Rely on modes of assessment which are nor directly
quantitative, bur which are instead qualitative.
4. Subordinate reliability to face validity
To satisfy
these
requirements,
a test should
3.What is communicative language ability? (continued)
Bachman (1990) refers to the 2 primary approaches to identifying what is meant by
‘‘authentic’’ within language assessment:
Real-life approach (represented also by Morrow’s concerns) focuses on the degree to
which a test represents language performance in non-testing situational use.
Interactional ability approach is concerned more with the distinguishing characteristics
of communicative language use.
3.What is communicative language ability? (continued)
Cziko (1984) examines several models of communicative language ability and addresses 3 problematic issues in
the development and interpretation of the models of communicative competence:
1. Correlational analyses are problematic. If two skills have a low correlation coefficient, they would be
considered relatively independent. However, a low correlation might also be the result of little within- skill
variation on one/both of the skill tests.
2. Variable language skill exposure in a group may lead to misleading interpretations.
3. Heterogeneous groups of subjects may show high within- skill variance in language proficiency while
homogeneous groups may show low within- skill variance.
Given these issues, he concludes that in order to understand models of communicative competence research must
involve the use of CRTs.
4.What problems do CRT developers face?
Several practical questions:
1. How can item analysis be performed when: (a) no comparison group is designated as instructed or
uninstructed group; (b) no externally identified masters and non-masters are defined; or (c) when
mastery groups are defined and available?
2. How dependable are the decisions made on the basis of the test? How generalizable are the scores
and analyses to those of other examinees on other forms of the test?
3. How can a standard, or cut-point, be rationally set?
4. What advantages and disadvantages accrue from application of the statistical approaches provided
by NRT or CRT analyses?
thank you!

More Related Content

What's hot

Standards In Language Testing
Standards In Language TestingStandards In Language Testing
Standards In Language Testing
masters8
 
Testing and assessment in elt
Testing and assessment in eltTesting and assessment in elt
Testing and assessment in elt
Cidher89
 
types-of-test-and-testing
 types-of-test-and-testing types-of-test-and-testing
types-of-test-and-testing
Amal Al Abri
 
Language Testing: Approaches and Techniques
Language Testing: Approaches and TechniquesLanguage Testing: Approaches and Techniques
Language Testing: Approaches and Techniques
Monica Angeles
 
Principles of language assessment
Principles of language assessmentPrinciples of language assessment
Principles of language assessment
Sutrisno Evenddy
 
Makalah nrt n crt
Makalah nrt n crtMakalah nrt n crt
Makalah nrt n crt
Nur Arif S
 

What's hot (20)

Brown, chapter 4 By Savaedi
Brown, chapter 4 By SavaediBrown, chapter 4 By Savaedi
Brown, chapter 4 By Savaedi
 
Standards In Language Testing
Standards In Language TestingStandards In Language Testing
Standards In Language Testing
 
Fundamental concepts and principles in Language Testing
Fundamental concepts and principles in Language TestingFundamental concepts and principles in Language Testing
Fundamental concepts and principles in Language Testing
 
Kinds of Language Tests
Kinds of Language TestsKinds of Language Tests
Kinds of Language Tests
 
Introduction to Test and Assessment
Introduction to Test and Assessment Introduction to Test and Assessment
Introduction to Test and Assessment
 
Language testing
Language testingLanguage testing
Language testing
 
Testing : An important part of ELT
Testing : An important part of ELTTesting : An important part of ELT
Testing : An important part of ELT
 
Testing language skills chapter one
Testing language skills chapter oneTesting language skills chapter one
Testing language skills chapter one
 
LANGUAJE TESTING
LANGUAJE TESTINGLANGUAJE TESTING
LANGUAJE TESTING
 
Testing and assessment in elt
Testing and assessment in eltTesting and assessment in elt
Testing and assessment in elt
 
Assessment and testing language
Assessment and testing languageAssessment and testing language
Assessment and testing language
 
types-of-test-and-testing
 types-of-test-and-testing types-of-test-and-testing
types-of-test-and-testing
 
Test Construction
Test ConstructionTest Construction
Test Construction
 
Test Technique
Test TechniqueTest Technique
Test Technique
 
Language Testing: Approaches and Techniques
Language Testing: Approaches and TechniquesLanguage Testing: Approaches and Techniques
Language Testing: Approaches and Techniques
 
Characteristics of a good test
Characteristics of a good testCharacteristics of a good test
Characteristics of a good test
 
My Presentation
My PresentationMy Presentation
My Presentation
 
Principles of language assessment
Principles of language assessmentPrinciples of language assessment
Principles of language assessment
 
Makalah nrt n crt
Makalah nrt n crtMakalah nrt n crt
Makalah nrt n crt
 
Principles of Language Assessment
Principles of Language AssessmentPrinciples of Language Assessment
Principles of Language Assessment
 

Similar to Brown&hudson,chap1

Learning_activity1_Tirado.Correa_Geovanna.Elizabeth.pptx
Learning_activity1_Tirado.Correa_Geovanna.Elizabeth.pptxLearning_activity1_Tirado.Correa_Geovanna.Elizabeth.pptx
Learning_activity1_Tirado.Correa_Geovanna.Elizabeth.pptx
getirado
 
Languageassessmenttsl3123notes 141203115756-conversion-gate01 (1)
Languageassessmenttsl3123notes 141203115756-conversion-gate01 (1)Languageassessmenttsl3123notes 141203115756-conversion-gate01 (1)
Languageassessmenttsl3123notes 141203115756-conversion-gate01 (1)
hakim azman
 
Testing and test constructions
Testing and test constructionsTesting and test constructions
Testing and test constructions
Samcruz5
 
Norm referenced and criterion referenced.pptx
Norm referenced and criterion referenced.pptxNorm referenced and criterion referenced.pptx
Norm referenced and criterion referenced.pptx
jason322724
 

Similar to Brown&hudson,chap1 (20)

Nrt and crt
Nrt and crtNrt and crt
Nrt and crt
 
A1. Cajamarca.Patricio.Assessment.pptx
A1. Cajamarca.Patricio.Assessment.pptxA1. Cajamarca.Patricio.Assessment.pptx
A1. Cajamarca.Patricio.Assessment.pptx
 
Norm-referenced & Criterion-referenced Tests
Norm-referenced & Criterion-referenced TestsNorm-referenced & Criterion-referenced Tests
Norm-referenced & Criterion-referenced Tests
 
LING139 (MARTINEZ MAEd-Eng).ppt
LING139 (MARTINEZ MAEd-Eng).pptLING139 (MARTINEZ MAEd-Eng).ppt
LING139 (MARTINEZ MAEd-Eng).ppt
 
Learning_activity1_Tirado.Correa_Geovanna.Elizabeth.pptx
Learning_activity1_Tirado.Correa_Geovanna.Elizabeth.pptxLearning_activity1_Tirado.Correa_Geovanna.Elizabeth.pptx
Learning_activity1_Tirado.Correa_Geovanna.Elizabeth.pptx
 
Criterion-referenced and norm-referenced assessments: compatibility and compl...
Criterion-referenced and norm-referencedassessments: compatibility and compl...Criterion-referenced and norm-referencedassessments: compatibility and compl...
Criterion-referenced and norm-referenced assessments: compatibility and compl...
 
Lt j-test construction procedure
Lt j-test construction procedureLt j-test construction procedure
Lt j-test construction procedure
 
L2 assessment
L2 assessmentL2 assessment
L2 assessment
 
standardized Achievement tests SAT
standardized Achievement tests SATstandardized Achievement tests SAT
standardized Achievement tests SAT
 
Classification of Assessment based on Nature of Interpretation-Norms Referen...
Classification of Assessment based on Nature of  Interpretation-Norms Referen...Classification of Assessment based on Nature of  Interpretation-Norms Referen...
Classification of Assessment based on Nature of Interpretation-Norms Referen...
 
Languageassessmenttsl3123notes 141203115756-conversion-gate01 (1)
Languageassessmenttsl3123notes 141203115756-conversion-gate01 (1)Languageassessmenttsl3123notes 141203115756-conversion-gate01 (1)
Languageassessmenttsl3123notes 141203115756-conversion-gate01 (1)
 
Testing and test constructions
Testing and test constructionsTesting and test constructions
Testing and test constructions
 
Standards based assessment
Standards based assessmentStandards based assessment
Standards based assessment
 
Bab 3
Bab 3 Bab 3
Bab 3
 
TEST DEVELOPMENT AND EVALUATION (6462)
TEST DEVELOPMENT AND EVALUATION (6462)TEST DEVELOPMENT AND EVALUATION (6462)
TEST DEVELOPMENT AND EVALUATION (6462)
 
Quantitative analysis
Quantitative analysisQuantitative analysis
Quantitative analysis
 
Psychological Tests
Psychological TestsPsychological Tests
Psychological Tests
 
Norm referenced and criterion referenced.pptx
Norm referenced and criterion referenced.pptxNorm referenced and criterion referenced.pptx
Norm referenced and criterion referenced.pptx
 
Principles_of_language_testing.ppt
Principles_of_language_testing.pptPrinciples_of_language_testing.ppt
Principles_of_language_testing.ppt
 
A1.Pombo.Jurado.Jose.Assessment.nrc.18234.pptx
A1.Pombo.Jurado.Jose.Assessment.nrc.18234.pptxA1.Pombo.Jurado.Jose.Assessment.nrc.18234.pptx
A1.Pombo.Jurado.Jose.Assessment.nrc.18234.pptx
 

Recently uploaded

plant breeding methods in asexually or clonally propagated crops
plant breeding methods in asexually or clonally propagated cropsplant breeding methods in asexually or clonally propagated crops
plant breeding methods in asexually or clonally propagated crops
parmarsneha2
 

Recently uploaded (20)

Fish and Chips - have they had their chips
Fish and Chips - have they had their chipsFish and Chips - have they had their chips
Fish and Chips - have they had their chips
 
NLC-2024-Orientation-for-RO-SDO (1).pptx
NLC-2024-Orientation-for-RO-SDO (1).pptxNLC-2024-Orientation-for-RO-SDO (1).pptx
NLC-2024-Orientation-for-RO-SDO (1).pptx
 
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptxMARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
 
Basic Civil Engineering Notes of Chapter-6, Topic- Ecosystem, Biodiversity G...
Basic Civil Engineering Notes of Chapter-6,  Topic- Ecosystem, Biodiversity G...Basic Civil Engineering Notes of Chapter-6,  Topic- Ecosystem, Biodiversity G...
Basic Civil Engineering Notes of Chapter-6, Topic- Ecosystem, Biodiversity G...
 
Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)
 
plant breeding methods in asexually or clonally propagated crops
plant breeding methods in asexually or clonally propagated cropsplant breeding methods in asexually or clonally propagated crops
plant breeding methods in asexually or clonally propagated crops
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
How to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPHow to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERP
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
 
Introduction to Quality Improvement Essentials
Introduction to Quality Improvement EssentialsIntroduction to Quality Improvement Essentials
Introduction to Quality Improvement Essentials
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
NCERT Solutions Power Sharing Class 10 Notes pdf
NCERT Solutions Power Sharing Class 10 Notes pdfNCERT Solutions Power Sharing Class 10 Notes pdf
NCERT Solutions Power Sharing Class 10 Notes pdf
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
 

Brown&hudson,chap1

  • 2. In this chapter Definition of norm- referenced tests and criterion-referenced tests 1 The differences and similarities of norm-referenced and criterion-referenced approaches The place of criterion-referenced tests in language testing theory and research 2 3
  • 3. NRT definition Any test that is primarily designed to disperse the performances of students in a normal distribution based on their general abilities, or proficiencies, for purposes of categorizing the students into levels or comparing students’ performances to the performances of the others who formed the normative group. The interpretation given an examinee’s score is called a relative decision because it is understood as that examinee’s position relative to the scores of all of the other examinees who took the test.
  • 4. Criterion-referenced tests They yield absolute decisions because each examinee's score is meaningful without reference to the scores of the other examinees. Glaser (1963): criterion-referenced measures indicate the content of the behavioral repertory, and the correspondence between what an individual does and the underlying continuum of achievement. These measures provide information as to the degree of competence attained by a particular student which is independent of reference to the performance of others.
  • 5. Variations of CRTs Domain-referenced tests Hively, Patterson, and Page (1968): DRTs are based on item forms; item forms: the documents which delineate a domain of student behaviors and content-area material to which test items are then referenced. Osburn (1968): universe-defined test; any test constructed and administered in a way that an examinee's score on the test provides an unbiased estimate of his score on some explicitly defined universe of item content.
  • 6. Variations of CRTs Mastery-referenced tests Tests that link different mastery decisions to specific instructional contexts with specific instructional objectives that are not necessarily related to any external domain of knowledge. The differences among the RT applications tend to focus on types of sampling generalizability; whether generalized to a domain, to an instructional set of objectives, to a mastery decision, etc.
  • 7. Variations of CRTs Objectives-referenced tests Are constructed so that subsets of the items measure the specific objectives of a course, program of study, or other clearly delineated subject-matter area. They can be CRTs if: a) objectives are written to define a domain; b) items are representative samples of behavior from this domain.
  • 8. Differences and similarities between NRTs and CRTs in educational settings Teaching/testing mismatches. Lack of instructional sensitivity. Lack of curricular relevance. Restriction to the normal distribution. Restriction to items that discriminate. Why were criterion-referenced tests developed? 5 4 1 2 3 CRTs developed in response to problems, or weaknesses, that were perceived in the pervasive norm- referenced testing of the day. The problems with NRTs:
  • 9. Teaching/testing mismatches In cases where large-scale, standardized examinations are used, material tested in the examination may nor be directly related to the teaching going on at the particular institution involved. Such mismatches may arise because of the general nature of the material that is typically tested on an NRT or because the content of the test is not directly related to the curriculum at the institution.
  • 10. Lack of instructional sensitivity Because of their general and abstracted nature and putative global applicability across a variety of instructional settings, NRTs are nor suited to measuring the specific learning points and skills developed in a particular program. As a result, NRTs cannot be expected to measure the amount of knowledge or skill that a student has within a well- defined content area. Furthermore, NRTs cannot be expected to be effective for diagnosis of deficiencies with reference to particular courses or programs.
  • 11. Lack of curricular relevance Because of they can cause teaching testing mismatches and generally lack sensitivity to instruction, some educators feel thar NRTs are nor effective for evaluating the effects of curriculum change on student achievement. Thus, NRTs are nor felt to be particularly well-suited to assessing the strengths and weaknesses of a given program, or for suggesting useful areas of instructional amelioration in a specific program, or for comparing the relative strengths and weaknesses of different language programs.
  • 12. Restriction to the normal distribution NRTs are designed, statistically analyzed, and revised with the purpose of creating a normal distribution of scores. Thus, one of the first issues that must be addressed in analyzing NRT results is the degree to which the distribution of scores is normal. However, if all of the students know all of the material, a test that reflects that knowledge would be desirable. On such a test, a normal distribution of scores could nor reasonably be expected to appear.
  • 13. Restriction to items that discriminate To create an NRT, those items are selected that about half of the students cannot answer correctly on average. Thus, the focus of the test is on content that the students (or at least 50 percent of them) do not know rather than on what they have learned (as on a CRT). As a result, test designers sometimes have a tendency to select items simply because they discriminate well between high-achieving and low-achieving students rather than because the items are related to the curriculum or anything that the students are learning.
  • 14. Differences and similarities between NRTs and CRTs CRTs are different CRTs can be used to circumvent all of the complaints mentioned previously. CRTs can be expected to have these characteristics: Emphasis on teaching/testing matches; Focus on instructional sensitivity; Curricular relevance; Absence of normal distribution restrictions; No item discrimination restriction.
  • 15. Differences and similarities between NRTs and CRTs Comparisons between NRTs and CRTs Hudson and Lynch (1984): relative standing and absolute standing NRM: broad, less descriptive indication of relative standing CRM: gaps in coverage, more descriptive information, absolute standing
  • 16. Differences and similarities between NRTs and CRTs Fundamental distinctions between CRTs and NRTs NRTs and CRTs both: Require specification of the achievement domain to be measured; Require a relevant and representative sample of test items; Use the same types of test items; Use the same rules for item writing (except for item difficulty); Are judged by the same qualities of goodness (validity and reliability); Are useful in educational measurement.
  • 17. Differences and similarities between NRTs and CRTs Davies' principles 1. CRTs cannot be constructed in a completely separate way from NRTs without "the usual canons of item discreteness and discrimination". Counter claim: CRT items are certainly concerned with discriminating between very different features than those in NRT development.
  • 18. Differences and similarities between NRTs and CRTs Davies' principles (continued) 2. Because teachers are concerned with very small groups of learners what they require is a criterion- referenced use of a norm-referenced tests, that "does not discriminate greatly among their students but which does establish an adequate dichotomy… between plus success and minus success". Further, for every CRT there must be a population for whom the tests could be norm-referenced. Counter claim: this principle is on the face of it an inverse statement of what goes on in classrooms.
  • 19. Differences and similarities between NRTs and CRTs Davies' principles (continued) 3. Criterion-referencing is linked to exercises while norm-referencing is linked to tests. Counter claim: this proposed definition of "test" and "exercise" is not found elsewhere and is only argued by Davies himself.
  • 20. The place of CRTs in language testing theory and research Four questions are raised 1. What makes language testing special? a. Language and language acquisition are different in nature from other educational content areas such as mathematics, etc. b. Language is interactional; c. Language is situated.
  • 21. The place of CRTs in language testing theory and research 2. What is language proficiency? There are different perspectives and definitions to this complicated construct.
  • 22. 3.What is communicative language ability? Communicative competence was proposed by Hymes (1972) and Campbell and Wales (1970) as a broader view of Chomsky’s linguistic competence. Several models of communicative competence were proposed. These models differ mostly in terms of what they include. The differing views of communicative competence and communicative language ability produce differing views of how language tests can best assess the communicative abilities of examinees.
  • 23. 3.What is communicative language ability? (continued) Morrow (1979) asserts that language tests must reflect the following features of language use: 1. Language is used in interaction. 2. Interactions are usually unpredictable. 3. Language has a context. 4. Language is used for a purpose. 5. There is a need to examine performance. 6. Language is authentic, not simplified. 7. Language success is behavior based. 1. Be criterion-referenced against the operational performance of a set of language tasks. 2. Be concerned with validating itself against those criteria and be concerned with content, construct and predictive validity, not concurrent validity. 3. Rely on modes of assessment which are nor directly quantitative, bur which are instead qualitative. 4. Subordinate reliability to face validity To satisfy these requirements, a test should
  • 24. 3.What is communicative language ability? (continued) Bachman (1990) refers to the 2 primary approaches to identifying what is meant by ‘‘authentic’’ within language assessment: Real-life approach (represented also by Morrow’s concerns) focuses on the degree to which a test represents language performance in non-testing situational use. Interactional ability approach is concerned more with the distinguishing characteristics of communicative language use.
  • 25. 3.What is communicative language ability? (continued) Cziko (1984) examines several models of communicative language ability and addresses 3 problematic issues in the development and interpretation of the models of communicative competence: 1. Correlational analyses are problematic. If two skills have a low correlation coefficient, they would be considered relatively independent. However, a low correlation might also be the result of little within- skill variation on one/both of the skill tests. 2. Variable language skill exposure in a group may lead to misleading interpretations. 3. Heterogeneous groups of subjects may show high within- skill variance in language proficiency while homogeneous groups may show low within- skill variance. Given these issues, he concludes that in order to understand models of communicative competence research must involve the use of CRTs.
  • 26. 4.What problems do CRT developers face? Several practical questions: 1. How can item analysis be performed when: (a) no comparison group is designated as instructed or uninstructed group; (b) no externally identified masters and non-masters are defined; or (c) when mastery groups are defined and available? 2. How dependable are the decisions made on the basis of the test? How generalizable are the scores and analyses to those of other examinees on other forms of the test? 3. How can a standard, or cut-point, be rationally set? 4. What advantages and disadvantages accrue from application of the statistical approaches provided by NRT or CRT analyses?