This document discusses different types of language assessments including achievement tests, performance assessments, and proficiency tests. It describes how achievement tests measure curriculum-specific, memorized content while proficiency tests measure non-specific, novel responses. The document also discusses criterion-referenced versus norm-referenced scoring and how formative assessments are used for learning during a course while summative assessments are used for assigning grades at the end. Finally, different item types like constructed response, multiple choice, and matching are presented along with their advantages and disadvantages.
1. Aligning Assessment Type and
Purpose with Use
National Chinese Language Conference 2013
David P. Ellis, PhD
National Foreign Language Center
University of Maryland
4. Achievement vs. Performance vs.
Proficiency
Content Response
Achievement Curriculum-Specific Memorized/Rehearsed
Performance
Proficiency
5. Achievement vs. Performance vs.
Proficiency
Content Response
Achievement Curriculum-Specific Memorized/Rehearsed
Performance
Proficiency Non-Specific Novel/Non-Rehearsed
6. Achievement vs. Performance vs.
Proficiency
Content Response
Achievement Curriculum-Specific Memorized/Rehearsed
Performance Domain-Specific Applied/Semi-Rehearsed
Proficiency Non-Specific Novel/Non-Rehearsed
7. Activity
Either alone or in pairs, classify the
following assessments as achievement,
performance, or proficiency.
8. Test Type Activity
Test Type Test Type
Driving test (road) SAT
Driving test (written) Semester final
Vocabulary quiz Writing placement
Role play (known
prompt)
Typing test
Role play (unknown
prompt)
Translation test
9. Test Type Activity Answers
Test Type Test Type
Driving test (road) Performance SAT Proficiency
Driving test (written) Achievement Semester final Achievement
Vocabulary quiz Achievement Writing placement Proficiency
Role play (known
prompt)
Performance Typing test Performance
Role play (unknown
prompt)
Proficiency Translation test Performance
14. Test Scoring Considerations
Criterion-Referenced Absolute Norm-Referenced Relative
Test Reliability Precision Test Validity Accuracy
Intra-Rater Reliability Within Inter-Rater Reliability Across
15. Activity
Either alone or in pairs, classify the scoring
method of the following assessments as
either criterion-referenced or norm-
referenced.
16. Test Scoring Activity
Test Scoring Type Test Scoring Type
Driving test SAT
Oral Proficiency
Interview (OPI)
Semester final
Vocabulary quiz Writing placement
Role play TOEFL
17. Test Scoring Activity Answers
Test Scoring Type Test Scoring Type
Driving test Criterion-Referenced SAT Norm-Referenced
Oral Proficiency
Interview (OPI)
Criterion-Referenced Semester final Depends
Vocabulary quiz Criterion-Referenced Writing placement Depends
Role play Depends TOEFL Norm-Referenced
19. Test Use (Decision Purpose)
Reason Example
Diagnostic
Placement
Progress
Mastery
Selection
Prediction
20. Test Use (Decision Purpose)
Reason Example
Diagnostic MRI after head injury
Placement Writing test for incoming ESL students
Progress Pop quiz on last night’s reading material
Mastery Driving test
Selection Employment test
Prediction SAT
22. Test Use (Decision Type)
Formative Summative
What Assessment FOR learning Assessment OF learning
When
Why
Who
23. Test Use (Decision Type)
Formative Summative
What Assessment FOR learning Assessment OF learning
When DURING a course/section At the END of a course/section
Why
Who
24. Test Use (Decision Type)
Formative Summative
What Assessment FOR learning Assessment OF learning
When DURING a course/section At the END of a course/section
Why To INFORM subsequent teaching To ASSIGN a score/grade
Who
25. Test Use (Decision Type)
Formative Summative
What Assessment FOR learning Assessment OF learning
When DURING a course/section At the END of a course/section
Why To INFORM subsequent teaching To ASSIGN a score/grade
Who Teacher, Self, Peers Teacher
26. Test Design (Item Types)
Item Type Advantage Disadvantage
Constructed response
Multiple choice
True False
Check all that apply
Cloze
Matching
Categorization
27. Test Design (Item Types)
Item Type Advantage Disadvantage
Constructed response Requires a novel response Cannot be machine scored
Multiple choice Increases ability to create
reliable tests
Does not require a novel
response
True False Easy to create items 50-50 chance w/guessing
Check all that apply Allows for numerous answer
combinations
Can be perceived as unfair
Cloze Can test multiple concepts
(vocab) within a single context
Unclear what skill is being
tested
Matching Allows demonstration of
pairwise relationships
Correct answers can be found
by process of elimination
Categorization Allows demonstration of
semantic relationships
Scoring is complex
Take advantage of the collective expertise in the room and have a discussion rather than a lecture. Nevertheless, this talk is oriented to experienced teachers who have limited formal training in assessment.
In the sciences, precision vs. accuracy (bullseye, scale)
In the sciences, precision vs. accuracy (bullseye, scale)
In the sciences, precision vs. accuracy (bullseye, scale)