Chapter 11 
Assessment of Academic 
Achievement with Multiple-Skill 
Devices
Achievement Tests 
• Achievement Tests 
– Norm-referenced 
• Allow for comparisons between students 
– Criterion-referenced 
• Allow for comparisons between individual students and a 
skill benchmark. 
• Why do we use achievement tests? 
– Assist teachers in determining skills students do and 
do not have 
– Inform instruction 
– Academic screening 
– Progress evaluation
Classifying Achievement Tests 
Diagnostic Achievement 
Number of students who can be tested 
High 
Less efficient administration – Dense 
content and numerous items allow 
teachers to uncover specific strengths 
and weaknesses 
Low 
More efficient administration – 
Comparisons between students can be 
made but very little power in determining 
strengths and weaknesses 
High 
Efficient administration – Typically only 
quantitative data are available 
Low 
Less efficient administration – Allows for 
more qualitative information about the 
student.
Considerations for Selecting a Test 
• Four Factors 
– Content validity 
• What the test actually measures should match its intended 
use 
– Stimulus-response modes 
• Students should not be hindered by the manner of test 
administration or required response 
– Standards used in state 
– Relevant norms 
• Does the student population being assessed match the 
population from which the normative data were acquired?
Tests of Academic Achievement 
• Peabody Individual Achievement Test (PIAT-R/ 
NU) 
• Wide Range Achievement Test 4 (WRAT4) 
• Wechsler Individual Achievement Test 3 
(WIAT-III)
Peabody Individual Achievement Test- 
Revised/Normative Update (PIAT-R/NU) 
• In general… 
– Individually administered; norm-referenced for K- 
12 students 
• Norm population 
– Most recent update was completed in 1998 
• Representative of each grade level 
– No changes to test structure
PIAT-R/NU 
Subtests 
Mathematics: 100 
multiple-choice items 
assess students’ 
knowledge and 
application of math 
concepts and facts 
Reading recognition: 
100 multiple-choice 
items require students 
to match and name 
letters and words 
General information: 
100 questions 
presented orally. 
Content areas include 
social studies, science, 
sports, and fine arts. 
Reading 
comprehension: 81 
multiple-choice items 
require students to 
select an appropriate 
answer following a 
reading passage 
Spelling: 100 items 
ranging in difficulty 
from kindergarten 
(letter naming) to high 
school (multiple-choice 
following verbal 
presentation) 
Written expression: 
Split into two levels. 
Level 1 assesses pre-writing 
skills and Level 
II requires story writing 
following a picture 
prompt
PIAT-R/NU 
• Scores 
– For all but one subtest (written expression), response 
to each item is pass/fail 
– Raw scores converted into: 
• Standard scores 
• Percentile ranks 
• Normal curve equivalents 
• Stanines 
– 3 composite scores 
• Total reading 
• Total test 
• Written language
PIAT-R/NU 
• Reliability and Validity 
– Despite new norms, reliability and validity data 
are only available for the original PIAT-R (1989) 
– Previous reliability and validity data are likely 
outdated 
• Outdated tests may not be relevant in the current 
educational context
Wide Range Achievement Test 4 
(WRAT4) 
• In general… 
– Individually administered 
– 15-45 minute test length depending on age (5-94 
age range) 
– Norm-referenced, but covers a limited sample of 
behaviors in 4 content areas 
• Norm population 
– Stratified across age, gender, ethnicity, geographic 
region, and parental education
WRAT4 
Subtests 
Word Reading: 
The student is 
required to 
name letters 
and read 
words 
Sentence 
Comprehension: 
The student is 
shown sentences 
and fills in missing 
words 
Spelling: The 
student write 
down words as 
they are read 
aloud 
Math 
Computation: The 
student solves 
basic 
computation 
problems 
• Scores 
– Raw scores converted to: 
• Standard scores, confidence 
intervals, percentiles, grade 
equivalents, and stanines 
• Reading composite available 
• Reliability 
– Internal consistency and 
alternate-form data are 
sufficient for screening purposes 
• Validity 
– Performance increases with age 
– WRAT4 is linked to other tests 
that have since been updated; 
additional evidence is necessary
Wechsler Individual Achievement Test- 
Third Edition (WIAT-III) 
• General 
– Diagnostic, norm-referenced achievement test 
– Reading, mathematics, written expression, 
listening, and speaking 
– Ages 4-19 
• Norm Population 
– Stratified sampling was used to sample within 
several common demographic variables: 
• Pre K – 12, age, race/ethnicity, sex, parent education, 
geographic region
WIAT-III 
• Subtests and scores 
– 16 subtests arranged into 7 domain composite 
scores and one total achievement score (structure 
provided on next slide) 
– Raw scores converted to: 
• Standard scores, percentile ranks, normal curve 
equivalents, stanines, age and grade equivalents, and 
growth scale value scores.
WIAT-III Subtests 
Composite Subtest 
Basic Reading Word Reading 
Pseudoword Decoding 
Reading Comprehension 
and Fluency 
Reading Comprehension 
Oral Reading Fluency 
Early Reading Skills 
Mathematics Math Problem Solving 
Numerical Operations 
Math Fluency Math Fluency – (Addition, Subtraction, & Multiplication) 
Written Expression Alphabet Writing Fluency 
Spelling 
Sentence Composition 
Essay Composition 
Oral Expression Listening Comprehension 
Oral Expression
WIAT-III 
• Reliability 
– Adequate reliability evidence 
• Split-half 
• Test-retest 
• Interrater agreement 
• Validity 
– Adequate validity evidence 
• Content 
• Construct 
• Criterion 
• Clinical Utility 
• Stronger reliability and validity evidence increase the 
relevance of information derived from the WIAT-III
Getting the Most Out of an Achievement Test 
• Helpful but not sufficient – most tests allow 
teachers to find an appropriate starting point 
• What is the nature of the behaviors being 
sampled by the test? 
– Need to seek out additional information 
concerning student strengths and weaknesses 
• Which items did the student excel on? Which did he or 
she struggle with? 
• Were there patterns of responding?

Pp ch11

  • 1.
    Chapter 11 Assessmentof Academic Achievement with Multiple-Skill Devices
  • 2.
    Achievement Tests •Achievement Tests – Norm-referenced • Allow for comparisons between students – Criterion-referenced • Allow for comparisons between individual students and a skill benchmark. • Why do we use achievement tests? – Assist teachers in determining skills students do and do not have – Inform instruction – Academic screening – Progress evaluation
  • 3.
    Classifying Achievement Tests Diagnostic Achievement Number of students who can be tested High Less efficient administration – Dense content and numerous items allow teachers to uncover specific strengths and weaknesses Low More efficient administration – Comparisons between students can be made but very little power in determining strengths and weaknesses High Efficient administration – Typically only quantitative data are available Low Less efficient administration – Allows for more qualitative information about the student.
  • 4.
    Considerations for Selectinga Test • Four Factors – Content validity • What the test actually measures should match its intended use – Stimulus-response modes • Students should not be hindered by the manner of test administration or required response – Standards used in state – Relevant norms • Does the student population being assessed match the population from which the normative data were acquired?
  • 5.
    Tests of AcademicAchievement • Peabody Individual Achievement Test (PIAT-R/ NU) • Wide Range Achievement Test 4 (WRAT4) • Wechsler Individual Achievement Test 3 (WIAT-III)
  • 6.
    Peabody Individual AchievementTest- Revised/Normative Update (PIAT-R/NU) • In general… – Individually administered; norm-referenced for K- 12 students • Norm population – Most recent update was completed in 1998 • Representative of each grade level – No changes to test structure
  • 7.
    PIAT-R/NU Subtests Mathematics:100 multiple-choice items assess students’ knowledge and application of math concepts and facts Reading recognition: 100 multiple-choice items require students to match and name letters and words General information: 100 questions presented orally. Content areas include social studies, science, sports, and fine arts. Reading comprehension: 81 multiple-choice items require students to select an appropriate answer following a reading passage Spelling: 100 items ranging in difficulty from kindergarten (letter naming) to high school (multiple-choice following verbal presentation) Written expression: Split into two levels. Level 1 assesses pre-writing skills and Level II requires story writing following a picture prompt
  • 8.
    PIAT-R/NU • Scores – For all but one subtest (written expression), response to each item is pass/fail – Raw scores converted into: • Standard scores • Percentile ranks • Normal curve equivalents • Stanines – 3 composite scores • Total reading • Total test • Written language
  • 9.
    PIAT-R/NU • Reliabilityand Validity – Despite new norms, reliability and validity data are only available for the original PIAT-R (1989) – Previous reliability and validity data are likely outdated • Outdated tests may not be relevant in the current educational context
  • 10.
    Wide Range AchievementTest 4 (WRAT4) • In general… – Individually administered – 15-45 minute test length depending on age (5-94 age range) – Norm-referenced, but covers a limited sample of behaviors in 4 content areas • Norm population – Stratified across age, gender, ethnicity, geographic region, and parental education
  • 11.
    WRAT4 Subtests WordReading: The student is required to name letters and read words Sentence Comprehension: The student is shown sentences and fills in missing words Spelling: The student write down words as they are read aloud Math Computation: The student solves basic computation problems • Scores – Raw scores converted to: • Standard scores, confidence intervals, percentiles, grade equivalents, and stanines • Reading composite available • Reliability – Internal consistency and alternate-form data are sufficient for screening purposes • Validity – Performance increases with age – WRAT4 is linked to other tests that have since been updated; additional evidence is necessary
  • 12.
    Wechsler Individual AchievementTest- Third Edition (WIAT-III) • General – Diagnostic, norm-referenced achievement test – Reading, mathematics, written expression, listening, and speaking – Ages 4-19 • Norm Population – Stratified sampling was used to sample within several common demographic variables: • Pre K – 12, age, race/ethnicity, sex, parent education, geographic region
  • 13.
    WIAT-III • Subtestsand scores – 16 subtests arranged into 7 domain composite scores and one total achievement score (structure provided on next slide) – Raw scores converted to: • Standard scores, percentile ranks, normal curve equivalents, stanines, age and grade equivalents, and growth scale value scores.
  • 14.
    WIAT-III Subtests CompositeSubtest Basic Reading Word Reading Pseudoword Decoding Reading Comprehension and Fluency Reading Comprehension Oral Reading Fluency Early Reading Skills Mathematics Math Problem Solving Numerical Operations Math Fluency Math Fluency – (Addition, Subtraction, & Multiplication) Written Expression Alphabet Writing Fluency Spelling Sentence Composition Essay Composition Oral Expression Listening Comprehension Oral Expression
  • 15.
    WIAT-III • Reliability – Adequate reliability evidence • Split-half • Test-retest • Interrater agreement • Validity – Adequate validity evidence • Content • Construct • Criterion • Clinical Utility • Stronger reliability and validity evidence increase the relevance of information derived from the WIAT-III
  • 16.
    Getting the MostOut of an Achievement Test • Helpful but not sufficient – most tests allow teachers to find an appropriate starting point • What is the nature of the behaviors being sampled by the test? – Need to seek out additional information concerning student strengths and weaknesses • Which items did the student excel on? Which did he or she struggle with? • Were there patterns of responding?