SlideShare a Scribd company logo
1 of 4
Download to read offline
DISCUSSION PAPER
Stuart R. Kahl, Ph.D., CEO | kahl.stuart@measuredprogress.org
Michael Nering, Ph.D., Assistant Vice President | nering.michael@measuredprogress.org
Michael Russell, Ph.D.,Vice President | mike@nimbletools.com
Peter D. Hofman, Vice President | hofman.peter@measuredprogress.org
www.measuredprogress.org | 800.431.8901 | 100 Education Way, Dover, NH 03820
Adaptive Testing, Learning Progressions,
and Students with Disabilities
May 17, 2011
©2011 Measured Progress. All rights reserved. | measuredprogress.org | 800.431.8901
DISCUSSION PAPER
2
Introduction
This paper responds to the legitimate concern that a
student may not be afforded a sufficient opportunity
during adaptive testing for accountability purposes to
demonstrate what on-grade-level knowledge or skills he or
she actually has. We summarize below different means of
achieving the desired objective of providing all students a
valid opportunity to perform at grade level prior to being
presented with off-grade test content. We also discuss the
implications of learning progressions in creating adaptive
assessments.
For purposes of this discussion, we assume that sufficient
accommodations are embedded in adaptive testing
systems so that the maximum number of students
possible has access to the test content and the ability
to respond to items. In other words, the absence of
accommodations is not an obstacle to students’ correctly
responding to grade-level test content. We do recognize
that this assumption will likely require modifications to
most, if not all, adaptive platforms currently available.
With that assumption in mind, three options—which
might be combined—could address the key issue:
ƒƒ Using a stage- rather than an item-adaptive test design
ƒƒ Developing multiple adaptive tests at finer grains of
detail—the strand or even learning progression level
rather than at the content area level
ƒƒ Employing monte carlo simulations to anticipate a
wide range of student performance scenarios, then
using the results to affect the item pool characteristics
and adaptive algorithm to achieve the desired objective
Adaptive Testing and Development
Options
The most commonly used adaptive tests are adaptive at
the total test level and use individual item difficulties to
zero in on an estimate of a student’s overall ability – the
total test score. Answering the first question correctly
leads to a harder second question. Answering that one
incorrectly would lead to the third question being an
easier one. As this process continues, an estimate within
a specified error tolerance eventually emerges. Note that
commonly with this type of adaptive test, item difficulty
and students’ answers (whether or not they are correct)
drive the adaptive algorithm.
However, poor performance on a few early items can
prematurely drive the next items and the ultimate ability
estimate too low. One way of reducing the chances of
this happening would be to use stage-adaptive testing
whereby the student takes a cluster of “on-grade” items
before a next cluster of items is identified/selected for the
student. This approach still assumes the goal is a total
test score for a purpose such as statewide accountability.
And, again, item difficulty generally plays a key role in the
adaptive algorithm.
The psychometric underpinning of adaptive testing as
described above is Item Response Theory, which assumes
that every item is an estimator of an underlying general
ability, mathematical ability for instance. However,
students’ knowledge and skill in mathematics are not
so neatly ordered. For a variety of reasons, including
the content and quality of instruction, a student might
be higher performing in one area of mathematics than
in another. Separate adaptive tests in each area (tests
that are adaptive at the math strand level) could reveal
this. In other words, a student who overall might be
pegged as below grade-level could perform at grade-level
performance in one or more areas. Still, math strands (e.g.,
geometry and measurement, probability and statistics)
are quite broad. For this reason, consideration should be
given to tests that are adaptive at a still finer level, such as
learning progressions. In fact, for meaningful information
that might be considered diagnostic from adaptive testing,
this might be the only appropriate approach. The finer
the level at which a test is adaptive, the more diagnostic
the results would be. A student profile of results from a
series of adaptive tests at the learning progression level
could show quite variable results across progressions for
a student, some revealing grade level proficiency, others
perhaps not. Such an approach would ensure that students
would be able to demonstrate their on-grade knowledge
and skills before receiving off-grade test content. And the
results on the individual adaptive tests in the series could
be combined to generate a total test score.
Using a different approach, we could study ahead of
time the concern around having students take off-grade
items before having an opportunity to demonstrate their
knowledge, skills, and/or ability associated with their
actual grade level. This approach would make use of
monte carlo simulations. For example, if a concern existed
about a low-performing student initially having trouble
with the navigation controls in a computer adaptive test
(CAT) that resulted in a student being routed to off-grade
(i.e., one grade lower) items – then we could study this
scenario to see what might happen. CATs have two key
components that will affect the results of this scenario:
©2011 Measured Progress. All rights reserved. | measuredprogress.org | 800.431.8901
DISCUSSION PAPER
3
the item pool characteristics, and the CAT algorithm.
Ideally, CAT designers/developers optimize the algorithm
relative to the item pool and to scenarios such as the one
above. By carefully constructing the components of the
algorithm, the designers/developers can test various real
world scenarios ahead of CAT administration so that they
can develop a fair and precise administration process for
students across the entire performance continuum. As
another example, we may find that students struggling
early on in a CAT that are routed to a lower grade level
have no chance in recovering and being administered
items at the actual grade level. To mitigate this we may
find it necessary to increase the size of the item pool, or
make adjustments to our selection algorithm – so that we
can more precisely determine when somebody should
be routed to a lower grade level, or give them better
opportunities to return to their grade level once they’ve
been routed to a lower grade level. Note that we could
use the two distinct adaptive options (stage-adaptive or
narrow-scope item-adaptive) as part of a monte carlo
analysis during the assessment development stage. The
process would include pilot testing, since the monte carlo
analyses are data driven.
The Implications of Learning
Progressions
Although the issue is far broader and complex, when the
topic of adaptive testing arises, people might think about
the use of learning progressions. A learning progression
describes the developmental process of acquiring
knowledge and/or building skills one would need in order
to master an area. The individual kernels of knowledge
and/or skills are connected in some fashion that builds
capacity and expertise. The concept implies a linear path,
although the process may not at all be linear and for any
particular area might vary for different people.
If educators desire to use the results of adaptive
assessment to inform instructional interventions, then it
makes sense to use a learning progression in developing
the adaptive algorithm and selecting items. IN theory,
learning progressions could be used in both item- and
stage-adaptive tests. However, the extent to which the
learning progressions are not well articulated (e.g., in
general or just for some learners) will limited the ability
of the adaptive tests to precisely determine where a given
student is located on the performance continuum. This is
not so much a limitation of adaptive tests per se as it is on
using something as complicated as learning progressions
as a method of adaptation. If a sufficient item pool exists,
and if we were to use different learning progressions in
developing the adaptive algorithms, we might end up
developing distinct tests.
The complexity in using learning progressions arises in
large part from two key factors: (1) their relative newness
and lack of articulation in some content/sub-content
areas, and (2) unresolved questions surrounding the
most accurate/useful way to describe students’ learning
patterns—whether one size can really fit all. Put simply, we
need to be careful about what is not well articulated versus
for whom a learning progression is not well articulated.
Not unexpectedly—but unfortunately, the relative newness
of learning progressions has led to varied views of what
they are. The early Common Core State Standards work
regarded learning progressions as fairly broad. Others
have defined them more narrowly. Various experts
are exploring and developing learning progressions.
Some are quite generalized, while others are extremely
granular. Some experts have suggested teachers take
a practical approach and merely “chunk” content in a
logical progression, whereas others have taken a far more
rigorous, data-driven (IE expensive and time-consuming)
approach to develop specific, detailed progressions. And,
as noted, coverage is incomplete across—and in some
cases with-in—content areas. In general, the broader a sub-
domain is, the more paths there are for students to follow
– that is, the order in which they learn things can vary. If
progressions are very narrowly defined, then within one,
the order of learning might be fairly universal. The latter
situation could necessitate a large number of progressions
to cover a sub-domain and might only be applicable in
mathematics and possibly certain sub-domains of science.
Having multiple progressions in a particular area might
greatly increase instructional complexities and challenges,
although that approach might appear to better address the
needs of diverse student populations.
As noted, the varied views of learning progressions
stem from the fact that this is a relatively new topic.
Theoretically, they exist for all knowledge/skill areas and
apply to all students. Different theoretical perspectives
exist on establishing and measuring them for all students.
An example of distinct theoretical approaches is to
conceive of a learning progression as a honey comb with
each cell representing a discrete body of knowledge or
skill. Many cells build on or support each other and there
may be a linear progression to the development of those
associated skills, but there may also be skills or knowledge
that is developed in a non-linear manner. Progression
©2011 Measured Progress. All rights reserved. | measuredprogress.org | 800.431.8901
DISCUSSION PAPER
4
occurs as more cells are “filled” in, but in this model the
order in which cells are filled in is of less importance
and can vary among students. This intriguing concept
has the advantage of addressing differences in student
learning patterns while not greatly increasing instructional
complexity.
Bringing us back full circle, with a sufficient item pool,
even if we found the honeycomb concept of a learning
progression to be most accurate, we could use it to design an
appropriate adaptive assessment. The bottom line is that at
this point we don’t know whether students with disabilities
require different learning progressions—or whether a
honeycomb or other pattern is more accurate, but we think
this is an extremely important issue to resolve.

More Related Content

What's hot

Criteria to Consider when Constructing Good tests
Criteria to Consider when Constructing Good testsCriteria to Consider when Constructing Good tests
Criteria to Consider when Constructing Good testsShimmy Tolentino
 
Characteristics of a good test
Characteristics of a good testCharacteristics of a good test
Characteristics of a good testALMA HERMOGINO
 
Validity and objectivity of tests
Validity and objectivity of testsValidity and objectivity of tests
Validity and objectivity of testsbushra mushtaq
 
Principles of Test Construction 1
Principles of Test Construction 1Principles of Test Construction 1
Principles of Test Construction 1Monica P
 
Ssr test construction admin and scoring
Ssr test construction admin and scoringSsr test construction admin and scoring
Ssr test construction admin and scoringVictor Jr. Bactol
 
Preparing and using achievement tests test specification
Preparing and using achievement tests   test specificationPreparing and using achievement tests   test specification
Preparing and using achievement tests test specificationOmar Jacalne
 
Standardized and non standardized tests
Standardized and non standardized testsStandardized and non standardized tests
Standardized and non standardized testsvinoli_sg
 
Measurement and Assessment in Teaching 11th Edition Miller Test Bank
Measurement and Assessment in Teaching 11th Edition Miller Test BankMeasurement and Assessment in Teaching 11th Edition Miller Test Bank
Measurement and Assessment in Teaching 11th Edition Miller Test Bankfavomolom
 
Qualities of a Good Test
Qualities of a Good TestQualities of a Good Test
Qualities of a Good TestDrSindhuAlmas
 
Quantitative analysis in language research
Quantitative analysis in language researchQuantitative analysis in language research
Quantitative analysis in language researchCarlo Magno
 
Standardized testing.pptx 2
Standardized testing.pptx 2Standardized testing.pptx 2
Standardized testing.pptx 2Jesullyna Manuel
 
Brown, chapter 4 By Savaedi
Brown, chapter 4 By SavaediBrown, chapter 4 By Savaedi
Brown, chapter 4 By SavaediSavaedi
 
Fulcher standardized testing
Fulcher standardized testingFulcher standardized testing
Fulcher standardized testingMelikarj
 
Characteristics of a good test
Characteristics  of a good testCharacteristics  of a good test
Characteristics of a good testAli Heydari
 

What's hot (20)

Campare 3737
Campare 3737Campare 3737
Campare 3737
 
Criteria to Consider when Constructing Good tests
Criteria to Consider when Constructing Good testsCriteria to Consider when Constructing Good tests
Criteria to Consider when Constructing Good tests
 
Characteristics of a good test
Characteristics of a good testCharacteristics of a good test
Characteristics of a good test
 
Validity and objectivity of tests
Validity and objectivity of testsValidity and objectivity of tests
Validity and objectivity of tests
 
Principles of Test Construction 1
Principles of Test Construction 1Principles of Test Construction 1
Principles of Test Construction 1
 
Achievement Test
Achievement TestAchievement Test
Achievement Test
 
Ssr test construction admin and scoring
Ssr test construction admin and scoringSsr test construction admin and scoring
Ssr test construction admin and scoring
 
Preparing and using achievement tests test specification
Preparing and using achievement tests   test specificationPreparing and using achievement tests   test specification
Preparing and using achievement tests test specification
 
Standardized and non standardized tests
Standardized and non standardized testsStandardized and non standardized tests
Standardized and non standardized tests
 
Measurement and Assessment in Teaching 11th Edition Miller Test Bank
Measurement and Assessment in Teaching 11th Edition Miller Test BankMeasurement and Assessment in Teaching 11th Edition Miller Test Bank
Measurement and Assessment in Teaching 11th Edition Miller Test Bank
 
Qualities of a Good Test
Qualities of a Good TestQualities of a Good Test
Qualities of a Good Test
 
Quantitative analysis in language research
Quantitative analysis in language researchQuantitative analysis in language research
Quantitative analysis in language research
 
Standardized testing.pptx 2
Standardized testing.pptx 2Standardized testing.pptx 2
Standardized testing.pptx 2
 
Brown, chapter 4 By Savaedi
Brown, chapter 4 By SavaediBrown, chapter 4 By Savaedi
Brown, chapter 4 By Savaedi
 
Fulcher standardized testing
Fulcher standardized testingFulcher standardized testing
Fulcher standardized testing
 
Test construction
Test constructionTest construction
Test construction
 
Characteristics of a good test
Characteristics  of a good testCharacteristics  of a good test
Characteristics of a good test
 
Types of test
Types of testTypes of test
Types of test
 
Chapter 4 testing aima
Chapter 4 testing aimaChapter 4 testing aima
Chapter 4 testing aima
 
Test Construction
Test ConstructionTest Construction
Test Construction
 

Similar to Adaptive Testing, Learning Progressions, and Students with Disabilities - May 2011

Bloom's Taxonomy
Bloom's TaxonomyBloom's Taxonomy
Bloom's Taxonomybutest
 
After Lab Ends How Students Analyze And Interpret Experimental Data
After Lab Ends  How Students Analyze And Interpret Experimental DataAfter Lab Ends  How Students Analyze And Interpret Experimental Data
After Lab Ends How Students Analyze And Interpret Experimental DataScott Bou
 
TEST DEVELOPMENT AND EVALUATION (6462)
TEST DEVELOPMENT AND EVALUATION (6462)TEST DEVELOPMENT AND EVALUATION (6462)
TEST DEVELOPMENT AND EVALUATION (6462)HennaAnsari
 
Applying Peer-Review For Programming Assignments
Applying Peer-Review For Programming AssignmentsApplying Peer-Review For Programming Assignments
Applying Peer-Review For Programming AssignmentsStephen Faucher
 
Ccssi math standards
Ccssi math standardsCcssi math standards
Ccssi math standardsredevan1203
 
Testing and Test Construction (Evaluation ILE)
Testing and Test Construction (Evaluation ILE)Testing and Test Construction (Evaluation ILE)
Testing and Test Construction (Evaluation ILE)Samcruz5
 
Testing and Test construction (Evaluation in EFL)
Testing and Test construction (Evaluation in EFL)Testing and Test construction (Evaluation in EFL)
Testing and Test construction (Evaluation in EFL)Samcruz5
 
Educational Assessment and Evaluation
Educational Assessment and Evaluation Educational Assessment and Evaluation
Educational Assessment and Evaluation HennaAnsari
 
A Mastery Learning Approach To Engineering Homework Assignments
A Mastery Learning Approach To Engineering Homework AssignmentsA Mastery Learning Approach To Engineering Homework Assignments
A Mastery Learning Approach To Engineering Homework AssignmentsJoe Andelija
 
Standardized testing
Standardized testingStandardized testing
Standardized testingElLa Bee
 
OLI Analysis - Internship write up
OLI Analysis - Internship write upOLI Analysis - Internship write up
OLI Analysis - Internship write upKay Christensen
 
test construction in mathematics
test construction in mathematicstest construction in mathematics
test construction in mathematicsAlokBhutia
 
An Adaptive Evaluation System to Test Student Caliber using Item Response Theory
An Adaptive Evaluation System to Test Student Caliber using Item Response TheoryAn Adaptive Evaluation System to Test Student Caliber using Item Response Theory
An Adaptive Evaluation System to Test Student Caliber using Item Response TheoryEditor IJMTER
 
Grading criteria and marking schemes liz norman anzcvs 2015
Grading criteria and marking schemes liz norman anzcvs 2015Grading criteria and marking schemes liz norman anzcvs 2015
Grading criteria and marking schemes liz norman anzcvs 2015Liz Norman
 
M2 PERFORMANCE BASED ASSESSMENT.pdf
M2 PERFORMANCE BASED ASSESSMENT.pdfM2 PERFORMANCE BASED ASSESSMENT.pdf
M2 PERFORMANCE BASED ASSESSMENT.pdfMartin Nobis
 
construction and administration of unit test in science subject
construction and administration of unit test in science subjectconstruction and administration of unit test in science subject
construction and administration of unit test in science subjectAlokBhutia
 
B 190313162555
B 190313162555B 190313162555
B 190313162555pawanbais1
 
Test Construction1
Test Construction1Test Construction1
Test Construction1songoten77
 
IMPROVING FAIRNESS ON STUDENTS’ OVERALL MARKS VIA DYNAMIC RESELECTION OF ASSE...
IMPROVING FAIRNESS ON STUDENTS’ OVERALL MARKS VIA DYNAMIC RESELECTION OF ASSE...IMPROVING FAIRNESS ON STUDENTS’ OVERALL MARKS VIA DYNAMIC RESELECTION OF ASSE...
IMPROVING FAIRNESS ON STUDENTS’ OVERALL MARKS VIA DYNAMIC RESELECTION OF ASSE...IJITE
 

Similar to Adaptive Testing, Learning Progressions, and Students with Disabilities - May 2011 (20)

Bloom's Taxonomy
Bloom's TaxonomyBloom's Taxonomy
Bloom's Taxonomy
 
After Lab Ends How Students Analyze And Interpret Experimental Data
After Lab Ends  How Students Analyze And Interpret Experimental DataAfter Lab Ends  How Students Analyze And Interpret Experimental Data
After Lab Ends How Students Analyze And Interpret Experimental Data
 
TEST DEVELOPMENT AND EVALUATION (6462)
TEST DEVELOPMENT AND EVALUATION (6462)TEST DEVELOPMENT AND EVALUATION (6462)
TEST DEVELOPMENT AND EVALUATION (6462)
 
Applying Peer-Review For Programming Assignments
Applying Peer-Review For Programming AssignmentsApplying Peer-Review For Programming Assignments
Applying Peer-Review For Programming Assignments
 
Ccssi math standards
Ccssi math standardsCcssi math standards
Ccssi math standards
 
Testing and Test Construction (Evaluation ILE)
Testing and Test Construction (Evaluation ILE)Testing and Test Construction (Evaluation ILE)
Testing and Test Construction (Evaluation ILE)
 
Testing and Test construction (Evaluation in EFL)
Testing and Test construction (Evaluation in EFL)Testing and Test construction (Evaluation in EFL)
Testing and Test construction (Evaluation in EFL)
 
Educational Assessment and Evaluation
Educational Assessment and Evaluation Educational Assessment and Evaluation
Educational Assessment and Evaluation
 
A Mastery Learning Approach To Engineering Homework Assignments
A Mastery Learning Approach To Engineering Homework AssignmentsA Mastery Learning Approach To Engineering Homework Assignments
A Mastery Learning Approach To Engineering Homework Assignments
 
Standardized testing
Standardized testingStandardized testing
Standardized testing
 
OLI Analysis - Internship write up
OLI Analysis - Internship write upOLI Analysis - Internship write up
OLI Analysis - Internship write up
 
test construction in mathematics
test construction in mathematicstest construction in mathematics
test construction in mathematics
 
An Adaptive Evaluation System to Test Student Caliber using Item Response Theory
An Adaptive Evaluation System to Test Student Caliber using Item Response TheoryAn Adaptive Evaluation System to Test Student Caliber using Item Response Theory
An Adaptive Evaluation System to Test Student Caliber using Item Response Theory
 
Grading criteria and marking schemes liz norman anzcvs 2015
Grading criteria and marking schemes liz norman anzcvs 2015Grading criteria and marking schemes liz norman anzcvs 2015
Grading criteria and marking schemes liz norman anzcvs 2015
 
M2 PERFORMANCE BASED ASSESSMENT.pdf
M2 PERFORMANCE BASED ASSESSMENT.pdfM2 PERFORMANCE BASED ASSESSMENT.pdf
M2 PERFORMANCE BASED ASSESSMENT.pdf
 
construction and administration of unit test in science subject
construction and administration of unit test in science subjectconstruction and administration of unit test in science subject
construction and administration of unit test in science subject
 
B 190313162555
B 190313162555B 190313162555
B 190313162555
 
Language assessment
Language assessmentLanguage assessment
Language assessment
 
Test Construction1
Test Construction1Test Construction1
Test Construction1
 
IMPROVING FAIRNESS ON STUDENTS’ OVERALL MARKS VIA DYNAMIC RESELECTION OF ASSE...
IMPROVING FAIRNESS ON STUDENTS’ OVERALL MARKS VIA DYNAMIC RESELECTION OF ASSE...IMPROVING FAIRNESS ON STUDENTS’ OVERALL MARKS VIA DYNAMIC RESELECTION OF ASSE...
IMPROVING FAIRNESS ON STUDENTS’ OVERALL MARKS VIA DYNAMIC RESELECTION OF ASSE...
 

Adaptive Testing, Learning Progressions, and Students with Disabilities - May 2011

  • 1. DISCUSSION PAPER Stuart R. Kahl, Ph.D., CEO | kahl.stuart@measuredprogress.org Michael Nering, Ph.D., Assistant Vice President | nering.michael@measuredprogress.org Michael Russell, Ph.D.,Vice President | mike@nimbletools.com Peter D. Hofman, Vice President | hofman.peter@measuredprogress.org www.measuredprogress.org | 800.431.8901 | 100 Education Way, Dover, NH 03820 Adaptive Testing, Learning Progressions, and Students with Disabilities May 17, 2011
  • 2. ©2011 Measured Progress. All rights reserved. | measuredprogress.org | 800.431.8901 DISCUSSION PAPER 2 Introduction This paper responds to the legitimate concern that a student may not be afforded a sufficient opportunity during adaptive testing for accountability purposes to demonstrate what on-grade-level knowledge or skills he or she actually has. We summarize below different means of achieving the desired objective of providing all students a valid opportunity to perform at grade level prior to being presented with off-grade test content. We also discuss the implications of learning progressions in creating adaptive assessments. For purposes of this discussion, we assume that sufficient accommodations are embedded in adaptive testing systems so that the maximum number of students possible has access to the test content and the ability to respond to items. In other words, the absence of accommodations is not an obstacle to students’ correctly responding to grade-level test content. We do recognize that this assumption will likely require modifications to most, if not all, adaptive platforms currently available. With that assumption in mind, three options—which might be combined—could address the key issue: ƒƒ Using a stage- rather than an item-adaptive test design ƒƒ Developing multiple adaptive tests at finer grains of detail—the strand or even learning progression level rather than at the content area level ƒƒ Employing monte carlo simulations to anticipate a wide range of student performance scenarios, then using the results to affect the item pool characteristics and adaptive algorithm to achieve the desired objective Adaptive Testing and Development Options The most commonly used adaptive tests are adaptive at the total test level and use individual item difficulties to zero in on an estimate of a student’s overall ability – the total test score. Answering the first question correctly leads to a harder second question. Answering that one incorrectly would lead to the third question being an easier one. As this process continues, an estimate within a specified error tolerance eventually emerges. Note that commonly with this type of adaptive test, item difficulty and students’ answers (whether or not they are correct) drive the adaptive algorithm. However, poor performance on a few early items can prematurely drive the next items and the ultimate ability estimate too low. One way of reducing the chances of this happening would be to use stage-adaptive testing whereby the student takes a cluster of “on-grade” items before a next cluster of items is identified/selected for the student. This approach still assumes the goal is a total test score for a purpose such as statewide accountability. And, again, item difficulty generally plays a key role in the adaptive algorithm. The psychometric underpinning of adaptive testing as described above is Item Response Theory, which assumes that every item is an estimator of an underlying general ability, mathematical ability for instance. However, students’ knowledge and skill in mathematics are not so neatly ordered. For a variety of reasons, including the content and quality of instruction, a student might be higher performing in one area of mathematics than in another. Separate adaptive tests in each area (tests that are adaptive at the math strand level) could reveal this. In other words, a student who overall might be pegged as below grade-level could perform at grade-level performance in one or more areas. Still, math strands (e.g., geometry and measurement, probability and statistics) are quite broad. For this reason, consideration should be given to tests that are adaptive at a still finer level, such as learning progressions. In fact, for meaningful information that might be considered diagnostic from adaptive testing, this might be the only appropriate approach. The finer the level at which a test is adaptive, the more diagnostic the results would be. A student profile of results from a series of adaptive tests at the learning progression level could show quite variable results across progressions for a student, some revealing grade level proficiency, others perhaps not. Such an approach would ensure that students would be able to demonstrate their on-grade knowledge and skills before receiving off-grade test content. And the results on the individual adaptive tests in the series could be combined to generate a total test score. Using a different approach, we could study ahead of time the concern around having students take off-grade items before having an opportunity to demonstrate their knowledge, skills, and/or ability associated with their actual grade level. This approach would make use of monte carlo simulations. For example, if a concern existed about a low-performing student initially having trouble with the navigation controls in a computer adaptive test (CAT) that resulted in a student being routed to off-grade (i.e., one grade lower) items – then we could study this scenario to see what might happen. CATs have two key components that will affect the results of this scenario:
  • 3. ©2011 Measured Progress. All rights reserved. | measuredprogress.org | 800.431.8901 DISCUSSION PAPER 3 the item pool characteristics, and the CAT algorithm. Ideally, CAT designers/developers optimize the algorithm relative to the item pool and to scenarios such as the one above. By carefully constructing the components of the algorithm, the designers/developers can test various real world scenarios ahead of CAT administration so that they can develop a fair and precise administration process for students across the entire performance continuum. As another example, we may find that students struggling early on in a CAT that are routed to a lower grade level have no chance in recovering and being administered items at the actual grade level. To mitigate this we may find it necessary to increase the size of the item pool, or make adjustments to our selection algorithm – so that we can more precisely determine when somebody should be routed to a lower grade level, or give them better opportunities to return to their grade level once they’ve been routed to a lower grade level. Note that we could use the two distinct adaptive options (stage-adaptive or narrow-scope item-adaptive) as part of a monte carlo analysis during the assessment development stage. The process would include pilot testing, since the monte carlo analyses are data driven. The Implications of Learning Progressions Although the issue is far broader and complex, when the topic of adaptive testing arises, people might think about the use of learning progressions. A learning progression describes the developmental process of acquiring knowledge and/or building skills one would need in order to master an area. The individual kernels of knowledge and/or skills are connected in some fashion that builds capacity and expertise. The concept implies a linear path, although the process may not at all be linear and for any particular area might vary for different people. If educators desire to use the results of adaptive assessment to inform instructional interventions, then it makes sense to use a learning progression in developing the adaptive algorithm and selecting items. IN theory, learning progressions could be used in both item- and stage-adaptive tests. However, the extent to which the learning progressions are not well articulated (e.g., in general or just for some learners) will limited the ability of the adaptive tests to precisely determine where a given student is located on the performance continuum. This is not so much a limitation of adaptive tests per se as it is on using something as complicated as learning progressions as a method of adaptation. If a sufficient item pool exists, and if we were to use different learning progressions in developing the adaptive algorithms, we might end up developing distinct tests. The complexity in using learning progressions arises in large part from two key factors: (1) their relative newness and lack of articulation in some content/sub-content areas, and (2) unresolved questions surrounding the most accurate/useful way to describe students’ learning patterns—whether one size can really fit all. Put simply, we need to be careful about what is not well articulated versus for whom a learning progression is not well articulated. Not unexpectedly—but unfortunately, the relative newness of learning progressions has led to varied views of what they are. The early Common Core State Standards work regarded learning progressions as fairly broad. Others have defined them more narrowly. Various experts are exploring and developing learning progressions. Some are quite generalized, while others are extremely granular. Some experts have suggested teachers take a practical approach and merely “chunk” content in a logical progression, whereas others have taken a far more rigorous, data-driven (IE expensive and time-consuming) approach to develop specific, detailed progressions. And, as noted, coverage is incomplete across—and in some cases with-in—content areas. In general, the broader a sub- domain is, the more paths there are for students to follow – that is, the order in which they learn things can vary. If progressions are very narrowly defined, then within one, the order of learning might be fairly universal. The latter situation could necessitate a large number of progressions to cover a sub-domain and might only be applicable in mathematics and possibly certain sub-domains of science. Having multiple progressions in a particular area might greatly increase instructional complexities and challenges, although that approach might appear to better address the needs of diverse student populations. As noted, the varied views of learning progressions stem from the fact that this is a relatively new topic. Theoretically, they exist for all knowledge/skill areas and apply to all students. Different theoretical perspectives exist on establishing and measuring them for all students. An example of distinct theoretical approaches is to conceive of a learning progression as a honey comb with each cell representing a discrete body of knowledge or skill. Many cells build on or support each other and there may be a linear progression to the development of those associated skills, but there may also be skills or knowledge that is developed in a non-linear manner. Progression
  • 4. ©2011 Measured Progress. All rights reserved. | measuredprogress.org | 800.431.8901 DISCUSSION PAPER 4 occurs as more cells are “filled” in, but in this model the order in which cells are filled in is of less importance and can vary among students. This intriguing concept has the advantage of addressing differences in student learning patterns while not greatly increasing instructional complexity. Bringing us back full circle, with a sufficient item pool, even if we found the honeycomb concept of a learning progression to be most accurate, we could use it to design an appropriate adaptive assessment. The bottom line is that at this point we don’t know whether students with disabilities require different learning progressions—or whether a honeycomb or other pattern is more accurate, but we think this is an extremely important issue to resolve.