Haxhiraj ch13 14-presentation

Ch-13 & 14
Cizek & Bunch
Brikena Haxhiraj
1

Ch-13
Scheduling Standard Setting Activities
 Chapter Goals:
 Authors suggest ways and methods of how to schedule standard setting in
two types of assessments by drawing primarily on their experiences in large
scale credentialing programs and educational assessments while providing
examples of each standard setting activities.
1. Scheduling standard setting for educational assessments
2. Scheduling standard setting for credentialing programs
2

Scheduling standard setting for educational
assessment
 Table 31-1 (p. 219-221) provides an overview of the main activities to be
completed along with a time table for their completion.
 A generic version of the table can also be found at
www.sagepub.com/cizek/schedule
 This table shows the planning for standard setting beginning two years
before the actual standard setting session.
3

1. Overall Plan
 Establish performance level labels (PLLs) and performance level descriptions (PLDs)
 Drafting a standard setting plan before item writing begins, is one way to make sure the
test supports the standard-setting activity that is eventually carried out.
 Table 13-1 shows a field test exactly one year prior to the first operational administration
of the test. During the first year, a regular testing window would be reserved for field
testing.
 The planning should specify: a) a method, b) an agenda, c) training procedures and d)
analysis procedures.
 Technical advisory committee (TAC).
 Stakeholder review
4

2. Participants
 Identify and recruit the individuals who will participate in the standard setting activity (i.e., the panelists).
 For statewide assessments, it is preferable that the panelists be as representative of the state as possible.
 Table 13-1 shows the process of identifying these individuals about nine months before standard setting
begins.
 Creation of the standard-setting panels is a three-step process
1. Local superintendents or their designees identify potential panelists in accordance with specifications
provided by the state education agency.
2. Notify candidates prior to submitting their names by sending an initial letter to all candidates
3. State agency staff sort the nominations to create the required number of panels and with the approved
numbers of panelists.
5

3. Materials
 Training materials, forms and data analysis programs
 The timing of preparing these materials is crucial
 Some can be prepared in advance and some can not (refer to Table 13-2; 13-3).
 Final Preparations: Everyone involved needs to be thoroughly prepared; all presentations
should be scripted and rehearsed, all rating forms should be double checked, all
participant materials should be produced, duplicated, collated, and assembled easy in
sets.
 As a final part of the preparation, the entire standard-setting staff should conduct a dress
rehearsal making sure that timing of presentations, is consistent with the agenda, that all
forms are correct and usable and that the flow of events is logical.
6

4. At the standard setting site and following up
 The lead facilitator attends to matters related to conduct of the sessions
 Logistics coordinator attends to everything else
 Once panelists complete their tasks, and turn in their materials, data entry staff take over,
and the next morning, the data analysis staff continues the process.
 All data entry should be verified by a second person before data analysis begins.
 The state education agency responsible for the standard setting should have arranged
time on the agenda of the state board of education as soon as possible after standard
setting in order to have cut scores approved.
 Once cut scores are adopted by the board, it is possible to include them in the score
reporting programs and produce score reports.
7

Scheduling standard setting for
credentialing programs
 Scheduling standard setting for credentialing programs is different from
educational assessment programs. Educational assessment programs are
bound to specific time of the academic year and tests are typically given in
the spring or fall.
 Credentialing programs are not bound by these constraints, and have the
ability for some flexibility such as computer adaptive testing (CAT) or
computer based testing (CBT) may permit test administration on any day of
the year.
 Table 13-4 provides an overview of the major tasks for a credentialing
testing program.
8

Small group activity
 In groups of three review pages (237-245) and post the key components of
scheduling standard setting for credentialing programs focusing on
differences between scheduling standard setting for educational
assessments.
 Use this website to post your thoughts
 http://padlet.com/wall/4qxyguqgnd
9

Recommendations
 Planning for standard setting needs to be made an integral part of planning for
test development.
 Plans of the standard setting facilitators should be reviewed by test
development staff, and vice versa.
 One person with authority over both item developers and standard setters
should have informed oversight over both activities.
 Attention to scoring in particular with open ended or constructed response
items.
 Finally, test planning, test development, and standard setting are interlinked
parts of a single enterprise.
10

Ch-14
Vertically-Moderated Standard Setting
Chapter Goals:
Describe:
(1) the general concept of VMSS
(2) specific approaches to conduct VMSS
(3) a specific application of VMSS
Provide:
(1) suggestions for a current assessment system and a need for additional
research
11

Linking Test Scores across grades within the
Norm Referenced Testing (NRT) context
Review from Ch-6 (Ryan & Shepard)
 Construct of Linking- refers to several types of statistical methods that
establish a relationship between the score scales from two tests, so the
results can be comparable between the tests.
 Test Score Equating- Used to measure year to year changes over time for
different students in the same grade
 Vertical Equating- linking test scores vertically across grade levels and
schooling levels. The tests that are to be linked need to measure the same
construct.
12

Interrelated Challenges within the Standards-
Referenced Testing (SRT) context
 NCLB requirements for tracking cohort growth & achievement gaps
 These newer assessment apply standards-referenced testing (SRT)
 Linking test performance standards from two or more grade levels (adjacent and
not adjacent)
 The construct measured may be different
 Sheer number of performance levels that NCLB requires
 The wide test span and developmental range
 The panels of educators who participate in standard setting
13

A New Method that Links Standards Across Tests
 To address these challenges, a need to develop and implement standard
setting methods that set performance levels across all affected grade levels
with some method for smoothing out differences between grades.
 Suggested approach—VMSS—Vertically Moderated Standard Setting
14

History of VMSS
 Introduced by Lissitz & Huynh (2003b)
 AYP is based on the percentage of students who meet Proficient and the
expected percentage increases over time.
 The purpose of VMSS – deriving at a set of cross grade standards that
realistically tracks student growth over time and provides a reasonable
expectation of growth from one grade to the next.
 The critical issue—defining reasonable expectations using vertical scaling would
not produce a satisfactory set of expectations for grade to grade growth.
 Alternative to vertical scaling or equating, Lissitz and Huynh (2003 b) suggested
VMSS.
15

What is VMSS?
 A process of vertical articulation of standards: aligning scores, scales or
proficiency levels.
 Is a procedure or set of procedures, typically carried out after individual
standards have been set that seeks to smooth out the bumps that inevitably
occur across grades.
 Reasonable expectations are stated in terms of percentages of students at
or above a consequential performance level, such as Proficient.
 Lets discuss the hypothetical scenario using the table on the next slide
(p.255 in your book).
16

What is VMSS?
Grades % of Students At or Above
Proficient Performance Lv.
Difference
3
4
5
6
7
8
37
41
34
43
29
42
+ 4 %
- 7 %
+ 9 %
- 14 %
+ 13 %
17

Approaches to VMSS
 Focuses on percentages of students at various proficiency levels
 Is based on assumptions about growth in achievement over time
 Problem: Different percentages of students reaching a given performance level
– such as—Proficient cut score at different grades.
 Solution:
 1. Set all standards at the score point or such that equal percentages of students
would be classified as proficient at each grade level by fiat.
 2. Set standards only for the lowest and highest grades and then align the
percentages of Proficient students in the intermediate grades accordingly.
18

Approaches to VMSS
Grades % of Students At or
Above Proficient
Performance Lv.
3
4
5
6
7
8
37
38
39
40
41
42
36
37
38
39
40
41
42
0 5 10
Y-Value 1
Y-Value 1
19

Assumptions re: growth over time
 Lewis & Huang (2005)
 The percentage of students
classified as at or above Proficient
would be expected to be:
1. Equal across grades or subjects
2. Approximately equal
3. Smoothly decreasing
4. Smoothly increasing
 Ferrara, Johnson & Chen (2005)
 Assumptions for standard setting are
based on the intersection of three
growth models:
1. Linear Growth
2. Remediation
3. Acceleration
20

Alternative procedures
 Due to VMSS being a relatively new procedure, it is difficult to pinpoint
limitations and alternative procedures
 There have been few thoroughly documented applications of VMSS
 Each application has been slightly different from the others
 Authors have suggested a common core of elements to VMSS
 However, no fixed set of steps has emerged in applications of VMSS so far
 Every aspect of any application might be thought as an alternative procedure
21

Core components of VMSS future applications
1. Grounding in historical data (Lwesi & Haug, 2005; Buckendahl et al, 2005).
2. Establishment of performance models
3. Consideration of historical data
4. Cross-grade examination of test content and student performance
5. Polling of participants
6. Follow up review and adjustment
22

Limitations
 Lack of historical perspective or context would be not only limiting, but
debilitating If the focus of VMSS is the percentages of students at or above a
particular proficiency level.
 Any application of VMSS is hampered if it is not supported by a
theoretically or empirically sound model of achievement growth.
 Maintaining a meaning of cut scores and fidelity to PLDs is one of the most
fundamental for future research.
 Research and development is a growth industry
23

Haxhiraj ch13 14-presentation

Recommended

Recommended

More Related Content

What's hot

What's hot (17)

Similar to Haxhiraj ch13 14-presentation

Similar to Haxhiraj ch13 14-presentation (20)

Haxhiraj ch13 14-presentation

Editor's Notes