1. New York State
Education Department
Understanding The Process:
Science Assessments and the
New York State Learning Standards
2. January 2002
NYSED
New York State Learning Standards
• April 1994, the Board of Regents approved a plan to
revise the State assessment system based on
learning standards.
• July 1996, the Board of Regents approved 28
learning standards in seven standard areas:
– Math Science and Technology, Social Studies,
The Arts, English Language Arts, Languages
other than English, Career Development and Occupational
Studies, Family Consumer
Science/Health/Physical Education
3. January 2002
NYSED
New York State Learning Standards
• Learning standards outline what students should
know, understand and be able to do in a specific
subject area
• Learning standards contain content and performance
standards
– Content Standard
The knowledge, skills, and
understandings that individuals can habitually
demonstrate over time as a consequence of
instruction and experience
– Performance Standards
Levels
of student achievement in domains of study
4. January 2002
NYSED
New York State Learning Standards
• Learning standards consist of performance indicators at
the:
– Elementary ( K-4) ,
– Intermediate (5-8), and
– Commencement (9-12) levels
• Performance indicators are embedded in the learning
standards and are aligned to Science Core Curriculum
Guides and State Assessments
5. January 2002
NYSED
State Assessments
• Provide a uniform measure of student achievement
across all districts, all schools, all classrooms
• State tests assess the extent to which students have
achieved the learning standards in a content area
• Are important indicators of student achievement of
the learning standards
• Are used to understand individual student needs in
conjunction with other appropriate measures
• Drive necessary changes in curriculum and
classroom instruction
6. January 2002
NYSED
Science Assessments
• Elementary Science
Elementary Science Program Evaluation Test (ESPET)
Administered at Grade 4
• Intermediate Science
Intermediate Level Science
Administered at Grade 8
• Commencement Level
Regents Science Exams
Living Environment
Physical Setting/Earth Science
Physical Setting/Chemistry
Physical Setting/Physics
7. January 2002
NYSED
Test Development Process
in Science
• The test development process ensures assessments
created are fair, valid and reliable measures of
student performance in relation to meeting the State
learning standards
• The process involves 19 steps and approximately two-
three years to develop a State assessment
8. January 2002
NYSED
Test Development Process
Item W riting
S olicit item w riters
T rain item w riters
T est item s are su bm itted an d review ed
Testing Item s
P re T est item s/form s
F ield test item s/form s
O peration al form s/T ests
Test A nalaysis
P re test data/F ield test data
Item an alysis/T est analysis
E xam review com m ittees/ S tan dards setting study
L earning S tand ard s
C O R E G uid es/S ub ject S p ecific C ontent A rea
T est S p ecifications /T est B luep rint
9. January 2002
NYSED
Test Development …continued
• Review learning standards in subject content area
• Design test specifications- “test blueprint”
• Solicit and train Item Writers
• Publish prototypes of items/generic rubrics; (sample
tests)
• Review and edit submitted items
• Pre-test items; scan pretests, read and score performance
items
• Perform item analysis; review items and data
10. January 2002
NYSED
Test Development …continued
• Field test forms; scan field tests; read and score
performance items
• Perform item and test analysis
• Submit to Statewide Examination Review Committees:
Sensitivity Review - ensures that all people are depicted in
accord to dignity ; certified trained reviewers review or reject test
items
Bias Analysis - evaluates whether a test question asks the same
question and at the same level of difficulty across sub- groups of test
takers
• Determine student performance levels through Standards
Setting Study - “cut scores”
11. January 2002
NYSED
Test Construction
• New York State teachers and content consultants, in
coordination with Office of State Assessment and
Curriculum and Instruction, determine test specifications
• A “test blueprint” determines the percentage of questions
weighted for each standard and key ideas
12. January 2002
NYSED
Item Writing
New York State Teachers & Content Specialists
• Are trained as item writers by New York State
Education Department staff
• Align State learning standards contained in
Science Core curriculum guides to all test items
generated
• Write items and scoring rubrics for State tests in
science
13. January 2002
NYSED
Pre-Tests
• Prospective test items are “pre-tested” by a diverse
sample of students across the State
• Approximately 200 students for each item are tested
• Results from pre-tested items are statistically
analyzed to determine question “item difficulty,” and
fairness
14. January 2002
NYSED
Field Tests
• Field test items are developed from pre-test
questions and administered in “short forms” to a
representative sample of students across the State;
(800-1000)
• Field tests are comparable in difficulty from different
test forms based on statistical analysis and student
performance
15. January 2002
NYSED
Field Tests
• Each field test form is “equated,” meaning two or
more test forms are constructed to cover the same
explicit content, conform to the same statistical
specifications, and are administered under
identical procedures
• Two or more essentially parallel tests are placed
on a common scale -“ equating”
16. January 2002
NYSED
Field Tests to Operational Tests
• “State Assessments,” operational tests, are assembled
from Field Test Forms
• Statistical analysis ensures different test forms are
comparable in fairness, validity and reliability
• Operational tests are placed on a “scale score,” a derived
score to which raw scores are converted by numerical
transformation; raw scores to standard scores
• Full length forms are presented to the State Examinations
Review Committee for sensitivity and bias review
17. January 2002
NYSED
Standard Setting Process
• State Tests assess the extent to which students have met
the learning standards in a content area
• Although scores for the Regents Exams are placed on a
numerical scale based on field test data, there are
essentially
Three Performance Levels
Does not meet the standards
Meets the standards
Meets the standards with distinction
18. January 2002
NYSED
Standard Setting
Performance Levels
• Standard Setting committee members are given
definitions of student performance levels
• Student performance levels are applied to all State
assessments that are developed including Regents tests
19. January 2002
NYSED
Standard Setting…Three Performance Levels
Example: Physical Setting /Earth Science
• Does Not Meet Learning Standards
• Meets Learning Standards
• Meets Learning Standards with Distinction
– The student demonstrates, on demand, proficiency,
in terms of Physical Setting /Earth Science content,
concepts, science skills and basic science knowledge
in any or most of the science learning standards and
key ideas that are addressed for productive citizenship
and has sufficient knowledge and skill for the demands
of most work places or post secondary academic
environments
20. January 2002
NYSED
Setting the “cut score”
• The Board of Regents has determined “65” as passing a NY
State Regents Examination and “85” as passing with
distinction
• “Passing” =“Proficient”, the performance needed to
achieve learning standards
• To determine the “passing score” or “65” a formal
Standards Setting study is conducted based on the
reasoned judgement of subject matter specialists and
student performance data
21. January 2002
NYSED
Scoring and Scaling
• Based on statistics from student pre-test and field test
data, items are placed on a logarithmic scale according to
item difficulty level and student ability
• The two points “passing/65” and “passing with
distinction/85” are then algebraically mapped to scale, 0-
100 (not raw score but scale score
22. January 2002
NYSED
Standards Setting Committee
• Committee members are:
-knowledgeable in the learning standards for
science
-from public and
nonpublic schools -are current
and former classroom teachers -
represent urban, suburban and rural schools -
selected members from business and industry
• Each member makes individual judgements with respect
to the item difficulty, scaling and equating of field tests,
and professional expertise
23. January 2002
NYSED
Standards Setting Process
• New York State Teachers and content experts use the
“book marking” method in conjunction with
professional judgement to set a “cut score”
• In the “bookmarking” procedure, multiple choice and
constructed response items are ordered in terms of
their item difficulty
• Test items corresponding to various points on the
scale are presented as examples of test items at that
difficulty level
• The purpose of the items is to illustrate the meaning of
the difficulty scale at specific points
24. January 2002
NYSED
Standard Setting Process
• Test items used come from an “anchor” form; a
test form upon which all cut points are set and all
later forms of the test will be equated
• Committee members apply their professional
judgements to these ordered items
• A “cut-score” , or performance standard, is a
specified point on a scale score, “65”, and is set such
that scores at or above that point are acted upon
differently from scores below that point
25. January 2002
NYSED
Science Regents
Scoring and Scaling
• The Conversion Chart provided for each test
administration translates raw scores to scale scores
(performance standards) and then maps to a 0-100 scale.
26. January 2002
NYSED
Science Regents Examinations
Scoring
• Test administration for each test form is “equated” so that
the same “scale score”, represents the same level of
achievement
• Test forms vary somewhat in the mix of easier and more
difficult items, resulting in the relationship between the
raw score and the scale score also varying from each test
administration
27. January 2002
NYSED
Science Regents Examinations
• Syllabus - Based
• Addressed a selective
student population
• Assessments were
designed from course of
study
• Syllabi contained
prescriptive content
• Standards -Based
• Universal access to all
students
• Assessments are derived
from the standards
• Standards drive the
content of the courses
designed
28. January 2002
NYSED
This item’s difficulty level, based on field test data,
was the easiest question on the ES June 2001 exam.
29. January 2002
NYSED
Test item 9 on the LE June 2001 exam has an item difficulty level
at the passing performance level, “meets the standards”,
based on field test data and the standards setting process.
30. January 2002
NYSED
This item’s difficulty level, based on field test data, is an example
of a test item at the designated passing performance level,
“meets the standards”.
31. January 2002
NYSED
Test item 30 on the LE June 2001 exam has an item difficulty level
at the passing performance level, “meets the standards”,
based on field test data and the standards setting process.
32. January 2002
NYSED
This item’s difficulty level, based on field test data, is another
example of a question on the ES June 2001exam that “meets the
standards”.
33. January 2002
NYSED
This item’s difficulty level, based on field test data, was one of the most
difficult questions on the LE June 2001 exam.
34. January 2002
NYSED
This item’s difficulty level, based on field test data, was the most difficult
question on the ES June 2001 exam.
35. January 2002
NYSED
Science Regents Examinations
• Old “65”/passing was
determined by a “Raw
Score”
• A students score was
based on a maximum of
100 points.
• Test item difficulty varied
from each test form
• New “65”/passing is
determined by a “scale
score”
• A student score is derived
by converting a raw score
to a scale score based on
student field test data
• The item difficulty values
represent the same level
of difficulty from each test
administration
36. January 2002
NYSED
Science Regents Examination
Number of Students Tested - Total State
Regents Test 1997-1998 1998-1999 1999-2000
Biology 131,992 141,424 149,605
Earth Science 68,405 80,512 75,357
Earth Science
pro mod
54,318
Total 122,723
63,556
Total 144,068
67,114
Total 142,471
Chemistry 98,016 104,230 104,763
Physics 48,345 49,517 50,159
37. January 2002
NYSED
Regents Science Examinations
Statistics
Regents Science Total # of Students
Tested 2001
Increase # of Students
Tested 1997-2001
Regents Biology
Living Environment
70,387
179,489
Total 249,876
117,884
Earth Science
Physical Setting/ES
36,804
129,564
Total 166,368
43,645
Chemistry 113,253 15,237
Physics 50,663 2,318
38. January 2002
NYSED
Reliability of State Assessments
• Reliability focuses on the consistency of test scores
(performance) for a group of tests takers across
measures of time
• Reliability is best achieved by evaluating the whole
test before considering smaller portions of the test
• Inter- rater reliability is conducted after each test
administration (Teams of teachers are provided
uniform training and scoring procedures to re-score
10% of the Regents examinations audited)