Upcoming SlideShare
×

# NCCE Bylsma

491 views

Published on

NCCE Presentation

Published in: Education
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
491
On SlideShare
0
From Embeds
0
Number of Embeds
65
Actions
Shares
0
2
0
Likes
0
Embeds 0
No embeds

No notes for slide
• ESHB 2261 – SBE must develop an Accountability Index to identify schools/districts for recognition and additional state support
• It took two years of detailed conversation with data experts and a wide range of stakeholders to come up with this system, which was easier to understand and more valid than the federal accountability system (NCLB , AYP).5 outcomes across the top, 4 indicators down the left-hand side(Averages computed for each row and column)Simple average of the 20 “inner” cells is the index number (bottom right corner)Elementary/middle schools have 16 cells (vs 37 cells)HS and district have 20 cells (vs 45 and 119 cells)
• Not really “peers” in a strict sense of the term – look at percentages of students in certain categories.Multiple regression to determine a “predicted level” of achievement: Positive scores are “beating the odds” Negative scores are underperforming
• This explains how the Achievement vs Peers indicator works using the elementary math index results from 2007You’re familiar with scatterplots and how student performance declines as the level of poverty increases. The heavy black trend line is the predicted Learning Index level for schools with that level of low-income students.School A and B have about the same Learning Index (about 2.5). However, one has &gt;85% FRL and other has &lt; 25% FRL. The distance to the heavy black line is their “score” when adjusting for socioeconomic status. School A is almost .4 above the line and would be given a 7 for its rating compared to its peers, while School B is almost .4 below the line and would be given a 1 for its rating compared to its peers.Let’s assume this scatterplot represents the results when adjusting for all 4 variables, not just one. The thick dotted trend lines reflect the cutpoints for the highest and lowest ratings (-.20 to +.20). All schools above the upper line would be rated a 7, all schools below the bottom dotted red line are more than .20 points below the predicted level and would get a 1. The other dotted lines are the other cutpoints.
• 4. Since districts do not have access to student-level results statewide, they cannot compute SGP results on their own. The state must compute and report the results. OSPI published 2013 student, school, and district SGP results in December (&gt; 3 months after school began).
• Possible rating scale for school/district accountability: 0 – 24.9%25 – 34.9%35 – 44.9%45 – 55%55.1 – 65%65.1 – 75%75.1 – 100%
• ### NCCE Bylsma

1. 1. MAKING SENSE OF THE NEW ACCOUNTABILITY INDEX AND STUDENT GROWTH PERCENTILES Dr. Pete Bylsma Director, Assessment/Student Information Services Renton School District (Past President, Washington Educational Research Association - WERA) Dr. Glenn Malone Executive Director of Assessment, Accountability & Student Success Puyallup School District (WERA President-Elect) NCCE Conference March 12, 2014
2. 2.  Describe changes in federal accountability that prompted changes in old Index and required student growth measures  Describe old and new Achievement Index that rates schools (assigns labels, identifies high and low performers, basis for State Board of Education/OSPI recognition)  Describe & critique the new student growth percentile measure (SGP) used in the new index (and potentially used in staff evaluations) SESSION OBJECTIVES
3. 3. AYP under NCLB started in 2002, state discarded its existing accountability system • AYP used 9 student groups, reading/math proficiency and participation, graduation rate • 37 “cells” possible for schools, 111 for district • Gradually increasing goal, all groups must meet standard by 2014 • “Conjunctive” model – not making it in one area means not making AYP • Escalating negative sanctions when not making AYP, but only for Title I schools Why Change Accountability System? 3
4. 4. • System is too complicated, invalid, and unrealistic – Different “rules” than those used by state • Larger minimum N, margin of error, excludes some students – Negative label applied when missing one goal, ELLs must take test despite not knowing English – Conjunctive model  all will eventually “fail” • Resulted in unintended side effects – Focus on “bubble kids,” narrowing curriculum, some states lowered standards so all can pass by 2014 Problems with AYP System 4
5. 5. AYP waiver approved in 2012, some rules no longer apply • Do not need to have all students meet standard by 2014 • Do not need to set aside Title I funds • School choice or supplemental services not required • Still looks at reading & math percent meeting standard, 95% participation rate, graduation rates Annual Measurable Objectives (AMO) is new measure • Each subgroup in each school has its own annual targets • Targets use a 2011 baseline, must cut in half the “proficiency gap” (difference between baseline and 100% meeting standard) by 2017 5 New Federal Accountability Rules
6. 6. 6 Example of AMOs 0 10 20 30 40 50 60 70 80 90 100 2011 2012 2013 2014 2015 2016 2017 Asian White Twoor More Races All Low Income Black Pacific Islander Hispanic AmericanIndian Special Education LimitedEnglish
7. 7. Instead of “not making AYP,” lowest performing schools are now identified for more support 3 types of “Persistently Low Achieving” schools • Priority: Bottom 5% in “all students” category • Focus: Bottom 10% of all subgroups (Asian, black, Hispanic, white, low income, ELL, special education) • Emerging: Schools close to becoming Priority or Focus (next lowest 5%/10%) No grade-band distinctions (elementary, middle, high, comprehensive, alternative are all in the same rankings) 7 Revised Federal Accountability “Sanctions”
8. 8. System to identify low performing schools is badly flawed • Applies only to Title I schools, must have N > 30 for three years • To identify Focus and Emerging schools, all subgroups are combined and ranked together • In 2012, every Focus and Emerging school (186 total) was identified based on ELL or SpEd subgroups (or both)* If a school has a large ELL and/or SpEd population and is Title I, the odds of identification is very high *A few alternative schools were also identified for low graduation rates 8 Revised Federal Accountability “Sanctions”
9. 9. Educational accountability systems require: (1) measures of effectiveness (2) goals to guide improvement efforts (3) reports that provide useful information to policymakers, educators, and parents (4) a set of consequences that recognize exemplary performance and support those needing more help In response to flawed AYP system, the State Board of Education created an Accountability Index in 2009 to provide a better measure of school effectiveness Accountability Systems 9
10. 10. Original Accountability Index* Five Outcomes Results from 4 assessments (reading, writing, math, science) aggregated together from all grades and all students, extended graduation rate for all students, minimum N = 10 Four Indicators 1. Achievement by non-low income students (% meeting standard/ext. grad rate) 2. Achievement by low income students (eligible for FRL) 3. Achievement vs. Peers (make “apples to apples” comparisons by controlling for percent ELL, low-income, special ed, gifted, mobility) 4. Improvement (change in Learning Index from previous year) Creates a 5x4 matrix with 20 outcomes, each rated on a scale of 1-7 10* Required by Legislature in 2009 (ESHB 2261)
11. 11. Original Accountability Index Matrix (multiple measures using available state data) Outcomes Indicator Reading Writing Math Science Ext. G.R. Avg. Non-low inc. achievement Low inc. ach. Ach. vs. peers Improvement Average Index * * Simple average of all rated cells (compensatory model) 11
12. 12. Index Benchmarks and Ratings Indicator Reading Writing Math Science Ext. grad rate Achievement of - Non-low inc. - Low income (% met standard) % MET STANDARD RATING 90 – 100% 7 80 – 89.9% 6 70 – 79.9% 5 60 – 69.9% 4 50 – 59.9% 3 40 – 49.9% 2 < 40% 1 RATE RATING > 95 7 90 – 95% 6 85 – 89.9% 5 80 – 84.9% 4 75 – 79.9% 3 70 – 75% 2 < 70% 1 - Achievement vs. Peers (Learning Index) DIFFERENCE IN LEARNING INDEX RATING > .20 7 .151 to .20 6 .051 to .15 5 -.05 to .05 4 -.051 to -.15 3 -.151 to -.20 2 < -.20 1 DIFFERENCE IN RATE RATING > 12 7 6.1 to 12 6 3.1 to 6 5 -3 to 3 4 -3.1 to -6 3 -6.1 to -12 2 < -12 1 12
13. 13. Indicator Reading Writing Math Science Ext. grad rate - Improvement (Learning Index) CHANGE IN LEARNING INDEX RATING > .15 7 .101 to .15 6 .051 to .10 5 -.05 to .05 4 -.051 to -.10 3 -.101 to -.15 2 < -.15 1 CHANGE IN RATE RATING > 6 7 4.1 to 6 6 2.1 to 4 5 -2 to 2 4 -2.1 to -4 3 -4.1 to -6 2 < -6 1 Index Benchmarks and Ratings 13 • No Improvement rating given when performing at a very high level (sensitive to “ceiling” effect) • Index excluded ELL results in the first 3 years of enrollment (ELLs must still take tests, most exit in 3 years)
14. 14. Achievement vs. Peers •Recognizes context affects outcomes •Makes “apples to apples” comparisons (“statistical neighbors”) to control for 5 student variables (percent ELL, low-income, special education, mobile, gifted) •Separate analysis for each type of school (e.g., elementary, middle, high, multiple grades) •Non-regular schools do not receive a “peer” rating 14
15. 15. Illustration of Achievement vs. Peers (1 of 5 variables) Linear Regression 0.0 25.0 50.0 75.0 100.0 Pct low income 0.000 1.000 2.000 3.000 4.000 MathLearningIndex,2007                                                                                                                                                                                                                                                                                                     Math Learning Index, 2007 = 3.26 + -0.01 * PctLowInc R-Square = 0.70 A B 7 1 4 15
16. 16. Five Tier Names and Ranges Schools/districts assigned to a “tier” based on index score (but some applied A-F labels to these tiers) Tier Index Range Exemplary 5.50 – 7.00 Very Good 5.00 – 5.49 Good 4.00 – 4.99 Fair 2.50 – 3.99 Struggling 1.00 – 2.49 16
17. 17. Example - XXX High School Index (Good) Indicator Reading Writing Math Science Grad Rate Average Non-low inc. ach. 7 7 3 3 6 5.20 Low-inc. ach. 6 7 2 2 6 4.60 Ach. vs. peers 4 4 4 4 6 4.40 Improvement 5 2 1 4 3 3.00 Average 5.50 5.00 2.50 3.25 6.00 4.37 Indicator Reading Writing Math Science Grad Rate Non-low inc. ach.* 92.5 93.7 58.7 56.5 94.9 Low-inc. ach.* 87.2 91.8 44.8 40.8 94.2 Ach. vs. peers** +.05 +.01 +.03 +.05 +10.3 Improvement** +.09 -.14 -.26 -.04 -2.5 * Percent meeting standard for content areas, extended graduation rate ** All students, content areas measured using the Learning Index 17
18. 18. 2012 Index Results 18 41 38 75 119 212 268 400 377 320 162 51 18 0 100 200 300 400 500 1.00 - 1.49 1.50 - 1.99 2.00 - 2.49 2.50 - 2.99 3.00 - 3.49 3.50 - 3.99 4.00 - 4.49 4.50 - 4.99 5.00 - 5.49 5.50 - 5.99 6.00 - 6.49 6.50 - 7.00 Struggling 7.4% Fair 28.8% Good 37.3% VeryGood 15.4% Exemplary 11.1% N=2,081
19. 19. Washington Achievement Awards OSPI/SBE used 2-year averages from Accountability Index • Overall Excellence Award uses the Index score (top 5% by grade band) • Special Recognition given “on the edges” when 2-year average is > 6.00 Language arts, math, science, graduation rate, Improvement 19 Outcomes Indicator Reading Writing Math Science G.R. Average Non-low inc. achievement Compare1 Low inc. ach. Ach. vs. peers Improvement 6.00 Average 6.00 6.00 6.00 6.00 Top 5%1 1 Overall Excellence is granted only if the average difference in the income gap and the race/ethnicity gap (using a separate matrix) is < 2.5
20. 20. • Federal NCLB waiver required a change to the current Index – it must include subgroups and a growth measure • Merges two different accountability systems (state and federal) into one system • Has no relationship with AMOs! • New index is much more complicated, has different rules compared to previous index 20 New Accountability Index
21. 21. • Included in waiver proposal to U.S. Dept. of Education (waiver still not approved) • Includes all subgroups (race/ethnicity, programs) • N > 20 across grade band (not grade) • New rating scales (1-10) and more “labels” • No Peer rating • Growth based on SGPs, not grade band improvement in Levels • Includes all ELL results (including results of students who exited program) • Basis for identifying low-performing schools (federal acct.) • Sanctions also apply to non-Title I schools • Preliminary analyses show high correlation with school % FRL -.53 (elementary) -.45 (middle) -.60 (high) 21 New Accountability Index
22. 22. 6 Labels, Norm-referenced • Exemplary: Top 5% of schools using overall index, must have 60% students proficient in all tested subjects (given recognition) • Very Good: Next 15% of schools • Good: Next 30% of schools • Fair: Next 30% of schools • Underperforming: Next 5% of schools + 10% with large achievement gaps • Priority: Lowest 5% of index
23. 23. Proposed Priority, Focus, Emerging • Includes all schools, not just Title I • Uses Index to identify schools rather than stacked rankings Priority system uses the overall index value – Bottom 5% are Priority (“Struggling”) – Next 5% from the bottom are Emerging Priority Focus system uses index value for each subgroup in each school – Bottom 10% are Focus – Next 10% from the bottom are Emerging Focus
24. 24. Getting Off the Priority / Focus List* • For 3 consecutive years in Math and Reading: – Meet or exceed AMOs for all subgroups – Have at least 95% participation for all subgroups – Not be in the bottom 5% (or 10% for Focus) – Decrease % of students in all groups scoring Level 1 or 2 in reading and math. Improvement % must be comparable to top 30% of Title 1 schools • OSPI determines sufficient progress has been made * Unclear how Emerging schools get off list
25. 25. New Emphasis on Student Growth • Federal waiver submitted in 2011 requires a student growth measure for the Index and for teacher and principal evaluations • Index has growth measure but “weak legislation” regarding use of state test results in growth measure puts waiver in jeopardy • OSPI amended waiver in July 2013 and requires student growth to be a “substantial factor” in 3 of 8 teacher and principal criteria – brinksmanship occurring right now • Many ways to measure growth, State Board only considered Student Growth Percentile (SGP)
26. 26. Achievement vs Growth What’s the Difference? Achievement Growth
27. 27. Measuring Student Growth • Growth, in its simplest form, is a comparison of the assessment results of a student or group of students between two points in time where a positive difference would imply growth.
28. 28. Student Growth Percentiles • Problem: Current state assessment system was not designed to measure student growth – Only selected grades and subjects are tested – Difficulty varies in passing the test from one year to the next (high school reading and writing HSPE is easy to pass (bar was lowered due to graduation requirement) • State’s Solution: Use a norm-referenced system that ranks the rate of student growth
29. 29. Student Growth Percentiles • SGPs compare the growth rates of students who were at the same scale score level the previous year (their “academic peers”) Example: A student earning an SGP of 80 performed as well or better than 80 percent of the students who scored the same score the previous year • Does not compare the growth rate of all students to each other or compare the achievement to all students (the usual way to give percentiles)
30. 30. Student Growth Percentiles • SGP trajectory predicts where students will perform in the future, based on their previous growth rate and students who were at the same scale score level the previous year • OSPI groups students into three categories High Growth Top 1/3 67th to 99th percentile Typical Growth Middle 1/3 34th to 66th percentile Low Growth Bottom 1/3 1st to 33rd percentile • The median SGPs for a class, grade, school or district is the “score” (school median SGP is used in the new Index)
31. 31. SGP Student Data Student Growth Percentile (SGP) results are available to the public on the OSPI State Longitudinal Data System (SLDS) website 1 • From OSPI homepage, select “K-12 Data & Reports” on right side • Select “Static Data Files” • Select “Assessment” menu item, scroll down to find the SGP files and reports 1 http://data.k12.wa.us/PublicDWP/Web/WashingtonWeb/Home.aspx
32. 32. SGP School Data Available to the public on the OSPI State Longitudinal Data System (SLDS) website http://data.k12.wa.us/PublicDWP/Web/WashingtonWeb/Home.aspx • From OSPI’s homepage, click on the “K-12 Data & Reports” button on the right-hand side, then click on “Static Data Files” • Under the “Assessment” menu item, you can scroll down to find the SGP files and reports
33. 33. Takes you to a list of district links
34. 34. SGPs on OSPI’s Web site Three types of SGP files available to public • Bubble chart with all schools, with district’s schools identified (hover over bubble for results) • Individual school results by subgroup (compared to district and state for three years) • Excel file with all results for all schools and district (Renton’s file has > 5000 rows and 20 columns)
35. 35. Problems with SGP 1. Results can be misleading Percentile rank is not based on all students, so the 50th percentile is not the middle of the entire distribution, just those who had the same scale score the previous year 2. SGPs do not provide a measure of adequate (enough) growth or a year’s worth of growth A student can be at the 50th percentile and not make a year’s worth of growth or enough growth to meet expectations upon graduation; another student can be at the 50th percentile and make more than a year’s worth of growth
36. 36. Student Report: No growth is “typical”
37. 37. Student Report: Decline is “high growth”
38. 38. Problems with SGP 3. Results may not reflect an accurate measure of student growth or educator effectiveness • SGPs are “highly unstable” and “problematic” for students with very high and low scores because there are relatively few students with those scores to obtain stable rankings1 • No standard errors reported • Does not control for differences in the student population 4. Results are not available in a timely manner 5. SGPs are new and harder to understand than current metrics 1 Castellano, K. and Ho, A. (2013). A Practitioner’s Guide to Growth Models. Washington, DC: Council of Chief State School Officers
39. 39. Alternative Measure of Student Growth • Criterion-referenced approach • Students are compared to their own growth, not the growth rate of others • Encourages cooperation because score doesn’t depend on how other students perform • Can be computed quickly and easily – doesn’t require a minimum number of students and doesn’t depend on how other students perform • Uses familiar data and concepts, makes it easy to understand
40. 40. Measuring Achievement and Growth LeadingSlipping GainingLagging Above 439 Level 4 (Exceeds standard) 400-439 Level 3 (Meets standard) 375-399 Level 2 (Below standard) Below 375 Level 1 (Far below standard) Change in Scale Score from Grade 4 (2012) 2013Grade5MathScaleScore -100 -75 -50 -25 0 25 50 75 100
41. 41. -100 -50 0 50 100 2013 Achievement and Growth from 2012 (Math, Grade 4 and Change from Grade 3) Leading Slipping GainingLagging Average change in scale score: +6.5 (413.1 to 419.6) N = 913 R2=.58 56.3% of the students made at least one year gain (change in scale score > 0) Each dot represents a student who was enrolled in the district in both 2012 and 2013 (scores below 300 were marked as 300, scores above 500 were marked as 500) 15.6% (N=142) 50.4% (N=460) 5.9% (N=54) 28.1% (N=257) Change in Scale Score from Grade 3 (2012) 2013Grade4MathScaleScore 500 440 400 375 300 Above 439 Level 4 (Exceeds standard) Below 375 Level 1 (Far below standard) 375-399 Level 2 (Below standard) 400-439 Level 3 (Meets standard)
42. 42. Change in Math Scale Scores, 2011 to 2012 Non-Low Income Low Income (FRL) 43% made 1+ years gain60% made 1+ years gain
43. 43. Limitations to Alternative Measure • Proficiency cut scores vary slightly from grade to grade It’s harder to meet standard in some grades compared to others (like having an easy teacher one year and a hard teacher the next) • No “vertical scale” to measure absolute growth Smarter Balanced assessments will have a vertical scale and cut scores that align with college/career readiness For more details, see WERA Educational Journal, Winter 2014 article, “Using SGPs to Measure Student Growth: Context, Characteristics, and Cautions” www.wera-web.org