• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
SC - Assessment literacy in a teacher evaluation frame 11-12
 

SC - Assessment literacy in a teacher evaluation frame 11-12

on

  • 488 views

Utilizing a framework for using data in teacher evaluations we answer questions about assessments and data and provide areas to consider.

Utilizing a framework for using data in teacher evaluations we answer questions about assessments and data and provide areas to consider.

Statistics

Views

Total Views
488
Views on SlideShare
488
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Concept – If we fix schools we fix education. Schools actually did improve during this period.Race to the Top, Gates Foundation, Teach for America…Signaled in a number of waysNCLB about fixing schools – 100% Proficient by 2014Punishments for AYP – SES, Choice, RestructuringObama switch – Race to the TopFixing or improving teaching and the teaching professionRecruiting teachers from alternative careersMove from holding schools accountable to holding teachers accountable. Wrong no. Different Yes.David Brooks – Aug 2010 – Atlantic Monthly – Teachers are fair game – Teachers under scrutiny – Somewhat unfairlyBOE are asking about test based accountabilityCharleston SC – Any teacher without 50% of students on growth norm – Yr 1 on report, Yr 2 only rehired by approval by BOE50% Yr 1, 25% year 2 to be rehiredOur goal – Make sure you are prepared. Understand the risk. Proper ways to implement including legal issues. Clarify some of the implications – Very complex – Prepare you and a prudent course
  • Teacher evaluations and the use of data in them can take many forms. You can use them for supporting teachers and their improvement. You can use the evaluations to compensate teachers or groups of teachers differently or you can use them in their highest stakes way to terminate teachers. The higher the stakes put on the evaluation, the more risk there is to you and your organization from a political, legal, and equity perspective. Most people naturally respond with increasing the levels of rigor put into designing the process as a way to ameliorate the risk. One fact is that the risk can’t be eliminated. Our goal – Make sure you are prepared. Understand the risk. Proper ways to implement including legal issues. Clarify some of the implications – Very complex – Prepare you and a prudent course
  • Contrast with what value added communicatesPlot normal growth for Marcus vs anticipated growth – value added. If you ask whether the teachers provided value added, the answer is Yes.Other line is what is needed for college readinessBlue line is what is used to evaluate the teacher. Is he on the line the parents want him to be on? Probably not.Don’t focus on one at the expense of the otherNCLB – AYP vs what the parent really wants for goal settingCan be come so focused on measuring teachers that we lose sight of what parents valueWe are better off moving towards the kids aspirationsAs a parent I didn’t care if the school made AYP. I cared if my kids got the courses that helped them go where they want to go.
  • This is the value added metricNot easy to make nuanced decisions. Can learn about the ends.
  • Steps are quite important. People tend to skip some of these.Kids take a test – important that the test is aligned to instruction being givenMetric – look at growth vs growth norm and calculate a growth index. Two benefits – Very transparent/Simple.People tend to use our growth norms – if you hit 60% for a grade level within a school you are dong well.Norms – growth of a kid or group of kids compared to a nationally representative sample of studentsWhy isn’t this value added?Not all teachers can be compared to a nationally representative sample because they don’t teach kids that are just like the national sampleThe third step controls for variables unique to the teacher’s classroom or environmentFourth step – rating – how much below average before the district takes action or how much above before someone gets performance pay. Particular challenge in NY state right now. Law requires it.
  • Steps are quite important. People tend to skip some of these.Kids take a test – important that the test is aligned to instruction being givenMetric – look at growth vs growth norm and calculate a growth index. Two benefits – Very transparent/Simple.People tend to use our growth norms – if you hit 60% for a grade level within a school you are dong well.Norms – growth of a kid or group of kids compared to a nationally representative sample of studentsWhy isn’t this value added?Not all teachers can be compared to a nationally representative sample because they don’t teach kids that are just like the national sampleThe third step controls for variables unique to the teacher’s classroom or environmentFourth step – rating – how much below average before the district takes action or how much above before someone gets performance pay. Particular challenge in NY state right now. Law requires it.
  • State assessment designed to measure proficiency – many items in the middle not at the endsMust use multiple points of data over time to measure this. We also believe that a principal should be more in control of the evaluation than the test – Principal and Teacher leaders are what changes schools
  • 5th grade IL math cut scores shown
  • Common core – very ambitious things they want to measure – tackle things on an AP test. Write and show their work.A CC assessment to evaluate teachers can be a problem.Raise your hand if you know what the capital of Chile is. Santiago. Repeat after me. We will review in a couple of minutes. Facts can be relatively easily acquired and are instructionally sensitive. If you expose kids to facts in a meaningful and engaging ways, it is sensitive to instruction.
  • Problem – insensitive to instructionPrereq skills – writing skills. Given events on N. Africa today, Q requires a lot of pre-req knowledge. Need to know the story. Put it into writing. Reasoning skills to put it together with events today. And I need to know what is going on today as well. One doesn’t develop this entire set of skills in the 9 months of instruction.Common core is what we want. Just not for teacher evaluation.These questions are not that sensitive to instruction. Problematic when we hold teachers accountable for instruction or growth.
  • Problem – insensitive to instructionPrereq skills – writing skills. Given events on N. Africa today, Q requires a lot of pre-req knowledge. Need to know the story. Put it into writing. Reasoning skills to put it together with events today. And I need to know what is going on today as well. One doesn’t develop this entire set of skills in the 9 months of instruction.Common core is what we want. Just not for teacher evaluation.These questions are not that sensitive to instruction. Problematic when we hold teachers accountable for instruction or growth.
  • Steps are quite important. People tend to skip some of these.Kids take a test – important that the test is aligned to instruction being givenMetric – look at growth vs growth norm and calculate a growth index. Two benefits – Very transparent/Simple.People tend to use our growth norms – if you hit 60% for a grade level within a school you are dong well.Norms – growth of a kid or group of kids compared to a nationally representative sample of studentsWhy isn’t this value added?Not all teachers can be compared to a nationally representative sample because they don’t teach kids that are just like the national sampleThe third step controls for variables unique to the teacher’s classroom or environmentFourth step – rating – how much below average before the district takes action or how much above before someone gets performance pay. Particular challenge in NY state right now. Law requires it.
  • NCLB required everyone to get above proficient – message focus on kids at or near proficientSchool systems respondedMS standards are harder than the elem standards – MS problemNo effort to calibrate them – no effort to project elem to ms standardsStart easy and ramp up.Proficient in elem and not in MS with normal growth. When you control for the difficulty in the standards Elem and MS performance are the same
  • Not only are standards different across grades, they are different across states.It’s data like this that helps to inspire the Common Core and consistent standards so we compare apples to apples
  • Dramatic differences between standards based vs growthKY 5th grade mathematicsSample of students from a large school systemX-axis Fall score, Y number of kidsBlue are the kids who did not change status between the fall and the spring on the state testRed are the kids who declined in performance over spring – DecenderGreen are kids who moved above it in performance over the spring – Ascender – Bubble kidsAbout 10% based on the total number of kidsAccountability plans are made typically based on these red and green kids
  • Same district as beforeYellow – did not meet target growth – spread over the entire range of kidsGreen – did meet growth targets60% vs 40% is doing well – This is a high performing district with high growthMust attend to all kids – this is a good thing – ones in the middle and at both extremesOld one was discriminatory – focus on some in lieu of othersTeachers who teach really hard at the standard for years – Teachers need to be able to reach them allThis does a lot to move the accountability system to parents and our desires.
  • Steps are quite important. People tend to skip some of these.Kids take a test – important that the test is aligned to instruction being givenMetric – look at growth vs growth norm and calculate a growth index. Two benefits – Very transparent/Simple.People tend to use our growth norms – if you hit 60% for a grade level within a school you are dong well.Norms – growth of a kid or group of kids compared to a nationally representative sample of studentsWhy isn’t this value added?Not all teachers can be compared to a nationally representative sample because they don’t teach kids that are just like the national sampleThe third step controls for variables unique to the teacher’s classroom or environmentFourth step – rating – how much below average before the district takes action or how much above before someone gets performance pay. Particular challenge in NY state right now. Law requires it.
  • There are wonderful teachers who teach in very challenging, dysfunctional settings. The setting can impact the growth. HLM embeds the student in a classroom, the classroom in the school, and controls for the school parameters. Is it perfect. No. Is it better? Yes.Opposite is true and learning can be magnified as well.What if kids are a challenge, ESL or attendance for instance. It can deflate scores especially with a low number of kids in the sample being analyzed. Also need to make sure you have a large enough ‘n’ to make this possible especially true in small districts.Our position is that a test can inform the decision, but the principal/administrator should collect the bulk of the data that is used in the performance evaluation process.
  • Experts recommend multiple years of data to do the evaluation. Invalid to just use two points and will testify to it.Principals never fire anyone – NY rubber room – mythIf they do, it’s not fast enough. – Need to speed up the processThis won’t make the process faster – Principals doing intense evaluations will
  • The question we asked: Are teachers who are rated poorly or well in one year likely to stay there in the second year? Important if high stakes where there is a belief that someone won’t improve.We did VA assessment in year one and again in year two – 493 teachers40% of people in the bottom quintile moved out.Yr 1 and yr 2 correlations – these results are more highly correlated than most other studies. Our is best case scenario.One class can impact results so need multiple years of data to get stable results
  • Measurement error is compounded in test 1 and test 2
  • Green line is their VA estimate and bar is the error of measureBoth on top and bottom people can be in other quartilesPeople in the middle can cross quintiles – just based on SEMCross country – winners spread out. End of the race spread. Middle you get a pack. Middle moving up makes a big difference in the overall race.Instability and narrowness of ranges means evaluating teachers in the middle of the test mean slight changes in performance can be a large change in performance ranking
  • Non –random assignments Models control for various things – FRL, ethnicity, school effectiveness overall. Beyond this point assignment is random.1st year teachers get more discipline problems than teachers who have been 30 years. Pick the kids they get. If the model doesn’t control for disciplinary record – none do have that data – scores are inflated. Makes model invalid.Principals do need to do non-random assignment – sound educational reasons for the placement – match adults for kids
  • One or two kids can impact a classroom and not a grade and schoolsWhy? A large n helps reduce the standard error
  • Steps are quite important. People tend to skip some of these.Kids take a test – important that the test is aligned to instruction being givenMetric – look at growth vs growth norm and calculate a growth index. Two benefits – Very transparent/Simple.People tend to use our growth norms – if you hit 60% for a grade level within a school you are dong well.Norms – growth of a kid or group of kids compared to a nationally representative sample of studentsWhy isn’t this value added?Not all teachers can be compared to a nationally representative sample because they don’t teach kids that are just like the national sampleThe third step controls for variables unique to the teacher’s classroom or environmentFourth step – rating – how much below average before the district takes action or how much above before someone gets performance pay. Particular challenge in NY state right now. Law requires it.
  • Use NY point system as the example
  • Assessment is ultimately to serve kids. Be thoughtful. Get help.Involve stakeholders in the creation of a comprehensive evaluation systems with multiple measures of teacher effectiveness (Rand, 2010)Select the measures and VA models carefullyBring as much data to bear as possible to create a body of evidenceStart small and learnWe wouldn’t be who we are if I didn’t stress using the data for formative purposes. That’s what we really value.