2010 C Washington (Future Of Assessments) [No Notes] Rev 1 1
1. The future of AssessmentsLessons learned internationally Washington, 9 March 2010 Andreas SchleicherHead, Indicators and Analysis DivisionOECD Directorate for Education
2. The future of assessments Or the Alchemists’ Stone? The Holy Grail? Know why you are looking You cannot improve what you cannot measure The yardstick for success is no longer just improvement by national standards but the best performing education systems globally Know what you are looking for A new assessment culture Responsive to changing skill requirements Capitalising on methodological advances Not sacrificing validity gains for efficiency gains Know howyou will recognise it when you find it Gauging predictive validity Impact on improving learning and teaching Implications and lessons learned .
3.
4. A world of change – highereducation Expenditure per student at tertiary level (USD) Cost per student Graduate supply Tertiary-type A graduation rate
5. A world of change – highereducation Expenditure per student at tertiary level (USD) United States Cost per student Finland Graduate supply Tertiary-type A graduation rate
6. A world of change – highereducation Expenditure per student at tertiary level (USD) Australia Finland United Kingdom Tertiary-type A graduation rate
7. A world of change – highereducation Expenditure per student at tertiary level (USD) Tertiary-type A graduation rate
8. A world of change – highereducation Expenditure per student at tertiary level (USD) Tertiary-type A graduation rate
9. A world of change – highereducation Expenditure per student at tertiary level (USD) Tertiary-type A graduation rate
10. A world of change – highereducation Expenditure per student at tertiary level (USD) Tertiary-type A graduation rate
11. A world of change – highereducation Expenditure per student at tertiary level (USD) Tertiary-type A graduation rate
12. A world of change – highereducation Expenditure per student at tertiary level (USD) United States Australia Finland Tertiary-type A graduation rate
19. Schooling in the industrial age: Uniform learning The challenges today: Universal quality Motivated and self-reliant citizens Risk-taking entrepreneurs, converging and continuously emerging professions tied to globalising contexts and technological advance
20. How the demand for skills has changedEconomy-wide measures of routine and non-routine task input (US) Mean task input as percentiles of the 1960 task distribution The dilemma of assessments: The skills that are easiest to teach and test are also the ones that are easiest to digitise, automate and outsource (Levy and Murnane)
21. Changing skill demands The great collaborators and orchestrators The more complex the globalised world becomes, the more individuals and companies need various forms of co-ordination and management The great synthesisers Conventionally, our approach to problems was breaking them down into manageable bits and pieces, today we create value by synthesising disparate bits together The great explainers The more content we can search and access, the more important the filters and explainers become
22. Changing skill demands The great versatilists Specialists generally have deep skills and narrow scope, giving them expertise that is recognised by peers but not valued outside their domain Generalists have broad scope but shallow skills Versatilists apply depth of skill to a progressively widening scope of situations and experiences, gaining new competencies, building relationships, and assuming new roles. They are capable not only of constantly adapting but also of constantly learning and growing The great personalisers A revival of interpersonal skills, skills that have atrhophied to some degree because of the industrial age and the Internet The great localisers Localising the global
23. Education today needs to prepare students… … to deal with more rapid change than ever before… … for jobs that have not yet been created… … using technologies that have not yet been invented… … to solve problems that we don’t yet know will arise It’s about new… Ways of thinking involving creativity, critical thinking, problem-solving and decision-making Ways of working including communication and collaboration Tools for working including the capacity to recognise and exploit the potential of new technologies The capacity to live in a multi-faceted world as active and responsible citizens.
24. Mathematics in PISA The real world The mathematical World Making the problem amenable to mathematical treatment A mathematical model A model of reality Understanding, structuring and simplifying the situation Using relevant mathematical tools to solve the problem A real situation Validating the results Mathematical results Real results Interpreting the mathematical results
31. Participative / internal Formative classroom-based assessments(e.g. Europe, Asia) Efficiency gains Validity gains Large scale and high-stakes summative assessments, typically multiple-choice to contain costs(US, England, Latin America…) Large scale and low-stakes assessments, sample-based administration allows for complex task types (e.g. Northern Europe, Scotland, PISA) Administrative / external
32.
33. Responding to assessments can enhance student learning if tasks are well crafted to incorporate principles of learning
34.
35.
36. Increased likelihood of postsec. particip. at age 19/21 associated with PISA reading proficiency at age 15 (Canada)after accounting for school engagement, gender, mother tongue, place of residence, parental, education and family income (reference group PISA Level 1) Odds ratioCollege entry School marks at age 15 PISA performance at age 15
37. Relationship between test performance and economic outcomesAnnual improved GDP from raising performance by 25 PISA points Percent addition to GDP
39. Implications and lessons learned The medieval Alchemists’ followed the dictates of a well-established science but that was built on wrong foundations The search for the Holy Grail was overburdened by false clues and cryptic symbols
40. From assessment-inhibited practice towards outcome driven reform Strong focus on processes Integrated quality management Good willand trust Weak outcome-based management Strong outcome-based management External control, uninformed prescription Deprivation Weak focus on processes
41. Some criteria used in the world Coherence Built on a well-structured conceptual base—an expected learning progression—as the foundation both for large scale and classroom assessments Consistency and complementarity across administrative levels of the system and across grades Comprehensiveness Using a range of assessment methods to ensure adequate measurement of intended constructs and measures of different grain size to serve different decision-making needs Provide productive feedback, at appropriate levels of detail, to fuel accountability and improvement decisions at multiple levels Continuity A continuous stream of evidence that tracks the progress of both individual students .
42. Designing assessments Assessment frameworks A working definition of the domain and its underlying assumptions Organising the domain and identifying key task characteristics that guide task construction Operationalising task characteristics in terms of variables Validating the variables and assessing the contribution they each make to understanding task difficulty Establishing an interpretative scheme .
43. Understanding learningprogressions Learning targets Defining what mastery means for a given skill level Progress variables Delineate a pathway that characterise the steps that learners typically follow as they become more proficient Evaluation of students reasoning in terms of the correctness of their solutions as well as in terms of their complexity, validity and precision Levels of achievement Describing the breadth and depth of the learner’s understanding of the domain at a particular level of advancement Learning performances The operational definitions of what student’s understanding would look like at each of the stages of progress . Wilson, ATC21S
53. Knowledge about scienceAttitudes -Interest in science -Support for scientific enquiry -Responsibility Students demonstrate ability to compare and differentiate among competing explanations by examining supporting evidence. They can formulate arguments by synthesising evidence from multiple sources. Students can point to an obvious feature in a simple table in support of a given statement. They are able to recognise if a set of given characteristics apply to the function of everyday artifacts.
54. Some methodological challenges Can we sufficiently distinguish the role of context from that of the underlying cognitive construct ? Do new types of items that are enabled by computers and networks change the constructs that are being measured ? Can we drink from the firehose of increasing data streams that arise from new assessment modes ? Can we utilise new technologies and new ways of thinking of assessments to gain more information from the classroom without overwhelming the classroom with more assessments ? What is the right mix of crowd wisdom and traditional validity information ? How can we create assessments that are activators of students’ own learning ? Wilson, ATC21S
55. High policy value A real-time assessment environment that bridges the gap between formative and summative assessment . Quick wins Must haves Examine individual, institutional and systemic factors associated with performance Extending the range of competencies through which quality is assessed Monitor educational progress Measuring growth in learning Low feasibility High feasibility Establish the relative standing of students and schools Assuming that every new skill domain is orthogonal to all others Money pits Low-hanging fruits Low policy value
85. www.oecd.org; www.pisa.oecd.org All national and international publications The complete micro-level database email: Andreas.Schleicher@OECD.org Twitter: @SchleicherEDU … and remember: Without data, you are just another person with an opinion Thank you !
Editor's Notes
Letmebeginby highlighting some of the transformations we are seeing across OECD countries from the old bureaucratic education systems to the modern enabling systems. In the past, teachers were often left alone in classrooms with a lot of prescription what to teach, modern education systems set ambitious goals, clarify what students should be able to do, and then provide teachers with the tools to figure out what content and instruction they need to provide to their students. In the past, different students were taught in similar ways, today the challenge is to embrace diversity with differentiated pedagogical practices.Thefocusofpolicyisshifting from provision to outcomes, from looking upwards in the bureaucracy to looking outwards to the next teacher, the next schoolWe talk a lot about equity, but we are seeing systems delivering equity.The goal of the past was standardisation and conformity, it is now being ingenious, about personalising educational experiences.The past was curriculum-centred, we told teachers what to teach, the future is learner centred.
Education systems are responding to the challenges, first of all, with rapidly rising output. The pace of change is most clearly visible in higher education, and I want to bring two more dimensions into the picture here. Each dot on this chart represents one country. The horizontal axis shows you the college graduation rate, the proportion of an age group that comes out of the system with a college degree. The vertical axis shows you how much it costs to educate a graduate per year.
*Lets now add where the money comes from into the picture, the larger the dot, the larger the share of private spending on college education, such as tuition.The chart shows the US as the country with the highest college graduation rate, and the highest level of spending per student. The US is also among the countries with the largest share of resources generated through the private sector. That allows the US to spend roughly twice as much per student as Europe. US, FinlandThe only thing I have not highlighted so far is that this was the situation in 1995. And now watch this closely as you see how this changed between 1995 and 2005.
You see that in 2000, five years, later, the picture looked very different. While in 1995 the US was well ahead of any other country – you see that marked by the dotted circle, in 2000 several other countries had reached out to this frontier. Look at Australia, in pink.
But we need to be careful before claiming success, quantity does not provide much of a guarantee for success.Let us go back to the 1960s. The chart shows you the wealth of world regions and the average years of schooling in these regions, which is the most traditional measure of human capital. Have a look at Latin America, it ranked third in wealth and third in years of schooling, so in the 1960s the world seemed pretty much in order.
But when you look at economic growth between 1960 and 2000, you see that something went wrong. Despite the fact that Latin America did well in terms of years of schooling, only Sub-Saharan Africa did worse in terms of economic growth. So in 2000, Latin America had fallen back considerably in terms of GDP per capita.You can draw two conclusions from this: Either education is not as important for economic growth as we thought, or we have for a long time been measuring the wrong thing.
Now let me add one additional element, and that is a measure of the quality of education, in the form of the score of the different world regions on international tests like PISA or TIMSS. And you see now that the world looks in order again, there seems a close relationship between test scores and economic growth. You can see that even more clearly when you put this into graphical form. And the relationship that you see here holds even when you account for other factors, it even holds when you compare growth in economies with growth in learning outcomes, which is the closest we can come to examining causality.So what this tells you is that it is not simply years of schooling or the number of graduates we produce, but indeed the quality of learning outcomes that counts. That is why it is so crucial to develop assessments that are telling us something about the quality of education.
Levy and Murnane show how the composition of the US work force has changed. What they show is that, between 1970 and 2000, work involving routine manual input, the jobs of the typical factory worker, was down significantly. Non-routine manual work, things we do with our hands, but in ways that are not so easily put into formal algorithms, was down too, albeit with much less change over recent years – and that is easy to understand because you cannot easily computerise the bus driver or outsource your hairdresser. All that is not surprising, but here is where the interesting story begins: Among the skill categories represented here, routine cognitive input, that is cognitive work that you can easily put into the form of algorithms and scripts saw the sharpest decline in demand over the last couple of decades, with a decline by almost 8% in the share of jobs. So those middle class white collar jobs that involve the application of routine knowledge, are most at threat today. And that is where schools still put a lot of their focus and what we value in multiple choice accountability systems.The point here is, that the skills that are easiest to teach and test are also the skills that are easiest to digitise, automatise and offshore. If that is all what we do in school, we are putting our youngsters right up for competition with computers, because those are the things computers can do better than humans, and our kids are going to loose out before they even started. Where are the winners in this process? These are those who engage in expert thinking – the new literacy of the 21st century, up 8% - and complex communication, up almost 14%.
Whatthischarttells you is that this is all not simply about more of the same skills, but that the nature of the skills that matter is changing rapidly. Education today needs to prepare students…… to deal with more rapid change than ever before…… for jobs that have not yet been created…… using technologies that have not yet been invented…… to solve problems that we don’t yet know will arise It’s about new…Ways of thinkinginvolving creativity, critical thinking, problem-solving and decision-makingWays of workingincluding communication and collaborationTools for workingincluding the capacity to recognise and exploit the potential of new technologiesThe capacity to live in a multi-faceted world as active and responsible citizens.
The approaches to assessment that countries are pursuing to get there differ markedly.Many systems seek to figure out how effectively education systems, or parts of them, function as a whole. To do that, they typically assess a sample of schools and students rather than every student in every school. Of course, you then cannot compare the performance of individual schools and build accountability systems around that. On the other hand, you can put more resources into testing fewer students better, rather than sacrifacing validity gains for efficiency gains. Other assessment systems focus on school performance. And yet others seek evaluate teachers and students in classrooms.And of course, all of these systems are complementary and they typically look at what students are able to do.Assessment systems, and the philosophies around which they are build, vary also by their purposes. Is the primary purpose accountability, are results made publicly available and linked to some form of consequencies? Those purposes are common in England, the United States and Latin America.Or is the focus diagnosis and improvement, with the results feeding back directly to teachers and learners? Again, these are complementary not competing objectives, even if there is some tension between objectivity in an assessment and relevance. That tends to be the primary focus in Northern Europe or Scotland, for example.There are also differences with who is in charge to prepare the evaluation, to conduct it and to use the results. Systems involve very different actors and stakeholders in their assessment processes, often this is related to the political economy to maximise the utilisity and acceptance of assessment processes.Similarly, the type of instruments, differ widely, both within but also across countries. When you think about an assessment in the United States, you may imagine a multiple-choice test, but there are other countries where multiple-choice tests are not used at all, they may use open-ended assessments or even oral examinations. Or they may use classroom observations, student or teacher portfolios or other instruments. I will come back to this aspect.And then, of course, assessments differ by what they assess, whether the focus is on inputs, processes or outputs.
But there are also some common trends, most importantly the move towards multi-layered assessment systems that coherently extend from students to schools to states, nations and the international level. These assessments seek not to take learning time away from students, but try to enhance the learning of students, of teachers, of school administrators and policy makers, through building frameworks for lateral accountability.The assessments underline that successful learning is as much about the process as it is about facts and figures, they emphasise that success is not about the reproduction of subject matter content, but about the capacity to integrate, synthesize and creatively extrapolate from what you know and apply that knowledge in novel situations.They try to provide a window into students’ understandings and the conceptual strategies a student uses to solve a problem.They provide dynamic task contexts in which prior actions may stimulate unpredictable reactions that in turn influence subsequent strategies and options. •They try to add value for teaching and learning. The process of responding to assessments canenhance student learning if assessment tasks are well crafted to incorporate principles of learning and cognition. For example, assessment tasks can incorporate transfer and authentic applications, and can provide opportunities for students to organize and deepen their understanding through explanation and use of multiple representations. They try to generate information that can be acted upon and provides productive and usable feedback for all intended users. Teachers need to be able to understand what theassessment reveals about students’ thinking. And school administrators, policymakers, and teachers need to be able to use this assessment information to determine how to create better opportunities for student learning.Last but not least, those assessments do not operate in a vacuum but, are part of a comprehensive set of instruments that extend to instructional material as well as teacher training.
How do we deal with the issue of external validity? The best way to find out whether what students have learned at school matters for their life is to actuallywatch what happens to them after they leave school. This is exactly what we have done that with around 30,000 students in Canada. We tested them in the year 2000 when they were 15 years old in reading, math and science, and since then we are following up with them each year on what choices they make and how successful they are in their transition from school to higher education and work.The horizontal axis shows you the PISA level which 15-year-old Canadians had scored in 2000. Level 2 is the baseline level on the PISA reading test and Level 5 the top level in reading.The red bar shows you how many times more successful someone who scored Level 2 at age 15 was at age 19 to have made a successful transition to university, as compared to someone who did not make it to the baseline PISA level 1. And to ensure that what you see here is not simply a reflection of social background, gender, immigration or school engagement, we have already statistically accounted for all of these factors. The orange bar. …How would you expect the picture to be like at age 21? We are talking about test scores here, but for a moment, lets go back to the judgements schools make on young people, for example through school marks. You can do the same thing here, you can see how well school marks at age 15 predict the subsequent success of youths. You see that there is some relationship as well, but that it is much less pronounced than when we use the direct measure of skills.
It is important that assessments build on conceptually strong assessment frameworks
There are some tough methodological challenges that need to be addressed:Can we sufficiently distinguish the role of context from that of the underlying cognitive construct ?Do new types of items that are enabled by computers and networks change the constructs that are being measured ?Can we drink from the firehose of increasing data streams that arise from new assessment modes ?Can we utilise new technologies and new ways of thinking of assessments to gain more information from the classroom without overwhelming the classroom with more assessments ?What is the right mix of crowd wisdom and traditional validity information ?How can we create assessments that are activators of students’ own learning ?