SlideShare a Scribd company logo
Designing an Assessment System
Richard P. Phelps
International Research-to-Practice Conference
Nazarbayev Intellectual Schools AEO
Astana, Kazakhstan
October, 2016
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 1
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 2
“If a thing exists, it
exists in some
amount. If it exists in
some amount, then it
is capable of being
measured.”
−−René Descartes,
Principles of
Philosophy, 1664
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 3
Image of Protein Molecules Forming Memories
Albert Einstein College of Medicine, New York, January 2014
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 4
Image of Protein Molecules Forming Memories
Albert Einstein College of Medicine, New York, January 2014
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 5
Learning Curve
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 6
Forgetting Curve (1870s)
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 7
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 8
Ebbinghaus:
“Learning usually
requires rehearsal
or repetition”
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 9
Cognitive Load Theory
John Sweller, 1980s
Working Memory Capacity
George Miller, 1950s
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 10
Working Memory:
Ability to temorarily hold and
manipulate information for
cognitive tasks
Working Memory is challenged by:
new, unfamiliar information and
quantity of discrete bits of information
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 11
I am thinking of a type of object, what is it?
They are shapes, geometric plane figures,
polygons, quadrilaterals, and parallelograms
with opposite equal acute angles, opposite
equal obtuse angles, and four equal sides
Description 1:
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 12
I am thinking of a type of object, what is it?
Description 2:
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 13
Two centuries of research on learning concludes…
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 14
“…repeated retrieval during learning is the key to
long-term retention.”
— Henry L. “Roddy” Roediger
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 15
Cognitive Scientists’ 6 Strategies for Effective Learning
Retrieval Practice
Spaced Practice
Dual Coding
Interleaving
Concrete Examples
Elaboration
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 16
Retrieval Practice
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 17
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 18
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 19
Implications for Teachers 1
Most teachers should test more
frequently, …with smaller,
shorter, low-stakes tests
Understand that useful
assessment can be short and
simple.
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 20
Implications for Teachers 2
Does the test format
matter?
• multiple-choice?
• essay?
• short answer?
• oral?
• demonstration?
• …etc.?
Not so much.
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 21
Tests provide
feedback to teachers
about what works
and what does not
Implications for Teachers 3
Just like students can learn by testing each other;
teachers can help each other by reviewing each
others’ tests.
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 22
Cognitive Psychology
experiments were
conducted with
“formative” tests in
schools and classrooms
What about systemwide, large-scale tests?
First priority:
do no harm to the
formative testing
programs in schools
and classrooms
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 24
The effect of testing on student learning
• 12-year study, read >3,000 documents
• analyzed close to 700 separate studies, and
more than 1,600 separate effects
• 2,000 other studies were reviewed and
found incomplete or inappropriate
• hundreds of other studies remain to be
reviewed
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 25
245 Qualitative studies
813 Surveys or Polls
640 Quantitative Studies:
Experiments:
School- and classroom-level
Multivariate studies:
Large-scale testing programs
The effect of testing on student learning
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 26
Meta-analysis
A method for
summarizing a large
research literature, with
a single, comparable
measure.
( 0.5 effect size ≈ 1 grade level of learning )
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 27
Findings from Phelps (2012):
• Survey study effect sizes average >1.0
• Over 90% of qualitative studies positive
• For quantitative studies, univariate effect sizes positive and
stronger when:
– Testing more frequently
– Testing with feedback
– Testing with stakes
28
Findings from Phelps & Silva (2015)
For quantitative studies, effect sizes vary
between 0.55 and 0.88:
+++ testing more frequently
++ testing with stakes
+ testing with feedback
International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016© 2016, Richard P PHELPS
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 29
• size of study population
• small +0.34 over large
• scale of test administration
• small-scale +0.14 over large-scale
• responsible level of government
• local tests +0.29 over state tests
Effect of scale on testing benefits
Large-scale test, tight security
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 30
Large-scale test, lax security
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 31
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 32
Besides, systemwide tests are needed for
other purposes, such as…
…selection to programs with limited number of places
…monitoring and system diagnosis
…workforce planning
…accountability
…credentialing
That’s enough!
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 33
Some large-scale test advantages
On per-student basis, inexpensive
Cognitive laboratory pre-testing possible
Standardization offers comparisons across schools and regions.
May produce high-quality items that schools and teachers can use.
MOST IMPORTANT:
provides reliable, comparative information to all those not involved in
a particular school
The more systemwide decision points, the better ?
Figure 1: Average TIMSS Score and Number of Quality Control
Measures Used, by Country
0
10
20
30
40
50
60
70
80
0 5 10 15 20
Number of Quality Control Measures Used
AveragePercentCorrect(grades7&8)
Top-Performing Countries Bottom-Performing Countries
SOURCE: Phelps, Benchmarking to the best in mathematics, Evaluation Review, 2001
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 34
Quality control has proportionally greater effect in poorer countries
Figure 2: Average TIMSS Score and Number of Quality Control
Measures Used (each adjusted for GDP/capita), by Country
Number of Quality Control Measures Used (per GDP/capita)
AveragePercentCorrect(grades7&8)
(perGDP/capita)
SOURCE: Phelps, Benchmarking to the best in mathematics, Evaluation Review, 2001
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 35
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 36
TIMSS, PIRLS, CIVED, SITES, ICILS,
PPP, ECES, TEDS
IEA:
OECD PISA:
World Bank:
PISA, PISA for schools
PISA for development
READ, SABER
…provides funding for PISA
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 37
The effect of international testing programs
Freedomtodesignyourtesting
school
tests
international
tests
state and national tests
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 38
OECD and World Bank are run by economists
How well do economists understand PSYCH-ometrics?
Some interesting examples:
Chile’s national testing
program, funded by the
World Bank
OECD’s “Synergies for
Better Learning” project
© 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 39
Some interesting oddities:
World Bank educational
assessment chiefs are always
Irish nationals affiliated with
Boston College in the USA.
PISA is universally interpreted as
an achievement test, even by
the OECD. In reality, it has been
an unvalidated aptitude test.
Designing an Assessment System
richard {at} nonpartisaneducation {dot} org

More Related Content

Similar to Designing an Assessment System

Top Universities, Top Libraries Do Research Services in Academic Libraries Co...
Top Universities, Top LibrariesDo Research Services in Academic LibrariesCo...Top Universities, Top LibrariesDo Research Services in Academic LibrariesCo...
Top Universities, Top Libraries Do Research Services in Academic Libraries Co...
Llarina González Solar
 
Web of Science NUI Galway October 2018
Web of Science NUI Galway October 2018Web of Science NUI Galway October 2018
Web of Science NUI Galway October 2018
rosie.dunne
 
Brazil's university ranking a prediction study with machine learning 234 ifka...
Brazil's university ranking a prediction study with machine learning 234 ifka...Brazil's university ranking a prediction study with machine learning 234 ifka...
Brazil's university ranking a prediction study with machine learning 234 ifka...
IFSC
 
From Open Access to Open Science
From Open Access to Open ScienceFrom Open Access to Open Science
From Open Access to Open Science
Natalia Manola
 
REG / EAACI Quality Standards Taskforce Meeting
REG / EAACI Quality Standards Taskforce MeetingREG / EAACI Quality Standards Taskforce Meeting
REG / EAACI Quality Standards Taskforce Meeting
Zoe Mitchell
 
An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...
Juan Antonio Vizcaino
 
11th International Conference on Psychology, Language and Teaching (ICPLT)
11th International Conference on Psychology, Language and Teaching (ICPLT)11th International Conference on Psychology, Language and Teaching (ICPLT)
11th International Conference on Psychology, Language and Teaching (ICPLT)
Global R & D Services
 
Let's Talk Research Annual Conference - 24th-25th September 2014 (Professor R...
Let's Talk Research Annual Conference - 24th-25th September 2014 (Professor R...Let's Talk Research Annual Conference - 24th-25th September 2014 (Professor R...
Let's Talk Research Annual Conference - 24th-25th September 2014 (Professor R...
NHSNWRD
 
Web Internationalization: Russian Universities. Report No. 24/2016
Web Internationalization: Russian Universities. Report No. 24/2016Web Internationalization: Russian Universities. Report No. 24/2016
Web Internationalization: Russian Universities. Report No. 24/2016
Russian Council
 
Quanitiative Research PlanTextbooksAmerican Psychological Asso.docx
Quanitiative Research PlanTextbooksAmerican Psychological Asso.docxQuanitiative Research PlanTextbooksAmerican Psychological Asso.docx
Quanitiative Research PlanTextbooksAmerican Psychological Asso.docx
amrit47
 
Promoting a culture of Open Research at Lancaster University
Promoting a culture of Open Research at Lancaster UniversityPromoting a culture of Open Research at Lancaster University
Promoting a culture of Open Research at Lancaster University
Lancaster University Library
 
Information services performance/quality/value/impact/benefit: (a) concepts a...
Information services performance/quality/value/impact/benefit: (a) concepts a...Information services performance/quality/value/impact/benefit: (a) concepts a...
Information services performance/quality/value/impact/benefit: (a) concepts a...
Petros Kostagiolas
 
Information services performance/quality/value/impact/benefit: (a) concepts a...
Information services performance/quality/value/impact/benefit: (a) concepts a...Information services performance/quality/value/impact/benefit: (a) concepts a...
Information services performance/quality/value/impact/benefit: (a) concepts a...
Petros Kostagiolas
 
REG-EAACI Taskforce Report
REG-EAACI Taskforce ReportREG-EAACI Taskforce Report
REG-EAACI Taskforce Report
Zoe Mitchell
 
Stephenson - Data Curation for Quantitative Social Science Research
Stephenson - Data Curation for Quantitative Social Science ResearchStephenson - Data Curation for Quantitative Social Science Research
Stephenson - Data Curation for Quantitative Social Science Research
National Information Standards Organization (NISO)
 
What´s that thing called RRI? By Jacqueline Broerse
What´s that thing called RRI? By Jacqueline Broerse What´s that thing called RRI? By Jacqueline Broerse
What´s that thing called RRI? By Jacqueline Broerse
RRI Tools
 
Hans Lund: Background and Introduction to the COST Action EVBRES
Hans Lund: Background and Introduction to the COST Action EVBRESHans Lund: Background and Introduction to the COST Action EVBRES
Hans Lund: Background and Introduction to the COST Action EVBRES
Caroline Blaine
 
REG Annual General Meeting 2015
REG Annual General Meeting 2015REG Annual General Meeting 2015
REG Annual General Meeting 2015
Zoe Mitchell
 
Techniques for Context-Aware and Cold-Start Recommendations
Techniques for Context-Aware and Cold-Start RecommendationsTechniques for Context-Aware and Cold-Start Recommendations
Techniques for Context-Aware and Cold-Start Recommendations
Matthias Braunhofer
 
Sparc-Japan-Slow-revolution-in-scholarly-communication
Sparc-Japan-Slow-revolution-in-scholarly-communicationSparc-Japan-Slow-revolution-in-scholarly-communication
Sparc-Japan-Slow-revolution-in-scholarly-communication
hierohiero
 

Similar to Designing an Assessment System (20)

Top Universities, Top Libraries Do Research Services in Academic Libraries Co...
Top Universities, Top LibrariesDo Research Services in Academic LibrariesCo...Top Universities, Top LibrariesDo Research Services in Academic LibrariesCo...
Top Universities, Top Libraries Do Research Services in Academic Libraries Co...
 
Web of Science NUI Galway October 2018
Web of Science NUI Galway October 2018Web of Science NUI Galway October 2018
Web of Science NUI Galway October 2018
 
Brazil's university ranking a prediction study with machine learning 234 ifka...
Brazil's university ranking a prediction study with machine learning 234 ifka...Brazil's university ranking a prediction study with machine learning 234 ifka...
Brazil's university ranking a prediction study with machine learning 234 ifka...
 
From Open Access to Open Science
From Open Access to Open ScienceFrom Open Access to Open Science
From Open Access to Open Science
 
REG / EAACI Quality Standards Taskforce Meeting
REG / EAACI Quality Standards Taskforce MeetingREG / EAACI Quality Standards Taskforce Meeting
REG / EAACI Quality Standards Taskforce Meeting
 
An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...
 
11th International Conference on Psychology, Language and Teaching (ICPLT)
11th International Conference on Psychology, Language and Teaching (ICPLT)11th International Conference on Psychology, Language and Teaching (ICPLT)
11th International Conference on Psychology, Language and Teaching (ICPLT)
 
Let's Talk Research Annual Conference - 24th-25th September 2014 (Professor R...
Let's Talk Research Annual Conference - 24th-25th September 2014 (Professor R...Let's Talk Research Annual Conference - 24th-25th September 2014 (Professor R...
Let's Talk Research Annual Conference - 24th-25th September 2014 (Professor R...
 
Web Internationalization: Russian Universities. Report No. 24/2016
Web Internationalization: Russian Universities. Report No. 24/2016Web Internationalization: Russian Universities. Report No. 24/2016
Web Internationalization: Russian Universities. Report No. 24/2016
 
Quanitiative Research PlanTextbooksAmerican Psychological Asso.docx
Quanitiative Research PlanTextbooksAmerican Psychological Asso.docxQuanitiative Research PlanTextbooksAmerican Psychological Asso.docx
Quanitiative Research PlanTextbooksAmerican Psychological Asso.docx
 
Promoting a culture of Open Research at Lancaster University
Promoting a culture of Open Research at Lancaster UniversityPromoting a culture of Open Research at Lancaster University
Promoting a culture of Open Research at Lancaster University
 
Information services performance/quality/value/impact/benefit: (a) concepts a...
Information services performance/quality/value/impact/benefit: (a) concepts a...Information services performance/quality/value/impact/benefit: (a) concepts a...
Information services performance/quality/value/impact/benefit: (a) concepts a...
 
Information services performance/quality/value/impact/benefit: (a) concepts a...
Information services performance/quality/value/impact/benefit: (a) concepts a...Information services performance/quality/value/impact/benefit: (a) concepts a...
Information services performance/quality/value/impact/benefit: (a) concepts a...
 
REG-EAACI Taskforce Report
REG-EAACI Taskforce ReportREG-EAACI Taskforce Report
REG-EAACI Taskforce Report
 
Stephenson - Data Curation for Quantitative Social Science Research
Stephenson - Data Curation for Quantitative Social Science ResearchStephenson - Data Curation for Quantitative Social Science Research
Stephenson - Data Curation for Quantitative Social Science Research
 
What´s that thing called RRI? By Jacqueline Broerse
What´s that thing called RRI? By Jacqueline Broerse What´s that thing called RRI? By Jacqueline Broerse
What´s that thing called RRI? By Jacqueline Broerse
 
Hans Lund: Background and Introduction to the COST Action EVBRES
Hans Lund: Background and Introduction to the COST Action EVBRESHans Lund: Background and Introduction to the COST Action EVBRES
Hans Lund: Background and Introduction to the COST Action EVBRES
 
REG Annual General Meeting 2015
REG Annual General Meeting 2015REG Annual General Meeting 2015
REG Annual General Meeting 2015
 
Techniques for Context-Aware and Cold-Start Recommendations
Techniques for Context-Aware and Cold-Start RecommendationsTechniques for Context-Aware and Cold-Start Recommendations
Techniques for Context-Aware and Cold-Start Recommendations
 
Sparc-Japan-Slow-revolution-in-scholarly-communication
Sparc-Japan-Slow-revolution-in-scholarly-communicationSparc-Japan-Slow-revolution-in-scholarly-communication
Sparc-Japan-Slow-revolution-in-scholarly-communication
 

More from Richard P Phelps

Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptx
Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptxDismissive Reviews, Citation Cartels, and the Replication Crisis.pptx
Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptx
Richard P Phelps
 
The Successful Degradation of Evidence on Educational Testing in the United S...
The Successful Degradation of Evidence on Educational Testing in the United S...The Successful Degradation of Evidence on Educational Testing in the United S...
The Successful Degradation of Evidence on Educational Testing in the United S...
Richard P Phelps
 
Comparing achievement and aptitude tests for university admission
Comparing achievement and aptitude tests for university admissionComparing achievement and aptitude tests for university admission
Comparing achievement and aptitude tests for university admission
Richard P Phelps
 
It's a myth: High stakes cause test score inflation
It's a myth: High stakes cause test score inflation It's a myth: High stakes cause test score inflation
It's a myth: High stakes cause test score inflation
Richard P Phelps
 
It's a myth: High stakes cause test score inflation
It's a myth: High stakes cause test score inflationIt's a myth: High stakes cause test score inflation
It's a myth: High stakes cause test score inflation
Richard P Phelps
 
Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...
Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...
Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...
Richard P Phelps
 
Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...
Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...
Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...
Richard P Phelps
 
Arkansas common core presentation
Arkansas common core presentationArkansas common core presentation
Arkansas common core presentation
Richard P Phelps
 
Classroom testing: Using tests to promote learning
Classroom testing: Using tests to promote learningClassroom testing: Using tests to promote learning
Classroom testing: Using tests to promote learning
Richard P Phelps
 
University Admission Testing in Chile: The PSU
University Admission Testing in Chile: The PSUUniversity Admission Testing in Chile: The PSU
University Admission Testing in Chile: The PSU
Richard P Phelps
 
Test benefits slide show
Test benefits slide showTest benefits slide show
Test benefits slide show
Richard P Phelps
 
Forty years of polls on standardized tests in education
Forty years of polls on standardized tests in educationForty years of polls on standardized tests in education
Forty years of polls on standardized tests in education
Richard P Phelps
 
Economic perspectives on testing
Economic perspectives on testingEconomic perspectives on testing
Economic perspectives on testing
Richard P Phelps
 
L'effet de tests standardisés sur les résultats scolaires des élèves : 1910-...
L'effet de tests standardisés sur les résultats scolaires des élèves :  1910-...L'effet de tests standardisés sur les résultats scolaires des élèves :  1910-...
L'effet de tests standardisés sur les résultats scolaires des élèves : 1910-...
Richard P Phelps
 
The effect of testing on student achievement: 1910-2010
The effect of testing on student achievement: 1910-2010The effect of testing on student achievement: 1910-2010
The effect of testing on student achievement: 1910-2010
Richard P Phelps
 
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
Richard P Phelps
 
Source of Lake Wobegon
Source of Lake WobegonSource of Lake Wobegon
Source of Lake Wobegon
Richard P Phelps
 
Worse Than Plagiarism: Dismissive Reviews
Worse Than Plagiarism: Dismissive ReviewsWorse Than Plagiarism: Dismissive Reviews
Worse Than Plagiarism: Dismissive Reviews
Richard P Phelps
 

More from Richard P Phelps (18)

Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptx
Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptxDismissive Reviews, Citation Cartels, and the Replication Crisis.pptx
Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptx
 
The Successful Degradation of Evidence on Educational Testing in the United S...
The Successful Degradation of Evidence on Educational Testing in the United S...The Successful Degradation of Evidence on Educational Testing in the United S...
The Successful Degradation of Evidence on Educational Testing in the United S...
 
Comparing achievement and aptitude tests for university admission
Comparing achievement and aptitude tests for university admissionComparing achievement and aptitude tests for university admission
Comparing achievement and aptitude tests for university admission
 
It's a myth: High stakes cause test score inflation
It's a myth: High stakes cause test score inflation It's a myth: High stakes cause test score inflation
It's a myth: High stakes cause test score inflation
 
It's a myth: High stakes cause test score inflation
It's a myth: High stakes cause test score inflationIt's a myth: High stakes cause test score inflation
It's a myth: High stakes cause test score inflation
 
Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...
Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...
Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...
 
Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...
Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...
Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...
 
Arkansas common core presentation
Arkansas common core presentationArkansas common core presentation
Arkansas common core presentation
 
Classroom testing: Using tests to promote learning
Classroom testing: Using tests to promote learningClassroom testing: Using tests to promote learning
Classroom testing: Using tests to promote learning
 
University Admission Testing in Chile: The PSU
University Admission Testing in Chile: The PSUUniversity Admission Testing in Chile: The PSU
University Admission Testing in Chile: The PSU
 
Test benefits slide show
Test benefits slide showTest benefits slide show
Test benefits slide show
 
Forty years of polls on standardized tests in education
Forty years of polls on standardized tests in educationForty years of polls on standardized tests in education
Forty years of polls on standardized tests in education
 
Economic perspectives on testing
Economic perspectives on testingEconomic perspectives on testing
Economic perspectives on testing
 
L'effet de tests standardisés sur les résultats scolaires des élèves : 1910-...
L'effet de tests standardisés sur les résultats scolaires des élèves :  1910-...L'effet de tests standardisés sur les résultats scolaires des élèves :  1910-...
L'effet de tests standardisés sur les résultats scolaires des élèves : 1910-...
 
The effect of testing on student achievement: 1910-2010
The effect of testing on student achievement: 1910-2010The effect of testing on student achievement: 1910-2010
The effect of testing on student achievement: 1910-2010
 
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
 
Source of Lake Wobegon
Source of Lake WobegonSource of Lake Wobegon
Source of Lake Wobegon
 
Worse Than Plagiarism: Dismissive Reviews
Worse Than Plagiarism: Dismissive ReviewsWorse Than Plagiarism: Dismissive Reviews
Worse Than Plagiarism: Dismissive Reviews
 

Recently uploaded

Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
David Douglas School District
 
PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.
Dr. Shivangi Singh Parihar
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
amberjdewit93
 
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
National Information Standards Organization (NISO)
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
chanes7
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
Celine George
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
RitikBhardwaj56
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
Assessment and Planning in Educational technology.pptx
Assessment and Planning in Educational technology.pptxAssessment and Planning in Educational technology.pptx
Assessment and Planning in Educational technology.pptx
Kavitha Krishnan
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
IreneSebastianRueco1
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
tarandeep35
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
History of Stoke Newington
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
adhitya5119
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
Dr. Mulla Adam Ali
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
Nicholas Montgomery
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
heathfieldcps1
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
Nicholas Montgomery
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
Academy of Science of South Africa
 

Recently uploaded (20)

Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
 
PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
 
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
Assessment and Planning in Educational technology.pptx
Assessment and Planning in Educational technology.pptxAssessment and Planning in Educational technology.pptx
Assessment and Planning in Educational technology.pptx
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
 

Designing an Assessment System

  • 1. Designing an Assessment System Richard P. Phelps International Research-to-Practice Conference Nazarbayev Intellectual Schools AEO Astana, Kazakhstan October, 2016 © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 1
  • 2. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 2 “If a thing exists, it exists in some amount. If it exists in some amount, then it is capable of being measured.” −−René Descartes, Principles of Philosophy, 1664
  • 3. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 3 Image of Protein Molecules Forming Memories Albert Einstein College of Medicine, New York, January 2014
  • 4. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 4 Image of Protein Molecules Forming Memories Albert Einstein College of Medicine, New York, January 2014
  • 5. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 5 Learning Curve
  • 6. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 6 Forgetting Curve (1870s)
  • 7. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 7
  • 8. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 8 Ebbinghaus: “Learning usually requires rehearsal or repetition”
  • 9. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 9 Cognitive Load Theory John Sweller, 1980s Working Memory Capacity George Miller, 1950s
  • 10. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 10 Working Memory: Ability to temorarily hold and manipulate information for cognitive tasks Working Memory is challenged by: new, unfamiliar information and quantity of discrete bits of information
  • 11. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 11 I am thinking of a type of object, what is it? They are shapes, geometric plane figures, polygons, quadrilaterals, and parallelograms with opposite equal acute angles, opposite equal obtuse angles, and four equal sides Description 1:
  • 12. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 12 I am thinking of a type of object, what is it? Description 2:
  • 13. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 13
  • 14. Two centuries of research on learning concludes… © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 14 “…repeated retrieval during learning is the key to long-term retention.” — Henry L. “Roddy” Roediger
  • 15. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 15 Cognitive Scientists’ 6 Strategies for Effective Learning Retrieval Practice Spaced Practice Dual Coding Interleaving Concrete Examples Elaboration
  • 16. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 16 Retrieval Practice
  • 17. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 17
  • 18. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 18
  • 19. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 19 Implications for Teachers 1 Most teachers should test more frequently, …with smaller, shorter, low-stakes tests Understand that useful assessment can be short and simple.
  • 20. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 20 Implications for Teachers 2 Does the test format matter? • multiple-choice? • essay? • short answer? • oral? • demonstration? • …etc.? Not so much.
  • 21. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 21 Tests provide feedback to teachers about what works and what does not Implications for Teachers 3 Just like students can learn by testing each other; teachers can help each other by reviewing each others’ tests.
  • 22. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 22 Cognitive Psychology experiments were conducted with “formative” tests in schools and classrooms
  • 23. What about systemwide, large-scale tests? First priority: do no harm to the formative testing programs in schools and classrooms
  • 24. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 24 The effect of testing on student learning • 12-year study, read >3,000 documents • analyzed close to 700 separate studies, and more than 1,600 separate effects • 2,000 other studies were reviewed and found incomplete or inappropriate • hundreds of other studies remain to be reviewed
  • 25. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 25 245 Qualitative studies 813 Surveys or Polls 640 Quantitative Studies: Experiments: School- and classroom-level Multivariate studies: Large-scale testing programs The effect of testing on student learning
  • 26. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 26 Meta-analysis A method for summarizing a large research literature, with a single, comparable measure. ( 0.5 effect size ≈ 1 grade level of learning )
  • 27. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 27 Findings from Phelps (2012): • Survey study effect sizes average >1.0 • Over 90% of qualitative studies positive • For quantitative studies, univariate effect sizes positive and stronger when: – Testing more frequently – Testing with feedback – Testing with stakes
  • 28. 28 Findings from Phelps & Silva (2015) For quantitative studies, effect sizes vary between 0.55 and 0.88: +++ testing more frequently ++ testing with stakes + testing with feedback International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016© 2016, Richard P PHELPS
  • 29. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 29 • size of study population • small +0.34 over large • scale of test administration • small-scale +0.14 over large-scale • responsible level of government • local tests +0.29 over state tests Effect of scale on testing benefits
  • 30. Large-scale test, tight security © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 30
  • 31. Large-scale test, lax security © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 31
  • 32. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 32 Besides, systemwide tests are needed for other purposes, such as… …selection to programs with limited number of places …monitoring and system diagnosis …workforce planning …accountability …credentialing That’s enough!
  • 33. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 33 Some large-scale test advantages On per-student basis, inexpensive Cognitive laboratory pre-testing possible Standardization offers comparisons across schools and regions. May produce high-quality items that schools and teachers can use. MOST IMPORTANT: provides reliable, comparative information to all those not involved in a particular school
  • 34. The more systemwide decision points, the better ? Figure 1: Average TIMSS Score and Number of Quality Control Measures Used, by Country 0 10 20 30 40 50 60 70 80 0 5 10 15 20 Number of Quality Control Measures Used AveragePercentCorrect(grades7&8) Top-Performing Countries Bottom-Performing Countries SOURCE: Phelps, Benchmarking to the best in mathematics, Evaluation Review, 2001 © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 34
  • 35. Quality control has proportionally greater effect in poorer countries Figure 2: Average TIMSS Score and Number of Quality Control Measures Used (each adjusted for GDP/capita), by Country Number of Quality Control Measures Used (per GDP/capita) AveragePercentCorrect(grades7&8) (perGDP/capita) SOURCE: Phelps, Benchmarking to the best in mathematics, Evaluation Review, 2001 © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 35
  • 36. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 36 TIMSS, PIRLS, CIVED, SITES, ICILS, PPP, ECES, TEDS IEA: OECD PISA: World Bank: PISA, PISA for schools PISA for development READ, SABER …provides funding for PISA
  • 37. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 37 The effect of international testing programs Freedomtodesignyourtesting school tests international tests state and national tests
  • 38. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 38 OECD and World Bank are run by economists How well do economists understand PSYCH-ometrics? Some interesting examples: Chile’s national testing program, funded by the World Bank OECD’s “Synergies for Better Learning” project
  • 39. © 2016, Richard P PHELPS International Research-to-Practice Conference, Astana, Kazakhstan, October, 2016 39 Some interesting oddities: World Bank educational assessment chiefs are always Irish nationals affiliated with Boston College in the USA. PISA is universally interpreted as an achievement test, even by the OECD. In reality, it has been an unvalidated aptitude test.
  • 40. Designing an Assessment System richard {at} nonpartisaneducation {dot} org