Submit Search
Upload
The effect of testing on student achievement: 1910-2010
•
Download as PPT, PDF
•
2 likes
•
931 views
Richard P Phelps
Follow
Presentation at the 2012 meeting of the International Test Commission, Amsterdam
Read less
Read more
Education
Report
Share
Report
Share
1 of 34
Download now
Recommended
Standard wording for formulating evidence conclusions and implications for re...
Standard wording for formulating evidence conclusions and implications for re...
CEBaP_rkv
5 Easy Ways to Improve Cohesion in IELTS Writing Task 2
5 Easy Ways to Improve Cohesion in IELTS Writing Task 2
Ben Worthington
What's Significant? Hypothesis Testing, Effect Size, Confidence Intervals, & ...
What's Significant? Hypothesis Testing, Effect Size, Confidence Intervals, & ...
Pat Barlow
Classroom testing: Using tests to promote learning
Classroom testing: Using tests to promote learning
Richard P Phelps
IDEOM overview
IDEOM overview
Derm Outcomes
REG Annual General Meeting 2015
REG Annual General Meeting 2015
Zoe Mitchell
DiP committee presentation
DiP committee presentation
CPEDInitiative
REG-EAACI Taskforce Report
REG-EAACI Taskforce Report
Zoe Mitchell
Recommended
Standard wording for formulating evidence conclusions and implications for re...
Standard wording for formulating evidence conclusions and implications for re...
CEBaP_rkv
5 Easy Ways to Improve Cohesion in IELTS Writing Task 2
5 Easy Ways to Improve Cohesion in IELTS Writing Task 2
Ben Worthington
What's Significant? Hypothesis Testing, Effect Size, Confidence Intervals, & ...
What's Significant? Hypothesis Testing, Effect Size, Confidence Intervals, & ...
Pat Barlow
Classroom testing: Using tests to promote learning
Classroom testing: Using tests to promote learning
Richard P Phelps
IDEOM overview
IDEOM overview
Derm Outcomes
REG Annual General Meeting 2015
REG Annual General Meeting 2015
Zoe Mitchell
DiP committee presentation
DiP committee presentation
CPEDInitiative
REG-EAACI Taskforce Report
REG-EAACI Taskforce Report
Zoe Mitchell
2. Tools to calculate samplesize
2. Tools to calculate samplesize
Azmi Mohd Tamil
BRENDER-Economic considerations in risk management-ID1485-IDRC2014_b
BRENDER-Economic considerations in risk management-ID1485-IDRC2014_b
Global Risk Forum GRFDavos
05 Programme evaluation
05 Programme evaluation
東京大学医学系研究科医学教育国際研究センター
Ovretveit implementation science research course 1day sept 11
Ovretveit implementation science research course 1day sept 11
john
Interpretation of Human Abuse Potential Studies and Clinically Important Resp...
Interpretation of Human Abuse Potential Studies and Clinically Important Resp...
nlevy-cooperman
EDR8205-5
EDR8205-5
eckchela
Epidemiology study design
Epidemiology study design
robayade
Learning Organization in Department of Skills Development Malaysia
Learning Organization in Department of Skills Development Malaysia
Ghalip Spahat
Koonal's Slides from the 2017 PROMs Conference
Koonal's Slides from the 2017 PROMs Conference
Office of Health Economics
Innovative Sample Size Methods For Clinical Trials
Innovative Sample Size Methods For Clinical Trials
nQuery
Rationalize research
Rationalize research
RCSI MEDICAL UNIVERSITY
Delphi in community assessment na
Delphi in community assessment na
Hibsah Ridwan
Navigation Support for Learners in Informal Learning Environments, Recommende...
Navigation Support for Learners in Informal Learning Environments, Recommende...
Hendrik Drachsler
Desarrollo por GRADE de la Gúia de práctica de Encefalopatía hipóxico-isquemi...
Desarrollo por GRADE de la Gúia de práctica de Encefalopatía hipóxico-isquemi...
Javier González de Dios
B.S 4- Class 1-Introduction to analytical chemistry
B.S 4- Class 1-Introduction to analytical chemistry
Sajjad Ullah
VINCE'S Project planning forms_0210-1
VINCE'S Project planning forms_0210-1
radvin
Paper review on micropollutants in European river
Paper review on micropollutants in European river
hicky1225
Adaptation of evidence-based clinical practice guidelines: the 'Adapted ADAPT...
Adaptation of evidence-based clinical practice guidelines: the 'Adapted ADAPT...
Yasser Sami Abdel Dayem Amer
Case study: Methodology Reviews
Case study: Methodology Reviews
CTSI at UCSF
Use of case pairs can potentially improve the efficiency and effectiveness of...
Use of case pairs can potentially improve the efficiency and effectiveness of...
Poh-Sun Goh
Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptx
Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptx
Richard P Phelps
The Successful Degradation of Evidence on Educational Testing in the United S...
The Successful Degradation of Evidence on Educational Testing in the United S...
Richard P Phelps
More Related Content
Similar to The effect of testing on student achievement: 1910-2010
2. Tools to calculate samplesize
2. Tools to calculate samplesize
Azmi Mohd Tamil
BRENDER-Economic considerations in risk management-ID1485-IDRC2014_b
BRENDER-Economic considerations in risk management-ID1485-IDRC2014_b
Global Risk Forum GRFDavos
05 Programme evaluation
05 Programme evaluation
東京大学医学系研究科医学教育国際研究センター
Ovretveit implementation science research course 1day sept 11
Ovretveit implementation science research course 1day sept 11
john
Interpretation of Human Abuse Potential Studies and Clinically Important Resp...
Interpretation of Human Abuse Potential Studies and Clinically Important Resp...
nlevy-cooperman
EDR8205-5
EDR8205-5
eckchela
Epidemiology study design
Epidemiology study design
robayade
Learning Organization in Department of Skills Development Malaysia
Learning Organization in Department of Skills Development Malaysia
Ghalip Spahat
Koonal's Slides from the 2017 PROMs Conference
Koonal's Slides from the 2017 PROMs Conference
Office of Health Economics
Innovative Sample Size Methods For Clinical Trials
Innovative Sample Size Methods For Clinical Trials
nQuery
Rationalize research
Rationalize research
RCSI MEDICAL UNIVERSITY
Delphi in community assessment na
Delphi in community assessment na
Hibsah Ridwan
Navigation Support for Learners in Informal Learning Environments, Recommende...
Navigation Support for Learners in Informal Learning Environments, Recommende...
Hendrik Drachsler
Desarrollo por GRADE de la Gúia de práctica de Encefalopatía hipóxico-isquemi...
Desarrollo por GRADE de la Gúia de práctica de Encefalopatía hipóxico-isquemi...
Javier González de Dios
B.S 4- Class 1-Introduction to analytical chemistry
B.S 4- Class 1-Introduction to analytical chemistry
Sajjad Ullah
VINCE'S Project planning forms_0210-1
VINCE'S Project planning forms_0210-1
radvin
Paper review on micropollutants in European river
Paper review on micropollutants in European river
hicky1225
Adaptation of evidence-based clinical practice guidelines: the 'Adapted ADAPT...
Adaptation of evidence-based clinical practice guidelines: the 'Adapted ADAPT...
Yasser Sami Abdel Dayem Amer
Case study: Methodology Reviews
Case study: Methodology Reviews
CTSI at UCSF
Use of case pairs can potentially improve the efficiency and effectiveness of...
Use of case pairs can potentially improve the efficiency and effectiveness of...
Poh-Sun Goh
Similar to The effect of testing on student achievement: 1910-2010
(20)
2. Tools to calculate samplesize
2. Tools to calculate samplesize
BRENDER-Economic considerations in risk management-ID1485-IDRC2014_b
BRENDER-Economic considerations in risk management-ID1485-IDRC2014_b
05 Programme evaluation
05 Programme evaluation
Ovretveit implementation science research course 1day sept 11
Ovretveit implementation science research course 1day sept 11
Interpretation of Human Abuse Potential Studies and Clinically Important Resp...
Interpretation of Human Abuse Potential Studies and Clinically Important Resp...
EDR8205-5
EDR8205-5
Epidemiology study design
Epidemiology study design
Learning Organization in Department of Skills Development Malaysia
Learning Organization in Department of Skills Development Malaysia
Koonal's Slides from the 2017 PROMs Conference
Koonal's Slides from the 2017 PROMs Conference
Innovative Sample Size Methods For Clinical Trials
Innovative Sample Size Methods For Clinical Trials
Rationalize research
Rationalize research
Delphi in community assessment na
Delphi in community assessment na
Navigation Support for Learners in Informal Learning Environments, Recommende...
Navigation Support for Learners in Informal Learning Environments, Recommende...
Desarrollo por GRADE de la Gúia de práctica de Encefalopatía hipóxico-isquemi...
Desarrollo por GRADE de la Gúia de práctica de Encefalopatía hipóxico-isquemi...
B.S 4- Class 1-Introduction to analytical chemistry
B.S 4- Class 1-Introduction to analytical chemistry
VINCE'S Project planning forms_0210-1
VINCE'S Project planning forms_0210-1
Paper review on micropollutants in European river
Paper review on micropollutants in European river
Adaptation of evidence-based clinical practice guidelines: the 'Adapted ADAPT...
Adaptation of evidence-based clinical practice guidelines: the 'Adapted ADAPT...
Case study: Methodology Reviews
Case study: Methodology Reviews
Use of case pairs can potentially improve the efficiency and effectiveness of...
Use of case pairs can potentially improve the efficiency and effectiveness of...
More from Richard P Phelps
Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptx
Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptx
Richard P Phelps
The Successful Degradation of Evidence on Educational Testing in the United S...
The Successful Degradation of Evidence on Educational Testing in the United S...
Richard P Phelps
Comparing achievement and aptitude tests for university admission
Comparing achievement and aptitude tests for university admission
Richard P Phelps
Boarding School: Benefits and Drawbacks
Boarding School: Benefits and Drawbacks
Richard P Phelps
It's a myth: High stakes cause test score inflation
It's a myth: High stakes cause test score inflation
Richard P Phelps
It's a myth: High stakes cause test score inflation
It's a myth: High stakes cause test score inflation
Richard P Phelps
Designing an Assessment System
Designing an Assessment System
Richard P Phelps
Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...
Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...
Richard P Phelps
Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...
Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...
Richard P Phelps
Arkansas common core presentation
Arkansas common core presentation
Richard P Phelps
University Admission Testing in Chile: The PSU
University Admission Testing in Chile: The PSU
Richard P Phelps
Test benefits slide show
Test benefits slide show
Richard P Phelps
Forty years of polls on standardized tests in education
Forty years of polls on standardized tests in education
Richard P Phelps
Economic perspectives on testing
Economic perspectives on testing
Richard P Phelps
L'effet de tests standardisés sur les résultats scolaires des élèves : 1910-...
L'effet de tests standardisés sur les résultats scolaires des élèves : 1910-...
Richard P Phelps
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
Richard P Phelps
Source of Lake Wobegon
Source of Lake Wobegon
Richard P Phelps
Worse Than Plagiarism: Dismissive Reviews
Worse Than Plagiarism: Dismissive Reviews
Richard P Phelps
More from Richard P Phelps
(18)
Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptx
Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptx
The Successful Degradation of Evidence on Educational Testing in the United S...
The Successful Degradation of Evidence on Educational Testing in the United S...
Comparing achievement and aptitude tests for university admission
Comparing achievement and aptitude tests for university admission
Boarding School: Benefits and Drawbacks
Boarding School: Benefits and Drawbacks
It's a myth: High stakes cause test score inflation
It's a myth: High stakes cause test score inflation
It's a myth: High stakes cause test score inflation
It's a myth: High stakes cause test score inflation
Designing an Assessment System
Designing an Assessment System
Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...
Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...
Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...
Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...
Arkansas common core presentation
Arkansas common core presentation
University Admission Testing in Chile: The PSU
University Admission Testing in Chile: The PSU
Test benefits slide show
Test benefits slide show
Forty years of polls on standardized tests in education
Forty years of polls on standardized tests in education
Economic perspectives on testing
Economic perspectives on testing
L'effet de tests standardisés sur les résultats scolaires des élèves : 1910-...
L'effet de tests standardisés sur les résultats scolaires des élèves : 1910-...
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
Source of Lake Wobegon
Source of Lake Wobegon
Worse Than Plagiarism: Dismissive Reviews
Worse Than Plagiarism: Dismissive Reviews
Recently uploaded
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
TechSoup
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
Steve Thomason
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
Thiyagu K
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology ( Production , Purification , and Application )
Sakshi Ghasle
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
Maestría en Comunicación Digital Interactiva - UNR
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
VS Mahajan Coaching Centre
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
iammrhaywood
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
RoyAbrique
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
Celine George
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
Thiyagu K
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
SoniaTolstoy
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
National Information Standards Organization (NISO)
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
EduSkills OECD
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
manuelaromero2013
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
Jayanti Pande
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
RAM LAL ANAND COLLEGE, DELHI UNIVERSITY.
Recently uploaded
(20)
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology ( Production , Purification , and Application )
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
The effect of testing on student achievement: 1910-2010
1.
The effect of
testing on student achievement: 1910-2010 Richard P. PHELPS © 2012, Richard P 1 International Test Commission, 8th Conference, Amsterdam, PHELPS
2.
Meta-analysis •
A method for summarizing a large research literature, with a single, comparable measure. © 2012, Richard P 2 International Test Commission, 8th Conference, Amsterdam, PHELPS
3.
The effect of
testing on student achievement • 12-year long study • analyzed close to 700 separate studies, and more than 1,600 separate effects • 2,000 other studies were reviewed and found incomplete or inappropriate • lacking sufficient time and money, hundreds of other studies will not be reviewed © 2012, Richard P 3 International Test Commission, 8th Conference, Amsterdam, PHELPS
4.
Looking for studies
to include in the meta-analyses 1. Included only those studies that found an effect from testing on student achievement or on teacher instruction… © 2012, Richard P 4 International Test Commission, 8th Conference, Amsterdam, PHELPS
5.
Studies included in
the meta-analyses 2. …when: • a test is newly introduced, or newly removed • quantity of testing is increased or reduced • test stakes are introduced or increased, or removed or reduced © 2012, Richard P 5 International Test Commission, 8th Conference, Amsterdam, PHELPS
6.
Studies included in
the meta-analyses 3. …plus previous research summaries (e.g.) • Kulik, Kulik, Bangert-Drowns, & Schwalb (1983-1991) on: – mastery testing, – frequency of testing, and – programs for high-risk university students • Basol & Johanson (2009) on testing frequency • Jaekyung Lee (2007) on cross-state studies • W.J. Haynie (2007) in career-tech ed © 2012, Richard P 6 International Test Commission, 8th Conference, Amsterdam, PHELPS
7.
Number of studies
of effects, by methodology type Number of Number of Methodology type studies effects Quantitative 177 640 Surveys and public 247 813 opinion polls (US & Canada) Qualitative 245 245 TOTAL 669 1698 © 2012, Richard P 7 International Test Commission, 8th Conference, Amsterdam, PHELPS
8.
Effect size: Cohen’s
d d = (YE - YC) / Spool YE = mean, experimental group YC = mean, control group Spooled = standard deviation © 2012, Richard P 8 International Test Commission, 8th Conference, Amsterdam, PHELPS
9.
Effect size: Other
formulae d = t*((n1+n2/n1*n2)^0.5 d = 2r/(1-r²)^0.5 d = (YE pre-YE post-YC pre+ YC post)/Spooled post © 2012, Richard P 9 International Test Commission, 8th Conference, Amsterdam, PHELPS
10.
Effect size: Interpretation
• d between 0.25 & 0.50 weak effect • d between 0.50 et 0.75 medium effect • d more than 0.75 strong effect © 2012, Richard P 10 International Test Commission, 8th Conference, Amsterdam, PHELPS
11.
Quantitative studies
(population coverage ≈ 7 million persons) © 2012, Richard P 11 International Test Commission, 8th Conference, Amsterdam, PHELPS
12.
Quantitative studies: Effect
size • “Bare bones” calculation: d ≈ +0.55 …a medium effect • Bare bones effect size adjusted for measurement error d ≈ +0.71 …a stronger effect • Using same-study-author aggregation d ≈ +0.88 …a strong effect © 2012, Richard P 12 International Test Commission, 8th Conference, Amsterdam, PHELPS
13.
Which predictors matter?
Mean Effect Treatment Group… Size …is made aware of performance, and control group is not +0.98 …receives targeted instruction (e.g., remediation) +0.96 …is tested with higher stakes than control group +0.87 …is tested more frequently than control group +0.85 © 2012, Richard P 13 International Test Commission, 8th Conference, Amsterdam, PHELPS
14.
More Moderators –
Source of Test Number of Mean Studies Effect Size Researcher or Teacher 87 0.93 National 24 0.87 Commercial 38 0.82 State or District 11 0.72 Total 160 © 2012, Richard P 14 International Test Commission, 8th Conference, Amsterdam, PHELPS
15.
More Moderators –
Sponsor of Test Number of Mean Studies Effect Size International 5 1.02 Local 99 0.93 National 45 0.81 State 11 0.64 Total 160 © 2012, Richard P 15 International Test Commission, 8th Conference, Amsterdam, PHELPS
16.
More Moderators -
Study Design Number of Mean Studies Effect Size Pre-post 12 0.97 Experiment, Quasi-experiment 107 0.94 Multivariate 26 0.80 Experiment, posttest only 7 0.60 Pre-post (with shadow test) 8 0.58 Total 160 © 2012, Richard P 16 International Test Commission, 8th Conference, Amsterdam, PHELPS
17.
More Moderators –
Scale of Analysis Number of Mean Studies Effect Size Aggregated 9 1.60 Small-scale 118 0.91 Large-scale 33 0.57 Total 160 © 2012, Richard P 17 International Test Commission, 8th Conference, Amsterdam, PHELPS
18.
More Moderators –
Scale of Administration Number Mean of Studies Effect Size Classroom 115 0.95 Mid-scale 6 0.72 Large-scale 39 0.71 Total 160 © 2012, Richard P 18 International Test Commission, 8th Conference, Amsterdam, PHELPS
19.
Surveys and opinion
polls © 2012, Richard P 19 International Test Commission, 8th Conference, Amsterdam, PHELPS
20.
Percentage of survey
items, by respondent group and type of survey 50 45 40 35 30 Education Percent 25 Providers 20 15 Education 10 Consumers 5 0 Public opinion polls Program evaluation surveys* © 2012, Richard P 20 International Test Commission, 8th Conference, Amsterdam, PHELPS
21.
Number and percent
of survey items, by test stakes and target group Test stakes Number % Target group Number % High 507 62 Students 393 46 Medium 184 23 Schools 281 33 Low 33 4 Teachers 116 14 Unknown 89 11 No stakes 64 7 TOTAL 813 TOTAL 854 © 2012, Richard P 21 International Test Commission, 8th Conference, Amsterdam, PHELPS
22.
Opinion polls, by
year • 244 between 1958--2008, in the U.S. & Canada • 813 unique question-response combinations • close to 700,000 individual respondents 120 100 80 60 40 20 0 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 Year © 2012, Richard P 22 International Test Commission, 8th Conference, Amsterdam, PHELPS
23.
Surveys and opinion
polls: Regular standardized tests, performance tests Regular tests Performance tests (N ≈125) (N ≈ 50) Respondent opinion d d Achievement is increased 1.2 1.0 …weighted by size of study population 1.9 0.5 Instruction is improved 1.0 1.4 …weighted by size of study population 0.9 0.9 Tests help align instruction 1.0 1.0 …weighted by size of study population 0.5 0.9 © 2012, Richard P 23 International Test Commission, 8th Conference, Amsterdam, PHELPS
24.
Qualitative studies: Summary
(One cannot calculate an effect size.) © 2012, Richard P 24 International Test Commission, 8th Conference, Amsterdam, PHELPS
25.
Qualitative studies, by
methodology type Number of Methodology studies % Case study 120 43 Experiment or pre-post study 21 7 Interviews (individual or group) 75 27 Journal 2 1 Review of official records, documents, reports 33 12 Research review 8 3 Survey 22 8 TOTAL 281 100 © 2012, Richard P 25 International Test Commission, 8th Conference, Amsterdam, PHELPS
26.
Qualitative studies:
Effect on student achievement 244 studies conducted in the past century in over 30 countries Number of Percent without Direction of effect studies Percent of studies the inferred Positive 204 84 93 Positive inferred 24 10 Mixed 5 2 2 No change 8 3 4 Negative 3 1 1 TOTAL 244 100 100 © 2012, Richard P 26 International Test Commission, 8th Conference, Amsterdam, PHELPS
27.
Qualitative studies: Testing
improves student achievement and teacher instruction Number of Achievement is improved studies % Yes 200 95 Mixed results 1 <1 No 10 5 TOTAL 211 100 Number of Instruction is improved studies % Yes 158 96 No 7 4 TOTAL 165 100 © 2012, Richard P 27 International Test Commission, 8th Conference, Amsterdam, PHELPS
28.
Qualitative studies:
Variation by rigor and test stakes Level of rigor Direction of effect high medium low Total Positive 95 67 42 204 Positive inferred 10 8 6 24 Mixed 3 1 1 5 No change 4 3 1 8 Negative 1 1 1 3 TOTAL 113 80 51 244 Stakes Direction of effect high medium low unknown Total Positive 133 27 38 6 204 Positive inferred 12 5 7 24 Mixed 4 1 5 No change 2 1 5 8 Negative 3 3 TOTAL 154 33 51 6 244 © 2012, Richard P 28 International Test Commission, 8th Conference, Amsterdam, PHELPS
29.
Qualitative studies:
Regular standardized tests and performance tests Regular tests Performance tests (N =176) (N = 69) Study results % % Generally positive 93 95 High-stakes tests 71 42 High level of study rigor 46 48 Student attitudes toward test positive 60 71 Teacher attitudes toward test positive 55 80 Student achievement improved 95 95 Instruction improved 92 100 Large-scale testing 86 68 © 2012, Richard P 29 International Test Commission, 8th Conference, Amsterdam, PHELPS
30.
An enormous research
literature • But, assertions that it does not exist at all are common – Some claims are made by those who oppose standardized testing, and may be wishful thinking – Others are “firstness” claims © 2012, Richard P 30 International Test Commission, 8th Conference, Amsterdam, PHELPS
31.
Dismissive research reviews
• With a dismissive research literature review, a researcher assures all that no other researcher has studied the same topic © 2012, Richard P 31 International Test Commission, 8th Conference, Amsterdam, PHELPS
32.
Firstness claims
• With a firstness claim, a researcher insists that he or she is the first to ever study a topic © 2012, Richard P 32 International Test Commission, 8th Conference, Amsterdam, PHELPS
33.
Social costs are
enormous • Research conducted by those without power or celebrity is dismissed -- ignored and lost • Public policies are skewed, based exclusively on the research results of those with power or celebrity • Society pays again and again for research that has already been done © 2012, Richard P 33 International Test Commission, 8th Conference, Amsterdam, PHELPS
34.
The effect of
testing on student achievement: 1910-2010 Richard P. PHELPS © 2012, Richard P 34 International Test Commission, 8th Conference, Amsterdam, PHELPS
Download now