SlideShare a Scribd company logo
1 of 26
The Source of Lake Wobegon
           By Richard P. Phelps

      (c)2007-2012, Richard P. Phelps
“Welcome to Lake Wobegon, where all the women
  are strong, all the men are good-looking, and all
          the children are above average.”
           - Garrison Keillor, A Prairie Home Companion
John J. Cannell, M.D.


• Residency in rural West Virginia, 1980s
• Surprised by claims that state and school district
  scored “above average” on national tests
• Investigated, found that all 50 states claimed to
  be “above average”
Cannell’s suspects


• Outdated or invalid norms
• Lax security
• Deliberate educator manipulation
  – Showing test items to teachers beforehand
  – Keeping test forms around for years
  – Misleading reporting, etc.
CRESST’s suspects


• Outdated or invalid norms
• High stakes, that induce “teaching to the test”
  (i.e., test coaching)

  (This hypothesis now generally accepted as accurate
           among K-12 education researchers)
• “We know that tests that are
  used for accountability tend to
  be taught to in ways that
  produce inflated scores.”
      - Dan Koretz, CRESST, 1992


• “Corruption of indicators is a
  continuing problem where tests
  are used for accountability or
  other high-stakes purposes.”
      - Robert Linn, CRESST, 2000
Explanations for Spuriously High Achievement Scores
From Responses to CannelI in Educational Measurement:
               Issues and Practice (1988)

                    Authors: A   B   C   D   E   F

Inadequate norms            X    X   X           X

Outdated norms              X    X   X   X   X

Curriculum alignment        X    X   X

High stakes pressure        X    X

Teaching the test           X    X           X

Incomplete population tested X   X   X

Inappropriate comparisons        X   X
More left-out-
         variable bias

• Linn (2000) cites higher gains on Title 1 pre-post testing
  over 9 months than over 12 as evidence of inflation
   – Does not consider 3 months of forgetting


• CRESST study (1991) in one school district also cited as
  evidence of inflation
   – Does not consider curricular misalignment, motivation, test
     security, variation in stakes
Examining the high-
                    stakes-cause-score-
                     inflation hypothesis

• “Strong” version of hypothesis:
  – There are no rival hypotheses

• “Weak” version of hypothesis:
  – More inflation in grades closer to stakes
  – Test coaching increases scores
  – Correlation between stakes and inflation
Defining
“test-score inflation”



         State percentile difference between:

             Cannell’s NRTs (late ‘80s)
                        &
              Math NAEP (’90 or ’92)
Testing the strong
                                hypothesis 1


  State rotated items?              yes     no
  Average “score inflation”         9.3    10.0


Level of test security             lax    med   tight
Average “score inflation”         10.6    9.7    8.9
Testing the strong
                            hypothesis 2


Moreover…
Cannell found score inflation in elementary school
 tests in dozens of states – none of those tests
 had high stakes.
Cannell also found score inflation in secondary
 school tests in dozens of states – only one had
 high stakes.
Test Security in South
                                   Carolina:
                               score-inflated test

Cannell, 1989, p.89:
“Unlike their other two tests, teachers are allowed to look at
  test booklets, teachers may obtain test booklets before
  the day of testing, booklets are not sealed, and testing is
  not routinely monitored by state officials. Outside test
  proctors are not used, test questions have not been
  rotated every year, and answer sheets have not been
  scanned for suspicious erasures or analyzed for cluster
  variance. There are no state regulations that govern test
  security and test administration for norm-referenced
  testing done independently in the local school districts.”
Test Security In South
                                   Carolina:
                             two high-stakes tests
Cannell, 1989, p.89:
“South Carolina also administers a graduation exam and a
  criterion referenced test, both of which have significant
  security measures. Teachers are not allowed to look at
  either of these two test booklets, teachers may not
  obtain booklets before the day of testing, the graduation
  test booklets are sealed, testing is routinely monitored by
  state officials, special education students are generally
  included in all tests used in South Carolina unless their
  IEP recommends against testing, outside test proctors
  administer the graduation exam, and most test questions
  are rotated every year on the criterion referenced test.”
Tomāto                           Tomăto



  Is the high-stakes-cause-test-score-inflation
  hypothesis caused by semantic distortion?

“Tests are ‘high-stakes’ when:
  teachers feel judged by the results?”
  parents receive reports of their child’s test scores?”
  test scores are widely reported in the newspapers?”
Standards for
    Educational and
    Psychological
    Testing:

“High-stakes test. A test used to provide results that have
  important, direct consequences for examinees,
  programs, or institutions involved in the testing.” (p.176)

“Low-stakes test. A test used to provide results that have
  only minor or indirect consequences for examinees,
  programs, or institutions involved in the testing.” (p.178)
Shortcomings of
  Cannell’s studies


• Responses to his survey of state test security
  practices do not always specify which practices
  apply to which tests in states that administered more
  than one

• He calculated score trends for NRTs and, with one
  exception, not for standards-based tests
Testing the weak
                               hypothesis 1


Q. Do grade levels closer to high-stakes event
    (e.g., high school graduation exam) show
    greater score increases?

  Yes, in “washback” studies of: John Bishop (1997),
     Linda Winfield (1990), Norm Fredericksen (1994)

  No, in Cannell’s data
Q. Why disparate results?
A. Low-stakes comparison tests differed


Washback studies used untraceable,
sample-based tests, administered
with tight security (TIMSS, NAEP)




                     Cannell used traceable NRTs
                     administered with lax security
Testing the weak
                                  hypothesis 2



Q. Is there direct evidence that test coaching raises test
     scores?

A. No, see Powers (1993), Becker (1990), Powers & Rock
    (1994), Camara (2001), etc.
Testing the weak hypothesis 3


Perhaps low-stakes tests
are subject to score
inflation where a jurisdiction
administers a separate
high-stakes test, thereby
creating a general
environment of high-stakes
pressure?
Q. High-stakes, score inflation related?
A. Maybe negatively.

                     Coef    S.E.   t    p
Intercept            45.70 10.20 4.48 0.0004
NAEP %-ile score     -0.55 0.15 -3.72 0.0020
Item rotation?        0.57   2.94 0.19 0.8501
Level of security?    0.85   1.66 0.52 0.6141
High-stakes?         -6.47 3.51 -1.84 0.0853
25

                         20
Amount of "inflation"


                         15
(in percentile points)




                         10

                         5

                         0
                              40        50          60          70           80   90
                         -5
                                             Average NAEP percentile score


                               Pink squares: states with a high-stakes test
                         Blue diamonds: states without any high-stakes test
Two types of
                      tests resist
                    score inflation:


1. Those untraceable to individual jurisdictions or schools
   (no incentive to cheat)
2. Those with tight security and ample item rotation (no
   opportunity to cheat)

       Traceable tests lacking security and item
       rotation are candidates for score inflation
Artificial test score gains (score inflation) are
     caused by neglect, incompetence, or
deliberate educator manipulation, but always
        require means and opportunity.

                     • Motive is only present with
                       traceable tests.

                     • Means and opportunity exist
                       only in the absence of
                       security measures and item
                       rotation.
http://www.nonpartisaneducation.org/Review/Articles/v1n2.htm

More Related Content

Similar to Source of Lake Wobegon

High Stakes Standardized Testing
High Stakes Standardized TestingHigh Stakes Standardized Testing
High Stakes Standardized Testinggrove1em
 
Measurement and instrumentaion
Measurement and instrumentaionMeasurement and instrumentaion
Measurement and instrumentaionahmedabbas1121
 
MWERA Parent Perceptions of Trauma-informed Assessment Conference Paper
MWERA Parent Perceptions of Trauma-informed Assessment Conference PaperMWERA Parent Perceptions of Trauma-informed Assessment Conference Paper
MWERA Parent Perceptions of Trauma-informed Assessment Conference PaperCamilleMora
 
Non parametric study; Statistical approach for med student
Non parametric study; Statistical approach for med student Non parametric study; Statistical approach for med student
Non parametric study; Statistical approach for med student Dr. Rupendra Bharti
 
Louzel Report - Reliability & validity
Louzel Report - Reliability & validity Louzel Report - Reliability & validity
Louzel Report - Reliability & validity Louzel Linejan
 
Caveon Webinar Series: What you Should Know about High Stakes Cheating in You...
Caveon Webinar Series: What you Should Know about High Stakes Cheating in You...Caveon Webinar Series: What you Should Know about High Stakes Cheating in You...
Caveon Webinar Series: What you Should Know about High Stakes Cheating in You...Caveon Test Security
 
Aligning tests to standards
Aligning tests to standardsAligning tests to standards
Aligning tests to standardsFariba Chamani
 
Is it Cheating or Group Problem Solving
Is it Cheating or Group Problem SolvingIs it Cheating or Group Problem Solving
Is it Cheating or Group Problem SolvingGreg Friese
 
The five scales handout
The five scales handoutThe five scales handout
The five scales handoutCamilleMora
 
Test standardization
Test standardizationTest standardization
Test standardizationKaye Batica
 
Practical Language Testing by Fulcher (2010)
Practical Language Testing by Fulcher (2010)Practical Language Testing by Fulcher (2010)
Practical Language Testing by Fulcher (2010)Mahsa Farahanynia
 
Psychometrics 101: Know what your assessment data is telling you
Psychometrics 101: Know what your assessment data is telling youPsychometrics 101: Know what your assessment data is telling you
Psychometrics 101: Know what your assessment data is telling youExamSoft
 
TASA Presentation by John Cronin
TASA Presentation by John CroninTASA Presentation by John Cronin
TASA Presentation by John CroninNWEA
 
Chahine Understanding Common Study Results
Chahine Understanding Common Study ResultsChahine Understanding Common Study Results
Chahine Understanding Common Study ResultsSaad Chahine
 
group 8 psychological assessment RPM & DAT.pdf
group 8 psychological assessment RPM & DAT.pdfgroup 8 psychological assessment RPM & DAT.pdf
group 8 psychological assessment RPM & DAT.pdfcmpvillaverde
 

Similar to Source of Lake Wobegon (20)

High Stakes Standardized Testing
High Stakes Standardized TestingHigh Stakes Standardized Testing
High Stakes Standardized Testing
 
Measurement and instrumentaion
Measurement and instrumentaionMeasurement and instrumentaion
Measurement and instrumentaion
 
MWERA Parent Perceptions of Trauma-informed Assessment Conference Paper
MWERA Parent Perceptions of Trauma-informed Assessment Conference PaperMWERA Parent Perceptions of Trauma-informed Assessment Conference Paper
MWERA Parent Perceptions of Trauma-informed Assessment Conference Paper
 
Item analysis
Item analysisItem analysis
Item analysis
 
Item analysis
Item analysisItem analysis
Item analysis
 
New item analysis
New item analysisNew item analysis
New item analysis
 
Non parametric study; Statistical approach for med student
Non parametric study; Statistical approach for med student Non parametric study; Statistical approach for med student
Non parametric study; Statistical approach for med student
 
Item analysis with spss software
Item analysis with spss softwareItem analysis with spss software
Item analysis with spss software
 
Louzel Report - Reliability & validity
Louzel Report - Reliability & validity Louzel Report - Reliability & validity
Louzel Report - Reliability & validity
 
Caveon Webinar Series: What you Should Know about High Stakes Cheating in You...
Caveon Webinar Series: What you Should Know about High Stakes Cheating in You...Caveon Webinar Series: What you Should Know about High Stakes Cheating in You...
Caveon Webinar Series: What you Should Know about High Stakes Cheating in You...
 
Aligning tests to standards
Aligning tests to standardsAligning tests to standards
Aligning tests to standards
 
Is it Cheating or Group Problem Solving
Is it Cheating or Group Problem SolvingIs it Cheating or Group Problem Solving
Is it Cheating or Group Problem Solving
 
The five scales handout
The five scales handoutThe five scales handout
The five scales handout
 
Test standardization
Test standardizationTest standardization
Test standardization
 
Reunião para discussão do ASQ-3 (versão em Português)
Reunião para discussão do ASQ-3 (versão em Português)Reunião para discussão do ASQ-3 (versão em Português)
Reunião para discussão do ASQ-3 (versão em Português)
 
Practical Language Testing by Fulcher (2010)
Practical Language Testing by Fulcher (2010)Practical Language Testing by Fulcher (2010)
Practical Language Testing by Fulcher (2010)
 
Psychometrics 101: Know what your assessment data is telling you
Psychometrics 101: Know what your assessment data is telling youPsychometrics 101: Know what your assessment data is telling you
Psychometrics 101: Know what your assessment data is telling you
 
TASA Presentation by John Cronin
TASA Presentation by John CroninTASA Presentation by John Cronin
TASA Presentation by John Cronin
 
Chahine Understanding Common Study Results
Chahine Understanding Common Study ResultsChahine Understanding Common Study Results
Chahine Understanding Common Study Results
 
group 8 psychological assessment RPM & DAT.pdf
group 8 psychological assessment RPM & DAT.pdfgroup 8 psychological assessment RPM & DAT.pdf
group 8 psychological assessment RPM & DAT.pdf
 

More from Richard P Phelps

Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptx
Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptxDismissive Reviews, Citation Cartels, and the Replication Crisis.pptx
Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptxRichard P Phelps
 
The Successful Degradation of Evidence on Educational Testing in the United S...
The Successful Degradation of Evidence on Educational Testing in the United S...The Successful Degradation of Evidence on Educational Testing in the United S...
The Successful Degradation of Evidence on Educational Testing in the United S...Richard P Phelps
 
Comparing achievement and aptitude tests for university admission
Comparing achievement and aptitude tests for university admissionComparing achievement and aptitude tests for university admission
Comparing achievement and aptitude tests for university admissionRichard P Phelps
 
Boarding School: Benefits and Drawbacks
Boarding School: Benefits and DrawbacksBoarding School: Benefits and Drawbacks
Boarding School: Benefits and DrawbacksRichard P Phelps
 
Designing an Assessment System
Designing an Assessment SystemDesigning an Assessment System
Designing an Assessment SystemRichard P Phelps
 
Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...
Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...
Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...Richard P Phelps
 
Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...
Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...
Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...Richard P Phelps
 
Arkansas common core presentation
Arkansas common core presentationArkansas common core presentation
Arkansas common core presentationRichard P Phelps
 
Classroom testing: Using tests to promote learning
Classroom testing: Using tests to promote learningClassroom testing: Using tests to promote learning
Classroom testing: Using tests to promote learningRichard P Phelps
 
University Admission Testing in Chile: The PSU
University Admission Testing in Chile: The PSUUniversity Admission Testing in Chile: The PSU
University Admission Testing in Chile: The PSURichard P Phelps
 
Forty years of polls on standardized tests in education
Forty years of polls on standardized tests in educationForty years of polls on standardized tests in education
Forty years of polls on standardized tests in educationRichard P Phelps
 
Economic perspectives on testing
Economic perspectives on testingEconomic perspectives on testing
Economic perspectives on testingRichard P Phelps
 
L'effet de tests standardisés sur les résultats scolaires des élèves : 1910-...
L'effet de tests standardisés sur les résultats scolaires des élèves :  1910-...L'effet de tests standardisés sur les résultats scolaires des élèves :  1910-...
L'effet de tests standardisés sur les résultats scolaires des élèves : 1910-...Richard P Phelps
 
The effect of testing on student achievement: 1910-2010
The effect of testing on student achievement: 1910-2010The effect of testing on student achievement: 1910-2010
The effect of testing on student achievement: 1910-2010Richard P Phelps
 
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...Richard P Phelps
 
Worse Than Plagiarism: Dismissive Reviews
Worse Than Plagiarism: Dismissive ReviewsWorse Than Plagiarism: Dismissive Reviews
Worse Than Plagiarism: Dismissive ReviewsRichard P Phelps
 

More from Richard P Phelps (17)

Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptx
Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptxDismissive Reviews, Citation Cartels, and the Replication Crisis.pptx
Dismissive Reviews, Citation Cartels, and the Replication Crisis.pptx
 
The Successful Degradation of Evidence on Educational Testing in the United S...
The Successful Degradation of Evidence on Educational Testing in the United S...The Successful Degradation of Evidence on Educational Testing in the United S...
The Successful Degradation of Evidence on Educational Testing in the United S...
 
Comparing achievement and aptitude tests for university admission
Comparing achievement and aptitude tests for university admissionComparing achievement and aptitude tests for university admission
Comparing achievement and aptitude tests for university admission
 
Boarding School: Benefits and Drawbacks
Boarding School: Benefits and DrawbacksBoarding School: Benefits and Drawbacks
Boarding School: Benefits and Drawbacks
 
Designing an Assessment System
Designing an Assessment SystemDesigning an Assessment System
Designing an Assessment System
 
Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...
Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...
Innovaciones en la evaluación en el aula: El uso de pruebas para promover el ...
 
Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...
Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...
Fortalezas y debilidades de las pruebas estandarizadas como mecanismos inclus...
 
Arkansas common core presentation
Arkansas common core presentationArkansas common core presentation
Arkansas common core presentation
 
Classroom testing: Using tests to promote learning
Classroom testing: Using tests to promote learningClassroom testing: Using tests to promote learning
Classroom testing: Using tests to promote learning
 
University Admission Testing in Chile: The PSU
University Admission Testing in Chile: The PSUUniversity Admission Testing in Chile: The PSU
University Admission Testing in Chile: The PSU
 
Test benefits slide show
Test benefits slide showTest benefits slide show
Test benefits slide show
 
Forty years of polls on standardized tests in education
Forty years of polls on standardized tests in educationForty years of polls on standardized tests in education
Forty years of polls on standardized tests in education
 
Economic perspectives on testing
Economic perspectives on testingEconomic perspectives on testing
Economic perspectives on testing
 
L'effet de tests standardisés sur les résultats scolaires des élèves : 1910-...
L'effet de tests standardisés sur les résultats scolaires des élèves :  1910-...L'effet de tests standardisés sur les résultats scolaires des élèves :  1910-...
L'effet de tests standardisés sur les résultats scolaires des élèves : 1910-...
 
The effect of testing on student achievement: 1910-2010
The effect of testing on student achievement: 1910-2010The effect of testing on student achievement: 1910-2010
The effect of testing on student achievement: 1910-2010
 
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
L'effet de tests standardisés sur les résultats scolaires des élèves : Méta-a...
 
Worse Than Plagiarism: Dismissive Reviews
Worse Than Plagiarism: Dismissive ReviewsWorse Than Plagiarism: Dismissive Reviews
Worse Than Plagiarism: Dismissive Reviews
 

Recently uploaded

microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docxPoojaSen20
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Magic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptxMagic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptxdhanalakshmis0310
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 

Recently uploaded (20)

microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Magic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptxMagic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 

Source of Lake Wobegon

  • 1. The Source of Lake Wobegon By Richard P. Phelps (c)2007-2012, Richard P. Phelps
  • 2. “Welcome to Lake Wobegon, where all the women are strong, all the men are good-looking, and all the children are above average.” - Garrison Keillor, A Prairie Home Companion
  • 3. John J. Cannell, M.D. • Residency in rural West Virginia, 1980s • Surprised by claims that state and school district scored “above average” on national tests • Investigated, found that all 50 states claimed to be “above average”
  • 4. Cannell’s suspects • Outdated or invalid norms • Lax security • Deliberate educator manipulation – Showing test items to teachers beforehand – Keeping test forms around for years – Misleading reporting, etc.
  • 5. CRESST’s suspects • Outdated or invalid norms • High stakes, that induce “teaching to the test” (i.e., test coaching) (This hypothesis now generally accepted as accurate among K-12 education researchers)
  • 6. • “We know that tests that are used for accountability tend to be taught to in ways that produce inflated scores.” - Dan Koretz, CRESST, 1992 • “Corruption of indicators is a continuing problem where tests are used for accountability or other high-stakes purposes.” - Robert Linn, CRESST, 2000
  • 7. Explanations for Spuriously High Achievement Scores From Responses to CannelI in Educational Measurement: Issues and Practice (1988) Authors: A B C D E F Inadequate norms X X X X Outdated norms X X X X X Curriculum alignment X X X High stakes pressure X X Teaching the test X X X Incomplete population tested X X X Inappropriate comparisons X X
  • 8. More left-out- variable bias • Linn (2000) cites higher gains on Title 1 pre-post testing over 9 months than over 12 as evidence of inflation – Does not consider 3 months of forgetting • CRESST study (1991) in one school district also cited as evidence of inflation – Does not consider curricular misalignment, motivation, test security, variation in stakes
  • 9. Examining the high- stakes-cause-score- inflation hypothesis • “Strong” version of hypothesis: – There are no rival hypotheses • “Weak” version of hypothesis: – More inflation in grades closer to stakes – Test coaching increases scores – Correlation between stakes and inflation
  • 10. Defining “test-score inflation” State percentile difference between: Cannell’s NRTs (late ‘80s) & Math NAEP (’90 or ’92)
  • 11. Testing the strong hypothesis 1 State rotated items? yes no Average “score inflation” 9.3 10.0 Level of test security lax med tight Average “score inflation” 10.6 9.7 8.9
  • 12. Testing the strong hypothesis 2 Moreover… Cannell found score inflation in elementary school tests in dozens of states – none of those tests had high stakes. Cannell also found score inflation in secondary school tests in dozens of states – only one had high stakes.
  • 13. Test Security in South Carolina: score-inflated test Cannell, 1989, p.89: “Unlike their other two tests, teachers are allowed to look at test booklets, teachers may obtain test booklets before the day of testing, booklets are not sealed, and testing is not routinely monitored by state officials. Outside test proctors are not used, test questions have not been rotated every year, and answer sheets have not been scanned for suspicious erasures or analyzed for cluster variance. There are no state regulations that govern test security and test administration for norm-referenced testing done independently in the local school districts.”
  • 14. Test Security In South Carolina: two high-stakes tests Cannell, 1989, p.89: “South Carolina also administers a graduation exam and a criterion referenced test, both of which have significant security measures. Teachers are not allowed to look at either of these two test booklets, teachers may not obtain booklets before the day of testing, the graduation test booklets are sealed, testing is routinely monitored by state officials, special education students are generally included in all tests used in South Carolina unless their IEP recommends against testing, outside test proctors administer the graduation exam, and most test questions are rotated every year on the criterion referenced test.”
  • 15. Tomāto Tomăto Is the high-stakes-cause-test-score-inflation hypothesis caused by semantic distortion? “Tests are ‘high-stakes’ when: teachers feel judged by the results?” parents receive reports of their child’s test scores?” test scores are widely reported in the newspapers?”
  • 16. Standards for Educational and Psychological Testing: “High-stakes test. A test used to provide results that have important, direct consequences for examinees, programs, or institutions involved in the testing.” (p.176) “Low-stakes test. A test used to provide results that have only minor or indirect consequences for examinees, programs, or institutions involved in the testing.” (p.178)
  • 17. Shortcomings of Cannell’s studies • Responses to his survey of state test security practices do not always specify which practices apply to which tests in states that administered more than one • He calculated score trends for NRTs and, with one exception, not for standards-based tests
  • 18. Testing the weak hypothesis 1 Q. Do grade levels closer to high-stakes event (e.g., high school graduation exam) show greater score increases? Yes, in “washback” studies of: John Bishop (1997), Linda Winfield (1990), Norm Fredericksen (1994) No, in Cannell’s data
  • 19. Q. Why disparate results? A. Low-stakes comparison tests differed Washback studies used untraceable, sample-based tests, administered with tight security (TIMSS, NAEP) Cannell used traceable NRTs administered with lax security
  • 20. Testing the weak hypothesis 2 Q. Is there direct evidence that test coaching raises test scores? A. No, see Powers (1993), Becker (1990), Powers & Rock (1994), Camara (2001), etc.
  • 21. Testing the weak hypothesis 3 Perhaps low-stakes tests are subject to score inflation where a jurisdiction administers a separate high-stakes test, thereby creating a general environment of high-stakes pressure?
  • 22. Q. High-stakes, score inflation related? A. Maybe negatively. Coef S.E. t p Intercept 45.70 10.20 4.48 0.0004 NAEP %-ile score -0.55 0.15 -3.72 0.0020 Item rotation? 0.57 2.94 0.19 0.8501 Level of security? 0.85 1.66 0.52 0.6141 High-stakes? -6.47 3.51 -1.84 0.0853
  • 23. 25 20 Amount of "inflation" 15 (in percentile points) 10 5 0 40 50 60 70 80 90 -5 Average NAEP percentile score Pink squares: states with a high-stakes test Blue diamonds: states without any high-stakes test
  • 24. Two types of tests resist score inflation: 1. Those untraceable to individual jurisdictions or schools (no incentive to cheat) 2. Those with tight security and ample item rotation (no opportunity to cheat) Traceable tests lacking security and item rotation are candidates for score inflation
  • 25. Artificial test score gains (score inflation) are caused by neglect, incompetence, or deliberate educator manipulation, but always require means and opportunity. • Motive is only present with traceable tests. • Means and opportunity exist only in the absence of security measures and item rotation.

Editor's Notes

  1. Title
  2. Keillor quote
  3. J.J. Cannell
  4. Cannell’s suspects
  5. CRESST suspects
  6. CRESST quotes
  7. CRESST research design
  8. More LOVB
  9. Testing the hypothesis
  10. Inflation measure
  11. Strong 1
  12. Strong 2
  13. South Carolina inflated
  14. South Carolina high-stakes
  15. Semantic distortion
  16. Standards definition
  17. Cannell study shortcomings
  18. Weak 1
  19. Why disparate results
  20. Weak 2
  21. Weak 3
  22. Stakes and inflation related
  23. Scatterplot
  24. Two resistant test types
  25. Means, motive, opportunity
  26. Fin