• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
IQ Score Interpretation in Atkins MR/ID Death Penalty Cases:  The Good, Bad and the Ugly
 

IQ Score Interpretation in Atkins MR/ID Death Penalty Cases: The Good, Bad and the Ugly

on

  • 1,965 views

I presented this at the 2012 Habeas Assistance Training Seminar in Washington DC, Aug, 2012. It reviews a number of psychometric issues in Atkins MR/ID death penalty cases using examples from a ...

I presented this at the 2012 Habeas Assistance Training Seminar in Washington DC, Aug, 2012. It reviews a number of psychometric issues in Atkins MR/ID death penalty cases using examples from a recent completed case and other cases as well.

Statistics

Views

Total Views
1,965
Views on SlideShare
1,962
Embed Views
3

Actions

Likes
0
Downloads
27
Comments
0

2 Embeds 3

http://www.blogger.com 2
http://www.linkedin.com 1

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    IQ Score Interpretation in Atkins MR/ID Death Penalty Cases:  The Good, Bad and the Ugly IQ Score Interpretation in Atkins MR/ID Death Penalty Cases: The Good, Bad and the Ugly Presentation Transcript

    • IQ Score Interpretations in Atkins Cases Kevin S. McGrew, PhD Director Institute for Applied Psychometrics (IAP)
    • Additional info re:Kevin McGrew and IAP can be found at theMindHub™ web portalwww.themindhub.com
    • For additional information and to stay current (ICDP blog) www.atkinsmrdeathpenaltly.com
    • ICDP ….…. ….
    • ICDP
    • ICDP
    • ICDP
    • IQ Score Interpretations in Atkins CasesA recently successful Atkins case (state agreed to LWOP a few weeks prior to evidentiary hearing) is bases of presentation but will be augmented with information from other cases
    • Case involved the Flynn Effect:But we will not be covering todayRecommended article (more at ICDP blog) ICDP
    • “Outliers” – why?State expert built argument around theWAIS-R scores being the best estimates ofdefendants true intelligence (underlying“You can’t fake bad” strategy ) anddismissed other scores as most likely due tomalingering—arguments not based onsound and reliable methods of scienceState expert failed in professional duediligence to consider scientific basedexplanations of the consistencies andinconsistencies in the complete collectionof scores
    • Median of all = 68 It is statistically or mathematically inappropriate to compute the arithmetic average (mean) of IQ scores. The median is Strong acceptable, under certain convergence circumstances of indicators The only way to compute an average (mean) IQ score is to use a complex equation that incorporates the reliabilities of all scores and the intercorrelations among all scores Median is acceptable metric
    • Fundamental Issue: Comparability (Exchangeability) of IQ ScoresIntellectual Functioning: Conceptual Issues Kevin S. McGrew and Keith F. Widaman AAIDD Death Penalty Manual Chapter (in preparation)
    • Fundamental Issue: Comparability of IQ Scores “Not all scores obtained on intelligence tests given to the same person will be identical” (AAIDD, 2010, p. 38) The global (full scale) IQ from different tests are frequently similar…Other times the IQ scores will be markedly different…a finding that often produces consternation for examiners and recipients of psychological reports
    • Fundamental Issue: Comparability of IQ ScoresFloyd et al. (2008) used generalizability theory methods to evaluate IQ-IQ exchangeability across ten different IQ battery global composite g-score composites (comprised of 6 to 14 individual tests) acrossapproximately 1,000 subjects
    • Fundamental Issue: Comparability of IQ Scores Average (mdn) r = .76 – lets round to .80Coefficient of determination r2 x 100 = 64 % shared variance Test A Shared .r = .80 common abilities Test B
    • Fundamental Issue: Comparability of IQ Scores Test A Shared .r = .80 common abilities Test B “psychologists can anticipate that 1 in 4 individuals taking an intelligence test battery will receive an IQ more than 10 points higher or lower when taking another battery”Floyd et al. (2008)
    • The standard error of the difference (SEdiff) must be used to ascertain if the scores in question are reliably different SEdiff = 15 x SQRT[2 - r11 - r22] Test A reliability = .95 Test B reliability = .93 1 SEdiff (68 % confidence) = 5.2 points 2 SEdiff (95 % confidence) = 10.4 pointsBefore interpreting the scores from these two IQ tests as being significantly difference, an IQ-IQ difference of at least 10+ points would be required Easier way via use of confidence band rule-of-thumb
    • e.g., WAIS-R score The higherThe standard error of differences represent the difference (SEdiff) WAIS-R reliable differences withconfidence band rule- all other obtained IQ scores is a of-thumb scores scientifically based fact in this case. e.g., Not One needs sign. to accept different and to from each explain why. other e.g., None of If 95 % SEM confidence bands these 6 tests for compared scores do not are sign. touch, the difference is likely a different from reliable difference and one another hypotheses about the difference should be enteratined If 95 % SEM confidence bands for compared scores overlap, then the difference is likely not a reliable difference and should not generate significant hypotheses about score differences.
    • IQ-IQ score differences: Scientific hypotheses that warrant exploration• Test administration or scoring errors• Practice effects• Malingering / effort• Norm obsolescence (Flynn effect) Today will focus only• Content differences between different tests on select topics – or different revisions of the same test only those relevant• Little known psychometric problems with to this example case and some of the some of the “gold standards” more unknown or• Individual/situational factors for person misunderstood issues or specific test session
    • Unscientific IQ-IQ score difference hypthoses I have seen or read Will focus only on select topics – esp. those relevant to theVoodoo psychometrics example case and some of the more unknown or misunderstood issues
    • Outliers – why?Most likely scientific explanations in this case Ability content differences between differenttests or different revisions of the same test •“Drilling down” further – changes in g- loadings/saturation of subtests included on WAIS-R and WAIS-III/IV
    • High gIQ test battery subtest T1 Intelligence test batteryg-loadings or saturation Individual test g (general T2 Intelligence) loadings T3 General intelligence (g) Derived from factor analysis T4 Think of a general intelligence pole that is T5 saturated with more g-ness (like magnetism) at the top T6 and less g-ness at the bottom. T7 Factor analysis orders the T8 tests on the pole based on their saturation of g-ness T9 T10 Low g Subtests
    • WISC/WISC-R/WAIS/WAIS-R MR/ID subtest g-loading pattern research Also astounding is the study-by-study consistency in the subtests that emerge as “easy” (Picture Completion, Object Assembly, Block Design) or “hard” (Arithmetic, Vocabulary, Information) for diverse samples of retarded populations(Kaufman, 1979, p.203) (28 studies)
    • Plot of ________ 1988 and 1993 WAIS-R Subtest Scaled Scores by g (general intelligence) loadings 16 15 1988 WAIS-R 1993 WAIS-R ________ WAIS-R subtest scaled scores 14 High subtest 13 scaled score PicA 12 11 Dig Spn PicC 10 BlkD Dig 9 Sym Arith 8 Cmp 7 Ob Sim Voc Asm 6 Info 5 4 Low subtest 0.55 0.65 0.75 0.85 0.95 scaled score WAIS-R Subtest g (general intelligence loadings (Kaufman, 1990, p. 253) High g: MoreLow g: less cognitively cognitively abstract/complex (Fair or moderate g) (Good or high g) abstract/complex
    • Plot of _________WAIS-R Subtest Scaled Scores by g (general intelligence) loadingsRank-order correlation of ___ 1993 WAIS-R Rank-order correlation of ___ 1988 WAIS-R subtest scores test g-loadings is -.71. subtest scores test g-loadings is -.68. 16 15 1988 WAIS-R ___________ WAIS-R subtest scaled scores 1993 WAIS-R 14 High subtest 13 scaled score PicA 12 Dig This is a form of internal 11 PicC Spn convergence validity evidence for 10 Dig BlkD MR/ID Dx 9 Sym Arith 8 Cmp 7 Ob Sim Voc Asm 6 Info 5 4 Low subtest 0.55 0.65 0.75 0.85 0.95 scaled score WAIS-R Subtest g (general intelligence loadings (Kaufman, 1990, p. 253) High g: More Low g: less cognitively cognitively abstract/complex (Fair or moderate g) (Good or high g) abstract/complex
    • Plot of _________WAIS-R Subtest Scaled Scores by g (general intelligence) loadingsDropped from battery in WAIS-IV revision Eliminated from FS IQ in WAIS-IV revision (supplemental subtest) 16 15 1988 WAIS-R __________ WAIS-R subtest scaled scores 1993 WAIS-R 14 High subtest 13 scaled score PicA 12 11 Dig Spn PicC 10 BlkD Dig 9 Sym Arith 8 Cmp 7 Ob Sim Voc Asm 6 Info 5 4 Low subtest 0.55 0.65 0.75 0.85 0.95 scaled score WAIS-R Subtest g (general intelligence loadings (Kaufman, 1990, p. 253) High g: More Low g: less cognitively cognitively abstract/complex (Fair or moderate g) (Good or high g) abstract/complexEliminated from FS IQ in WAIS-III revision (supplemental subtest) & dropped from battery in WAIS-IV revision
    • The WAIS-III/IV batteries include more complex tests (than the WAIS-R) and are better indicators of general intelligence The state expert would not recognize (continued to ignore) this scientific fact and held on to the WAIS-R scores as the most accurate – the rest of lower scores due to malingering
    • Outliers – why? Most likely scientific explanations in this case• Ability content differences between different tests or different revisions of the same test• Little known psychometric problems with some of the “gold standards”
    • CHC IQ Test Batteries DNA Fingerprints
    • The publisher, in both the WAIS-III/WAIS-IV manuals, describes changes in abilities measured to improve the battery to be consistent with contemporary research The state expert would not recognize (continued to ignore) this scientific fact and held on to the WAIS-R scores as the most accurate – the rest of lower scores due to malingering
    • Recommended article re: CHC theory of intelligence (Many more at ICDP blog)
    • Continuum of Progress: Intelligence Theories and the Evolution of the Wechsler Adult IQ Battery General Dichotomous Multiple Multiple Multiple Ability (g) Abilities Cognitive Abilities Cognitive Abilities Cognitive Abilities (Incomplete; not implicitly (Incomplete; implicitly (“Complete”; implicitly or explicitly CHC-organized or explicitly CHC-organized or explicitly CHC- organized gBroad Abilities Spearman Original Gf-Gc Thurstone PMAs Cattell-Horn-Carroll (CHC) Theory of Cognitive Abilities CHC is now considered to be the consensus W-B (1939; 1946) model of the structure WAIS-R (1981) WAIS-III (1997) WAIS-IV (2008) of intelligence The WAIS-III and WAIS-IV revisions made the battery more consistent with contemporary neurocognitive and intelligence research. They are more valid indicators of general intelligence (supported by WAIS-III/IV tech manuals and independent reviews) than the older WAIS-R. The changes in abilities measured from the WAIS-R to the WAIS-III/IV help explain the WAIS-R “outlier” scores The WAIS-IV should not be considered “the gold standard” as per the consensus CHC model of intelligence.
    • Continuum of Progress: Intelligence Theories and the Wechsler Adult IQ Battery General Dichotomous Multiple Multiple Multiple Ability (g) Abilities Cognitive Abilities Cognitive Abilities Cognitive Abilities (Incomplete; not implicitly (Incomplete; implicitly (“Complete”; implicitly or explicitly CHC-organized or explicitly CHC-organized or explicitly CHC- organized gBroad Abilities Spearman Original Gf-Gc Thurstone PMAs Cattell-Horn-Carroll (CHC) Theory of Cognitive Abilities The revisions made to W-B (1939; 1946) WAIS-R (1981) WAIS-III (1997) WAIS-IV (2008) other IQ batteries (with adult norms SB and WJ) also changed the composition of their composite IQ scores and is Stanford- Binet LM SB-IV (1986) SB-V(2003) a likely source of score (1937; 1960; differences that must be 1972) considered WJ (1977) WJ III (2001) WJ-R (1989) WJ III NU (2005)
    • Continuum of Progress: Intelligence Theories and the Wechsler Adult IQ Battery General Dichotomous Multiple Multiple Multiple Ability (g) Abilities Cognitive Abilities Cognitive Abilities Cognitive Abilities (Incomplete; not implicitly (Incomplete; implicitly (“Complete”; implicitly or explicitly CHC-organized or explicitly CHC-organized or explicitly CHC- organized gBroad Abilities Spearman Original Gf-Gc Thurstone PMAs Cattell-Horn-Carroll (CHC) Theory of Cognitive Abilities W-B (1939; 1946) Knowing the ability WAIS-III (1997) WAIS-IV (2008) WAIS-R (1981) coverage similarities and differences is important when comparing and understanding possible IQ- Stanford- IQ differences between the Binet LM SB-IV (1986) SB-V(2003) latest versions of these (1937; 1960; 1972) batteries WJ (1977) WJ III (2001) WJ-R (1989) WJ III NU (2005)
    • Continuum of Progress: Intelligence Theories and the Wechsler Adult IQ Battery General Dichotomous Multiple Multiple Multiple Ability (g) Abilities Cognitive Abilities Cognitive Abilities Cognitive Abilities (Incomplete; not implicitly (Incomplete; implicitly (“Complete”; implicitly or explicitly CHC-organized or explicitly CHC-organized or explicitly CHC- organized gBroad Abilities Spearman Original Gf-Gc Thurstone PMAs Cattell-Horn-Carroll (CHC) Theory of Cognitive Abilities W-B (1939; 1946) IQ-IQ score difference WAIS-R (1981) WAIS-III (1997) WAIS-IV (2008) explanations may require knowledge of across and within battery revision ability coverage understanding. There are many possible scenarios Stanford- when there is a history of IQ Binet LM SB-IV (1986) SB-V(2003) testing within the same battery (1937; 1960; 1972) system or across battery systems WJ (1977) WJ III (2001) WJ-R (1989) WJ III NU (2005)
    • Continuum of Progress: Intelligence Theories and Test Batteries General Dichotomous Multiple Multiple Multiple Ability (g) Abilities Cognitive Abilities Cognitive Abilities Cognitive Abilities (Incomplete; not implicitly (Incomplete; implicitly (“Complete”; implicitly or explicitly CHC-organized or explicitly CHC-organized or explicitly CHC-organized g Broad Abilities (Neuropsych. Psychometric)Primary Theories Spearman Original Gf-Gc Thurstone PMAs Cattell-Horn Carroll (CHC) Theory of Cognitive Abilities Simultaneous- PASS Successive (Planning, Attention, Simultaneous, Successive) WJ (1977) WJ-R (1989) WJ III (2001) WJ III NU (2005) Stanford- SB-IV (1986) SB-V(2003) Applied IQ Batteries Binet LM (1937; 1960; 1972) WPPSI-R (1989) WPPSI-III (2002) When childhood and adult WISC-IV (2003) WISC-R (1974) WISC-III 1991) battery scores are available the W-B (1939; 1946) WAIS-IV (2008) WAIS-III (1997) interpretation of IQ-IQ WAIS-R (1981) differences due to ability coverage differences becomes even more complex K-ABC (1983) KABC-II (2004) KAIT (1993) CAS (1997) DAS (1990) DAS-II (2007)
    • Knowledge of CHC ability coverage critical TONI-2/ when brief special purpose Ravens/ 100% Gf (e.g., nonverbaI) IQ scores are reported
    • The state expert argued that some of the lower subtest scores (after the WAIS-R’s) was further evidence of malingeringVoodoo psychometrics
    • State expert argued that variability in Wechsler subtest scores, esp. lower scores post-Atkins were obvious sign of malingering …thus supportingthe conclusion that the WAIS-R scores were the bestestimate of general intelligence The implied “You can’t fakesmart” strategy or interpretation
    • There is an EXTREME amount of variability in the professional expertise in IQ subtest profile interpretation: Scientific/psychometric vs. “clinical” lore-based interpretation VS
    • Recall the standard error of the difference (SEdiff) must be used to ascertain if the scores in question are reliably different
    • Plot of ___________WAIS-R & WAIS-III Similarities scores (+- 95 SEM) - Range of 4 20 19 18 17 16 15 14 Scaled score 13 95% SEM band (median = +- 1.7) 12 11 10 9 8 7 6 Average (median = 5.0) 5 4 3 2 1 0 6 8 0 2 4 6 8 0 2 4 6 8 0 98 98 99 99 99 99 99 00 00 00 00 00 01 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, Date No statistically reliable difference across all scores
    • Plot of ______________WAIS-R & WAIS-III Comprehension scores (+- 95 SEM) - Range of 4 20 19 18 17 16 15 14 Scaled score 13 95% SEM band (median = +- 2.3) 12 11 10 9 8 7 6 Average (median = 5.5) 5 4 3 2 1 0 6 8 0 2 4 6 8 0 2 4 6 8 0 98 98 99 99 99 99 99 00 00 00 00 00 01 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, Date No statistically reliable difference across all scores
    • Plot of __________ WAIS-R, WAIS-III & WAIS-IV Digit Span scores (+- 95 SEM) – Range of 7 20 19 18 As reported in WAIS-R tech. manual, DS has poor 17 16 reliability (mdn = .81) – 4th weakest in battery. Thus 15 some variability to be expected. And, the WAIS-IV 14 DS is a three-component and not two component Scaled score 13 12 test—so they are not measuring the SEM band (median = +- 1.9) 95% exact same 11 10 construct 9 8 7 6 Average (median = 5.5) 5 4 3 2 1 0 6 8 0 2 4 6 8 0 2 4 6 8 0 98 98 99 99 99 99 99 00 00 00 00 00 01 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, Date 7 point difference There is a scientific explanation
    • Plot of ________WAIS-R, WAIS-III & WAIS-IV Picture Completion scores (+- 95 SEM) - Range of 6 On the WAIS-RWAIS-III revision. “Only 50% of the content of Picture Completion and Picture Arrangement was retained from the WAIS-R, and only 20 19 60 % of the Object Assembly items were retained. In addition, the correlations 18 between WAIS-R and WAIS-III version of these subtests are relatively low (r’s of 17 .59 - .63)” ------ 35 – 40 % shared variance 16 15 14 13 (Kaufman & 12 95% SEM band (median = +- 2.5 )Lichtenberger, 20 PICC 11 02, p. 91) 10 9 8 7 6 Average (median = 4.5) 5 4 3 2 1 0 DATE There is a scientific explanation
    • The state expert proposed an Expected WAIS-III IQ (based on WAIS-R IQ) – Actual WAIS-III discrepancy method to support malingering hypothesisVoodoo psychometrics
    • WAIS-R IQ 85  Expected WAIS-III 81-83 (will us 82 for discussion)
    • WAIS-R IQ 85  Expected WAIS-III 81-83 (will us 82 for discussion) Obtained WAIS-III scores lower than “expected/predicted” = malingering according to state expert D All other lower scores = malingering as per state expert
    • Major flaws with this method and logic (part of commonly stated or implied -- “You can’t fake smart” strategy• There is no need to estimate WAIS-III scores as actual WAIS-III scores exist• No scientific or professional evidence or literature suggesting the use or validity ofthis method• The technical manuals do not recommend the use of these tables for this purpose.The purpose for presenting in TM is to demonstrate concurrent criterion validity. Thisinformation clearly was not presented in the TM to support this type of use • If such a procedure were to be used, the study would need to include subjects that had WAIS-III 9+ years later than WAIS-R (not average of 4.7 weeks) • The tables do not include the standard error of equating (esp. around the cut score of 70) which would be required as per the Joint Test Standards if the table was intended to be used for this purpose • If intended for this purpose, the publisher would have had to conduct a properly designed equating study (rectangular distribution; minimum n recommended is 400 to 1,500 – not 192.) • etc., etc., etc.
    • The only scientifically accepted method forpredicting one score from another is to use the correlation and a prediction model WAIS-R/WAIS-IIIcorrelation of .93 would suggest very accurate prediction…..but all prediction has error that can be quantified as the standard error of estimate (SEest)
    • Using WAIS-R IQ scores and standardprediction model based on WAIR-R/WAIS-III r = .93, best predicted WAIS-III givenWAIS-R scores is 81But there is prediction error • 1 SEest (68% confidence) = + 5.5 • 2 SEest (95 % confidence) = +11.0Thus, given this person’s WAIS-R score, the onlyscientifically accepted expected/predicted WAIS-IIIscore is 81 + 11 pts -- 95 % confidence band ofpredicted/expected WAIS-III score of 70 to 92
    • Only appropriate predicted/expected WAIS-III score prediction (95%confidence) is a range from 72 to 90 D All actual WAIS-III IQ scores have SEM confidence bands that overlap with SEest (standard error of estimate - error of prediction) band based on WAIS- R score. Thus, all 3 WAIS-III scores are not reliably statistically different from predicted score
    • The state expert characterized defendant’s measured achievement (WJ III) as “quite impressive” given his level of measured intelligence – at levels inconsistent with MR/ID Dx The IQ = ACH fallacy argumentVoodoo psychometrics
    • Problems with “impressive” achievement argumentDefendant’s original WJ III achievement scores were based onoriginal 2001 norms. Failed to rescore and reinterpret in light ofWJ III 2007 Normative Update (WJ III NU)Selective “cherry picking” of relatively high scores and failure toutilize most “real world” score metrics to establish functionalacademic skills • Ignored cognitive measures on WJ III Ach. Battery consistent with MR/IDIQ = ACH fallacy
    • Test State authors & expert pub rec this focused on as best these metric scores Cogmeasures Cogmeasures Hardly “quite impressive”
    • Recall the standard error of the estimate(SEest) must be used estimate the amount of error in the IQ  ACH prediction
    • The Reality of IQ  Achievement Predicted ScoresIQACH correlation in scientific literature (for adults) reported from .50 to .60Prediction error (SEest) when r = .50 to .60 • 1 SEest (68% confidence) = + 12/13 • 2 SEest (95 % confidence) = + 24/26State expert used IQ of 73 within the context of his “impressive” conclusion.Using this score, the scientifically accepted range of expected/predictedachievement scores is approximately 72 to 98 (68% confidence) and 59 to 111(95% confidence)The defendants WJ III NU ach. standard scores are well within these expectedranges
    • The IQ  Achievement Fallacy: One cannot achieve above your IQ score
    • The IQ  Achievement Fallacy: One cannot achieve above your IQ score (often used as part of “You can’t fake smart” argument) IQACH correlations of .50 to .60 indicate that IQ accounts for only approximately 25% to 40% of ach. test scores.Thus, for any given IQ score: •Half of all individuals will obtain achievement scores at or below their IQ score. •Half of all students will obtain achievement scores at or above their IQ score!
    • Other “You can’t fake smart” examples I have seen (not exhaustive list)The use of the National Adult Reading Test (NART), a commonly used measure topredict “premorbid” intelligence in neuropsych settings, to predict expected IQscores against which an existing score is comparedThe use of neuropsych “demographically adjusted (Heaton)” norms
    • Other “You can’t fake smart” examples I have seen (not exhaustive list)Use of group aptitude measures (ASVAB; AFQT) as convergent validityevidence
    • Proportional CHC broad ability coverage of ASVAB and ASVAB-derived AFQT score Major cognitive ability domains sampled across the major Other human ability domains individualized IQ batteries (Wechslers, Stanford-Binet, WJ (acquired acculturated III/BAT III) which are combined to produce general intelligence knowledge) included in the (g) full-scale global composite IQ score ASVAB differential aptitude test battery 100% % CHC broad abilities represented is ASVAB and 90% 80% 70% Note. ASVAB Verbal tests ASVAB AFQT score (Verbal Comp or VL as per 60% CHC model/theory) also tap 50% Gc abilities, but require the subject to read the 40% items…thus involving Grw abilities 30% 20% 10% 0% Gf Gq Gc Glr Ga Gv Gsm Gs Grw Gk ASVAB 15.0 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 25.0 30.0 30.0 ASVAB AFQT 25.0 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 50.0 25.0
    • Other “You can’t fake smart” examples I have seen (not exhaustive list) Unknown problems with some of the older “gold standards”: Often dueto lack of due diligence and expertise
    • The 1960 SB was not a renorming (data gathered for item ordering work) • 1960 SB norms still based on 1932 norming sample • Any 1960 SB score may suffer from extreme Flynn effect (e.g. if tested in 1972 with 1960 SB, FE of approximately 12 points)The 1986 SB-IV had serious psychometric problems (Reynolds, 1987 & others) • Underepresentative standardization sample (“far below industry standards”) • “IQ roulette” • “I believe the use of the S-B IV IQs to be logically indefensible, and I certainly would not want to defend their accuracy or validity in a court of law” (Reynolds, 1987; p. 141)
    • Other “You can’t fake smart” examples I have seen (not exhaustive list) Unknown problems with some of the older “gold standards” • WAIS-R norm sample for 16 to 19 year olds have been demonstrated to be suspect and “soft.” Simply put, the WAIS-R norms for 16-19-year-olds are suspect and examiners should interpret [them] with extreme caution. The norms for 16-19-year-olds are ‘soft’ or ‘easy’ because the reference group performed more poorly than 16-to-19- year-olds really perform in the general population. The surprising result is that the IQs of 16- through 19-year-olds tested on the WAIS-R will be spuriously high by 3 to 5 points” (p. 85, italics added).Kaufman (1990)
    • IQ Score Interpretations in Atkins Cases Kevin S. McGrew, PhD Director Institute for Applied Psychometrics (IAP) www.themindhumb.com