SlideShare a Scribd company logo
1 of 35
Moving Beyond “Eeny, Meeny, Miny,
Moe”: What Factors Should Guide the
Evaluation of Selection Tests
John M. Ford
CWH Research, Inc.
Overview
• Critical Assumptions
• 7 Important Considerations When
Evaluating Selection Tests
Critical Assumptions
• Tests have value.
• All tests are not the same.
• Tests are part of an overall organizational
environment—Changing to a new test will
not magically change your organization by
itself.
#1: Don’t let the tail wag the dog—
Take control of your process
• The RFP process is not conducive to
making informed decisions.
– Don’t let test providers decide what
information you should consider.
• Evaluating tests requires professional
judgment.
– You must ask the right questions and evaluate
the evidence.
#1: Don’t let the tail wag the dog—
Take control of your process
• Don’t forget future and hidden costs.
– Inefficient performance
– Increased training/remedial training/retraining
– Lawsuits
– Turnover
– Grievances
– Disciplinary problems
– Accidents
#2: There is no such thing as a
valid test
“Validity refers to the degree to which evidence and
theory support the interpretations of test scores entailed
by proposed uses of tests. Validity is, therefore, the
most fundamental consideration in developing and
evaluating tests. The process of validation involves
accumulating evidence to provide a sound scientific
basis for the proposed score interpretation of test scores
required by proposed uses that are evaluated, not the test
itself. When test scores are used or interpreted in more
than one way, each intended interpretation must be
validated” (Standards for Educational and Psychological
Testing, 1999; p. 9).
#3—Not all validation evidence is
equal
• Validity should not be treated as a
categorical variable in your decision-
making.
• Validation evidence should be evaluated
along a continuum.
• This guideline applies to evidence regarding
content relevance (i.e., content validity).
Example: Not All Content Relevance
Evidence is Equal.
Job Domain
Test Domain
≠
Test 1 Test 2
9
Corollary—Adverse impact is also
not a continuous variable
• Adverse impact should also be evaluated on a
continuum.
– Although they both violate the 4/5ths rule, an AI ratio
of .70 is preferable to .20.
– Similarly, 1.00 is preferable to .80.
• Higher AI ratios provide a variety of results:
– More diversity in your organization
– Greater likelihood of meeting the 4/5ths rule in
individual samples
– Lower likelihood of grievances, EEOC investigations,
lawsuits, and bad press
#4: Context matters!!!
• Validity cannot be properly evaluated without
knowledge of the validation process.
– Get the technical report.
– Validation study circumstances should match your
circumstances.
• Every validation study should include a job
analysis or analysis of work.
– Is the job domain appropriately defined?
– Are the job requirements similar to your position?—
This is necessary to transport validation evidence.
– Are the test components defined in a manner consistent
with the job domain?
#4: Context matters!!!
• Use of test should match your process/needs
• Validity coefficients are not an island—they
provide very little information without
context.
– Is the sample appropriate for your agency?
– Is the criterion related to important aspects of
the job (and your job)?
– Is the validity coefficient corrected or
uncorrected?
#4: Context matters!!!
• Don’t forget complexity.
– Reading level
– Math level
– Skills/Abilities level
• Context is also important in evaluating adverse
impact.
– Adverse impact is influenced by factors unrelated to the
test.
– Consider the sample—Applicant samples are better
indicators of adverse impact than incumbent samples.
13
Example—Adverse impact is influenced
by factors unrelated to the test
Total Sample Size
Number of Minorities in the Sample
Selection Ratio
Correlation Between Predictors
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.1 0.3 0.5 0.7 0.9
Selection Ratio
12% Minority
20 %Minority
Example: AI Ratios From a Single-Hurdle
Selection System
N = 200
d = 0.00
Roth, Bobko, & Switzer, 2006
ProbabilityofViolatingthe4/5thsRule
Example—Consider the sample when
evaluating adverse impact.
.69-.13Test #5
.41-.16Test #4
.24-.66Test #3
.54.15Test #2
.50-.10Test #1
White-Black SD-
Difference in Applicant
Sample
White-Black SD-
Difference in
Validation Sample
• Applicant samples generally demonstrate higher
adverse impact than incumbent samples.
#5: Beware of small samples
• “Ignoring sampling error leads to
disastrous results in the area of personnel
selection.” (Hunter & Hunter, 1984)
• Sampling error occurs due to only sampling
part of the entire population
– Single studies and/or small samples are not
definitive.
– Results from single studies and/or small
samples are not robust.
Example: Sampling Error—
Smaller Samples
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
#1 #2 #3 #4 #5 #6 #7 #8 #9 Tot
Validity Coefficient
Single Test Validated in Multiple Samples (All samples > 20
participants)
Example: Sampling Error-
Larger Samples
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Client #1 Client #2 Client #3 Combined
Sample
Validity Coefficient
Next Generation Firefighter/EMS Written Aptitude Test
(All Samples > 65 participants)
#5: Beware of small samples
• Capitalizing on chance can result in misleading
validity coefficients.
• Capitalization on chance can occur when:
– Final items on test are determined based on validation
sample.
– Test weights are determined based on validation
sample.
• You should expect lower validity coefficients in the
future under these circumstances.
Example: Capitalization on
Chance
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Client #1 Client #2 Client #3 Total
Sample
Regression Weights
based on Client #1
Rational Weights
Next Generation Firefighter/EMS Written Aptitude Test
Validity
#5: Beware of small samples
• Single studies/small samples can also result in misleading
adverse impact ratios.
– The 4/5ths rule is not AI. It is an indicator of
underlying AI—“The 4/5ths rule merely establishes a
numerical basis for drawing an initial inference and for
requiring additional information” (Uniform Guidelines,
Questions & Answers)
• AI ratios can vary substantially over different
administrations.—Again, results from single studies and/or
small samples are not definitive or robust.
Example: One Client’s AI Ratios Over Multiple
Administrations
1.41
0.84
0.69
0.67
0.87
0.78 0.72
0.44
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1 2 3 4 5 6 7 8
Administrations
Combined Sample = .75
AdverseImpactRatio
#5: Beware of small samples
• When evaluating samples:
– More weight should be given to evidence from
multiple samples—Cross validation.
– More weight should be given to larger samples.
– More weight should be given to representative
samples.
– More weight should be given to results from
studies that are developed and weighted using
rational models.
#6: Don’t forget the O’s
• The concept of KSAs has been expanded to KSAOs
– O’s = Other Characteristics
• Judgment & Common Sense
• Interpersonal Skills
• Emotional Skills
• Leadership
• Personality traits or temperaments
• Interests
• Defining a broader job domain can result in higher
validity and lower adverse impact.
Example: KSAO Importance
Ratings for Firefighter Position
Very Important
Essential for successful
performance of the job
Critically Important
Failure to perform results in
extreme negative consequences
3.8
4.2
4.2
4.2
3 3.5 4 4.5
Basic Educational
Skills
Emotional Outlook
Interpersonal Skills
Practical Skills
Example: Broad Assessments Can Increase
Validity & Reduce Adverse Impact
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Validity
Basic
Educational
Skills
Combined
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
d-statistic
Basic
Educational
Skills
Combined
Combined includes Interpersonal Skills, Emotional Outlook,
& Practical Skills
#6: Don’t forget the O’s
• Using broader assessments early in the process can result
in substantially better hires.
– Some agencies administer a narrow test (e.g., basic
educational skills) in the first stage and measure a broader
range of skills in a later stage (e.g., interview).
– This strategy will screen out individuals who are more
complete candidates and would be superior employees.
– Measuring a broad range of skills can increase the validity
(i.e., the quality of the candidate pool) and minimize the AI
of your first stage (as well as your total process).
– Measuring a broad range of skills early in your process can
also reduce the cost of later steps.
Example: Which Candidate Would
be the Best Hire?
Basic
Educational
Skills
Interpersonal
Emotional
Outlook
Practical
Candidate A 87 60 60 60
Candidate B 85 70 70 70
Candidate C 83 90 90 90
Example: Advantage of Measuring a
Broad Range of Skills Early in
Process
Selection
Ratio
AI Ratio-
Cognitive
Screen
AI Ratio-
Complete
Model
% of Top
Candidates
Screened Out by
Cognitive Screen
.20 .32 .32 68%
.40 .37 .49 35%
.60 .52 .65 23%
.80 .63 .85 12%
#7: Remember the Evaluate the
Pass Point
• Adverse impact ratios are dependent upon pass
points.
– Adverse Impact Ratio—A substantially different rate of
selection is indicated when the selection rate for a
protected group is less than 4/5ths (80%) of the selection
rate for the group with the highest selection rate.
• Changing the pass point results changes the AI Ratio.
• Make sure the pass point used by test provider when
evaluating adverse impact is similar to your expected
pass point.
Protected Group Majority Group
Mean-ProtectedGroup
Mean-MajorityGroup
1 Standard Deviation
d = 1.00
Meets4/5thsRule
Fails4/5thsRule
Example: Adverse impact ratios are dependent
on pass points
#7: Remember to evaluate the
pass point
• Remember that your process may have
multiple pass points.
– Those that pass test
– Those that are ultimately hired
• Although your initial pass point may meet
the 4/5ths rule, the rank-order is a critical
consideration.
Protected Group Majority Group
Mean-ProtectedGroup
Mean-MajorityGroup
1 Standard Deviation
d = 1.00
InitialPassPoint
FinalPassPoint
Example: Rank order is critical consideration
Example: Rank order impacts AI Ratio
of your ultimate pass point
Selection Process Results
10 of 70 W pass
4 of 30 B pass
W pass ratio = 14.3 %
B pass ratio = 13.3 %
AI Ratio = 0.93
Rank Score Race
1 92 W
2 88 W
3 87 W
4 86 B
5 81 W
6 80 W
7 79 W
8 78 B
9 77 W
10 76 B
11 75 B
12 72 W
13 71 W
14 70 W
Hires: 7 W, 3 B
Hire ratios: W = 10%, B = 10%
Hire ratio AI = 1.0
Conclusion: No Adverse Impact
Hires: 4 W, 1 B
Hire ratios: W = 5.7%, B = 3.3%
Hire ratio AI = 0.58
Conclusion: Adverse Impact
7 Critical Considerations When
Evaluating Selection Tests
1. Don’t let the tail wag the dog—Take
control of your process.
2. There is no such thing as a valid test.
3. Not all validity evidence is equal.
4. Context matters!!!
5. Beware of small samples.
6. Don’t forget the O’s
7. Remember to evaluate the pass point.

More Related Content

What's hot

Auditing sampling presentation
Auditing sampling  presentationAuditing sampling  presentation
Auditing sampling presentationDominic Korkoryi
 
Psychometrics 101: Know What Your Assessment Data is Telling You
Psychometrics 101: Know What Your Assessment Data is Telling YouPsychometrics 101: Know What Your Assessment Data is Telling You
Psychometrics 101: Know What Your Assessment Data is Telling YouExamSoft
 
Chapter 9 – homework
Chapter 9 – homeworkChapter 9 – homework
Chapter 9 – homeworkbagarza
 
Validity and reliability of the instrument
Validity and reliability of the instrumentValidity and reliability of the instrument
Validity and reliability of the instrumentBhumi Patel
 
Selection decisions
Selection decisionsSelection decisions
Selection decisionsNcell
 
T8 audit sampling
T8 audit samplingT8 audit sampling
T8 audit samplingnamninh
 
Chapter 6 - Selection and Placement
Chapter 6 - Selection and PlacementChapter 6 - Selection and Placement
Chapter 6 - Selection and PlacementDaniel Edward Ricio
 
The Axioms of Testing
The Axioms of TestingThe Axioms of Testing
The Axioms of TestingPaul Gerrard
 
empirical software engineering, v2.0
empirical software engineering, v2.0empirical software engineering, v2.0
empirical software engineering, v2.0CS, NcState
 
Chp7 - Research Methods for Business By Authors Uma Sekaran and Roger Bougie
Chp7  - Research Methods for Business By Authors Uma Sekaran and Roger BougieChp7  - Research Methods for Business By Authors Uma Sekaran and Roger Bougie
Chp7 - Research Methods for Business By Authors Uma Sekaran and Roger BougieHassan Usman
 
Metpen chapter 7 UMA SEKARAN
Metpen chapter 7 UMA SEKARANMetpen chapter 7 UMA SEKARAN
Metpen chapter 7 UMA SEKARANDiyah Aprilia
 
Investigating Serendipity in Recommender Systems Based on Real User Feedback
Investigating Serendipity in Recommender Systems Based on Real User FeedbackInvestigating Serendipity in Recommender Systems Based on Real User Feedback
Investigating Serendipity in Recommender Systems Based on Real User FeedbackDenis Kotkov
 
Risk Management in Data Analysis
Risk Management in Data AnalysisRisk Management in Data Analysis
Risk Management in Data AnalysisDavid Lee
 
Research Method for Business chapter 6
Research Method for Business chapter  6Research Method for Business chapter  6
Research Method for Business chapter 6Mazhar Poohlah
 
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...Qualitative Studies in Software Engineering - Interviews, Observation, Ground...
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...alessio_ferrari
 
Test Axioms – An Introduction
Test Axioms – An IntroductionTest Axioms – An Introduction
Test Axioms – An IntroductionPaul Gerrard
 

What's hot (19)

Auditing sampling presentation
Auditing sampling  presentationAuditing sampling  presentation
Auditing sampling presentation
 
Psychometrics 101: Know What Your Assessment Data is Telling You
Psychometrics 101: Know What Your Assessment Data is Telling YouPsychometrics 101: Know What Your Assessment Data is Telling You
Psychometrics 101: Know What Your Assessment Data is Telling You
 
Chapter 9 – homework
Chapter 9 – homeworkChapter 9 – homework
Chapter 9 – homework
 
Validity and reliability of the instrument
Validity and reliability of the instrumentValidity and reliability of the instrument
Validity and reliability of the instrument
 
Selection decisions
Selection decisionsSelection decisions
Selection decisions
 
4 research design + sampling methods dr. hueihsia holloman
4 research design + sampling methods dr. hueihsia holloman4 research design + sampling methods dr. hueihsia holloman
4 research design + sampling methods dr. hueihsia holloman
 
T8 audit sampling
T8 audit samplingT8 audit sampling
T8 audit sampling
 
Research methodology presentation
Research methodology presentationResearch methodology presentation
Research methodology presentation
 
Chapter 6 - Selection and Placement
Chapter 6 - Selection and PlacementChapter 6 - Selection and Placement
Chapter 6 - Selection and Placement
 
The Axioms of Testing
The Axioms of TestingThe Axioms of Testing
The Axioms of Testing
 
Lecture 06
Lecture 06Lecture 06
Lecture 06
 
empirical software engineering, v2.0
empirical software engineering, v2.0empirical software engineering, v2.0
empirical software engineering, v2.0
 
Chp7 - Research Methods for Business By Authors Uma Sekaran and Roger Bougie
Chp7  - Research Methods for Business By Authors Uma Sekaran and Roger BougieChp7  - Research Methods for Business By Authors Uma Sekaran and Roger Bougie
Chp7 - Research Methods for Business By Authors Uma Sekaran and Roger Bougie
 
Metpen chapter 7 UMA SEKARAN
Metpen chapter 7 UMA SEKARANMetpen chapter 7 UMA SEKARAN
Metpen chapter 7 UMA SEKARAN
 
Investigating Serendipity in Recommender Systems Based on Real User Feedback
Investigating Serendipity in Recommender Systems Based on Real User FeedbackInvestigating Serendipity in Recommender Systems Based on Real User Feedback
Investigating Serendipity in Recommender Systems Based on Real User Feedback
 
Risk Management in Data Analysis
Risk Management in Data AnalysisRisk Management in Data Analysis
Risk Management in Data Analysis
 
Research Method for Business chapter 6
Research Method for Business chapter  6Research Method for Business chapter  6
Research Method for Business chapter 6
 
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...Qualitative Studies in Software Engineering - Interviews, Observation, Ground...
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...
 
Test Axioms – An Introduction
Test Axioms – An IntroductionTest Axioms – An Introduction
Test Axioms – An Introduction
 

Viewers also liked

Judith A Mangan resume
Judith A  Mangan resumeJudith A  Mangan resume
Judith A Mangan resumeJudith Mangan
 
Verbos y cuantificadores, rodriguez avila
Verbos y cuantificadores, rodriguez avilaVerbos y cuantificadores, rodriguez avila
Verbos y cuantificadores, rodriguez avilaJussely Rodríguez
 
Definitions of unit 2
Definitions of unit 2Definitions of unit 2
Definitions of unit 2quintomat
 
Anna prezent.
Anna prezent.Anna prezent.
Anna prezent.megikatq
 
Real Time Event Monitoring System
Real Time Event Monitoring SystemReal Time Event Monitoring System
Real Time Event Monitoring Systemshrenikjain38
 
Verbos y cuantificadores, rodriguez avila
Verbos y cuantificadores, rodriguez avilaVerbos y cuantificadores, rodriguez avila
Verbos y cuantificadores, rodriguez avilaJussely Rodríguez
 
Verbos y cuantificadores, rodriguez avila
Verbos y cuantificadores, rodriguez avilaVerbos y cuantificadores, rodriguez avila
Verbos y cuantificadores, rodriguez avilaJussely Rodríguez
 
портфоліо Квятковська Оксана Павлівна
портфоліо Квятковська Оксана Павлівнапортфоліо Квятковська Оксана Павлівна
портфоліо Квятковська Оксана Павлівнаkviatkovska
 
презентац форми страхува
презентац форми страхувапрезентац форми страхува
презентац форми страхуваRudInna
 
Getting started with 8051 at89 c51 using keil uvision 4 and proteus
Getting started with 8051 at89 c51 using keil uvision 4 and proteusGetting started with 8051 at89 c51 using keil uvision 4 and proteus
Getting started with 8051 at89 c51 using keil uvision 4 and proteusrnrao569
 
eCommercePresentation
eCommercePresentationeCommercePresentation
eCommercePresentationChris Fry
 
How to make a vision board powerpoint
How to make a vision board powerpointHow to make a vision board powerpoint
How to make a vision board powerpointdonavon1991
 
7 reasons why vision boards fail
7 reasons why vision boards fail7 reasons why vision boards fail
7 reasons why vision boards faildonavon1991
 
台北國際禮品暨文具展 攤位圈選會議簡報
台北國際禮品暨文具展 攤位圈選會議簡報台北國際禮品暨文具展 攤位圈選會議簡報
台北國際禮品暨文具展 攤位圈選會議簡報Jason334
 
белгия1
белгия1белгия1
белгия1megikatq
 
Futbolen turnir
Futbolen turnirFutbolen turnir
Futbolen turnirmegikatq
 

Viewers also liked (20)

Judith A Mangan resume
Judith A  Mangan resumeJudith A  Mangan resume
Judith A Mangan resume
 
Verbos y cuantificadores, rodriguez avila
Verbos y cuantificadores, rodriguez avilaVerbos y cuantificadores, rodriguez avila
Verbos y cuantificadores, rodriguez avila
 
Definitions of unit 2
Definitions of unit 2Definitions of unit 2
Definitions of unit 2
 
Anna prezent.
Anna prezent.Anna prezent.
Anna prezent.
 
Real Time Event Monitoring System
Real Time Event Monitoring SystemReal Time Event Monitoring System
Real Time Event Monitoring System
 
Verbos y cuantificadores, rodriguez avila
Verbos y cuantificadores, rodriguez avilaVerbos y cuantificadores, rodriguez avila
Verbos y cuantificadores, rodriguez avila
 
Verbos y cuantificadores, rodriguez avila
Verbos y cuantificadores, rodriguez avilaVerbos y cuantificadores, rodriguez avila
Verbos y cuantificadores, rodriguez avila
 
портфоліо Квятковська Оксана Павлівна
портфоліо Квятковська Оксана Павлівнапортфоліо Квятковська Оксана Павлівна
портфоліо Квятковська Оксана Павлівна
 
Ocfs2 storage
Ocfs2 storageOcfs2 storage
Ocfs2 storage
 
презентац форми страхува
презентац форми страхувапрезентац форми страхува
презентац форми страхува
 
Getting started with 8051 at89 c51 using keil uvision 4 and proteus
Getting started with 8051 at89 c51 using keil uvision 4 and proteusGetting started with 8051 at89 c51 using keil uvision 4 and proteus
Getting started with 8051 at89 c51 using keil uvision 4 and proteus
 
eCommercePresentation
eCommercePresentationeCommercePresentation
eCommercePresentation
 
How to make a vision board powerpoint
How to make a vision board powerpointHow to make a vision board powerpoint
How to make a vision board powerpoint
 
Presentation1pdf
Presentation1pdfPresentation1pdf
Presentation1pdf
 
7 reasons why vision boards fail
7 reasons why vision boards fail7 reasons why vision boards fail
7 reasons why vision boards fail
 
台北國際禮品暨文具展 攤位圈選會議簡報
台北國際禮品暨文具展 攤位圈選會議簡報台北國際禮品暨文具展 攤位圈選會議簡報
台北國際禮品暨文具展 攤位圈選會議簡報
 
белгия1
белгия1белгия1
белгия1
 
Materi 220115
Materi 220115Materi 220115
Materi 220115
 
Futbolen turnir
Futbolen turnirFutbolen turnir
Futbolen turnir
 
My CV
My CVMy CV
My CV
 

Similar to Evaluating tests

Ipac 2014
Ipac 2014Ipac 2014
Ipac 2014cwhms
 
Public Safety Hiring Tutorial
Public Safety Hiring TutorialPublic Safety Hiring Tutorial
Public Safety Hiring Tutorialcwhms
 
Reliability and validity w3
Reliability and validity w3Reliability and validity w3
Reliability and validity w3Muhammad Ali
 
Reliability and validity
Reliability and validityReliability and validity
Reliability and validityMuhammad Ali
 
Chapter 7 Inp3004 My Handouts
Chapter 7 Inp3004 My HandoutsChapter 7 Inp3004 My Handouts
Chapter 7 Inp3004 My Handoutsguest052daff
 
Chapter 7 I N P3004 My Handouts
Chapter 7  I N P3004 My HandoutsChapter 7  I N P3004 My Handouts
Chapter 7 I N P3004 My Handoutsicy unknown
 
Testing & Interviewing.ppt
Testing & Interviewing.pptTesting & Interviewing.ppt
Testing & Interviewing.pptAymanRathore1
 
Testing and selection
Testing and selectionTesting and selection
Testing and selectionAnwal Mirza
 
Introduction : Employee testing and selection
Introduction : Employee testing and selectionIntroduction : Employee testing and selection
Introduction : Employee testing and selectionMicha Paramitha
 
Audit Sampling. Murodullo Turdiyev.pptx
Audit Sampling. Murodullo Turdiyev.pptxAudit Sampling. Murodullo Turdiyev.pptx
Audit Sampling. Murodullo Turdiyev.pptxJanobHechkim1
 
Testing selection
Testing selectionTesting selection
Testing selectionshivfaldu
 
Lesson 5a_Surveys and Measurement 2023.pptx
Lesson 5a_Surveys and Measurement 2023.pptxLesson 5a_Surveys and Measurement 2023.pptx
Lesson 5a_Surveys and Measurement 2023.pptxGowshikaSekar
 
Sample-size-comprehensive.pptx
Sample-size-comprehensive.pptxSample-size-comprehensive.pptx
Sample-size-comprehensive.pptxssuser4eb7dd
 
Paul Gerrard - Advancing Testing Using Axioms - EuroSTAR 2010
Paul Gerrard - Advancing Testing Using Axioms - EuroSTAR 2010Paul Gerrard - Advancing Testing Using Axioms - EuroSTAR 2010
Paul Gerrard - Advancing Testing Using Axioms - EuroSTAR 2010TEST Huddle
 
Caveon webinar series Standard Setting for the 21st Century, Using Informa...
Caveon webinar series    Standard Setting for the 21st Century, Using Informa...Caveon webinar series    Standard Setting for the 21st Century, Using Informa...
Caveon webinar series Standard Setting for the 21st Century, Using Informa...Caveon Test Security
 

Similar to Evaluating tests (20)

Ipac 2014
Ipac 2014Ipac 2014
Ipac 2014
 
Public Safety Hiring Tutorial
Public Safety Hiring TutorialPublic Safety Hiring Tutorial
Public Safety Hiring Tutorial
 
Reliability and validity w3
Reliability and validity w3Reliability and validity w3
Reliability and validity w3
 
Reliability and validity
Reliability and validityReliability and validity
Reliability and validity
 
HRM_4th week
HRM_4th weekHRM_4th week
HRM_4th week
 
Chapter 7 Inp3004 My Handouts
Chapter 7 Inp3004 My HandoutsChapter 7 Inp3004 My Handouts
Chapter 7 Inp3004 My Handouts
 
Chapter 7 I N P3004 My Handouts
Chapter 7  I N P3004 My HandoutsChapter 7  I N P3004 My Handouts
Chapter 7 I N P3004 My Handouts
 
Testing & Interviewing.ppt
Testing & Interviewing.pptTesting & Interviewing.ppt
Testing & Interviewing.ppt
 
Testing and selection
Testing and selectionTesting and selection
Testing and selection
 
Bad Metric, Bad!
Bad Metric, Bad!Bad Metric, Bad!
Bad Metric, Bad!
 
Ch06
Ch06 Ch06
Ch06
 
Introduction : Employee testing and selection
Introduction : Employee testing and selectionIntroduction : Employee testing and selection
Introduction : Employee testing and selection
 
Audit Sampling. Murodullo Turdiyev.pptx
Audit Sampling. Murodullo Turdiyev.pptxAudit Sampling. Murodullo Turdiyev.pptx
Audit Sampling. Murodullo Turdiyev.pptx
 
Testing selection
Testing selectionTesting selection
Testing selection
 
Lesson 5a_Surveys and Measurement 2023.pptx
Lesson 5a_Surveys and Measurement 2023.pptxLesson 5a_Surveys and Measurement 2023.pptx
Lesson 5a_Surveys and Measurement 2023.pptx
 
Sample-size-comprehensive.pptx
Sample-size-comprehensive.pptxSample-size-comprehensive.pptx
Sample-size-comprehensive.pptx
 
Paul Gerrard - Advancing Testing Using Axioms - EuroSTAR 2010
Paul Gerrard - Advancing Testing Using Axioms - EuroSTAR 2010Paul Gerrard - Advancing Testing Using Axioms - EuroSTAR 2010
Paul Gerrard - Advancing Testing Using Axioms - EuroSTAR 2010
 
WANTED: Perfect Employee
WANTED: Perfect Employee WANTED: Perfect Employee
WANTED: Perfect Employee
 
Caveon webinar series Standard Setting for the 21st Century, Using Informa...
Caveon webinar series    Standard Setting for the 21st Century, Using Informa...Caveon webinar series    Standard Setting for the 21st Century, Using Informa...
Caveon webinar series Standard Setting for the 21st Century, Using Informa...
 
Ch._6_pp_industrial.ppt
Ch._6_pp_industrial.pptCh._6_pp_industrial.ppt
Ch._6_pp_industrial.ppt
 

Recently uploaded

Panet vs.Plastics - Earth Day 2024 - 22 APRIL
Panet vs.Plastics - Earth Day 2024 - 22 APRILPanet vs.Plastics - Earth Day 2024 - 22 APRIL
Panet vs.Plastics - Earth Day 2024 - 22 APRILChristina Parmionova
 
VIP High Class Call Girls Amravati Anushka 8250192130 Independent Escort Serv...
VIP High Class Call Girls Amravati Anushka 8250192130 Independent Escort Serv...VIP High Class Call Girls Amravati Anushka 8250192130 Independent Escort Serv...
VIP High Class Call Girls Amravati Anushka 8250192130 Independent Escort Serv...Suhani Kapoor
 
Earth Day 2024 - AMC "COMMON GROUND'' movie night.
Earth Day 2024 - AMC "COMMON GROUND'' movie night.Earth Day 2024 - AMC "COMMON GROUND'' movie night.
Earth Day 2024 - AMC "COMMON GROUND'' movie night.Christina Parmionova
 
13875446-Ballistic Missile Trajectories.ppt
13875446-Ballistic Missile Trajectories.ppt13875446-Ballistic Missile Trajectories.ppt
13875446-Ballistic Missile Trajectories.pptsilvialandin2
 
history of 1935 philippine constitution.pptx
history of 1935 philippine constitution.pptxhistory of 1935 philippine constitution.pptx
history of 1935 philippine constitution.pptxhellokittymaearciaga
 
VIP Call Girls Service Bikaner Aishwarya 8250192130 Independent Escort Servic...
VIP Call Girls Service Bikaner Aishwarya 8250192130 Independent Escort Servic...VIP Call Girls Service Bikaner Aishwarya 8250192130 Independent Escort Servic...
VIP Call Girls Service Bikaner Aishwarya 8250192130 Independent Escort Servic...Suhani Kapoor
 
Call Girls Service Race Course Road Just Call 7001305949 Enjoy College Girls ...
Call Girls Service Race Course Road Just Call 7001305949 Enjoy College Girls ...Call Girls Service Race Course Road Just Call 7001305949 Enjoy College Girls ...
Call Girls Service Race Course Road Just Call 7001305949 Enjoy College Girls ...narwatsonia7
 
Call Girls Service AECS Layout Just Call 7001305949 Enjoy College Girls Service
Call Girls Service AECS Layout Just Call 7001305949 Enjoy College Girls ServiceCall Girls Service AECS Layout Just Call 7001305949 Enjoy College Girls Service
Call Girls Service AECS Layout Just Call 7001305949 Enjoy College Girls Servicenarwatsonia7
 
Cunningham Road Call Girls Bangalore WhatsApp 8250192130 High Profile Service
Cunningham Road Call Girls Bangalore WhatsApp 8250192130 High Profile ServiceCunningham Road Call Girls Bangalore WhatsApp 8250192130 High Profile Service
Cunningham Road Call Girls Bangalore WhatsApp 8250192130 High Profile ServiceHigh Profile Call Girls
 
DNV publication: China Energy Transition Outlook 2024
DNV publication: China Energy Transition Outlook 2024DNV publication: China Energy Transition Outlook 2024
DNV publication: China Energy Transition Outlook 2024Energy for One World
 
Call Girls Bangalore Saanvi 7001305949 Independent Escort Service Bangalore
Call Girls Bangalore Saanvi 7001305949 Independent Escort Service BangaloreCall Girls Bangalore Saanvi 7001305949 Independent Escort Service Bangalore
Call Girls Bangalore Saanvi 7001305949 Independent Escort Service Bangalorenarwatsonia7
 
(ANIKA) Call Girls Wadki ( 7001035870 ) HI-Fi Pune Escorts Service
(ANIKA) Call Girls Wadki ( 7001035870 ) HI-Fi Pune Escorts Service(ANIKA) Call Girls Wadki ( 7001035870 ) HI-Fi Pune Escorts Service
(ANIKA) Call Girls Wadki ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
2024: The FAR, Federal Acquisition Regulations - Part 27
2024: The FAR, Federal Acquisition Regulations - Part 272024: The FAR, Federal Acquisition Regulations - Part 27
2024: The FAR, Federal Acquisition Regulations - Part 27JSchaus & Associates
 
EDUROOT SME_ Performance upto March-2024.pptx
EDUROOT SME_ Performance upto March-2024.pptxEDUROOT SME_ Performance upto March-2024.pptx
EDUROOT SME_ Performance upto March-2024.pptxaaryamanorathofficia
 
“Exploring the world: One page turn at a time.” World Book and Copyright Day ...
“Exploring the world: One page turn at a time.” World Book and Copyright Day ...“Exploring the world: One page turn at a time.” World Book and Copyright Day ...
“Exploring the world: One page turn at a time.” World Book and Copyright Day ...Christina Parmionova
 
Artificial Intelligence in Philippine Local Governance: Challenges and Opport...
Artificial Intelligence in Philippine Local Governance: Challenges and Opport...Artificial Intelligence in Philippine Local Governance: Challenges and Opport...
Artificial Intelligence in Philippine Local Governance: Challenges and Opport...CedZabala
 
VIP Kolkata Call Girl Jatin Das Park 👉 8250192130 Available With Room
VIP Kolkata Call Girl Jatin Das Park 👉 8250192130  Available With RoomVIP Kolkata Call Girl Jatin Das Park 👉 8250192130  Available With Room
VIP Kolkata Call Girl Jatin Das Park 👉 8250192130 Available With Roomishabajaj13
 
Greater Noida Call Girls 9711199012 WhatsApp No 24x7 Vip Escorts in Greater N...
Greater Noida Call Girls 9711199012 WhatsApp No 24x7 Vip Escorts in Greater N...Greater Noida Call Girls 9711199012 WhatsApp No 24x7 Vip Escorts in Greater N...
Greater Noida Call Girls 9711199012 WhatsApp No 24x7 Vip Escorts in Greater N...ankitnayak356677
 

Recently uploaded (20)

9953330565 Low Rate Call Girls In Adarsh Nagar Delhi NCR
9953330565 Low Rate Call Girls In Adarsh Nagar Delhi NCR9953330565 Low Rate Call Girls In Adarsh Nagar Delhi NCR
9953330565 Low Rate Call Girls In Adarsh Nagar Delhi NCR
 
Panet vs.Plastics - Earth Day 2024 - 22 APRIL
Panet vs.Plastics - Earth Day 2024 - 22 APRILPanet vs.Plastics - Earth Day 2024 - 22 APRIL
Panet vs.Plastics - Earth Day 2024 - 22 APRIL
 
VIP High Class Call Girls Amravati Anushka 8250192130 Independent Escort Serv...
VIP High Class Call Girls Amravati Anushka 8250192130 Independent Escort Serv...VIP High Class Call Girls Amravati Anushka 8250192130 Independent Escort Serv...
VIP High Class Call Girls Amravati Anushka 8250192130 Independent Escort Serv...
 
Earth Day 2024 - AMC "COMMON GROUND'' movie night.
Earth Day 2024 - AMC "COMMON GROUND'' movie night.Earth Day 2024 - AMC "COMMON GROUND'' movie night.
Earth Day 2024 - AMC "COMMON GROUND'' movie night.
 
13875446-Ballistic Missile Trajectories.ppt
13875446-Ballistic Missile Trajectories.ppt13875446-Ballistic Missile Trajectories.ppt
13875446-Ballistic Missile Trajectories.ppt
 
history of 1935 philippine constitution.pptx
history of 1935 philippine constitution.pptxhistory of 1935 philippine constitution.pptx
history of 1935 philippine constitution.pptx
 
VIP Call Girls Service Bikaner Aishwarya 8250192130 Independent Escort Servic...
VIP Call Girls Service Bikaner Aishwarya 8250192130 Independent Escort Servic...VIP Call Girls Service Bikaner Aishwarya 8250192130 Independent Escort Servic...
VIP Call Girls Service Bikaner Aishwarya 8250192130 Independent Escort Servic...
 
Call Girls Service Race Course Road Just Call 7001305949 Enjoy College Girls ...
Call Girls Service Race Course Road Just Call 7001305949 Enjoy College Girls ...Call Girls Service Race Course Road Just Call 7001305949 Enjoy College Girls ...
Call Girls Service Race Course Road Just Call 7001305949 Enjoy College Girls ...
 
Call Girls Service AECS Layout Just Call 7001305949 Enjoy College Girls Service
Call Girls Service AECS Layout Just Call 7001305949 Enjoy College Girls ServiceCall Girls Service AECS Layout Just Call 7001305949 Enjoy College Girls Service
Call Girls Service AECS Layout Just Call 7001305949 Enjoy College Girls Service
 
Cunningham Road Call Girls Bangalore WhatsApp 8250192130 High Profile Service
Cunningham Road Call Girls Bangalore WhatsApp 8250192130 High Profile ServiceCunningham Road Call Girls Bangalore WhatsApp 8250192130 High Profile Service
Cunningham Road Call Girls Bangalore WhatsApp 8250192130 High Profile Service
 
DNV publication: China Energy Transition Outlook 2024
DNV publication: China Energy Transition Outlook 2024DNV publication: China Energy Transition Outlook 2024
DNV publication: China Energy Transition Outlook 2024
 
Call Girls Bangalore Saanvi 7001305949 Independent Escort Service Bangalore
Call Girls Bangalore Saanvi 7001305949 Independent Escort Service BangaloreCall Girls Bangalore Saanvi 7001305949 Independent Escort Service Bangalore
Call Girls Bangalore Saanvi 7001305949 Independent Escort Service Bangalore
 
(ANIKA) Call Girls Wadki ( 7001035870 ) HI-Fi Pune Escorts Service
(ANIKA) Call Girls Wadki ( 7001035870 ) HI-Fi Pune Escorts Service(ANIKA) Call Girls Wadki ( 7001035870 ) HI-Fi Pune Escorts Service
(ANIKA) Call Girls Wadki ( 7001035870 ) HI-Fi Pune Escorts Service
 
2024: The FAR, Federal Acquisition Regulations - Part 27
2024: The FAR, Federal Acquisition Regulations - Part 272024: The FAR, Federal Acquisition Regulations - Part 27
2024: The FAR, Federal Acquisition Regulations - Part 27
 
Call Girls In Rohini ꧁❤ 🔝 9953056974🔝❤꧂ Escort ServiCe
Call Girls In  Rohini ꧁❤ 🔝 9953056974🔝❤꧂ Escort ServiCeCall Girls In  Rohini ꧁❤ 🔝 9953056974🔝❤꧂ Escort ServiCe
Call Girls In Rohini ꧁❤ 🔝 9953056974🔝❤꧂ Escort ServiCe
 
EDUROOT SME_ Performance upto March-2024.pptx
EDUROOT SME_ Performance upto March-2024.pptxEDUROOT SME_ Performance upto March-2024.pptx
EDUROOT SME_ Performance upto March-2024.pptx
 
“Exploring the world: One page turn at a time.” World Book and Copyright Day ...
“Exploring the world: One page turn at a time.” World Book and Copyright Day ...“Exploring the world: One page turn at a time.” World Book and Copyright Day ...
“Exploring the world: One page turn at a time.” World Book and Copyright Day ...
 
Artificial Intelligence in Philippine Local Governance: Challenges and Opport...
Artificial Intelligence in Philippine Local Governance: Challenges and Opport...Artificial Intelligence in Philippine Local Governance: Challenges and Opport...
Artificial Intelligence in Philippine Local Governance: Challenges and Opport...
 
VIP Kolkata Call Girl Jatin Das Park 👉 8250192130 Available With Room
VIP Kolkata Call Girl Jatin Das Park 👉 8250192130  Available With RoomVIP Kolkata Call Girl Jatin Das Park 👉 8250192130  Available With Room
VIP Kolkata Call Girl Jatin Das Park 👉 8250192130 Available With Room
 
Greater Noida Call Girls 9711199012 WhatsApp No 24x7 Vip Escorts in Greater N...
Greater Noida Call Girls 9711199012 WhatsApp No 24x7 Vip Escorts in Greater N...Greater Noida Call Girls 9711199012 WhatsApp No 24x7 Vip Escorts in Greater N...
Greater Noida Call Girls 9711199012 WhatsApp No 24x7 Vip Escorts in Greater N...
 

Evaluating tests

  • 1. Moving Beyond “Eeny, Meeny, Miny, Moe”: What Factors Should Guide the Evaluation of Selection Tests John M. Ford CWH Research, Inc.
  • 2. Overview • Critical Assumptions • 7 Important Considerations When Evaluating Selection Tests
  • 3. Critical Assumptions • Tests have value. • All tests are not the same. • Tests are part of an overall organizational environment—Changing to a new test will not magically change your organization by itself.
  • 4. #1: Don’t let the tail wag the dog— Take control of your process • The RFP process is not conducive to making informed decisions. – Don’t let test providers decide what information you should consider. • Evaluating tests requires professional judgment. – You must ask the right questions and evaluate the evidence.
  • 5. #1: Don’t let the tail wag the dog— Take control of your process • Don’t forget future and hidden costs. – Inefficient performance – Increased training/remedial training/retraining – Lawsuits – Turnover – Grievances – Disciplinary problems – Accidents
  • 6. #2: There is no such thing as a valid test “Validity refers to the degree to which evidence and theory support the interpretations of test scores entailed by proposed uses of tests. Validity is, therefore, the most fundamental consideration in developing and evaluating tests. The process of validation involves accumulating evidence to provide a sound scientific basis for the proposed score interpretation of test scores required by proposed uses that are evaluated, not the test itself. When test scores are used or interpreted in more than one way, each intended interpretation must be validated” (Standards for Educational and Psychological Testing, 1999; p. 9).
  • 7. #3—Not all validation evidence is equal • Validity should not be treated as a categorical variable in your decision- making. • Validation evidence should be evaluated along a continuum. • This guideline applies to evidence regarding content relevance (i.e., content validity).
  • 8. Example: Not All Content Relevance Evidence is Equal. Job Domain Test Domain ≠ Test 1 Test 2
  • 9. 9 Corollary—Adverse impact is also not a continuous variable • Adverse impact should also be evaluated on a continuum. – Although they both violate the 4/5ths rule, an AI ratio of .70 is preferable to .20. – Similarly, 1.00 is preferable to .80. • Higher AI ratios provide a variety of results: – More diversity in your organization – Greater likelihood of meeting the 4/5ths rule in individual samples – Lower likelihood of grievances, EEOC investigations, lawsuits, and bad press
  • 10. #4: Context matters!!! • Validity cannot be properly evaluated without knowledge of the validation process. – Get the technical report. – Validation study circumstances should match your circumstances. • Every validation study should include a job analysis or analysis of work. – Is the job domain appropriately defined? – Are the job requirements similar to your position?— This is necessary to transport validation evidence. – Are the test components defined in a manner consistent with the job domain?
  • 11. #4: Context matters!!! • Use of test should match your process/needs • Validity coefficients are not an island—they provide very little information without context. – Is the sample appropriate for your agency? – Is the criterion related to important aspects of the job (and your job)? – Is the validity coefficient corrected or uncorrected?
  • 12. #4: Context matters!!! • Don’t forget complexity. – Reading level – Math level – Skills/Abilities level • Context is also important in evaluating adverse impact. – Adverse impact is influenced by factors unrelated to the test. – Consider the sample—Applicant samples are better indicators of adverse impact than incumbent samples.
  • 13. 13 Example—Adverse impact is influenced by factors unrelated to the test Total Sample Size Number of Minorities in the Sample Selection Ratio Correlation Between Predictors
  • 14. 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.1 0.3 0.5 0.7 0.9 Selection Ratio 12% Minority 20 %Minority Example: AI Ratios From a Single-Hurdle Selection System N = 200 d = 0.00 Roth, Bobko, & Switzer, 2006 ProbabilityofViolatingthe4/5thsRule
  • 15. Example—Consider the sample when evaluating adverse impact. .69-.13Test #5 .41-.16Test #4 .24-.66Test #3 .54.15Test #2 .50-.10Test #1 White-Black SD- Difference in Applicant Sample White-Black SD- Difference in Validation Sample • Applicant samples generally demonstrate higher adverse impact than incumbent samples.
  • 16. #5: Beware of small samples • “Ignoring sampling error leads to disastrous results in the area of personnel selection.” (Hunter & Hunter, 1984) • Sampling error occurs due to only sampling part of the entire population – Single studies and/or small samples are not definitive. – Results from single studies and/or small samples are not robust.
  • 17. Example: Sampling Error— Smaller Samples 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 #1 #2 #3 #4 #5 #6 #7 #8 #9 Tot Validity Coefficient Single Test Validated in Multiple Samples (All samples > 20 participants)
  • 18. Example: Sampling Error- Larger Samples 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Client #1 Client #2 Client #3 Combined Sample Validity Coefficient Next Generation Firefighter/EMS Written Aptitude Test (All Samples > 65 participants)
  • 19. #5: Beware of small samples • Capitalizing on chance can result in misleading validity coefficients. • Capitalization on chance can occur when: – Final items on test are determined based on validation sample. – Test weights are determined based on validation sample. • You should expect lower validity coefficients in the future under these circumstances.
  • 20. Example: Capitalization on Chance 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 Client #1 Client #2 Client #3 Total Sample Regression Weights based on Client #1 Rational Weights Next Generation Firefighter/EMS Written Aptitude Test Validity
  • 21. #5: Beware of small samples • Single studies/small samples can also result in misleading adverse impact ratios. – The 4/5ths rule is not AI. It is an indicator of underlying AI—“The 4/5ths rule merely establishes a numerical basis for drawing an initial inference and for requiring additional information” (Uniform Guidelines, Questions & Answers) • AI ratios can vary substantially over different administrations.—Again, results from single studies and/or small samples are not definitive or robust.
  • 22. Example: One Client’s AI Ratios Over Multiple Administrations 1.41 0.84 0.69 0.67 0.87 0.78 0.72 0.44 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1 2 3 4 5 6 7 8 Administrations Combined Sample = .75 AdverseImpactRatio
  • 23. #5: Beware of small samples • When evaluating samples: – More weight should be given to evidence from multiple samples—Cross validation. – More weight should be given to larger samples. – More weight should be given to representative samples. – More weight should be given to results from studies that are developed and weighted using rational models.
  • 24. #6: Don’t forget the O’s • The concept of KSAs has been expanded to KSAOs – O’s = Other Characteristics • Judgment & Common Sense • Interpersonal Skills • Emotional Skills • Leadership • Personality traits or temperaments • Interests • Defining a broader job domain can result in higher validity and lower adverse impact.
  • 25. Example: KSAO Importance Ratings for Firefighter Position Very Important Essential for successful performance of the job Critically Important Failure to perform results in extreme negative consequences 3.8 4.2 4.2 4.2 3 3.5 4 4.5 Basic Educational Skills Emotional Outlook Interpersonal Skills Practical Skills
  • 26. Example: Broad Assessments Can Increase Validity & Reduce Adverse Impact 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 Validity Basic Educational Skills Combined 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 d-statistic Basic Educational Skills Combined Combined includes Interpersonal Skills, Emotional Outlook, & Practical Skills
  • 27. #6: Don’t forget the O’s • Using broader assessments early in the process can result in substantially better hires. – Some agencies administer a narrow test (e.g., basic educational skills) in the first stage and measure a broader range of skills in a later stage (e.g., interview). – This strategy will screen out individuals who are more complete candidates and would be superior employees. – Measuring a broad range of skills can increase the validity (i.e., the quality of the candidate pool) and minimize the AI of your first stage (as well as your total process). – Measuring a broad range of skills early in your process can also reduce the cost of later steps.
  • 28. Example: Which Candidate Would be the Best Hire? Basic Educational Skills Interpersonal Emotional Outlook Practical Candidate A 87 60 60 60 Candidate B 85 70 70 70 Candidate C 83 90 90 90
  • 29. Example: Advantage of Measuring a Broad Range of Skills Early in Process Selection Ratio AI Ratio- Cognitive Screen AI Ratio- Complete Model % of Top Candidates Screened Out by Cognitive Screen .20 .32 .32 68% .40 .37 .49 35% .60 .52 .65 23% .80 .63 .85 12%
  • 30. #7: Remember the Evaluate the Pass Point • Adverse impact ratios are dependent upon pass points. – Adverse Impact Ratio—A substantially different rate of selection is indicated when the selection rate for a protected group is less than 4/5ths (80%) of the selection rate for the group with the highest selection rate. • Changing the pass point results changes the AI Ratio. • Make sure the pass point used by test provider when evaluating adverse impact is similar to your expected pass point.
  • 31. Protected Group Majority Group Mean-ProtectedGroup Mean-MajorityGroup 1 Standard Deviation d = 1.00 Meets4/5thsRule Fails4/5thsRule Example: Adverse impact ratios are dependent on pass points
  • 32. #7: Remember to evaluate the pass point • Remember that your process may have multiple pass points. – Those that pass test – Those that are ultimately hired • Although your initial pass point may meet the 4/5ths rule, the rank-order is a critical consideration.
  • 33. Protected Group Majority Group Mean-ProtectedGroup Mean-MajorityGroup 1 Standard Deviation d = 1.00 InitialPassPoint FinalPassPoint Example: Rank order is critical consideration
  • 34. Example: Rank order impacts AI Ratio of your ultimate pass point Selection Process Results 10 of 70 W pass 4 of 30 B pass W pass ratio = 14.3 % B pass ratio = 13.3 % AI Ratio = 0.93 Rank Score Race 1 92 W 2 88 W 3 87 W 4 86 B 5 81 W 6 80 W 7 79 W 8 78 B 9 77 W 10 76 B 11 75 B 12 72 W 13 71 W 14 70 W Hires: 7 W, 3 B Hire ratios: W = 10%, B = 10% Hire ratio AI = 1.0 Conclusion: No Adverse Impact Hires: 4 W, 1 B Hire ratios: W = 5.7%, B = 3.3% Hire ratio AI = 0.58 Conclusion: Adverse Impact
  • 35. 7 Critical Considerations When Evaluating Selection Tests 1. Don’t let the tail wag the dog—Take control of your process. 2. There is no such thing as a valid test. 3. Not all validity evidence is equal. 4. Context matters!!! 5. Beware of small samples. 6. Don’t forget the O’s 7. Remember to evaluate the pass point.

Editor's Notes

  1. Test have value. Hopefully, since this is IPMAAC, everyone agrees with this. Some agencies I talk to seem to view tests as the enemy—something they are required to do and must get through as painlessly as possible. They could care less about validity, they just want something easy, cheap, and that will not get them sued. Tests are a wonderful opportunity to help your organization. Using the right test can increase: Job Performance, Efficiency, Diversity, Training Success, Ability to Attain Organizational Goals, & Image in the Community. Using the right test can also reduce Lawsuits, Turnover, Grievances, Disciplinary Problems, Accidents, Employee Costs, Time Spent on Tasks, & Remedial/Repeat Training. All tests are not the same. Some agencies treat tests as interchangeable. They simply select the cheapest or most convenient one. But choosing the right (or the wrong) test can have an important impact on your organization. That being said, tests are part of an overall organizational environment. When there are organizational problems, tests are an easy scapegoat (or assumed savior—which sets them up to be the scapegoat when expectations aren’t met). However, if you don’t recruit good candidates, train them appropriately, and fail to reward desired behaviors (and negatively reinforce undesired behaviors), the best test in the world will not solve your problems.
  2. When test providers write proposals, they are not necessarily focused on helping you make an informed choice—they are trying to get you to purchase their test. Their goal is to emphasize their test’s strengths and minimize (or hide) their test’s weaknesses. This often leads to incomplete, vague, mismatched, and sometimes misleading information, which makes an informed choice next to impossible. You must take control of the process. This is your process. It is your responsibility to ensure that test providers give you the information you want. You must ask the right questions. Professional Judgment. Hopefully, you didn’t come here hoping that this presentation will make selecting a test easier. If anything, I am going to stress the importance of putting more thought and effort into this process. Although I will provide a checklist of things to consider, this is not a magical formula. Although this table can help you organize the information you need, it is not simply a matter of counting the positives or check marks. Not all of the questions are equally important and it is unlikely that any test will have a strong response to every question. And none of the questions are meant to be a decision rule—they are simply designed to help you determine the strength of the validation evidence presented. If a test provider cannot appropriately provide a good answer for one of these questions, it doesn’t mean that they have a bad test or the test isn’t valid. It means that you should take that into account when considering the utility and value of the test. You must use your professional judgment to evaluate the psychometric properties and practical realities of the tests to come to the right decision.
  3. Although we often refer to tests as being valid, technically this is not accurate. Validity actually refers to the uses and interpretations of a test, it is not a property of the test. A test that is valid in one context will not be valid in another (e.g., our entry-level firefighter test is valid for selecting new firefighters, but not for selecting the second baseman of the New York Mets). You must evaluate whether the test is valid for the purpose that you intend to use it.
  4. There was an interesting discussion on the IPMAAC listserv earlier this year about whether criterion-related validity or content validity is superior. However, I don’t think you can answer that question without actually evaluating the evidence. The question you should ask when evaluating validation evidence is how strongly does it support the inferences that I intend to draw with the test. For both criterion-related studies and content studies, some evidence will be more persuasive than others. Although I am a stats guy and always want to do criterion-related studies whenever possible, I would rather use a test based on a strong content study than a weak criterion-related study. Although we intuitively recognize that criterion-related validation evidence runs along a continuum, this is true of content validity as well. “Even when the validation strategy used does not involve empirical predictor-criterion linkages, such as when a user relies on test content to provide validation evidence, there is still an implied link between the test score and a criterion” (Principles). Some agencies seem to say, well both tests are valid, so we will take the cheaper one. But it is important to compare the strength of the content validity for the tests. Some content validation studies provide much stronger evidence of validity than others.
  5. Technically, both of these tests demonstrate content validity. They both measure constructs that are within the job domain. However, the evidence supporting the validity of Test 2 is much stronger. It would be a mistake to treat them equal in terms of their validity. Test 2 is likely to have much more utility and positive results for your organization than Test 1. One should not make the assumption that since they both have some evidence of validity, you should move on and use another factor to make the decision.
  6. The 4/5ths rule does not mean that 20% discrimination is OK.
  7. This may be the most important take home message from this presentation. You can not simply compare two validity coefficients and select the highest one. Example—I was once told that an agency selected a certain test to select firefighters because it had the highest validity. When I saw the proposals, it was true that the selected test had the highest validity coefficient. However, the validity study examined unskilled workers and the criterion was whether the workers were still on the job after 3 months. I submit that this is not a very valid test for selecting firefighters. To properly evaluate validity, you must know the process they went through. You should always ask for a copy of the technical report so that you can properly evaluate the validation evidence. The validity evidence should be based on the same job (or a very similar one), must be conducted in a manner consistent with how you intend to use the test, and should predict relevant job outcomes (if it is criterion-related). If any of these are untrue, the validation evidence for your use of the test is substantially weakened.
  8. Complexity should be evaluated. For example, a job analysis for an HR Analyst might identify math as a critical KSAO. However, a test that assessed simple addition, although a math test, would not be a valid test. Example of tests for Accounting Assistant, Accountant, Sr. Accountant, and Auditor. Accounting skills are critical for each of these positions. However, a test that assessed accounting skills at the level of an auditor would not be valid for the accounting assistant position, and a test that assessed accounting skills at the level of an accounting assistant would not be valid for the auditor position.
  9. 0.10.30.50.70.9 12% Minority0.230.190.110.070.01 20 % Minority0.370.160.080.050
  10. Test #1—SS Fire (Incumbent, W = 654, B = 49; Applicant, W = 45,641, B = 9,090) Test #2-SS Law (Incumbent, W = 411, B = 35; Applicant, W = 15,248, B = 6,401) Test #3—RISP (Incumbent, W = 88, B = 8; Applicant, W = 1,203, B = 120) Test #4—NG MCFRS (Incumbent, W = 43, B = 31; Applicant, W = 879, B = 164) Test #5—NG Den (Incumbent, W = 58, B = 17; Applicant, W = 669, B = 138)
  11. Beware of tests that have only been sampled in one place and that are based on small samples. Do not treat the validity coefficient from a single study as fact. Regardless of the magnitude of the coefficient, it should not carry as much weight as a test that has been validated in several places or with large samples. This does not mean that a single validity study should be ignored. It just doesn’t carry the same weight as multiple validation samples.
  12. Here is an example of a test that was validated in 17 different agencies, 9 of which were larger than 20 participants. As you can see, the validity coefficients ranged from .03 to .52. If you were to only make decisions based on the first validation coefficient (.51) or the second coefficient (.03), your decision would be based on incomplete or biased information. Personally, I would give more weight to the .27 coefficient based on 469 in the total sample than in the .51 from a sample of 36. There is a danger that a test provider might only provide information on Sample #1 or cherry pick validation studies. You should ask test providers to provide information for every study that has been conducted (or on a combined sample). Selection Solutions Corr Valid Coeff0.670.040.140.490.520.180.290.520.050.35 N365827512836296723469
  13. Even with larger samples, there will be variance in validation results. Again, the combined sample provides the best estimate of validity. Client #1Client #2Client #3Combined Sample Adj Val Coeff0.470.260.720.48 N659669230
  14. If the final test is developed based on the validation sample and/or the weighting of the test components is based on the sample, you should expect lower validity coefficients in future samples. As you can see, doing so results in higher validity in the initial sample, but lower validity in future samples. If the validation study capitalized on chance, you should not expect the validity evidence to be quite as strong for your agency.
  15. Combined Sample (3080 Candidates/1751 Whites/480 Blacks) = .75 This cut score is higher than the one actually used by the client. Only administrations with at least 10 Black Candidates were included. 1=(6/06, 198/85/52), 2 = (6/06, 212/82/45), 3=(6/06, 424/222/82), 4=(4/04, 522/261/103), 5=(5/03, 337/142/77), 6=(6/02, 297/144/56), 7=(9/01, 213/87/61), 8=(4/01, 136/105/10) Making decisions based on small samples (specifically administrations 1 & 8) would be based on incomplete or faulty assumptions.
  16. Knowledge--Body of information Skills--Proficiency at performing a job task Abilities--Mental or physical capability needed to perform tasks
  17. N = 926 PG County, Montgomery County, Denver, Tuscaloosa, Columbus, Austin Here, the O’s are rated as the most important parts of the job.
  18. Validity-.26, .39 AI-.90, .45
  19. Knowledge--Body of information Skills--Proficiency at performing a job task Abilities--Mental or physical capability needed to perform tasks
  20. Depending on the cut score, using a cognitive test as the first screen could screen the best candidate out of the process. If final ranking is based on cognitive, you would hire candidate A. If final ranking is based on interview, you would probably hire candidate B.
  21. Based on Next Gen Applicant Sample. Using a Cognitive screen as a first step results in higher AI (compared to using the complete model) and screens out a high percentage of individuals that the complete model indicates would be the top candidates. In other words, using a selection ratio of 20%, selecting the top 20% of cognitive scores would exclude 68% percent of the individuals that the complete model indicates would ultimately be in the top 20%. They never make it to the end of the process.
  22. Stop and discuss the long term implications of rank order and AI If everyone who passes is eventually hired, does the rank order matter? What are the implications of rank order from a documentation and tracking perspective?