Testreview Systems in Sweden

Anders Sjöberg
Associate Professor
Department of Psychology

Disposition
• The background and development of two test review
systems
• Advantages and disadvantages of each system is
presented from the perspectives that validity is (or is
not) a characteristic of a test.
• Example of a selection process validity study
• Questions

Swedish Psychological
Association
• Swedish Psychological Association is the union and
professional organization for the country’s
psychologists
• One task is to review different kinds of psychological
assessments carried out in the work and
organizational area, such as personality and
cognitive ability test used for selection and
development

Swedish National Board of
Health and Welfare (NBHW)
• The National Board of Health and Welfare is a
government agency in Sweden under the Ministry of
Health and Social Affairs
• One of the tasks is to review different types of
psychological assessments carried out such to
detect violence in marriage, abuse of alcohol and
other health related problems

SPA Review Model
Procedure
• SPA Review Model for the Description and
Evaluation of Psychological Tests is a procedure
that employs two anonymous reviewers for each
test review, with a third person to oversee the review
(Consulting Editor)

Swedish Psychological
Association
• EFPA Review Model for the Description and
Evaluation of Psychological Tests. Version 3.42,
(2008)

EFPA Review Model Sources
• British Psychological Society (BPS) Test Review
Evaluation Form
• The Spanish Questionnaire for the Evaluation of
Psychometric Tests (Spanish Psychological
Association);
• the Rating System for Test Quality produced by the
Committee on Testing of the Dutch Association of
• American Psychological Association [APA],
American Educational Research Association
[AERA], and National Council on Measurement in
Education [NCME]. US AERA/ APA/NCME .
Standards for Educational and Psychological test

EFPA Validity
• The framework to operationalize validity is based on
Standards for Educational and Psychological Tests
[APA], AERA], [NCME], 1954). This
conceptualization of validity holds that there are
three approaches to the validation of tests.
• Content validation (demonstration that test items are
a representative sample of the behaviors)
• Criterion-related validation (demonstration that
scores on a test are related to an outcome)
• Construct validation (collection of evidence that a
psychological concept or construct explains test
performance)

Practice
Evaluation of Psychological Tests. Version 3.42,
(2008)

2.10.1
Construct Validity - Overall Adequacy
(This overall rating is obtained by using judgment based on the ratings given for items 2.10.1.2 –
2.10.1.6. Do not simply average numbers to obtain an overall rating.)
2.10.1.2
Sample sizes:
[ -2] No information given.
[ -1] One inadequate study (e.g. sample size less than 100).
[ 0 ] One adequate study (e.g. sample size of 100-200).
[ 1 ] More than one adequate or large sized study.
[ 2 ] Good range of adequate to large studies.
2.10.1.4
Median and range of the correlations between the test and other
similar tests:
[ -1] Inadequate (r < 0.55).
[ 0 ] Adequate (0.55 < r < 0.65).
[ 1 ] Good (0.65 < r < 0.75).
[ 2 ] Excellent (r > 0.75)
2.10.1.5
Quality of instruments as criteria or markers:
[ -1] Inadequate information given.
[ 0 ] Adequate quality
[ 1 ] Good quality.
[ 2 ] Excellent quality with wide range of relevant markers for convergent and divergent validation.

• American Psychological Association [APA],
American Educational Research Association
[AERA], and National Council on Measurement in
Education [NCME]. US AERA/ APA/NCME .
Standards for Educational and Psychological test.
Evaluation of Psychological Tests.
• Buros Center for testing
Swedish National Board of
Health and Welfare (NBHW)

NBHW Procedure
• NBHW test Review Model for the Description and
Evaluation of Assessment have a procedure that
employ two anonymous reviewers for each
assessment review, with one person to oversee the
review, (Consulting Editor)

NBHW Validity
• Validity is defined as the degree to which evidence and
theory support the interpretation of assessment scores
proposed by the service provider of the assessment.
• Instead of talking about different kinds of validity, the
service provider of the assessment must state explicitly
what interpretations are to be derived from a set of scores
and how to use these scores for decision making.
• In this way, the strength of the validity evidence refers to
the probability that the inference is correct.
• Thus, it is critical for service providers of the assessment
designing and conducting validation studies to concentrate
their efforts on ensuring evidence for the inferences they
wish to make in much the same way that they would
otherwise “defend” their conclusions in an hypothesis
testing situation.

Practice
• NBHW test Review Model for the Description and
Evaluation of Assessment

Validity
The process of validation involves accumulating evidence to provide a sound scientific basis for the
proposed score interpretations. It is the interpretations of assessment scores required by proposed
uses that are evaluated, not the assessment itself. When test scores are used or interpreted in more
than one way, each intended interpretation must be validated.
Evidence that the interpretation of the assessment score are correct.
Describe the validity studies
X Evidence that the interpretation of the results are correct is not possible to value due to lack of or
insufficient information
X Evidence that the interpretation of the results are correct, should be revised and clarified
X Evidence that the interpretation of the results are correct, should be supplemented
X Evidence that the interpretation of the results are correct is good
Justification of valuation:
Proposals

Validity
of a test
• Easy to evaluate
• Concentrates on
statistics
• Difficult to
evaluate
• Concentrates on
content and
evidence
Validity of
the use of a
test score
As a
reviewer

Validity
of a test
• Difficult to
evalute
• Concentrates on
statistics
• Easy to evaluate
• Concentrates on
content and
evidence
Validity of
the use of a
test score
As a client

Selection practice
• SPA model - psychometric properties of the test
• NBHW model – the selection process and decision

Example Selection
• Organization A use intelligence test in the selection process
(N=200)
• Organization B use intelligence test in the selection process
(N=200)

Results based on the validity
argument
Test score
Low High
Performance
Low
High
85
85
15
15
r = .70

Question and Analysis
• The relationship between the test score and the
selection decision (Not selected or Selected)
• Is the selection decision based on intelligence score

Organization A
Test score
Low High
Decision
Not selected
Selected
60
60
40
40
r = .20

Organization B
Test score
Low High
Decision
Not selected
Selected
95
95
5
5
r = .90

Conclusions
• Psychometric quality is important but not sufficient
to ensure good test use
• Both psychometric quality and practical use of the
test score should be included as criteria in the
review models
• Start to discuss the validity definition in your test-
review models

EFPAVersion 3.3: November 2004
• When judging overall validity, it is important to bear in
mind the importance placed on construct validity as
the best indicator of whether a test measures what it
claims to measure. In some cases, the main evidence
of this could be in the form of criterion-related studies.
Such a test might have an ‘adequate’ or better rating
for criterion-related validity and a less than adequate
one for construct validity. In general, if the evidence of
criterion-related validity or the evidence for construct
validity is at least adequate, then, by implication, the
overall rating must also be at least adequate. It
should not be regarded as an average or as the
lowest common denominator.

Testreview Systems in Sweden

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Testreview Systems in Sweden

Similar to Testreview Systems in Sweden (20)

Recently uploaded

Recently uploaded (20)

Testreview Systems in Sweden

Editor's Notes