VALIDITY
BY NATASHAVELÁSQUEZ MANRIQUE
INTRODUCTION
 1. Content validity
 2. Criterion-related validity
 3. Other forms of evidence for construct validity
 4.Validity in scoring
 5. Face validity
 6. How to make more valid tests
Content
validity
 It refers to how accurately an assessment or measurement tool
taps into various aspects of the specific construct in question. In
other words, do the questions really assess the construct in
question?
 A test needs to be related to the content of the class (relevant
content).
 HOW TO JUDGE IT? We need a specification of the skills or
structures that the test is meant to cover.
 Not all the course content needs to appear in the test.
The
importance of
content
validity
 Language specifications provide the test constructor with a basis
for making a principled selection of elements to include in the test.
 A comparison between test specification and test content is the
basis for judgments related to content validity.
 The greater a test’s content validity, the more likely is to be an
accurate measure of what it is supposed to measure.
ACTIVITY 1
1. According to the CEFR, which specification of the language
skills would you need to take into account to test an A1 student?
PLEASE USE YOUR GADGETS TO HAVE ACCES TO THE CEFR
LANGUAGE SPECIFICATION.
2. Do you think teachers care about those specifications while
testing a student?
Criterion-
related validity
 It is the degree to which test results agree with those provided by
some independent and highly dependable assessment of the
candidate’s ability.
 The independent assessment is the criterion measure against
which the test is validated
 TWOTYPES:
CONCURRENT VALIDITY PREDICTICEVALIDITY
CONCURRENT
VALIDITY
 It refers to the extent to which the results of a particular test, or
measurement correspond to those of a previously established
measurement for the same construct.
 Is it possible to test everything you need to test in a short time?
 This will always depend on how many functions are tested in the
component, and how representative they are among the complete
set of functions including in the objectives.
How the level
of agreement
is measured?
 Using the “correlation coefficient”. This is a mathematical measure
of similarity.
 Perfect agreement= 1
 Total lack of agreement= 0
The level of agreement is regarded as satisfactory, depending on
the purpose of the test and the decisions that are made based on it.
PREDICITVE
VALIDITY
 This topic concerns the degree to which a test can predict a
candidate’s future performance.
 How helpful is it to use final outcomes as the criterion measure
when so many factors other than ability in English (such as
subject knowledge, intelligence, motivation, health and
happiness) will have contributed to every outcome?
 Example: placement tests
Other forms
of evidence
for construct
validity
 We cannot be sure that the items of the test are measuring what
we expect them to measure.
 Construct validity: “construct” refers to any underlying ability
that is hypothesized in a theory of language ability.
 It is important to establish if distinct abilities exist, if they can be
measured and if they are measured in a test.
 Research is needed for evidence.
Another way of obtaining evidence about the construct validity of a
test is to investigate what test takers actually do when they respond
to an answer.
TWO PRINCIPAL METHODS:
THINK ALOUD RETROSPECTION
Test taker voice
their thoughts as
they respond to
the item.
They try to recollect
what their thinking
was, as to they
responded.
Problem: The very
voicing of thoughts may
interfere with what would
be the natural response of
the item
Problem: Thoughts may
be forgotten
VALIDITY IN
SCORING
 It is worth pointing out that if a test is to have validity, not only the
item must be valid. Also the way in which the responses are scored
must be valid.
 If the test is meant to test one specific skill and while grading
another one interferes with the scoring process, then the test is
not valid.
ACTIVITY 2
To discuss:
 Do the PUCE language tests lack scoring validity?Yes or no?
 So, why do teachers take away points when they are grading the
reading and comprehension part?
 What is the reason to do so? Do you think this is fair for the
student?
FACE
VALIDITY
 A test is said to have face validity if it looks as if it measures what
it is supposed to measure.
 A test which does not have face validity may not be accepted by
candidates, teachers, education authorities or employers.
How to make
tests more
valid
 Write explicit specifications for the test which take into account all
that is known about the constructs that are to be measured.
 Make sure that you include a representative sample of the content
of these in the test.
 Use direct testing.
 Make sure that the scoring responses relates directly to what is
being tested.
 Do everything possible to make the test reliable. If the test is not
reliable, it cannot be valid.
ACTIVITY 3
 In pairs answer the following test. Then, check if it has validity
enough to say that the student knows all the content of the A1
level.
 Would you say that the test is measuring the most important
content of the A1 level?
 Which contents would you test instead of those and why?
Conclusion
 It is important to take into account the language specifications.
 It is necessary to do research to say that a test has content validity.

Validity

  • 1.
  • 2.
    INTRODUCTION  1. Contentvalidity  2. Criterion-related validity  3. Other forms of evidence for construct validity  4.Validity in scoring  5. Face validity  6. How to make more valid tests
  • 3.
    Content validity  It refersto how accurately an assessment or measurement tool taps into various aspects of the specific construct in question. In other words, do the questions really assess the construct in question?  A test needs to be related to the content of the class (relevant content).  HOW TO JUDGE IT? We need a specification of the skills or structures that the test is meant to cover.  Not all the course content needs to appear in the test.
  • 4.
    The importance of content validity  Languagespecifications provide the test constructor with a basis for making a principled selection of elements to include in the test.  A comparison between test specification and test content is the basis for judgments related to content validity.  The greater a test’s content validity, the more likely is to be an accurate measure of what it is supposed to measure.
  • 5.
    ACTIVITY 1 1. Accordingto the CEFR, which specification of the language skills would you need to take into account to test an A1 student? PLEASE USE YOUR GADGETS TO HAVE ACCES TO THE CEFR LANGUAGE SPECIFICATION. 2. Do you think teachers care about those specifications while testing a student?
  • 6.
    Criterion- related validity  Itis the degree to which test results agree with those provided by some independent and highly dependable assessment of the candidate’s ability.  The independent assessment is the criterion measure against which the test is validated  TWOTYPES: CONCURRENT VALIDITY PREDICTICEVALIDITY
  • 7.
    CONCURRENT VALIDITY  It refersto the extent to which the results of a particular test, or measurement correspond to those of a previously established measurement for the same construct.  Is it possible to test everything you need to test in a short time?  This will always depend on how many functions are tested in the component, and how representative they are among the complete set of functions including in the objectives.
  • 8.
    How the level ofagreement is measured?  Using the “correlation coefficient”. This is a mathematical measure of similarity.  Perfect agreement= 1  Total lack of agreement= 0 The level of agreement is regarded as satisfactory, depending on the purpose of the test and the decisions that are made based on it.
  • 9.
    PREDICITVE VALIDITY  This topicconcerns the degree to which a test can predict a candidate’s future performance.  How helpful is it to use final outcomes as the criterion measure when so many factors other than ability in English (such as subject knowledge, intelligence, motivation, health and happiness) will have contributed to every outcome?  Example: placement tests
  • 10.
    Other forms of evidence forconstruct validity  We cannot be sure that the items of the test are measuring what we expect them to measure.  Construct validity: “construct” refers to any underlying ability that is hypothesized in a theory of language ability.  It is important to establish if distinct abilities exist, if they can be measured and if they are measured in a test.  Research is needed for evidence.
  • 11.
    Another way ofobtaining evidence about the construct validity of a test is to investigate what test takers actually do when they respond to an answer. TWO PRINCIPAL METHODS: THINK ALOUD RETROSPECTION Test taker voice their thoughts as they respond to the item. They try to recollect what their thinking was, as to they responded. Problem: The very voicing of thoughts may interfere with what would be the natural response of the item Problem: Thoughts may be forgotten
  • 12.
    VALIDITY IN SCORING  Itis worth pointing out that if a test is to have validity, not only the item must be valid. Also the way in which the responses are scored must be valid.  If the test is meant to test one specific skill and while grading another one interferes with the scoring process, then the test is not valid.
  • 13.
    ACTIVITY 2 To discuss: Do the PUCE language tests lack scoring validity?Yes or no?  So, why do teachers take away points when they are grading the reading and comprehension part?  What is the reason to do so? Do you think this is fair for the student?
  • 14.
    FACE VALIDITY  A testis said to have face validity if it looks as if it measures what it is supposed to measure.  A test which does not have face validity may not be accepted by candidates, teachers, education authorities or employers.
  • 15.
    How to make testsmore valid  Write explicit specifications for the test which take into account all that is known about the constructs that are to be measured.  Make sure that you include a representative sample of the content of these in the test.  Use direct testing.  Make sure that the scoring responses relates directly to what is being tested.  Do everything possible to make the test reliable. If the test is not reliable, it cannot be valid.
  • 16.
    ACTIVITY 3  Inpairs answer the following test. Then, check if it has validity enough to say that the student knows all the content of the A1 level.  Would you say that the test is measuring the most important content of the A1 level?  Which contents would you test instead of those and why?
  • 17.
    Conclusion  It isimportant to take into account the language specifications.  It is necessary to do research to say that a test has content validity.