Reliability & validity

VALIDITY &
RELIABILITY
Dr. Haozhou Pu
Department of Health and Sport Science

How to measure your athleticism?
How to measure it?

Reliability
 The degree to which a measure is consistent or dependable;
the degree to which it would give you the same result over
and over again, assuming the underlying phenomenon is
not changing.
 True score theory
 Every measurement is an additive composite of two components:
the true ability of the respondent and random error.

Reliability
 Random Error
 Random error is caused by any
factors that randomly affect
measurement across the sample
(e.g. mood)
 No consistent effects across the
sample; no effect on the average
score.
 random error is sometimes
considered noise.

Reliability
 Systematic Error
 Systematic error is caused by any
factors that systematically affect
measurement of the variable across
the sample (e.g. cheating)
 Systematic errors tend to be
consistently either positive or
negative
 systematic error is sometimes
considered to be bias in
measurement.

Reliability
 Inter-Rater or Inter-Observer Reliability
 Test-Retest Reliability
 Internal Consistency Reliability

Reliability
 Inter-Rater or Inter-Observer Reliability
 Used to assess the degree to which different raters/observers give
consistent estimates of the same phenomenon.

Reliability
 Test-Retest Reliability
 Used to assess the consistency of a measure from one time to another.

Reliability
 Used to assess the consistency of results across items within a test.
 Average Inter-item Correlation
 The average inter-item correlation uses all of the items on our instrument that are
designed to measure the same construct.

Reliability
 Average Item-total Correlation
 The item total correlation is the correlation between an individual item and the total
score without that item.

Reliability
 Split-Half Reliability
 In split-half reliability we randomly divide all items that purport to measure the same
construct into two sets.

Reliability
 Cronbach's Alpha (a)
 Cronbach's Alpha is mathematically equivalent to the average of all possible split-
half estimates.
most frequently used
estimate of internal
consistency.

Threats to Reliability
 Subject Error
 The subject may respond differently depending upon when they are asked
 Example: time to survey sport fans (before or after the game?)
 Researcher Error
 Different researchers may take different approaches in data collection
 Example: online vs. in person
 Subject Bias
 Participants may give the responses they think the researcher wants or the
perceived “correct” answer

Validity
 “How do I know that the method I am using is really
measuring what I want it to measure”
 Validity is the best available approximation of the truth of a
given proposition.

Construct Validity
 Translation validity
 the degree to which you accurately translated your construct into the
operationalization
 Face validity
 Content validity
 Criterion-related validity
 Check the performance of operationalization against some criterion
 Predictive validity
 Concurrent validity
 Convergent validity
 Discriminant validity

Translation Validity
Face Validity
 Does the method appear appropriate to
measure what you want it to measure at first
glance?
 Whether on its face it seems like a good
operationalization of the construct
 We can improve the quality of face validity
assessment considerably by making it more
systematic

Translation Validity
Content Validity
 In content validity, you essentially check the
operationalization against the relevant content domain
for the construct.
 Often based on assessment by experts in relevant
content domain

Criterion-related Validity
 Predictive Validity
 In predictive validity, we assess the operationalization's ability to predict
something it should theoretically be able to predict.
 Example: a measure of math ability should be able to predict how well a person
will do in an engineering-based profession.
 Concurrent Validity
 In concurrent validity, we assess the operationalization's ability to distinguish
between groups that it should theoretically be able to distinguish between.
 Example: fans with high team ID will be more resilient against poor team
performance than those with low team ID

Criterion-related Validity
 Convergent Validity
 In convergent validity, we examine the degree to
which the operationalization is similar to
(converges on) other operationalization that it
theoretically should be similar to.
 Discriminant Validity
 In discriminant validity, we examine the degree
to which the operationalization is not similar to
(diverges from) other operationalization that it
theoretically should be not be similar to.
 Example: tests on self-esteem and depression
should be negatively correlated— discriminant
validity

External Validity
External Validity (generalizability)
 how well data and theories from one setting can be generalized to
another.

Threats to External Validity
Types of Threats to External
Validity
Description of Threat
In Response, Actions the
Researcher Can Take
Interaction of selection and
treatment
Because of the narrow characteristics
of participants in the experiment, the
researcher cannot generalize to
individuals who do not have the
characteristics of participants.
The researcher restricts claims about
groups to which the results cannot be
generalized. The researcher conducts
additional experiments with groups
with different characteristics.
Interaction of setting and treatment
Because of the characteristics of the
setting of participants in an
experiment, a researcher cannot
generalize to individuals in other
settings.
The researcher needs to conduct
additional experiments in new settings
to see if the same results occur as in
the initial setting.
Interaction of history and
treatment
Because results of an experiment are
time-bound, a researcher cannot
generalize the results to past or future
situations.
The researcher needs to replicate the
study at later times to determine if the
same results occur as in the earlier
time.

Internal Validity
Internal Validity (causality)
 whether the effects observed in a study are due to the
manipulation of the independent variable and not some other
factor – changes in the study of DV can be attributed to IV.

Threats to Internal Validity
Type of Threat to Internal
Validity
Researcher Can Take
History
Because time passes during an experiment, events
can occur that unduly influence the outcome beyond
the experimental treatment.
The researcher can have both the experimental and
control groups experience the same external events.
Maturation
Participants in an experiment may mature or change
during the experiment, thus influencing the results.
The researcher can select participants who mature or
change at the same rate (e.g., same age) during the
experiment.
Regression to the mean
Participants with extreme scores are selected for the
experiment. Naturally, their scores will probably
change during the experiment. Scores, over time,
regress toward the mean.
A researcher can select participants who do not have
extreme scores as entering characteristics for the
experiment.
Selection
Participants can be selected who have certain
characteristics that predispose them to have certain
outcomes (e.g., they are brighter).
The researcher can select participants randomly so
that characteristics have the probability of being
equally distributed among the experimental groups.
Mortality (also called study
attrition)
Participants drop out during an experiment due to
many possible reasons. The outcomes are thus
unknown for these individuals.
A researcher can recruit a large sample to account
for dropouts or compare those who drop out with
those who continue—in terms of the outcome.

Threats to Internal Validity
Diffusion of treatment (also
called cross contamination of
groups)
Participants in the control and experimental groups
communicate with each other. This communication
can influence how both groups score on the
outcomes.
The researcher can keep the two groups as separate
as possible during the experiment.
Compensatory/resentful
demoralization
The benefits of an experiment may be unequal or
resented when only the experimental group receives
the treatment (e.g., experimental group receives
therapy and the control group receives nothing).
The researcher can provide benefits to both groups,
such as giving the control group the treatment after
the experiment ends or giving the control group
some different type of treatment during the
experiment.
Compensatory rivalry
Participants in the control group feel that they are
being devalued, as compared to the experimental
group, because they do not experience the
treatment.
The researcher can take steps to create equality
between the two groups, such as reducing the
expectations of the control group or clearly
explaining the value of the control group.
Testing
Participants become familiar with the outcome
measure and remember responses for later testing.
The researcher can have a longer time interval
between administrations of the outcome or use
different items on a later test than were used in an
earlier test.
Instrumentation
The instrument changes between a pretest and
posttest, thus impacting the scores on the outcome.
The researcher can use the same instrument for the
pretest and posttest measures.
Type of Threat to Internal
Validity
Researcher Can Take

Reliability & validity

Recommended

Recommended

More Related Content

Similar to Reliability & validity

Similar to Reliability & validity (20)

Recently uploaded

Recently uploaded (20)