5. Reliability
The degree to which a measure is consistent or dependable;
the degree to which it would give you the same result over
and over again, assuming the underlying phenomenon is
not changing.
True score theory
Every measurement is an additive composite of two components:
the true ability of the respondent and random error.
6. Reliability
Random Error
Random error is caused by any
factors that randomly affect
measurement across the sample
(e.g. mood)
No consistent effects across the
sample; no effect on the average
score.
random error is sometimes
considered noise.
7. Reliability
Systematic Error
Systematic error is caused by any
factors that systematically affect
measurement of the variable across
the sample (e.g. cheating)
Systematic errors tend to be
consistently either positive or
negative
systematic error is sometimes
considered to be bias in
measurement.
10. Reliability
Inter-Rater or Inter-Observer Reliability
Used to assess the degree to which different raters/observers give
consistent estimates of the same phenomenon.
12. Reliability
Internal Consistency Reliability
Used to assess the consistency of results across items within a test.
Average Inter-item Correlation
The average inter-item correlation uses all of the items on our instrument that are
designed to measure the same construct.
13. Reliability
Internal Consistency Reliability
Used to assess the consistency of results across items within a test.
Average Item-total Correlation
The item total correlation is the correlation between an individual item and the total
score without that item.
14. Reliability
Internal Consistency Reliability
Used to assess the consistency of results across items within a test.
Split-Half Reliability
In split-half reliability we randomly divide all items that purport to measure the same
construct into two sets.
15. Reliability
Internal Consistency Reliability
Used to assess the consistency of results across items within a test.
Cronbach's Alpha (a)
Cronbach's Alpha is mathematically equivalent to the average of all possible split-
half estimates.
most frequently used
estimate of internal
consistency.
16. Threats to Reliability
Subject Error
The subject may respond differently depending upon when they are asked
Example: time to survey sport fans (before or after the game?)
Researcher Error
Different researchers may take different approaches in data collection
Example: online vs. in person
Subject Bias
Participants may give the responses they think the researcher wants or the
perceived “correct” answer
17. Validity
“How do I know that the method I am using is really
measuring what I want it to measure”
Validity is the best available approximation of the truth of a
given proposition.
18. Construct Validity
Translation validity
the degree to which you accurately translated your construct into the
operationalization
Face validity
Content validity
Criterion-related validity
Check the performance of operationalization against some criterion
Predictive validity
Concurrent validity
Convergent validity
Discriminant validity
19. Translation Validity
Face Validity
Does the method appear appropriate to
measure what you want it to measure at first
glance?
Whether on its face it seems like a good
operationalization of the construct
We can improve the quality of face validity
assessment considerably by making it more
systematic
20. Translation Validity
Content Validity
In content validity, you essentially check the
operationalization against the relevant content domain
for the construct.
Often based on assessment by experts in relevant
content domain
21. Criterion-related Validity
Predictive Validity
In predictive validity, we assess the operationalization's ability to predict
something it should theoretically be able to predict.
Example: a measure of math ability should be able to predict how well a person
will do in an engineering-based profession.
Concurrent Validity
In concurrent validity, we assess the operationalization's ability to distinguish
between groups that it should theoretically be able to distinguish between.
Example: fans with high team ID will be more resilient against poor team
performance than those with low team ID
22. Criterion-related Validity
Convergent Validity
In convergent validity, we examine the degree to
which the operationalization is similar to
(converges on) other operationalization that it
theoretically should be similar to.
Discriminant Validity
In discriminant validity, we examine the degree
to which the operationalization is not similar to
(diverges from) other operationalization that it
theoretically should be not be similar to.
Example: tests on self-esteem and depression
should be negatively correlated— discriminant
validity
25. Threats to External Validity
Types of Threats to External
Validity
Description of Threat
In Response, Actions the
Researcher Can Take
Interaction of selection and
treatment
Because of the narrow characteristics
of participants in the experiment, the
researcher cannot generalize to
individuals who do not have the
characteristics of participants.
The researcher restricts claims about
groups to which the results cannot be
generalized. The researcher conducts
additional experiments with groups
with different characteristics.
Interaction of setting and treatment
Because of the characteristics of the
setting of participants in an
experiment, a researcher cannot
generalize to individuals in other
settings.
The researcher needs to conduct
additional experiments in new settings
to see if the same results occur as in
the initial setting.
Interaction of history and
treatment
Because results of an experiment are
time-bound, a researcher cannot
generalize the results to past or future
situations.
The researcher needs to replicate the
study at later times to determine if the
same results occur as in the earlier
time.
26. Internal Validity
Internal Validity (causality)
whether the effects observed in a study are due to the
manipulation of the independent variable and not some other
factor – changes in the study of DV can be attributed to IV.
27. Threats to Internal Validity
Type of Threat to Internal
Validity
Description of Threat
In Response, Actions the
Researcher Can Take
History
Because time passes during an experiment, events
can occur that unduly influence the outcome beyond
the experimental treatment.
The researcher can have both the experimental and
control groups experience the same external events.
Maturation
Participants in an experiment may mature or change
during the experiment, thus influencing the results.
The researcher can select participants who mature or
change at the same rate (e.g., same age) during the
experiment.
Regression to the mean
Participants with extreme scores are selected for the
experiment. Naturally, their scores will probably
change during the experiment. Scores, over time,
regress toward the mean.
A researcher can select participants who do not have
extreme scores as entering characteristics for the
experiment.
Selection
Participants can be selected who have certain
characteristics that predispose them to have certain
outcomes (e.g., they are brighter).
The researcher can select participants randomly so
that characteristics have the probability of being
equally distributed among the experimental groups.
Mortality (also called study
attrition)
Participants drop out during an experiment due to
many possible reasons. The outcomes are thus
unknown for these individuals.
A researcher can recruit a large sample to account
for dropouts or compare those who drop out with
those who continue—in terms of the outcome.
28. Threats to Internal Validity
Diffusion of treatment (also
called cross contamination of
groups)
Participants in the control and experimental groups
communicate with each other. This communication
can influence how both groups score on the
outcomes.
The researcher can keep the two groups as separate
as possible during the experiment.
Compensatory/resentful
demoralization
The benefits of an experiment may be unequal or
resented when only the experimental group receives
the treatment (e.g., experimental group receives
therapy and the control group receives nothing).
The researcher can provide benefits to both groups,
such as giving the control group the treatment after
the experiment ends or giving the control group
some different type of treatment during the
experiment.
Compensatory rivalry
Participants in the control group feel that they are
being devalued, as compared to the experimental
group, because they do not experience the
treatment.
The researcher can take steps to create equality
between the two groups, such as reducing the
expectations of the control group or clearly
explaining the value of the control group.
Testing
Participants become familiar with the outcome
measure and remember responses for later testing.
The researcher can have a longer time interval
between administrations of the outcome or use
different items on a later test than were used in an
earlier test.
Instrumentation
The instrument changes between a pretest and
posttest, thus impacting the scores on the outcome.
The researcher can use the same instrument for the
pretest and posttest measures.
Type of Threat to Internal
Validity
Description of Threat
In Response, Actions the
Researcher Can Take
NOTE:
To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image.
Push-up, deadlift, squat. Running dash, jump
Use scale to measure my weight. 150, 160 180, not reliable
What if the error is not random
The level of awesomeness of XXX
For example, if we have six items we will have 15 different item pairings (i.e., 15 correlations). The average interitem correlation is simply the average or mean of all these correlations
For example, if we have six items we will have 15 different item pairings (i.e., 15 correlations). The average interitem correlation is simply the average or mean of all these correlations.
The level of awesomeness of XXX
64+88=86=87=83=92/6
The level of awesomeness of XXX
For example, if we have six items we will have 15 different item pairings (i.e., 15 correlations). The average interitem correlation is simply the average or mean of all these correlations
Think of the center of the target as the concept that you are trying to measure. Imagine that for each person you are measuring, you are taking a shot at the target. If you measure the concept perfectly for a person, you are hitting the center of the target. If you don't, you are missing the center. The more you are off for that person, the further you are from the center.