External & InternalValidity
• Internal validity is
concerned with the degree
to which the results of the
study are due to the
independent variable(s)
under consideration and
not due to anything else.
• External validity relates to
the degree to which
findings can be
generalized/ transferred to
populations or situations.
3.
Relationship between Internal&
External Validity
• Internal and External Validity are
Interconnected
• Necessary Condition for Causality
4.
Challenges in ResearchDesign
• Complexity of Isolating Variables
• Designing an Effective Study
5.
Goal of Presentation
•To outline common threats to internal validity.
• To explain how they distort research results.
• To offer methods to mitigate these threats.
6.
Extraneous Factors Affecting
Internaland External Validity
• Campbell and Stanley (1963), Cook and
Campbell (1979), and others has identified
multiple factors that threaten both internal
and external validity.
• Gall et al. (1996) summarized these factors
in their research.
• Miles and Huberman (1994) presented a
similar list of validity threats.
7.
Threats to InternalValidity
(by Fred L. Perry)
• A list of threats compiled by Fred L. Perry:
14 primary threats
9 subordinate ones
• He illustrates these extraneous factors with the
Research Minefield. And calls these threats
‘mines’.
8.
Threats to InternalValidity
1. History
2. Maturation
3. Differential Selection
4. Statistical Regression
5. Subject Attrition
6. Competing Group
Contamination
7. Testing
8. Researcher and Data Gatherer
effect
9. Pygmalion Effect
10. Hawthorne Effect
11. Treatment Intervention
12. Accumulative Treatment Effect
13. Treatment Fidelity
14. Treatment Strength-Time
Interaction
9.
1. History
• Itrefers to the influence of events that take place at different
points in time on the dependent variable other than the
independent variable.
• Example of History in Practice: In a study examining the
effect of a new teaching methodology on children’s
second language (L2) learning over several months, an
external event (the airing of a new bilingual TV program)
could influence language behavior, potentially interacting
with the new teaching methodology.
10.
1. History (cont…)
•History as a Threat to Longitudinal Studies
• Azpillaga et al. (2001): The study aimed to investigate the
effects of a drama-based teaching method on English
language achievement over two years. The independent variable
was the type of teaching method (dramatized vs.
nondramatized), and the dependent variable was language
achievement (aural comprehension and oral production).
11.
2. Maturation
• Maturationrefers to natural developmental changes in participants
over time that are unrelated to the treatment. This could include
physical, cognitive, or emotional development.
• Example from Piaget’s Theory
• Azpillaga et al. (2001): While the study participants were all the same
age and likely developed at similar rates, maturation could still play a
role. For instance, early pubescent children may respond differently to
drama-based teaching methods than pre-pubescent children. The study
didn’t explore this possibility, but it highlights a potential interaction
between the treatment and participants’ developmental stages.
12.
3. Differential Selection
•This occurs when participants are not randomly selected
and placed into different groups (e.g., treatment vs.
control).
• Pre-existing differences between the groups can affect the
results.
• Azpillaga et al. (2001): The researchers did not use
random sampling but took steps to match participants on
various criteria (e.g., shy students, trouble-makers, and
average students). This effort helped control for
preexisting differences between the groups.
13.
4. Statistical Regression
•Statistical regression occurs when participants chosen for extreme scores (either
very high or very low) are likely to score closer to the average on subsequent
measurements. This is not due to any intervention but a natural statistical
phenomenon.
• Example: If a study selects participants who perform very poorly on a language
test and later tests them after a treatment, any improvement might not be due to
the treatment. Instead, it could be due to the tendency of extreme scores to move
toward the mean (average).
• In qualitative studies, researchers might specifically choose extreme cases (such
as very low or very high performers) because they provide rich data. In this case,
statistical regression may not be a concern, as the focus is on in-depth
understanding rather than generalizing results.
14.
5. Subject Attrition
•Subject attrition refers to the loss of participants during the course of a study, which
can distort the results if the attrition is not random. Those who drop out may differ
systematically from those who remain in the study, affecting the generalizability
and internal validity of the findings.
• For Example: A researcher conducts a 12-week study to evaluate the effect of a
new exercise program on weight loss.
Participants are divided into two groups:
Group A: Follows the new exercise program.
Group B: Does not participate in the exercise program (control group).
The researcher measures participants’ weight at the beginning and end of the
study.
• By the end of the study: 30% of participants in Group A drop out, citing reasons
such as the program being too intense or scheduling conflicts. In contrast, only 5%
of participants in Group B drop out.
15.
6. Competing GroupContamination
• Competing group contamination occurs when there are multiple
treatment groups or a lack of proper control groups in a study. If
the groups are not well controlled or coordinated, external factors
could influence the results, leading to confounded outcomes.
• Azpillaga et al. (2001): The study compared two teaching
methods: dramatized format versus non-dramatized format for
teaching a third language. The treatment was applied consistently
across multiple groups, but the control groups came from
different schools with no coordination between them.
16.
6. Competing GroupContamination
(cont…)
• Competing group contamination can take four different directions:
Competing Group Rivalry (John Henry Effect): participants in competing
groups change their behavior to outdo one another. In
the Azpillaga et al. (2001) study, the control group and the experimental group
came from the same sociogeographical location. Although the possibility of the
John Henry effect was not mentioned, the fact that the experimental group
outperformed the control group suggests that rivalry was not a significant issue.
Experimental Treatment Diffusion (Compromise): participants in competing
groups gain knowledge about the treatment conditions in other groups and
incorporate these factors into their own treatment. In the
Azpillaga et al. (2001) study, while there was no explicit mention of
experimental treatment diffusion, there was a potential for participants in the
control group to become aware of the experimental treatment.
17.
6. Competing GroupContamination
(cont…)
Compensatory equalization of treatments: when researchers, in an
attempt to make the control group feel less disadvantaged, provide them
with extra materials or special treatment that effectively turns them into a
new treatment group. In the Azpillaga et al. (2001) study, since the
experimental group outperformed the control group, compensatory equalization
likely did not occur.
Demoralization (boycott) of the control group: when participants in
the control group feel resentful or demoralized because they perceive that
the treatment group is receiving better or more interesting treatments.
This resentment may lead to decreased effort and motivation in the
control group. In the Azpillaga et al. (2001) study, since both the treatment
and control groups were from the same sociogeographical location, the control
group might have become demoralized if they learned they were not receiving the
new, more interesting dramatized format.
18.
7. Testing
• Testingrefers to ways in which measuring the dependent variable(s) can distort the results of
a study.
• List of five sources to which the consumer of research needs to pay attention.
1. Instrumentation: when different instruments are used to assess performance at different
stages of the study (e.g., pretest vs. posttest).
2. Measurement-treatment interaction: when the results of an intervention (or treatment)
only become apparent through the use of a specific type of measurement, such as a
particular kind of test or assessment.
3. Pretest effect: when the test administered before the treatment (the pretest) heightens
participants’ awareness of certain material that they might not have paid as much attention
to otherwise.
4. Posttest effect: when the design of the posttest inadvertently helps participants make
associations or connections that they would not have made otherwise, potentially making
the treatment appear more effective than it truly is.
5. Time of measurement effect: This effect refers to the timing of when measurements are
taken after the treatment.
19.
8. Researcher andData Gatherer Effect:
• When the identity or behavior of the person administering
the treatment or collecting data influences the results.
• This could be due to biases or expectations from the
researcher, or the influence of the data-gathering process
itself (e.g., whether a research assistant, a tape recorder, or
a video camera is used).
• The mere presence of a researcher or data collector can
change participants’ behavior or responses.
• Bejarano et al.’s (1997) study.
• Wesche and Paribakht’s (2000) study.
20.
9. Pygmalion Effect(Researcher Effect):
• When the researcher’s expectations influence
their observations and judgments of the
participants.
• If a researcher believes that certain
participants have higher ability, they may
unknowingly treat them more leniently or be
more encouraging, which could lead to biased
results.
21.
10. Hawthorne Effect
•When participants alter their behavior simply because they
are aware that they are part of a study. This awareness can
lead them to act in ways that they wouldn’t in a normal,
non-research environment.
• In Gray’s (1998) study, teacher trainees were asked to
write in interactive diaries outside of class hours. The
students were flattered by their involvement and might
have altered their behavior due to the awareness of being
part of an educational program, which could have affected
the quality of the diaries.
22.
11. Treatment Intervention
•Treatment intervention can affect the results of a study in at
least two undesirable ways: Novelty and Disruption.
• Novelty Effect: New treatments may create an initial boost
in motivation simply because they are novel. This effect could
distort results if it fades over time.
• Disruption Effect: Unfamiliar treatments or tools may
disrupt performance. For example, students using computers for
the first time might perform poorly due to unfamiliarity with the
technology, which may obscure the treatment’s effectiveness.
• Example
23.
12. Accumulated TreatmentEffect
(Multiple-Treatment Interference or Order Effect)
• When the order in which treatments are
presented influences the outcomes.
• Mehnert (1998) studied the effects of
planning time on L2 German speakers’ speech
performance..
24.
13. Treatment Fidelity
•Treatment fidelity refers to whether the treatment was
implemented as intended. If the treatment is not applied
consistently or properly, the results may not reflect the true effects
of the treatment.
• Bejarano et al. (1997) trained teachers to use specific group work
techniques and then monitored them through lesson plans and
teacher logs to ensure that both the treatment and control groups
received the intended treatment.
25.
14. Treatment Strength–TimeInteraction
• Some treatments may require more time to show their effects.
Short treatment durations may not provide enough time for the
treatment to have a noticeable impact, leading to misleading
conclusions.
• In the study by Rodriguez and Sadoski (2000), participants
were given only one session to learn mnemonic strategies.