This document discusses research design and methodology. It begins by introducing different types of research designs, including experimental and non-experimental. It then covers key aspects of research such as developing hypotheses, defining variables, and establishing validity. Specific challenges of classroom-based research are also addressed. The document provides guidance on selecting an appropriate research design based on the study context and limitations. It emphasizes that every design has tradeoffs, and the best approach depends on the research question and constraints.
2. “Everything that can be counted
does not necessarily count;
everything that counts cannot
necessarily be counted.”
- Albert Einstein
“Measure what is measurable,
and make measurable what is
not so.”
- Galileo Galilei
3. Experimental
Single subject
True experimental
Quasi-experimental
Non-Experimental
Descriptive
Comparative
Correlational
Ex post facto
Causal-comparative
4. Is it stated in declarative form?
Is it consistent with known facts, previous
research, and theory?
Does it state the expected relationship
between two or more variables?
Is it testable?
Is it clear?
Is it concise?
5. Variable: A concept or characteristic that can
take on different values or be divided into
categories.
A B C
6. Conceptual Definitions: use words and
concepts to describe the variable.
Operational Definitions: indicate how the
concept is measured or manipulated.
• How would you define each of
these variables conceptually?
• How would you operationalize
them in a research study?
Gender
Race
Social Class
Leadership Style
Achievement
Efficacy
Depression
7.
8. Independent Variable (IV): The predictor or cause. In
experimental studies, the researcher controls or
manipulates the independent variable (the
treatment or intervention).
DependentVariable (DV): The outcome or effect that
the researcher measures (e.g., knowledge, skills
or attitudes).
A B C
Independent Independent Dependent
9. Possible problems related to causality:
The assessment was not measured well.
The intervention was not manipulated well.
Something other than the intervention
caused change in the assessment.
10. ConstructValidity:
Am I measuring what I think I am
measuring?
Am I implementing what I think I am
implementing?
InternalValidity:
Did the treatment cause the outcome?
11. It is the inference that is valid or invalid, not
the measure.
An instrument can be valid for one use but
not another.
Validity is a matter of degree.
Validity involves an overall evaluative
judgment based on evidence.
12. A study does not have absolute
validity or absolutely no validity
The level of validity relates to the confidence
in the conclusions
Construct and internal validity are measured
on a continuum
Construct validity does not imply internal
validity (and vice versa)
When a hypothesis is supported, it does not
necessarily mean that the study has either
construct or internal validity
13. A concept, model, or schematic idea
A construct is the global notion of the measure, such as:
▪ Student motivation
▪ Intelligence
▪ Student learning
▪ Student anxiety
The specific method of measuring a construct is called
the operational definition.
For any construct, researchers can choose many
possible operational definitions.
14. Example:What is “productivity”? (Operational definition)
How do we measure productivity? (Proxy measures)
Common measures of productivity:
▪ Work output
▪ Time and face time at work
▪ Absences
Common data collection methods:
▪ Observation
▪ Record review
▪ Self report
Proxy: approximates the real thing
15. Measure learning directly (clear operational
definitions; learning is not the same as
enjoyment or perceived learning).
Measure student learning through student
learning objectives (ensure these are aligned
with assessments).
Use established scales to measure student
attitudes and personality (don’t reinvent the
wheel; tests in Print).
16. Know how to score the measure (make sure
you’ve established this before data collection;
know what is reasonable; rubrics; training; IRR).
Determine whether to use graded or
ungraded measures (pros and cons of both).
Minimize participant and researcher
expectancies.
17. Determine whether to use multiple
operational definitions (can use multiple
measures).
Use a retention measure to investigate long-
term effects (but treat long term results with
caution about other influences).
18. The treatment (intervention) needs to be
manipulated well to ensure construct validity
The only difference between conditions
should be the treatment
Other variables that are different
between conditions are confounds
To determine construct validity,
treatments need specific operational
definitions
Anything that can affect the results and
cause a difference between students in
treatment and control conditions needs
to be documented
19. Construct validity of the treatment is
questionable in any design that
compares one section of a class with
another
Classes are a social space, and the students
and instructors are interdependent
Students can ask different questions
The class may have a different “tone”
Splitting a class into two groups can minimize
this concern; if students in a split class can be
randomly assigned to a condition, internal
validity will increase
20. DifferentTypes of Comparison in Research Design
Between Participants Within Participants:
MultipleTreatments
Within Participants:
Multiple Measures
How it works Students in one condition
compared to students in another
condition (control –Treatment;
multipleT’s)
All students in both control
and treatment conditions
Students receive both pre-test
(control) and post-test
(treatment)
Strengths No carryover effects from
multiple treatments; no
instrumentation or testing
effects from multiple
assessments
No selection bias; greater
statistical power
No selection bias; greater
statistical power
Weaknesses Selection bias without random
assignment; many differences if
groups are separate (e.g., two
separate classes); lower
statistical power
Instrumentation and testing
effects; carryover effects
Instrumentation and testing
effects; other confounds that
occur between assessments
Improve
Internal
Validity by:
Random assignment; adding
covariates
Counterbalancing Increase number of
assessments; add no
treatment separate control
condition; use alternative
measures for assessment
21. Can the sample
used in the study
generalize to other
groups or
populations?
Generally, it is
impossible in
classroom studies
to get a sample
that will generalize
to all students.
The researcher
should report
demographic
characteristics
How realistic is the
situation? In a
classroom, if the
treatment works,
external validity is
higher
22.
23. Trying to measure everything
Small number of students = low statistical power
Only a single class; limits type of design
Difficulties in random assignment
Difficulties in determining whether the treatment is
potent enough to have an effect (relates to power)
Conducting an ethical study in a classroom or
training situation
24. Don’t Use
Want to make statement
about causality
Have low number of
students
Use
Have single group of
students that cannot be
divided
Have only one session in
which to collect data
Additional Options:
Correlate many variables at the same time
25. Don’t Use
Want to make
statement about
causality
Want to make
comparison to
another group
Use
Desired focus is on
describing treatment
and not assessment
Cannot have pre-test
or control group
Want single group of
students that cannot
be divided
26. Don’t Use
Have low number of
students
Groups are very different
Have different
assessments for each
condition
Use
Concerned about carryover
effects
Concerned about testing and
instrumentation effects
Have multiple groups
Have only one session to
collect data
Additional Options:
• Use random assignment to improve internal validity
• Add post-test to assess long-term change
• Add additional conditions
• Use covariates to improve internal validity and power
27. Don’t Use
Items other than treatment
occur between
assessments
First assessment affects
second
Students likely to change
between assessments with
no treatment
Use
Have low number of
students
Have single group that
cannot e divided
Cannot have control
condition
Additional Options:
• Add post-test to assess long-term change
• Use alternative measures to minimize testing and
instrumentation effects
28. Don’t Use
Have single group of
students that cannot
be divided
Use
Have multiple groups
Additional Options:
• Use random assignment to improve internal validity
• Add post-test to assess long-term change
• Use alternative measures to minimize testing and
instrumentation effects
• Add additional conditions
• Use covariates to improve internal validity and power
29. Don’t Use
Early treatments affect
later treatments
Early assessments affect
later assessments
Use
Have low number of
students
Have single group that
cannot be divided
Additional Options:
• Add additional treatments
• Counterbalance conditions to improve internal validity
• Include pre-test to assess students before any treatment
30. Don’t Use
First assessment, by itself,
affects second
Have single group of
students that cannot be
divided
Use
Have low number of
students
Have multiple groups
Additional Options:
• Include pre-test to assess before treatment
• Add post-test to examine long-term change
• Use random assignment to improve internal validity
• Use alternative measures to minimize testing and
instrumentation effects
31. Don’t Use
Have only one session
to collect data
Early assessments
affect later
assessments
Use
Have low number of
students
Have single group that
cannot be divided
Want to determine long-
term effects
Additional Options:
• Add control condition to improve internal validity
• Add additional treatment condition, with treatment at
different time to improve internal validity
32. Use multiple treatments to investigate
interactions (Interactions)
Use moderators to determine when
treatment has effect (Concept of ATI)
Use mediators to investigate how treatment
has effect (Mixed Method?)
33. Each design has advantages and
disadvantages.
Often, there is no clear right way, although
some designs will be better than others.
There is no single ideal study that eliminates
all potential problems and all alternative
hypotheses.
One study cannot answer all of your
questions!
Editor's Notes
Many SOTL studies take form of: If I change teaching in X, what will be the impact on outcome Y. Change may be curriculum, instructional format, technology, time, etc. Outcome may be attitudes, motivation, enjoyment, learning – factual information, skills, problem solving, retention, etc.
Discuss causality as key consideration
Use existing measures if one matches your operational definition. Consider using an instrument used in similar studies. Increases body of knowledge and connectivity to prior research
Scoring and Pygmalion effect: Look at reported data that are not reasonable
Creating multiple opportunities to find differences, but within hypotheses
Note methods hypothesized to increase retention, and thus availability for use and transfer; ABAB designs in special ed for behavior change investigation
Refers to Treatment Validity
Sample size and diversity issues, random sampling… Look at studies in the aggregate. Describe your sample so that it can be contextualized in the broader literature.
What works in laboratory might not work in real classrooms. Is this an authentic context?