20. (8` points) Two observers observe a child in the classroom every 30 minutes to record whether he is behaving aggressively. They use two categories for their observations: yes (aggressive) or no (not aggressive). Using the data presented below, answer the following question.
Calculate and report the observers' interobserver reliability.
Do you think the observers demonstrated acceptable interobserver reliability? Why or why not?
21. (5 points) A researcher was interested in determining whether more frequent breaks (i.e., "coffee breaks") in a business setting would help employees to be more productive. With the cooperation of the management, employees on one floor of the corporate offices were allowed to take a 10-minute break each hour (at any time) between 8:00 and 11:00 A.M. (for a total of 30 minutes). The comparison group comprised employees on different floors who followed the usual corporate policy of taking a 30-minute break sometime during the morning (at any time). Measures of productivity were gathered for each employee according to his or her job (e.g., number of reports written, number of sales made, etc.). A time series analysis was applied to compare the productivity of both groups of employees for six months before and after the intervention (started in July). Quite surprisingly, the productivity of both groups increased following the onset of the intervention, suggesting to the researcher that the timing of breaks makes no difference.
What type of research design was used in this study?
Describe two ways in which contamination may have influenced the results of this study.
Describe one threat to internal validity that might be present in this study because the independent variable manipulation was implemented on different floors of the building.
Research Methods in Psychology
Quasi-Experimental Designs
1
Characteristics of True Experiments
Manipulate Independent Variable (IV)
Treatment, comparison conditions
High degree of control
Choice of the DVs
Random assignment to conditions
Unambiguous outcome regarding effect of IV on DV
Internal validity
2
Applied Research
Goals
Test external validity of lab findings
Improve conditions in which people live and work (natural settings)
Quasi-experiments
Procedures that approximate the conditions of highly controlled laboratory experiments
3
Permission
Difficult to gain permission to conduct true experiments in natural settings
Difficult to gain access to participants
Random assignment perceived as unfair
People want a “treatment”
Random assignment is best way to determine whether a treatment is effective
Use “waiting-list” control group or alternate treatments
Tablets in English and science classes example
Obstacles to Conducting True Experiments in Natural Settings
4
Advantage of True Experiments
Threats to internal validity are controlled
8 general threats to internal validityhistoryregressionmaturationselectiontestingsubject attritionin ...
20. (8` points) Two observers observe a child in the classro.docx
1. 20. (8` points) Two observers observe a child in the classroom
every 30 minutes to record whether he is behaving aggressively.
They use two categories for their observations: yes (aggressive)
or no (not aggressive). Using the data presented below, answer
the following question.
Calculate and report the observers' interobserver reliability.
Do you think the observers demonstrated acceptable
interobserver reliability? Why or why not?
21. (5 points) A researcher was interested in determining
whether more frequent breaks (i.e., "coffee breaks") in a
business setting would help employees to be more productive.
With the cooperation of the management, employees on one
floor of the corporate offices were allowed to take a 10-minute
break each hour (at any time) between 8:00 and 11:00 A.M. (for
a total of 30 minutes). The comparison group comprised
employees on different floors who followed the usual corporate
policy of taking a 30-minute break sometime during the
morning (at any time). Measures of productivity were gathered
for each employee according to his or her job (e.g., number of
reports written, number of sales made, etc.). A time series
analysis was applied to compare the productivity of both groups
of employees for six months before and after the intervention
2. (started in July). Quite surprisingly, the productivity of both
groups increased following the onset of the intervention,
suggesting to the researcher that the timing of breaks makes no
difference.
What type of research design was used in this study?
Describe two ways in which contamination may have influenced
the results of this study.
Describe one threat to internal validity that might be present in
this study because the independent variable manipulation was
implemented on different floors of the building.
Research Methods in Psychology
Quasi-Experimental Designs
1
Characteristics of True Experiments
Manipulate Independent Variable (IV)
Treatment, comparison conditions
High degree of control
Choice of the DVs
Random assignment to conditions
Unambiguous outcome regarding effect of IV on DV
Internal validity
3. 2
Applied Research
Goals
Test external validity of lab findings
Improve conditions in which people live and work (natural
settings)
Quasi-experiments
Procedures that approximate the conditions of highly controlled
laboratory experiments
3
Permission
Difficult to gain permission to conduct true experiments in
natural settings
Difficult to gain access to participants
Random assignment perceived as unfair
People want a “treatment”
Random assignment is best way to determine whether a
treatment is effective
Use “waiting-list” control group or alternate treatments
Tablets in English and science classes example
Obstacles to Conducting True Experiments in Natural Settings
4
Advantage of True Experiments
4. Threats to internal validity are controlled
8 general threats to internal
validityhistoryregressionmaturationselectiontestingsubject
attritioninstrumentationadditive effects with selection
5
Threats to Internal Validity
History
When an event occurs at the same time as the treatment and
changes participants’ behavior
Participants’ “history” includes events other than treatment
Difficult to infer treatment has an effect
6
History Threat, continued
Does a campus recycling awareness campaign influence the
amount of paper, plastic, and cans in campus bins?
History threat: Suppose at week 4 (X = treatment) a popular
celebrity also starts to promote recycling in the media.
Can you conclude the campus campaign was effective?
7
Series 1 1 2 3 4 X 5 6 7 8 30 35
30 35 40 55 55 60 55
Week
Recycling (Kg)
5. Threats to Internal Validity, continued
Maturation
Participants naturally change over time.
These maturational changes, not treatment, may explain any
changes in participants during an experiment.
8
Maturation Threat, continued
Does a new reading program improve 2nd graders’ reading
comprehension?
Reading comprehension improves naturally as children mature
over the year.
Can you conclude the reading program was effective?
9
Series 1 Pre Post 25 70
Reading Comprehension
Threats to Internal Validity, continued
Testing
Taking a test generally affects subsequent testing.
Participants’ performance on a measure at the end of a study
may differ from an initial testing because of their familiarity
6. with the measure.
10
Testing Threat, continued
Does teaching a new problem solving strategy influence
people’s ability to solve problems quickly?
If similar problems are used in the pretest, faster problem
solving at post-test may be due to familiarity with the test.
Can we conclude the new strategy improves problem-solving
ability?
11
Series 1 Pre Post 12 4
Minutes (Mean)
Threats to Internal Validity, continued
Instrumentation
Instruments used to measure participants’ performance may
change over time
Example: observers may become bored or tired
Changes in participants’ performance may be due to changes in
instruments used to measure performance, not to a treatment.
12
7. Instrumentation, continued
Suppose a police protection program is implemented to decrease
incidence of assault.
At the same time the program is implemented (X), reporting
laws change such that what constitutes assault is broadened.
Can we conclude the program was effective (or ineffective)?
13
Series 1 1 2 3 4 X 5 6 7 8 25 20
22 24 35 45 40 35 38 Series 2 1 2 3
4 X 5 6 7 8 Series 3 1 2 3 4
X 5 6 7 8
Week
Assaults
Threats to Internal Validity, continued
Regression
Individuals sometimes perform very well or very poorly because
of chance (e.g., luck).
Chance factors are not likely present during 2nd testing, so
scores will not be as extreme.
Scores will “regress” (go toward) the mean.
Regression effects, not treatment, may account for changes in
participants’ performance over time.
14
8. Regression, continued
Suppose students are selected for an enrichment program
because of their very high scores on a brief test.
Regression: to the extent the test is an unreliable measure of
ability, we can expect their scores to regress to the mean at the
2nd testing.
Can we conclude the enrichment program was effective (or
ineffective)?
15
Series 1 Pre Post 90 70
Test Score (Mean)
Threats to Internal Validity, continued
Subject attrition
When participants are lost from the study (attrition), the group
equivalence formed at the start of the study may be destroyed.
Differences between treatment and control groups at the end of
the study may be due to natural differences in those who remain
in each group.
16
Subject Attrition, continued
Suppose an exercise program is offered to employees who
would like to lose weight.
At Time 1, N = 50
M weight = 225 pounds
9. At Time 2, N = 25 (25 drop out of study)
Suppose the 25 who stayed in program weighed, on average,
150 pounds at Time 1
Did the exercise program help people to lose weight?
17
East Time1 Time2 0 225 150 0
MeanWeight
Threats to Internal Validity, continued
Selection
Occurs when differences exist between individuals in treatment
and control groups at the start of a study
These differences become alternative explanations for any
differences observed at the end of the study
Random assignment controls the selection threat
18
Selection, continued
Suppose a community recycling program is tested. Individuals
who are interested in recycling are encouraged to participate.
Evaluation: Compare the weight of garbage (i.e., not recycled)
from participants in the program with weight of garbage from
those not in the new program.
Can we tell if the new recycling program is effective?
19
10. East In Not In 0 15 35 0
Recycle Program
mean lbs/week
Threats to Internal Validity, continued
Additive effects with selection
When one group of participants in an experiment
Responds differently to an external event (history)
Matures at a different rate
Is measured more sensitively by a test (instrumentation)
These threats (rather than treatment) may account for any group
differences at the end of a study.
20
Additive effects with selection, continued
Suppose School A starts a program (X) to prevent alcohol abuse
on campus (Week 4). The DV is number of alcohol-related
infractions in student residences.
School B is a comparison.
During Week 4 the newspaper at School A reports a student
death due to intoxication (“local history effect”).
Is the program effective?
School A 1 2 3 4 X 5 6 7 8 9 8
8 9 1 3 2 3 2 School B 1 2 3
4 X 5 6 7 8 8 10 9 7 8 7
10 9 9 Column1 1 2 3 4 X 5 6
7 8
11. Week
# Infractions
Threats to Internal Validity, continued
With no comparison group, must rule out:
history, maturation, testing, instrumentation, regression, subject
attrition, selection
When there is a comparison group, you must rule out these
threats:
selection, additive effects with selection
Adding a comparison group helps rule out many threats to
internal validity.
22
Stretching Exercise, page 323
Threats to Internal Validity, continued
Threats even true experiments may not eliminate
Contamination
resentment, rivalry, diffusion of treatments
Experimenter expectancy effects
Novelty effects (including Hawthorne effect)
Threats to external validity
Treatment effects may not generalize
Best way to assess external validity: replication
12. 24
Quasi-Experiments
“Quasi-” (resembling) experiments
Important alternative when true experiments are not possible
Lack the high degree of control found in true experiments
Often no random assignment
Researchers must seek additional evidence to eliminate threats
to internal validity
25
The One-Group Pretest-Posttest Design
“Bad experiment” or “pre-experimental design”
Intact group is selected to receive a treatment
e.g., a classroom of children, a group of employees
Pretest is 1st Observation (O1)
Treatment is implemented (X)
Posttest is 2nd Observation (O2)
O1 X O2
26
One-Group Pretest-Posttest Design, cont.
O1 X O2
None of the threats to internal validity are controlled.
Any change between pretest (O1) and posttest (O2) may be due
13. to treatment (X) or
History
Maturation
Testing
Or instrumentation, regression, subject attrition, selection
27
Quasi-Experimental Designs
Nonequivalent Control Group Design
A group similar to the treatment group serves as a comparison
group
Obtain pretest and posttest measures for individuals in both
groups
Random assignment to groups is not used
Pretest scores are used to determine whether the groups are
equivalent
Equivalent only on this dimension
28
Nonequivalent Control Group Design,
continuedTreatment↓O1XO2← treatment group-------------------
-------------------------------O1O2← nonequivalent control
grouppretestposttest
Nonequivalent Control Group Design, continued
Example: Does taking a research methods course improve
reasoning ability?
14. Compare students in research methods and developmental
psychology courses
DV: 7-item test of methodological and statistical reasoning
ability
Suppose group differences are observed at the posttest
30
Nonequivalent Control Group Design, continued
By adding a comparison group, rule out these threats to internal
validity:
history
maturation
testing
instrumentation
regression
Assume that these threats happen the same to both groups,
therefore, can’t be used to explain posttest differences.
31
Methods Pre Post 3 5 Developmental Pre Post 2.5
2.75
Mean Reasoning Score
Nonequivalent Control Group Design, continued
What threats are not ruled out?
Selection
15. Without random assignment to conditions, the two groups are
probably not equivalent on many dimensions.
These preexisting differences may account for group differences
at the posttest.
32
Nonequivalent Control Group Design, continued
Additive effects with selection
The two groups
May have different experiences (selection X history or “local
history effect”)
May mature at different rates (selection X maturation)
May be measured more or less sensitively by the instrument
(selection X instrumentation)
May drop out of the study (courses) at different rates
(differential subject attrition)
May differ in terms of regression to the mean (differential
regression)
33
Quasi-Experiments, continued
Simple Interrupted Time Series Design
Observe a DV for some time before and after a treatment is
introduced.
Archival data are often used.
Look for clear discontinuity in the time-series data for evidence
of treatment effectiveness.
16. O1 O2 O3 O4 X O5 O6 O7 O8
34
Simple Interrupted Time-Series Design, cont.
Example: Study habits
Intervention: An instructional course to change students’ study
habits
Implemented during summer following the sophomore year
(after semester 4)
DV: semester GPA
Suppose a discontinuity is observed when the treatment (X) is
introduced
35
Simple Interrupted Time-Series Design, cont.
What threats can be ruled out?
Maturation: assume maturational changes are gradual, not
abrupt
Testing (GPA): if testing influences performance, these effects
are likely to show up in initial observations (before X)
Testing effects less likely with archival data
Regression: if scores regress to the mean, they will do so in
initial observations
discontinuity
36
Series 1 1 2 3 4 5 6 7 8 1.9 2
17. 1.75 2 3 3.25 3 3.5
Semester
Mean GPA
Quasi-Experiments, continued
Time Series with Nonequivalent Control Group Design
Add a comparison group to the simple time-series design
O1 O2 O3 O4 X O5 O6 O7 O8
--------------------------------------------------------------
O1 O2 O3 O4 O5 O6 O7 O8
37
Time Series with Nonequivalent Control Group Design,
continued
Example: Study habits
Suppose a nonequivalent control group is added—these students
don’t participate in the study habits course
Who could be in the comparison group?
What threats would you be able to rule out?
38
Treatment 1 2 3 4 5 6 7 8 1.9 2
1.75 2 3 3.25 3 3.5 Control 1 2 3 4
18. 5 6 7 8 2 1.9 2.1 2 2 2.1 2.25 2
Semester
Mean GPA
An Example
Study to determine if well being is increased if nursing home
residents are given the opportunity to make daily personal
decisions (how their room is arranged, visits, movie choices)
Two groups: choice group and no-choice group
Assignment to groups was done by floor in a nursing home
These floors were chosen due to similarity in the residents’
physical and psychological health and prior SES
Questionnaires administered 1 week before and three weeks into
the study
Staff members rated residents before and after treatment
(alertness, sociability, and activity)
Contest—guess the number of jelly beans in a jar.
What is the independent variable?
What is/are the dependent variable(s)?
What type of quasi-experimental design?
Which threats to internal validity are controlled?
Which threats are not controlled?
19. Research Methods in Psychology
Observation
1
Observational Research
Researchers cannot observe
All of a person’s behavior
All people’s behavior
Researchers can observe
Samples of individuals
Samples of behavior at particular times
Samples of different settings and conditions
2
Observational Research
Goal of sampling behavior
Represent larger population of
Behaviors
People
Settings and conditions
3
Observational Research
Example:
How many hours of television did you watch last week?
20. Is this number representative of how much you typically watch
tv?
Is the average for the class representative of the number of
hours of tv watched by
all students on campus?
all college students?
all people?
4
Observational Research
Use data from a sample to represent the population
“Generalize” the findings from sample to population
Sample must be similar to population
External validity
Extent to which a study’s findings may be used to describe
people, settings, conditions beyond those used in the study.
5
Sampling Behavior
Extent to which observations may be generalized (external
validity)
Depends on how behavior is sampled
Two methods
Time sampling
Situation sampling
Goal: obtain representative sample of behavior
6
21. Sampling Behavior, continued
Time Sampling
Choose time intervals for observations
Systematic (first day of each week; third hour of every day;
9:00, 11:00, 1:00 during school day)
Random (random day each week, random hour during the day,
three random ½ hour periods during school day)
EAR (electronically activated recording; every 12.5 minutes, 30
seconds of recording)
Don’t use time sampling for observing rare events (might miss
them)
Event sampling (animals eating; museum patrons interacting
with exhibits; player shooting foul shots)
7
Sampling Behavior, continued
Situation Sampling
Choose different settings, circumstances, conditions for
observations
If we want to examine how “considerate” a person is, we would
do this.
What if there are too many behaviors to observe (food
selections in dining hall)?
Use subject sampling to observe only some individuals within a
situation (rules about probability sampling still apply—random
subject sampling of some form would be best).
8
22. Exercise
If you wanted to investigate the number and nature of disruptive
behaviors in college classes and how they change over the
semester at Albertus Magnus College, how would you do that?
What type of sampling would you use (and why)?
What if you wanted to investigate the same topic above in
college classes in general?
Classification of Observational MethodsObservational
MethodsDirect ObservationIndirect (Unobtrusive)
ObservationObservation without InterventionObservation
with
InterventionPhysical TracesArchival RecordsParticipant
Observation
Structured Observation
Field
Experiment
10
Direct Observation without Intervention
Naturalistic Observation
Observation in natural (real-world) setting
No attempt to intervene or change situation
Expert teacher example
Goals
Describe behavior as it normally occurs (bullying)
Examine relationships among naturally occurring variables
Establish external validity of lab findings
Correlation between bullying and establishing relationships
Use when ethical considerations prevent experimental
manipulation (bullying effects on developing peer relationships)
23. 11
Direct Observation with Intervention
Characterizes most psychological research
Gain control over observations
Three methods in natural settings
Participant observation (note reactivity)
Undisguised—e.g., person gets permission to live with tribe to
observe and record their activities
Disguised—e.g., participants sought admission to psychiatric
hospital complaining of one symptom
Structured observation—between non-intervention and field
experiment; inattentional blindness example
Field experiment—one or more IVs manipulated in natural
setting (clown vs. skateboard)
12
Indirect (Unobtrusive) Observational Methods
Examine evidence of past behavior
Nonreactive
Two types of methods
Physical traces
Use (natural or controlled) traces
Cigarettes in ashtray; recyclables in garbage; highlighting in
textbook; food left on a plate
Products
Tattoos; bumper stickers; portion size of meals
Archival records
Running records; episodic records
24. Indirect (Unobtrusive) Observational Methods (continued)
Archival records—public and private documents describing
activities of individuals, groups, institutions, and governments
Running records—those that are continuously kept and updated
Status updates on Facebook; stock market; price of oil; records
of sports teams
episodic records—describe specific events or episodes
Birth certificate; marriage license; subpoena; divorce filing
One can examine the impact of the above events on behavior
(absenteeism, grades, detentions/suspensions)
Unobtrusive Measures
Possible problems in archival records
Selective deposit—not all information is recorded (politicians
speaking to media; Facebook best foot forward)
Selective survival—not all information is kept over time (advice
columnists don’t keep all letters; parents don’t keep all of kids’
grades/artwork)
Spurious relationships—2D:4D finger ratio; ice cream sales and
shark attacks
Nominal
Categorize behaviors, events, people
Hair color; height; walking (alone, pairs, listening to music,
playing on phone)
Ordinal
Rank-order behaviors
Least favorite to favorite; fastest to slowest; class rank
Measurement Scales
25. 16
Measurement Scales (continued)
Interval
Has values that are meaningful and equally spaced
Temperature; Time on a clock; Likert scale (?)
Ratio
Has values that are equally spaced and scale has an absolute 0;
ratios of scale values.
Age; ruler measurements; income; response time
Measurement Scales (continued)
Brand of phone you use
Scale to measure weight
Number on a baseball jersey
Miles per hour
Golf score (in relation to par)
Top 25 poll in college football
Eye color
Letter grade in class
Military rank
IQ tests
Number of times getting out of seat
Social security number
Measurement Scales (continued)
Brand of phone you use Nominal
Scale to measure weight Ratio
Number on a baseball jersey Nominal
Miles per hour Ratio
Golf score (in relation to par) Interval
26. Top 25 poll in college football Ordinal
Eye color Nominal
Letter grade in class Ordinal
Military rank Ordinal
IQ tests Interval
Number of times getting out of seat Ratio
Social security number Nominal
Analysis of Observational Data
Method for analysis depends on
Goal of the study
How data are recorded
Measurement scale
Two types of analysis
Qualitative
Quantitative
20
Analysis of Observational Data, continued
Qualitative Analysis
Data reduction to summarize comprehensive records
Coding: identify units of behavior (including categories or
themes) using specific criteria
Emphasis on verbal summary
21
Analysis of Observational Data, continued
27. Quantitative Analysis
Statistical summary of observations
Descriptive statistics depend on measurement scale
Nominal: relative frequency
Ordinal: (e.g., ranking priorities for government action such as
education, economy, etc.) rank percentages
Interval and ratio: mean, standard deviation
22
Analysis of Observational Data, continued
Interobserver reliability
Measure of agreement between observers
Nominal: percent agreement
Ordinal: Spearman rank-order correlation
Interval and Ratio: Pearson correlation
23
Analysis of Observational Data, continued
Factors that affect interobserver reliability
Characteristics of the observers
Bored, tired, amount of experience
Train observers and provide feedback
Clearly define events and behaviors to be observed
Clear operational definitions
Provide examples
24
28. Thinking Critically About Observational Research
Problems in observational research
Influence of the observer on behavior
Observer bias
25
Thinking Critically About Observational Research, continued
Influence of the Observer
Reactivity: people change their usual behavior when they know
they’re being observed.
Researchers want to observe people’s usual behavior.
Demand characteristics: people pay attention to cues and
information in the situation to guide their behavior.
26
Thinking Critically About Observational Research, continued
Controlling reactivity
Conceal observer (videotape, one-way mirror)
Disguised participant observation (cell phone study)
Use indirect (unobtrusive) observation (use traces, products,
archival data)
Adapt participants to observer (lesson study)
Habituation
Reactivity is a potential problem in most psychological
research.
27
29. Thinking Critically About Observational Research, continued
Observer bias
Observers often have expectations about behavior.
Example: expectations based on research hypotheses
Expectations can lead observers to look at only particular
behaviors
Example from tipping behavior study
28