You developed a new 3-item survey measure called the "Attitude Towards Large Computer Monitors" (ATLCM) scale to assess how participants feel about large computer monitors. To validate this new measure, you would conduct an experiment where participants are randomly assigned to either watch 2 hours of TV per day or 0 hours of TV for a week, then have them complete the ATLCM scale. However, your initial results finding no differences in ATLCM scores between conditions suggest issues with the sensitivity of the ATLCM measure to detect changes in attitude from the TV viewing manipulation. Further validation work is needed to establish the reliability and validity of the ATLCM scale before using it to assess relationships between variables.
Model Call Girl in Tilak Nagar Delhi reach out to us at đ9953056974đ
Â
classfeb08and10.ppt
1. IS 4800 Empirical Research Methods
for Information Science
Class Notes Feb 8, 2012
Instructor: Prof. Carole Hafner, 446 WVH
hafner@ccs.neu.edu Tel: 617-373-5116
Course Web site: www.ccs.neu.edu/course/is4800sp12/
2. Outline
â Assignment 2: Relational Agents for Patient
Education study
â Assignment 3: Descriptive Statistics Report
â Review for test
â Team project 1
â Survey research â (cont.)
â Questionnaire construction
â Composite measures
â Validity and reliability
3. Assignment 2 Points to mention
â Respect for persons
â Subjects can opt out/verbal and written consent obtained
â Refusal will not impact medical care -- voluntary
â Study described in detail in recruitment letter -- informed
â Gives procedures to ensure confidentiality
â Participants given number to call if they have concerns
â Beneficence
â Little or no risk
â Potential for significant public benefit â what benefits
⢠Benefit to all diabetes patients
⢠Use of relational agents for educating elderly/minorities/low literacy people
â Justice
â Participants may benefit personally (health + $)
â Minority patients in urban areas have 3X higher low health literacy
therefore represent a class that would benefit the most.
4. Assignment 2 more points to mention
â Data safety & monitoring plan
â Independent oversight ensures plan is followed
â Provides extra protection for poor/minority patients (justice)
â Point of Study Subjects section
â Document inclusion/exclusion criteria
â Demonstrate there is a sufficient sample size
â Shows disabled are not over-burdened (justice)
â HIPPAA issues
â Use of data to pre-select without consent
â âOpt-outâ initial consent process
â Use of phone interview to collect more data
5. Assignment 3
â Results were disappointing
â Frequency tables are only meaningful for
categorial measures (gender and job category)
unless you create intervals for numeric data.
â Histograms are meaningful for numeric measures
(experience, call time, customer satisfaction)
â Crosstabs â apparently could not figure out how to
get percents
â Most were able to get the scatter plot
â About half did the Custom Tables
â Grade of B for all the requested stats plus a
minimal discussion
6. 6
3. Types of Questionnaire Items
⢠Restricted (close-ended)
â Respondents are given a list of alternatives and
check the desired alternative
⢠Open-Ended
â Respondents are asked to answer a question in
their own words
⢠Partially Open-Ended
â An âOtherâ alternative is added to a restricted
item, allowing the respondent to write in an
alternative
7. 7
Types of Questionnaire Items
⢠Rating Scale
â Respondents circle a number on a scale (e.g., 0 to
10) or check a point on a line that best reflects their
opinions
â Two factors need to be considered
⢠Number of points on the scale
⢠How to label (âanchorâ) the scale (e.g., endpoints only or
each point)
⢠Ranking question
8. 8
Types of Questionnaire Items
â A Likert Scale is a scale used to assess attitudes
⢠Respondents indicate the degree of agreement or
disagreement to a series of statements
⢠I am happy.
Disagree 1 2 3 4 5 6 7 Agree
â A Semantic Differential Scale allows
participate to provide a rating within a bipolar
space
⢠How are you feeling right now?
Sad 1 2 3 4 5 6 7 Happy
10. 10
Psychological Concepts
aka âConstructsâ
â Constructs are general codifications of experience
and observations.
â Observe differences in social standing -> concept of social
status.
â Observe differences in religious commitment -> concept of
religiosity
â Most psychological constructs have no ultimate
definitions
â Constructs are ad hoc summaries of experience and
observations
11. 11
Composite Measures
â Indexes (aka âscalesâ) provide an ordinal ranking of
respondents with respect to a construct of interest
(e.g., liking of computers)
â Usually assessed through a series of related questions.
12. 12
Composite measures
â It is seldom possible to arrive at a single question that
adequately represents a complex variable.
â Any single item is likely to misrepresent some respondents
(e.g., church-going)
â A single item may not provide enough variation for
your purposes.
â Single items give crude assessments; several items
give a more comprehensive and accurate assessment.
14. 14
Operationalization
â The process of specifying empirical
observations that are indicators of the concept
of interest
â Begin by enumerating all the subdimensions
(âfactorsâ) of the concept
â Review previous research
â Use commonsense
15. 15
Example: religiosity
â Subdimensions/indicators/factors
â Ritual involvement
⢠E.g., going to church
â Ideological involvement
⢠Acceptance of religious beliefs
â Intellectual involvement
⢠Extent of knowledge about religion
â Experiential involvement
⢠Range of religious experiences
â Consequential involvement
⢠Extent to which religion guides social decisions
â (there are many others)
16. 16
Discriminant indicators
â Also think about related measures which should not
be indicators of your construct
â In particular if you will be measuring another related
variable, make sure none of your indicators include
any attributes of it.
â Example
â Want to study the relationship between religiosity and
attitudes towards war => including a question about
adherence to âpeace on earthâ doctrine is not a good idea.
17. 17
Picking items for a Composite
â Face validity
â Unidimensionality
â All items measure same concept
â Should provide variance in responses
â Donât pick items that classify everyone one way.
â If you are interested in a binary classification (e.g.,
liberal vs. conservative), each item should split
respondents roughly in half
â Negate up to half of the items to avoid
response bias.
18. 18
Picking items:
bivariate analysis
â Every pair of items should be related, but not
too strongly
â Scoring high on item A should increase likelihood
of scoring high on item B
â But, if two items are perfectly correlated (e.g. one
logically implies the other), then one can be
dropped.
â Should also look at combinations of >2 items
to ensure that they all provide additional
information.
19. 19
Scoring a Composite Measure
â Average the item scores
â Weight items equally unless you have a compelling
reason to do otherwise
â Missing data
â Omit dataset
â Impute average/intermediate score
â âLast value forwardâ for repeated measures
â Many other strategies
21. Designing a Composite Measure
Literature Review
Previous measures, theoretical concepts
Brainstorm on Factors
Brainstorm on Items
Preliminary /Validity Reliability testing
Factor analysis
Reliability testing
Validity testing
22. Validity and Reliability
â Reliability of a measure
â Validity of a measure
â Especially composite measures of constructs
â Validity of claims about association of IV and DV
â Internal
â External
23. 23
Internal Validity
â INTERNAL VALIDITY is the degree to which your
design tests what it was intended to test
â In an experiment, internal validity means showing the
observed difference in the dependent variable is truly caused
by changes in the independent variable
â In correlational research, internal validity means that observed
difference in the value of the criterion variable are truly
related to changes in the predictor variable
â Internal validity is threatened by Extraneous and
Confounding variables
â Internal validity must be considered during the design
phase of research
24. 24
External Validity
â EXTERNAL VALIDITY is the degree to which results
generalize beyond your sample and research setting
â External validity is threatened by the use of a highly
controlled laboratory setting, restricted populations,
pretests, demand characteristics, experimenter bias, and
subject selection bias (such as volunteer bias)
â Steps taken to increase internal validity may decrease
external validity and vice versa
â Internal validity may be more important in basic
research; external validity, in applied research
25. 25
Factors Affecting External Validity
Reactive testing A pretest may affect reactions to
an experimental variable.
Interactions between
selection biases and the
independent variable
Results may apply only to
subjects representing a unique
group.
Reactive effects of
experimental arrangements
Artificial experimental
manipulations or the subjectâs
knowledge that he or she is a
research subject may affect
results.
Multiple treatment
interference
Exposure to early treatments
may affect responses to later
treatments.
26. Internal vs. External Validity of a study..
ďŽ Internal:
ďŽ appropriate methods (well designed)
ďŽ conducted properly
ďŽ data analyzed correctly
ďŽ correct inference
ďŽ replicability: could someone else conduct your
study and get the same result?
ďŽ External:
ďŽ generalize-ability
27. Extraneous and Confounding Variables
(impact on internal validity)
â Extraneous variable â influences the DV.
â Confounding variable â influences BOTH the
IV and DV. Ice cream and drowning deaths.
â The most dangerous type of Extraneous variable
â Must be considered during design of a study
28. Examples
â Confounding variable (very difficult to address)
â A study of the effect of larger vs. smaller monitors on
performance. Larger monitors have better speakers.
(correlation w/IV). Perhaps the performance
difference is due to the speakers.
â Other extraneous variable (can be addressed by
sample restriction, matched group assignment ,
statistical methods)
â Task time on 2 word processors: typing skill. Can
control by only using subjects with one skill level,
matching skills levels among groups, multivariate
analysis.
30. 30
Example:
â You want to evaluate a new
sensor to detect whether
people are happy or not.
â You hire actors and randomly
assign them to act happy or
sad, and test your sensors on
them.
â What kind of validity
(internal/external) might be
challenged?
31. 31
Example:
â You conduct the
âConversational Agents to
Promote Health Literacyâ
study by assigning the first 30
patients who volunteer to the
intervention group, and the
next 30 to the control group.
â What kind of validity
(internal/external) might be
challenged?
32. 32
Research Settings
â The laboratory setting
â Affords greatest control over extraneous variables
â Simulations
⢠Attempt to recreate the real world in the laboratory
⢠Realism is an issue
â The field setting
â Study conducted in a real world environment
⢠Field experiment: Manipulate variables in the field
⢠High degree of external validity, but internal validity
may be low
34. 34
What is a validated measure?
â Has reliability
â Has validity
â For psychological measures, these are
collectively referred to as a measureâs
âpsychometricsâ.
35. 35
Measure Reliability
â A reliable measure produces similar results
when repeated measurements are made under
identical conditions
â Reliability can be established in several ways
⢠Test-retest reliability: Administer the same
test twice
⢠Parallel-forms reliability: Alternate forms of
the same test used
⢠Split-half reliability: Parallel forms are
included on one test and later separated for
comparison
36. 36
Reliability
â For surveys, this also encompasses internal
consistency:
â Do all of the questions address the same
underlying construct of interest?
â That is, do scores covary?
â A standard measure is Cronbachâs alpha
⢠0 = no correlation
⢠1 = scores always covary in the same way
⢠0.7 used as conventional threshold
37. 37
Increasing the Reliability
of a Questionnaire
â Check to be sure the items on your questionnaire are
clearly written and appropriate for those who will
complete your questionnaire
â Increase the number of items on your questionnaire
â Standardize the conditions under which the test is
administered (e.g., timing procedures, lighting,
ventilation, instructions)
â Make sure you score your questionnaire carefully,
eliminating scoring errors
38. Volunteer Bias
ďŽ How can it affect external validity?
ďŽ Characteristics of volunteers?
ďŽ How do you address volunteer bias?
39. Characteristics of Individuals
Who Volunteer for Research
Maximum Confidence
1. tend to be more highly educated than nonvolunteers
2. tend to come from a higher social class than
nonvolunteers
3. are of a higher intelligence in general, but not when
volunteers for atypical research (such as hypnosis, sex
research)
4. have a higher need for approval than nonvolunteers
5. are more social than nonvolunteers
40. Considerable Confidence
1. Volunteers are more âarousal seekingâ than nonvolunteers
(especially when the research involves stress)
2. Individuals who volunteer for sex research are more
unconventional than nonvolunteers
3. Females are more likely to volunteer than males, except when
the research involves physical or emotional stress
4. Volunteers are less authoritarian than nonvolunteers
5. Jews are more likely to volunteer than Protestants; however,
Protestants are more likely to volunteer than Catholics
6. Volunteers have a tendency to be less conforming than
nonvolunteers, except when the volunteers are female and the
research is clinically oriented
Source: Adapted from Rosenthal & Rosnow, 1975.
41. Remedies for Volunteer Bias
ďŽ Make your appeal very interesting
ďŽ Make your appeal as nonthreatening as possible
ďŽ Explicitly state the theoretical and practical
importance of your research
ďŽ Explicitly state why the target population is relevant to
your research
ďŽ Offer a small reward for participation
42. ďŽ Have a high-status person make the appeal for
participants
ďŽ Avoid research that is physically or
psychologically stressful
ďŽ Have someone known to participants make the
appeal
ďŽ Use public or private commitment to volunteering
when appropriate
43. 43
â The degree to which a measure corresponds to
what happens in the real world.
â Example:
â Assessing productivity/day in the lab vs.
â Assessing productivity/day in the office
Ecological Validity
44. 44
Concerns with Measures
â Sensitivity
â Is a dependent measure sensitive enough to detect behavior
change?
â An insensitive measure will not detect subtle behaviors
â Range Effects
â Occur when a dependent measure has an upper or lower limit
⢠Ceiling effect: When a dependent measure has an upper
limit
⢠Floor effect: When a dependent measure has a lower limit.
45. 45
Example
â You want to assess the effect of TV viewing on whether
people like large computer monitors or not (yes/no).
â You run an experiment in which participants are randomized
to watch either 2 hrs or 0 hrs of TV per day for a week, then
answer your question.
â Whatâs going on?
Participant Condition LikesLargeMonitors
1 TV Yes
2 No TV Yes
3 TV Yes
4 No TV Yes
46. 46
Developing a New Measure
â Say you decide you need a new survey
measure, âattitude towards large computer
monitorsâ (ATLCM)
â I like big monitors.
â Big monitors make me nervous.
â I prefer small monitors, even if they cost more.
â 7-pt Likert scales
â How would you validate this measure?
47. 47
Example
â You want to assess the effect of TV viewing on attitude towards
large computer monitors (ATLCM).
â You run an experiment in which participants are randomized to
watch either 2 hrs or 0 hrs of TV per day for a week, then fill out
the ATLCM.
â Whatâs going on?
Participant Condition ATLCM
1 TV 7.0
2 No TV 6.7
3 TV 6.9
4 No TV 7.0
48. 48
Measure Validity
â A valid measure measures what you intend it to
measure
â Very important when using psychological tests
(e.g., intelligence, aptitude, (un)favorable attitude)
â Validity can be established in a variety of ways
⢠Face validity: Assessment of adequacy of
content. Least powerful method
⢠Content validity: How adequately does a
variable sample the full range of behavior it is
intended to measure?
49. 49
⢠Criterion-related validity: How adequately
does a test score match some criterion score?
Takes two forms
âConcurrent validity: Does test score
correlate highly with score from a measure
with known validity?
âPredictive validity: Does test predict
behavior known to be associated with the
behavior being measured?
Measure Validity
50. 50
⢠Construct validity: Do the results of a
test correlate with what is theoretically
known about the construct being
evaluated?
âConvergent validity (subtype): measures
of constructs that should be related to
each other are
âDiscriminant validity (subtype): measures
of constructs that should not be related
are not
Measure Validity
51. 51
Example
Seniority
MonitorSize Productivity
â Assume we have good evidence for this model of the
world..
â We now propose a new measure for Productivity
â What would be evidence for convergent validity?
â What would be evidence for discriminant validity?
53. 53
Sampling
â You should obtain a representative sample
â The sample closely matches the characteristics of
the population
â A biased sample occurs when your sample
characteristics donât match population
characteristics
â Biased samples often produce misleading or
inaccurate results
â Usually stem from inadequate sampling procedures