The application of anchoring vignettes to the EQ-5D-5L:a possible solution to reporting heterogeneity in PROMs
1. Paula Lorgelly
Deputy Director
Visiting Professor, Division of Cancer Studies, Kings College London
9th June 2016
The application of anchoring
vignettes to the EQ-5D-5L:
a possible solution to reporting
heterogeneity in PROMs
2. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Acknowledgments
• Australian Research Council Discovery Project Grant
(DP110101426)
• Investigators: Paula Lorgelly, Bruce Hollingsworth, Mark Harris,
Nigel Rice, John Wildman, William Greene
• Researchers: Rachel Knott, Nicole Black (Au)
• BankWest Curtin Economics Centre Research Grants Program
• Mark Harris, Nigel Rice, Paula Lorgelly, Rachel Knott
• Faculty of Business and Economics Research Grant Scheme
• Paula Lorgelly, Rachel Knott, Mark Harris
3. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Background
• Individual and household surveys often rely on self-reported
measures of health
• In general, would you say your health is: excellent, very good,
good, fair or poor?
• Analyses using measures of self-reported health (SRH) rely on
the measure being an accurate reflection of the true health of the
groups or individuals concerned
• But responses to questions on subjective scales will be inaccurate
if groups of individuals systematically differ in their use and/or
interpretation of the response categories
• Systematic variation in the use of response categories is known
as reporting heterogeneity or response scale heterogeneity or
differential item functioning (DIF)
3
4. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
EQ-5D
• The EQ-5D is the most commonly used instrument for measuring
preference-based health-related quality of life (HRQoL)
• The responses to the five health domains can be used
descriptively as health profiles (12112) or converted to a
preference-weighted summary index which reflects health-related
utility (where 0 is dead and 1 is full health), thus can be used to
estimate QALYs
• Most commonly used in economic evaluations, but the EQ-5D is
increasingly being used as a measure of population health status
and is included in a number of population health surveys
• When used to measure and compare health profiles or utilities
across sub-groups of the population, the results will be
misleading if groups systematically differ in use of response
categories
• Could the EQ-5D suffer from DIF like other SRH measures?
5. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
5
6. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Differential Item Functioning
τ4
τ3
τ2
τ1
No problems
Slight problems
Moderate problems
Severe problems
Unable to walk
Group 2
Underlyinglatenthealthscaleformobility
τ4
τ3
τ2
τ1
No problems
Slight problems
Moderate problems
Severe problems
Unable to walk
Group 1High mobility
Low mobility
Group 2’s
mean health
Group 1’s
mean health
7. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Programme of research
• ARC funded project (starting 2011) assessing reporting
behaviour/heterogeneity and it’s consequences
• Focus on SRH (generic likert scale) in panel surveys
• Addition of primary research looking at relatively new
phenomenon of anchoring vignettes
• Limited research considering DIF in the EQ-5D, none using
anchoring vignette technique
• Programme of research
• Necessary to design vignettes, given identifying assumptions
• Explore feasibility of eliciting responses
• Robustly test if can be used to adjust for DIF
• Consider broader applications
8. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Anchoring vignettes
• In order to obtain any meaningful comparison between the health
of groups 1 and 2 it is essential to adjust for DIF
• Anchoring vignettes (King et al. 2004) can be used to adjust for
DIF
• Previously been used to address DIF in political efficacy,
job/income/life satisfaction, general/specific health measures
• Vignette - a brief health description of a hypothetical individual
• Respondents are asked to rate the health state described by the
vignette using the same ordered categories they use to rate their
own health
• Since the actual level of health of the people in the vignettes is
the same for all respondents, the variation in ratings can be used
to identify and correct for DIF
9. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Anchoring vignettes
• Example of a vignette for the mobility domain:
Belinda walks for one or two kilometres and climbs three flights
of stairs every day without tiring.
Select the one option that best describes Belinda’s mobility:
She has no problems with walking around
She has slight problems with walking around
She has moderate problems with walking around
She has severe problems with walking around
She is unable to walk around
10. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Anchoring vignettes
• Typically, a series of vignettes are presented for each health
construct of interest, at varying levels of severity
• Suppose we give groups 1 and 2 two vignettes to rate, of
differing severity:
• Vignette 1 – limited problems in walking around
• Vignette 2 – more problems in walking around
11. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Anchoring vignettes
τ4
τ3
τ2
τ1
No problems
Slight problems
Moderate problems
Severe problems
Unable to walk
Group 2
Underlyinglatenthealthscaleformobility
Vignette 2
Vignette 1
High mobility
Low mobility
τ4
τ3
τ2
τ1
No problems
Slight problems
Moderate problems
Severe problems
Unable to walk
Group 1
12. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Necessary assumptions
• Vignette equivalence (VE) holds if all respondents interpret the
health states described by the vignettes in the same way and on
the same uni-dimensional scale, aside from random error.
• VE is demonstrated in the example above by the horizontal
dotted lines
• Response consistency (RC) is where respondents rate the health
of the hypothetical people described in the vignettes in the same
way or using the same underlying scale that they would rate their
own health.
• RC would be violated if, for example, respondents rated the
health described by the vignettes either more or less harshly
than they did their own health
13. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Stage 1: Qualitative assessment of RC
• Initial ARC study question
• Research questions:
• Is the rating of vignettes for the EQ5D-5L feasible?
• How do the vignette ratings compare by version?
• Informal test for VE – is the ordering of vignettes consistent
with global ordering?
• Understand thought process when rating vignettes – do
respondents rate hypothetical individuals in the same way as
themselves? Does RC hold? (qualitative perspective).
14. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Methods – vignette development
• Gary King has a library of vignettes
http://gking.harvard.edu/vign/eg/
• Used these where possible and amended according to EQ-5D
attributes mobility, self care, usual activities, pain,
anxiety/depression
• Version A: 15 vignettes - single health dimensions. Asks EQ-5D-
5L by health dimension
• Version B: 3 vignettes - combined health state. Asks EQ-5D-5L as
a whole including the VAS
• Respondents asked to rate the health of people in the vignettes
before rating their own health to help with priming
• Vignette names were gender specific
• Respondents were instructed to assume the hypothetical people
were of the same age and background as themselves
15. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Pluralistic research design
• Online survey: socio-demographic questions, long term illnesses,
health seeking behaviour, objective health measures, vignettes
and EQ-5D-5L (+ SAH).
• Randomisation of survey version (A or B)
• Data collection: April to May 2012
• Phase 1: Online survey + face-to-face interview
• Interview to assess survey (clarity of instructions, wording &
formatting) and feasibility of vignette task (clarity of the
vignettes, level of concentration required, and thought
processes).
• Phase 2: Online survey only
• Additional questions on thoughts during vignette rating task
• Subjects: staff, students and people from a database of past
research participants recruited through Monash University online
newsletter and emails.
16. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Results – feasibility
• Interview feedback:
• Survey was straight forward and easy
• Vignettes were easy to understand and the descriptions
seemed real and imaginable.
• 3 younger respondents (aged 18-24) found some scenarios
difficult to imagine for someone their age.
• Version A: one respondent noted the difficulty in rating a
person’s health “…without any other background or other
knowledge of other aspects of their health” (Male, 30-34).
– Highlights trade-off between simplicity of vignettes and lack
of context in a single health dimension description.
• Version B was equally easy to understand as A
• But, version B required more concentration than A
17. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Response consistency
• Did you assume the people in the vignettes were of the same age
and background as yourself?
• Total sample, yes = 69%
• Interview: If no, why?
• “When I read someone more disabled than myself I thought
they were possibly older and if less disabled, possibly
younger”. (Male, 55-59, VA).
• “Most of them seemed older than me. I probably don’t see
many people with those symptoms my age”. (Male, 18-24,
VA)
18. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Response consistency
• Did you imagine yourself in the health state of the people in the
vignettes (at least for some of them)?
• Online only: yes = 77%
• Many in interview also demonstrated this. For example:
• “I pictured myself in that position. It’s easier to judge whether
something is bad or not if it happens to you.” (Male, 18-24,
VA).
• Others in interview took an external view. For example:
• “I didn’t think of myself as them – I thought they were
another person. I rated myself quite separately from the
vignettes” (Female, 25-29, VB)
• “I was trying to think of a view that a medical or paramedic
person would put on it.” (Male, 70+, VA)
19. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Response consistency
• Did you rate your own health on the same scale as the
hypothetical individuals?
• Online only: 39% strongly agree; 39% somewhat agree; 15%
disagree; 6% unsure
• More people in version B strongly agree (50%) than in version A
(29%).
• Suggests describing vignette as a whole health state rather than
as independent health dimensions does a better job at
encouraging response consistency.
• Combined responses (interview and online only) suggest 37%
demonstrated response consistency (28% for version A, 46% for
version B).
20. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Stage 1: Summary
• Evidence that vignettes for the EQ-5D-5L are feasible
• Suggested improvements required in the wording in order to
improve response consistency
• Health states age neutral
• … imagine yourself …
• Several avenues to explore in future work
• Au and Lorgelly (2014) Anchoring vignettes for health
comparisons: an analysis of response consistency. QoLR.
• Knott et al (2016) Response scale heterogeneity in the EQ-5D.
Health Economics
21. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Stage 2: Quantitative exploration
• Second ARC study question plus BankWest study
• Research questions:
• Can the anchoring vignette approach be used to identify DIF in
the EQ-5D-5L?
• Does it pass ‘strong’ tests for RC and VE?
• What is the impact of DIF on inter-group comparisons?
22. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Data
• Two online surveys of a sample of representative Australian
residents, recruited via a survey panel company (April 2014 and
Aug/Sept 2015)
• First survey compared versions A and B (and priming effect),
second only used version B, analysis focuses on version B
vignettes, of which their were two
• Total n=4,095
23. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Vignette 1
• REBECCA/ROB is able to walk distances of up to 500 metres
without any problems but feels puffed and tired after walking one
kilometre or walking up more than one flight of stairs. She/he is
able to wash, dress and groom her/himself, but it requires some
effort due to an injury from an accident one year ago. Her/his
injury causes her/him to stay home from work or social activities
about once a month. Rebecca/Rob feels some stiffness and pain
in her/his right shoulder most days however her/his symptoms
are usually relieved with low doses of medication, stretching and
massage. She/he feels happy and enjoys things like hobbies or
social activities around half of the time. The rest of the time
she/he worries about the future and feels depressed a couple of
days a month.
24. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Vignette 2
• CHRISTINE/CHRIS is suffering from an injury which causes
her/him a considerable amount of pain. She/he can walk up to a
distance of 50 metres without any assistance, but struggles to
walk up and down stairs. She/he can wash her/his face and comb
her/his hair, but has difficulty washing her/his whole body
without help. She/he needs assistance with putting clothes on the
lower half of her/his body. Since having the injury Christine/Chris
can no longer cook or clean the house her/himself, and needs
someone to do the grocery shopping for her/him. The injury has
caused her/him to experience back pain every day and she/he is
unable to stand or sit for more than half an hour at a time.
She/he is depressed nearly every day and feels hopeless. She/he
also has a low self-esteem and feels that she/he has become a
burden.
25. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Data
• Standard socio-demographic questions, self reports of own health
(EQ-5D-5L and SRH), vignettes, additional health questions
• First survey included ‘objective’ health measures to test RC
• Considered heterogeneity in following groups
• Age, gender, education and country of birth
26. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Econometric Analysis
• Hierarchical ordered probit (HOPIT) model
• Extension of OP but allows for variation in the inter-category
thresholds by modelling them as a function of covariates
• We estimated five separate HOPITs for each domain of the EQ-
5D-5L
• DIF is tested for using LR that restrict the threshold covariates to
be zero
• Impact of DIF on EQ-5D-5L indices assessed by simulating data
given distribution of latent health using the estimated parameters
of the mean function of the HOPIT and the characteristics of each
individual, apply the predicted thresholds at sample means of the
covariates
• EQ-5D-5L values from Australian DCE (Norman et al, 2013)
27. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Testing the assumptions
• Bago d’Uva et al (2011) developed tests for VE and RC
• RC test based on objective measures
• Inter-category thresholds should be the same for across the
health and the vignette equations
• VE tests that no systematic difference in perceptions ofhte health
states of the vignette persons
• Interactions between individual characteristics and vignette
severity (for all but one vignette)
28. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Results regarding assumptions
Degrees of
freedom
χ2
test statistic p-value
Response consistency
Mobility 13 15.12 0.300
Self-care 13 18.31 0.146
Usual activities 13 8.14 0.835
Pain/discomfort 13 18.86 0.127
Anxiety/depression 13 19.44 0.110
Vignette equivalence
Mobility 13 100.06 <0.001
Self-care 13 178.69 <0.001
Usual activities 13 170.03 <0.001
Pain/discomfort 13 241.63 <0.001
Anxiety/depression 13 172.44 <0.001
29. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Vignette equivalence – age groups
Degrees of freedom χ2
test statistic p-value
Age 20-34
Mobility 8 21.785 0.005
Self-care 8 65.791 <0.001
Usual activities 8 54.208 <0.001
Pain/discomfort 8 68.995 <0.001
Anxiety/depression 8 38.895 <0.001
Age 35-44
Mobility 8 28.017 <0.001
Self-care 8 75.826 <0.001
Usual activities 8 56.664 <0.001
Pain/discomfort 8 79.472 <0.001
Anxiety/depression 8 45.601 <0.001
Age 45-54
Mobility 8 67.563 <0.001
Self-care 8 110.842 <0.001
Usual activities 8 93.543 <0.001
Pain/discomfort 8 129.923 <0.001
Anxiety/depression 8 82.278 <0.001
Age 55-65
Mobility 8 8.296 0.600
Self-care 8 9.427 0.492
Usual activities 8 11.675 0.307
Pain/discomfort 8 15.076 0.129
Anxiety/depression 8 24.061 0.007
30. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Evidence of DIF for 55-65 (N=914)
Mobility Self-care
Usual
activities
Pain/
discomfort
Anxiety/
depression
LR test statistic 94.82 57.71 64.73 74.89 74.57
p-value 0.000 0.043 0.008 0.001 0.001
Degrees of freedom 40 40 40 40 40
31. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Mobility Self care
Usual
activities
Pain/
Discomfort
Anxiety/
Depression
Female -0.165* -0.005 0.059 0.131*** 0.035
(0.087) (0.052) (0.046) (0.050) (0.047)
Education (base category low)
Medium -0.128 -0.088 0.014 -0.109* 0.047
(0.095) (0.061) (0.054) (0.057) (0.055)
High -0.251** -0.168** -0.073 -0.142** -0.03
(0.107) (0.067) (0.057) (0.061) (0.058)
Country of Birth (ref. Australia)
Oth English speaking 0.099 0.125 -0.097 0.188** 0.119
(0.160) (0.095) (0.094) (0.089) (0.088)
Asia 0.168 0.037 0.025 0.055 0.02
(0.105) (0.073) (0.065) (0.070) (0.066)
Other 0.399** 0.159 0.142 0.201 0.118
(0.179) (0.133) (0.121) (0.126) (0.123)
Marital status (ref. never married)
Married/de facto -0.335*** -0.165** -0.005 -0.063 0.008
(0.103) (0.074) (0.070) (0.074) (0.073)
Divorced/widowed -0.259** -0.123 0.066 -0.034 0.092
(0.123) (0.084) (0.079) (0.084) (0.081)
Employment status (ref. NILF)
Employed -0.009 -0.032 -0.074 -0.044 -0.087*
(0.084) (0.053) (0.048) (0.051) (0.048)
Unemployed -0.333 -0.127 0.018 -0.023 -0.269**
(0.265) (0.128) (0.102) (0.113) (0.120)
Resultsforthefirstthreshold–
betweenextremeandsevere
32. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
DIF adjusted indices
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
EQ-5DIndex
Index based on self-reports DIF-adjusted index
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
EQ-5DIndex
Index based on self-reports DIF-adjusted index
Difference=0.049
Difference=0.095
33. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Stage 2: Summary
• Vignettes can be used identify DIF in the EQ-5D-5L (at least in
certain age groups)
• Failure to adjust for DIF can lead to conclusions that are
misleading
• Further work is needed to achieve vignette equivalence
• Earlier work increased RC (rate vignettes as if it were
themselves, imagine person of similar age, avoided age
specific diseases) but did this come at the expense of VE?
• Further work required to understand what this means for
economic evaluations
• Knott et al (2016) Differential item functioning in the EQ-5D: an
exploratory analysis using anchoring vignettes. Working paper
34. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Stage 3: external application
• Funded by Monash Faculty of Business grant
• Often voiced concern is that the inclusion of vignettes in studies,
particularly clinical trials is not costless
• Application of vignettes has typically been limited to datasets
where they are collected
• Recent work (Harris et al, 2015) showed that it is possible to
correct for DIF using vignette responses collected externally to
the main dataset, using SRH and HILDA
• Research question
• Is it possible to adjust for DIF in the EQ-5D within a dataset
that did not include vignettes?
• If it’s possible, what effect does it have?
35. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Data sources
• Vignette data as before
• Multi Instrument Comparison (MIC) study recruited 8,000+
respondents in 6 countries to complete 6 of the most common
MAUIs, including the EQ-5D-5L (Richardson et al, 2012)
• Targeting of morbidity groups and the healthy public
• Wave 1 Australian sample N=1,341
• Given RC and VE only exist in 55+ age group, MIC external
sample N=656 and vignette sample N=914
• Key issue: how similar are the two groups, how applicable will the
vignette responses in the external data be to the MIC
respondents? Is the DIF problem in this sample the same as in
the other?
36. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Descriptive statistics
MIC sample (N = 656) Vignettes sample (N = 914)
Mean St. Dev. Mean St. Dev.
Female 0.447 0.498 0.497 0.500
Male 0.553 0.498 0.503 0.500
Aged 55-64 0.566 0.496 1 -
Aged 65+ 0.435 0.496 0 -
University degree (high) 0.349 0.477 0.309 0.462
Certificate/diploma (medium) 0.245 0.431 0.330 0.471
High school or less (low) 0.405 0.491 0.361 0.481
Born in Australia 0.686 0.465 0.756 0.430
Employed 0.244 0.430 0.528 0.500
Married 0.654 0.476 0.650 0.477
Asthma 0.061 0.240 0.166 0.373
Cancer 0.200 0.400 0.101 0.301
Respiratory 0.093 0.291 0.067 0.250
Depression 0.067 0.250 0.318 0.466
Diabetes 0.180 0.384 0.149 0.356
38. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
DIF adjustment – group differences
-0.004
0.054
0.038
0.093
0.065
0.08
-0.02
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
Male - Female High educ - Low educ Migrant - Born Aus Employed - Not employed Married - Alone Aged 65 plus - Under 65
DifferenceinEQ-5D-5Lindices
Unadjusted scores DIF-adjusted scores
0.016
0.079
0.037
0.141
0.096 0.097
-0.02
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
Male - Female High educ - Low educ Migrant - Born Aus Employed - Not employed Married - Alone Aged 65 plus - Under 65
DifferenceinEQ-5D-5Lindices
Unadjusted scores DIF-adjusted scores
MID=0.074
39. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Stage 3: summary
• It is possible to correct for DIF using responses to anchoring
vignettes that are collected externally to the main dataset of
interest
• Resulting QALY measures can be considered comparable across
different population groups
• Assuming reporting behaviour in each sample is the same
• Knott & Lorgelly (2016) Adjusting for differential item functioning
in the EQ-5D using externally-collected vignettes. HESG Paper
(Gran Canaria)
40. Application of anchoring vignettes to the EQ-5D-5L:
a possible solution to reporting heterogeneity in PROMs
Where to next?
• Better understanding of the vignette equivalence failure issue
• Will there always be a trade-off with response consistency?
• Is there value in exploring DIF cross-culturally?
• Multi-national clinical trials, often apply one country’s tariff as
if all respondents are within that country
• Is the external adjustment as good as (or a close substitute for)
collecting them within a study?
• What does this mean for economic evaluations and the decisions
they inform?
• Could response behaviour change over time?