Quality of Life
in Clinical Trials
Some slides adapted from
Aside: MIXED procedure
“Mixed” effects model in SAS
Includes both fixed and random effects
Estimate a common slope over time
Allow each individual to has his/her own intercept
Why are we interested in Quality of Life (QOL)?
The FDA has stated that efficacy with
respect to overall survival and/or
improvements in QOL might provide the
basis for drug approval.
Shaughnessy JA, Wittes RE, Burke G et al.
Commentary concerning demonstration of safety and efficacy of
Investigational anticancer agents in clinical trials.
Journal of Clinical Oncology 1991 (9) 2225-32
How are you feeling today?
What is QoL?
WHO: “Health is not only the absence of
infirmity and disease, but also a state of
physical, mental and social well-being.”
Multiple domains include: Physical,
cognitive, emotional and social
functioning, pain, sexual functioning,
health perceptions, and symptoms about
nausea and fatigue
Fundamental Principle: QoL IS ASSESSED
BY THE PATIENT
Definition depends on context:
Cancer vs. MI vs. hypertension
Early instruments for measuring QoL were
Later instruments, “general health status”
POMS = Profile of Mood
SIP = Sickness Impact Profile
Difficulties with concept
No agreement on definition
Lack of standardized measures
One definition (Levine and Croog) has two components:
Social (major component): get along with family and
Physical: perform daily activities
Emotional: stability and self-control
Intellectual: decision-making ability
Life satisfaction: sense of well-being
Health Status: compared to others
Factors influencing QoL
Labeling: diagnosis brings on ‘change’
Non-related life events (e.g. death in the
Rationale in Clinical Trials
QOL assesses effect of intervention/treatment
Primary response (treatment improves symptoms?)
Side effects (treatment toxic?)
Economic aspects (low risk/cost of treatment but high
Another setting: Treatment for pain
Primary response (pain lessened?)
Side effects (interact with disease? Other side effects?)
Data collection can add measurement error or
Mode: self-administered vs. interview
Self-admin: Reading ability, fine-motor skills
Interview: Hearing problems, age/gender/ethnicity
sensitivity, training of interviewer
Instrument validity, sensitivity, specificity
Sensitivity of questions
Frame of reference (cognitive skills, privacy, cultural
Patient vs family vs health care provider
Determine QoL objective
Choose instrument to measure QoL
Reliable, valid, responsive, feasible
Global measures, disease-specific measures,
Select assessment time points
Develop analysis plan
Choosing your instrument
Off-the-shelf (i.e. general) instruments
Designed to distinguish sickness from wellness
May not be sensitive to particular aspect of a
May not be validated or “normed” in population
May ask silly questions for trial population
May take long time to complete
May impact negatively on compliance
Choosing your instrument
“Tailor Made” Instruments
Quick and simple
Standardized but targeted to disease
Validated, normed to trial population
Select subsets of off-the-shelf instruments
Often designed by graduate student or
Often too long
Often not validated or ‘normed’ or field tested
in the patient population of interest
a. CARES-SF (Schag 1991) - 59 item scale which measures rehabilitation and quality of
life in patients with cancer. This has been modified to the HIV Overview of Problems
Evaluation Systems (HOPES, Schag 1992)
b. City of Hope Quality of Life, Cancer Patient Version (Ferrell 1995) – a 41 item ordinal
scale representing the four domains of quality of life including physical well being,
psychological well being, and spiritual well being.
c. Daily Diary Card-QOL (Gower 1995) - a self-administered card for use in cancer
clinical trials that has been shown to demonstrate short-term changes in quality of life
related to symptoms induced by chemotherapy.
d. EORTC QOL-30 (Aaronson 1993) - this instrument is composed of modules to assess
quality of life for specific cancers in clinical trials. The current instrument is 30 items
with physical function, role function, cognitive function, emotional function, social
function, symptoms, and financial impact.
e. FACT-G (Cella 1993) – a 33 item scale developed to measure quality of life in patients
undergoing cancer treatment.
f. FLIC (Finkelstein 1988) – a 22 item instrument which measures quality of life in the
following domains: physical/occupational function, psychological state, sociability,
and somatic discomfort. This scale was originally proposed as an adjunct measure to
cancer clinical trials.
g. Southwest Oncology Group Quality of Life Questionnaire (Moinpour 1990) – a scale
developed for cancer patients incorporating questions from various function, symptoms,
and global quality of life measures.
Look for measures that are proven to be
Validity: does measure actually measure
the construct it is intended to measure?
Reliability: how much close is does our
measure get to the “true” score? (ranges
from 0 to 1)
RELIABILITY AND VALIDITY DEPEND ON THE
SAMPLE TO WHICH YOUR MEASURE IS APPLIED!
the FACT-G has been shown to have reliability of in 0.87
in Americans undergoing chemotherapy (Cella)
Is it still a reliable measure in Japanese men with
Is it a reliable measure in Korean women with breast
If a measure is to be applied to a different
population from which it has been validated on, it
needs to be re-assessed.
What is the big deal if the reliability is lower in
Low reliability = Poor measure
Low reliability also implies poor validity.
Think of these scales as “surrogate markers” of
quality of life
Would you use surrogate markers that you KNEW
were only weakly related to the true outcome of
If reliability is low, then you are not measuring
what you are trying to measure.
Look for reliabilities above 0.75
What about validity in new population?
The same items/questions may mean different things to
different patient populations or cultures
For some latent variables (e.g. mental disorders), the
variable of interest manifests itself differently in
different cultures or population subgroups.
Translations into different languages can affect results
If there are items in the scale that are irrelevant for
your patient population, then you are compromising
your validity by including them.
QOL measured by multiple indicators
Need validated overall ‘score’
Or, can use fancier multivariate methods
Usually, treat ‘score’ as observed level of
QoL and proceed with analysis.
‘score’ is often not a valid measure of QoL in
the patient population
Score tends to be fraught with measurement
error (reliability tells you about this)
Used FACT-G and FACT-B.
Simply added up the responses to each item
(but that is shown to be valid and reliable)
Treats each item as “exchangeable:” assumes
each item is equally sensitive to changes in
Alternative: develop model to weight each
item relative to how informative it is about QoL
(latent variable methods…..).
QoL is, by definition, a “latent variable:” it cannot
be directly measured.
We measure it using “symptoms” of QoL
Statistical methods help us make inference about
state of QoL via the symptoms.
Develop models/scales for measuring QoL
We can maximize reliability and evaluate validity.
Issues to consider:
What if our “symptoms” are not tapping into QoL like we
What if patients’ perceptions of the questions we ask are
How can we find out about these things????
Have you had
Latent Variable Depiction
Other latent variables in medical research
Other QOL issues
Often interested in whether or not survival with
poor quality of life is better than death without
“QALY”= Quality Adjusted Life Years
Cancer: many patients would rather not get toxic
therapies and have more enjoyable end of life
The general idea is to down-weight time spent in
periods of poor quality of life.
How to determine the weights?
Different settings might need different weights.
QTWIST: Quality-Adjusted Time Without Symptoms of disease and Toxicity.
Quality Adjusted Survival
Evaluate therapies based on both quantity
and quality of life through survival
Based on QALYs.
Define QOL health states, including
one with good health (minimal
Patients progress through health
states and never back-track.
Partition the area under the Kaplan-
Meier Curve and calculate the average
time spent in each clinical health
Compare treatment regimens using
weighted sums durations, weights are
Example: 5 year survival
0 6 1 0 8 2 0 6 1 0 2 1 3. . . .× + × + × + × = 3 adjusted years of life
Compare the average QTWIST in two treatment groups.
Could be that on treatment A, people live longer, but QOL is worse.
Quality of Life for Individual
Fairclough and Gelber, “Quality of Life:
Statistical Issues and Analysis.” From
Quality of Life and Pharmacoeconomics in
Clinical Trials, Second Edition, ed. B.