Introduction to Survival analysis
Ghada Abu-Sheasha, PhD
Associate Prof in Medical Statistics, Alexandria University
DPLM in Health Economic, University of York
MSc in Economic Evaluation for HTA, University of York
What is your expectations?
What are your expectations?
Outline
• Why do we need SA?
• What is SA?
• Examples for SA
• Objectives of SA
• Terminology
• Types of SA
• Non-parametric SA
• Semi-parametric SA
• Parametric SA
• Types of predictors
• Time-dependent
• Time-independent
• Life table and survival curve
• Kaplan-Meier curve
• Tests for comparison
• Log-rank test
• Breslow
• Tarone-Ware
Why do we need SA?
Here I can use t-test to compare time to event
between two groups
Here I can use Chi-square test to compare
proportions of event within a year 1 Year
The event simply doesn’t occur
before the end of study.
They drop out
for reasons
unrelated to
the study.
1 Year
The event simply doesn’t occur
before the end of study.
They drop out
for reasons
unrelated to
the study.
1 Year
Censored cases
1. Why not compare mean time-to-event between your groups using a
t-test?
2. Why not compare proportion of events in your groups using
risk/odds ratios or logistic regression?
Censored cases
• Subjects are said to be censored if they are lost to follow up or drop
out of the study, or if the study ends before they die or have an
outcome of interest.
• They are counted as alive or disease-free for the time they were
enrolled in the study.
• If dropout is related to both outcome and treatment, dropouts may bias the
results
Outline
• Why do we need SA?
• What is SA?
• Examples for SA
• Objectives of SA
• Terminology
• Types of SA
• Non-parametric SA
• Semi-parametric SA
• Parametric SA
• Types of predictors
• Time-dependent
• Time-independent
• Life table and survival curve
• Kaplan-Meier curve
• Tests for comparison
• Log-rank test
• Breslow
• Tarone-Ware
Outline
• Why do we need SA?
• What is SA?
• Examples for SA
• Objectives of SA
• Terminology
• Types of SA
• Non-parametric SA
• Semi-parametric SA
• Parametric SA
• Types of predictors
• Time-dependent
• Time-independent
• Life table and survival curve
• Kaplan-Meier curve
• Tests for comparison
• Log-rank test
• Breslow
• Tarone-Ware
What is survival analysis
• Statistical methods for analyzing longitudinal data on the occurrence
of events.
• Accommodates data from randomized clinical trial or cohort study
design and deal with censoring
• Events may include death, injury, onset of illness, recovery from
illness (binary variables) or transition above or below the clinical
threshold of a meaningful continuous variable (e.g. CD4 counts).
Outline
• Why do we need SA?
• What is SA?
• Examples for SA
• Objectives of SA
• Terminology
• Types of SA
• Non-parametric SA
• Semi-parametric SA
• Parametric SA
• Types of predictors
• Time-dependent
• Time-independent
• Life table and survival curve
• Kaplan-Meier curve
• Tests for comparison
• Log-rank test
• Breslow
• Tarone-Ware
Outline
• Why do we need SA?
• What is SA?
• Examples for SA
• Objectives of SA
• Terminology
• Types of SA
• Non-parametric SA
• Semi-parametric SA
• Parametric SA
• Types of predictors
• Time-dependent
• Time-independent
• Life table and survival curve
• Kaplan-Meier curve
• Tests for comparison
• Log-rank test
• Breslow
• Tarone-Ware
Lung cancer clinical trial
• Phase III clinical trial of 164 patients with surgically resected (non-
small cell) lung cancer, randomized patients to receive radiotherapy
either with or with- out adjuvant combination platinum-based
chemotherapy (Lung Cancer Study Group, 1988; Piantadosi, 1997).
• The primary outcome is the time to first relapse (including death from
lung cancer).
• The relapse proportions in the radiotherapy and combination arms
were 81.4% (70 out of 86) and 69.2% (54 out of 78), respectively.
However, these figures are potentially misleading as they as they
ignore the duration spent in remission before these events
occurred.
Random
sample of
surgically
resected lung
cancer
Radiotherapy
only
Combination
Relapse
No relapse
Relapse
No relapse
TIME
Random
assignment
81%
69%
Prospective cohort studies
To determine the unique prognostic ability of various factors on overall
survival.
Prospective cohort data and ovarian cancer
Eight-hundred, twenty-five (825) patients diagnosed with primary
epithelial ovarian carcinoma between January 1990 and December
1999 at the Western General Hospital in Edinburgh. Follow-up data
were available up until the end of December 2000, by which time 550
(75.9%) had died (Clark et al, 2001).
Primary outcome is overall mortality
Overall survival by FIGO
stage, and there is a
significant decrease in
overall survival with
more advanced disease.
Outline
• Why do we need SA?
• What is SA?
• Examples for SA
• Objectives of SA
• Terminology
• Types of SA
• Non-parametric SA
• Semi-parametric SA
• Parametric SA
• Types of predictors
• Time-dependent
• Time-independent
• Life table and survival curve
• Kaplan-Meier curve
• Tests for comparison
• Log-rank test
• Breslow
• Tarone-Ware
Outline
• Why do we need SA?
• What is SA?
• Examples for SA
• Objectives of SA
• Terminology
• Types of SA
• Non-parametric SA
• Semi-parametric SA
• Parametric SA
• Types of predictors
• Time-dependent
• Time-independent
• Life table and survival curve
• Kaplan-Meier curve
• Tests for comparison
• Log-rank test
• Breslow
• Tarone-Ware
Objectives of survival analysis
• Estimate time-to-event for a group of individuals, such as time
until second heart-attack for a group of MI patients.
• To compare time-to-event between two or more groups, such as
treated vs. placebo MI patients in a randomized controlled trial.
• To assess the relationship of co-variables to time-to-event, such
as: does weight, insulin resistance, or cholesterol influence survival
time of MI patients?
Note: expected time-to-event = 1/incidence rate
Outline
• Why do we need SA?
• What is SA?
• Examples for SA
• Objectives of SA
• Terminology
• Types of SA
• Non-parametric SA
• Semi-parametric SA
• Parametric SA
• Types of predictors
• Time-dependent
• Time-independent
• Life table and survival curve
• Kaplan-Meier curve
• Tests for comparison
• Log-rank test
• Breslow
• Tarone-Ware
Outline
• Why do we need SA?
• What is SA?
• Examples for SA
• Objectives of SA
• Terminology
• Types of SA
• Non-parametric SA
• Semi-parametric SA
• Parametric SA
• Types of predictors
• Time-dependent
• Time-independent
• Life table and survival curve
• Kaplan-Meier curve
• Tests for comparison
• Log-rank test
• Breslow
• Tarone-Ware
• Event: what terminates an episode (such as death, adoption of an
innovation), it is the change which causes the subject to transition
from one state to another.
• Time-to-event: The time from entry into a study until a subject has a
particular outcome
The event simply doesn’t occur
before the end of study.
They drop out
for reasons
unrelated to
the study.
Censored cases
Data structure
Two-variable outcome :
• Time variable: ti = time at last disease-free observation or time at
event
• Censoring variable: ci =1 if had the event; ci =0 no event by time ti
Functions
• Probability of failure, 𝒇 𝒕
The probability of the event
occurring at exactly time 𝑡𝑖
𝑓 𝑡 = 𝑃(𝑇 = 𝑡𝑖)
• Cumulative failure function,
F 𝑡
The probability that a person dies
before time 𝑡𝑖
𝐹 𝑡 = 𝑃(𝑇 < 𝑡𝑖)
• Cumulative survival function
𝑆 𝑡
The probability that a person
survives longer than time 𝑡𝑖
𝑆 𝑡 = 𝑃(𝑇 ≥ 𝑡𝑖)
All definitions
• Hazard function, ℎ 𝑡
The probability of the event occurring at exactly time 𝑡𝑖 provided the
patient has survived up to 𝑡𝑖.
ℎ 𝑡 = 𝑃 𝑇 = 𝑡𝑖 𝑇 ≥ 𝑡𝑖−1)
Put another way, it represents the
instantaneous event rate for an individual who
has already survived to time t.
Outline
• Why do we need SA?
• What is SA?
• Examples for SA
• Objectives of SA
• Terminology
• Types of SA
• Non-parametric SA
• Semi-parametric SA
• Parametric SA
• Types of predictors
• Time-dependent
• Time-independent
• Life table and survival curve
• Kaplan-Meier curve
• Tests for comparison
• Log-rank test
• Breslow
• Tarone-Ware
Outline
• Why do we need SA?
• What is SA?
• Examples for SA
• Objectives of SA
• Terminology GW
• Types of SA
• Non-parametric SA
• Semi-parametric SA
• Parametric SA
• Types of predictors
• Time-dependent
• Time-independent
• Life table and survival curve
• Kaplan-Meier curve
• Tests for comparison
• Log-rank test
• Breslow
• Tarone-Ware
How the random variable is
defined is very important.
Continuous vs discrete
Books in a backpack
• If X is equal to the number of books in a backpack, then
X is a discrete random variable.
• If X is the weight of a book, then X is a continuous
random variable because weights are measured.
Distance in kilometers
• If X is equal to the number of km (to the nearest km) you drive
to work, then X is a discrete random variable. You count the
miles.
• If X is the distance you drive to work, then you measure values
of X and X is a continuous random variable.
T n N at risk
𝒇 𝒕 𝐅 𝒕 𝑺(𝒕) 𝒉 𝒕
PDF CDF 1-CDF
𝒇(𝒕)
𝑺(𝒕)
Def
0 0 100 0.00 0 1 0.00 0.00
1 10 90 0.10 0.10 0.90 0.11 0.11
2 20 70 0.20 0.3 0.70 0.25 0.25
3 30 40 0.30 0.6 0.40 0.55 0.55
4 5 35 0.05 0.65 0.35 0.13 0.13
5 35 0 0.35 1 0.00 2.00 2.00
The following data shows the numbers of deaths (n) occurred over a
period of five years in a cohort of 100 subjects. Suppose T=time to
death, is measured in number of years to the nearest year.
Probability of failure, 𝑓 𝑡
• The probability of the event occurring at exactly time t
• If T is discrete, 𝑓 𝑡 probability distribution function
𝑓 𝑡 = 𝑃(𝑇 = 𝑡)
• If T is continuous, 𝑓 𝑡 probability density function
𝑓 𝑡 = 𝑃(𝑡 < 𝑇 < 𝑡 + 𝑥)
T n N at risk
𝒇 𝒕 𝐅 𝒕 𝑺(𝒕) 𝒉 𝒕
PDF CDF 1-CDF
𝒇(𝒕)
𝑺(𝒕)
Def
0 0 100 0.00 0 1 0.00 0.00
1 10 90 0.10 0.10 0.90 0.11 0.11
2 20 70 0.20 0.3 0.70 0.25 0.25
3 30 40 0.30 0.6 0.40 0.55 0.55
4 5 35 0.05 0.65 0.35 0.13 0.13
5 35 0 0.35 1 0.00 2.00 2.00
What is the probability someone selected at random will die in T=2
T n N at risk
𝒇 𝒕 𝐅 𝒕 𝑺(𝒕) 𝒉 𝒕
PDF CDF 1-CDF
𝒇(𝒕)
𝑺(𝒕)
Def
0 0 100 0.00 0 1 0.00 0.00
1 10 90 0.10 0.10 0.90 0.11 0.11
2 20 70 0.20 0.3 0.70 0.25 0.25
3 30 40 0.30 0.6 0.40 0.55 0.55
4 5 35 0.05 0.65 0.35 0.13 0.13
5 35 0 0.35 1 0.00 2.00 2.00
What is the probability someone selected at random will die in T=2
Cumulative failure function, F 𝑡
• The probability that a person dies before time t
• 𝐹 𝑡 = 𝑃(𝑇 < 𝑡)
T n N at risk
𝒇 𝒕 𝐅 𝒕 𝑺(𝒕) 𝒉 𝒕
PDF CDF 1-CDF
𝒇(𝒕)
𝑺(𝒕)
Def
0 0 100 0.00 0 1 0.00 0.00
1 10 90 0.10 0.10 0.90 0.11 0.11
2 20 70 0.20 0.3 0.70 0.25 0.25
3 30 40 0.30 0.6 0.40 0.55 0.55
4 5 35 0.05 0.65 0.35 0.13 0.13
5 35 0 0.35 1 0.00 2.00 2.00
What is the probability someone selected at random will die at 𝑇 ≤ 2
What is the probability someone selected at random will die at 𝑇 < 2
T n N at risk
𝒇 𝒕 𝐅 𝒕 𝑺(𝒕) 𝒉 𝒕
PDF CDF 1-CDF
𝒇(𝒕)
𝑺(𝒕)
Def
0 0 100 0.00 0 1 0.00 0.00
1 10 90 0.10 0 1 0.10 0.10
2 20 70 0.20 0.1 0.9 0.22 0.22
3 30 40 0.30 0.3 0.70 0.43 0.43
4 5 35 0.05 0.6 0.40 0.13 0.13
5 35 0 0.35 0.65 0.35 1.00 1.00
What is the probability someone selected at random will die at 𝑇 < 2
T n N at risk
𝒇 𝒕 𝐅 𝒕 𝑺(𝒕) 𝒉 𝒕
PDF CDF 1-CDF
𝒇(𝒕)
𝑺(𝒕)
Def
0 0 100 0.00 0 1 0.00 0.00
1 10 90 0.10 0 1 0.10 0.10
2 20 70 0.20 0.1 0.9 0.22 0.22
3 30 40 0.30 0.3 0.70 0.43 0.43
4 5 35 0.05 0.6 0.40 0.13 0.13
5 35 0 0.35 0.65 0.35 1.00 1.00
Cumulative survival function 𝑆 𝑡
• 𝑆 𝑡 the probability that a person survives longer than some specified
time t
• 𝑆 𝑡 = 𝑃(𝑇 ≥ 𝑡)
What is the probability someone selected at random survives at 𝑇 ≥ 2
T n N at risk
𝒇 𝒕 𝐅 𝒕 𝑺(𝒕) 𝒉 𝒕
PDF CDF 1-CDF
𝒇(𝒕)
𝑺(𝒕)
Def
0 0 100 0.00 0 1 0.00 0.00
1 10 90 0.10 0 1 0.10 0.10
2 20 70 0.20 0.1 0.9 0.22 0.22
3 30 40 0.30 0.3 0.70 0.43 0.43
4 5 35 0.05 0.6 0.40 0.13 0.13
5 35 0 0.35 0.65 0.35 1.00 1.00
What is the probability someone selected at random survives at 𝑇 ≥ 2
T n N at risk
𝒇 𝒕 𝐅 𝒕 𝑺(𝒕) 𝒉 𝒕
PDF CDF 1-CDF
𝒇(𝒕)
𝑺(𝒕)
Def
0 0 100 0.00 0 1 0.00 0.00
1 10 90 0.10 0 1 0.10 0.10
2 20 70 0.20 0.1 0.9 0.22 0.22
3 30 40 0.30 0.3 0.70 0.43 0.43
4 5 35 0.05 0.6 0.40 0.13 0.13
5 35 0 0.35 0.65 0.35 1.00 1.00
Hazard function, ℎ 𝑡
The probability of the event occurring at exactly time 𝑡𝑖 provided the
patient has survived up to 𝑡𝑖−1.
ℎ 𝑡 = 𝑃 𝑇 = 𝑡𝑖 𝑇 ≥ 𝑡𝑖−1)
Put another way, it represents the
instantaneous event rate for an individual who
has already survived to time t.
T n N at risk
𝒇 𝒕 𝐅 𝒕 𝑺(𝒕) 𝒉 𝒕
PDF CDF 1-CDF Def
𝒇(𝒕)
𝑺(𝒕)
0 0 100 0.00 0 1 0.00 0.00
1 10 90 0.10 0 1 0.10 0.10
2 20 70 0.20 0.1 0.9 0.22 0.22
3 30 40 0.30 0.3 0.70 0.43 0.43
4 5 35 0.05 0.6 0.40 0.13 0.13
5 35 0 0.35 0.65 0.35 1.00 1.00
What is the probability someone selected at random survives at 𝑇 ≥ 2
T n N at risk
𝒇 𝒕 𝐅 𝒕 𝑺(𝒕) 𝒉 𝒕
PDF CDF 1-CDF Def
𝒇(𝒕)
𝑺(𝒕)
0 0 100 0.00 0 1 0.00 0.00
1 10 90 0.10 0 1 0.10 0.10
2 20 70 0.20 0.1 0.9 0.22 0.22
3 30 40 0.30 0.3 0.70 0.43 0.43
4 5 35 0.05 0.6 0.40 0.13 0.13
5 35 0 0.35 0.65 0.35 1.00 1.00
What is the probability someone selected at random survives at 𝑇 ≥ 2
Hazard from 𝑓 𝑡 and 𝑆 𝑡 , ℎ 𝑡
ℎ 𝑡 =
𝑓(𝑡)
𝑆(𝑡)
T n N at risk
𝒇 𝒕 𝐅 𝒕 𝑺(𝒕) 𝒉 𝒕
PDF CDF 1-CDF Def
𝒇(𝒕)
𝑺(𝒕)
0 0 100 0.00 0 1 0.00 0.00
1 10 90 0.10 0 1 0.10 0.10
2 20 70 0.20 0.1 0.9 0.22 0.22
3 30 40 0.30 0.3 0.70 0.43 0.43
4 5 35 0.05 0.6 0.40 0.13 0.13
5 35 0 0.35 0.65 0.35 1.00 1.00
What is the probability someone selected at random survives at 𝑇 ≥ 2
T n N at risk
𝒇 𝒕 𝐅 𝒕 𝑺(𝒕) 𝒉 𝒕
PDF CDF 1-CDF Def
𝒇(𝒕)
𝑺(𝒕)
0 0 100 0.00 0 1 0.00 0.00
1 10 90 0.10 0 1 0.10 0.10
2 20 70 0.20 0.1 0.9 0.22 0.22
3 30 40 0.30 0.3 0.70 0.43 0.43
4 5 35 0.05 0.6 0.40 0.13 0.13
5 35 0 0.35 0.65 0.35 1.00 1.00
What is the probability someone selected at random survives at 𝑇 ≥ 2
Outline
• Why do we need SA?
• What is SA?
• Examples for SA
• Objectives of SA
• Terminology GW
• Types of SA
• Non-parametric SA
• Semi-parametric SA
• Parametric SA
• Types of predictors
• Time-dependent
• Time-independent
• Life table and survival curve
• Kaplan-Meier curve
• Tests for comparison
• Log-rank test
• Breslow
• Tarone-Ware
Outline
• Why do we need SA?
• What is SA?
• Examples for SA
• Objectives of SA
• Terminology GW
• Types of SA
• Non-parametric SA
• Semi-parametric SA
• Parametric SA
• Types of predictors
• Time-dependent
• Time-independent
• Life table and survival curve
• Kaplan-Meier curve
• Tests for comparison
• Log-rank test
• Breslow
• Tarone-Ware
Types of
survival
analysis
Nonparametric: no assumption about the shape
of hazard function. Hazard function is estimated
based on empirical data, showing change over
time. Example: Kaplan-Meier survival analysis.
Semi-parametric: no assumption about the
shape of hazard function, It makes assumption
about how covariates affect the hazard
function. Example: Cox regression
Parametric: specify the shape of baseline
hazard function and covariates effects on
hazard function in advance. Used for predictive
modelling
One
assumption
for all survival
analysis
No systematic differences
between censored and
uncensored cases
Outline
• Why do we need SA?
• What is SA?
• Examples for SA
• Objectives of SA
• Terminology GW
• Types of SA
• Non-parametric SA
• Semi-parametric SA
• Parametric SA
• Types of predictors
• Time-dependent
• Time-independent
• Life table and survival curve
• Kaplan-Meier curve
• Tests for comparison
• Log-rank test
• Breslow
• Tarone-Ware
Outline
• Why do we need SA?
• What is SA?
• Examples for SA
• Objectives of SA
• Terminology GW
• Types of SA
• Non-parametric SA
• Semi-parametric SA
• Parametric SA
• Types of predictors
• Time-dependent
• Time-independent
• Life table and survival curve
• Kaplan-Meier curve
• Tests for comparison
• Log-rank test
• Breslow
• Tarone-Ware
Types of predictors
• Time independent predictors
• Time-dependent predictors
Time-independent variable
• A time-independent variable is defined to be any variable whose
value for a given individual does not change over time. Example is
SEX.
What about smoking status,
age, weight, treatment?
Types of predictors
• In prospective studies, when individuals are followed over time,
the values of covariates may change with time. Covariates can
thus be divided into fixed and time-dependent.
Outline
• Why do we need SA?
• What is SA?
• Examples for SA
• Objectives of SA
• Terminology GW
• Types of SA
• Non-parametric SA
• Semi-parametric SA
• Parametric SA
• Types of predictors
• Time-dependent
• Time-independent
• Life table and survival curve
• Kaplan-Meier curve
• Tests for comparison
• Log-rank test
• Breslow
• Tarone-Ware
Outline
• Why do we need SA?
• What is SA?
• Examples for SA
• Objectives of SA
• Terminology GW
• Types of SA
• Non-parametric SA
• Semi-parametric SA
• Parametric SA
• Types of predictors
• Time-dependent
• Time-independent
• Life table and survival curve
• Kaplan-Meier curve
• Tests for comparison
• Log-rank test
• Breslow
• Tarone-Ware
Anderson.sav
• It consists of remission survival times on 42 leukemia
patients, half of whom receive a new therapy and the other
half of whom get a standard therapy (Freireich et al., Blood,
1963).
• The exposure variable of interest is treatment status (Rx = 0
if new treatment, Rx = 1 if standard treatment).
• Two other variables for control are log white blood cell
count (i.e., logWBC) and sex.
• Failure status is defined by the relapse variable (0 if
censored, 1 if failure).
Life table
• Life Tables is a descriptive procedure for examining the distribution of
time-to-event variables.
• We also can compare the distribution by levels of a factor variable.
• The basic idea of life tables is to subdivide the period of observation
into smaller time intervals.
• Then the probability from each of the intervals are estimated.
Life table
Life table
The beginning
value for each
interval. Each
interval extends
from its start
time up to the
start time of the
next interval.
Life table
The number of
surviving cases
at the
beginning of
the interval.
This number
decreases
steadily as
cases relapsed
or be censored
Life table
The number of
censored cases
in this interval.
They are lost to
follow up while
in remission.
Life table
The number of
surviving cases
minus one half
the censored
cases. This is
intended to
account for the
effect of the
censored case
Life table
The number of
cases that have
had a relapse.
These are cases
with status= 1.
Life table
The ratio of
terminal events
to the number
exposed to risk
(5/42=0.12).
Life table
One minus the
proportion
terminating.
(1-0.12)=0.88
Life table
The proportion
of cases
surviving from
the start of the
table to the end
of the interval
(42-5-
8)/42=0.6905(se
cond row).
Life table
The proportion
of cases
surviving from
the start of the
table to the end
of the interval
(42-5-
8)/42=0.8984
(second row).
Life table
An estimate of the
probability of
experiencing the
terminal event per time
unit during the interval.
𝑓 𝑡𝑖 =
𝑆 𝑡𝑖+1 − 𝑆 𝑡𝑖
𝑡𝑖+1 − 𝑡𝑖
(0.88 − 0.69)
4
= 0.475
Life table
An estimate of the
probability of experiencing
the terminal event per time
unit during the interval,
conditional upon surviving
to the start of the interval.
ℎ 𝑡𝑖 =
𝑓(𝑡𝑖)
𝑎𝑣𝑔[𝑆 𝑡𝑖 , 𝑆 𝑡𝑖+1 ]
1st row:
0.03
0.88+1
2
= 0.03
2nd row:
0.48
0.69+0.88
2
= 0.06
Life table
An estimate of the
probability of experiencing
the terminal event per time
unit during the interval,
conditional upon surviving
to the start of the interval.
ℎ 𝑡𝑖 =
𝑓(𝑡𝑖)
𝑎𝑣𝑔[𝑆 𝑡𝑖 , 𝑆 𝑡𝑖+1 ]
1st row:
𝟎.𝟎𝟑
𝟎.𝟖𝟖+𝟏
𝟐
= 𝟎. 𝟎𝟑
2nd row:
0.48
0.69+0.88
2
= 0.06
1.00
Life table
An estimate of the
probability of experiencing
the terminal event per time
unit during the interval,
conditional upon surviving
to the start of the interval.
ℎ 𝑡𝑖 =
𝑓(𝑡𝑖)
𝑎𝑣𝑔[𝑆 𝑡𝑖 , 𝑆 𝑡𝑖+1 ]
1st row:
0.03
0.88+1
2
= 0.03
2nd row:
0.48
0.69+0.88
2
= 0.06
Life table
It is the time when half the
patients are expected to be
alive.
It means that the chance of
surviving beyond that time is 50
percent.
The median survival is approximately 12 weeks.
X-axis shows the time to event.
Y-axis shows
the
probability
of survival
X-axis shows the time to event.
Y-axis shows
the
probability
of survival
After 8 weeks, the probability of remaining in a remission is
around 70% given that the patient had not experience a
relapse yet
Any point on the
survival curve
shows the
probability that a
case remain in a
remission past
that time.
Outline
• Why do we need SA?
• What is SA?
• Examples for SA
• Objectives of SA
• Terminology GW
• Types of SA
• Non-parametric SA
• Semi-parametric SA
• Parametric SA
• Types of predictors
• Time-dependent
• Time-independent
• Life table and survival curve GW
• Kaplan-Meier curve
• Tests for comparison
• Log-rank test
• Breslow
• Tarone-Ware
GW
Compare time to relapse between males and females among patients
with leukemia.
What’s the probability of surviving past 20 weeks among men?
What’s the probability of surviving past 20 weeks among women?
Outline
• Why do we need SA?
• What is SA?
• Examples for SA
• Objectives of SA
• Terminology GW
• Types of SA
• Non-parametric SA
• Semi-parametric SA
• Parametric SA
• Types of predictors
• Time-dependent
• Time-independent
• Life table and survival curve GW
• Kaplan-Meier curve
• Tests for comparison
• Log-rank test
• Breslow
• Tarone-Ware
Outline
• Why do we need SA?
• What is SA?
• Examples for SA
• Objectives of SA
• Terminology GW
• Types of SA
• Non-parametric SA
• Semi-parametric SA
• Parametric SA
• Types of predictors
• Time-dependent
• Time-independent
• Life table and survival curve GW
• Kaplan-Meier curve
• Tests for comparison
• Log-rank test
• Breslow
• Tarone-Ware
Kaplan-Meier and Log-rank
test
K-M provides estimated S(t)
It can be used with or without censored data
The survival curves give a visual representation of
the life tables.
X- axis the time to event.
Drops in the survival
curve occur whenever
the medication takes
effect in a patient.
Y axis P(not experience
a relapse)
Horizontal gaps vs Vertical gaps
Notice the vertical gap
and horizontal gaps.
Horizontal gaps vs Vertical gaps
Horizontal gap at
S(t)=0.6
A horizontal gap
means that it took
longer for one group
to experience a
certain fraction of
deaths.
Horizontal gaps vs Vertical gaps
Sixty percent remained
alive up to 13 weeks in
GP2 compared to 5
weeks in GP3
Horizontal gap at
S(t)=0.6
Horizontal gaps vs Vertical gaps
Vertical gap at t=10
weeks
A vertical gap
means that at a
specific time point,
one group had a
greater fraction of
subjects surviving.
Horizontal gaps vs Vertical gaps
Vertical gap at t=10
weeks
At the 10th week, 63%
of GP2 remains alive
compared to 20% in
GP3
The survival curves give a visual representation of
the life tables.
The KM curves are
quite different with GP
1 having consistently
better survival
prognosis than
GP 2, and GP 2 having
consistently better
survival prognosis than
gp 3.
Note also that the
difference between
GPs 1 and 2 is about
the same over time,
whereas GP 2 appears
to diverge from
• The Kaplan-Meier procedure is a method of estimating time-to-event
models in the presence of censored cases.
• A descriptive procedure for examining the distribution of time-to-
event variables. We also can compare the distribution by levels of a
factor variable or produce separate analyses by levels of a
stratification variable.
• Censored cases (right-censored cases) are those for which the event
of interest has not yet happened.
Uses
• Intuitive graphical presentation.
• Commonly used to describe survivorship of study population/s.
• Commonly used to compare two study populations.
Breslow by the number of cases at
risk at each time point.
Log rank 
the same.
Tarone-Ware the
square root of the
number of cases at risk at
each time point.
All of them test the equality of survival
functions by weighting all time points …
The full name of Breslow statistic is Gehan-
Breslow-Wilcoxon test (after Edmund Alpheus
Gehan, Norman Edward Breslow and Frank
Wilcoxon).
Tarone, R.E. and Ware, J.
Pooled over strata: a single test is
computed for all factor levels,
testing for equality of survival
function across all levels of the
factor variable.
For each stratum: a separate test
is computed for group formed by
the stratification variable.
Pairwise over strata: a separate
test is computed for each pair of
factor levels when a pooled test
shows non-equality of survival
functions.
Pairwise for each stratum: a
separate test is computed for
each pair of factor variable, for
each stratum of the stratification
variable.
It shows the number of cases per
group, had a relapse and was censored
SPSS output
The table is very large
The survival table is
a descriptive table
that details the
time to event.
Each observation occupies a row.
The table is
sectioned by each
level of (ln WBC)
Time. The time at
which the event or
censoring
occurred.
Status. whether
the case relapsed
or was censored.
Cumulative Proportion Surviving at
the Time. The proportion of cases
surviving from the start of the table
until this time.
N of Cum Events. N of
cases that relapsed
from the start of the
table until this time.
N of Remaining Cases.
N of cases that, at this
time, haven’t yet
relapsed or be
censored.
N of Remaining Cases. N
of cases that, at this
time, haven’t yet
relapsed or be censored.
• Time to survival is the shortest in GP3, followed by Gp2 and was
the longest in GP1
• The confidence intervals do not overlap between levels; thus,
differences in effect on time to event can be inferred.
This table provides overall tests of the equality of survival
times across groups. Since the significance values of the tests
are all < 0.05, there is a statistically significant difference in the
distribution of relapse among the three groups.
Pair-wise comparisons revealed that all the relapse
distribution was significantly different across all pairs.
The survival curves give a visual representation of
the life tables.
X- axis the time to event.
Drops in the survival
curve occur whenever
the medication takes
effect in a patient.
Y axis P(not experience
a relapse)
The survival curves give a visual representation of
the life tables.
The KM curves are
quite different with GP
1 having consistently
better survival
prognosis than
GP 2, and GP 2 having
consistently better
survival prognosis than
gp 3.
Note also that the
difference between
GPs 1 and 2 is about
the same over time,
whereas GP 2 appears
to diverge from
Uses
• Commonly used to describe survivorship of study population/s.
• Commonly used to compare two study populations.
• Intuitive graphical presentation.
Limitation
•Mainly descriptive
•Doesn’t control for covariates
•Requires categorical predictors
•Can’t accommodate time-dependent variables
Outline
• Why do we need SA?
• What is SA?
• Examples for SA
• Objectives of SA
• Terminology GW
• Types of SA
• Non-parametric SA
• Semi-parametric SA
• Parametric SA
• Types of predictors
• Time-dependent
• Time-independent
• Life table and survival curve GW
• Kaplan-Meier curve GW
• Tests for comparison GW
• Log-rank test
• Breslow
• Tarone-Ware
Using suitable graphic representation and
statistical test..
• Compare the effect of treatment on relapse among patients with
leukemia
• Compare the effect of treatment on relapse among males
• Compare the effect of treatment on relapse of among females
Outline
• Why do we need SA?
• What is SA?
• Examples for SA
• Objectives of SA
• Terminology GW
• Types of SA
• Non-parametric SA
• Semi-parametric SA
• Parametric SA
• Types of predictors
• Time-dependent
• Time-independent
• Life table and survival curve GW
• Kaplan-Meier curve GW
• Tests for comparison GW
• Log-rank test
• Breslow
• Tarone-Ware
Outline
• Why do we need SA?
• What is SA?
• Examples for SA
• Objectives of SA
• Terminology GW
• Types of SA
• Non-parametric SA
• Semi-parametric SA
• Parametric SA
• Types of predictors
• Time-dependent
• Time-independent
• Life table and survival curve GW
• Kaplan-Meier curve GW
• Tests for comparison GW
• Log-rank test
• Breslow
• Tarone-Ware
Any Questions?

1. Introduction to Survival analysis

  • 1.
    Introduction to Survivalanalysis Ghada Abu-Sheasha, PhD Associate Prof in Medical Statistics, Alexandria University DPLM in Health Economic, University of York MSc in Economic Evaluation for HTA, University of York
  • 2.
    What is yourexpectations?
  • 3.
    What are yourexpectations?
  • 4.
    Outline • Why dowe need SA? • What is SA? • Examples for SA • Objectives of SA • Terminology • Types of SA • Non-parametric SA • Semi-parametric SA • Parametric SA • Types of predictors • Time-dependent • Time-independent • Life table and survival curve • Kaplan-Meier curve • Tests for comparison • Log-rank test • Breslow • Tarone-Ware
  • 6.
    Why do weneed SA?
  • 7.
    Here I canuse t-test to compare time to event between two groups
  • 8.
    Here I canuse Chi-square test to compare proportions of event within a year 1 Year
  • 9.
    The event simplydoesn’t occur before the end of study. They drop out for reasons unrelated to the study. 1 Year
  • 10.
    The event simplydoesn’t occur before the end of study. They drop out for reasons unrelated to the study. 1 Year Censored cases
  • 11.
    1. Why notcompare mean time-to-event between your groups using a t-test? 2. Why not compare proportion of events in your groups using risk/odds ratios or logistic regression?
  • 12.
    Censored cases • Subjectsare said to be censored if they are lost to follow up or drop out of the study, or if the study ends before they die or have an outcome of interest. • They are counted as alive or disease-free for the time they were enrolled in the study. • If dropout is related to both outcome and treatment, dropouts may bias the results
  • 13.
    Outline • Why dowe need SA? • What is SA? • Examples for SA • Objectives of SA • Terminology • Types of SA • Non-parametric SA • Semi-parametric SA • Parametric SA • Types of predictors • Time-dependent • Time-independent • Life table and survival curve • Kaplan-Meier curve • Tests for comparison • Log-rank test • Breslow • Tarone-Ware
  • 14.
    Outline • Why dowe need SA? • What is SA? • Examples for SA • Objectives of SA • Terminology • Types of SA • Non-parametric SA • Semi-parametric SA • Parametric SA • Types of predictors • Time-dependent • Time-independent • Life table and survival curve • Kaplan-Meier curve • Tests for comparison • Log-rank test • Breslow • Tarone-Ware
  • 15.
    What is survivalanalysis • Statistical methods for analyzing longitudinal data on the occurrence of events. • Accommodates data from randomized clinical trial or cohort study design and deal with censoring • Events may include death, injury, onset of illness, recovery from illness (binary variables) or transition above or below the clinical threshold of a meaningful continuous variable (e.g. CD4 counts).
  • 16.
    Outline • Why dowe need SA? • What is SA? • Examples for SA • Objectives of SA • Terminology • Types of SA • Non-parametric SA • Semi-parametric SA • Parametric SA • Types of predictors • Time-dependent • Time-independent • Life table and survival curve • Kaplan-Meier curve • Tests for comparison • Log-rank test • Breslow • Tarone-Ware
  • 17.
    Outline • Why dowe need SA? • What is SA? • Examples for SA • Objectives of SA • Terminology • Types of SA • Non-parametric SA • Semi-parametric SA • Parametric SA • Types of predictors • Time-dependent • Time-independent • Life table and survival curve • Kaplan-Meier curve • Tests for comparison • Log-rank test • Breslow • Tarone-Ware
  • 18.
    Lung cancer clinicaltrial • Phase III clinical trial of 164 patients with surgically resected (non- small cell) lung cancer, randomized patients to receive radiotherapy either with or with- out adjuvant combination platinum-based chemotherapy (Lung Cancer Study Group, 1988; Piantadosi, 1997). • The primary outcome is the time to first relapse (including death from lung cancer). • The relapse proportions in the radiotherapy and combination arms were 81.4% (70 out of 86) and 69.2% (54 out of 78), respectively. However, these figures are potentially misleading as they as they ignore the duration spent in remission before these events occurred.
  • 19.
  • 20.
    Prospective cohort studies Todetermine the unique prognostic ability of various factors on overall survival.
  • 21.
    Prospective cohort dataand ovarian cancer Eight-hundred, twenty-five (825) patients diagnosed with primary epithelial ovarian carcinoma between January 1990 and December 1999 at the Western General Hospital in Edinburgh. Follow-up data were available up until the end of December 2000, by which time 550 (75.9%) had died (Clark et al, 2001). Primary outcome is overall mortality
  • 22.
    Overall survival byFIGO stage, and there is a significant decrease in overall survival with more advanced disease.
  • 23.
    Outline • Why dowe need SA? • What is SA? • Examples for SA • Objectives of SA • Terminology • Types of SA • Non-parametric SA • Semi-parametric SA • Parametric SA • Types of predictors • Time-dependent • Time-independent • Life table and survival curve • Kaplan-Meier curve • Tests for comparison • Log-rank test • Breslow • Tarone-Ware
  • 24.
    Outline • Why dowe need SA? • What is SA? • Examples for SA • Objectives of SA • Terminology • Types of SA • Non-parametric SA • Semi-parametric SA • Parametric SA • Types of predictors • Time-dependent • Time-independent • Life table and survival curve • Kaplan-Meier curve • Tests for comparison • Log-rank test • Breslow • Tarone-Ware
  • 25.
    Objectives of survivalanalysis • Estimate time-to-event for a group of individuals, such as time until second heart-attack for a group of MI patients. • To compare time-to-event between two or more groups, such as treated vs. placebo MI patients in a randomized controlled trial. • To assess the relationship of co-variables to time-to-event, such as: does weight, insulin resistance, or cholesterol influence survival time of MI patients? Note: expected time-to-event = 1/incidence rate
  • 26.
    Outline • Why dowe need SA? • What is SA? • Examples for SA • Objectives of SA • Terminology • Types of SA • Non-parametric SA • Semi-parametric SA • Parametric SA • Types of predictors • Time-dependent • Time-independent • Life table and survival curve • Kaplan-Meier curve • Tests for comparison • Log-rank test • Breslow • Tarone-Ware
  • 27.
    Outline • Why dowe need SA? • What is SA? • Examples for SA • Objectives of SA • Terminology • Types of SA • Non-parametric SA • Semi-parametric SA • Parametric SA • Types of predictors • Time-dependent • Time-independent • Life table and survival curve • Kaplan-Meier curve • Tests for comparison • Log-rank test • Breslow • Tarone-Ware
  • 28.
    • Event: whatterminates an episode (such as death, adoption of an innovation), it is the change which causes the subject to transition from one state to another. • Time-to-event: The time from entry into a study until a subject has a particular outcome
  • 29.
    The event simplydoesn’t occur before the end of study. They drop out for reasons unrelated to the study. Censored cases
  • 30.
    Data structure Two-variable outcome: • Time variable: ti = time at last disease-free observation or time at event • Censoring variable: ci =1 if had the event; ci =0 no event by time ti
  • 31.
    Functions • Probability offailure, 𝒇 𝒕 The probability of the event occurring at exactly time 𝑡𝑖 𝑓 𝑡 = 𝑃(𝑇 = 𝑡𝑖) • Cumulative failure function, F 𝑡 The probability that a person dies before time 𝑡𝑖 𝐹 𝑡 = 𝑃(𝑇 < 𝑡𝑖) • Cumulative survival function 𝑆 𝑡 The probability that a person survives longer than time 𝑡𝑖 𝑆 𝑡 = 𝑃(𝑇 ≥ 𝑡𝑖)
  • 32.
    All definitions • Hazardfunction, ℎ 𝑡 The probability of the event occurring at exactly time 𝑡𝑖 provided the patient has survived up to 𝑡𝑖. ℎ 𝑡 = 𝑃 𝑇 = 𝑡𝑖 𝑇 ≥ 𝑡𝑖−1) Put another way, it represents the instantaneous event rate for an individual who has already survived to time t.
  • 33.
    Outline • Why dowe need SA? • What is SA? • Examples for SA • Objectives of SA • Terminology • Types of SA • Non-parametric SA • Semi-parametric SA • Parametric SA • Types of predictors • Time-dependent • Time-independent • Life table and survival curve • Kaplan-Meier curve • Tests for comparison • Log-rank test • Breslow • Tarone-Ware
  • 34.
    Outline • Why dowe need SA? • What is SA? • Examples for SA • Objectives of SA • Terminology GW • Types of SA • Non-parametric SA • Semi-parametric SA • Parametric SA • Types of predictors • Time-dependent • Time-independent • Life table and survival curve • Kaplan-Meier curve • Tests for comparison • Log-rank test • Breslow • Tarone-Ware
  • 35.
    How the randomvariable is defined is very important. Continuous vs discrete
  • 36.
    Books in abackpack • If X is equal to the number of books in a backpack, then X is a discrete random variable. • If X is the weight of a book, then X is a continuous random variable because weights are measured.
  • 37.
    Distance in kilometers •If X is equal to the number of km (to the nearest km) you drive to work, then X is a discrete random variable. You count the miles. • If X is the distance you drive to work, then you measure values of X and X is a continuous random variable.
  • 38.
    T n Nat risk 𝒇 𝒕 𝐅 𝒕 𝑺(𝒕) 𝒉 𝒕 PDF CDF 1-CDF 𝒇(𝒕) 𝑺(𝒕) Def 0 0 100 0.00 0 1 0.00 0.00 1 10 90 0.10 0.10 0.90 0.11 0.11 2 20 70 0.20 0.3 0.70 0.25 0.25 3 30 40 0.30 0.6 0.40 0.55 0.55 4 5 35 0.05 0.65 0.35 0.13 0.13 5 35 0 0.35 1 0.00 2.00 2.00 The following data shows the numbers of deaths (n) occurred over a period of five years in a cohort of 100 subjects. Suppose T=time to death, is measured in number of years to the nearest year.
  • 39.
    Probability of failure,𝑓 𝑡 • The probability of the event occurring at exactly time t • If T is discrete, 𝑓 𝑡 probability distribution function 𝑓 𝑡 = 𝑃(𝑇 = 𝑡) • If T is continuous, 𝑓 𝑡 probability density function 𝑓 𝑡 = 𝑃(𝑡 < 𝑇 < 𝑡 + 𝑥)
  • 40.
    T n Nat risk 𝒇 𝒕 𝐅 𝒕 𝑺(𝒕) 𝒉 𝒕 PDF CDF 1-CDF 𝒇(𝒕) 𝑺(𝒕) Def 0 0 100 0.00 0 1 0.00 0.00 1 10 90 0.10 0.10 0.90 0.11 0.11 2 20 70 0.20 0.3 0.70 0.25 0.25 3 30 40 0.30 0.6 0.40 0.55 0.55 4 5 35 0.05 0.65 0.35 0.13 0.13 5 35 0 0.35 1 0.00 2.00 2.00 What is the probability someone selected at random will die in T=2
  • 41.
    T n Nat risk 𝒇 𝒕 𝐅 𝒕 𝑺(𝒕) 𝒉 𝒕 PDF CDF 1-CDF 𝒇(𝒕) 𝑺(𝒕) Def 0 0 100 0.00 0 1 0.00 0.00 1 10 90 0.10 0.10 0.90 0.11 0.11 2 20 70 0.20 0.3 0.70 0.25 0.25 3 30 40 0.30 0.6 0.40 0.55 0.55 4 5 35 0.05 0.65 0.35 0.13 0.13 5 35 0 0.35 1 0.00 2.00 2.00 What is the probability someone selected at random will die in T=2
  • 42.
    Cumulative failure function,F 𝑡 • The probability that a person dies before time t • 𝐹 𝑡 = 𝑃(𝑇 < 𝑡)
  • 43.
    T n Nat risk 𝒇 𝒕 𝐅 𝒕 𝑺(𝒕) 𝒉 𝒕 PDF CDF 1-CDF 𝒇(𝒕) 𝑺(𝒕) Def 0 0 100 0.00 0 1 0.00 0.00 1 10 90 0.10 0.10 0.90 0.11 0.11 2 20 70 0.20 0.3 0.70 0.25 0.25 3 30 40 0.30 0.6 0.40 0.55 0.55 4 5 35 0.05 0.65 0.35 0.13 0.13 5 35 0 0.35 1 0.00 2.00 2.00 What is the probability someone selected at random will die at 𝑇 ≤ 2
  • 44.
    What is theprobability someone selected at random will die at 𝑇 < 2 T n N at risk 𝒇 𝒕 𝐅 𝒕 𝑺(𝒕) 𝒉 𝒕 PDF CDF 1-CDF 𝒇(𝒕) 𝑺(𝒕) Def 0 0 100 0.00 0 1 0.00 0.00 1 10 90 0.10 0 1 0.10 0.10 2 20 70 0.20 0.1 0.9 0.22 0.22 3 30 40 0.30 0.3 0.70 0.43 0.43 4 5 35 0.05 0.6 0.40 0.13 0.13 5 35 0 0.35 0.65 0.35 1.00 1.00
  • 45.
    What is theprobability someone selected at random will die at 𝑇 < 2 T n N at risk 𝒇 𝒕 𝐅 𝒕 𝑺(𝒕) 𝒉 𝒕 PDF CDF 1-CDF 𝒇(𝒕) 𝑺(𝒕) Def 0 0 100 0.00 0 1 0.00 0.00 1 10 90 0.10 0 1 0.10 0.10 2 20 70 0.20 0.1 0.9 0.22 0.22 3 30 40 0.30 0.3 0.70 0.43 0.43 4 5 35 0.05 0.6 0.40 0.13 0.13 5 35 0 0.35 0.65 0.35 1.00 1.00
  • 46.
    Cumulative survival function𝑆 𝑡 • 𝑆 𝑡 the probability that a person survives longer than some specified time t • 𝑆 𝑡 = 𝑃(𝑇 ≥ 𝑡)
  • 47.
    What is theprobability someone selected at random survives at 𝑇 ≥ 2 T n N at risk 𝒇 𝒕 𝐅 𝒕 𝑺(𝒕) 𝒉 𝒕 PDF CDF 1-CDF 𝒇(𝒕) 𝑺(𝒕) Def 0 0 100 0.00 0 1 0.00 0.00 1 10 90 0.10 0 1 0.10 0.10 2 20 70 0.20 0.1 0.9 0.22 0.22 3 30 40 0.30 0.3 0.70 0.43 0.43 4 5 35 0.05 0.6 0.40 0.13 0.13 5 35 0 0.35 0.65 0.35 1.00 1.00
  • 48.
    What is theprobability someone selected at random survives at 𝑇 ≥ 2 T n N at risk 𝒇 𝒕 𝐅 𝒕 𝑺(𝒕) 𝒉 𝒕 PDF CDF 1-CDF 𝒇(𝒕) 𝑺(𝒕) Def 0 0 100 0.00 0 1 0.00 0.00 1 10 90 0.10 0 1 0.10 0.10 2 20 70 0.20 0.1 0.9 0.22 0.22 3 30 40 0.30 0.3 0.70 0.43 0.43 4 5 35 0.05 0.6 0.40 0.13 0.13 5 35 0 0.35 0.65 0.35 1.00 1.00
  • 49.
    Hazard function, ℎ𝑡 The probability of the event occurring at exactly time 𝑡𝑖 provided the patient has survived up to 𝑡𝑖−1. ℎ 𝑡 = 𝑃 𝑇 = 𝑡𝑖 𝑇 ≥ 𝑡𝑖−1) Put another way, it represents the instantaneous event rate for an individual who has already survived to time t.
  • 50.
    T n Nat risk 𝒇 𝒕 𝐅 𝒕 𝑺(𝒕) 𝒉 𝒕 PDF CDF 1-CDF Def 𝒇(𝒕) 𝑺(𝒕) 0 0 100 0.00 0 1 0.00 0.00 1 10 90 0.10 0 1 0.10 0.10 2 20 70 0.20 0.1 0.9 0.22 0.22 3 30 40 0.30 0.3 0.70 0.43 0.43 4 5 35 0.05 0.6 0.40 0.13 0.13 5 35 0 0.35 0.65 0.35 1.00 1.00 What is the probability someone selected at random survives at 𝑇 ≥ 2
  • 51.
    T n Nat risk 𝒇 𝒕 𝐅 𝒕 𝑺(𝒕) 𝒉 𝒕 PDF CDF 1-CDF Def 𝒇(𝒕) 𝑺(𝒕) 0 0 100 0.00 0 1 0.00 0.00 1 10 90 0.10 0 1 0.10 0.10 2 20 70 0.20 0.1 0.9 0.22 0.22 3 30 40 0.30 0.3 0.70 0.43 0.43 4 5 35 0.05 0.6 0.40 0.13 0.13 5 35 0 0.35 0.65 0.35 1.00 1.00 What is the probability someone selected at random survives at 𝑇 ≥ 2
  • 52.
    Hazard from 𝑓𝑡 and 𝑆 𝑡 , ℎ 𝑡 ℎ 𝑡 = 𝑓(𝑡) 𝑆(𝑡)
  • 53.
    T n Nat risk 𝒇 𝒕 𝐅 𝒕 𝑺(𝒕) 𝒉 𝒕 PDF CDF 1-CDF Def 𝒇(𝒕) 𝑺(𝒕) 0 0 100 0.00 0 1 0.00 0.00 1 10 90 0.10 0 1 0.10 0.10 2 20 70 0.20 0.1 0.9 0.22 0.22 3 30 40 0.30 0.3 0.70 0.43 0.43 4 5 35 0.05 0.6 0.40 0.13 0.13 5 35 0 0.35 0.65 0.35 1.00 1.00 What is the probability someone selected at random survives at 𝑇 ≥ 2
  • 54.
    T n Nat risk 𝒇 𝒕 𝐅 𝒕 𝑺(𝒕) 𝒉 𝒕 PDF CDF 1-CDF Def 𝒇(𝒕) 𝑺(𝒕) 0 0 100 0.00 0 1 0.00 0.00 1 10 90 0.10 0 1 0.10 0.10 2 20 70 0.20 0.1 0.9 0.22 0.22 3 30 40 0.30 0.3 0.70 0.43 0.43 4 5 35 0.05 0.6 0.40 0.13 0.13 5 35 0 0.35 0.65 0.35 1.00 1.00 What is the probability someone selected at random survives at 𝑇 ≥ 2
  • 56.
    Outline • Why dowe need SA? • What is SA? • Examples for SA • Objectives of SA • Terminology GW • Types of SA • Non-parametric SA • Semi-parametric SA • Parametric SA • Types of predictors • Time-dependent • Time-independent • Life table and survival curve • Kaplan-Meier curve • Tests for comparison • Log-rank test • Breslow • Tarone-Ware
  • 57.
    Outline • Why dowe need SA? • What is SA? • Examples for SA • Objectives of SA • Terminology GW • Types of SA • Non-parametric SA • Semi-parametric SA • Parametric SA • Types of predictors • Time-dependent • Time-independent • Life table and survival curve • Kaplan-Meier curve • Tests for comparison • Log-rank test • Breslow • Tarone-Ware
  • 59.
    Types of survival analysis Nonparametric: noassumption about the shape of hazard function. Hazard function is estimated based on empirical data, showing change over time. Example: Kaplan-Meier survival analysis. Semi-parametric: no assumption about the shape of hazard function, It makes assumption about how covariates affect the hazard function. Example: Cox regression Parametric: specify the shape of baseline hazard function and covariates effects on hazard function in advance. Used for predictive modelling
  • 60.
    One assumption for all survival analysis Nosystematic differences between censored and uncensored cases
  • 61.
    Outline • Why dowe need SA? • What is SA? • Examples for SA • Objectives of SA • Terminology GW • Types of SA • Non-parametric SA • Semi-parametric SA • Parametric SA • Types of predictors • Time-dependent • Time-independent • Life table and survival curve • Kaplan-Meier curve • Tests for comparison • Log-rank test • Breslow • Tarone-Ware
  • 62.
    Outline • Why dowe need SA? • What is SA? • Examples for SA • Objectives of SA • Terminology GW • Types of SA • Non-parametric SA • Semi-parametric SA • Parametric SA • Types of predictors • Time-dependent • Time-independent • Life table and survival curve • Kaplan-Meier curve • Tests for comparison • Log-rank test • Breslow • Tarone-Ware
  • 63.
    Types of predictors •Time independent predictors • Time-dependent predictors
  • 64.
    Time-independent variable • Atime-independent variable is defined to be any variable whose value for a given individual does not change over time. Example is SEX.
  • 65.
    What about smokingstatus, age, weight, treatment?
  • 66.
    Types of predictors •In prospective studies, when individuals are followed over time, the values of covariates may change with time. Covariates can thus be divided into fixed and time-dependent.
  • 67.
    Outline • Why dowe need SA? • What is SA? • Examples for SA • Objectives of SA • Terminology GW • Types of SA • Non-parametric SA • Semi-parametric SA • Parametric SA • Types of predictors • Time-dependent • Time-independent • Life table and survival curve • Kaplan-Meier curve • Tests for comparison • Log-rank test • Breslow • Tarone-Ware
  • 68.
    Outline • Why dowe need SA? • What is SA? • Examples for SA • Objectives of SA • Terminology GW • Types of SA • Non-parametric SA • Semi-parametric SA • Parametric SA • Types of predictors • Time-dependent • Time-independent • Life table and survival curve • Kaplan-Meier curve • Tests for comparison • Log-rank test • Breslow • Tarone-Ware
  • 69.
    Anderson.sav • It consistsof remission survival times on 42 leukemia patients, half of whom receive a new therapy and the other half of whom get a standard therapy (Freireich et al., Blood, 1963). • The exposure variable of interest is treatment status (Rx = 0 if new treatment, Rx = 1 if standard treatment). • Two other variables for control are log white blood cell count (i.e., logWBC) and sex. • Failure status is defined by the relapse variable (0 if censored, 1 if failure).
  • 70.
  • 72.
    • Life Tablesis a descriptive procedure for examining the distribution of time-to-event variables. • We also can compare the distribution by levels of a factor variable. • The basic idea of life tables is to subdivide the period of observation into smaller time intervals. • Then the probability from each of the intervals are estimated.
  • 73.
  • 74.
    Life table The beginning valuefor each interval. Each interval extends from its start time up to the start time of the next interval.
  • 75.
    Life table The numberof surviving cases at the beginning of the interval. This number decreases steadily as cases relapsed or be censored
  • 76.
    Life table The numberof censored cases in this interval. They are lost to follow up while in remission.
  • 77.
    Life table The numberof surviving cases minus one half the censored cases. This is intended to account for the effect of the censored case
  • 78.
    Life table The numberof cases that have had a relapse. These are cases with status= 1.
  • 79.
    Life table The ratioof terminal events to the number exposed to risk (5/42=0.12).
  • 80.
    Life table One minusthe proportion terminating. (1-0.12)=0.88
  • 81.
    Life table The proportion ofcases surviving from the start of the table to the end of the interval (42-5- 8)/42=0.6905(se cond row).
  • 82.
    Life table The proportion ofcases surviving from the start of the table to the end of the interval (42-5- 8)/42=0.8984 (second row).
  • 83.
    Life table An estimateof the probability of experiencing the terminal event per time unit during the interval. 𝑓 𝑡𝑖 = 𝑆 𝑡𝑖+1 − 𝑆 𝑡𝑖 𝑡𝑖+1 − 𝑡𝑖 (0.88 − 0.69) 4 = 0.475
  • 84.
    Life table An estimateof the probability of experiencing the terminal event per time unit during the interval, conditional upon surviving to the start of the interval. ℎ 𝑡𝑖 = 𝑓(𝑡𝑖) 𝑎𝑣𝑔[𝑆 𝑡𝑖 , 𝑆 𝑡𝑖+1 ] 1st row: 0.03 0.88+1 2 = 0.03 2nd row: 0.48 0.69+0.88 2 = 0.06
  • 85.
    Life table An estimateof the probability of experiencing the terminal event per time unit during the interval, conditional upon surviving to the start of the interval. ℎ 𝑡𝑖 = 𝑓(𝑡𝑖) 𝑎𝑣𝑔[𝑆 𝑡𝑖 , 𝑆 𝑡𝑖+1 ] 1st row: 𝟎.𝟎𝟑 𝟎.𝟖𝟖+𝟏 𝟐 = 𝟎. 𝟎𝟑 2nd row: 0.48 0.69+0.88 2 = 0.06 1.00
  • 86.
    Life table An estimateof the probability of experiencing the terminal event per time unit during the interval, conditional upon surviving to the start of the interval. ℎ 𝑡𝑖 = 𝑓(𝑡𝑖) 𝑎𝑣𝑔[𝑆 𝑡𝑖 , 𝑆 𝑡𝑖+1 ] 1st row: 0.03 0.88+1 2 = 0.03 2nd row: 0.48 0.69+0.88 2 = 0.06
  • 87.
    Life table It isthe time when half the patients are expected to be alive. It means that the chance of surviving beyond that time is 50 percent. The median survival is approximately 12 weeks.
  • 88.
    X-axis shows thetime to event. Y-axis shows the probability of survival
  • 89.
    X-axis shows thetime to event. Y-axis shows the probability of survival
  • 90.
    After 8 weeks,the probability of remaining in a remission is around 70% given that the patient had not experience a relapse yet Any point on the survival curve shows the probability that a case remain in a remission past that time.
  • 91.
    Outline • Why dowe need SA? • What is SA? • Examples for SA • Objectives of SA • Terminology GW • Types of SA • Non-parametric SA • Semi-parametric SA • Parametric SA • Types of predictors • Time-dependent • Time-independent • Life table and survival curve GW • Kaplan-Meier curve • Tests for comparison • Log-rank test • Breslow • Tarone-Ware
  • 92.
    GW Compare time torelapse between males and females among patients with leukemia. What’s the probability of surviving past 20 weeks among men? What’s the probability of surviving past 20 weeks among women?
  • 93.
    Outline • Why dowe need SA? • What is SA? • Examples for SA • Objectives of SA • Terminology GW • Types of SA • Non-parametric SA • Semi-parametric SA • Parametric SA • Types of predictors • Time-dependent • Time-independent • Life table and survival curve GW • Kaplan-Meier curve • Tests for comparison • Log-rank test • Breslow • Tarone-Ware
  • 94.
    Outline • Why dowe need SA? • What is SA? • Examples for SA • Objectives of SA • Terminology GW • Types of SA • Non-parametric SA • Semi-parametric SA • Parametric SA • Types of predictors • Time-dependent • Time-independent • Life table and survival curve GW • Kaplan-Meier curve • Tests for comparison • Log-rank test • Breslow • Tarone-Ware
  • 95.
  • 96.
    K-M provides estimatedS(t) It can be used with or without censored data
  • 97.
    The survival curvesgive a visual representation of the life tables. X- axis the time to event. Drops in the survival curve occur whenever the medication takes effect in a patient. Y axis P(not experience a relapse)
  • 98.
    Horizontal gaps vsVertical gaps Notice the vertical gap and horizontal gaps.
  • 99.
    Horizontal gaps vsVertical gaps Horizontal gap at S(t)=0.6 A horizontal gap means that it took longer for one group to experience a certain fraction of deaths.
  • 100.
    Horizontal gaps vsVertical gaps Sixty percent remained alive up to 13 weeks in GP2 compared to 5 weeks in GP3 Horizontal gap at S(t)=0.6
  • 101.
    Horizontal gaps vsVertical gaps Vertical gap at t=10 weeks A vertical gap means that at a specific time point, one group had a greater fraction of subjects surviving.
  • 102.
    Horizontal gaps vsVertical gaps Vertical gap at t=10 weeks At the 10th week, 63% of GP2 remains alive compared to 20% in GP3
  • 103.
    The survival curvesgive a visual representation of the life tables. The KM curves are quite different with GP 1 having consistently better survival prognosis than GP 2, and GP 2 having consistently better survival prognosis than gp 3. Note also that the difference between GPs 1 and 2 is about the same over time, whereas GP 2 appears to diverge from
  • 104.
    • The Kaplan-Meierprocedure is a method of estimating time-to-event models in the presence of censored cases. • A descriptive procedure for examining the distribution of time-to- event variables. We also can compare the distribution by levels of a factor variable or produce separate analyses by levels of a stratification variable. • Censored cases (right-censored cases) are those for which the event of interest has not yet happened.
  • 105.
    Uses • Intuitive graphicalpresentation. • Commonly used to describe survivorship of study population/s. • Commonly used to compare two study populations.
  • 107.
    Breslow by thenumber of cases at risk at each time point. Log rank  the same. Tarone-Ware the square root of the number of cases at risk at each time point. All of them test the equality of survival functions by weighting all time points …
  • 108.
    The full nameof Breslow statistic is Gehan- Breslow-Wilcoxon test (after Edmund Alpheus Gehan, Norman Edward Breslow and Frank Wilcoxon).
  • 109.
  • 110.
    Pooled over strata:a single test is computed for all factor levels, testing for equality of survival function across all levels of the factor variable. For each stratum: a separate test is computed for group formed by the stratification variable.
  • 111.
    Pairwise over strata:a separate test is computed for each pair of factor levels when a pooled test shows non-equality of survival functions. Pairwise for each stratum: a separate test is computed for each pair of factor variable, for each stratum of the stratification variable.
  • 112.
    It shows thenumber of cases per group, had a relapse and was censored SPSS output
  • 113.
    The table isvery large The survival table is a descriptive table that details the time to event. Each observation occupies a row. The table is sectioned by each level of (ln WBC)
  • 114.
    Time. The timeat which the event or censoring occurred. Status. whether the case relapsed or was censored. Cumulative Proportion Surviving at the Time. The proportion of cases surviving from the start of the table until this time. N of Cum Events. N of cases that relapsed from the start of the table until this time. N of Remaining Cases. N of cases that, at this time, haven’t yet relapsed or be censored.
  • 115.
    N of RemainingCases. N of cases that, at this time, haven’t yet relapsed or be censored.
  • 116.
    • Time tosurvival is the shortest in GP3, followed by Gp2 and was the longest in GP1 • The confidence intervals do not overlap between levels; thus, differences in effect on time to event can be inferred.
  • 117.
    This table providesoverall tests of the equality of survival times across groups. Since the significance values of the tests are all < 0.05, there is a statistically significant difference in the distribution of relapse among the three groups.
  • 118.
    Pair-wise comparisons revealedthat all the relapse distribution was significantly different across all pairs.
  • 119.
    The survival curvesgive a visual representation of the life tables. X- axis the time to event. Drops in the survival curve occur whenever the medication takes effect in a patient. Y axis P(not experience a relapse)
  • 120.
    The survival curvesgive a visual representation of the life tables. The KM curves are quite different with GP 1 having consistently better survival prognosis than GP 2, and GP 2 having consistently better survival prognosis than gp 3. Note also that the difference between GPs 1 and 2 is about the same over time, whereas GP 2 appears to diverge from
  • 121.
    Uses • Commonly usedto describe survivorship of study population/s. • Commonly used to compare two study populations. • Intuitive graphical presentation.
  • 122.
    Limitation •Mainly descriptive •Doesn’t controlfor covariates •Requires categorical predictors •Can’t accommodate time-dependent variables
  • 123.
    Outline • Why dowe need SA? • What is SA? • Examples for SA • Objectives of SA • Terminology GW • Types of SA • Non-parametric SA • Semi-parametric SA • Parametric SA • Types of predictors • Time-dependent • Time-independent • Life table and survival curve GW • Kaplan-Meier curve GW • Tests for comparison GW • Log-rank test • Breslow • Tarone-Ware
  • 124.
    Using suitable graphicrepresentation and statistical test.. • Compare the effect of treatment on relapse among patients with leukemia • Compare the effect of treatment on relapse among males • Compare the effect of treatment on relapse of among females
  • 125.
    Outline • Why dowe need SA? • What is SA? • Examples for SA • Objectives of SA • Terminology GW • Types of SA • Non-parametric SA • Semi-parametric SA • Parametric SA • Types of predictors • Time-dependent • Time-independent • Life table and survival curve GW • Kaplan-Meier curve GW • Tests for comparison GW • Log-rank test • Breslow • Tarone-Ware
  • 126.
    Outline • Why dowe need SA? • What is SA? • Examples for SA • Objectives of SA • Terminology GW • Types of SA • Non-parametric SA • Semi-parametric SA • Parametric SA • Types of predictors • Time-dependent • Time-independent • Life table and survival curve GW • Kaplan-Meier curve GW • Tests for comparison GW • Log-rank test • Breslow • Tarone-Ware
  • 127.

Editor's Notes

  • #19 For example, suppose that despite the treatment being randomized in the lung cancer trial, older patients were assigned more often to the radiotherapy alone group. This group would have a worse baseline prognosis and so the simple analysis may have underestimated its efficacy compared to the combination treatment, referred to as confounding between treatment and age.
  • #22 It is important to note that because overall mortality is the event of interest, nonfatal relapses are ignored, and those who have not died are considered (right) censored. Figure 1 (right) is specific to the outcome or event of interest. Here, death from any cause, often called overall survival, was the outcome of interest. If we were interested solely in ovarian cancer deaths, then patients 5 and 6 – those who died from non-ovarian causes – would be censored. In general, it is good practice to choose an end-point that cannot be misclassified. All- cause mortality is a more robust end-point than a specific cause of death. If we were interested in time to relapse, those who did not have a relapse (fatal or nonfatal) would be censored at either the date of death or the date of last follow-up.