This document discusses methods for evaluating discrimination for survival outcomes using time-dependent measures. It connects the time-dependent area under the ROC curve (AUC) to the time-dependent predictiveness curve. The AUC can be estimated based on the predictiveness curve, which plots the risk of an event versus quantiles of a marker over time. Simulation studies assess the impact of model misspecification when estimating the conditional risk function used to derive estimates of the time-dependent AUC.
1. Discrimination measures for survival outcomes:
Connection between the AUC and the predictiveness
curve
V. Viallon, A. Latouche
University Lyon 1
University Versailles Saint Quentin
October 13, 2011
Viallon (Univ. Lyon 1) Connecting the pred. curve and the AUC 1 / 22
2. 1 The binary outcome setting
The AUC
The predictiveness curve
Connecting the AUC and the predictiveness curve
2 The survival outcome setting
Time-dependent outcomes definitions
Connecting time-dependent AUC and time-dependent predictiveness
curve
Estimates of the time-dependent AUC
Some synthetic examples
Viallon (Univ. Lyon 1) Connecting the pred. curve and the AUC 2 / 22
3. AUC for binary outcomes
Let D be a (0, 1)-variable representing the status regarding a given
disease:
D = 1 for diseased individuals;
D = 0 for non-diseased individuals.
Further let X be a continuous marker. For any c ∈ I
R
X > c: the test is positive;
X ≤ c: the test is negative.
TPR(c) = I P(X > c|D = 1) and FPR(c) = I
P(X > c|D = 0).
ROC curve: FPR(c), TPR(c) , c ∈ I
R
AUC: TPR(c)dFRP(c).
Viallon (Univ. Lyon 1) Connecting the pred. curve and the AUC 3 / 22
4. Binary outcomes: a toy example
FPR(c) = I
P(X > c|D = 0)
TPR(c) = I
P(X > c|D = 1)
6
1.0
4
0.8
0.6
2
TPR
0.4
0
0.2
−2
0.0
−4
0.0 0.2 0.4 0.6 0.8 1.0
0 1
FPR
Viallon (Univ. Lyon 1) Connecting the pred. curve and the AUC 4 / 22
5. Binary outcomes: a toy example
FPR(c) = I
P(X > c|D = 0)
TPR(c) = I
P(X > c|D = 1)
6
1.0
4
0.8
0.6
2
TPR
0.4
0
0.2
−2
0.0
−4
0.0 0.2 0.4 0.6 0.8 1.0
0 1
FPR
Viallon (Univ. Lyon 1) Connecting the pred. curve and the AUC 5 / 22
6. Binary outcomes: a toy example
FPR(c) = I
P(X > c|D = 0)
TPR(c) = I
P(X > c|D = 1)
6
1.0
4
0.8
0.6
2
TPR
0.4
0
0.2
−2
0.0
−4
0.0 0.2 0.4 0.6 0.8 1.0
0 1
FPR
Viallon (Univ. Lyon 1) Connecting the pred. curve and the AUC 6 / 22
7. Binary outcomes: a toy example
FPR(c) = I
P(X > c|D = 0)
TPR(c) = I
P(X > c|D = 1)
6
1.0
4
0.8
0.6
2
TPR
0.4
0
0.2
−2
0.0
−4
0.0 0.2 0.4 0.6 0.8 1.0
0 1
FPR
Viallon (Univ. Lyon 1) Connecting the pred. curve and the AUC 7 / 22
8. Binary outcomes: a toy example
FPR(c) = I
P(X > c|D = 0)
TPR(c) = I
P(X > c|D = 1)
6
1.0
4
0.8
0.6
2
TPR
0.4
0
0.2
−2
0.0
−4
0.0 0.2 0.4 0.6 0.8 1.0
0 1
FPR
Viallon (Univ. Lyon 1) Connecting the pred. curve and the AUC 8 / 22
9. Predictiveness curve for binary outcomes
Many alternative criteria have been proposed for evaluating
discrimination
proportion of explained variation,
standardized total gain
risk reclassification measures (Pencina et al., SiM, 2006)
Most of them express as simple functions of the predictiveness curve
(Gu and Pepe, Int. J. Biostatistics, 2009).
Denote by G −1 the quantile function of X . For any q ∈ [0, 1], let
R(q) = P D = 1|X = G −1 (q)
be the risk associated to the qth quantile of X .
The predictiveness curve plots R(q) versus q.
Viallon (Univ. Lyon 1) Connecting the pred. curve and the AUC 9 / 22
10. Predictiveness curves and their corresponding AUC values
1
With p = I
P(D = 1) = 0 R(q)dq=0.5
1.0
R1 (AUC=0.500)
R2 (AUC=0.700)
R3 (AUC=0.833)
0.8
R4 (AUC=0.928)
R5 (AUC=1.000)
Predictiveness Curve
0.6
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
Quantiles
Viallon (Univ. Lyon 1) Connecting the pred. curve and the AUC 10 / 22
11. The relation in the binary outcome setting
Still denote by R the predictiveness curve of marker X ,
R(q) = P D = 1|X = G −1 (q) .
1
Then, denoting by p = IP(D = 1) = 0 R(q)dq the disease
prevalence, the AUC of marker X is given by
1
0 qR(q)dq − p 2 /2
AUC =
p(1 − p)
We can check that
AUC = 0.5 when R(q) = p;
AUC = 1 when R(q) = 1 [1−p,1] (q).
I
Viallon (Univ. Lyon 1) Connecting the pred. curve and the AUC 11 / 22
12. Extensions to survival outcomes
In prospective cohort study, the outcome (e.g., the disease status) can
change over time
⇒ we consider time-dependent outcomes, TPR, FPR, ROC curves,
AUC and predictiveness curve.
Notations:
Ti and Ci : survival and censoring times for subject i
(Zi , δi )1≤i≤n with Zi = min(Ti , Ci ) and δi = 1 i ≤ Ci )
I(T
Di (t): time-dependent outcome status for subject i at time t.
Viallon (Univ. Lyon 1) Connecting the pred. curve and the AUC 12 / 22
13. Heagerty and Zheng’s Taxonomy
Today’s talk focus on Cumulative cases & Dynamic controls:
cumulative cases: Di (t) = 1 if Ti ≤ t;
dynamic controls Di (t) = 0 if Ti > t;
so that Di (t) = 1 i ≤ t}.
I{T
⇒discrimination between subjects who had the event prior to time t
and those who were still event-free at time t.
Viallon (Univ. Lyon 1) Connecting the pred. curve and the AUC 13 / 22
14. Cumulative cases and Dynamic controls
For a given evaluation time t0
Cumulative true positive rates are
TPRC (c, t0 ) = I
P(X > c|D(t0 ) = 1)
P(X > c|T ≤ t0 );
= I
Dynamic false positive rates are
FPRD (c, t0 ) = I
P(X > c|D(t0 ) = 0)
= I
P(X > c|T > t0 );
∞
AUCC,D (t0 ) = C
−∞ TPR (c, t0 )d FPRD (c, t0 ) .
But 1 i ≤ t0 ) is not observed for all i due to censoring!!
I(T
Viallon (Univ. Lyon 1) Connecting the pred. curve and the AUC 14 / 22
15. Cumulative cases and Dynamic controls
For a given evaluation time t0
Cumulative true positive rates are
TPRC (c, t0 ) = I
P(X > c|D(t0 ) = 1)
P(X > c|T ≤ t0 );
= I
Dynamic false positive rates are
FPRD (c, t0 ) = I
P(X > c|D(t0 ) = 0)
= I
P(X > c|T > t0 );
∞
AUCC,D (t0 ) = C
−∞ TPR (c, t0 )d FPRD (c, t0 ) .
But 1 i ≤ t0 ) is not observed for all i due to censoring!!
I(T
Viallon (Univ. Lyon 1) Connecting the pred. curve and the AUC 14 / 22
16. Workaround for AUCC,D
Using Bayes’s theorem (see, e.g., Chambless & Diao)
∞ ∞
F (t0 ; X = x)[1 − F (t0 ; X = c)]
AUCC,D (t0 ) = g (x)g (c)dxdc
−∞ c [1 − F (t0 )]F (t0 )
with
P(T ≤ t) be the risk function at time t;
F (t) = I
P(T ≤ t|X = x) be the conditional risk function at
F (t; X = x) = I
time t;
g the density function of marker X .
Viallon (Univ. Lyon 1) Connecting the pred. curve and the AUC 15 / 22
17. Predictiveness curve and AUCC,D
Introduce
P(D(t) = 1|X = G −1 (q))
R(t; q) := I
= P(T ≤ t|X = G −1 (q))
I
the time-dependent predictiveness curve
We established that
1 2
C,D 0 qR(t0 ; q)dq − F (t0 )
2
AUC (t0 ) =
F (t0 )[1 − F (t0 )]
Proper estimation of R(t0 ; q) (especially for q 1) should yield
proper estimation of AUCC,D (t0 )
Viallon (Univ. Lyon 1) Connecting the pred. curve and the AUC 16 / 22
18. Deriving estimates for AUCC,D (t)
G and g : cdf and pdf of X .
X(i) : i-th order statistic of (X1 , . . . , Xn ).
Fn (t0 ; x): estimator of the conditional risk F (t0 ; X = x).
Using the change of variable x = G −1 (q), we have
1 ∞
0 qR(t0 ; q)dq = −∞ G (x)F (t0 ; X = x)g (x)dx, so that
n
1 i
Fn (t0 ; X(i) ),
n n
i=1
1
is the empirical counterpart of the 0 qR(t0 ; q)dq.
Viallon (Univ. Lyon 1) Connecting the pred. curve and the AUC 17 / 22
19. Deriving estimates for AUCC,D (t)
To estimate the marginal risk function F
the KM estimator Fn,(1) (t0 ).
Since F (t0 ) = F (t0 ; x)g (x)dx, we can also use
n
1
Fn,(2) (t0 ) = Fn (t0 ; Xi ).
n
i=1
This yields two estimators for AUCC,D (t0 ), namely, for k = 1, 2,
1 n i 2
n i=1 n Fn (t0 ; X(i) ) − Fn,(k) (t0 )/2
AUCC,D (t0 ) =
n,(k) .
Fn,(k) (t0 ) 1 − Fn,(k) (t0 )
Experimental results (not shown) suggested better performances results
obtained with k = 2.
Viallon (Univ. Lyon 1) Connecting the pred. curve and the AUC 18 / 22
20. Simulation study
X : continuous marker
T : survival time generated under
a Cox model λ(t) = λ0 (t) exp(αX )
a TV coeff. Cox model λ(t) = λ0 (t) exp(α(t)X )
Various censoring schemes were considered.
Estimation of the cond. risk F (t; X = x)
a Cox model
an Aalen additive model
conditional KM estimator
Goal: assess the effect of model misspecification – when estimating
the conditional risk function– on the AUCC,D (t) estimation.
Viallon (Univ. Lyon 1) Connecting the pred. curve and the AUC 19 / 22
21. Simulation under a Cox model
1.0
True
HLP
KM cond.
0.9 Add. Aalen
Cox
0.8
AUC C/D
0.7
0.6
0.5
0 1 2 3 4 5
Time
Viallon (Univ. Lyon 1) Connecting the pred. curve and the AUC 20 / 22
22. Simulation under a Time-varying coefficient Cox model
1.0
True
HLP
KM cond.
0.9 Add. Aalen
Cox
0.8
AUC C/D
0.7
0.6
0.5
0.2 0.4 0.6 0.8 1.0 1.2
Time
Viallon (Univ. Lyon 1) Connecting the pred. curve and the AUC 21 / 22
23. Conclusion
Relation between the predictiveness curve and the AUCC,D (t).
Enables to easily derive estimates of the AUCC,D (t) given estimates of
the cond. risk function.
Correctly specifying the model (when estimating the cond. risk
function) is crucial to get proper estimation of AUCC,D (t);
A similar relation can be obtained to connect the partial AUC (or
partial AUCC,D (t)) to the (time-dependent) predictiveness curve;
The conditional risk function, through the predictiveness curve, is the
key when assessing discrimination of prognostic tools
Viallon (Univ. Lyon 1) Connecting the pred. curve and the AUC 22 / 22
24. Sketch of the proof
∞ ∞
F (t; X = x)[1 − F (t; X = c)]g (x)g (c)dxdc
−∞ c
1 1
−1 −1
= F (t; X = G (u))[1 − F (t; X = G (v ))]dudv
0 v
1 1
−1 −1
= [1 − S(t; X = G (u))]S(t; X = G (v ))dudv
0 v
1 1 1
−1 −1 −1
= (1 − v )S(t; X = G (v ))dv − S(t; X = G (u))S(t; X = G (v ))dudv
0 0 v
1 1 1
−1 −1 −1
= (1 − v )S(t; X = G (v ))dv − S(t; X = G (u))S(t; X = G (v ))1I(u ≥ v )dudv .
0 0 0
Setting
−1 −1
L(u, v ) = S(t; X = G (u))S(t; X = G (v )),
we have L(u, v ) = L(v , u) so that
∞ ∞
F (t; X = x)[1 − F (t; X = c)]g (x)g (c)dxdc
−∞ c
1 1 1 1
−1 −1 −1
= (1 − v )S(t; X = G (v ))dv − S(t; X = G (u))S(t; X = G (v ))dudv
0 2 0 0
1 1 1 2
−1 −1
= (1 − v )S(t; X = G (v ))dv − S(t; X = G (v ))dv .
0 2 0
25. Simulation under a Time-varying coefficient Cox model
1.0
1.0
True True
KM cond. KM cond.
0.8
0.8
Add. Aalen Add. Aalen
Cox Cox
Predictiveness curve
Predictiveness curve
0.6
0.6
0.4
0.4
0.2
0.2
0.0
0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Quantile of marker Quantile of marker