JSM2015

Motivation and Goal
NLST Data and Probability Model
Application
Results and Summary
Long term eﬀects and over diagnosis of CT scan
in lung cancer.
Dongfeng Wu
Department of Bioinformatics and Biostatistics
School of Public Health and Information Sciences
University of Louisville
JSM, August 12, 2015
Wu screening eﬀects

Motivation and Goal
Application
Results and Summary
Background and Motivation
Hot debate on over-diagnosis, the diagnosis of cancer that never
would have become symptoms in a person’s lifetime.
We have developed a probability model for evaluating long term
eﬀects and over diagnosis for initially healthy people without any
screening history (Wu et al, Statistica Sinica 2014).
Old people may have gone through some screening exam before and
seems healthy so far. How to extend the original model to people
with a screening history?
We will investigate whether continued cancer screening for people at
risk cause greater chance of over-diagnosis, and to evaluate
long-term outcomes of regular screening for the whole cohort.
Methods will be applied to the computed tomography (CT) arm in
the National Lung Screening Trial (NLST) data.

Motivation and Goal
Application
Results and Summary
How to Evaluate Long-term Effects?
People in periodic screening were categorized into 4 mutually exclusive
groups: Symptom-free-life, No-early-detection, True-early-detection, and
Over-diagnosis, based on diagnosis status and ultimate lifetime disease
status.
Table 1. Definition of outcomes/events in screening
ultimate lifetime disease status
diagnosis status
no symptom before death having symptom before death
not-screen-detected Symptom-free-life No-early-detection
screen-detected Over-diagnosis True-early-detection
All initially superficially healthy participants in a screening program will
fall into one of the above 4 groups.

Motivation and Goal
Application
Results and Summary
Deﬁnition of Groups/Outcomes in Screening
Group 1 (Symptom-free-life(SympF)): A man who took part in
screening exams, no lung cancer was ever diagnosed, and untimately he
died of other causes.
Group 2 (No-early-detection(NoED)): A man who took part in
screening exams, lung cancer manifested itself clinically and was not
detected by scheduled screening exams.
Group 3 (True-early-detection(TrueED)): A man whose lung cancer was
diagnosed at a scheduled screening exam and his clinical symptom
would have appeared before death.
Group 4 (Over-Diagnosis(OverD)): A man who was diagnosed with lung
cancer at a scheduled screening exam BUT his clinical symptoms
would NOT have appeared before his death.
Probability of each group with no screening history has been derived and
estimated (Wu et al, Statistica Sinica 2014). Now we extend that work to older
people with a history.

Motivation and Goal
Application
Results and Summary
The Model
The Progressive disease model:
S0
disease free
Sp
t1
preclinical
Sc
t2
clinical
t
6
- -
sojourn time: t2 − t1.
lead time: t2 − t.
sensitivity: β = P(X = 1|D = 1).

Motivation and Goal
Application
Results and Summary
The NLST Study
About 54,000 Male and Female heavy smokers were enrolled
between 08/2002-04/2004. Data collection ﬁnished by
12/2009.
They were randomized to 2 arms: chest X-ray or low-dose
spiral CT.
Each arm underwent 3 annual screenings; more tumor cases
were diagnosed in the CT arm than that in the chest X-ray.
Initial screening age 55−74.

Motivation and Goal
Application
Results and Summary
Table 2: The NLST Data - Overview
Group within Study atotal subj. bScreen-diag. No. cInterval No.
The NLST: Chest X-ray
Overall 26226 279 177
male smokers 15500 165 107
female smokers 10726 114 70
The NLST: Spiral CT
Overall 26452 649 60
male smokers 15621 384 44
female smokers 10831 265 16
a Total number of people who ever received chest X-ray for lung cancer.
b Total number of subjects diagnosed by regular screening.
c Total number of clinical incident cases between two regular screenings.

Motivation and Goal
Application
Results and Summary
Deﬁnition
Let t0 < t1 < · · · < tk−1 < tk: k ordered screening exam
times.
ni : the number of individuals examined at ti−1
si : screening detected cases at the exam given at ti−1
ri : interval cases, the number of cases found in the clinical
state (Sc) within (ti−1, ti ).
(ni , si , ri ): data stratiﬁed by initial age in the i-th interval.

Motivation and Goal
Application
Results and Summary
Table 3: The NLST - CT group data
Age n1 s1 r1 n2 s2 r2 n3 s3 r3
· · · · · ·
60 1946 16 3 1847 13 1 1797 17 0
61 1786 18 0 1678 14 1 1659 11 3
62 1548 11 1 1452 8 2 1408 12 0
63 1427 14 1 1350 6 2 1320 11 0
64 1352 17 0 1287 18 72 1240 11 3
· · · · · ·

Motivation and Goal
Application
Results and Summary
The Probability Model
βi = β(ti ): sensitivity at age ti .
w(t)dt: transition probability from S0 → Sp at age (t, t + dt)
q(z): pdf of sojourn time S = Sc − Sp.
Q(z) = P(S > z) =
∞
z q(x)dx: survivor function of the
sojourn time.
A person’s life time T ∼ fT (t).
ith generation: people enter Sp during ith interval (ti−1, ti ),
i = 0, · · · , k, t−1 ≡ 0.

Motivation and Goal
Application
Results and Summary
Big picture: How to derive the probability?
0 t0
History ¤
t1 · · · tK1
↑
Current age
tK1+1
Future§
· · · tK1+K−1 T = tK1+K
Let K1 = scr. number in the past, K = scr. number in the future. Define
HK1 =



A man who had screening exams at his ages t0 < t1 < · · · < tK1−1,
no lung cancer has been diagnosed,
and he is asymptomatic at her current age tK1



.
Our Plan:
(a) Derive the probability of each case when K1 = 1, K = 1 when lifetime T is
fixed.
(b) Derive the probability of each case for any screening number K1 and K,
when lifetime T is fixed.
(c) Let T ∼ fT (t) and the future screening number K will be a random
variable!

Motivation and Goal
Application
Results and Summary
When K1 = K = 1:
0 t0
History ¤
t1
↑
Current age
Future§
T = t
Assume an asymptomatic man at current age t1, underwent one exam at
t0(< t1), and his life time T = t(> t1). Deﬁne event
H1 =
A man who had a screening exam at age t0, no lung cancer has been found,
and he is asymptomatic at current age t1
.
Then
P(H1|T ≥ t1) = 1 −
t1
0
w(x)dx + (1 − β0)
t0
0
w(x)Q(t1 − x)dx
+
t1
t0
w(x)Q(t1 − x)dx.

Motivation and Goal
Application
Results and Summary
Symptom-free-life and No-early-detection:K1 = K = 1
Given his lifetime T = t > t1,
P(Case 1: SympF, H1|T = t)
= 1 −
t
0
w(x)dx + (1 − β1)(1 − β0)
t0
0
w(x)Q(t − x)dx
+ (1 − β1)
t1
t0
w(x)Q(t − x)dx +
t
t1
w(x)Q(t − x)dx.
P(Case 2: NoED, H1|T = t)
= (1 − β1)(1 − β0)
t0
0
w(x)[Q(t1 − x) − Q(t − x)]dx
+ (1 − β1)
t1
t0
w(x)[Q(t1 − x) − Q(t − x)]dx
+
t
t1
w(x)[1 − Q(t − x)]dx.

Motivation and Goal
Application
Results and Summary
True-early-detection and Over-diagnosis:K1 = K = 1
P(Case 3: TrueED, H1|T = t)
= β1(1 − β0)
t0
0
w(x)[Q(t1 − x) − Q(t − x)]dx
+ β1
t1
t0
w(x)[Q(t1 − x) − Q(t − x)]dx
P(Case 4: OverD, H1|T = t)
= β1(1 − β0)
t0
0
w(x)Q(t − x)dx + β1
t1
t0
w(x)Q(t − x)dx.
If human lifetime T ∼ fT (t), then
P(Case i, H1|T ≥ t1) =
∞
t1
P(Case i, H1|T = t)fT (t|T ≥ t1)dt, i = 1, 2, 3, 4.
where the conditional pdf fT (t|T > t1) = fT (t)/P(T > t1), if t > t1.

Motivation and Goal
Application
Results and Summary
For Any Fixed Scr. K1 and Future Scr. K
Deﬁne
HK1
=



A man who had screening exams at ages t0 < t1 < · · · < tK1−1,
no lung cancer has been diagnosed,
and he is asymptomatic at his current age tK1



.
Then
P(HK1
|T ≥ tK1
) = 1 −
tK1
0
w(x)dx
+
K1−1
j=0
(1 − βj ) · · · (1 − βK1−1)
tj
tj−1
w(x)Q(tK1
− x)dx
+
tK1
tK1−1
w(x)Q(tK1 − x)dx.

Motivation and Goal
Application
Results and Summary
Symptom-free-life and No-early-detection: K1 & K
P(Case 1, HK1 |T = tK1+K ) = 1 −
tK1+K
0
w(x)dx
+
K1+K−1
j=0
(1 − βj ) · · · (1 − βK1+K−1)
tj
tj−1
w(x)Q(tK1+K − x)dx.
+
tK1+K
tK1+K−1
w(x)Q(tK1+K − x)dx
P(Case 2, HK1 |T = tK1+K ) = IK1+K,K1+1 + IK1+K,K1+2 + · · · + IK1+K,K1+K ,
IK1+K,j =
j−1
i=0
(1 − βi ) · · · (1 − βj−1)
ti
ti−1
w(x)[Q(tj−1 − x) − Q(tj − x)]dx
+
tj
tj−1
w(x)[1 − Q(tj − x)]dx, for ∀j = K1 + 1, · · · , K1 + K.

Motivation and Goal
Application
Results and Summary
True-early-detection and Over-Diagnosis: K1 & K scr.
P(Case 3, HK1
|T = tK1+K ) =
K1+K−1
j=K1
βj



j−1
i=0
(1 − βi ) · · · (1 − βj−1)
×
ti
ti−1
w(x)[Q(tj − x) − Q(tK1+K − x)]dx +
tj
tj−1
w(x)[Q(tj − x) − Q(tK1+K − x)]dx
P(Case 4, HK1
|T = tK1+K ) =
K1+K−1
j=K1
βj



j−1
i=0
(1 − βi ) · · · (1 − βj−1)
×
ti
ti−1
w(x)Q(tK1+K − x)dx +
tj
tj−1
w(x)Q(tK1+K − x)dx

Motivation and Goal
Application
Results and Summary
When Lifetime T is a Random Variable
We can prove that for any scr. number K1 ≥ 1 and K ≥ 1:
4
i=1
P(Case i, HK1
|T = tK1+K ) = P(HK1
|T ≥ tK1
) (1)
For an asymptomatic man at his current age tK1 , his life time is not a
ﬁxed value, but a random variable. If his future screening schedule is
tK1 < tK1+1 < . . . , then the number of screening exams is also a r.v.
K = K(T) = n, if tK1+n−1 < T < tK1+n.
P(Case i, HK1
|T ≥ tK1
) =
∞
tK1
P(Case i, HK1
|K = K(T), T = t)fT (t|T ≥ tK1
)dt,
And
4
i=1
P(Case i|HK1
, T ≥ tK1
) = 1.

Motivation and Goal
Application
Results and Summary
Application to the NLST-CT Data
Sensitivity: β(t) = Prob(Screen + |Sp, age = t),
β(t) =
1
1 + exp(−b0 − b1 ∗ (t − m))
.
Transition density S0 → Sp: 0.3 * lognormal pdf
w(t|µ, σ2
) =
0.3
√
2πσt
exp −(log t − µ)2
/(2σ2
) , σ > 0.
Sojourn time: Log logistic distribution
q(t) =
κtκ−1ρκ
[1 + (tρ)κ]2
, κ > 0, ρ > 0.
The distribution of the life span fT (t) was derived from the
period life table, Social Security Administration.
http://www.ssa.gov/OACT/STATS/table4c6.html

Motivation and Goal
Application
Results and Summary
Figure 1: Conditional PDF of lifetime for both gender in
the US

x
(ft.F60+ft.M60)/2
60 70 80 90 100 110 120
0.00.04
Conditional lifetime PDF for both genders with t_current=60
x
(ft.F70+ft.M70)/2
70 80 90 100 110 120
0.00.04
x
(ft.F80+ft.M80)/2
80 90 100 110 120
0.00.04


Motivation and Goal
Application
Results and Summary
Application to the NLST - CT Data
Let CT = NLST CT data, and θ = (b0, b1, µ, σ2, κ, ρ). The
posterior predictive probability for each outcome is
P(Case i|T > tK1 , HK1 , CT)
= P(Case i, θ|T > tK1 , HK1 , CT)dθ
= P(Case i|T > tK1 , HK1 , θ)f (θ|CT)dθ
≈
1
n
n
j=1
P(Case i|T > tK1 , HK1 , θ∗
j )
θ∗
j : 800 Posterior samples from MCMC.

Motivation and Goal
Application
Results and Summary
Table 4. Projection of lung cancer scr. outcomes using NLST-CT data
(∆1, ∆2)a
Pb
(SympF) P(NoED) P(TrueED) P(OverD)
Initial screen age t0 = 50, current age tK1 = 60
(1 yr, 1 yr) 80.26(0.57) 1.86(0.28) 17.19(0.60) 0.66(0.09)
(2 yr, 1 yr) 80.05(0.56) 1.86(0.27) 17.40(0.60) 0.66(0.09)
(1 yr, 2 yr) 80.50(0.57) 6.31(0.63) 12.74(0.55) 0.42(0.08)
(2 yr, 2 yr) 80.29(0.57) 6.30(0.63) 12.96(0.56) 0.43(0.08)
(1 yr, 1 yr) 86.13(0.39) 1.30(0.24) 11.90(0.42) 0.67(0.09)
(2 yr, 1 yr) 85.64(0.42) 1.31(0.24) 12.38(0.44) 0.67(0.09)
(1 yr, 2 yr) 86.38(0.39) 4.33(0.44) 8.86(0.45) 0.42(0.08)
(2 yr, 2 yr) 85.88(0.41) 4.33(0.44) 9.36(0.50) 0.43(0.08)
(1 yr, 1 yr) 94.20(0.26) 0.55(0.16) 4.73(0.23) 0.52(0.08)
(2 yr, 1 yr) 93.75(0.30) 0.57(0.17) 5.16(0.27) 0.53(0.09)
(1 yr, 2 yr) 94.38(0.25) 1.75(0.21) 3.54(0.26) 0.34(0.07)
(2 yr, 2 yr) 93.92(0.28) 1.76(0.23) 3.97(0.31) 0.36(0.07)
a
∆1, ∆2 are scr. interval in history and in future correspondingly.
b
The mean probability (with standard error) are in percentage.

Motivation and Goal
Application
Results and Summary
Table 5. Estimated probability of over-diagnosis in screen-detected cases
(with 95% C.I.)
(∆1, ∆2) P(TrueED|De
) P(OverD|D)
(1 yr, 1 yr) 96.31 (95.10, 97.12) 3.69 (2.88, 4.90)
(2 yr, 1 yr) 96.35 (95.17, 97.15) 3.65 (2.85, 4.83)
(1 yr, 2 yr) 96.78 (95.62, 97.51) 3.22 (2.49, 4.38)
(2 yr, 2 yr) 96.83 (95.70, 97.54) 3.17 (2.46, 4.30)
(1 yr, 1 yr) 94.68 (93.01, 95.79) 5.32 (4.21, 6.99)
(2 yr, 1 yr) 94.85 (93.26, 95.90) 5.15 (4.10, 6.74)
(1 yr, 2 yr) 95.45 (93.87, 96.46) 4.55 (3.54, 6.13)
(2 yr, 2 yr) 95.62 (94.12, 96.58) 4.38 (3.42, 5.88)
(1 yr, 1 yr) 90.21 (87.32, 92.16) 9.79 (7.84, 12.68)
(2 yr, 1 yr) 90.71 (87.98, 92.51) 9.29 (7.49, 12.02)
(1 yr, 2 yr) 91.29 (88.55, 93.06) 8.71 (6.94, 11.45)
(2 yr, 2 yr) 91.82 (89.26, 93.44) 8.18 (6.56, 10.74)
e
Event D = {cancer was diagnosed at regular scheduled exam}

Motivation and Goal
Application
Results and Summary
Summary of Table 4
The probability of symptom-free-life is stable within each age
group; it increases as current age increases, about 80-95% for all
age groups.
The percentage of Over-diagnosis in the whole population (not
“False Positive”) is small, about 0.34-0.67%. It decreases slightly as
age advances and as screen interval increases.
The probability of true-early-detection decreases as future
screening interval ∆2 increases and as current age increases.
The probability of no-early-detection increases as future ∆2
increases; but it decreases as current age increases.

Motivation and Goal
Application
Results and Summary
Summary of Table 5
Among the screen-detected cases, the probability of
over-diagnosis is increasing as current age increases (from
3% to 9%). It is about 8∼9% for the 80-years-old, with 95%
CI (6%, 12%).
Among the screen-detected cases, the probability of
true-early-detection is decreasing as current age increases,
from 96% to 90%.
This study provides a systematic approach to assess the
outcomes of regular screening for old age with a screening
history.

Motivation and Goal
Application
Results and Summary
Acknowledgements and References
I want to thank Beth Levitt, Tom Riley and Jerome Mabie of
IMS for helping organize the NLST-CT data, and Ruiqi Liu of UL
for providing MCMC posterior samples for this study.
Thank you!
Wu D, Kafadar K, and Rai S. (2015). Inference of future screening
outcomes for older age group with a screening history. Draft.
Wu D and Rosner GL(2014). Inference of long term eﬀects and
over-diagnosis in periodic cancer screening. Statistica Sinica. 24,
815-831.
Wu D, Kafadar K, Rosner GL and Broemeling LD(2012). The lead
time distribution when lifetime is subject to competing risks in
cancer screening. The International Journal of Biostatistics, 8:1,
Article 6, April 2012.

JSM2015

Recommended

Recommended

More Related Content

Similar to JSM2015

Similar to JSM2015 (20)

JSM2015