8. Examinations of Data Structures
Recurrence Competing risks
Semi-
competing risks
Typical univariate survival
data × × ×
Competing risks data
without recurrences × ∨ ×
Semi-competing risks
survival data × × ∨
Recurrent events data with
multiple competing risks ∨ ∨ ×
8
Data structure form of our two motivating examples
10. Typical Univariate Survival Data
• Observed variables:
𝑋 = 𝐷 ∧ 𝐶
𝛿 = 𝐼 𝐷 ≤ 𝐶
Initialstate
death
censored
10
D
C
11. Competing Risks Data without Recurrences
• Observed variables:
෨𝑌 = Y ∧ 𝐶 ∧ 𝐷
෨∆= 𝐼 𝑌 ≤ 𝐶 ∆
Initialstate
death
censored
Type-1 event
Type-2 event
11
C
Y
D
12. Semi-Competing Risks Survival Data
• Observed variables:
𝑌 ∧ 𝐷 ∧ 𝐶
𝐼 𝑌 ≤ 𝐷 ∧ 𝐶
𝐷 ∧ 𝐶
𝐼 𝐷 ≤ 𝐶
• Semi-competing risks: 𝐷 is a competing risks for 𝑌 but not vice versa
Initialstate
death
rejection
Y
D
12
15. Competing Risks
- Classical Definition
• Every human is continuously exposed to many risks of death
• Cancer
• Heart diseases
• Pneumonia
• …
• Because death is not a repetitive event and is usually attributed to a
single cause, these risks compete with one another for the life of a
person.
• What if we are interested in a particular cause of death?
• Identifiability issue: Tsiatis (1975)
• Crude approach
• Net approach
15
16. Competing Risks
- Crude Approach
• Assumption:
• Recognize the presence of all competing risks
• 𝑌 = 𝑌1 ∧ ⋯ ∧ 𝑌𝐾
• ∆ = cause of the event
• Quantity of interest:
• Cumulative incidence function (abbr. CIF)
• CIF: 𝐹𝑘 𝑡 = 𝑃𝑟 𝑌 ≤ 𝑡, ∆= 𝑘
• Note that: lim
𝑡→∞
𝐹𝑘 𝑡 < 1
• 𝐹𝑘 𝑡 is an improper function
• Remark: This approach is constructed on a real world
16
17. Competing Risks
- Net Approach
• Assumptions:
• Other competing risks can be removed
• 𝑌 = 𝑌1 ∧ ⋯ ∧ 𝑌𝐾
• ∆ = cause of the event
• Quantity of interest
• Survival function: 𝑆 𝑘 𝑡 = 𝑃𝑟 𝑌𝑘 > 𝑡
• Note that: lim
𝑡→∞
𝑆 𝑘 𝑡 = 0
• 𝑆 𝑘 𝑡 is a proper survival function
• Remark: This approach is constructed on a hypothetical world
17
18. Semi-Competing Risks Data
- Revisited
What we observed Exist but censored
18
• The implicit assumption is that the censoring events & the terminal
event can both be removed.
• Although we cannot observe 𝑌 for 𝑌 > 𝐷, we still assume that there is
a hypothetical world that such Y exists.
?
?
?
20. Modeling Associations
• There are three possible types of association
• Between gap-times
• Between competing risks
• Between recurrences and death
• Association models
• Frailty (random effect) modeling
• Copula modeling
• Frailty model and Archimedean Copula model
• Equivalence under some conditions in theory
• Data generation algorithms are different
20
21. Modeling Associations
- Frailty and Copula Approaches
• There are three possible types of association
• Between gap-times
• Between competing risks
• Between recurrences and death
• 𝑊 : frailty (unobservable)
• How 𝑊 affects 𝑇𝑗 usually Cox PH model
• Conditional independence given W s.t. 𝑇𝑗 ⊥ 𝑇𝑗′ | 𝑊
• The frailty distribution is specified usually Gamma dist.
• For the frailty control the association between competing risks,
• We use the copula package in R
• The built-in function rcopula()
21
22. Frailty Modeling
- before specifying frailty distribution
• Assume the frailty variable 𝑊 has a PH effect on the hazard of 𝑇𝑗
• 𝜆𝑗 𝑡 𝑤 = 𝑤𝜆0𝑗 𝑡 exp 𝑍 𝑇 𝛽
𝑠𝑖𝑚𝑝𝑙𝑖𝑓𝑦
𝜆𝑗 𝑡 𝑤 = 𝑤𝜆0𝑗 𝑡
• Λ𝑗 𝑡 𝑤 = 0
𝑡
𝜆𝑗 𝑢 𝑤 𝑑𝑢 = 𝑤Λ0𝑗 𝑡
• 𝑆𝑗 𝑡 𝑤 = exp −Λ𝑗 𝑡 𝑤 = exp −Λ0𝑗 𝑡
𝑤
= 𝐵𝑗(𝑡)
𝑤
where 𝐵𝑗(𝑡) is a continuous baseline survivor function of 𝑇𝑗
• Derive the unconditional marginal survivor function of 𝑇𝑗
• 𝑆𝑗 𝑡 = 𝑤
𝑆𝑗 𝑡 𝑤 𝑑𝐹 𝑊(𝑤) = 𝑤
𝐵𝑗(𝑡)
𝑤
𝑑𝐹 𝑊(𝑤)
= න
𝑤
𝑒 𝑤 log 𝐵 𝑗(𝑡)
𝑑𝐹 𝑊(𝑤) = 𝑝 − log 𝐵𝑗(𝑡)
, where 𝑝 𝑢 = 𝐸 𝑒−𝑢𝑊 is the Laplace transform of 𝑊
22
𝑆𝑗 𝑡 𝑤
𝑆𝑗(𝑡)
𝑝 𝑢 𝑝−1 𝑢
23. Frailty Modeling
- after specifying frailty distribution
• Specify the form of 𝐹 𝑊 𝑤 as W~𝑔𝑎𝑚𝑚𝑎 𝛼, 𝛽
• Since its Laplace transform has a closed form
• Derive the Laplace transform as 𝑝 𝑢 = 𝐸 𝑒−𝑢𝑊 = 1 + 𝛽𝑢 −𝛼
• 𝑊 : frailty (unobservable)
• To ensure the identifiability of model 𝐸 𝑊 = 1
• To allow within-cluster dependence 𝑉𝑎𝑟 𝑊 = 𝜃1 ≥ 0
• To satisfy the above constraints, assume
• W~𝑔𝑎𝑚𝑚𝑎 𝛼 =
1
𝜃1
, 𝛽 = 𝜃1
• so the Laplace transform is
• 𝑝 𝑢 = 𝐸 𝑒−𝑢𝑊 = 1 + 𝛽𝑢 −𝛼 = 1 + 𝜃1 𝑢
−
1
𝜃1
23
𝑆𝑗 𝑡 𝑤
𝑆𝑗(𝑡)
𝑝 𝑢 𝑝−1 𝑢
25. Sketch of Simulation Study
• Hypothetical world – Net approach
• No terminal event or with terminal event
• Steps:
• Generate multivariate failure times
• Take minimum as observed variables
• Real world – Crude approach
• No terminal event or with terminal event
• Steps:
• Generate data from improper distribution
• Ref: Cheng & Fine (2012)
• We propose to use different frailty variables to account for
different associations.
• Use generated data to evaluate the performance of CIFs
Hypothetical
𝐶
𝐶, 𝐷
Real world
𝐶
𝐶, 𝐷
25
26. • Under the hypothetical world assumption
• Remove the influence of a terminal event
by taking 𝐷 = ∞
Cases 1.
Data Generation Algorithms
𝑇𝑗(1) 𝑇𝑗(2)
• 𝑇𝑗 = min{ 𝑇𝑗(1), 𝑇𝑗(2)}
• ∆𝑗 = ቊ
1, 𝑖𝑓 𝑇𝑗 = 𝑇𝑗(1)
2, 𝑖𝑓 𝑇𝑗 = 𝑇𝑗(2)
Compete
𝑇1, Δ1 𝑇2, Δ2 𝑇3, Δ3
⋯
Frailty H
Frailty W
Hypothetical
𝐶
𝐶, 𝐷
Real world
𝐶
𝐶, 𝐷
26
27. Cases 1.
Data Generation Algorithms (cont’d)
• Step 1:
For each subject, generate 𝑊~𝑔𝑎𝑚𝑚𝑎
1
𝜃1
, 𝜃1 and 𝐶~𝑢𝑛𝑖𝑓 0, 𝐾
• Step 2:
For each stage, generate 𝑈𝑗(1), 𝑈𝑗(2) ~𝐶𝑙𝑎𝑦𝑡𝑜𝑛 𝜃0 , then use 𝑈𝑗(𝑘) to
generate 𝑇𝑗(𝑘) as follows:
𝜆𝑗 𝑡|𝑤 = 𝑤𝜆0𝑗 𝑡 = 𝑤𝜉𝑗
S(𝑗) 𝑡|𝑤 = 𝑒𝑥𝑝 −Λ𝑗 𝑡 = 𝑒𝑥𝑝 − 𝜉𝑗 𝑡
𝑤
𝑈𝑗(𝑘) = 𝑒𝑥𝑝 − 𝜉𝑗 𝑇𝑗(𝑘)
𝑤
𝑇𝑗(𝑘) = −
1
𝑤𝜉 𝑗
𝑙𝑜𝑔𝑈𝑗 𝑘
So, define 𝑇𝑗 = min{𝑇𝑗 1 , 𝑇𝑗(2)} and ൝
Δ𝑗 = 1, 𝑖𝑓 𝑇𝑗 = 𝑇𝑗 1
Δ𝑗 = 2, 𝑖𝑓 𝑇𝑗 = 𝑇𝑗 2
.
Then, set 𝑌𝑗 = σ 𝓂=1
𝑗
𝑇 𝑚.
27
28. Cases 1.
Data Generation Algorithms (cont’d)
• Step 3:
If 𝑌𝑗 ≤ 𝐶, set ෩𝑌𝑗 = 𝑌𝑗 and ෩𝑇𝑗 = 𝑇𝑗.
Repeat Step 2 for 𝑗 = 1, ⋯ , (𝑀 − 1), where 𝑀 satisfies 𝑌 𝑀−1 ≤ 𝐶 < 𝑌 𝑀.
Finally, we set ෪𝑌 𝑀 = 𝐶, ෪𝑇 𝑀 = 𝐶 − 𝑌 𝑀−1, and ෪∆ 𝑀= 0.
28
29. • Under the hypothetical world assumption
• Consider the influence of a terminal event
by taking 𝐷 < ∞
Cases 2.
Data Generation Algorithms
𝑇𝑗(1) 𝑇𝑗(2)
• 𝑇𝑗 = min{ 𝑇𝑗(1), 𝑇𝑗(2)}
• ∆𝑗 = ቊ
1, 𝑖𝑓 𝑇𝑗 = 𝑇𝑗(1)
2, 𝑖𝑓 𝑇𝑗 = 𝑇𝑗(2)
Compete
𝑇1, Δ1 𝑇2, Δ2 𝑇3, Δ3
⋯
𝐷
Frailty W
Frailty 𝐻
Frailty V
Hypothetical
𝐶
𝐶, 𝐷
Real world
𝐶
𝐶, 𝐷
29
30. Cases 2.
Data Generation Algorithms (cont’d)
• Step 1:
For each subject, generate 𝑊~𝑔𝑎𝑚𝑚𝑎
1
𝜃1
, 𝜃1 , 𝑉~𝑔𝑎𝑚𝑚𝑎
1
𝜃2
, 𝜃2
and 𝐶~𝑢𝑛𝑖𝑓 0, 𝐾
• Step 2:
For each stage, generate 𝑈D~unif 0,1 , then use 𝑈D to generate 𝐷 by
𝜆 𝐷 𝑡|𝑣 = 𝑣𝜆0𝐷 𝑡 = 𝑣𝜂
𝑆 𝐷 𝑡|𝑣 = 𝑒𝑥𝑝 −Λ 𝐷 𝑡 = 𝑒𝑥𝑝 − 𝜂𝑡 𝑣
𝑈 𝐷 = 𝑒𝑥𝑝 − 𝜂𝐷 𝑣
𝐷 = −
1
𝑣𝜂
𝑙𝑜𝑔𝑈 𝐷
Moreover, for each stage, generate 𝑈𝑗(1), 𝑈𝑗(2) ~𝐶𝑙𝑎𝑦𝑡𝑜𝑛 𝜃0 , then use
𝑈𝑗(𝑘) to generate 𝑇𝑗(𝑘) as follows:
𝜆𝑗 𝑡|𝑣, 𝑤 = 𝑣𝑤𝜆0𝑗 𝑡 = 𝑣𝑤𝜉𝑗
S(𝑗) 𝑡|𝑣, 𝑤 = 𝑒𝑥𝑝 −Λ𝑗 𝑡 = 𝑒𝑥𝑝 − 𝜉𝑗 𝑡
𝑣𝑤
30
40. Simulation Study
Hypothetical
𝐶
𝐶, 𝐷
Real world
𝐶
𝐶, 𝐷
40
Case (i) Case (ii) Case (iii) Case (iv)
Between competing risks
𝜃0
indep.
(1)
dep.
(1.25)
indep.
(1)
dep.
(1.25)
Between gap-times
𝜃1
indep.
(1)
indep.
(1)
dep.
(1.25)
dep.
(1.25)
• Ref: Ph.D. thesis of Bowen Li (2016)
• IPCW (inverse probability
censoring weighting) to adjust bias
45. Conclusions
• Two real-world examples
• Analyze recurrent events data with competing risks
• Literature development
• Review survival data structure
• Deal with competing risks
• Account for Identifiability issue
• “Net” vs. “Crude” approach
• Handle association by frailty approach
• Between gap-times
• Between competing risks
• Between recurrence processes and death
• Simulation
• Propose data generation algorithms
• Conduct simulation analysis to examine the validity
45
46. References (1)
• Chen, C. M., Chuang, Y. W., & Shen, P. S. (2015). Two‐stage
estimation for multivariate recurrent event data with a dependent
terminal event. Biometrical Journal, 57(2), 215-233.
• Chang, W. H., & Wang, W. (2009). Regression analysis for
cumulative incidence probability under competing risks. Statistica
Sinica, 391-408
• Cheng, Y., & Fine, J. P. (2012). Cumulative incidence association
models for bivariate competing risks data. Journal of the Royal
Statistical Society: Series B (Statistical Methodology), 74(2), 183-
202.
• Goethals, K., Janssen, P., & Duchateau, L. (2008). Frailty models
and copulas: similarities and differences. Journal of Applied
Statistics, 35(9), 1071-1079.
46
47. References (2)
• Oakes, D. (1982). A model for association in bivariate survival
data. Journal of the Royal Statistical Society. Series B
(Methodological), 414-422.
• Oakes, D. (1989). Bivariate survival models induced by
frailties. Journal of the American Statistical Association, 84(406),
487-493.
• Tsiatis, A. (1975). A nonidentifiability aspect of the problem of
competing risks. Proceedings of the National Academy of
Sciences, 72(1), 20-22.
• Wang, W., & Wells, M. T. (2000). Estimation of Kendall's tau under
censoring. Statistica Sinica, 1199-1215.
47