Multimorbidity Multistate Model

Develop a multi-state model
to predict multimorbidity of
Cardiovascular disease (CVD),
Type 2 diabetes (T2D), and
Chronic kidney disease (CKD)
Research aim
By:
Manali Ajay Jain
MSc Health Data Science (2022-23)

Develop a multi-state model
to predict multimorbidity of
Cardiovascular disease (CVD),
Type 2 diabetes (T2D), and
Chronic kidney disease (CKD)
Research aim
Many thanks to
Dr Glen Martin
for guiding me
throughout the research

Key elements
to understand
Prediction models
Informs people about their ailment prognosis
01
Multistate models
A stochastic process with multiple discrete states that it can occupy at
any given time
02
Multimorbidity
03
co-existence of two or more chronic illnesses, where one is not necessarily
more significant than the others time
Risk predictions of Multimorbidity
04
newly emerging area of research

! The missing pieces !
Multimorbidity studies were
mainly performed on groups of
individual
The validation sample size of the
risk predictions of multimorbidity
tend to be smaller

An attempt to fill the gap
This multistate model
analysed every single
patient
Sample size for
validation was
relatively larger
Validation approach
were generating
stacked transition
possibilities graphs

Data Source
Clinical Practice Research Datalink (CPRD)
*Advisory note:
 Access to the CPRD dataset was gained only after the
completion of the CPRD resource module training test
 The UoM also provided a Data Protection and Cyber
Security course

Foundation of
the dataset
Data was obtained
from Clinical Practice
Research Datalink
(CPRD)
Total data was
of 2,714,535
observations with 91
variables
Considered only
healthy population
and required variables
to start the analysis
This reduced the data to
2,001,735 observations and
74 variables, out of which 30
were various comorbidities
and 8 were demographic
variables
This data was then split in
70:30 ratio for training and
testing
purposes respectively
70:30

Data in a
glance
Demographic variables
 Age, BMI, Cholesterol ration, and SBP were
continuous variables
 Gender: female, male
 Ethnicity: white, mixed ethnic group, asian/asian
british, black/african/Caribbean/black british,
other ethnic group
 Smoking: never, ex, current
 Index of Multiple Deprivation: 5 stages, most to
least

Pathways
Diagram
3.T2D
Healthy
2.CVD
4.CKD
5 CVD+T2D
7.CKD+T2D
6.CVD+CKD 8.CVD+CKD+T2D
DEATH

Pathways
3.T2D
Healthy
2.CVD
4.CKD
5 CVD+T2D
7.CKD+T2D
6.CVD+CKD 8.CVD+CKD+T2D
DEATH
Matrix indicating 20
possible transitions
Pathways

XQuartz 2.8.5
R 4.1.3-gcc830
RStudio 1.3.1073
R libraries
 dplyr (Hadley Wickham et al. 2014)
 ggplot (Hadley Wickham et al. 2007)
 mstate (de Wreede et al. 2011)
 calibmsm (Alexander Pate et al. unreleased)
Technical
Requirements

No need of standardization
No need of normalization
Missing values were treated
by single stochastic imputation
Splitting was done randomly,
without any criteria
Before we
proceed

Let’s
begin!
• Training dataset was used for model construction

Let’s
begin!
• Population distribution was studied across transitions

Let’s
begin!
• Dataset was converted into a dataset of class msdata (long format)
• id
• from
• to
• trans
• Tstart
• Tstop
• time
• status
• covariates (the 8 demographic variables)

Let’s
begin!
• Covariates were expanded – dummy variables were generated based on
the no. of transitions

Let’s
begin!
• Time was converted from days to years

Let’s
begin!
• Cox model was fitted which had separate baseline hazards for each of the
transitions and no covariates

Let’s
begin!
• Cox model was fitted which had separate baseline hazards for each of the
transitions and no covariates
• Transition hazards estimates and their associated covariances from each
stage were calculated

Healthy stage
 Healthy -> CKD
highest
 Healthy -> T2D
lowest

CVD stage
 CVD -> Death
highest
 CVD -> CVD+T2D
lowest

T2D stage
 T2D -> Death
highest
 T2D -> CVD+T2D
lowest

CKD stage
 CKD -> Death
highest
 CKD -> CKD+T2D
lowest

CVD+T2D
stage
 CVD+T2D -> Death
upper line
 CVD+T2D -> CVD+T2D+CKD
lower line

CVD+CKD
stage
 CVD+CKD -> Death
upper line
 CVD+CKD -> CVD+T2D+CKD
lower line

CKD+T2D
stage
 CKD+T2D -> Death
upper line
 CKD+T2D -> CVD+T2D+CKD
lower line

CVD+T2D+CKD
stage
 This was created from a simple
survfit object since it had just
one transition

Diving
deeper
Transition possibilities estimates were also produced

Diving
deeper
These findings demonstrate how a patient's prognosis is influenced by both their initial
condition and by the moment used as the beginning point for prediction

Stacked transition possibilities
Diving
deeper
 The distance between two
adjacent curves represents
the probability of being in
the corresponding state

Diving
deeper
To create a lower dimensional representation of the regression coefficients of the whole
model, the reduced rank (RR) model was helpful

Diving
deeper
To create a lower dimensional representation of the regression coefficients of the whole
model, the reduced rank (RR) model was helpful
RR model generates three items
• Alpha
• Gamma
• Beta

Alpha output of RR model
Misc ethnic group
Ex-smokers
More deprived

Gamma output of RR model
• All the values are mostly negative
• The coefficients for this risk score
in Gamma are negative and of
substantial size for all transitions
into death, same for transitions
starting from healthy

Beta output of RR model
• Combined analysis of alpha and
gamma
• Lower values of Alpha (for instance
mixed ethnicity, ex smoker, more
deprived) correspond to higher
death rates

• Built-up model is now fitted into the test
dataset
• Two follow-up durations were used to
evaluate the transition probabilities
• 5 years
• 10 years
• They could then be compared to the train
dataset stacked transition probabilities plot
It’s
TESTING
TIME!

after 5 years
Testing
time
adjacent curves represents the
probability of being in the
corresponding state

after 10 years
Testing
time
adjacent curves represents the
probability of being in the
corresponding state
 Majorly transitions resulting to
death is less over the years

Was it
worth it??
Couldn’t make calibration plots, but somehow
managed to get future prediction probabilities
01
01
Resulted in meaningful analysis which can
potentially be beneficial for the health sector
02

Multimorbidity Multistate Model

Recommended

Recommended

More Related Content

Similar to Multimorbidity Multistate Model

Similar to Multimorbidity Multistate Model (20)

Recently uploaded

Recently uploaded (20)

Multimorbidity Multistate Model

Editor's Notes