Presentation done in the 41st International Engineering in Medicine and Biology Conference, to be held in Berlin, Germany from July 23–27, 2019.
Author version of the full paper available in https://www.researchgate.net/publication/334647238_Clustering_Cardiovascular_Risk_Trajectories_of_Patients_with_Type_2_Diabetes_Using_Process_Mining
Clustering Cardiovascular Risk Trajectories of Patients with Type 2 Diabetes Using Process Mining
1. Clustering Cardiovascular Risk Trajectories
of Patients with Type 2 Diabetes Using
Process Mining
Antonio Martinez-Millana
anmarmil@itaca.upv.es
@anmarmil
Pebesma J., Martinez-Millana A., Sacchi L., Fernandez-
Llatas C.,
De Cata P., Chiovato L., Bellazzi R. and Traver V.
University of Twente, the Netherlands
Universitat Politècnica de València, Spain
University of Pavia, Italy
ICS Maugeri, Pavia, Italy
2. Outline
1. Introduction and objective
2. Materials and methods
• Cardiovascular risk
• Data base
• Process Mining
• Design of the experiments
3. Results
4. Discussion
EMBC 2019 2
4. INTRODUCTION
4
Cardiovascular diseases (CVDs) are the leading causes of morbidity and mortality
among people with Type 2 Diabetes Mellitus (T2DM)
Multi-factorial intervention reduced CVD events and mortality in T2DM
There are substantial differences in the risk of different CVD states in T2DM
patients depending on the sex.
Shah et al, The Lancet D&End, 2015
Marso et al, NEJM, 2016
Juutilainen et al. Diabetes Care 2004
There are different interpretations
of the sex equality in the burden of
CVD
Mosca et al. Circulation 2004
5. INTRODUCTION
4
scikit-learn.org
sas.com
Models are ‘black boxes’ for
professionals
Models have an inherent
error
Data quality is key
Models are trained with data
not always compatible
Models are not always applicable
on the same way
Data Mining plethora
6. Discipline of Syntactic Data Mining that
supports in the understanding of complex
processes in a comprehensive, objective
and exploratory way
Process mining
EMBC 2019
INTRODUCTION
7. OBJECTIVE
EMBC 2019 5
to study the clinical pathways of CVDs as an
indicator for the risk progression in T2DM
patients and discover differences due to the sex
of the subjects
8. 8
MATERIALS AND METHODS
CVR was estimated using the algorithm
proposed and validated in Progetto Cuore
Assesses the likelihood of experiencing a first
major cardiovascular event
Data base
Estimation of Cardiovascular Risk (CVR)
1020 subjects – 930 with CVR estimations
T2DM
December 96 – February 2015 (18 years)
Clinical follow-up
EMBC 2019
Giampaoli et al 2015
12. Experiments
12
Combination of process discovery and conformance.
Group similar CVR pathways to observe understandable
behaviours.
Use a quality threshold clustering (QT)
QT requires a distance threshold within each cluster,
defined as the similarity index.
Heyer, Kruglyak, & Yooseph 1999
EMBC 2019
MATERIALS AND METHODS
14. 14
RESULTS
1-2 clusters with a lot of
outliers and a very simple
topology
(for example CVR II - III)
1-2 clusters with few
outliers, but a very
heterogeneous topology
(for example II - III - IV -
V - IV).
Influence of similarity index
EMBC 2019
16. DISCUSSION
Three different trajectories in the course of
cardiovascular risk in patient with type 2 diabetes:
high risk, medium risk and low risk.
The similarity index was chosen based upon a
compromise between the number of clusters and
outliers.
The representation of time on each risk allowed to
discover significant differences in the sex distribution
16EMBC 2019
17. CONCLUSION
PM described the clinical course of CVR
effectively and provided clusters grouping the
variety of trajectories.
Two of the three clusters showed an unbalanced
gender distribution
One of the clusters (medium risk) described a
balanced distribution.
17EMBC 2019
18. Clustering Cardiovascular Risk Trajectories
of Patients with Type 2 Diabetes Using
Process Mining
Antonio Martinez-Millana
anmarmil@itaca.upv.es
@anmarmil
Pebesma J., Martinez-Millana A., Sacchi L., Fernandez-
Llatas C.,
De Cata P., Chiovato L., Bellazzi R. and Traver V.
University of Twente, the Netherlands
Universitat Politècnica de València, Spain
University of Pavia, Italy
ICS Maugeri, Pavia, Italy
Thank you
Editor's Notes
Type 2 diabetes is defined as a metabolic impairment due to an insufficient insulin delivery or a decreased insulin action
Insulin impairment lead to diverse complication at micro and macro vascular levels
, since no a priori number of clusters is known.
The application of Process Mining technologies allowed us to identify three The application of Process Mining technologies allowed us to identify three
This way of quality clustering was chosen, since we had no a priori knowledge about the trajectories of possible groups
The representation of time is crucial in health care, as it can represent back and forth clinical situations (response to treatments, effect of follow-up, etc).
Although two clusters show an unbalanced gender distribution