Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
R. PIZZI1
, S. SICCARDI1
, C. PEDRINAZZI2
, O. DURIN2
and G. INAMA2
1Computer Science Department,University of Milan (Ital...
8th International Conference on APPLIED MATHEMATICS, SIMULATION, MODELLING.
Florence, 2014
THE CLINICAL PROBLEM
•Sudden de...
THE CLINICAL PROBLEM
Physical activity causes
• structural remodelling of the ventricles
• Alteration of the heart loading...
THE CLINICAL PROBLEM
Power sports (weightlifting, rowing, etc.)
• Increase of cardiac output
• Increase of frequency
• Inc...
THE CLINICAL PROBLEM
•Few studies on cardiovascular adaptation to sport
activity in master athletes
•Clinical significance...
THE DATA
• Data collected from four groups:
– A (18 subjects) athletes <40 5 females 13 males
– B (19 subjects) athletes >...
THE DATA
ECG signals from from treadmill exercise stress test
• HRV (heart rate variability)
• RR intervals analysis
• Eli...
Multiscale Entropy Analysis
Sample Entropy assesses the complexity of a time
series.
• Estimator of the conditional probab...
THE DATA
Clustering
Hierarchical clustering with Ward’s method
• Agglomerative hierarchical clustering
• The pair of clust...
THE DATA
• Recursively , n-1 clusters are formed of size 1, the EES is
calculated, the pair with smallest EES forms the fi...
THE DATA
THE DATA
Artificial Neural Network
• ANNs are effective non linear classifiers
• Our model : ITSOM
• A custom SOM-like net...
THE ITSOM ARTIFICIAL
NEURAL NETWORK
RESULTS
Variables:
– Athlete yes/no
– Age
– Gender
– MSE1
– MSE5
– MSE20
– pNN20
RESULTS
The best clustering highlighted 6 classes:
1. Non-athletes <40 , low MSE1
2. Non-athletes >40, highest MSE1, high ...
RESULTS
The ANN results gave further indications.
• The ANN clearly distinguish athletes form non-athletes
(separated attr...
RESULTS
There are correspondences between attractors and
clusters:
• One attractors collects 5 of 7 subjects of cluster 5 ...
FIRST CONCLUSIONS
• The two non-linear procedures (clustering and ANN) are mutually
congruent in discriminating subjects
•...
FIRST CONCLUSIONS
From the ANN classification:
• lack of age stratification within the group of non-
athletes
• athletes h...
THE NEW VARIABLES
• We wanted to assess if other variables may identify in
the clusters groups of subjects with common
car...
THE NEW VARIABLES
•TDD (telediastolic diameter)
•IVS (interventricular septum)
•PW (posterior wall) thickness
•ETT (exerci...
DATA ANALYSIS
• To evaluate the variables value differences among
clusters we used a t-test.
• The following table show th...
DATA ANALYSIS
DATA ANALYSIS
• None of the variables is significantly different between
the two clusters 1 and 2 ( non-athletes)
• VO2 pe...
THE NEW CLUSTERING
We performed a new clustering using all the variables.
The best result identifies 5 clusters:
1. 16 ath...
THE NEW CLUSTERING
• The clusters discriminate perfectly athletes from non
athletes.
• Although sex was not a considered v...
THE NEW CLUSTERING
• Workload : maximum for cluster 1 (male athletes,
minimum for cluster 2 (female athletes)
0
0,2
0,4
0,...
THE NEW CLUSTERING
• Good correlation between :
– IVS and Pw in clusters 1,2,5
– IVS and TDD in clusters 3,4
– PW and DTS ...
THE NEW CLUSTERING
• Good correlation between HB, HCT in clusters 1,4,5.
THE NEW CLUSTERING
• Good correlation between pNN20,MSE1,MSE5,MSE20
especially in cluster 2 and 4.
0
0,2
0,4
0,6
0,8
1
1,2...
t-TEST EVALUATION
• We examined the significance of the variable values
differences among clusters.
• In the table the var...
t-TEST EVALUATION
OBSERVATIONS
• Cluster 1 (male athletes <40) differs significantly from
all the other clusters for respiratory variables (...
IN SUMMARY
Starting form the analysis of the ECG stress test signals, we could
conclude that:
• MSE1 is low in athletes bo...
IN SUMMARY
The analysis of all the variables together has been able to
discriminate:
• depending on the VO variables
– ath...
IN SUMMARY
On the basis of the following variables we could
discriminate:
• workload: between male athletes and female ath...
IN SUMMARY
• Male athletes differ compared to all the other
groups according in particular to VO variables
and workload.
•...
CONCLUSIONS
• The study has allowed a stratification of the
subjects on the basis of physical activity,
age and sex
• The ...
CONCLUSIONS
• The complex interaction between these variables
required the use of nonlinear analysis techniques,
namely cl...
REFERENCES
• O. Durin, C. Pedrinazzi, G. Donato, R. Pizzi, G. Inama, Usefulness of
nonlinear analysis of ECG signals for p...
Upcoming SlideShare
Loading in …5
×

Data mining Methods for the Stratification of the Arrhythmic Risk in Young and Master Athletes

70 views

Published on

A study to evaluate the risk of Sudden Arrhythmic Death for young and master athletes.

Published in: Health & Medicine
  • Be the first to comment

  • Be the first to like this

Data mining Methods for the Stratification of the Arrhythmic Risk in Young and Master Athletes

  1. 1. R. PIZZI1 , S. SICCARDI1 , C. PEDRINAZZI2 , O. DURIN2 and G. INAMA2 1Computer Science Department,University of Milan (Italy) 2 Department of Cardiology, Hospital of Crema (Italy) Data Mining Methods for the Stratification of the Arrhythmic Risk in Young and Master Athletes
  2. 2. 8th International Conference on APPLIED MATHEMATICS, SIMULATION, MODELLING. Florence, 2014 THE CLINICAL PROBLEM •Sudden death in young atlete is still an open and socially relevant issue. •It hits more than 1000 young athletes (<35) every year in Italy •Most deaths are due to hidden heart diseases •Cardiomyopathy is responsible for up to 30% of fatal cases
  3. 3. THE CLINICAL PROBLEM Physical activity causes • structural remodelling of the ventricles • Alteration of the heart loading conditions • Predisposing to possible fatal arrhythmias • Endurance sports (running, bicycling, etc) may cause increase of heart rate and stroke volume • reduced vascular resistances • slight increase in blood pressure
  4. 4. THE CLINICAL PROBLEM Power sports (weightlifting, rowing, etc.) • Increase of cardiac output • Increase of frequency • Inclease of vascular resistances and blood pressure • Increase of pressure load • Volume load may lead to dilatation of left ventricle and wall thickness
  5. 5. THE CLINICAL PROBLEM •Few studies on cardiovascular adaptation to sport activity in master athletes •Clinical significance for cardiac rehabilitation after myocardial infarction and surgical procedures •Exercise has a positive effect: – Reduces cardiovascular events – A clinical protocol should include risk evaluation.
  6. 6. THE DATA • Data collected from four groups: – A (18 subjects) athletes <40 5 females 13 males – B (19 subjects) athletes >40 6 females, 13 males – C 8 subjects non-athletes < 40 3 females 13 males – D 7 subjects non-athletes>40 2 females 5 males
  7. 7. THE DATA ECG signals from from treadmill exercise stress test • HRV (heart rate variability) • RR intervals analysis • Elimination of artifacts pNNx analysis • percentage of RR intervals lasting more than x ms • pNN20 and pNN50 are the most significant. • pNN20 gives the best discrimination among subjects.
  8. 8. Multiscale Entropy Analysis Sample Entropy assesses the complexity of a time series. • Estimator of the conditional probability that two sequences of m data points remain similar (distance <r) including one more point. Multiscale Entropy analysis (MSE) • is the Sample Entropy of a time series at multiple scales, i.e. taking the average of groups of x points. • MSEx is the MSE with scale factor x • Both pNNx and MSEx have been satisfactorily used in many researches on arrhythmias. THE DATA
  9. 9. THE DATA Clustering Hierarchical clustering with Ward’s method • Agglomerative hierarchical clustering • The pair of clusters to merge are chosen minimizing the Sum of Squared Errors • i cluster • k variable • j observation (case)
  10. 10. THE DATA • Recursively , n-1 clusters are formed of size 1, the EES is calculated, the pair with smallest EES forms the first cluster. • Then n-2 clusters with couples of size 2 and 1 of size 3 are formed, EES is calsulated and so on • Algorithm stops when 1 single cluster of size n is formed.
  11. 11. THE DATA
  12. 12. THE DATA Artificial Neural Network • ANNs are effective non linear classifiers • Our model : ITSOM • A custom SOM-like network (Self Organizing Map) evaluates chaotic attractors within the sequence of winning neurons.
  13. 13. THE ITSOM ARTIFICIAL NEURAL NETWORK
  14. 14. RESULTS Variables: – Athlete yes/no – Age – Gender – MSE1 – MSE5 – MSE20 – pNN20
  15. 15. RESULTS The best clustering highlighted 6 classes: 1. Non-athletes <40 , low MSE1 2. Non-athletes >40, highest MSE1, high pNN20 – 1 subject in this group is non-athlete <40 but with high MSE1. 3. Athletes >40, low MSE1, low pNN20 – 1 subject in this group is an athlete<40 with average MSE1. 4. Athletes with high MSE1, high pNN20 both >40 and <40 5. Athletes <40 with low MSE1 and low pNN20 6. Athletes >40 very young (mean age 19) , dispersed values of MSE1 and pNN20.
  16. 16. RESULTS The ANN results gave further indications. • The ANN clearly distinguish athletes form non-athletes (separated attractors) • Again sex is not discriminant (there are no attractor separated by sex) • Age is discriminant for athletes, is not discriminant for non-athletes.
  17. 17. RESULTS There are correspondences between attractors and clusters: • One attractors collects 5 of 7 subjects of cluster 5 (see figure) • One attractor collects the subjects of cluster 2 • One attractor collects the subjects of cluster 1.
  18. 18. FIRST CONCLUSIONS • The two non-linear procedures (clustering and ANN) are mutually congruent in discriminating subjects • Sensitive variables exist: physical activity, age, MSE1 • Sex is not a discriminating variable • Very young athletes have dispersed values of MSE1. MSE1 seems to lower itself with maturity in athletes, but rises in non-athletes>40 physical activity seems to preserve a low MSE1
  19. 19. FIRST CONCLUSIONS From the ANN classification: • lack of age stratification within the group of non- athletes • athletes have cardiovascular characteristics differentiated by age.
  20. 20. THE NEW VARIABLES • We wanted to assess if other variables may identify in the clusters groups of subjects with common cardiovascular characteristics. • All the subjects underwent many other blood and clinical tests:
  21. 21. THE NEW VARIABLES •TDD (telediastolic diameter) •IVS (interventricular septum) •PW (posterior wall) thickness •ETT (exercise tolerance test) •EF (ejection fraction) •maximum O2 consumption (VO2peak) •% VO2 •O2 consumption at the anaerobic threshold (VO2AT ) •VE / VCO2 slope (indicator of ventilatory response to exercise) •peak RER (respiration exchange ratio) •maximum workload •HB (hemoglobin) •HCT (hematocrit) •creatinin •cholesterol (total, HDL, LDL) •triglycerides, •blood glucose •BNP (brain natriuretic peptide) •BMI (body mass index) •DTS (Duhe treadmill score) •mean Holter HR (cardiac frequency) •min-max-mean FC,​​ •FC
  22. 22. DATA ANALYSIS • To evaluate the variables value differences among clusters we used a t-test. • The following table show the emergence of many statistical significances • Significances with p<0.001 are indicated with **
  23. 23. DATA ANALYSIS
  24. 24. DATA ANALYSIS • None of the variables is significantly different between the two clusters 1 and 2 ( non-athletes) • VO2 peak, workload, Fcmean, Fcmin differ significantly between the non-athletes clusters and various clusters of athletes • TDD, Fcmin, DTS differ significantly between cluster 6 (very young athletes) and many other clusters • Fcmin, Fcmean,VO2peak,%VO2,VO2AT differ significantly between cluster 2 (non-athletes>40) and cluster 3 (athletes >40).
  25. 25. THE NEW CLUSTERING We performed a new clustering using all the variables. The best result identifies 5 clusters: 1. 16 athletes, both < 40 and < 40, 15 males , 1 female 2. 12 athletes, both < 40 and < 40, 10 females, 1 male 3. 12 athletes, both < 40 and < 40, all males 4. 10 non-athletes, both < 40 and < 40, 9 males, 1 female 5. 11 non athletes, both < 40 and < 40, both males and females.
  26. 26. THE NEW CLUSTERING • The clusters discriminate perfectly athletes from non athletes. • Although sex was not a considered variable, the cluster discriminate very well by sex. • The following figures show the dependence of clusters on groups of variables
  27. 27. THE NEW CLUSTERING • Workload : maximum for cluster 1 (male athletes, minimum for cluster 2 (female athletes) 0 0,2 0,4 0,6 0,8 1 1,2 1 2 3 4 5 age BMI wload
  28. 28. THE NEW CLUSTERING • Good correlation between : – IVS and Pw in clusters 1,2,5 – IVS and TDD in clusters 3,4 – PW and DTS in cluster 3
  29. 29. THE NEW CLUSTERING • Good correlation between HB, HCT in clusters 1,4,5.
  30. 30. THE NEW CLUSTERING • Good correlation between pNN20,MSE1,MSE5,MSE20 especially in cluster 2 and 4. 0 0,2 0,4 0,6 0,8 1 1,2 0 1 2 3 4 5 6 PNN20 MSE1 MSE5 MSE20
  31. 31. t-TEST EVALUATION • We examined the significance of the variable values differences among clusters. • In the table the variables with p<0.01 are reported. • Significances with p<0.001 are reported with *.
  32. 32. t-TEST EVALUATION
  33. 33. OBSERVATIONS • Cluster 1 (male athletes <40) differs significantly from all the other clusters for respiratory variables (VO) and at least one FC variable. • Cluster 1 differs in workload from all clusters except cluster 3 ( male athletes >40). • Differences between clusters of athletes and non- athletes involve frequently cholesterol or connected variables.
  34. 34. IN SUMMARY Starting form the analysis of the ECG stress test signals, we could conclude that: • MSE1 is low in athletes both < and > 40 and in non-athletes <40, • MSE1 is higher in non-athletes > 40. • Thus sport seems to keep down the parameter MSE1 regardless of age. The application of a self-organizing ANN reveals in addition that the non-athlete subjects are not separated by age, but result to be clearly separated by athletes. • Sex is not a discriminating variable in either the clustering or the ANN classification.
  35. 35. IN SUMMARY The analysis of all the variables together has been able to discriminate: • depending on the VO variables – athletes> 40 from non-athletes> 40 – non-athletes <40 from athletes> 40 – non-athletes> 40 from athletes <40 • depending on the workload – non athletes <40 from all athletes • depending on DTS – athletes <40 from athletes << 40 • depending on FC – non-athletes> from 40 athletes <40 . • No variables are significantly different between non-athletes <40 and > 40.
  36. 36. IN SUMMARY On the basis of the following variables we could discriminate: • workload: between male athletes and female athletes • VO parameters, workload, FC parameters : between male athletes and non-athletes, both male and female • VO parameters: between male athletes and female athletes.
  37. 37. IN SUMMARY • Male athletes differ compared to all the other groups according in particular to VO variables and workload. • Male athletes are distinguished from non- athletes according to the FC parameters.
  38. 38. CONCLUSIONS • The study has allowed a stratification of the subjects on the basis of physical activity, age and sex • The existence of significant differences in the cardiovascular status of these groups was shown, through the variability of a set of cardiovascular parameters, in particular MSE1, PNN20, VO and FC variables.
  39. 39. CONCLUSIONS • The complex interaction between these variables required the use of nonlinear analysis techniques, namely clustering and ANN. • Taking into account this stratification it will be possible,​​ following the subjects over time, to identify cardiovascular prognostic indicators that may help to prevent possibly fatal cardiac arrhythmias.
  40. 40. REFERENCES • O. Durin, C. Pedrinazzi, G. Donato, R. Pizzi, G. Inama, Usefulness of nonlinear analysis of ECG signals for prediction of inducibility of sustained ventricular tachycardia by programmed ventricular stimulation in patients with complex spontaneous ventricular arrhythmias, Annals of Noninvasive Electrocardiology, Vol. 13 , No. 3, 2008, pp.219-227. • R. Pizzi, G. Inama, O. Durin, C. Pedrinazzi, Non-invasive assessment of risk for severe tachyarrhythmias by means of non-linear analysis techniques, Chaos and Complexity Letters, Vol. 3 , No. 3, 2007, pp. 229-250, • G. Inama, C. Pedrinazzi, O. Durin, M. Nanetti, R. Pizzi, Microvolt t-wave alternans for risk stratification in athletes with ventricular arrhythmias: correlation with programmed ventricular stimulation, Annals of Noninvasive Electrocardiology, Vol. 13 , No. 1 , 2008,pp. ,14-21. • R. Pizzi, O. Durin, G. Inama, Non-invasive assessment of risk for severe tachyarrhythmias by means of non-linear analysis techniques, in: Developments in Chaos and Complexity Research, Nova Science NY, 2008. • G. Inama, C. Pedrinazzi, O. Durin, M. Nanetti, G. Donato, R. Pizzi, Ventricular arrythmias in competitive atheltes: risk stratification with T- wave alternans, Heart International, Vol. 3 , No. 1, 2007 ,pp. 58-67,. • R. Pizzi, S. Siccardi, C. Pedrinazzi, O. Durin, G. Inama, Cardiovascular Modifications and Stratification of the Arrhythmic Risk in Young and Master Athletes Am. J. Biomed Eng, Vol. 4 , No. 3, 2014, pp. 60-67.

×