Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019

15 views

Published on

A key challenge we face at Pacmed is quickly calibrating and deploying our tools for clinical decision support in different hospitals, where data formats may vary greatly. Using Intensive Care Units as a case study, I’ll delve into our scalable Python pipeline, which leverages Pandas’ split-apply-combine approach to perform complex feature engineering and automatic quality checks on large time-varying data, e.g. vital signs. I’ll show how we use the resulting flexible and interpretable dataframes to quickly (re)train our models to predict mortality, discharge, and medical complications.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019

  1. 1. Scaling is caring Building scalable feature engineering pipelines for machine learning in healthcare April 3 2019 Amsterdam 2019
  2. 2. Introductions • Michele Tonutti ! •Data Scientist at Pacmed •Intensive Care Team •Background in Biomedical Engineering and Robotics

  3. 3. Introductions •Developing machine-learning-driven decision support tools to make healthcare more personal, personalised and precise. •Patients only get care that has the highest probability of success for them. •Focus on oncology, emergency care, chronic diseases, and intensive care.
  4. 4. Pacmed focuses on four applications Emergency care: 
 What is the urgency level of a patient (how quick should someone see a doctor)? Intensive Care: 
 Predicting risk of ICU and post-ICU complications to support decision-making Chronic diseases: 
 What is the best treatment (combination) for patients with hypertension, diabetes and/or chronic kidney failure? Oncology: 
 What are the optimal treatments for the individual patient with colon-, prostate- or breast- cancer?
  5. 5. Intensive care is most promising and furthest developed Emergency care: 
 What is the urgency level of a patient (how quick should someone see a doctor)? Intensive Care: 
 Predicting risk of ICU and post-ICU complications to support decision-making Chronic diseases: 
 What is the best treatment (combination) for patients with hypertension, diabetes and/or chronic kidney failure? Oncology: 
 What are the optimal treatments for the individual patient with colon-, prostate- or breast- cancer?
  6. 6. The Intensive Care Unit (ICU)
  7. 7. Pacmed is currently working on four prediction problems on the intensive care t-3 t-2 t-1 Today t+7 Readmission/mortality Vital signs t-3 t-2 t-1 Today t+2 Re-intubationRespiratory 
 parameters t-3 t-2 t-1 Today t+1 t+2 Bed capacityPatient inflow
 & outflow t-3 t-2 t-1 Today t+1 Creatinine Kidney function Discharge decision 
 Predicting the readmission and mortality risk of patients on discharge Extubation decision Predicting the risk of re-intubation of patients if they are extubated Capacity management Predicting the number of full/ available beds Predicting complications E.g. Predicting kidney function
  8. 8. Machine-learning based decision support software
  9. 9. Explainable prediction of eligibility for discharge from the ICU
  10. 10. Explainable prediction of eligibility for discharge from the ICU Feature Value Interpretation of value SATURATION Max value of the admission 98% A max value of 98% is lower than 95% of all discharged patients SERUM CREATININE Trend in last 24 hours Increase of 20 ml From 100 to 120 The average patient had a stable serum creatinine during the last 24 hours. The increase of +20 is higher than 99% of discharged patients ALAT Variation in values last 24 hours Variation of 7 ml Between 5 and 12 The average patient had a variation of ALAT of 2 in the last 24 hours. A variation of 7 is higher than 76% of all patients. URINE OUTPUT Average last 24 hours 240 ml An average value of last 24 hours. The average discharged patient has a urine output of 250.
  11. 11. A pipeline for ICUs that works for both development and production Hospital 1 Hospital 2 Hospital 3
  12. 12. Development Production Hospital 1 Hospital 2 Hospital 3 A pipeline for ICUs that works for both development and production
  13. 13. Development Production Feature Engineering Hospital 1 Hospital 2 Hospital 3 A pipeline for ICUs that works for both development and production
  14. 14. Feature engineering for medical data is an iterative process Medical knowledge Feature engineering Modelling Validation
  15. 15. Feature engineering for medical data is an iterative process Medical knowledge Feature engineering Modelling Validation
  16. 16. The issue of variety in medical data 1.High number of unique parameters 2.Differing feature structure for different problems 3.Different parameter distributions between populations 4.Variability of measurements over time
  17. 17. Patient and admission characteristics Clinical observations Vital signs & device data Lab values High number of parameters measured in the ICU • Respiratory rate • Mechanical Ventilation • Tidal Volume • Expiratory minute Volume • Respiration modus • PEEP • Piek druk • Supplemental O2 • Fraction of inspired O2 • Type of O2 administration • Peripheral O2 saturation • Blood pressure (diastolic and systolic, arterial and non-invasive) • Pulmonary artery press. (diastolic and systolic) • CVP • PCWP wedge • Heart rate • Cardiac output • Tidal volume (inspiratory and expiratory) • Heart rhythm & ectopic • Shock index • Temperature peripheral • CAM, DOS, RASS, NAS • GCS • Pupil size and reaction Respiration Circulation • Cough stimulant • Urine output • Number of bronchial toilets • Age, sex • Length and weight at admission • Department of origin • Length of stay • Number of prior admissions • Time in the hospital before admission • CPR code • Base excess • O2 content in blood • Arterial O2 saturation • pH • Part. press. (O2 & CO2) • Actual bicarbonate Blood gas analysis Haematology • Hb, Ht • White blood cell count • MCH, MCV • Erythrocytes • Thrombocytes • Lymphocytes • Leucocytes • Baso, eo and neutro • Reticulocytes • PT, APTT • CK-MB • Troponin-T Cardiac enzymes • Natrium, Kalium • Chloride • Calcium, ion. Calcium • Magnesium • Fosfaat • Creatinine • CK • EST and CRP • Blood glucose • Blood lactate • Amylase • Serum albumine • BUN_creatinine • NT-ProBNP Chemistry • ALAT and ASAT • GGT, AF • LDH • Bilirubine Liver tests • Natrium, Kalium • Ureum Urinalysis Medication categories • Alimentary tract and metabolism • Antibiotics • Blood and blood-forming organs • Cardiovascular • Musculoskeletal system • Nervous system • General (sondevoeding) Other • CVVH • Lines and drains
  18. 18. Measurements can vary widely between hospitals Number of measurements Mean value Hospital 1 Hospital 2 Activated partial thromboplastin time (aPTT)
  19. 19. Parameters are measured at different time scales, with highly varying values and measurement frequencies
  20. 20. What do we need? • A feature engineering pipeline that:
 1. is scalable 2. can be used efficiently for both development and production 3. can be used for multiple outcome measures 4. produces features that are interpretable and useful for both machine learning models and doctors
  21. 21. Challenge: how to turn time series into information relevant for a model (and doctors)?
  22. 22. Challenge: how to turn time series into information relevant for a model (and doctors)? ๏ Recurrent Neural Networks
 e.g. (Phased) LSTMs ๏ Frequency domain transforms
 e.g. Fourier transform ๏ Embedded representations 
 e.g. patient2vec
  23. 23. Challenge: how to turn time series into information relevant for a model (and doctors)? ๏ Recurrent Neural Networks
 e.g. (Phased) LSTMs ๏ Frequency domain transforms
 e.g. Fourier transform ๏ Embedded representations 
 e.g. patient2vec • Scalable? • Reusable across models? • Interpretable?
  24. 24. Challenge: how to turn time series into information relevant for a model (and doctors)? ๏ Recurrent Neural Networks
 e.g. (Phased) LSTMs ๏ Frequency domain transforms
 e.g. Fourier transform ๏ Embedded representations 
 e.g. patient2vec • Scalable? • Reusable across models? • Interpretable?
  25. 25. Extracting interpretable aggregated values from vital parameters last first minimum average slope standard deviation maximum {…}counts Heart rate (bpm)
  26. 26. {…} {…} 1 2 3 First 48h First 72h First 24h {…} We use these aggregated features to capture short-term effects as well as longer-term trends
  27. 27. We use these aggregated features to capture short-term effects as well as longer-term trends {…} {…} {…} 1 2 3 Whole stay Day averages First and last day
  28. 28. Multiple patients, multiple parameters, continuous time scale
  29. 29. Multiple patients, multiple parameters, continuous time scale
  30. 30. Split - apply - combine 1) Splitting the data into groups based on some criteria. 2) Applying a function to each group independently. 3) Combining the results into a data structure.
  31. 31. Creating features grouped in custom time windows
  32. 32. Creating features grouped in custom time windows
  33. 33. Creating features grouped in custom time windows
  34. 34. Why not stick to Pandas then? • Interpretable, easy, reliable • Works very well with datetime formats • Most simple aggregations available
  35. 35. Why not stick to Pandas then? • Interpretable, easy, reliable • Works very well with datetime formats • Most simple aggregations available • No out-of-the-box parallelisation • Everything in memory • Custom aggregations can be extremely computationally heavy
  36. 36. Heavy computational load for custom functions
  37. 37. Dask: scalable Pandas • Abstraction over numpy, pandas and scikit-learn allowing you to run operations on them in parallel, using multicore processing
  38. 38. Dask: scalable Pandas
  39. 39. Dask: scalable Pandas
  40. 40. Dask: scalable Pandas • Manipulating large datasets, even when those datasets don’t fit in memory • Distributed computing on large datasets with standard Pandas operations like groupby, join, and time series computations • Scales up to multiple machines auto-magically.
 Scales down: low-memory and fast even on local machines.
  41. 41. Reminder: our goal of scalability ๏ Develop and test on any machine ๏ Re-use the same pipeline for production ๏ For both large and small datasets
  42. 42. Problems with Dask • Not all pandas aggregations available
 (e.g. apply custom functions on expanding windows) • Complex to optimise on each machine • Need to select manually number of workers, partitions, etc. • Performance highly dependent on settings • Slower for small datasets and certain transformations
  43. 43. Can we do better?
  44. 44. TSFRESH • "Time Series Feature extraction based on scalable hypothesis tests”.
  45. 45. TSFRESH • "Time Series Feature extraction based on scalable hypothesis tests”.
  46. 46. TSFRESH • Same split-apply-combine concept, but feature calculations are done on numpy arrays (vectorized), in parallel
  47. 47. Dealing with time-varying signals pandas Series numpy array Calculate aggregates in parallel pandas DataFrame min()
 max() std() …
  48. 48. Huge list of aggregates available out of the box
  49. 49. Result: clean, interpretable dataframe ready for modelling
  50. 50. Scaling up and down • (Local) multiprocessing • Cluster with Dask
  51. 51. Dealing with time-varying signals • Problem: using numpy arrays means losing the datetime dimension • Solution: custom fork of TSFRESH • The DatetimeIndex of the input pandas dataframe is used only when calculating time-dependent aggregations • Medication data can also be taken into account by exploiting multi- indices (e.g. medications)
  52. 52. Dealing with medications Aggregates: - Total amount - Time since last dose - Time under treatment - Time without treatment
  53. 53. Summary • Creating features for medical data entails dealing with variety and variability • Quick processing and interpretable features are top priorities • No single tool offer a unique solution
  54. 54. Summary • Pandas works well for quick processing of relatively small datasets • Split-apply-combine • Parallelizing (e.g. through Dask) allows quick computation of aggregates both locally and distributed • Vectorizing the split-apply-combine approach (e.g. with TSFRESH) speeds up computation both for small and large datasets. • Native support for Dask and custom distributors enables scaling
  55. 55. Conclusions • Approach not limited to Python or specific packages • Can be extended to any application that involve time series • Scaling horizontally: we adapted the ICU pipeline for various other projects (e.g. treatment decision based on patients’ clinical history) • No need to re-invent the wheel every time
  56. 56. Key takeaway “FEATURE ENGINEERING” PANDAS DATA SCIENTIST
  57. 57. Questions or feedback? Michele Tonutti michele.tonutti@pacmed.nl

×