Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Liliana Cruz Lopez - Deep Reinforcement Learning based Insulin Controller for Effective Type-1 Diabetic Care


Published on

Deep Reinforcement Learning based Insulin Controller for Effective Type-1 Diabetic Care

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Liliana Cruz Lopez - Deep Reinforcement Learning based Insulin Controller for Effective Type-1 Diabetic Care

  1. 1. Deep Reinforcement Learning based Insulin Controller for Effective Type-1 Diabetic Care Liliana Cruz Lopez, Columbia University 2019
  2. 2. Background ● More than 100 million adults in U.S. alone living with diabetes ● This condition results from high blood glucose level due to inadequate natural insulin (Type 2) or obsolete (Type 1) in the body ● Treatment involves maintaining healthy blood glucose level at all time by injecting appropriate amount of synthetic insulin at appropriate time ● Two types of insulin are used ○ basal for continuous blood glucose control ○ bolus as a short acting insulin with meal disturbances
  3. 3. CGM and Insulin Pump for Diabetic Control Key steps in CGM and Insulin Pump based blood glucose control ● Continuous glucose level monitoring with glucose sensor ● Algorithm to determine amount and type of insulin to be delivered at specific time ● Delivering insulin into body through insulin pump It is a complex problem to keep a healthy level of blood glucose at all the time!!!
  4. 4. Maintaining Appropriate Blood Glucose Level ● Effective insulin control for diabetic Type-1 patient requires that a healthy-level of glucose is maintained throughout the day with minimal fluctuations in either direction ○ Inadequate insulin causes high blood sugar (Hyperglycemia) resulting in longer term complication ○ Excessive insulin leads to low blood sugar (Hypoglycemia) which is often fatal if is too low ● Goal of insulin dependent diabetic care is to administer appropriate amount of insulin at appropriate time such that glucose level is maintained at near target level without reaching hypo or hyper level ● Maintenance of “right” amount of blood glucose is a very complex problem because of lot of day-to-day variability in patient’s condition ○ diet/nutrition changes, exercise amount, exposure to sun, daily life-style etc.
  5. 5. Closed Loop Control System for Optimal Insulin ● Interaction between Glucose and administered Insulin is a closed loop control system with feedback ○ Typically insulin control modeled as a PID controller to determine optimal insulin that needs to be administered ● Such model driven approaches have limitations due to complex nature induced by time varying, non-linear conditions ● Instead we approach this as an AI problem - specifically we consider Reinforcement Learning model for insulin controller and evaluate its efficacy
  6. 6. Insulin Control as a Reinforcement Learning Model ● We propose Reinforcement Learning based insulin control where ○ an agent repeatedly interacts with the environment, each time receiving feedback (reward) for its actions ○ goal of the agent is to learn an optimal policy that maximizes this feedback (reward) in the long run ● Specifically we propose and evaluate DDPG - a deep RL framework - for insulin controller ○ suitable for both continuous action and continuous space ○ Allows us to explicitly account for both basal and bolus glucose regulation
  7. 7. DDPG Background and Why DDPG DDPG Background ● Is an off-policy algorithm ● Works with environments with continuous action spaces ● Is similar to deep Q-learning for continuous action spaces ● Employs Actor-Critic model ● Learns directly from the observation spaces through policy gradient method DDPG as insulin Controller ● Environment: patient-glucose insulin interaction ● Sate Space: glucose level and meal amount ● Action Space: insulin amount at each time step which is in a continuous space
  8. 8. DDPG Formulation - It learns a Q-function and policy: Q-function: Policy: -It uses off-policy data and the Bellman equation to learn the Q-function, and uses the Q-function to learn the policy. Bellman equation: Deterministic Gradient Policy (Off-policy):
  9. 9. DDPG Algorithm Implementation Algorithm Steps Q function TD Targets Greedy Stochastic Policy Initializations TD Targets Update
  10. 10. Implementation and Environment ● We evaluate performance of DDPG based insulin controller using synthetic data generated with well known UVA/Padova model that models human glucose level ● It includes models for 30 virtual patients, 10 adults, 10 adolescents, 10 children ● DDPG insulin controller implemented in Python using OpenAI Gym framework by extending SimGlucose simulator ● Simulator parameters consists of the following features ○ Meal frequency and amount ○ Patient age, weight, height
  11. 11. Evaluation Framework ● DDPG based Insulin controller vs baseline model based controller (BBController) to study how well DDPG controller performs under varying/dynamic conditions ● We evaluate under three different scenarios by inducing disturbances in blood glucose through ○ single meal ○ multiple meals taken frequently in short intervals ○ multiple meals spread across longer intervals ● Rationale behind these choices is to understand if DDPG based controller has any advantage over typical model based insulin controller under such induced disturbances in glucose level
  12. 12. Experiment I: Single Meal Disturbance Disturbance with a single meal of size 30 (CHO value 10) introduced at 7:00am. ● Glucose level rose much higher with BBController before stabilizing to normal range whereas blood glucose fluctuation is relatively lower with DDPG Controller. ● Achieves tighter control with DDPG due to continuous adjustment based on the environment with purely data driven approach instead of preset model as with BBController. BBController DDPG Controller
  13. 13. Experiment II: Multi-Meal Frequent Disturbances Frequent Multi-meal Disturbance where three meals were taken with shorter gaps between meals ● Blood glucose rose to higher level (>250) for longer duration of interval with BBController and did not react fast enough after each disturbance as compared to DDPG based controller ● DDPG controller maintained glucose with relatively less fluctuations and closer to target level ● DDPG based controller more suitable for handling rapid and dynamic blood glucose disturbances even alone with just basal insulin due to continuous reactive nature of the controller BBController DDPG Controller
  14. 14. Experiment III: Multi-Meal Disturbances Spread Out Spread Out Multi-meal Disturbance: This scenario evaluates performance with multiple meal disturbances with longer gaps between meals ● BBController is able to handle this spread out meal disturbances relatively better compared to when disturbances are more rapid and dynamic. ● We find that both model based and DDPG controller with basal insulin have similar performance under this scenario. We need to further investigate if DDPG controller could achieve superior performance when bolus insulin is also combined. BBController
  15. 15. Conclusion ● Our study is first to propose and evaluate deep RL based insulin controller for Type-1 diabetic blood glucose management ● We implement DDPG based controller using SimGlucose that supports Open AI Gym framework ● We compare with model based baseline insulin controller (Padova) under three scenarios ○ Single meal ○ Frequent Multiple meal disturbance ○ Spread out multiple meal disturbance ● Our evaluation indicates that DDPG based controllers are more suitable to handling rapid and dynamic blood sugar disturbance conditions compared to model based controller ● DDPG controller is able to achieve more controlled blood glucose level even alone with basal insulin, possibly due to the continuous reactive nature of such algorithms
  16. 16. Future work ● Results from our initial evaluations are promising that deep RL based insulin controllers could be more effective in handling rapid fluctuations in blood sugar ● Further evaluate the effectiveness of DDPG based insulin controller by varying environment variables and additional physical factors such as exercise, life-style changes ● Incorporate bolus insulin into the DDPG controller action to further improve its efficacy in reducing blood sugar fluctuations
  17. 17. Acknowledgement Jinyu Xie, SimGlucose creator Yuan Zhao, Mobiquity Networks Dr. Chong Li, Columbia University InsightZen for infrastructure support.