MLconf NYC Samantha Kleinberg

2,344 views
2,210 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,344
On SlideShare
0
From Embeds
0
Number of Embeds
1,679
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

MLconf NYC Samantha Kleinberg

  1. 1. Causal Inference from Uncertain Time Series Data Samantha Kleinberg Stevens Institute of Technology
  2. 2. http://discoverysedge.mayo.edu/artificial-pancreas/
  3. 3. insulindependence.orginfluxis.com gigaom.com Nate Heintzman @ UCSD, Dexcom
  4. 4. Time Blood Glucose Heart Rate Skin Temp. Energy Use Weight 7:20 7:25 95.7 26.80 37.23 7:30 90.5 28.63 12.85 7:35 82.4 29.69 6.78 7:40 79.2 30.32 6.41 7:45 77.0 30.55 8.17 7:50 73.3 31.19 6.87 155 7:55 72.7 31.29 6.08 8:00 130.0 71.4 31.96 9.74 8:05 203.5 89.8 31.59 24.58 8:10 208.2 87.9 31.67 10.63 8:15 209.0 84.8 31.16 15.63 8:20 207.8 81.0 31.11 11.74 8:25 205.9 83.7 31.26 16.28 8:30 204.8 98.3 31.05 23.59 8:35 214.9 82.2 30.47 11.66 8:40 225.7 82.9 30.95 16.93 8:45 231.4 0.1 8:50 232.8 89.6 29.67 21.96 8:55 239.4 81.2 30.81 15.86
  5. 5. Time Blood Glucose Heart Rate Skin Temp. Energy Use Weight 7:20 7:25 95.7 26.80 37.23 7:30 90.5 28.63 12.85 7:35 82.4 29.69 6.78 7:40 79.2 30.32 6.41 7:45 77.0 30.55 8.17 7:50 73.3 31.19 6.87 155 7:55 72.7 31.29 6.08 8:00 130.0 71.4 31.96 9.74 8:05 203.5 89.8 31.59 24.58 8:10 208.2 87.9 31.67 10.63 8:15 209.0 84.8 31.16 15.63 8:20 207.8 81.0 31.11 11.74 8:25 205.9 83.7 31.26 16.28 8:30 204.8 98.3 31.05 23.59 8:35 214.9 82.2 30.47 11.66 8:40 225.7 82.9 30.95 16.93 8:45 231.4 0.1 8:50 232.8 89.6 29.67 21.96 8:55 239.4 81.2 30.81 15.86 Errors
  6. 6. Time Blood Glucose Heart Rate Skin Temp. Energy Use Weight 7:20 7:25 95.7 26.80 37.23 7:30 90.5 28.63 12.85 7:35 82.4 29.69 6.78 7:40 79.2 30.32 6.41 7:45 77.0 30.55 8.17 7:50 73.3 31.19 6.87 155 7:55 72.7 31.29 6.08 8:00 130.0 71.4 31.96 9.74 8:05 203.5 89.8 31.59 24.58 8:10 208.2 87.9 31.67 10.63 8:15 209.0 84.8 31.16 15.63 8:20 207.8 81.0 31.11 11.74 8:25 205.9 83.7 31.26 16.28 8:30 204.8 98.3 31.05 23.59 8:35 214.9 82.2 30.47 11.66 8:40 225.7 82.9 30.95 16.93 8:45 231.4 0.1 8:50 232.8 89.6 29.67 21.96 8:55 239.4 81.2 30.81 15.86 Errors Missing Data
  7. 7. Time Blood Glucose Heart Rate Skin Temp. Energy Use Weight 7:20 7:25 95.7 26.80 37.23 7:30 90.5 28.63 12.85 7:35 82.4 29.69 6.78 7:40 79.2 30.32 6.41 7:45 77.0 30.55 8.17 7:50 73.3 31.19 6.87 155 7:55 72.7 31.29 6.08 8:00 130.0 71.4 31.96 9.74 8:05 203.5 89.8 31.59 24.58 8:10 208.2 87.9 31.67 10.63 8:15 209.0 84.8 31.16 15.63 8:20 207.8 81.0 31.11 11.74 8:25 205.9 83.7 31.26 16.28 8:30 204.8 98.3 31.05 23.59 8:35 214.9 82.2 30.47 11.66 8:40 225.7 82.9 30.95 16.93 8:45 231.4 0.1 8:50 232.8 89.6 29.67 21.96 8:55 239.4 81.2 30.81 15.86 Errors Missing Data Different timescales
  8. 8. Discretization 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 191 201 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 191 201 hypo eu hyper
  9. 9. Logic-based causal inference • Complex, temporal relationships • Potential causes raise probability of and are earlier than effects Kleinberg, S. (2012) Causality, Probability, and Time.
  10. 10. Which causes are significant? • Main idea: looking for better explanations for the effect • Assess average difference cause makes to probability of effect • Determine which ε are statistically significant
  11. 11. Causes from EHR data 8 9 17 17 18 11 10 22 22 21 0510152025 dx_urinary_symptoms first_antihtn_combo dx_overweight dx_diabetes dx_hypothyroidism Months before CHF diagnosis S. Kleinberg and N. Elhadad (2013) Lessons Learned in Replicating Data-Driven Experiments in Multiple Medical Systems and Patient Populations. AMIA Annual Symposium.
  12. 12. Adding uncertainty D. Hutchison and S. Kleinberg (2013) Causality and Experimentation in the Sciences.
  13. 13. Carb-heavy meal (c), vigorous exercise (e), blood glucose (g)
  14. 14. Calculate: What’s the effect of exercise on glucose?
  15. 15. Calculate: Strict discretization: What’s the effect of exercise on glucose?
  16. 16. Calculate: Strict discretization: Probabilistic discretization: What’s the effect of exercise on glucose?
  17. 17. Validation Simulated structures • Randomly generated relationships • 6 levels of uncertainty • 20K observations
  18. 18. Results on simulated data
  19. 19. Causes of changes in glucose Cohort: 17 subjects with T1DM Sensor data (collected for >72 hours) – Glucose values – Insulin dosage – Activity – Sleep stage – Heart rate – Temperature With N. Heintzman (UCSD,Dexcom) http://dial.ucsd.edu/what-we-do.php
  20. 20. Results very vigorous exercise leads to hyperglycemia (fdr <.01) in 5-30 minutes – Found using both HR (anaerobic activity zone) and METs • Not found with strict discretization • Supported by literature (Marliss and Vranic, 2002; Riddell and Perkins, 2006) S. Kleinberg and N. Heintzman. Inference of Causal Relationships from Uncertain Data and Application to Type 1 Diabetes. (under review)
  21. 21. Open problems • Latent variables • Nonstationary time series • Real-time prediction/explanation • Simulation
  22. 22. Thanks! • NSF/CRA Computing Innovation fellowship • NLM Computational Thinking contract • NLM R01 • Columbia – Noemie Elhadad – George Hripcsak – Mitzi Morris – Rimma Pivovarov • Geisinger – Buzz Stewart – Craig Wood • UCSD – Nathaniel Heintzman • Stevens – Dylan Hutchison

×