Toss ‘N’ Turn: Smartphone as Sleep and Sleep Quality Detector, at CHI 2014


Published on

The rapid adoption of smartphones along with a growing habit for using these devices as alarm clocks presents an opportunity to use this device as a sleep detector. This adds value to UbiComp and personal informatics in terms of user context and new performance data to collect and visualize, and it benefits healthcare as sleep is correlated with many health issues. To assess this opportunity, we collected one month of phone sensor and sleep diary entries from 27 people who have a variety of sleep contexts. We used this data to construct models that detect sleep and wake states, daily sleep quality, and global sleep quality. Our system classifies sleep state with 93.06% accuracy, daily sleep quality with 83.97% accuracy, and overall sleep quality with 81.48% accuracy. Individual models performed better than generally trained models, where the individual models require 3 days of ground truth data and 3 weeks of ground truth data to perform well on detecting sleep and sleep quality, respectively. Finally, the features of noise and movement were useful to infer sleep quality.

Published in: Mobile, Health & Medicine

Toss ‘N’ Turn: Smartphone as Sleep and Sleep Quality Detector, at CHI 2014

  1. 1. Toss ‘N’ Turn: Smartphone as Sleep and Sleep Quality Detector Jun-Ki Min ( Afsaneh Doryab Jason Wiese Shahriyar Amini John Zimmerman Jason I. Hong
  2. 2. Sensing Sleep for… • Personal informatics • UbiComp system • Health monitoring
  3. 3. Current Practices
  4. 4. Opportunities • We already have smartphones • 83% of millennials sleep with their phone Pew Internet
  5. 5. How well a smartphone can sense sleep without requiring changes in our behavior? Task 1. Detect bedtime, waketime and duration Task 2. Infer daily sleep quality Task 3. Classify good or poor sleeper
  6. 6. Toss’N’Turn Sound amplitude Acceleration Ambient light intensity Screen proximity Processes Battery state Screen state Sleep diary Datapreprocessing Featureextraction Database Server Toss ‘N’ Turn (Data Collection Ver.)
  7. 7. Modeling SoundMotion Sleep Good or poor Bedtime Waketime Duration …01000000111111111011101011000 …0000000011111111111111111100010-minute window Sleeping (1) or not (0) …00 …010 …000 Every 30 minutes, run the sleep detection model
  8. 8. User Study • Recruited good and poor sleepers – Living in US, age > 18 – Pay $2 USD for each diary entry (a maximum $72) • Collected sleep data for a month • 30 participants signed up and 27 completed – Total 795 sleep-diary entries
  9. 9. Ground Truthing User Study Global score > 5 indicates a subject is having poor sleep Subjective sleep quality + Sleep latency + Sleep efficiency + Sleep duration + Use of medication + Sleep disturbances ---------------------------------- = Global sleep quality
  10. 10. Demographics User Study 11 Share bed with 3 8 3 1 1 Disrupting noises in the bedroom 12 15 Yes No Age 10 10 5 1 1 20 30 40 50 ? Regularly work 22 5No Yes Sex 19 8 Poor sleeper (PSQI global score > 5) Good sleeper (PSQI global score ≤ 5)
  11. 11. Evaluation • Classifier – Bayesian network (BN) with correlation-based feature selection • Task 1. Detect bedtime, waketime and duration • Task 2. Infer daily sleep quality – Train the model individually (leave-one-day-out cross validation) • Task 3. Classify good or poor sleeper – Leave-one-person-out cross validation
  12. 12. Task 1: Sleep Detection • Detect sleep windows  Detect sleep time • 94.5% in classifying sleep/not-sleep windows Evaluation Bedtime detection Baseline (avg. time) Our method Waketime detection Baseline (avg. time) Sleep duration inference Baseline (avg. time) -150 150-120 1209060300-30-60-90 Average minutes of over (+) and under (-) estimation errors Our method Our method
  13. 13. Task 2: Daily Sleep Quality Inference Evaluation • Detect sleep  Classify the quality of sleep • 84.0% in classifying good/poor sleeps Accuracy (%) Our methodRandom Poor sleep detection (F-score)
  14. 14. Task 3: Good/Poor Sleeper Classification Evaluation • Infer daily qualities  Classify the sleeper type • 81.5% in classifying good/poor sleepers Our methodRandom Accuracy (%) Poor sleeper detection (F-score)
  15. 15. Discussion How well a smartphone can sense sleep without requiring changes in our behavior? Task 1. Detect bedtime, waketime and duration within 35, 31, and 49 minutes of errors, respectively Task 2. Infer daily sleep quality with 84% accuracy Task 3. Classify good or poor sleeper with 81% accuracy
  16. 16. Top Five Features – Time – Battery charging / not-charging – Min. movement – Std. sound amplitude – Q3 sound amplitude – Bedtime – Waketime – Sleep duration – Std. movement – Yesterday’s sleep quality Discussion Sleep detection Sleep quality inference
  17. 17. Sleep Detection Errors Discussion People who sleep alone People who have a sleep partner Phone location Error(minutes)
  18. 18. General vs. Individual Models • Sleep detection: 93.06% vs. 94.52% – Need 3 days of ground truthing to train an individual model • Sleep quality inference: 77.23% vs. 83.97% – Need 3 weeks of ground truthing to train an individual model Discussion
  19. 19. Limitations • Subjective vs. objective sleep quality – “How was your sleep last night? Rate it on a one to five scale score” does not capture the full extent of a sleep session • People tend to over / underestimate their sleep • Small sample size of poor-quality sleep Discussion
  20. 20. Thanks! • More info at or email • Special thanks to: – DARPA, Google
  21. 21. Backup Slides
  22. 22. Data Collection Frequency Sensed value (frequency) Data collection cycle Night Day Btry<30% Btry<15% Sound amplitude (1hz) Cont. Every other minute Stop Acceleration (5hz) Cont. Every other minute Light, screen proximity (1/5hz) Cont. Every other minute List of running apps (1/10hz) When the screen is turned on Battery states When the battery level is changed or the power cable is plugged in/out Screen states When screen is turned on/off Sleep diary Every morning (notification)
  23. 23. Features Category Factor Feature variables Sleepdetection (32features foreachwindow) Noise level Sound amplitudes{Min., Q1, Med., Q3, Max., Avg., Std.} Movement Acceleration changes{Min., Q1, Med., Q3, Max., Avg., Std.} Light intensity Light intensities & screen proximities{Min., Q1, Med., Q3, Max., Avg., Std.} Device state & usage Duration of screen-on time{Min., Q1, Med., Q3, Max., Avg., Std.}, battery state{plug-in/out, charging/not-charging} & alarm app usage Regular sleep time Timestamp Dailyquality (122featuresforeach sleep) Sleep duration Bedtime, waketime & sleep duration (detected) Sleep latency, efficiency & disturbances Sensor values{#peaks, Avg. width of peaks, Avg. height of peaks, interval of peaks, position of peaks, Min., Q1, Med., Q3, Max., Avg., Std.} & yesterday’s sleep quality (previously inferred) Globalquality (198featuresforeach participant) Sleep regularity Bedtimes, waketimes, sleep durations & qualities for a month of sleeps{Med., Avg., Std.} (previously detected and inferred) Global sleep latency, efficiency & disturbances Daily sleep quality features{Med., Avg., Std.}
  24. 24. Modeling: Data Processing • Rational for “10 minutes” – Level of granularity when participants report sleep time – Median sleep latency = 10.9 minutes • 90,097 windows, 711 not-sleep and 728 sleep segments … SoundMotion 10-minute window Sleep Sleep Not-sleepNot-sleep Bedtime Waketime
  25. 25. Detection Sleep time detection model Smoothing Bedtime, Waketime, Duration Every 30m Sleep Detection & Quality Inference Modeling … SoundMotion 10-minute window Sleep Classification Sleep window detection model Feature extraction 1 = sleep or 0 = not-sleep …0011000001010000011111111011010100001010000100000… …0000000000000000011111111111111100000000000000000… Bedtime Waketime Duration Classification Sleep quality inference model Feature extraction Good or poor
  26. 26. Infer Other Contexts • Sleep alone vs. with others – 84.2% • Phone on the bed vs. near the bed vs. elsewhere – 91.9%