Actigraph sleep algorithm analysis

1
Actigraph Algorithm
Contents
Using the do files 2
Interpreting the output 3
Specifications 4
Parameter estimation 10
Accuracy 12
Caveats and future work 14

2
Using the .do Files
This algorithm is designed to predict sleep onset and wake time given an actigraph
dataset. It can be used as follows:
1. Download the do files
2. Open file called actigraph_algorithm.do
3. Under the heading “Change Directory Here”, change into the directory in which the do
files are stored using the cd “Enter path here ” command
4. Under the heading “Import Data File Here” import your actigraph dataset
5. Under the heading “Enter the hour and minute at which the data starts”, set the global
macro data_start_hour to the hour at which the actigraph was turned on, and set the
global macro data_start_minute to the minute at which the actigraph was turned on
a. The data start time can be found by opening the imported actigraph dataset and
looking at the first entry of the Time variable
b. Note that if the start time is in 24 hour format. For example, the data start hour of
1:00pm is 13
6. Execute the actigraph_algorithm.do file
7. Open the data file
8. Scroll to the rightmost variables of the data file. You will find the estimated sleep onset
and wake times organized by day under the variables sleep_onset_hour
sleep_onset_min wake_hour wake_min
9. On rare occasions, Stata will give strange estimates for reasons which are opaque to
me. Run the do file again and the estimates should make more sense

3
Interpreting the Output
The final product can be found in the rightmost variables of the data file, the output of
which will look something like this:
Starting from the top, we can see that on night 1, the participant went to sleep at 23:19
and woke up at 7:10 the following morning. On the second night, the participant went to sleep at
22:22 and woke up at 7:46. And so on.
A technical note about “sleep onset”. Most sleep researchers distinguish between time in
bed and sleep onset - time in bed being the time at which the participant went to bed and sleep
onset being the time at which the participant fell asleep. In actigraphy, time in bed is often
considered to be when activity is just beginning to decline for the night. Sleep onset is
considered to be when activity plateaus near 0 after time in bed. This algorithm does not make
the distinction between time in bed and sleep onset, instead measuring sleep onset as the point
at which activity drops off most sharply for the night. What I call ‘sleep onset’ is likely to be a
point in time between what sleep researchers would consider time in bed and actual sleep
onset. This algorithm can be expected to overestimate sleep, but given that time in bed and
actual sleep onset are often temporally proximate, it should only overestimate sleep by a few
minutes per night.

4
Specifications
The following steps are executed sequentially:
Time variable
For ease of calculations, the Time variable is converted into a variable called time which
stores the number of minutes since midnight of the day the actigraph was turned on. If, for
instance, the actigraph was turned on at 1:18pm, the first time entry would be 798. At 1:19pm
the time variable would read 799, and so on.
Capped vector magnitude
The main variable in the actigraph dataset is called VectorMagnitude, which is a
composite measure of movement along the x-, y-, and z-axes at any given minute. For our
calculations, it doesn’t matter how much a participant is moving after a certain point - it’s more
important to simply know whether or not a participant was moving, more like a binary variable. In
fact, using the full range of vector magnitudes introduces noise that makes our calculations less
accurate. To reduce this noise, I use a variable called the capped vector magnitude or
capped_vector_mag, which clips the vector magnitude at a ceiling value of 400. This clipping
procedure is also used by the ActiLife corporation, although their ceiling value is 300. The
ceiling value can be adjusted manually in the settings.do file.

5
Activity dropoff
I then derive the variable activity dropoff from the capped vector magnitude. The activity
dropoff at minute m is a sum of activity from the previous hour minus the sum of activity from the
following hour. Intuitively, when a participant goes to sleep, she has been active for the past
hour but will be at rest the following hour, so a high activity dropoff value should predict sleep
onset. Conversely, when a participant wakes, she has been at rest for the past hour but will be
active for the next hour, meaning that a low (large negative value) activity dropoff value should
predict wake time.
These graphs above show activity dropoff values at nighttime (left) and in the morning
(right). Notice that the nighttime graph peaks around 1400 minutes, which corresponds to a time
of 23:20. This falls between Alosias’ estimated time in bed (23:17) and sleep onset (23:24). The
morning graph troughs around 1870 minutes (7:10), which is exactly Alosias’ estimated wake
time.

6
Probability density functions (PDF)
The probability density function models multimodal skewed normal probability
distributions. This is important for generating the prior and posterior values in Bayes’ Theorem,
which we will use later on.
PDF takes in a variable over which the probability density will be calculated, the number
of modes, and the location (epsilon), scale (omega), shape (alpha), and weight parameters of
each mode. For each value of the input variable, PDF generates the probability density of the
distribution specified by the distribution’s parameters and stores it in a variable also called PDF.
In a unimodal distribution, I use the following equations:
1) PDF(x) ϕ( )Φ(α )( = 2
w w
x−ε
w
x−ε
2) ϕ(x) e( = 1
√2π
− 2
x2
3) Φ(x) [1 rf( )]( = 2
1
+ e x
√2
The error function erf is calculated by Maclaurin series. Despite the mathematical
expression where n is supposed to go to infinity, my program only iterates to 30. Because of this
limitation, the series sometimes fails to converge, in which case I automatically set the value at
-1 or 1:
4) erf(x)( = 2
√π
∑
∞
n=0
n!(2n+1)
(−1) xn 2n+1
For a multimodal distribution, I simply take the weighted average of multiple unimodal
distributions.
Probability densities for Bayes’ Theorem
In preparation for Bayes’ Theorem, I create 6 probability distributions - 3 for sleep onset
and 3 for wake time. For sleep onset:
1. P(A) Prior probability of sleep onset at minute m
2. P(B) Probability distribution of activity dropoff values around typical sleep onset times
(i.e. within two standard deviations of mean sleep onset of my calibration data)
3. P(B|A) Probability distribution of activity dropoff values given sleep onset at minute m
For wake time:
1. P(A) Prior probability of wake time at minute m
2. P(B) Probability distribution of activity dropoff values around typical wake times (i.e.
within two standard deviations of mean wake time of my calibration data)

7
3. P(B|A) Probability distribution of activity dropoff values given wake time at minute m
The probability density values are generated using the PDF function and estimated
parameters. How I obtained these estimates is discussed in the Parameter Estimation section.
Distribution Sleep/Wake Location Scale Shape
P(A) Sleep 1395.786155 3425.128674 -0.5261442611
P(B) Sleep -3457.665578 86337243.64 3.109131458
P(B|A) Sleep 18615.14849 13082017.55 -0.7147779391
P(A) Wake 449.5900616 2353.155443 -0.6673950005
P(B) Wake 1.459454871 33167983.61 -0.6477603423
P(B|A) Wake -22911.05368 55072594.1 27.85464787
Sleep probability densities
Wake probability densities

8
Bayes Theorem
These probability densities are combined using Bayes’ Theorem to form two posterior
distributions P(A|B): one for sleep onset, the other for wake time.
5) P(A|B)( = P(B)
P(B|A)P(A)
Posterior probability density for sleep onset
Posterior probability density for wake time

9
Combining posterior distributions
Earlier versions of the actigraph algorithm incorporated multiple posterior distributions
generated from predictive variables other than activity dropoff, including a measure of
regression discontinuity obtained using a modified version of the Quandt Likelihood Ratio
statistic. The current version uses only one predictive variable, making this step obsolete. I
leave this as an option if you later decide to include another predictive variable.
Expected value calculations
Sleep onset and wake time are calculated using a weighted average of possible sleep
onset and wake times where the weight is the value of the posterior distribution at minute m.
6) E(sleep onset)( =
∑
∞
m=−∞
wm
w∑
∞
m=−∞
m* m
Where m is value of the time variable at the current minute and w is the value of the
posterior distribution at the current minute. Wake time is calculated using the same formula, but
using weights from the wake time posterior rather than the sleep onset posterior.
Summing over negative infinity to infinity is obviously impossible, so I trimmed the
summation range to within 3 standard deviations of mean sleep onset and wake times. These
means and standard deviations were obtained using my dataset which I call the ‘calibration
data’. More on these in the Parameter Estimation section.
The output of these expected value calculations is the estimated sleep onset and wake
time. The algorithm repeats these expected value calculations for each day for which there is
data.

10
Parameter Estimation
Each probability distribution used in the Bayesian updating step required parameters
which I estimated using data from participants pid1001 through pid1019. Sleep onset and wake
times were initially estimated by Alosias. I know, not as reliable as a PSG (no offense to what
I’m sure are Alosias’ acute estimation abilities), but I had to start somewhere.
I set sleep onset and wake times within a 10 minute search range of Alosias’ estimates.
Within that 10 minute search range, my calibration program looked for the greatest activity
dropoff value for sleep onset or the least (most negative) activity dropoff value for wake time.
These were used as the sample data for estimating parameters for the sleep onset prior P(A) ,
the wake time prior P(A) , the activity dropoff value given sleep onset P(B|A), and the activity
dropoff value given wake time P(B|A).
Parameters for the probability density functions of activity dropoff P(B) at sleep onset
and wake time were obtained using samples of activity dropoff around mean sleep onset and
wake times respectively. More technically, ‘around’ mean sleep onset and wake times means
within 2 standard deviations.
Each of the estimated distributions was unimodal. I estimated the location (epsilon),
scale (omega^2), and shape (alpha) parameters using the following equations:
Sample skew
7) γ( = n−2
n√n−1
*
(x −x)∑
n
i=0
i
3
( (x −x) )∑
n
i=0
i
2 2/3
Delta
8) |δ|( =
√2
π
*
|γ|2/3
|γ| +((4−π)/2)2/3 2/3
Shape
9) α( = δ
√1−δ2
Scale
10) ω( 2 = σ2
(1− )π
2δ2
Location

11
11) ε δ( = x − ω √π
2
Using these equations, I calculated the three parameters (location, scale, and shape) for
each of the probability distribution functions. The do files in which I implemented these
equations, as well as the original data, can be found in the Calibration folder.

12
Accuracy
For a rough test of accuracy, I compared the output of my algorithm with the Alosias
method. I define agreement and disagreement as follows:
● For the algorithm and Alosias to agree on a sleep onset time, the algorithm’s sleep onset
time must be within 10 minutes of Alosias’ time in bed or sleep onset estimation
● For the algorithm and Alosias to agree on a wake time, the algorithm’s wake time must
be within 10 minutes of Alosias’ wake time estimation
● For the algorithm and Alosias to disagree on a sleep onset time, the algorithm’s sleep
onset time must be greater than 20 minutes away from Alosias’ time in bed or sleep
onset estimation
● For the algorithm and Alosias to disagree on a wake time, the algorithm’s wake time
must be greater than 20 minutes away from Alosias’ wake time estimation
The comparison of my algorithm with Alosias’ estimates for participants pid1001 through
pid1019 can be summarized as:
Agree Disagree
Sleep onset 82.3% 10.4%
Wake time 78.4% 8%
Total 80.4% 9.2%
Comparison to Cole-Kripke
Specifications of Cole-Kripke
The Cole-Kripke algorithm is a specific instance of the general sleep onset/wake time
algorithms used by ActiLife. Each minute (or whatever time frame it designates as one ‘epoch’),
is designated as either asleep or awake. ActiLife takes a weighted sum of the Vector Magnitude
(represented by A) at the current minute m, as well as the surrounding minutes (say m-n and
m+p), and compares it to a designated threshold value. If the weighted sum of the activity is
below the threshold, it count that minute as asleep. If the weighted sum is above the threshold,
it counts that minute as awake.
12) if A hreshold, participant is asleep( ∑
m+p
i=m−n
wi i < t
Cole-Kripke specifies both a method of clipping the Vector Magnitude and the weights
for this algorithm. In the original Cole-Kripke paper, the Vector Magnitude was clipped by first
summing the activity level during every 10-second period. It uses the maximum 10-second sum

13
within each minute as the new value of that minute’s Vector Magnitude. Finally, it feeds the new
Vector Magnitude values into the weighted sum function as follows:
13) D .00001(404 98A 26A 41A 408A 98A 50A )( = 0 * A−4 + 5 −3 + 3 −2 + 4 −1 + 1 0 + 5 1 + 3 2
The sleep/wake threshold was set at 1. That is, if D < 1, the participant was asleep.
Otherwise, the participant was scored as awake.
Actilife uses a modified version of Cole-Kripke to estimate sleep onset and wake time.
Sleep onset is estimated as the first minute scored as asleep at night. The ActiLife manual does
not specify how wake time is determined, but from the graphs they provide it appears they use
the same method to estimate wake time. That is, the last minute scored as asleep in a given in
the morning is considered to be the wake time.
Comparison
I unfortunately only have one participant’s Cole-Kripke estimation to compare against
Alosias’ estimates, pid1004. Using the same metrics of agreement and disagreement, I
compared Cole-Kripke and Alosias’ estimates:
Agree Disagree
Sleep onset 66.7% 0%
Wake time 33.3% 55.6%
Total 50% 27.8%
For the same participant, my algorithm produced the following level of agreement with
Alosias’ estimates:
Agree Disagree
Sleep onset 88.9% 0%
Wake time 55.6% 22.2%
Total 72.2% 11.1%
That is, my algorithm produces higher levels of agreement and lower levels of
disagreement. This is promising, although one participant (9 data points) certainly isn’t enough
to establish that my algorithm agrees with Alosias’ estimates more than Cole-Kripke.

14
Caveats and Future Work
There are at least four reasons to be wary of how accurate my algorithm will be going
forward. First, the data sets I used for calibrating the location, scale, and shape parameters are
the same data sets I used in calculating the accuracy of my algorithm. Ideally I would test
accuracy using out-of-sample data. Second, the probability distribution parameters might
change systematically between participant groups. For example, if sleep patterns change based
on weather, the prior sleep onset and prior wake time functions would need to be recalibrated.
Third, what I call ‘sleep onset’ is not actual sleep onset as defined in sleep research literature.
Rather, it’s the time point at which activity drops off most sharply, which is likely to be in the
short time window between time in bed and actual sleep onset.
Finally, the probability distribution parameters might (and hopefully should) change
between treatment and control group. But calculating different priors for treatment and control
groups also has its own methodological problems. Namely, if we find priors for longer sleep time
for the treatment group, it will be difficult to distinguish if any first stage effects are due to
treatment effects or simply the differences in priors. I think having a single prior for both
treatment and control groups is the lesser of two evils here.
Future work
Unfortunately I don’t have time this summer to distinguish between time in bed and sleep
onset, but if any RAs want to take on this task I can tell you how I would do it if I had the time:
1. Use the calibration algorithm to find parameters for time of sleep onset and time in bed
distributions relative to my estimated ‘sleep onset’
2. Use the calibration algorithm to find the activity dropoff distributions for time in bed and
actual sleep onset
3. Estimate my ‘sleep onset’
4. Use the parameters you found in (1) to create priors for time in bed and actual sleep
onset based on my ‘sleep onset’
5. Use Bayes’ Theorem to combine these priors with the probability density functions you’ll
obtain from the parameters you estimated in (2). This should approximate time in bed
and sleep onset
I would caution against an attempt to use regression discontinuity as an additional
estimation parameter along with activity dropoff. I spent far too much time trying to get a
modified version of the QLR statistic to predict sleep onset with only marginal success.
If sleep patterns change with weather or seasonally, use my calibration algorithm to
calibrate probability distribution parameters for priors that you can use at different times during
the year or in different weather conditions.

15
Most importantly, test this algorithm on out-of-sample data and compare it to
Cole-Kripke.

Actigraph sleep algorithm analysis

Recommended

Recommended

More Related Content

Similar to Actigraph sleep algorithm analysis

Similar to Actigraph sleep algorithm analysis (20)

Actigraph sleep algorithm analysis