Agenda
● Quick intro to Bayesian Structural Time Series
● Review of 2 toolkits: Prophet & BSTS
● Inference with time series: causalImpact
Bayesian Structural Time Series (BSTS)
Frequentist Time Series
● For example, Gaussian ARMA:
The “state” in Gaussian BSTS:
● Observation equation:
● State equation:
With IID
AR MA
To better understand the model, we observe two extreme cases:
1. 𝜏2=0 ⇒ we have IID noise, best estimator for Yt+1 is AVG(Y)
2. 𝜎2=0 ⇒ We get a random walk, best estimator for Yt+1 is Yt
Bayesian Structural Time Series (BSTS)
For a simple “one step back” structure, we can write the conditional distributions,
which helps showing the hierarchical nature of the model:
Data
● Site visits from different sources + searches in
a search engine (rhymes with doodle?)
● We don’t have “years of data”
● Predictions / inference about source1.
Toolkit 1: Prophet
● Wrapper around Stan
● Maintained by Facebook’s Core Data Science team
● Has R & Python bindings
● Strong/opinionated defaults
Prophet mission statement
“Prophet is a procedure for forecasting time series
data based on an additive model where non-linear trends
are fit with yearly, weekly, and daily seasonality,
plus holiday effects. It works best with time series
that have strong seasonal effects and several seasons
of historical data. Prophet is robust to missing data
and shifts in the trend, and typically handles outliers
well.”
What is Prophet doing?
Fire a series of “heavy guns” at once:
● Piecewise trending (25 points!)
● Automatic seasonality (week, year)
With a small effort you can add:
● Preloaded holidays
● More cycles of seasonality
● One regressor at a time (no NA’s)
● Custom breakpoints
And the rest is IID noise...
Toolkit 2: bsts
● R package (only)
● Based on another R package - Boom (Bayesian Object
Oriented Modeling), maintained by the same author
● Long list of optional components for the time series:
seasonality, holidays, trends, AR structures, dynamic
regression, etc.
● Some versions break backwards compatibility
BSTS mission statement
“Our approach combines three statistical methods into an
integrated system we call “Bayesian Structural Time Series”
or BSTS for short:
1) A “basic structural model” for trend and seasonality,
estimated using Kalman filters
2) Spike and slab regression for variable selection
3) Bayesian model averaging over the best performing models
for the final forecast.”[1]
What is BSTS doing?
● Kalman Filter:
● “Spike & Slab”: Multivariate, simultaneous approach to variable selection.
We estimate the joint dist of the coefficients with a binary vector of “inclusion
probability” (so when we make draws of zt the model changes!)
● Bayesian model averaging: How we correctly calculate the posterior
distribution of yt+1 when it depends on different models?
Summary
And for the rest there’s Stan!
Prophet BSTS
Claim to
faim
Simple & powerful (fire & forget) Rich & complex (model selection)
Stability Stable (for now?) Changes from version to version
(breaks code)
Defaults Defaults for strong shrinkage,
long & “stationary” series
Allows for shorter & complex
situations and emphasizes causality
(or, aggressive fit?)
Dist Gaussian only Gaussian and (sometimes) Poisson
Toolkit 2.5: CausalImpact
● A “counterfactual” machine: what
would have happened if we did not
change anything?
● Fit a BSTS model on “before” data
● Compares actual “after” data to
simulation
● Super simple interface
● Easy to explain to business
Anyone interested on building something similar for RStan?
Source:
[1] Scott SL, Varian HR. Bayesian Variable Selection for Nowcasting Economic Time Series. In: Economic Analysis of the Digital Economy. University of Chicago Press; 2015. https://www.nber.org/chapters/c12995.pdf
[2] For more details on BMA see https://wwwlegacy.stat.washington.edu/www/research/online/hoeting1999.pdf