Introducing
Time Series
BigML Spring 2017 Release
BigML, Inc 2Spring Release Webinar - July 2017
Spring 2017 Release
CHEE SING LEE Ph.D. - Machine Learning Engineer
Enter questions into chat box – We will answer
some via chat; others at the end of the session
https://bigml.com/releases/spring-2017
ATAKAN CETINSOY - VP of Predictive Applications
Resources
Moderator
Speaker
Contact info@bigml.com
Twitter @bigmlcom
Questions
BigML, Inc 3Spring Release Webinar - July 2017
Machine Learning Data
Color Mass PPAP
red 11 pen
green 45 apple
red 53 apple
yellow 0 pen
blue 2 pen
green 422 pineapple
yellow 555 pineapple
blue 7 pen
Discovering patterns within data:
• Color = “red” Mass < 100
• PPAP = “pineapple” Color ≠ “blue”
• Color = “blue” PPAP = “pen”
BigML, Inc 4Spring Release Webinar - July 2017
Color Number PPAP
green 45 apple
blue 2 pen
green 422 pineapple
blue 7 pen
yellow 0 pen
yellow 9 pineapple
red 555 apple
red 11 pen
Patterns still hold when rows
re-arranged:
• Color = “red” Mass < 100
• PPAP = “pineapple” Color ≠ “blue”
• Color = “blue” PPAP = “pen”
Machine Learning Data
BigML, Inc 5Spring Release Webinar - July 2017
Time Series Data
Year Pineapple Harvest
1986 50,74
1987 22,03
1988 50,69
1989 40,38
1990 29,80
1991 9,90
1992 73,93
1993 22,95
1994 139,09
1995 115,17
1996 193,88
1997 175,31
1998 223,41
1999 295,03
2000 450,53
Pineapple Harvest
Tons
0
125
250
375
500
Year
1986 1988 1990 1992 1994 1996 1998 2000
Trend
Error
BigML, Inc 6Spring Release Webinar - July 2017
Pineapple Harvest
Tons
0
125
250
375
500
Year
1986 1988 1990 1992 1994 1996 1998 2000
Year Pineapple Harvest
1986 139,09
1987 175,31
1988 9,91
1989 22,95
1990 450,53
1991 73,93
1992 40,38
1993 22,03
1994 295,03
1995 50,74
1996 29,8
1997 223,41
1998 115,17
1999 193,88
2000 50,69
Rearranging Disrupts Patterns
Time Series Data
BigML, Inc 7Spring Release Webinar - July 2017
if (color = “red” and mass = 11) then
PPAP = ?
Extrapolating learned
patterns to novel inputs
Machine Learning Forecasts
BigML, Inc 8Spring Release Webinar - July 2017
if (color = “red” and mass = 11) then
PPAP = [“apple”;34.24%]
Forecasts accompanied
with measure of uncertainty
or confidence
Machine Learning Forecasts
BigML, Inc 9Spring Release Webinar - July 2017
Time Series Forecasts
Forecasts: Extrapolating learned patterns to future times
Forecasts accompanied with measure of uncertainty or confidence
?
BigML, Inc 10Spring Release Webinar - July 2017
Linear Versus Random Split
Year Pineapple Harvest
1986 139,09
1987 175,31
1988 9,91
1989 22,95
1990 450,53
1991 73,93
1992 40,38
1993 22,03
1994 295,03
1995 115,17
Year Pineapple Harvest
1986 139,09
1987 175,31
1988 9,91
1989 22,95
1990 450,53
1991 73,93
1992 40,38
1993 22,03
1994 295,03
1995 115,17
Random Split Linear Split
BigML, Inc 11Spring Release Webinar - July 2017
Time Series Smoothing
Smoothing techniques remove the
“noise” in a time series, so the
underlying “signal” can be observed
BigML, Inc 12Spring Release Webinar - July 2017
Time Series Smoothing
Weight
0
0,05
0,1
0,15
0,2
Lag
1 2 3 4 5 6 7 8 9 10 11 12 13
BigML, Inc 13Spring Release Webinar - July 2017
Exponential Smoothing
Weight
0
0,05
0,1
0,15
0,2
Lag
1 2 3 4 5 6 7 8 9 10 11 12 13
BigML, Inc 14Spring Release Webinar - July 2017
Smoothing Compared
Moving Average Exponential Smoothing
Sum over fixed-width window Sum over entire series history
Uniformly Weighted Exponentially Weighted
1 integer parameter: window size
2 continuous parameters: initial state and
smoothing coefficient
• Continuous parameters → Fine-tuned optimization
• State-space model allows for error estimation
• Extensions for trend and seasonality
BigML, Inc 15Spring Release Webinar - July 2017
Trend
• Change or slope between
consecutive level values
• Smoothed exponentially like level,
coefficient 𝛽
• Affects overall trajectory of time
series.
y
0
12,5
25
37,5
50
Time
Apr May Jun Jul
Additive Trend
BigML, Inc 16Spring Release Webinar - July 2017
Trend
y
0
50
100
150
200
Time
Apr May Jun Jul
Multiplicative Trend
• Change or slope between
consecutive level values
• Smoothed exponentially like level,
coefficient 𝛽
• Affects overall trajectory of time
series.
BigML, Inc 17Spring Release Webinar - July 2017
Trend
y
0
25
50
75
100
Time
Apr May Jun Jul Aug Sep
Multiplicative Damped Trend
• Change or slope between
consecutive level values
• Smoothed exponentially like level,
coefficient 𝛽
• Affects overall trajectory of time
series.
BigML, Inc 18Spring Release Webinar - July 2017
Seasonality
• Seasonality is a recurring pattern of variation that occurs over periods of
consistent length
• Examples:
• Monthly A/C sales: higher in summer months and lower in winter months,
repeats with a period of 12 data points
• Vehicle traffic in an office district: higher during the workweek, lower during
the weekend; period of 7 data points
• May be present in time series with and without trend
• Seasonality is also smoothed exponentially, coefficient 𝛾
BigML, Inc 19Spring Release Webinar - July 2017
Seasonalityy
0
30
60
90
120
Time
1 3 5 7 9 11 13 15 17 19
y
0
35
70
105
140
Time
1 3 5 7 9 11 13 15 17 19
Additive Multiplicative
BigML, Inc 20Spring Release Webinar - July 2017
Error
• Variation in time series values not explained by trend or seasonality
• Can be additive or multiplicative (homoscedastic or heteroscedastic)
• Primarily affects forecast intervals.
BigML, Inc 21Spring Release Webinar - July 2017
Errory
0
150
300
450
600
Time
1 3 5 7 9 11 13 15 17 19
y
0
125
250
375
500
Time
1 3 5 7 9 11 13 15 17 19
Additive Multiplicative
BigML, Inc 22Spring Release Webinar - July 2017
M,Ad,N
We use a shorthand to refer to a specific error, trend, and seasonality
Multiplicative Error Additive Damped Trend No Seasonality
ETS Nomenclature
BigML, Inc 23Spring Release Webinar - July 2017
Exponential Smoothing Models
None Additive Multiplicative
None A,N,N M,N,N A,N,A M,N,A A,N,M M,N,M
Additive A,A,N M,A,N A,A,A M,A,A A,A,M M,A,M
Additive + Damped A,Ad,N M,Ad,N A,Ad,A M,Ad,A A,Ad,M M,Ad,M
Multiplicative A,M,N M,M,N A,M,A M,M,A A,M,M M,M,M
Multiplicative + Damped A,Md,N M,Md,N A,Md,A M,Md,A A,Md,M M,Md,M
Seasonality
Trend
BigML, Inc 24Spring Release Webinar - July 2017
Exponential Smoothing Models
None Additive Multiplicative
None A,N,N M,N,N A,N,A M,N,A A,N,M M,N,M
Additive A,A,N M,A,N A,A,A M,A,A A,A,M M,A,M
Additive + damped A,Ad,N M,Ad,N A,Ad,A M,Ad,A A,Ad,M M,Ad,M
Multiplicative A,M,N M,M,N A,M,A M,M,A A,M,M M,M,M
Multiplicative + Damped A,Md,N M,Md,N A,Md,A M,Md,A A,Md,M M,Md,M
Seasonality
Trend
• Some model types excluded due to numeric instability
• Multiplicative error models only available when data is strictly positive
BigML, Inc 25Spring Release Webinar - July 2017
Model Performance Metrics
AIC AICc BIC R-squared
• Akaike Information
Criterion
• It measures the trade-
off between the
goodness-of-fit and
the model’s complexity.
• The smaller the AIC,
the better.
• Corrected Akaike
Information Criterion
• It introduces a
correction element for
smaller datasets.
• For bigger datasets, the
AIC and AICc tend to
yield similar results.
• Schwarz Bayesian
Information Criterion
• It penalizes the model’s
complexity more
heavily.
• The lower the BIC the
better.
• R squared
• It measures the
model’s errors
compared to the
objective field actual
values.
• The higher the R
squared the better.
BigML, Inc 26Spring Release Webinar - July 2017
Time Series Workflow in BigML
SOURCE
DATASET
EVALUATION
FORECASTTIME SERIES FORECAST
+
TEST
TRAIN
BigML, Inc 27Spring Release Webinar - July 2017
Predictive Maintenance
NASA Battery Prognostics Data
Demo #1
BigML, Inc 28Spring Release Webinar - July 2017
Forecasting with Seasonality
Eurostat Agricultural Production Data
Demo #2
BigML, Inc 29Spring Release Webinar - July 2017
Calendar Correction
BigML, Inc 30Spring Release Webinar - July 2017
Calendar Correction
pounds/month ÷ days/month = pounds/day
BigML, Inc 31Spring Release Webinar - July 2017
Calendar Adjustment with WhizzML
BigML, Inc 32Spring Release Webinar - July 2017
BigML Time Series
• Industry-standard exponential smoothing time series analysis and forecasting
• Familiar BigML workflow: source → dataset → timeseries → forecast
• BigML’s market-leading visualization and collaboration
• Can be trained on any numeric field
• Train multiple objective fields and ETS model types in parallel.
• Combined modeling and medium-horizon forecasting
• Integrated with WhizzML and API bindings
BigML, Inc 33Spring Release Webinar - July 2017
Learn about Time Series
https://bigml.com/releases/spring-2017
Questions?
@bigmlcom info@bigml.com

BigML Spring 2017 Release

  • 1.
  • 2.
    BigML, Inc 2SpringRelease Webinar - July 2017 Spring 2017 Release CHEE SING LEE Ph.D. - Machine Learning Engineer Enter questions into chat box – We will answer some via chat; others at the end of the session https://bigml.com/releases/spring-2017 ATAKAN CETINSOY - VP of Predictive Applications Resources Moderator Speaker Contact info@bigml.com Twitter @bigmlcom Questions
  • 3.
    BigML, Inc 3SpringRelease Webinar - July 2017 Machine Learning Data Color Mass PPAP red 11 pen green 45 apple red 53 apple yellow 0 pen blue 2 pen green 422 pineapple yellow 555 pineapple blue 7 pen Discovering patterns within data: • Color = “red” Mass < 100 • PPAP = “pineapple” Color ≠ “blue” • Color = “blue” PPAP = “pen”
  • 4.
    BigML, Inc 4SpringRelease Webinar - July 2017 Color Number PPAP green 45 apple blue 2 pen green 422 pineapple blue 7 pen yellow 0 pen yellow 9 pineapple red 555 apple red 11 pen Patterns still hold when rows re-arranged: • Color = “red” Mass < 100 • PPAP = “pineapple” Color ≠ “blue” • Color = “blue” PPAP = “pen” Machine Learning Data
  • 5.
    BigML, Inc 5SpringRelease Webinar - July 2017 Time Series Data Year Pineapple Harvest 1986 50,74 1987 22,03 1988 50,69 1989 40,38 1990 29,80 1991 9,90 1992 73,93 1993 22,95 1994 139,09 1995 115,17 1996 193,88 1997 175,31 1998 223,41 1999 295,03 2000 450,53 Pineapple Harvest Tons 0 125 250 375 500 Year 1986 1988 1990 1992 1994 1996 1998 2000 Trend Error
  • 6.
    BigML, Inc 6SpringRelease Webinar - July 2017 Pineapple Harvest Tons 0 125 250 375 500 Year 1986 1988 1990 1992 1994 1996 1998 2000 Year Pineapple Harvest 1986 139,09 1987 175,31 1988 9,91 1989 22,95 1990 450,53 1991 73,93 1992 40,38 1993 22,03 1994 295,03 1995 50,74 1996 29,8 1997 223,41 1998 115,17 1999 193,88 2000 50,69 Rearranging Disrupts Patterns Time Series Data
  • 7.
    BigML, Inc 7SpringRelease Webinar - July 2017 if (color = “red” and mass = 11) then PPAP = ? Extrapolating learned patterns to novel inputs Machine Learning Forecasts
  • 8.
    BigML, Inc 8SpringRelease Webinar - July 2017 if (color = “red” and mass = 11) then PPAP = [“apple”;34.24%] Forecasts accompanied with measure of uncertainty or confidence Machine Learning Forecasts
  • 9.
    BigML, Inc 9SpringRelease Webinar - July 2017 Time Series Forecasts Forecasts: Extrapolating learned patterns to future times Forecasts accompanied with measure of uncertainty or confidence ?
  • 10.
    BigML, Inc 10SpringRelease Webinar - July 2017 Linear Versus Random Split Year Pineapple Harvest 1986 139,09 1987 175,31 1988 9,91 1989 22,95 1990 450,53 1991 73,93 1992 40,38 1993 22,03 1994 295,03 1995 115,17 Year Pineapple Harvest 1986 139,09 1987 175,31 1988 9,91 1989 22,95 1990 450,53 1991 73,93 1992 40,38 1993 22,03 1994 295,03 1995 115,17 Random Split Linear Split
  • 11.
    BigML, Inc 11SpringRelease Webinar - July 2017 Time Series Smoothing Smoothing techniques remove the “noise” in a time series, so the underlying “signal” can be observed
  • 12.
    BigML, Inc 12SpringRelease Webinar - July 2017 Time Series Smoothing Weight 0 0,05 0,1 0,15 0,2 Lag 1 2 3 4 5 6 7 8 9 10 11 12 13
  • 13.
    BigML, Inc 13SpringRelease Webinar - July 2017 Exponential Smoothing Weight 0 0,05 0,1 0,15 0,2 Lag 1 2 3 4 5 6 7 8 9 10 11 12 13
  • 14.
    BigML, Inc 14SpringRelease Webinar - July 2017 Smoothing Compared Moving Average Exponential Smoothing Sum over fixed-width window Sum over entire series history Uniformly Weighted Exponentially Weighted 1 integer parameter: window size 2 continuous parameters: initial state and smoothing coefficient • Continuous parameters → Fine-tuned optimization • State-space model allows for error estimation • Extensions for trend and seasonality
  • 15.
    BigML, Inc 15SpringRelease Webinar - July 2017 Trend • Change or slope between consecutive level values • Smoothed exponentially like level, coefficient 𝛽 • Affects overall trajectory of time series. y 0 12,5 25 37,5 50 Time Apr May Jun Jul Additive Trend
  • 16.
    BigML, Inc 16SpringRelease Webinar - July 2017 Trend y 0 50 100 150 200 Time Apr May Jun Jul Multiplicative Trend • Change or slope between consecutive level values • Smoothed exponentially like level, coefficient 𝛽 • Affects overall trajectory of time series.
  • 17.
    BigML, Inc 17SpringRelease Webinar - July 2017 Trend y 0 25 50 75 100 Time Apr May Jun Jul Aug Sep Multiplicative Damped Trend • Change or slope between consecutive level values • Smoothed exponentially like level, coefficient 𝛽 • Affects overall trajectory of time series.
  • 18.
    BigML, Inc 18SpringRelease Webinar - July 2017 Seasonality • Seasonality is a recurring pattern of variation that occurs over periods of consistent length • Examples: • Monthly A/C sales: higher in summer months and lower in winter months, repeats with a period of 12 data points • Vehicle traffic in an office district: higher during the workweek, lower during the weekend; period of 7 data points • May be present in time series with and without trend • Seasonality is also smoothed exponentially, coefficient 𝛾
  • 19.
    BigML, Inc 19SpringRelease Webinar - July 2017 Seasonalityy 0 30 60 90 120 Time 1 3 5 7 9 11 13 15 17 19 y 0 35 70 105 140 Time 1 3 5 7 9 11 13 15 17 19 Additive Multiplicative
  • 20.
    BigML, Inc 20SpringRelease Webinar - July 2017 Error • Variation in time series values not explained by trend or seasonality • Can be additive or multiplicative (homoscedastic or heteroscedastic) • Primarily affects forecast intervals.
  • 21.
    BigML, Inc 21SpringRelease Webinar - July 2017 Errory 0 150 300 450 600 Time 1 3 5 7 9 11 13 15 17 19 y 0 125 250 375 500 Time 1 3 5 7 9 11 13 15 17 19 Additive Multiplicative
  • 22.
    BigML, Inc 22SpringRelease Webinar - July 2017 M,Ad,N We use a shorthand to refer to a specific error, trend, and seasonality Multiplicative Error Additive Damped Trend No Seasonality ETS Nomenclature
  • 23.
    BigML, Inc 23SpringRelease Webinar - July 2017 Exponential Smoothing Models None Additive Multiplicative None A,N,N M,N,N A,N,A M,N,A A,N,M M,N,M Additive A,A,N M,A,N A,A,A M,A,A A,A,M M,A,M Additive + Damped A,Ad,N M,Ad,N A,Ad,A M,Ad,A A,Ad,M M,Ad,M Multiplicative A,M,N M,M,N A,M,A M,M,A A,M,M M,M,M Multiplicative + Damped A,Md,N M,Md,N A,Md,A M,Md,A A,Md,M M,Md,M Seasonality Trend
  • 24.
    BigML, Inc 24SpringRelease Webinar - July 2017 Exponential Smoothing Models None Additive Multiplicative None A,N,N M,N,N A,N,A M,N,A A,N,M M,N,M Additive A,A,N M,A,N A,A,A M,A,A A,A,M M,A,M Additive + damped A,Ad,N M,Ad,N A,Ad,A M,Ad,A A,Ad,M M,Ad,M Multiplicative A,M,N M,M,N A,M,A M,M,A A,M,M M,M,M Multiplicative + Damped A,Md,N M,Md,N A,Md,A M,Md,A A,Md,M M,Md,M Seasonality Trend • Some model types excluded due to numeric instability • Multiplicative error models only available when data is strictly positive
  • 25.
    BigML, Inc 25SpringRelease Webinar - July 2017 Model Performance Metrics AIC AICc BIC R-squared • Akaike Information Criterion • It measures the trade- off between the goodness-of-fit and the model’s complexity. • The smaller the AIC, the better. • Corrected Akaike Information Criterion • It introduces a correction element for smaller datasets. • For bigger datasets, the AIC and AICc tend to yield similar results. • Schwarz Bayesian Information Criterion • It penalizes the model’s complexity more heavily. • The lower the BIC the better. • R squared • It measures the model’s errors compared to the objective field actual values. • The higher the R squared the better.
  • 26.
    BigML, Inc 26SpringRelease Webinar - July 2017 Time Series Workflow in BigML SOURCE DATASET EVALUATION FORECASTTIME SERIES FORECAST + TEST TRAIN
  • 27.
    BigML, Inc 27SpringRelease Webinar - July 2017 Predictive Maintenance NASA Battery Prognostics Data Demo #1
  • 28.
    BigML, Inc 28SpringRelease Webinar - July 2017 Forecasting with Seasonality Eurostat Agricultural Production Data Demo #2
  • 29.
    BigML, Inc 29SpringRelease Webinar - July 2017 Calendar Correction
  • 30.
    BigML, Inc 30SpringRelease Webinar - July 2017 Calendar Correction pounds/month ÷ days/month = pounds/day
  • 31.
    BigML, Inc 31SpringRelease Webinar - July 2017 Calendar Adjustment with WhizzML
  • 32.
    BigML, Inc 32SpringRelease Webinar - July 2017 BigML Time Series • Industry-standard exponential smoothing time series analysis and forecasting • Familiar BigML workflow: source → dataset → timeseries → forecast • BigML’s market-leading visualization and collaboration • Can be trained on any numeric field • Train multiple objective fields and ETS model types in parallel. • Combined modeling and medium-horizon forecasting • Integrated with WhizzML and API bindings
  • 33.
    BigML, Inc 33SpringRelease Webinar - July 2017 Learn about Time Series https://bigml.com/releases/spring-2017
  • 34.