Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

BigML Spring 2017 Release

428 views

Published on

Our Spring 2017 release brings Time Series, a supervised learning method for analyzing time based data when historical patterns can explain future behavior. This new resource is available from the BigML Dashboard, API, as well as from WhizzML for its automation. Time Series is commonly used for predicting stock prices, sales forecasting, website traffic, production and inventory analysis as well as weather forecasting, among many other use cases.

Published in: Data & Analytics
  • Be the first to comment

BigML Spring 2017 Release

  1. 1. Introducing Time Series BigML Spring 2017 Release
  2. 2. BigML, Inc 2Spring Release Webinar - July 2017 Spring 2017 Release CHEE SING LEE Ph.D. - Machine Learning Engineer Enter questions into chat box – We will answer some via chat; others at the end of the session https://bigml.com/releases/spring-2017 ATAKAN CETINSOY - VP of Predictive Applications Resources Moderator Speaker Contact info@bigml.com Twitter @bigmlcom Questions
  3. 3. BigML, Inc 3Spring Release Webinar - July 2017 Machine Learning Data Color Mass PPAP red 11 pen green 45 apple red 53 apple yellow 0 pen blue 2 pen green 422 pineapple yellow 555 pineapple blue 7 pen Discovering patterns within data: • Color = “red” Mass < 100 • PPAP = “pineapple” Color ≠ “blue” • Color = “blue” PPAP = “pen”
  4. 4. BigML, Inc 4Spring Release Webinar - July 2017 Color Number PPAP green 45 apple blue 2 pen green 422 pineapple blue 7 pen yellow 0 pen yellow 9 pineapple red 555 apple red 11 pen Patterns still hold when rows re-arranged: • Color = “red” Mass < 100 • PPAP = “pineapple” Color ≠ “blue” • Color = “blue” PPAP = “pen” Machine Learning Data
  5. 5. BigML, Inc 5Spring Release Webinar - July 2017 Time Series Data Year Pineapple Harvest 1986 50,74 1987 22,03 1988 50,69 1989 40,38 1990 29,80 1991 9,90 1992 73,93 1993 22,95 1994 139,09 1995 115,17 1996 193,88 1997 175,31 1998 223,41 1999 295,03 2000 450,53 Pineapple Harvest Tons 0 125 250 375 500 Year 1986 1988 1990 1992 1994 1996 1998 2000 Trend Error
  6. 6. BigML, Inc 6Spring Release Webinar - July 2017 Pineapple Harvest Tons 0 125 250 375 500 Year 1986 1988 1990 1992 1994 1996 1998 2000 Year Pineapple Harvest 1986 139,09 1987 175,31 1988 9,91 1989 22,95 1990 450,53 1991 73,93 1992 40,38 1993 22,03 1994 295,03 1995 50,74 1996 29,8 1997 223,41 1998 115,17 1999 193,88 2000 50,69 Rearranging Disrupts Patterns Time Series Data
  7. 7. BigML, Inc 7Spring Release Webinar - July 2017 if (color = “red” and mass = 11) then PPAP = ? Extrapolating learned patterns to novel inputs Machine Learning Forecasts
  8. 8. BigML, Inc 8Spring Release Webinar - July 2017 if (color = “red” and mass = 11) then PPAP = [“apple”;34.24%] Forecasts accompanied with measure of uncertainty or confidence Machine Learning Forecasts
  9. 9. BigML, Inc 9Spring Release Webinar - July 2017 Time Series Forecasts Forecasts: Extrapolating learned patterns to future times Forecasts accompanied with measure of uncertainty or confidence ?
  10. 10. BigML, Inc 10Spring Release Webinar - July 2017 Linear Versus Random Split Year Pineapple Harvest 1986 139,09 1987 175,31 1988 9,91 1989 22,95 1990 450,53 1991 73,93 1992 40,38 1993 22,03 1994 295,03 1995 115,17 Year Pineapple Harvest 1986 139,09 1987 175,31 1988 9,91 1989 22,95 1990 450,53 1991 73,93 1992 40,38 1993 22,03 1994 295,03 1995 115,17 Random Split Linear Split
  11. 11. BigML, Inc 11Spring Release Webinar - July 2017 Time Series Smoothing Smoothing techniques remove the “noise” in a time series, so the underlying “signal” can be observed
  12. 12. BigML, Inc 12Spring Release Webinar - July 2017 Time Series Smoothing Weight 0 0,05 0,1 0,15 0,2 Lag 1 2 3 4 5 6 7 8 9 10 11 12 13
  13. 13. BigML, Inc 13Spring Release Webinar - July 2017 Exponential Smoothing Weight 0 0,05 0,1 0,15 0,2 Lag 1 2 3 4 5 6 7 8 9 10 11 12 13
  14. 14. BigML, Inc 14Spring Release Webinar - July 2017 Smoothing Compared Moving Average Exponential Smoothing Sum over fixed-width window Sum over entire series history Uniformly Weighted Exponentially Weighted 1 integer parameter: window size 2 continuous parameters: initial state and smoothing coefficient • Continuous parameters → Fine-tuned optimization • State-space model allows for error estimation • Extensions for trend and seasonality
  15. 15. BigML, Inc 15Spring Release Webinar - July 2017 Trend • Change or slope between consecutive level values • Smoothed exponentially like level, coefficient 𝛽 • Affects overall trajectory of time series. y 0 12,5 25 37,5 50 Time Apr May Jun Jul Additive Trend
  16. 16. BigML, Inc 16Spring Release Webinar - July 2017 Trend y 0 50 100 150 200 Time Apr May Jun Jul Multiplicative Trend • Change or slope between consecutive level values • Smoothed exponentially like level, coefficient 𝛽 • Affects overall trajectory of time series.
  17. 17. BigML, Inc 17Spring Release Webinar - July 2017 Trend y 0 25 50 75 100 Time Apr May Jun Jul Aug Sep Multiplicative Damped Trend • Change or slope between consecutive level values • Smoothed exponentially like level, coefficient 𝛽 • Affects overall trajectory of time series.
  18. 18. BigML, Inc 18Spring Release Webinar - July 2017 Seasonality • Seasonality is a recurring pattern of variation that occurs over periods of consistent length • Examples: • Monthly A/C sales: higher in summer months and lower in winter months, repeats with a period of 12 data points • Vehicle traffic in an office district: higher during the workweek, lower during the weekend; period of 7 data points • May be present in time series with and without trend • Seasonality is also smoothed exponentially, coefficient 𝛾
  19. 19. BigML, Inc 19Spring Release Webinar - July 2017 Seasonalityy 0 30 60 90 120 Time 1 3 5 7 9 11 13 15 17 19 y 0 35 70 105 140 Time 1 3 5 7 9 11 13 15 17 19 Additive Multiplicative
  20. 20. BigML, Inc 20Spring Release Webinar - July 2017 Error • Variation in time series values not explained by trend or seasonality • Can be additive or multiplicative (homoscedastic or heteroscedastic) • Primarily affects forecast intervals.
  21. 21. BigML, Inc 21Spring Release Webinar - July 2017 Errory 0 150 300 450 600 Time 1 3 5 7 9 11 13 15 17 19 y 0 125 250 375 500 Time 1 3 5 7 9 11 13 15 17 19 Additive Multiplicative
  22. 22. BigML, Inc 22Spring Release Webinar - July 2017 M,Ad,N We use a shorthand to refer to a specific error, trend, and seasonality Multiplicative Error Additive Damped Trend No Seasonality ETS Nomenclature
  23. 23. BigML, Inc 23Spring Release Webinar - July 2017 Exponential Smoothing Models None Additive Multiplicative None A,N,N M,N,N A,N,A M,N,A A,N,M M,N,M Additive A,A,N M,A,N A,A,A M,A,A A,A,M M,A,M Additive + Damped A,Ad,N M,Ad,N A,Ad,A M,Ad,A A,Ad,M M,Ad,M Multiplicative A,M,N M,M,N A,M,A M,M,A A,M,M M,M,M Multiplicative + Damped A,Md,N M,Md,N A,Md,A M,Md,A A,Md,M M,Md,M Seasonality Trend
  24. 24. BigML, Inc 24Spring Release Webinar - July 2017 Exponential Smoothing Models None Additive Multiplicative None A,N,N M,N,N A,N,A M,N,A A,N,M M,N,M Additive A,A,N M,A,N A,A,A M,A,A A,A,M M,A,M Additive + damped A,Ad,N M,Ad,N A,Ad,A M,Ad,A A,Ad,M M,Ad,M Multiplicative A,M,N M,M,N A,M,A M,M,A A,M,M M,M,M Multiplicative + Damped A,Md,N M,Md,N A,Md,A M,Md,A A,Md,M M,Md,M Seasonality Trend • Some model types excluded due to numeric instability • Multiplicative error models only available when data is strictly positive
  25. 25. BigML, Inc 25Spring Release Webinar - July 2017 Model Performance Metrics AIC AICc BIC R-squared • Akaike Information Criterion • It measures the trade- off between the goodness-of-fit and the model’s complexity. • The smaller the AIC, the better. • Corrected Akaike Information Criterion • It introduces a correction element for smaller datasets. • For bigger datasets, the AIC and AICc tend to yield similar results. • Schwarz Bayesian Information Criterion • It penalizes the model’s complexity more heavily. • The lower the BIC the better. • R squared • It measures the model’s errors compared to the objective field actual values. • The higher the R squared the better.
  26. 26. BigML, Inc 26Spring Release Webinar - July 2017 Time Series Workflow in BigML SOURCE DATASET EVALUATION FORECASTTIME SERIES FORECAST + TEST TRAIN
  27. 27. BigML, Inc 27Spring Release Webinar - July 2017 Predictive Maintenance NASA Battery Prognostics Data Demo #1
  28. 28. BigML, Inc 28Spring Release Webinar - July 2017 Forecasting with Seasonality Eurostat Agricultural Production Data Demo #2
  29. 29. BigML, Inc 29Spring Release Webinar - July 2017 Calendar Correction
  30. 30. BigML, Inc 30Spring Release Webinar - July 2017 Calendar Correction pounds/month ÷ days/month = pounds/day
  31. 31. BigML, Inc 31Spring Release Webinar - July 2017 Calendar Adjustment with WhizzML
  32. 32. BigML, Inc 32Spring Release Webinar - July 2017 BigML Time Series • Industry-standard exponential smoothing time series analysis and forecasting • Familiar BigML workflow: source → dataset → timeseries → forecast • BigML’s market-leading visualization and collaboration • Can be trained on any numeric field • Train multiple objective fields and ETS model types in parallel. • Combined modeling and medium-horizon forecasting • Integrated with WhizzML and API bindings
  33. 33. BigML, Inc 33Spring Release Webinar - July 2017 Learn about Time Series https://bigml.com/releases/spring-2017
  34. 34. Questions? @bigmlcom info@bigml.com

×