1645 track 2 ard_using our laptop

© 2016 Micron Technology, Inc. |
©2016 Micron Technology, Inc. All rights reserved. Information, products, and/or specifications are subject to
change without notice. All information is provided on an “AS IS” basis without warranties of any kind.
Statements regarding products, including regarding their features, availability, functionality, or compatibility,
are provided for informational purposes only and do not modify the warranty, if any, applicable to any
product. Drawings may not be to scale. Micron, the Micron logo, and all other Micron trademarks are the
property of Micron Technology, Inc. All other trademarks are the property of their respective owners.
Demand Forecasting with Machine Learning
Colin Ard

Agenda
- Demand Forecasting at Micron
- Classical Time Series Analysis
- Building Predictive Models for Time Series with Machine Learning
- Demand Forecasting with Machine Learning Ensembles
2

A Bit About Micron
3
- Founded in 1978 in Boise, ID
- First fabrication unit completed in 1980
- Growth through expansion and acquisition
- Global company with over 30,000 employees

Demand Forecasting at Micron
4

- Scope: Tens of thousands of series requiring forecasting
- Scale: Consistent high demand vs sparse low demand
- Structure…
Data Complexities

0
Univariate Time Series Analysis
6
𝑌𝑡 = 𝛼 + 𝛽𝑡 + 𝑒𝑡, 𝑒𝑡 ~Normal 0, 𝜎2
Solve by minimizing loss function: 𝐿 𝑌, 𝛼, 𝛽
Residual
𝐿 𝑌, 𝛼, 𝛽 = 𝑌𝑡 − 𝛼 − 𝛽𝑡 2
𝑇
𝑡=1
Squared error loss

7
ACFℎ = Correlation 𝑌𝑡, 𝑌𝑡−ℎ
Partial ACFℎ = Correlation 𝑌𝑡, 𝑌𝑡−ℎ|𝑌𝑡−1, … 𝑌𝑡−ℎ+1
Residual
𝐿 𝑌, 𝛼, 𝛽 = 𝑌𝑡 − 𝛼 − 𝛽𝑡 2
𝑇
𝑡=1
Squared error loss
𝑌𝑡 = 𝛼 + 𝛽𝑡 + 𝑒𝑡, 𝑒𝑡 ~Normal 0, 𝜎2
Solve by minimizing loss function: 𝐿 𝑌, 𝛼, 𝛽

Auto-regression (AR): 𝑌𝑡 = 𝛿 + 𝜑𝑌𝑡−1 + 𝑤𝑡
8
Differencing: ∆𝑌𝑡 = 𝑌𝑡 − 𝑌𝑡−1
Moving average (MA): 𝑌𝑡 = 𝜇 + 𝜃𝑤𝑡−1 + 𝑤𝑡
Non-stationary series
Difference: 𝑌𝑡 → ∆𝑌𝑡
AR 1
MA 1
Residuals from Differenced Series

9
Residuals from ARIMA (1, 1, 1)

10
Forecasted Value
Expected Value
Residuals from ARIMA (1, 1, 1)

© 2016 Micron Technology, Inc. |11
A Machine Learning Approach

The Bias-Variance Tradeoff
16
𝐿 𝑌, 𝑓; 𝛾 = 𝑌𝑖 − 𝑓 𝑋𝑖
2
𝑁
𝑖=1
+ 𝛾 𝑓′′
𝑠 2
𝑑𝑠
𝛾 = ∞ 𝛾 = 00 < 𝛾 < ∞
Squared error loss Complexity penalty
𝛾 ≥ 0: Tuning Parameter

© 2016 Micron Technology, Inc. | 17
Demand History
Naïve Forecast
Alternate Forecasts
Ensemble Forecast
Machine Learning

Ensembling Methods Pt. 1
18
Final pre-processing steps:
- Cumulative forecasts totals
- Model demand over the next 3 months, as opposed to demand
3 months from now
- Separate ensemble models trained for each cumulative
forecast span
- Feature sorting for suitability in modeling:
- linear associations
- interaction-dependent associations
- Feature/Outcome transformation and scaling…
Outcome𝑖 =
Actual𝑖 − Naive𝑖
𝑐 + Actual𝑖 + Naive𝑖
Feature𝑖𝑚 =
Forecast 𝑖𝑚 − Naive𝑖
𝑐 + Forecast 𝑖𝑚 + Naive𝑖
∆𝑌𝑖
Stacked Generalization
𝑓1 𝑋𝑖
…
𝑌𝑖 = 𝐹 𝑋𝑖
𝑓2 𝑋𝑖 𝑓3 𝑋𝑖 𝑓 𝑀 𝑋𝑖
𝐹 ∆𝑌𝑖 1
, ∆𝑌𝑖 2
, … , ∆𝑌𝑖 𝑀
, …
~

Ensembling Methods Pt. 2
19
Naïve Forecast: Qty
∆𝑌 ARIMA
100K
-0.2
1M
R1
R2
R3 R4
∆𝑌𝑖 = 𝐿 𝑹, ∆𝑌𝑖 1
, ∆𝑌𝑖 2
, … , ∆𝑌𝑖 𝑀
, …
…
Boosting Algorithms
𝑌0
∗
= 𝑌
ℎ0 𝑋
𝑌1
∗
ℎ1 𝑋
𝑌 𝑀
∗
ℎ 𝑀 𝑋 𝑌 = 𝐹 𝑀 𝑋

Ensemble Methods Pt. 3
20
Training Data
T1 T2 T3 TB
…
Bootstrap Aggregation
Bagged Estimate
𝑌𝑖 =
1
𝐵
𝑌𝑖
𝑏
𝐵
𝑏=1
𝑋𝑖1, 𝑌𝑖1 𝑋𝑖2, 𝑌𝑖2 … 𝑋𝑖,𝑇−ℎ, 𝑌𝑖,𝑇−ℎ 𝑋𝑖𝑠 𝑖
, 𝑌𝑖𝑠 𝑖
𝑋1𝑠1
, 𝑌1𝑠1
𝑋2𝑠2
, 𝑌2𝑠2
…
Candidate model inputs for Product i
Full sample at bth training iteration
Goal:
- A generalizable model for change in demand
Challenges:
- Have to assume the fundamentals that drove historical
demand are likely to fluctuate over time
- Limited to validation of the algorithm rather than
testing of predictions from a specific trained model
𝑆𝑎𝑚𝑝𝑙𝑒
Forecasting h months ahead from month T
𝑋𝑖𝑠 𝑖
, 𝑌𝑖𝑠 𝑖
… 𝑋 𝑁𝑠 𝑁
, 𝑌𝑁𝑠 𝑁

Forecast Accuracy Metrics and Validation
21
𝑤𝑀𝐴𝑃𝐸 = 100 ×
Actual − Forecast
Actual
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 100 − 𝑤𝑀𝐴𝑃𝐸
Training Data:
𝑋𝑖𝑠, 𝑌𝑖𝑠 , 𝑠 ∈ 1, … , 𝑇 − ℎ
Validation Data:
𝑋𝑖𝑇, 𝑌𝑖𝑇
𝑃𝑟𝑒𝑑𝑖𝑐𝑡
Training Data:
𝑋𝑖𝑠, 𝑌𝑖𝑠 , 𝑠 ∈ 2, … , 𝑇 − ℎ + 1
Validation Data:
𝑋𝑖,𝑇+1, 𝑌𝑖,𝑇+1
…
Training Data:
𝑋𝑖𝑠, 𝑌𝑖𝑠 , 𝑠 ∈ 𝐾 + 1, … , 𝑇 − ℎ + 𝐾
Validation Data:
𝑋𝑖,𝑇+𝐾, 𝑌𝑖,𝑇+𝐾
Training Data:
𝑋𝑖𝑠, 𝑌𝑖𝑠 , 𝑠 ∈ 3, … , 𝑇 − ℎ + 2
Validation Data:
𝑋𝑖,𝑇+2, 𝑌𝑖,𝑇+2

Forecast Accuracy Validation
22
Lag-1 Lag-2 (Cumulative) Lag-3 (Cumulative)

Conclusions
23
- Significant gains in forecast accuracy across lags
- Understand the challenge and play to your strengths
- Business processes, available data, and goals of the analysis
- Institutional knowledge and human expertise
- Acknowledgements
- Micron Enterprise Data Science and Demand Management Teams

1645 track 2 ard_using our laptop

1645 track 2 ard_using our laptop

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to 1645 track 2 ard_using our laptop

Similar to 1645 track 2 ard_using our laptop (20)

More from Rising Media, Inc.

More from Rising Media, Inc. (20)

Recently uploaded

Recently uploaded (20)

1645 track 2 ard_using our laptop