SlideShare a Scribd company logo
1 of 34
Download to read offline
Introduction Proposed Approach Empirical Results Concluding Remarks References
Bagging-Clustering Methods to Forecast Time
Series
Tiago Dantas
Fernando Cyrino
Pontifical Catholic University of Rio de Janeiro,Brazil
ISF 2017, Cairns-Australia, June 25-28
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Introduction
Cordeiro and Neves (2009) and Bergmeir et al. (2016), have
proposed new ways to generate forecasts using a very popular
Machine Learning technique, called Bagging (Bootstrap
Aggregating), proposed by Breiman (1996), in combination
with Exponential Smoothing methods to improve forecast
accuracy
The main idea is to use Bootstrap to generate an ensemble of
forecasts that is combined into one single output
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Bagged.BLD.MBB.ETS - Bergmeir et al.(2016)
Best model using Bagging and Exponential Smoothing
methods
Box-Cox transformation (stabilizes variance)
STL decomposition (decompose time series into seasonal, trend and
remainder)
Moving Block Bootstrap (generate new versions of the remainder)
Forecasts are obtained selecting one ETS model for each time series
(original and bootstrap versions)
Final forecast is obtained using the median (other possibilities are
mean, trimmed mean, among others)
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Bagged.BLD.MBB.ETS - Bergmeir et al.(2016)
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Why Bagging tends to work
The Mean Squared Forecast Error (MSFE) can be decomposed
into three terms:
MSFE = Var(yt+1|t) + bias(ˆyt+1|t)2 + Var(ˆyt+1|t)
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Why Bagging tends to work
The average forecast over the Bootstrap samples can be written as:
˜yt+1|t = 1
B
B
i=1
ˆy∗
(i)t+1|t
where the tilde indicates Bagging forecast and B is the total
number of Bootstrap samples.
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Why Bagging tends to work
bias(˜yt+1|t) = 1
B
B
i=1
bias(ˆy∗
(i)t+1|t)
Note that unbiased Bootstrapped versions lead to a relatively
unbiased ensemble
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Why Bagging tends to work
Var(˜yt+1|t) = 1
B2
B
i=1
Var(ˆy∗
(i)t+1|t) + 1
B2
i=i
Cov[ˆy∗
(i)t+1|t, ˆy∗
(i )t+1|t]
Variance tends to be reduced
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Why Bagging tends to work
When applying Bagging and Exponential Smoothing what
happens is variance reduction
If the variances are approximately equal and there is no
correlation:
Var(˜yt+1|t) ≈ 1
B Var(ˆy∗
(1)t+1|t)
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Why Bagging tends to work
Reducing covariance seems to be a good idea
Var(˜yt+1|t) = 1
B2
B
i=1
Var(ˆy∗
(i)t+1|t) + 1
B2
i=i
Cov[ˆy∗
(i)t+1|t, ˆy∗
(i )t+1|t]
The proposed approach tries to use this idea in order to
reduce forecast error
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Proposed Approach
The proposed approach can be divided in two parts:
1. Generation of Bootstrapped series - Algorithm developed by
Bergmeir et al. (2016)
2. The procedure to forecast and aggregate the series - New
developments
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Proposed Approach
Generating bootstrapped series
Bootstrapped series are generated in the same way as
Bagged.BLD.MBB.ETS
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Algorithm 1 Generating bootstrapped series
1: procedure BOOTSTRAP(ts,num.boot)
2: λ ← BoxCox.lambda(ts,min=0,max=1)
3: ts.bc← BoxCox(ts,λ = 1)
4: if ts is seasonal then
5: [trend seasonal remainder] ← stl(ts.bc)
6: else
7: seasonal ← 0
8: [trend,remainder] ← loess(ts.bc)
9: end if
10: recon.series[1] ← ts
11: for i in 2 to num.boot do
12: boot.sample[i] ← MBB(remainder)
13: recon.series.bc[i] ← trend + seasonal +boot.sample[i]
14: recon.series[i] ← InvBoxCox(recon.series.bc[i],λ)
15: end for
16: return recon.series
17: end procedure
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Proposed Approach
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Proposed Approach
The procedure to forecast and aggregate the series
the Proposed approach and Bagged.BLD.MBB.ETS differ in
the way the ensemble is constructed
Bagged.BLD.MBB.ETS consider all of the Bootstrapped
versions to make forecasts
The proposed approach considers a less correlated group of
time series to make forecasts
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Proposed Approach
To create a less correlated ensemble, the proposal is to
generate clusters from the Bootstrapped versions
Cluster procedures maximize similarity within the group and
minimize it between them
The expectation is that selecting series from different clusters
would lead to an ensemble less correlated and, therefore, less
correlated forecasts to be aggregated
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Proposed Approach
Partitioning Around Medoids Algorithm (PAM) and euclidean
distance are used to create the clusters (fast algorithm and
less sensible to outliers)
The number of cluster can be defined using cross-validation or
any other method (e.g. Silhouette Information)
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Proposed Approach
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Proposed Approach
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Proposed Approach
The user has to define the total number of series to aggregate
(100 was the choice made by Bergmeir and colleagues)
The number of series to be selected in each cluster is defined
as proportionally equal to the size of each cluster
Example: B =1000 and the total number of series to be
aggregated is 100. If cluster 1 has 20 series, therefore 2 series
would be selected (10%). But, Which 2 time series?
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
A validation set is defined
The time series with best results (sMAPE) in the validation
set are the ones selected in each cluster
The final forecast is obtained taking the median of the
forecasts (other possibilities are the mean, trimmed mean)
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Proposed Approach
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
MSE decomposition
Bagged.BLD.MBB.ETS (black) and the Proposed Approach (blue)
- Forecast (up to 18 steps ahead) - Time series 1083 - M3
competition
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Empirical Results
The proposed approach was validated on public available time
series from the M3 competition (1428 monthly, 756 quarterly
and 645 yearly time series)
The experiment was conducted using R and majorly the
forecast package (version 8.0)
The results for Bagged.BLD.MBB.ETS were obtained using
baggedETS()
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Monthly data
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Methods Rank sMAPE Mean sMAPE Median sMAPE
Proposed Approach 11.15 13.62 8.74
Bagged.BLD.MBB.ETS 11.30 13.65 8.85
THETA 11.53 13.89 8.92
ForecastPro 11.56 13.90 8.81
COMB S-H-D 12.54 14.47 9.37
ForcX 12.76 14.47 9.21
HOLT 12.78 15.79 9.28
WINTER 13.06 15.93 9.30
RBF 13.27 14.76 9.21
AAM1 13.48 15.67 9.67
DAMPEN 13.48 14.58 9.44
AutoBox2 13.60 15.73 9.28
B-J auto 13.69 14.80 9.32
AutoBox1 13.69 15.81 9.27
SMARTFCS 13.82 15.01 9.52
Flors-Pearc2 13.84 15.19 9.61
AAM2 13.85 15.94 9.62
Auto-ANN 13.91 15.03 9.62
PP-Autocast 14.13 15.33 9.90
ARARMA 14.20 15.83 9.80
AutoBox3 14.21 16.59 9.40
Flors-Pearc1 14.54 15.99 9.96
THETAsm 14.58 15.38 9.65
ROBUST-Trend 14.79 18.93 9.73
SINGLE 15.22 15.30 10.03
NAIVE2 16.04 16.89 10.12
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Quarterly data
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Methods Rank sMAPE Mean sMAPE Median sMAPE
THETA 11.39 8.96 5.37
COMB S-H-D 12.18 9.22 5.32
ROBUST-Trend 12.44 9.79 5.00
DAMPEN 12.66 9.36 5.59
PP-Autocast 12.81 9.39 5.26
ForcX 12.86 9.54 5.62
Bagged.BLD.MBB.ETS 12.96 9.80 5.81
B-J auto 13.16 10.26 5.69
ForecastPro 13.20 9.82 5.84
Proposed Approach 13.24 9.89 5.82
HOLT 13.27 10.94 5.71
RBF 13.30 9.57 5.67
AutoBox2 13.38 10.00 5.59
WINTER 13.38 10.84 5.71
Flors-Pearc1 13.48 9.95 5.61
ARARMA 13.49 10.19 6.11
Auto-ANN 13.89 10.20 6.28
THETAsm 14.18 9.82 5.65
AAM1 14.25 10.16 6.36
SMARTFCS 14.27 10.15 5.71
Flors-Pearc2 14.30 10.43 6.22
AutoBox3 14.38 11.19 6.15
AAM2 14.41 10.26 6.44
SINGLE 14.66 9.72 6.18
AutoBox1 14.69 10.96 6.14
NAIVE2 14.80 9.95 6.18
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Yearly data
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Methods Rank sMAPE Mean sMAPE Median sMAPE
ForcX 11.15 16.48 11.34
RBF 11.46 16.42 10.74
AutoBox2 11.48 16.59 11.31
Flors-Pearc1 11.57 17.21 10.72
THETA 11.58 16.97 11.25
ForecastPro 11.73 17.27 11.05
ROBUST-Trend 11.81 17.03 11.30
PP-Autocast 11.87 17.13 10.83
Bagged.BLD.MBB.ETS 11.89 17.40 11.20
DAMPEN 11.92 17.36 10.95
COMB S-H-D 11.99 17.07 11.68
Proposed Approach 12.21 17.56 11.42
SMARTFCS 12.38 17.71 11.83
HOLT 12.64 20.02 11.77
WINTER 12.64 20.02 11.77
Flors-Pearc2 13.02 17.84 12.55
ARARMA 13.03 18.36 11.35
B-J auto 13.04 17.73 11.70
Auto-ANN 13.32 18.57 13.08
AutoBox3 13.52 20.88 12.89
THETAsm 13.55 17.92 12.21
AutoBox1 13.82 21.59 12.75
NAIVE2 14.16 17.88 12.37
SINGLE 14.21 17.82 12.44
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Concluding Remarks
This proposed approach make forecasts combining Bagging,
Exponential Smoothing and Cluster methods
The empirical results demonstrate the approach was capable
of generating highly accurate forecasts for monthly time series
The so far, not explicitly addressed, covariance effect on the
combination of Bagging and Exponential Smoothing, is
probably resposible for reducing the forecast error
The method doesn’t seem to work well on short time series
(such as the case of yearly and quarterly time series from the
M3 competition)
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Concluding Remarks
Future work
Other weighting schemes for selected series
Other decomposition and forecasting methods
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
Thank You!
t.mendesdantas@gmail.com
Tiago Dantas, Fernando Cyrino ISF 2017
Introduction Proposed Approach Empirical Results Concluding Remarks References
References
Bergmeir, C., Hyndman, R.J., Benitez,J.M. (2016). Bagging
exponential smoothing methods using STL decomposition and
Box-Cox transformation.International Journal of Forecasting,
32(2), 303-312.
Breiman, L. (1996). Bagging Predictors.Machine Learning, 24,
123-140.
Cordeiro, C.,Neves,M. (2009). Forecasting time series with
BOOT.EXPOS procedure. REVSTAT-Statistical Journal, 7(2),
135-149.
Liao, T.W. (2005). Clustering of time series data - a
survey.Journal of Pattern Recognition, 38(11), 1857-1874.
Monteiro,P.,Vilar, J.A. (2016). TSclust: An R Package for TimeTiago Dantas, Fernando Cyrino ISF 2017

More Related Content

What's hot

Lec7 deeprlbootcamp-svg+scg
Lec7 deeprlbootcamp-svg+scgLec7 deeprlbootcamp-svg+scg
Lec7 deeprlbootcamp-svg+scgRonald Teo
 
The slides of my Ph.D. defense
The slides of my Ph.D. defenseThe slides of my Ph.D. defense
The slides of my Ph.D. defenseApostolos Chalkis
 
Refining Bayesian Data Analysis Methods for Use with Longer Waveforms
Refining Bayesian Data Analysis Methods for Use with Longer WaveformsRefining Bayesian Data Analysis Methods for Use with Longer Waveforms
Refining Bayesian Data Analysis Methods for Use with Longer WaveformsJames Bell
 
Genealized Diagonal band copula wifth two-sided power densities
Genealized Diagonal band copula wifth two-sided power densitiesGenealized Diagonal band copula wifth two-sided power densities
Genealized Diagonal band copula wifth two-sided power densitiesLeon Adams
 
PAC-Bayesian Bound for Deep Learning
PAC-Bayesian Bound for Deep LearningPAC-Bayesian Bound for Deep Learning
PAC-Bayesian Bound for Deep LearningMark Chang
 
Cordial Labelings in the Context of Triplication
Cordial Labelings in the Context of  TriplicationCordial Labelings in the Context of  Triplication
Cordial Labelings in the Context of TriplicationIRJET Journal
 
On Convolution of Graph Signals and Deep Learning on Graph Domains
On Convolution of Graph Signals and Deep Learning on Graph DomainsOn Convolution of Graph Signals and Deep Learning on Graph Domains
On Convolution of Graph Signals and Deep Learning on Graph DomainsJean-Charles Vialatte
 
PAC Bayesian for Deep Learning
PAC Bayesian for Deep LearningPAC Bayesian for Deep Learning
PAC Bayesian for Deep LearningMark Chang
 
Nearly optimal average case complexity of counting bicliques under seth
Nearly optimal average case complexity of counting bicliques under sethNearly optimal average case complexity of counting bicliques under seth
Nearly optimal average case complexity of counting bicliques under sethNobutaka Shimizu
 
dingo: a python package to analyzes metabolic networks
dingo: a python package to analyzes metabolic networksdingo: a python package to analyzes metabolic networks
dingo: a python package to analyzes metabolic networksApostolos Chalkis
 
High-Performance Approach to String Similarity using Most Frequent K Characters
High-Performance Approach to String Similarity using Most Frequent K CharactersHigh-Performance Approach to String Similarity using Most Frequent K Characters
High-Performance Approach to String Similarity using Most Frequent K CharactersHolistic Benchmarking of Big Linked Data
 
A new practical algorithm for volume estimation using annealing of convex bodies
A new practical algorithm for volume estimation using annealing of convex bodiesA new practical algorithm for volume estimation using annealing of convex bodies
A new practical algorithm for volume estimation using annealing of convex bodiesVissarion Fisikopoulos
 
Information in the Weights
Information in the WeightsInformation in the Weights
Information in the WeightsMark Chang
 
Total Dominating Color Transversal Number of Graphs And Graph Operations
Total Dominating Color Transversal Number of Graphs And Graph OperationsTotal Dominating Color Transversal Number of Graphs And Graph Operations
Total Dominating Color Transversal Number of Graphs And Graph Operationsinventionjournals
 
A multi-objective optimization framework for a second order traffic flow mode...
A multi-objective optimization framework for a second order traffic flow mode...A multi-objective optimization framework for a second order traffic flow mode...
A multi-objective optimization framework for a second order traffic flow mode...Guillaume Costeseque
 

What's hot (20)

Lec7 deeprlbootcamp-svg+scg
Lec7 deeprlbootcamp-svg+scgLec7 deeprlbootcamp-svg+scg
Lec7 deeprlbootcamp-svg+scg
 
The slides of my Ph.D. defense
The slides of my Ph.D. defenseThe slides of my Ph.D. defense
The slides of my Ph.D. defense
 
Refining Bayesian Data Analysis Methods for Use with Longer Waveforms
Refining Bayesian Data Analysis Methods for Use with Longer WaveformsRefining Bayesian Data Analysis Methods for Use with Longer Waveforms
Refining Bayesian Data Analysis Methods for Use with Longer Waveforms
 
Genealized Diagonal band copula wifth two-sided power densities
Genealized Diagonal band copula wifth two-sided power densitiesGenealized Diagonal band copula wifth two-sided power densities
Genealized Diagonal band copula wifth two-sided power densities
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
PAC-Bayesian Bound for Deep Learning
PAC-Bayesian Bound for Deep LearningPAC-Bayesian Bound for Deep Learning
PAC-Bayesian Bound for Deep Learning
 
Cordial Labelings in the Context of Triplication
Cordial Labelings in the Context of  TriplicationCordial Labelings in the Context of  Triplication
Cordial Labelings in the Context of Triplication
 
On Convolution of Graph Signals and Deep Learning on Graph Domains
On Convolution of Graph Signals and Deep Learning on Graph DomainsOn Convolution of Graph Signals and Deep Learning on Graph Domains
On Convolution of Graph Signals and Deep Learning on Graph Domains
 
Operation Research Techniques in Transportation
Operation Research Techniques in Transportation Operation Research Techniques in Transportation
Operation Research Techniques in Transportation
 
PAC Bayesian for Deep Learning
PAC Bayesian for Deep LearningPAC Bayesian for Deep Learning
PAC Bayesian for Deep Learning
 
Nearly optimal average case complexity of counting bicliques under seth
Nearly optimal average case complexity of counting bicliques under sethNearly optimal average case complexity of counting bicliques under seth
Nearly optimal average case complexity of counting bicliques under seth
 
dingo: a python package to analyzes metabolic networks
dingo: a python package to analyzes metabolic networksdingo: a python package to analyzes metabolic networks
dingo: a python package to analyzes metabolic networks
 
High-Performance Approach to String Similarity using Most Frequent K Characters
High-Performance Approach to String Similarity using Most Frequent K CharactersHigh-Performance Approach to String Similarity using Most Frequent K Characters
High-Performance Approach to String Similarity using Most Frequent K Characters
 
A new practical algorithm for volume estimation using annealing of convex bodies
A new practical algorithm for volume estimation using annealing of convex bodiesA new practical algorithm for volume estimation using annealing of convex bodies
A new practical algorithm for volume estimation using annealing of convex bodies
 
Engineering Economics
Engineering EconomicsEngineering Economics
Engineering Economics
 
Information in the Weights
Information in the WeightsInformation in the Weights
Information in the Weights
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Total Dominating Color Transversal Number of Graphs And Graph Operations
Total Dominating Color Transversal Number of Graphs And Graph OperationsTotal Dominating Color Transversal Number of Graphs And Graph Operations
Total Dominating Color Transversal Number of Graphs And Graph Operations
 
A multi-objective optimization framework for a second order traffic flow mode...
A multi-objective optimization framework for a second order traffic flow mode...A multi-objective optimization framework for a second order traffic flow mode...
A multi-objective optimization framework for a second order traffic flow mode...
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 

Similar to Bagging-Clustering Methods to Forecast Time Series

Economics of Climate Change Adaptation Training - Session 9
Economics of Climate Change Adaptation Training - Session 9Economics of Climate Change Adaptation Training - Session 9
Economics of Climate Change Adaptation Training - Session 9UNDP Climate
 
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...Mokhtar SELLAMI
 
ForecastCombinations package
ForecastCombinations packageForecastCombinations package
ForecastCombinations packageeraviv
 
Inf-Sup Stable Displacement-Pressure Combinations for Isogeometric Analysis o...
Inf-Sup Stable Displacement-Pressure Combinations for Isogeometric Analysis o...Inf-Sup Stable Displacement-Pressure Combinations for Isogeometric Analysis o...
Inf-Sup Stable Displacement-Pressure Combinations for Isogeometric Analysis o...CHENNAKESAVA KADAPA
 
RDataMining slides-regression-classification
RDataMining slides-regression-classificationRDataMining slides-regression-classification
RDataMining slides-regression-classificationYanchang Zhao
 
MediaEval 2016 - UPMC at MediaEval2016 Retrieving Diverse Social Images Task
MediaEval 2016 - UPMC at MediaEval2016 Retrieving Diverse Social Images TaskMediaEval 2016 - UPMC at MediaEval2016 Retrieving Diverse Social Images Task
MediaEval 2016 - UPMC at MediaEval2016 Retrieving Diverse Social Images Taskmultimediaeval
 
Chapter 16-spreadsheet1 questions and answer
Chapter 16-spreadsheet1  questions and answerChapter 16-spreadsheet1  questions and answer
Chapter 16-spreadsheet1 questions and answerRaajTech
 
A study of the decathlon dataset a student20 february 2
A study of the decathlon dataset a student20 february 2A study of the decathlon dataset a student20 february 2
A study of the decathlon dataset a student20 february 2makdul
 
The Lazy Traveling Salesman Memory Management for Large-Scale Link Discovery
The Lazy Traveling Salesman Memory Management for Large-Scale Link DiscoveryThe Lazy Traveling Salesman Memory Management for Large-Scale Link Discovery
The Lazy Traveling Salesman Memory Management for Large-Scale Link DiscoveryHolistic Benchmarking of Big Linked Data
 
Lecture # 12 measures of profitability ii
Lecture # 12 measures of profitability iiLecture # 12 measures of profitability ii
Lecture # 12 measures of profitability iiBich Lien Pham
 
Programming the Interaction Space Effectively with ReSpecTX
Programming the Interaction Space Effectively with ReSpecTXProgramming the Interaction Space Effectively with ReSpecTX
Programming the Interaction Space Effectively with ReSpecTXStefano Mariani
 
True Global Optimality of the Pressure Vessel Design Problem: A Benchmark for...
True Global Optimality of the Pressure Vessel Design Problem: A Benchmark for...True Global Optimality of the Pressure Vessel Design Problem: A Benchmark for...
True Global Optimality of the Pressure Vessel Design Problem: A Benchmark for...Xin-She Yang
 
9. benefit cost analysis
9. benefit cost analysis9. benefit cost analysis
9. benefit cost analysisMohsin Siddique
 
9 150316005537-conversion-gate01
9 150316005537-conversion-gate019 150316005537-conversion-gate01
9 150316005537-conversion-gate01abidiqbal55
 
Elaboración de las alternativas mutuamente excluyentes
Elaboración de las alternativas mutuamente excluyentesElaboración de las alternativas mutuamente excluyentes
Elaboración de las alternativas mutuamente excluyentesJOe Torres Palomino
 
A Framework for Analyzing the Impact of Business Cycles on Endogenous Growth
A Framework for Analyzing the Impact of Business Cycles on Endogenous GrowthA Framework for Analyzing the Impact of Business Cycles on Endogenous Growth
A Framework for Analyzing the Impact of Business Cycles on Endogenous GrowthGRAPE
 

Similar to Bagging-Clustering Methods to Forecast Time Series (20)

Economics of Climate Change Adaptation Training - Session 9
Economics of Climate Change Adaptation Training - Session 9Economics of Climate Change Adaptation Training - Session 9
Economics of Climate Change Adaptation Training - Session 9
 
Link Discovery Tutorial Part I: Efficiency
Link Discovery Tutorial Part I: EfficiencyLink Discovery Tutorial Part I: Efficiency
Link Discovery Tutorial Part I: Efficiency
 
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...
 
ForecastCombinations package
ForecastCombinations packageForecastCombinations package
ForecastCombinations package
 
Inf-Sup Stable Displacement-Pressure Combinations for Isogeometric Analysis o...
Inf-Sup Stable Displacement-Pressure Combinations for Isogeometric Analysis o...Inf-Sup Stable Displacement-Pressure Combinations for Isogeometric Analysis o...
Inf-Sup Stable Displacement-Pressure Combinations for Isogeometric Analysis o...
 
QMC Opening Workshop, Support Points - a new way to compact distributions, wi...
QMC Opening Workshop, Support Points - a new way to compact distributions, wi...QMC Opening Workshop, Support Points - a new way to compact distributions, wi...
QMC Opening Workshop, Support Points - a new way to compact distributions, wi...
 
GDRR Opening Workshop - Modeling Approaches for High-Frequency Financial Time...
GDRR Opening Workshop - Modeling Approaches for High-Frequency Financial Time...GDRR Opening Workshop - Modeling Approaches for High-Frequency Financial Time...
GDRR Opening Workshop - Modeling Approaches for High-Frequency Financial Time...
 
RDataMining slides-regression-classification
RDataMining slides-regression-classificationRDataMining slides-regression-classification
RDataMining slides-regression-classification
 
MediaEval 2016 - UPMC at MediaEval2016 Retrieving Diverse Social Images Task
MediaEval 2016 - UPMC at MediaEval2016 Retrieving Diverse Social Images TaskMediaEval 2016 - UPMC at MediaEval2016 Retrieving Diverse Social Images Task
MediaEval 2016 - UPMC at MediaEval2016 Retrieving Diverse Social Images Task
 
Chapter 16-spreadsheet1 questions and answer
Chapter 16-spreadsheet1  questions and answerChapter 16-spreadsheet1  questions and answer
Chapter 16-spreadsheet1 questions and answer
 
A study of the decathlon dataset a student20 february 2
A study of the decathlon dataset a student20 february 2A study of the decathlon dataset a student20 february 2
A study of the decathlon dataset a student20 february 2
 
The Lazy Traveling Salesman Memory Management for Large-Scale Link Discovery
The Lazy Traveling Salesman Memory Management for Large-Scale Link DiscoveryThe Lazy Traveling Salesman Memory Management for Large-Scale Link Discovery
The Lazy Traveling Salesman Memory Management for Large-Scale Link Discovery
 
Lecture # 12 measures of profitability ii
Lecture # 12 measures of profitability iiLecture # 12 measures of profitability ii
Lecture # 12 measures of profitability ii
 
Programming the Interaction Space Effectively with ReSpecTX
Programming the Interaction Space Effectively with ReSpecTXProgramming the Interaction Space Effectively with ReSpecTX
Programming the Interaction Space Effectively with ReSpecTX
 
True Global Optimality of the Pressure Vessel Design Problem: A Benchmark for...
True Global Optimality of the Pressure Vessel Design Problem: A Benchmark for...True Global Optimality of the Pressure Vessel Design Problem: A Benchmark for...
True Global Optimality of the Pressure Vessel Design Problem: A Benchmark for...
 
9. benefit cost analysis
9. benefit cost analysis9. benefit cost analysis
9. benefit cost analysis
 
9 150316005537-conversion-gate01
9 150316005537-conversion-gate019 150316005537-conversion-gate01
9 150316005537-conversion-gate01
 
Elaboración de las alternativas mutuamente excluyentes
Elaboración de las alternativas mutuamente excluyentesElaboración de las alternativas mutuamente excluyentes
Elaboración de las alternativas mutuamente excluyentes
 
A Framework for Analyzing the Impact of Business Cycles on Endogenous Growth
A Framework for Analyzing the Impact of Business Cycles on Endogenous GrowthA Framework for Analyzing the Impact of Business Cycles on Endogenous Growth
A Framework for Analyzing the Impact of Business Cycles on Endogenous Growth
 
Ppt econ 9e_one_click_ch08
Ppt econ 9e_one_click_ch08Ppt econ 9e_one_click_ch08
Ppt econ 9e_one_click_ch08
 

Recently uploaded

Data Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationData Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationBoston Institute of Analytics
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...Suhani Kapoor
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Servicejennyeacort
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/managementakshesh doshi
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Data Warehouse , Data Cube Computation
Data Warehouse   , Data Cube ComputationData Warehouse   , Data Cube Computation
Data Warehouse , Data Cube Computationsit20ad004
 

Recently uploaded (20)

Data Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationData Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health Classification
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/management
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Data Warehouse , Data Cube Computation
Data Warehouse   , Data Cube ComputationData Warehouse   , Data Cube Computation
Data Warehouse , Data Cube Computation
 

Bagging-Clustering Methods to Forecast Time Series

  • 1. Introduction Proposed Approach Empirical Results Concluding Remarks References Bagging-Clustering Methods to Forecast Time Series Tiago Dantas Fernando Cyrino Pontifical Catholic University of Rio de Janeiro,Brazil ISF 2017, Cairns-Australia, June 25-28 Tiago Dantas, Fernando Cyrino ISF 2017
  • 2. Introduction Proposed Approach Empirical Results Concluding Remarks References Introduction Cordeiro and Neves (2009) and Bergmeir et al. (2016), have proposed new ways to generate forecasts using a very popular Machine Learning technique, called Bagging (Bootstrap Aggregating), proposed by Breiman (1996), in combination with Exponential Smoothing methods to improve forecast accuracy The main idea is to use Bootstrap to generate an ensemble of forecasts that is combined into one single output Tiago Dantas, Fernando Cyrino ISF 2017
  • 3. Introduction Proposed Approach Empirical Results Concluding Remarks References Bagged.BLD.MBB.ETS - Bergmeir et al.(2016) Best model using Bagging and Exponential Smoothing methods Box-Cox transformation (stabilizes variance) STL decomposition (decompose time series into seasonal, trend and remainder) Moving Block Bootstrap (generate new versions of the remainder) Forecasts are obtained selecting one ETS model for each time series (original and bootstrap versions) Final forecast is obtained using the median (other possibilities are mean, trimmed mean, among others) Tiago Dantas, Fernando Cyrino ISF 2017
  • 4. Introduction Proposed Approach Empirical Results Concluding Remarks References Bagged.BLD.MBB.ETS - Bergmeir et al.(2016) Tiago Dantas, Fernando Cyrino ISF 2017
  • 5. Introduction Proposed Approach Empirical Results Concluding Remarks References Why Bagging tends to work The Mean Squared Forecast Error (MSFE) can be decomposed into three terms: MSFE = Var(yt+1|t) + bias(ˆyt+1|t)2 + Var(ˆyt+1|t) Tiago Dantas, Fernando Cyrino ISF 2017
  • 6. Introduction Proposed Approach Empirical Results Concluding Remarks References Why Bagging tends to work The average forecast over the Bootstrap samples can be written as: ˜yt+1|t = 1 B B i=1 ˆy∗ (i)t+1|t where the tilde indicates Bagging forecast and B is the total number of Bootstrap samples. Tiago Dantas, Fernando Cyrino ISF 2017
  • 7. Introduction Proposed Approach Empirical Results Concluding Remarks References Why Bagging tends to work bias(˜yt+1|t) = 1 B B i=1 bias(ˆy∗ (i)t+1|t) Note that unbiased Bootstrapped versions lead to a relatively unbiased ensemble Tiago Dantas, Fernando Cyrino ISF 2017
  • 8. Introduction Proposed Approach Empirical Results Concluding Remarks References Why Bagging tends to work Var(˜yt+1|t) = 1 B2 B i=1 Var(ˆy∗ (i)t+1|t) + 1 B2 i=i Cov[ˆy∗ (i)t+1|t, ˆy∗ (i )t+1|t] Variance tends to be reduced Tiago Dantas, Fernando Cyrino ISF 2017
  • 9. Introduction Proposed Approach Empirical Results Concluding Remarks References Why Bagging tends to work When applying Bagging and Exponential Smoothing what happens is variance reduction If the variances are approximately equal and there is no correlation: Var(˜yt+1|t) ≈ 1 B Var(ˆy∗ (1)t+1|t) Tiago Dantas, Fernando Cyrino ISF 2017
  • 10. Introduction Proposed Approach Empirical Results Concluding Remarks References Why Bagging tends to work Reducing covariance seems to be a good idea Var(˜yt+1|t) = 1 B2 B i=1 Var(ˆy∗ (i)t+1|t) + 1 B2 i=i Cov[ˆy∗ (i)t+1|t, ˆy∗ (i )t+1|t] The proposed approach tries to use this idea in order to reduce forecast error Tiago Dantas, Fernando Cyrino ISF 2017
  • 11. Introduction Proposed Approach Empirical Results Concluding Remarks References Proposed Approach The proposed approach can be divided in two parts: 1. Generation of Bootstrapped series - Algorithm developed by Bergmeir et al. (2016) 2. The procedure to forecast and aggregate the series - New developments Tiago Dantas, Fernando Cyrino ISF 2017
  • 12. Introduction Proposed Approach Empirical Results Concluding Remarks References Proposed Approach Generating bootstrapped series Bootstrapped series are generated in the same way as Bagged.BLD.MBB.ETS Tiago Dantas, Fernando Cyrino ISF 2017
  • 13. Introduction Proposed Approach Empirical Results Concluding Remarks References Algorithm 1 Generating bootstrapped series 1: procedure BOOTSTRAP(ts,num.boot) 2: λ ← BoxCox.lambda(ts,min=0,max=1) 3: ts.bc← BoxCox(ts,λ = 1) 4: if ts is seasonal then 5: [trend seasonal remainder] ← stl(ts.bc) 6: else 7: seasonal ← 0 8: [trend,remainder] ← loess(ts.bc) 9: end if 10: recon.series[1] ← ts 11: for i in 2 to num.boot do 12: boot.sample[i] ← MBB(remainder) 13: recon.series.bc[i] ← trend + seasonal +boot.sample[i] 14: recon.series[i] ← InvBoxCox(recon.series.bc[i],λ) 15: end for 16: return recon.series 17: end procedure Tiago Dantas, Fernando Cyrino ISF 2017
  • 14. Introduction Proposed Approach Empirical Results Concluding Remarks References Proposed Approach Tiago Dantas, Fernando Cyrino ISF 2017
  • 15. Introduction Proposed Approach Empirical Results Concluding Remarks References Proposed Approach The procedure to forecast and aggregate the series the Proposed approach and Bagged.BLD.MBB.ETS differ in the way the ensemble is constructed Bagged.BLD.MBB.ETS consider all of the Bootstrapped versions to make forecasts The proposed approach considers a less correlated group of time series to make forecasts Tiago Dantas, Fernando Cyrino ISF 2017
  • 16. Introduction Proposed Approach Empirical Results Concluding Remarks References Proposed Approach To create a less correlated ensemble, the proposal is to generate clusters from the Bootstrapped versions Cluster procedures maximize similarity within the group and minimize it between them The expectation is that selecting series from different clusters would lead to an ensemble less correlated and, therefore, less correlated forecasts to be aggregated Tiago Dantas, Fernando Cyrino ISF 2017
  • 17. Introduction Proposed Approach Empirical Results Concluding Remarks References Proposed Approach Partitioning Around Medoids Algorithm (PAM) and euclidean distance are used to create the clusters (fast algorithm and less sensible to outliers) The number of cluster can be defined using cross-validation or any other method (e.g. Silhouette Information) Tiago Dantas, Fernando Cyrino ISF 2017
  • 18. Introduction Proposed Approach Empirical Results Concluding Remarks References Proposed Approach Tiago Dantas, Fernando Cyrino ISF 2017
  • 19. Introduction Proposed Approach Empirical Results Concluding Remarks References Proposed Approach Tiago Dantas, Fernando Cyrino ISF 2017
  • 20. Introduction Proposed Approach Empirical Results Concluding Remarks References Proposed Approach The user has to define the total number of series to aggregate (100 was the choice made by Bergmeir and colleagues) The number of series to be selected in each cluster is defined as proportionally equal to the size of each cluster Example: B =1000 and the total number of series to be aggregated is 100. If cluster 1 has 20 series, therefore 2 series would be selected (10%). But, Which 2 time series? Tiago Dantas, Fernando Cyrino ISF 2017
  • 21. Introduction Proposed Approach Empirical Results Concluding Remarks References A validation set is defined The time series with best results (sMAPE) in the validation set are the ones selected in each cluster The final forecast is obtained taking the median of the forecasts (other possibilities are the mean, trimmed mean) Tiago Dantas, Fernando Cyrino ISF 2017
  • 22. Introduction Proposed Approach Empirical Results Concluding Remarks References Proposed Approach Tiago Dantas, Fernando Cyrino ISF 2017
  • 23. Introduction Proposed Approach Empirical Results Concluding Remarks References MSE decomposition Bagged.BLD.MBB.ETS (black) and the Proposed Approach (blue) - Forecast (up to 18 steps ahead) - Time series 1083 - M3 competition Tiago Dantas, Fernando Cyrino ISF 2017
  • 24. Introduction Proposed Approach Empirical Results Concluding Remarks References Empirical Results The proposed approach was validated on public available time series from the M3 competition (1428 monthly, 756 quarterly and 645 yearly time series) The experiment was conducted using R and majorly the forecast package (version 8.0) The results for Bagged.BLD.MBB.ETS were obtained using baggedETS() Tiago Dantas, Fernando Cyrino ISF 2017
  • 25. Introduction Proposed Approach Empirical Results Concluding Remarks References Monthly data Tiago Dantas, Fernando Cyrino ISF 2017
  • 26. Introduction Proposed Approach Empirical Results Concluding Remarks References Methods Rank sMAPE Mean sMAPE Median sMAPE Proposed Approach 11.15 13.62 8.74 Bagged.BLD.MBB.ETS 11.30 13.65 8.85 THETA 11.53 13.89 8.92 ForecastPro 11.56 13.90 8.81 COMB S-H-D 12.54 14.47 9.37 ForcX 12.76 14.47 9.21 HOLT 12.78 15.79 9.28 WINTER 13.06 15.93 9.30 RBF 13.27 14.76 9.21 AAM1 13.48 15.67 9.67 DAMPEN 13.48 14.58 9.44 AutoBox2 13.60 15.73 9.28 B-J auto 13.69 14.80 9.32 AutoBox1 13.69 15.81 9.27 SMARTFCS 13.82 15.01 9.52 Flors-Pearc2 13.84 15.19 9.61 AAM2 13.85 15.94 9.62 Auto-ANN 13.91 15.03 9.62 PP-Autocast 14.13 15.33 9.90 ARARMA 14.20 15.83 9.80 AutoBox3 14.21 16.59 9.40 Flors-Pearc1 14.54 15.99 9.96 THETAsm 14.58 15.38 9.65 ROBUST-Trend 14.79 18.93 9.73 SINGLE 15.22 15.30 10.03 NAIVE2 16.04 16.89 10.12 Tiago Dantas, Fernando Cyrino ISF 2017
  • 27. Introduction Proposed Approach Empirical Results Concluding Remarks References Quarterly data Tiago Dantas, Fernando Cyrino ISF 2017
  • 28. Introduction Proposed Approach Empirical Results Concluding Remarks References Methods Rank sMAPE Mean sMAPE Median sMAPE THETA 11.39 8.96 5.37 COMB S-H-D 12.18 9.22 5.32 ROBUST-Trend 12.44 9.79 5.00 DAMPEN 12.66 9.36 5.59 PP-Autocast 12.81 9.39 5.26 ForcX 12.86 9.54 5.62 Bagged.BLD.MBB.ETS 12.96 9.80 5.81 B-J auto 13.16 10.26 5.69 ForecastPro 13.20 9.82 5.84 Proposed Approach 13.24 9.89 5.82 HOLT 13.27 10.94 5.71 RBF 13.30 9.57 5.67 AutoBox2 13.38 10.00 5.59 WINTER 13.38 10.84 5.71 Flors-Pearc1 13.48 9.95 5.61 ARARMA 13.49 10.19 6.11 Auto-ANN 13.89 10.20 6.28 THETAsm 14.18 9.82 5.65 AAM1 14.25 10.16 6.36 SMARTFCS 14.27 10.15 5.71 Flors-Pearc2 14.30 10.43 6.22 AutoBox3 14.38 11.19 6.15 AAM2 14.41 10.26 6.44 SINGLE 14.66 9.72 6.18 AutoBox1 14.69 10.96 6.14 NAIVE2 14.80 9.95 6.18 Tiago Dantas, Fernando Cyrino ISF 2017
  • 29. Introduction Proposed Approach Empirical Results Concluding Remarks References Yearly data Tiago Dantas, Fernando Cyrino ISF 2017
  • 30. Introduction Proposed Approach Empirical Results Concluding Remarks References Methods Rank sMAPE Mean sMAPE Median sMAPE ForcX 11.15 16.48 11.34 RBF 11.46 16.42 10.74 AutoBox2 11.48 16.59 11.31 Flors-Pearc1 11.57 17.21 10.72 THETA 11.58 16.97 11.25 ForecastPro 11.73 17.27 11.05 ROBUST-Trend 11.81 17.03 11.30 PP-Autocast 11.87 17.13 10.83 Bagged.BLD.MBB.ETS 11.89 17.40 11.20 DAMPEN 11.92 17.36 10.95 COMB S-H-D 11.99 17.07 11.68 Proposed Approach 12.21 17.56 11.42 SMARTFCS 12.38 17.71 11.83 HOLT 12.64 20.02 11.77 WINTER 12.64 20.02 11.77 Flors-Pearc2 13.02 17.84 12.55 ARARMA 13.03 18.36 11.35 B-J auto 13.04 17.73 11.70 Auto-ANN 13.32 18.57 13.08 AutoBox3 13.52 20.88 12.89 THETAsm 13.55 17.92 12.21 AutoBox1 13.82 21.59 12.75 NAIVE2 14.16 17.88 12.37 SINGLE 14.21 17.82 12.44 Tiago Dantas, Fernando Cyrino ISF 2017
  • 31. Introduction Proposed Approach Empirical Results Concluding Remarks References Concluding Remarks This proposed approach make forecasts combining Bagging, Exponential Smoothing and Cluster methods The empirical results demonstrate the approach was capable of generating highly accurate forecasts for monthly time series The so far, not explicitly addressed, covariance effect on the combination of Bagging and Exponential Smoothing, is probably resposible for reducing the forecast error The method doesn’t seem to work well on short time series (such as the case of yearly and quarterly time series from the M3 competition) Tiago Dantas, Fernando Cyrino ISF 2017
  • 32. Introduction Proposed Approach Empirical Results Concluding Remarks References Concluding Remarks Future work Other weighting schemes for selected series Other decomposition and forecasting methods Tiago Dantas, Fernando Cyrino ISF 2017
  • 33. Introduction Proposed Approach Empirical Results Concluding Remarks References Thank You! t.mendesdantas@gmail.com Tiago Dantas, Fernando Cyrino ISF 2017
  • 34. Introduction Proposed Approach Empirical Results Concluding Remarks References References Bergmeir, C., Hyndman, R.J., Benitez,J.M. (2016). Bagging exponential smoothing methods using STL decomposition and Box-Cox transformation.International Journal of Forecasting, 32(2), 303-312. Breiman, L. (1996). Bagging Predictors.Machine Learning, 24, 123-140. Cordeiro, C.,Neves,M. (2009). Forecasting time series with BOOT.EXPOS procedure. REVSTAT-Statistical Journal, 7(2), 135-149. Liao, T.W. (2005). Clustering of time series data - a survey.Journal of Pattern Recognition, 38(11), 1857-1874. Monteiro,P.,Vilar, J.A. (2016). TSclust: An R Package for TimeTiago Dantas, Fernando Cyrino ISF 2017