SlideShare a Scribd company logo
Constrained Support Vector Quantile Regression
for Conditional Quantile Estimation
By Kostas Hatalis
hatalis@gmail.com
Dept. of Electrical & Computer Engineering
Lehigh University, Bethlehem, PA
2016
Kostas Hatalis Recurrent Neural Network 2016 1 / 19
Challenges in nonparametric probabilistic forecasting
Overall there are several challenges to consider when developing a
forecasting method:
Quantile cross-over problem
Multi-step forecasting
Rolling forecasting
Handle multidimensional features
Kostas Hatalis Recurrent Neural Network 2016 2 / 19
Support Vector Machines
To address these challenges I developed a support vector machine (SVM)
formulation for forecasting
min
w
λ w 2
+
1
N
N
i=1
L(yi , f (xi , w)) (1)
where L(yi , f (xi , w)) is a loss function. Why SVM?
good method to start before studying neural networks
robust to outliers
prevents over-fitting by regularization
handles multidimensional features
support kernels for nonlinear modeling
Kostas Hatalis Recurrent Neural Network 2016 3 / 19
Support Vector Machines
Kostas Hatalis Recurrent Neural Network 2016 4 / 19
Objective Function
An SVM classifier amounts to minimizing the hinge loss function:
min
1
n
n
i=1
max (0, 1 − yi (w · xi + b)) + λ w 2
(2)
Where the parameter λ determines the tradeoff between increasing the
margin-size and ensuring that the xi lie on the correct side of the margin.
we can rewrite the optimization problem as a differentiable objective
function:
min
w
1
n
n
i=1
ζi + λ w 2
(3)
subject to
yi (xi · w + b) ≥ 1 − ζi
ζi ≥ 0, for all i.
Kostas Hatalis Recurrent Neural Network 2016 5 / 19
SVQR - Support Vector Regression
Support vector regression:
min
w,b
1
2
w 2
+ C
N
i=1
(ξ−
i + ξ+
i ) (4)
subject to



yi − f (xi ) ≤ ε + ξ−
i ∀i
f (xi ) − yi ≤ ε + ξ+
i ∀i
ξ−
i , ξ+
i ≥ 0 ∀i
The constant C > 0 determines the trade off between the flatness of f and
the amount up to which deviations larger than ε are tolerated.
Kostas Hatalis Recurrent Neural Network 2016 6 / 19
SVQR - Nonlinear Quantile Regression
Nonlinear Quantile Regression (NQR) projects an input vector x into a
potentially higher dimensional feature space F using a nonlinear mapping
function φ(·)
Qy (τ|x) = fτ (x) = wτ φ(x)
where Qy (τ|x) is the τ-th quantile of the distribution of y conditional on
the values of x, wτ is a vector of parameters.
Kostas Hatalis Recurrent Neural Network 2016 7 / 19
SVQR - Primal
To solve the NQR problem it can be expressed as a support vector
regression formulation with non-crossing constraints
min
w,ξ−,ξ+
M
m=1
1
2
wm
2
+ C
N
i=1
(τmξ+
mi + (1 − τm)ξ−
mi )
s.t.



yi − wm φ(xi ) − ξ+
mi ≤ 0, ∀m, ∀i
−yi + wm φ(xi ) − ξ−
mi ≤ 0, ∀m, ∀i
ξ−
mi , ξ+
mi ≥ 0, ∀m, ∀i
wm φ(xi ) − wm+1φ(xi ) ≤ 0, ∀m, ∀i
Kostas Hatalis Recurrent Neural Network 2016 8 / 19
SVQR - Dual
Not easy to solve so I form the Lagrangian dual problem which is
conveniently a quadratic programming problem
min
α+,α−,λ
M
m=1

−
1
2
N
i=1
N
j=1
(α+
mi − α−
mi )(α+
mj − α−
mj )...
K(xi , xj ) +
N
i=1
(α+
mi − α−
mi )yi
−
1
2
N
i=1
N
i=j
(λmi − λm−1i )(λmj − λm−1j )K(xi , xj )
+
N
i=1
N
i=j
(α+
mi − α−
mi )(λmj − λm−1j )K(xi , xj )


subject to



λmi ≥ 0, ∀m∀i
α+
mi ∈ [0, τmC], ∀m∀i
α−
mi ∈ [0, (1 − τm)C], ∀m∀i
Kostas Hatalis Recurrent Neural Network 2016 9 / 19
SVQR - Quantile Estimation
From this dual formulation the conditional quantile τm can then be given
by
Qy (τm|x) = fτm (x) =
N
i=1
(α+
mi − α−
mi )K(x, xi )
−
N
i=1
(λmi − λm−1i )K(x, xi )
Kostas Hatalis Recurrent Neural Network 2016 10 / 19
SVQR - Kernel and Optimization
Given two samples x and x which are represented as feature vectors, the
radial basis function (RBF) kernel is calculated as
K(x, x ) = φ(x) φ(x ) = exp −
||x − x ||2
2σ2
In order to quickly solve for conditional quantile estimates sequential
minimization optimization is applied. Previously I tried the interior-point
convex method (common for QP) but was very slow.
Kostas Hatalis Recurrent Neural Network 2016 11 / 19
SVQR - Wind Features and Data Selection
Case Study: rolling forecasts using GEFCom2014 data.
Testing data: Months of June 2013 to August 2013
Training data: March 2013 to July 2013.
1 Raw wind speeds at 10m and 100m for U and V directions.
2 Derived wind speeds at 10m and 100m.
3 Derived wind direction at 10m and 100m.
4 Derived wind energy at 10m and 100m.
5 Wind shear.
6 Wind energy difference.
7 Wind direction difference.
All features were normalized between 0 and 1.
Kostas Hatalis Recurrent Neural Network 2016 12 / 19
SVQR - Benchmarks
Three naive models are commonly used for benchmarking in probabilistic
wind forecasting applications
Persistence distribution: formed by the most recent observations
such as the past 24 hours of wind power.
Climatology distribution: based on all available past wind power
observations.
Uniform distribution: assumes all wind power values at each time
step occur with equal probability.
Kostas Hatalis Recurrent Neural Network 2016 13 / 19
SVQR - Results
Kostas Hatalis Recurrent Neural Network 2016 14 / 19
Future Work: Deep SVQR
What is Deep SVQR?
Lot of research interest in deep kernel learning and neural network
hybrid SVMs.
To further enhance SVQR performance by better feature selection
A combination of neural networks for feature learning and SVQR to
forecast quantiles.
Optimization of the neural networks is directly linked to the
optimization of the SVQR objective.
Kostas Hatalis Recurrent Neural Network 2016 15 / 19
Future Work: Deep SVQR
Kostas Hatalis Recurrent Neural Network 2016 16 / 19
Smooth Pinball Function
The new objective function to minimize for smooth quantile regression is
Φα(w) =
1
n
n
i=1
Sτ.α(yi − x w)
The gradient vector of the above is
Φα(w) =
1
n
n
i=1
1
1 + exp yi −xi w
α
− τ xi
For a training iteration m, gradient descent can be applied as
w(m)
= w(m−1)
− ηΦα(w(m−1)
)
where η is the learning rate.
Kostas Hatalis Recurrent Neural Network 2016 17 / 19
Unconstrained Smooth -Insensitive SVQR
min
w
M
m=1
1
2
wm Kwm + C
N
i=1
τmui + α log 1 + exp(−
ui
α
)
where ui = yi − Ki wm − and α > 0. Quantile estimates are then given
by
fτ (x) =
N
i=1
wi,τ k(xi , x)
No need for dual, primal can be solved with gradient methods.
Kostas Hatalis Recurrent Neural Network 2016 18 / 19
Work Done
Probabilistic Forecasting by Support Vector Machines
[1] Hatalis, Kostas, et al. ”Constrained Support Vector Quantile
Regression for Nonparametric Probabilistic Prediction of Wind Power.”
AAAI Conference on Artificial Intelligence, Workshop on AI for Smart
Grids and Smart Buildings, July 2017.
Kostas Hatalis Recurrent Neural Network 2016 19 / 19

More Related Content

What's hot

Random Matrix Theory in Array Signal Processing: Application Examples
Random Matrix Theory in Array Signal Processing: Application ExamplesRandom Matrix Theory in Array Signal Processing: Application Examples
Random Matrix Theory in Array Signal Processing: Application Examples
Förderverein Technische Fakultät
 
K-means, EM and Mixture models
K-means, EM and Mixture modelsK-means, EM and Mixture models
K-means, EM and Mixture models
Vu Pham
 

What's hot (20)

What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
What you can do with a tall-and-skinny QR factorization in Hadoop: Principal ...
 
The reversible residual network
The reversible residual networkThe reversible residual network
The reversible residual network
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction models
 
Polynomial Matrix Decompositions
Polynomial Matrix DecompositionsPolynomial Matrix Decompositions
Polynomial Matrix Decompositions
 
SPDE presentation 2012
SPDE presentation 2012SPDE presentation 2012
SPDE presentation 2012
 
Estimating Space-Time Covariance from Finite Sample Sets
Estimating Space-Time Covariance from Finite Sample SetsEstimating Space-Time Covariance from Finite Sample Sets
Estimating Space-Time Covariance from Finite Sample Sets
 
Neural Networks: Radial Bases Functions (RBF)
Neural Networks: Radial Bases Functions (RBF)Neural Networks: Radial Bases Functions (RBF)
Neural Networks: Radial Bases Functions (RBF)
 
2010 mainz
2010 mainz2010 mainz
2010 mainz
 
CSC446: Pattern Recognition (LN7)
CSC446: Pattern Recognition (LN7)CSC446: Pattern Recognition (LN7)
CSC446: Pattern Recognition (LN7)
 
Random Matrix Theory in Array Signal Processing: Application Examples
Random Matrix Theory in Array Signal Processing: Application ExamplesRandom Matrix Theory in Array Signal Processing: Application Examples
Random Matrix Theory in Array Signal Processing: Application Examples
 
EFFINET - Initial Presentation
EFFINET - Initial PresentationEFFINET - Initial Presentation
EFFINET - Initial Presentation
 
(研究会輪読) Weight Uncertainty in Neural Networks
(研究会輪読) Weight Uncertainty in Neural Networks(研究会輪読) Weight Uncertainty in Neural Networks
(研究会輪読) Weight Uncertainty in Neural Networks
 
Bayesian Core: Chapter 8
Bayesian Core: Chapter 8Bayesian Core: Chapter 8
Bayesian Core: Chapter 8
 
Convolutional networks and graph networks through kernels
Convolutional networks and graph networks through kernelsConvolutional networks and graph networks through kernels
Convolutional networks and graph networks through kernels
 
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
 
Machine learning in science and industry — day 3
Machine learning in science and industry — day 3Machine learning in science and industry — day 3
Machine learning in science and industry — day 3
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Machine learning in science and industry — day 1
Machine learning in science and industry — day 1Machine learning in science and industry — day 1
Machine learning in science and industry — day 1
 
K-means, EM and Mixture models
K-means, EM and Mixture modelsK-means, EM and Mixture models
K-means, EM and Mixture models
 
About functional SIR
About functional SIRAbout functional SIR
About functional SIR
 

Similar to Constrained Support Vector Quantile Regression for Conditional Quantile Estimation

Similar to Constrained Support Vector Quantile Regression for Conditional Quantile Estimation (20)

MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priors
 
principalcomponentanalysis-150314161616-conversion-gate01 (1).pptx
principalcomponentanalysis-150314161616-conversion-gate01 (1).pptxprincipalcomponentanalysis-150314161616-conversion-gate01 (1).pptx
principalcomponentanalysis-150314161616-conversion-gate01 (1).pptx
 
Principal component analysis
Principal component analysisPrincipal component analysis
Principal component analysis
 
MCMC and likelihood-free methods
MCMC and likelihood-free methodsMCMC and likelihood-free methods
MCMC and likelihood-free methods
 
A nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formulaA nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formula
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)
 
Neural Networks: Support Vector machines
Neural Networks: Support Vector machinesNeural Networks: Support Vector machines
Neural Networks: Support Vector machines
 
linear SVM.ppt
linear SVM.pptlinear SVM.ppt
linear SVM.ppt
 
SASA 2016
SASA 2016SASA 2016
SASA 2016
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
 
More on randomization semi-definite programming and derandomization
More on randomization semi-definite programming and derandomizationMore on randomization semi-definite programming and derandomization
More on randomization semi-definite programming and derandomization
 
Pca ppt
Pca pptPca ppt
Pca ppt
 
Smooth Pinball based Quantile Neural Network
Smooth Pinball based Quantile Neural NetworkSmooth Pinball based Quantile Neural Network
Smooth Pinball based Quantile Neural Network
 
Clustering:k-means, expect-maximization and gaussian mixture model
Clustering:k-means, expect-maximization and gaussian mixture modelClustering:k-means, expect-maximization and gaussian mixture model
Clustering:k-means, expect-maximization and gaussian mixture model
 
Talk iccf 19_ben_hammouda
Talk iccf 19_ben_hammoudaTalk iccf 19_ben_hammouda
Talk iccf 19_ben_hammouda
 
MLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic trackMLHEP Lectures - day 2, basic track
MLHEP Lectures - day 2, basic track
 
Paper.pdf
Paper.pdfPaper.pdf
Paper.pdf
 
Svm map reduce_slides
Svm map reduce_slidesSvm map reduce_slides
Svm map reduce_slides
 
xldb-2015
xldb-2015xldb-2015
xldb-2015
 

Recently uploaded

一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
MAQIB18
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
DilipVasan
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 

Recently uploaded (20)

一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
 
How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptx
 
Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 

Constrained Support Vector Quantile Regression for Conditional Quantile Estimation

  • 1. Constrained Support Vector Quantile Regression for Conditional Quantile Estimation By Kostas Hatalis hatalis@gmail.com Dept. of Electrical & Computer Engineering Lehigh University, Bethlehem, PA 2016 Kostas Hatalis Recurrent Neural Network 2016 1 / 19
  • 2. Challenges in nonparametric probabilistic forecasting Overall there are several challenges to consider when developing a forecasting method: Quantile cross-over problem Multi-step forecasting Rolling forecasting Handle multidimensional features Kostas Hatalis Recurrent Neural Network 2016 2 / 19
  • 3. Support Vector Machines To address these challenges I developed a support vector machine (SVM) formulation for forecasting min w λ w 2 + 1 N N i=1 L(yi , f (xi , w)) (1) where L(yi , f (xi , w)) is a loss function. Why SVM? good method to start before studying neural networks robust to outliers prevents over-fitting by regularization handles multidimensional features support kernels for nonlinear modeling Kostas Hatalis Recurrent Neural Network 2016 3 / 19
  • 4. Support Vector Machines Kostas Hatalis Recurrent Neural Network 2016 4 / 19
  • 5. Objective Function An SVM classifier amounts to minimizing the hinge loss function: min 1 n n i=1 max (0, 1 − yi (w · xi + b)) + λ w 2 (2) Where the parameter λ determines the tradeoff between increasing the margin-size and ensuring that the xi lie on the correct side of the margin. we can rewrite the optimization problem as a differentiable objective function: min w 1 n n i=1 ζi + λ w 2 (3) subject to yi (xi · w + b) ≥ 1 − ζi ζi ≥ 0, for all i. Kostas Hatalis Recurrent Neural Network 2016 5 / 19
  • 6. SVQR - Support Vector Regression Support vector regression: min w,b 1 2 w 2 + C N i=1 (ξ− i + ξ+ i ) (4) subject to    yi − f (xi ) ≤ ε + ξ− i ∀i f (xi ) − yi ≤ ε + ξ+ i ∀i ξ− i , ξ+ i ≥ 0 ∀i The constant C > 0 determines the trade off between the flatness of f and the amount up to which deviations larger than ε are tolerated. Kostas Hatalis Recurrent Neural Network 2016 6 / 19
  • 7. SVQR - Nonlinear Quantile Regression Nonlinear Quantile Regression (NQR) projects an input vector x into a potentially higher dimensional feature space F using a nonlinear mapping function φ(·) Qy (τ|x) = fτ (x) = wτ φ(x) where Qy (τ|x) is the τ-th quantile of the distribution of y conditional on the values of x, wτ is a vector of parameters. Kostas Hatalis Recurrent Neural Network 2016 7 / 19
  • 8. SVQR - Primal To solve the NQR problem it can be expressed as a support vector regression formulation with non-crossing constraints min w,ξ−,ξ+ M m=1 1 2 wm 2 + C N i=1 (τmξ+ mi + (1 − τm)ξ− mi ) s.t.    yi − wm φ(xi ) − ξ+ mi ≤ 0, ∀m, ∀i −yi + wm φ(xi ) − ξ− mi ≤ 0, ∀m, ∀i ξ− mi , ξ+ mi ≥ 0, ∀m, ∀i wm φ(xi ) − wm+1φ(xi ) ≤ 0, ∀m, ∀i Kostas Hatalis Recurrent Neural Network 2016 8 / 19
  • 9. SVQR - Dual Not easy to solve so I form the Lagrangian dual problem which is conveniently a quadratic programming problem min α+,α−,λ M m=1  − 1 2 N i=1 N j=1 (α+ mi − α− mi )(α+ mj − α− mj )... K(xi , xj ) + N i=1 (α+ mi − α− mi )yi − 1 2 N i=1 N i=j (λmi − λm−1i )(λmj − λm−1j )K(xi , xj ) + N i=1 N i=j (α+ mi − α− mi )(λmj − λm−1j )K(xi , xj )   subject to    λmi ≥ 0, ∀m∀i α+ mi ∈ [0, τmC], ∀m∀i α− mi ∈ [0, (1 − τm)C], ∀m∀i Kostas Hatalis Recurrent Neural Network 2016 9 / 19
  • 10. SVQR - Quantile Estimation From this dual formulation the conditional quantile τm can then be given by Qy (τm|x) = fτm (x) = N i=1 (α+ mi − α− mi )K(x, xi ) − N i=1 (λmi − λm−1i )K(x, xi ) Kostas Hatalis Recurrent Neural Network 2016 10 / 19
  • 11. SVQR - Kernel and Optimization Given two samples x and x which are represented as feature vectors, the radial basis function (RBF) kernel is calculated as K(x, x ) = φ(x) φ(x ) = exp − ||x − x ||2 2σ2 In order to quickly solve for conditional quantile estimates sequential minimization optimization is applied. Previously I tried the interior-point convex method (common for QP) but was very slow. Kostas Hatalis Recurrent Neural Network 2016 11 / 19
  • 12. SVQR - Wind Features and Data Selection Case Study: rolling forecasts using GEFCom2014 data. Testing data: Months of June 2013 to August 2013 Training data: March 2013 to July 2013. 1 Raw wind speeds at 10m and 100m for U and V directions. 2 Derived wind speeds at 10m and 100m. 3 Derived wind direction at 10m and 100m. 4 Derived wind energy at 10m and 100m. 5 Wind shear. 6 Wind energy difference. 7 Wind direction difference. All features were normalized between 0 and 1. Kostas Hatalis Recurrent Neural Network 2016 12 / 19
  • 13. SVQR - Benchmarks Three naive models are commonly used for benchmarking in probabilistic wind forecasting applications Persistence distribution: formed by the most recent observations such as the past 24 hours of wind power. Climatology distribution: based on all available past wind power observations. Uniform distribution: assumes all wind power values at each time step occur with equal probability. Kostas Hatalis Recurrent Neural Network 2016 13 / 19
  • 14. SVQR - Results Kostas Hatalis Recurrent Neural Network 2016 14 / 19
  • 15. Future Work: Deep SVQR What is Deep SVQR? Lot of research interest in deep kernel learning and neural network hybrid SVMs. To further enhance SVQR performance by better feature selection A combination of neural networks for feature learning and SVQR to forecast quantiles. Optimization of the neural networks is directly linked to the optimization of the SVQR objective. Kostas Hatalis Recurrent Neural Network 2016 15 / 19
  • 16. Future Work: Deep SVQR Kostas Hatalis Recurrent Neural Network 2016 16 / 19
  • 17. Smooth Pinball Function The new objective function to minimize for smooth quantile regression is Φα(w) = 1 n n i=1 Sτ.α(yi − x w) The gradient vector of the above is Φα(w) = 1 n n i=1 1 1 + exp yi −xi w α − τ xi For a training iteration m, gradient descent can be applied as w(m) = w(m−1) − ηΦα(w(m−1) ) where η is the learning rate. Kostas Hatalis Recurrent Neural Network 2016 17 / 19
  • 18. Unconstrained Smooth -Insensitive SVQR min w M m=1 1 2 wm Kwm + C N i=1 τmui + α log 1 + exp(− ui α ) where ui = yi − Ki wm − and α > 0. Quantile estimates are then given by fτ (x) = N i=1 wi,τ k(xi , x) No need for dual, primal can be solved with gradient methods. Kostas Hatalis Recurrent Neural Network 2016 18 / 19
  • 19. Work Done Probabilistic Forecasting by Support Vector Machines [1] Hatalis, Kostas, et al. ”Constrained Support Vector Quantile Regression for Nonparametric Probabilistic Prediction of Wind Power.” AAAI Conference on Artificial Intelligence, Workshop on AI for Smart Grids and Smart Buildings, July 2017. Kostas Hatalis Recurrent Neural Network 2016 19 / 19