SlideShare a Scribd company logo
1 of 36
Download to read offline
The Impact of Smoothness on Model Class
Selection in Nonlinear System Identification:
An Application of Derivatives in the RKHS
Y. Bhujwalla, V. Laurain, M. Gilson
6th July 2016
yusuf-michael.bhujwalla@univ-lorraine.fr
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 1 / 23
Introduction
The Data-Generating System
Measured data : DN = {(u1, y1), (u2, y2), . . . , (uN , yN )}.
Describes So, an unknown nonlinear system with function fo : X → R,
So :
yo,k = fo(xk)
yk = yo,k + eo,k
Where xk = [yk−1 · · · yk−na uk · · · uk−nb ]⊤
∈ X = Rna+nb+1
.
Parametric Models
Nθ low (fixed)
→ Physically interpretable
Choice of basis function?
→ Combinatorially hard problem X
Nonparametric Models
Nθ high (∼ data)
→ Not interpretable X
Can define a general model class.
→ Flexibility
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 2 / 23
Introduction
The Data-Generating System
Measured data : DN = {(u1, y1), (u2, y2), . . . , (uN , yN )}.
Describes So, an unknown nonlinear system with function fo : X → R,
So :
yo,k = fo(xk)
yk = yo,k + eo,k
Where xk = [yk−1 · · · yk−na uk · · · uk−nb ]⊤
∈ X = Rna+nb+1
.
Parametric Models
Nθ low (fixed)
→ Physically interpretable
Choice of basis function?
→ Combinatorially hard problem X
Nonparametric Models
Such as kernel methods :
Input
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Output
0
0.5
1
1.5
2
yo
kx
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 2 / 23
Outline
1 Kernel Methods in Nonlinear Identification
2 The Kernel Selection Problem
3 Smoothness in the RKHS
4 Simulation Examples
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 3 / 23
1. Kernel Methods in Nonlinear Identification
Reproducing Kernel Hilbert Spaces
Hilbert Spaces
H is a space over a class of functions, f : X → R ∈ H :
· ∥ f ∥H
· ⟨ f , g ⟩H.
In system identification, H ⇔ model class.
Reproducing Kernels
H has a unique, associated kernel function, K : X × X → R, spanning the space
H.
The Reproducing Property states that f (x) can be explicitly represented as an
infinite sum in terms of the kernel function :
f (x) = ⟨ f , Kx⟩H =
∞
i=1
αiK(xi, x)
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 4 / 23
1. Kernel Methods in Nonlinear Identification
Identification in the RKHS
Identification in the RKHS
For ˆf ∈ H close to fo, ˆf should reflect observations :
ˆf = min
f
{ V( f ) = L(x, y, f (x)) }
However, infinitely many solutions ⇒ add constraint to model :
ˆf = min
f
{ V( f ) = L(x, y, f (x)) + g(∥ f ∥H) }
For such cost-functions, f (x) can be reduced to :
f (x) =
N
i=1
αiK(xi, x), α ∈ RN
· f (x) → a finite sum over the observations.
· The Representer Theorem (Schölkopf, Herbrich and Smola, 2001)
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 5 / 23
1. Kernel Methods in Nonlinear Identification
A Widely-Used Example
A Widely-Used Example
As an example minimise squared-error :
L(x, y, f (x)) = ∥y − f (x)∥2
2,
and use regularisation to avoid overparameterisation :
g(∥ f ∥H) = λ∥ f ∥2
H.
Giving :
Vf : V( f ) = ∥y − f (x)∥2
2 + λf ∥ f ∥2
H
⇒ αf = (K + λf I)−1
y
· Solution depends on
I. K and
II. λf
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 6 / 23
Outline
1 Kernel Methods in Nonlinear Identification
2 The Kernel Selection Problem
3 Smoothness in the RKHS
4 Simulation Examples
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 7 / 23
2. The Kernel Selection Problem
Choosing a Kernel Function
Choosing a kernel function...
K defines the model class
Let X = R, and K be the Gaussian
RBF kernel :
K(xi, x) = exp −
∥x − xi∥2
σ2
.
Width (σ) defines smoothness of
the kernel function.
Hence σ determines the model
class !
Other kernels have different
hyperparameters, but they will still
influence H.
Input
-1 -0.5 0 0.5 1
0
0.2
0.4
0.6
0.8
1
Input
-1 -0.5 0 0.5 1
0
0.2
0.4
0.6
0.8
1
KxKx
σ1
σ2 > σ1
σ
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 8 / 23
2. The Kernel Selection Problem
Implications of the Hyperparameter Selection
Estimation of 1D switching signal
using Vf = ∥y − f (x)∥2
2 + λf ∥f ∥2
H.
Many observations (N = 103
).
uk ∼ U(−1, 1).
Significant noise disturbances
(SNR = 5dB).
Two hyperparameters :
I. σ and
II. λ
-1 -0.5 0 0.5 1
-20
-10
0
10
20
30
fo(uk)
uk
FIGURE: Estimation of 1D switching
signal for different hyperparameter
values.
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 9 / 23
2. The Kernel Selection Problem
Implications of the Hyperparameter Selection
-1 -0.5 0 0.5 1
-20
0
20
-1 -0.5 0 0.5 1
-20
0
20
-1 -0.5 0 0.5 1
-20
0
20
-1 -0.5 0 0.5 1
-20
0
20
yoyo
yoyo
ˆfMEAN
ˆfMEAN
ˆfMEAN
ˆfMEAN
ˆfSD
ˆfSD
ˆfSD
ˆfSD
SMALL λ LARGE λ
SMALLσLARGEσ
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 10 / 23
2. The Kernel Selection Problem
Implications of the Hyperparameter Selection
-1 -0.5 0 0.5 1
-20
0
20
-1 -0.5 0 0.5 1
-20
0
20
-1 -0.5 0 0.5 1
-20
0
20
-1 -0.5 0 0.5 1
-20
0
20
SMOOTHNESS
FLEXIBILITY
yoyo
yoyo
ˆfMEAN
ˆfMEAN
ˆfMEAN
ˆfMEAN
ˆfSD
ˆfSD
ˆfSD
ˆfSD
SMALL λ LARGE λ
SMALLσLARGEσ
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 10 / 23
2. The Kernel Selection Problem
Summary
Summary
Vf : V(f ) = ∥y − f (x)∥2
2 + λf ∥ f ∥2
H.
Kernel framework very effective :
· flexible,
· well-understood.
However, choice of kernel often compromised (e.g. by noise).
⇒ Trade-off between flexibility and smoothness.
So, why regularise over ∥ f ∥H . . .
. . . when smoothness is often a more interesting property to control?
⇒ Desirable property in many models.
⇒ Characterises many systems.
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 11 / 23
Outline
1 Kernel Methods in Nonlinear Identification
2 The Kernel Selection Problem
3 Smoothness in the RKHS
4 Simulation Examples
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 12 / 23
3. Smoothness in the RKHS
Regularisation Using Derivatives
Proposition
Replace functional regularisation :
Vf : V(f ) = ∥y − f (x)∥2
2 + λf ∥ f ∥2
H,
With smoothness-enforcing regularisation :
VD : V(f ) = ∥y − f (x)∥2
2 + λD∥Df ∥2
H.
Now :
· Hence, smoothness controlled by regularisation.
· And, kernel hyperparameter removed from optimisation problem.
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 13 / 23
3. Smoothness in the RKHS
Regularisation Using Derivatives
Proposition
Replace functional regularisation :
Vf : V(f ) = ∥y − f (x)∥2
2 + λf ∥ f ∥2
H,
With smoothness-enforcing regularisation :
VD : V(f ) = ∥y − f (x)∥2
2 + λD∥Df ∥2
H.
Now :
· Hence, smoothness controlled by regularisation.
· And, kernel hyperparameter removed from optimisation problem.
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 13 / 23
3. Smoothness in the RKHS
Regularisation Using Derivatives
Proposition
Replace functional regularisation :
Vf : V(f ) = ∥y − f (x)∥2
2 + λf ∥ f ∥2
H,
With smoothness-enforcing regularisation :
VD : V(f ) = ∥y − f (x)∥2
2 + λD∥Df ∥2
H.
Now :
· Hence, smoothness controlled by regularisation.
· And, kernel hyperparameter removed from optimisation problem.
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 13 / 23
3. Smoothness in the RKHS
Derivatives in the RKHS
Derivatives in the RKHS
For f ∈ H, Df ∈ H (Zhou, 2008)
Hence, a derivative reproducing property can be defined :
Df = ⟨ f , DKx ⟩H
The Representer Theorem
Representer f (x) = N
i=1 αiK(xi, x) requires
g(∥ f ∥H) : a monotically increasing function of ∥ f ∥H
Clearly, ∥Df ∥H g(∥ f ∥H) ⇒ representer is suboptimal for VD.
However, if system is well-excited, f (x) = N
i=1 αiK(xi, x) can be used.
However, it loosely preserves the bias-variance properties of Vf
lim
λ→∞
f (x) = 0, ∀x ∈ R.
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 14 / 23
3. Smoothness in the RKHS
Derivatives in the RKHS
A Closed-Form Solution
Using derivative reproducing property, ∥Df ∥H can be defined :
∥Df ∥2
H = α⊤
D(1, 1)
Kα,
where
D(1, 1)
K(xi, xj) =
∂2
K(xi, xj)
∂xj ∂xi
.
Permitting a closed-form solution :
αD = K⊤
K + λDD(1, 1)
K
−1
K⊤
y.
As per Vf ⇒ αf = (K + λf I)−1
y.
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 15 / 23
Outline
1 Kernel Methods in Nonlinear Identification
2 The Kernel Selection Problem
3 Smoothness in the RKHS
4 Simulation Examples
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 16 / 23
4. Simulation Examples
Example 1 : Effect of the Regularisation
Estimation of 1D switching signal
using Vf and VD.
Many observations (N = 103
).
uk ∼ U(−1, 1).
Significant noise disturbances
(SNR = 5dB).
Gaussian RBF kernel, with σ = 0.01.
Varying levels of regularisation
(through λf , λD).
-1 -0.5 0 0.5 1
-20
-10
0
10
20
30
fo(uk)
uk
FIGURE: Estimation of 1D switching
signal for different λ values.
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 17 / 23
4. Simulation Examples
Example 1 : Effect of the Regularisation
⇒ Negligible regularisation (very small λf , λD).
Input
-1 -0.5 0 0.5 1
Output
-20
-10
0
10
20
30
yo
ˆfMEAN
ˆfSD
FIGURE: Vf : R( f)
Input
-1 -0.5 0 0.5 1
Output -20
-10
0
10
20
30
yo
ˆfMEAN
ˆfSD
FIGURE: VD : R(Df)
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 18 / 23
4. Simulation Examples
Example 1 : Effect of the Regularisation
⇒ Light regularisation (small λf , λD).
Input
-1 -0.5 0 0.5 1
Output
-20
-10
0
10
20
30
yo
ˆfMEAN
ˆfSD
FIGURE: Vf : R( f)
Input
-1 -0.5 0 0.5 1
Output -20
-10
0
10
20
30
yo
ˆfMEAN
ˆfSD
FIGURE: VD : R(Df)
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 18 / 23
4. Simulation Examples
Example 1 : Effect of the Regularisation
⇒ Moderate regularisation.
Input
-1 -0.5 0 0.5 1
Output
-20
-10
0
10
20
30
yo
ˆfMEAN
ˆfSD
FIGURE: Vf : R( f)
Input
-1 -0.5 0 0.5 1
Output -20
-10
0
10
20
30
yo
ˆfMEAN
ˆfSD
FIGURE: VD : R(Df)
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 18 / 23
4. Simulation Examples
Example 1 : Effect of the Regularisation
⇒ Heavy regularisation (large λf , λD).
Input
-1 -0.5 0 0.5 1
Output
-20
-10
0
10
20
30
yo
ˆfMEAN
ˆfSD
FIGURE: Vf : R( f)
Input
-1 -0.5 0 0.5 1
Output -20
-10
0
10
20
30
yo
ˆfMEAN
ˆfSD
FIGURE: VD : R(Df)
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 18 / 23
4. Simulation Examples
Example 1 : Effect of the Regularisation
⇒ Excessive regularisation (very large λf , λD).
Input
-1 -0.5 0 0.5 1
Output
-20
-10
0
10
20
30
yo
ˆfMEAN
ˆfSD
FIGURE: Vf : R( f)
Input
-1 -0.5 0 0.5 1
Output -20
-10
0
10
20
30
yo
ˆfMEAN
ˆfSD
FIGURE: VD : R(Df)
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 18 / 23
4. Simulation Examples
Example 2 : 1D Structural Selection
Identification of two unknown systems (X ∈ [−1, 1], SNR = 10dB, N = 103
).
Vf : λ, σ optimised using cross-validation.
VD : λ optimised using cross-validation, σ set based on data.
-0.5 0 0.5
-10
-5
0
5
10
15
20
25
f1
o(uk)
uk
FIGURE: S1
o : Smooth
-0.5 0 0.5
-10
-5
0
5
10
15
20
25
f2
o(uk)
uk
FIGURE: S2
o : Nonsmooth
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 19 / 23
4. Simulation Examples
Example 2 : Smooth S1
o
Using a small kernel, VD can reconstruct a smooth function.
Not feasible using Vf - needs kernel smoothing effect.
Input
-0.5 0 0.5
Output
-10
-5
0
5
10
15
20
25
ˆfMEAN
ˆfSD
kx
FIGURE: Vf : R( f)
Input
-0.5 0 0.5
Output
-10
-5
0
5
10
15
20
25
ˆfMEAN
ˆfSD
kx
FIGURE: VD : R(Df)
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 20 / 23
4. Simulation Examples
Example 2 : Nonsmooth S2
o
Using a small kernel, VD can detect structural nonlinearity.
However, Vf is too smooth, as σ must counteract noise.
Input
-0.5 0 0.5
Output
-10
-5
0
5
10
15
20
25
ˆfMEAN
ˆfSD
kx
FIGURE: Vf : R( f)
Input
-0.5 0 0.5
Output
-10
-5
0
5
10
15
20
25
ˆfMEAN
ˆfSD
kx
FIGURE: VD : R(Df)
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 21 / 23
Conclusions
RKHS in Nonlinear Identification
Flexible framework : attractive for nonlinear identification.
Smoothness controlled by kernel function and regularisation (σ and λf )
⇒ Constrained kernel function.
Derivatives in the RKHS
Smoothness controlled by regularisation (λD).
⇒ Simpler steering of the smoothness.
Simpler hyperparameter optimisation (just λD) and increased model flexibility.
⇒ Through use of a smaller kernel (small σ).
However, relies on a suboptimal representer.
⇒ Nonetheless, promising results have been obtained.
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 22 / 23
The Impact of Smoothness on Model Class
Selection in Nonlinear System Identification:
An Application of Derivatives in the RKHS
Y. Bhujwalla, V. Laurain, M. Gilson
6th July 2016
yusuf-michael.bhujwalla@univ-lorraine.fr
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 23 / 23
A. Bibliography
Alternative Smoothness-Enforcing Optimisation Schemes
Sobolev Spaces (Wahba, 1990 ; Pillonetto et al, 2014)
∥f ∥Hk
=
m
i=0 X
di
f (x)
dxi
2
dx
Identification using derivative observations (Zhou, 2008; Rosasco et al, 2010)
Vobvs( f ) = ∥y − f (x)∥2
2 + γ1
dy
dx
−
df (x)
dx
2
2
+ · · · γm
dm
y
dxm
−
dm
f (x)
dxm
2
2
+ λ ∥f ∥H
Regularization Using Derivatives (Rosasco et al, 2010; Lauer, Le and Bloch,
2012; Duijkers et al, 2014)
VD( f ) = ∥y − f (x)∥2
2 + λ∥Dm
f ∥p.
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 23 / 23
A. Bibliography
Literature Review
Kernel Methods in Machine Learning and System Identification
· Kernel methods in system identification, machine learning and function
estimation : A survey, G. Pillonetto, F. Dinuzzo, T. Chen, G. D. Nicolao and L.
Ljung, 2014.
· Learning with Kernels, B. Schölkopf, R. Herbrich and A. J. Smola, 2002.
· Gaussian Processes for Machine Learning, C. Rasmussen and C. Williams,
2006.
Reproducing Kernel Hilbert Spaces
· Theory of Reproducing Kernels, N. Aronszajn, 1950.
· A Generalized Representer Theorem, B. Schölkopf, R. Herbrich and A. J. Smola,
2001.
· Derivative reproducing properties for kernel methods in learning theory, D. Zhou,
2008.
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 23 / 23
B. Example 2 : 1D Structural Selection
S1
o : Smooth
Input
-0.5 0 0.5
Output
-10
-5
0
5
10
15
20
25
ˆfMEAN
ˆfSD
kx
FIGURE: Vf : R( f)
Input
-0.5 0 0.5Output
-10
-5
0
5
10
15
20
25
ˆfMEAN
ˆfSD
kx
FIGURE: VD : R(Df)
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 23 / 23
B. Example 2 : 1D Structural Selection
S2
o : Nonsmooth
Input
-0.5 0 0.5
Output
-10
-5
0
5
10
15
20
25
ˆfMEAN
ˆfSD
kx
FIGURE: Vf : R( f)
Input
-0.5 0 0.5Output
-10
-5
0
5
10
15
20
25
ˆfMEAN
ˆfSD
kx
FIGURE: VD : R(Df)
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 23 / 23
C. Applicability of the Representer
Kernel Density
Applicability of the representer depends on the kernel density, i.e. the ratio of
observations to the kernel width :
Input (x/σ)
-6 -4 -2 0 2 4 6
Output
0
0.5
1
1.5
ˆf
Kx
FIGURE: ρk = 0.6
Input (x/σ)
-6 -4 -2 0 2 4 6
Output
0
0.5
1
1.5
ˆf
Kx
FIGURE: ρk = 0.6
Input (x/σ)
-6 -4 -2 0 2 4 6
Output
0
0.5
1
1.5
ˆf
Kx
FIGURE: ρk = 0.4
Desirable to ensure σ ≈ max(∆x) (where ∆x is the spacing between adjacent
observations).
Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 23 / 23

More Related Content

What's hot

Minimax optimal alternating minimization \\ for kernel nonparametric tensor l...
Minimax optimal alternating minimization \\ for kernel nonparametric tensor l...Minimax optimal alternating minimization \\ for kernel nonparametric tensor l...
Minimax optimal alternating minimization \\ for kernel nonparametric tensor l...Taiji Suzuki
 
Robust Image Denoising in RKHS via Orthogonal Matching Pursuit
Robust Image Denoising in RKHS via Orthogonal Matching PursuitRobust Image Denoising in RKHS via Orthogonal Matching Pursuit
Robust Image Denoising in RKHS via Orthogonal Matching PursuitPantelis Bouboulis
 
A Novel Methodology for Designing Linear Phase IIR Filters
A Novel Methodology for Designing Linear Phase IIR FiltersA Novel Methodology for Designing Linear Phase IIR Filters
A Novel Methodology for Designing Linear Phase IIR FiltersIDES Editor
 
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7Ono Shigeru
 
Parallel Coordinate Descent Algorithms
Parallel Coordinate Descent AlgorithmsParallel Coordinate Descent Algorithms
Parallel Coordinate Descent AlgorithmsShaleen Kumar Gupta
 
CS221: HMM and Particle Filters
CS221: HMM and Particle FiltersCS221: HMM and Particle Filters
CS221: HMM and Particle Filterszukun
 
Lecture note4coordinatedescent
Lecture note4coordinatedescentLecture note4coordinatedescent
Lecture note4coordinatedescentXudong Sun
 
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 6
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 6Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 6
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 6Ono Shigeru
 
Multilayer Neural Networks
Multilayer Neural NetworksMultilayer Neural Networks
Multilayer Neural NetworksESCOM
 
Convex Optimization Modelling with CVXOPT
Convex Optimization Modelling with CVXOPTConvex Optimization Modelling with CVXOPT
Convex Optimization Modelling with CVXOPTandrewmart11
 
Brief intro : Invariance and Equivariance
Brief intro : Invariance and EquivarianceBrief intro : Invariance and Equivariance
Brief intro : Invariance and Equivariance홍배 김
 
VahidAkbariTalk.pdf
VahidAkbariTalk.pdfVahidAkbariTalk.pdf
VahidAkbariTalk.pdfgrssieee
 
VahidAkbariTalk_v3.pdf
VahidAkbariTalk_v3.pdfVahidAkbariTalk_v3.pdf
VahidAkbariTalk_v3.pdfgrssieee
 
Neural Processes
Neural ProcessesNeural Processes
Neural ProcessesSangwoo Mo
 
Continual reinforcement learning with complex synapses
Continual reinforcement learning with complex synapsesContinual reinforcement learning with complex synapses
Continual reinforcement learning with complex synapsesThyrixYang1
 
Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)Jia-Bin Huang
 
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...Matt Moores
 

What's hot (20)

ICPR 2012
ICPR 2012ICPR 2012
ICPR 2012
 
UCB 2012-02-28
UCB 2012-02-28UCB 2012-02-28
UCB 2012-02-28
 
Minimax optimal alternating minimization \\ for kernel nonparametric tensor l...
Minimax optimal alternating minimization \\ for kernel nonparametric tensor l...Minimax optimal alternating minimization \\ for kernel nonparametric tensor l...
Minimax optimal alternating minimization \\ for kernel nonparametric tensor l...
 
Robust Image Denoising in RKHS via Orthogonal Matching Pursuit
Robust Image Denoising in RKHS via Orthogonal Matching PursuitRobust Image Denoising in RKHS via Orthogonal Matching Pursuit
Robust Image Denoising in RKHS via Orthogonal Matching Pursuit
 
A Novel Methodology for Designing Linear Phase IIR Filters
A Novel Methodology for Designing Linear Phase IIR FiltersA Novel Methodology for Designing Linear Phase IIR Filters
A Novel Methodology for Designing Linear Phase IIR Filters
 
Mgm
MgmMgm
Mgm
 
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
 
Parallel Coordinate Descent Algorithms
Parallel Coordinate Descent AlgorithmsParallel Coordinate Descent Algorithms
Parallel Coordinate Descent Algorithms
 
CS221: HMM and Particle Filters
CS221: HMM and Particle FiltersCS221: HMM and Particle Filters
CS221: HMM and Particle Filters
 
Lecture note4coordinatedescent
Lecture note4coordinatedescentLecture note4coordinatedescent
Lecture note4coordinatedescent
 
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 6
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 6Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 6
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 6
 
Multilayer Neural Networks
Multilayer Neural NetworksMultilayer Neural Networks
Multilayer Neural Networks
 
Convex Optimization Modelling with CVXOPT
Convex Optimization Modelling with CVXOPTConvex Optimization Modelling with CVXOPT
Convex Optimization Modelling with CVXOPT
 
Brief intro : Invariance and Equivariance
Brief intro : Invariance and EquivarianceBrief intro : Invariance and Equivariance
Brief intro : Invariance and Equivariance
 
VahidAkbariTalk.pdf
VahidAkbariTalk.pdfVahidAkbariTalk.pdf
VahidAkbariTalk.pdf
 
VahidAkbariTalk_v3.pdf
VahidAkbariTalk_v3.pdfVahidAkbariTalk_v3.pdf
VahidAkbariTalk_v3.pdf
 
Neural Processes
Neural ProcessesNeural Processes
Neural Processes
 
Continual reinforcement learning with complex synapses
Continual reinforcement learning with complex synapsesContinual reinforcement learning with complex synapses
Continual reinforcement learning with complex synapses
 
Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)
 
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...
 

Viewers also liked

An RKHS Approach to Systematic Kernel Selection in Nonlinear System Identific...
An RKHS Approach to Systematic Kernel Selection in Nonlinear System Identific...An RKHS Approach to Systematic Kernel Selection in Nonlinear System Identific...
An RKHS Approach to Systematic Kernel Selection in Nonlinear System Identific...Yusuf Bhujwalla
 
Renrollment a fiduciary imperative
Renrollment a fiduciary imperativeRenrollment a fiduciary imperative
Renrollment a fiduciary imperativeRichard Davies
 
Pablo dominguez de_maria_publication_list
Pablo dominguez de_maria_publication_listPablo dominguez de_maria_publication_list
Pablo dominguez de_maria_publication_listPablo Dom
 
Verbo to te formas neg interr
Verbo to te formas neg interrVerbo to te formas neg interr
Verbo to te formas neg interrRocio Aponza
 
Internet Traffic Forecasting using Time Series Methods
Internet Traffic Forecasting using Time Series MethodsInternet Traffic Forecasting using Time Series Methods
Internet Traffic Forecasting using Time Series MethodsAjay Ohri
 
Diagramación y composición
Diagramación y composiciónDiagramación y composición
Diagramación y composicióngeorgeostro
 
Bristol post ppt
Bristol post pptBristol post ppt
Bristol post pptanna barton
 
Bristol post ppt
Bristol post pptBristol post ppt
Bristol post pptanna barton
 

Viewers also liked (13)

An RKHS Approach to Systematic Kernel Selection in Nonlinear System Identific...
An RKHS Approach to Systematic Kernel Selection in Nonlinear System Identific...An RKHS Approach to Systematic Kernel Selection in Nonlinear System Identific...
An RKHS Approach to Systematic Kernel Selection in Nonlinear System Identific...
 
Renrollment a fiduciary imperative
Renrollment a fiduciary imperativeRenrollment a fiduciary imperative
Renrollment a fiduciary imperative
 
Pablo dominguez de_maria_publication_list
Pablo dominguez de_maria_publication_listPablo dominguez de_maria_publication_list
Pablo dominguez de_maria_publication_list
 
Sadigh Gallery Winter Bargains 2017
Sadigh Gallery Winter Bargains 2017Sadigh Gallery Winter Bargains 2017
Sadigh Gallery Winter Bargains 2017
 
Verbo to te formas neg interr
Verbo to te formas neg interrVerbo to te formas neg interr
Verbo to te formas neg interr
 
Condiciones que favorecen la motivación
Condiciones que favorecen la motivaciónCondiciones que favorecen la motivación
Condiciones que favorecen la motivación
 
Internet Traffic Forecasting using Time Series Methods
Internet Traffic Forecasting using Time Series MethodsInternet Traffic Forecasting using Time Series Methods
Internet Traffic Forecasting using Time Series Methods
 
Week 5 paper
Week 5 paperWeek 5 paper
Week 5 paper
 
Tener razon
Tener razonTener razon
Tener razon
 
Diagramación y composición
Diagramación y composiciónDiagramación y composición
Diagramación y composición
 
Bristol post ppt
Bristol post pptBristol post ppt
Bristol post ppt
 
Falacia de justicia
Falacia de justiciaFalacia de justicia
Falacia de justicia
 
Bristol post ppt
Bristol post pptBristol post ppt
Bristol post ppt
 

Similar to The Impact of Smoothness on Model Class Selection in Nonlinear System Identification

Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Alexander Litvinenko
 
From RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphsFrom RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphstuxette
 
Several nonlinear models and methods for FDA
Several nonlinear models and methods for FDASeveral nonlinear models and methods for FDA
Several nonlinear models and methods for FDAtuxette
 
Phase Retrieval: Motivation and Techniques
Phase Retrieval: Motivation and TechniquesPhase Retrieval: Motivation and Techniques
Phase Retrieval: Motivation and TechniquesVaibhav Dixit
 
IVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionIVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionCharles Deledalle
 
Epsrcws08 campbell isvm_01
Epsrcws08 campbell isvm_01Epsrcws08 campbell isvm_01
Epsrcws08 campbell isvm_01Cheng Feng
 
Harmonic Analysis and Deep Learning
Harmonic Analysis and Deep LearningHarmonic Analysis and Deep Learning
Harmonic Analysis and Deep LearningSungbin Lim
 
Neural Networks: Support Vector machines
Neural Networks: Support Vector machinesNeural Networks: Support Vector machines
Neural Networks: Support Vector machinesMostafa G. M. Mostafa
 
EC8553 Discrete time signal processing
EC8553 Discrete time signal processing EC8553 Discrete time signal processing
EC8553 Discrete time signal processing ssuser2797e4
 
Paper.pdf
Paper.pdfPaper.pdf
Paper.pdfDavCla1
 
An Efficient And Safe Framework For Solving Optimization Problems
An Efficient And Safe Framework For Solving Optimization ProblemsAn Efficient And Safe Framework For Solving Optimization Problems
An Efficient And Safe Framework For Solving Optimization ProblemsLisa Muthukumar
 
Thesis oral defense
Thesis oral defenseThesis oral defense
Thesis oral defenseFan Zhitao
 
Supervisory control of discrete event systems for linear temporal logic speci...
Supervisory control of discrete event systems for linear temporal logic speci...Supervisory control of discrete event systems for linear temporal logic speci...
Supervisory control of discrete event systems for linear temporal logic speci...AmiSakakibara
 
HARMONIC ANALYSIS ASSOCIATED WITH A GENERALIZED BESSEL-STRUVE OPERATOR ON THE...
HARMONIC ANALYSIS ASSOCIATED WITH A GENERALIZED BESSEL-STRUVE OPERATOR ON THE...HARMONIC ANALYSIS ASSOCIATED WITH A GENERALIZED BESSEL-STRUVE OPERATOR ON THE...
HARMONIC ANALYSIS ASSOCIATED WITH A GENERALIZED BESSEL-STRUVE OPERATOR ON THE...irjes
 
Existence of Hopf-Bifurcations on the Nonlinear FKN Model
Existence of Hopf-Bifurcations on the Nonlinear FKN ModelExistence of Hopf-Bifurcations on the Nonlinear FKN Model
Existence of Hopf-Bifurcations on the Nonlinear FKN ModelIJMER
 
Functional analysis in mechanics 2e
Functional analysis in mechanics  2eFunctional analysis in mechanics  2e
Functional analysis in mechanics 2eSpringer
 
Functional analysis in mechanics
Functional analysis in mechanicsFunctional analysis in mechanics
Functional analysis in mechanicsSpringer
 

Similar to The Impact of Smoothness on Model Class Selection in Nonlinear System Identification (20)

Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...
 
From RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphsFrom RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphs
 
Several nonlinear models and methods for FDA
Several nonlinear models and methods for FDASeveral nonlinear models and methods for FDA
Several nonlinear models and methods for FDA
 
Phase Retrieval: Motivation and Techniques
Phase Retrieval: Motivation and TechniquesPhase Retrieval: Motivation and Techniques
Phase Retrieval: Motivation and Techniques
 
IVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionIVR - Chapter 1 - Introduction
IVR - Chapter 1 - Introduction
 
Epsrcws08 campbell isvm_01
Epsrcws08 campbell isvm_01Epsrcws08 campbell isvm_01
Epsrcws08 campbell isvm_01
 
Lecture6.handout
Lecture6.handoutLecture6.handout
Lecture6.handout
 
Harmonic Analysis and Deep Learning
Harmonic Analysis and Deep LearningHarmonic Analysis and Deep Learning
Harmonic Analysis and Deep Learning
 
Neural Networks: Support Vector machines
Neural Networks: Support Vector machinesNeural Networks: Support Vector machines
Neural Networks: Support Vector machines
 
QMC: Operator Splitting Workshop, A New (More Intuitive?) Interpretation of I...
QMC: Operator Splitting Workshop, A New (More Intuitive?) Interpretation of I...QMC: Operator Splitting Workshop, A New (More Intuitive?) Interpretation of I...
QMC: Operator Splitting Workshop, A New (More Intuitive?) Interpretation of I...
 
EC8553 Discrete time signal processing
EC8553 Discrete time signal processing EC8553 Discrete time signal processing
EC8553 Discrete time signal processing
 
Paper.pdf
Paper.pdfPaper.pdf
Paper.pdf
 
An Efficient And Safe Framework For Solving Optimization Problems
An Efficient And Safe Framework For Solving Optimization ProblemsAn Efficient And Safe Framework For Solving Optimization Problems
An Efficient And Safe Framework For Solving Optimization Problems
 
Thesis oral defense
Thesis oral defenseThesis oral defense
Thesis oral defense
 
Supervisory control of discrete event systems for linear temporal logic speci...
Supervisory control of discrete event systems for linear temporal logic speci...Supervisory control of discrete event systems for linear temporal logic speci...
Supervisory control of discrete event systems for linear temporal logic speci...
 
HARMONIC ANALYSIS ASSOCIATED WITH A GENERALIZED BESSEL-STRUVE OPERATOR ON THE...
HARMONIC ANALYSIS ASSOCIATED WITH A GENERALIZED BESSEL-STRUVE OPERATOR ON THE...HARMONIC ANALYSIS ASSOCIATED WITH A GENERALIZED BESSEL-STRUVE OPERATOR ON THE...
HARMONIC ANALYSIS ASSOCIATED WITH A GENERALIZED BESSEL-STRUVE OPERATOR ON THE...
 
Existence of Hopf-Bifurcations on the Nonlinear FKN Model
Existence of Hopf-Bifurcations on the Nonlinear FKN ModelExistence of Hopf-Bifurcations on the Nonlinear FKN Model
Existence of Hopf-Bifurcations on the Nonlinear FKN Model
 
Functional analysis in mechanics 2e
Functional analysis in mechanics  2eFunctional analysis in mechanics  2e
Functional analysis in mechanics 2e
 
Functional analysis in mechanics
Functional analysis in mechanicsFunctional analysis in mechanics
Functional analysis in mechanics
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 

Recently uploaded

What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
DATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage exampleDATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage examplePragyanshuParadkar1
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 

Recently uploaded (20)

What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
DATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage exampleDATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage example
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 

The Impact of Smoothness on Model Class Selection in Nonlinear System Identification

  • 1. The Impact of Smoothness on Model Class Selection in Nonlinear System Identification: An Application of Derivatives in the RKHS Y. Bhujwalla, V. Laurain, M. Gilson 6th July 2016 yusuf-michael.bhujwalla@univ-lorraine.fr Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 1 / 23
  • 2. Introduction The Data-Generating System Measured data : DN = {(u1, y1), (u2, y2), . . . , (uN , yN )}. Describes So, an unknown nonlinear system with function fo : X → R, So : yo,k = fo(xk) yk = yo,k + eo,k Where xk = [yk−1 · · · yk−na uk · · · uk−nb ]⊤ ∈ X = Rna+nb+1 . Parametric Models Nθ low (fixed) → Physically interpretable Choice of basis function? → Combinatorially hard problem X Nonparametric Models Nθ high (∼ data) → Not interpretable X Can define a general model class. → Flexibility Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 2 / 23
  • 3. Introduction The Data-Generating System Measured data : DN = {(u1, y1), (u2, y2), . . . , (uN , yN )}. Describes So, an unknown nonlinear system with function fo : X → R, So : yo,k = fo(xk) yk = yo,k + eo,k Where xk = [yk−1 · · · yk−na uk · · · uk−nb ]⊤ ∈ X = Rna+nb+1 . Parametric Models Nθ low (fixed) → Physically interpretable Choice of basis function? → Combinatorially hard problem X Nonparametric Models Such as kernel methods : Input 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Output 0 0.5 1 1.5 2 yo kx Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 2 / 23
  • 4. Outline 1 Kernel Methods in Nonlinear Identification 2 The Kernel Selection Problem 3 Smoothness in the RKHS 4 Simulation Examples Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 3 / 23
  • 5. 1. Kernel Methods in Nonlinear Identification Reproducing Kernel Hilbert Spaces Hilbert Spaces H is a space over a class of functions, f : X → R ∈ H : · ∥ f ∥H · ⟨ f , g ⟩H. In system identification, H ⇔ model class. Reproducing Kernels H has a unique, associated kernel function, K : X × X → R, spanning the space H. The Reproducing Property states that f (x) can be explicitly represented as an infinite sum in terms of the kernel function : f (x) = ⟨ f , Kx⟩H = ∞ i=1 αiK(xi, x) Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 4 / 23
  • 6. 1. Kernel Methods in Nonlinear Identification Identification in the RKHS Identification in the RKHS For ˆf ∈ H close to fo, ˆf should reflect observations : ˆf = min f { V( f ) = L(x, y, f (x)) } However, infinitely many solutions ⇒ add constraint to model : ˆf = min f { V( f ) = L(x, y, f (x)) + g(∥ f ∥H) } For such cost-functions, f (x) can be reduced to : f (x) = N i=1 αiK(xi, x), α ∈ RN · f (x) → a finite sum over the observations. · The Representer Theorem (Schölkopf, Herbrich and Smola, 2001) Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 5 / 23
  • 7. 1. Kernel Methods in Nonlinear Identification A Widely-Used Example A Widely-Used Example As an example minimise squared-error : L(x, y, f (x)) = ∥y − f (x)∥2 2, and use regularisation to avoid overparameterisation : g(∥ f ∥H) = λ∥ f ∥2 H. Giving : Vf : V( f ) = ∥y − f (x)∥2 2 + λf ∥ f ∥2 H ⇒ αf = (K + λf I)−1 y · Solution depends on I. K and II. λf Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 6 / 23
  • 8. Outline 1 Kernel Methods in Nonlinear Identification 2 The Kernel Selection Problem 3 Smoothness in the RKHS 4 Simulation Examples Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 7 / 23
  • 9. 2. The Kernel Selection Problem Choosing a Kernel Function Choosing a kernel function... K defines the model class Let X = R, and K be the Gaussian RBF kernel : K(xi, x) = exp − ∥x − xi∥2 σ2 . Width (σ) defines smoothness of the kernel function. Hence σ determines the model class ! Other kernels have different hyperparameters, but they will still influence H. Input -1 -0.5 0 0.5 1 0 0.2 0.4 0.6 0.8 1 Input -1 -0.5 0 0.5 1 0 0.2 0.4 0.6 0.8 1 KxKx σ1 σ2 > σ1 σ Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 8 / 23
  • 10. 2. The Kernel Selection Problem Implications of the Hyperparameter Selection Estimation of 1D switching signal using Vf = ∥y − f (x)∥2 2 + λf ∥f ∥2 H. Many observations (N = 103 ). uk ∼ U(−1, 1). Significant noise disturbances (SNR = 5dB). Two hyperparameters : I. σ and II. λ -1 -0.5 0 0.5 1 -20 -10 0 10 20 30 fo(uk) uk FIGURE: Estimation of 1D switching signal for different hyperparameter values. Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 9 / 23
  • 11. 2. The Kernel Selection Problem Implications of the Hyperparameter Selection -1 -0.5 0 0.5 1 -20 0 20 -1 -0.5 0 0.5 1 -20 0 20 -1 -0.5 0 0.5 1 -20 0 20 -1 -0.5 0 0.5 1 -20 0 20 yoyo yoyo ˆfMEAN ˆfMEAN ˆfMEAN ˆfMEAN ˆfSD ˆfSD ˆfSD ˆfSD SMALL λ LARGE λ SMALLσLARGEσ Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 10 / 23
  • 12. 2. The Kernel Selection Problem Implications of the Hyperparameter Selection -1 -0.5 0 0.5 1 -20 0 20 -1 -0.5 0 0.5 1 -20 0 20 -1 -0.5 0 0.5 1 -20 0 20 -1 -0.5 0 0.5 1 -20 0 20 SMOOTHNESS FLEXIBILITY yoyo yoyo ˆfMEAN ˆfMEAN ˆfMEAN ˆfMEAN ˆfSD ˆfSD ˆfSD ˆfSD SMALL λ LARGE λ SMALLσLARGEσ Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 10 / 23
  • 13. 2. The Kernel Selection Problem Summary Summary Vf : V(f ) = ∥y − f (x)∥2 2 + λf ∥ f ∥2 H. Kernel framework very effective : · flexible, · well-understood. However, choice of kernel often compromised (e.g. by noise). ⇒ Trade-off between flexibility and smoothness. So, why regularise over ∥ f ∥H . . . . . . when smoothness is often a more interesting property to control? ⇒ Desirable property in many models. ⇒ Characterises many systems. Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 11 / 23
  • 14. Outline 1 Kernel Methods in Nonlinear Identification 2 The Kernel Selection Problem 3 Smoothness in the RKHS 4 Simulation Examples Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 12 / 23
  • 15. 3. Smoothness in the RKHS Regularisation Using Derivatives Proposition Replace functional regularisation : Vf : V(f ) = ∥y − f (x)∥2 2 + λf ∥ f ∥2 H, With smoothness-enforcing regularisation : VD : V(f ) = ∥y − f (x)∥2 2 + λD∥Df ∥2 H. Now : · Hence, smoothness controlled by regularisation. · And, kernel hyperparameter removed from optimisation problem. Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 13 / 23
  • 16. 3. Smoothness in the RKHS Regularisation Using Derivatives Proposition Replace functional regularisation : Vf : V(f ) = ∥y − f (x)∥2 2 + λf ∥ f ∥2 H, With smoothness-enforcing regularisation : VD : V(f ) = ∥y − f (x)∥2 2 + λD∥Df ∥2 H. Now : · Hence, smoothness controlled by regularisation. · And, kernel hyperparameter removed from optimisation problem. Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 13 / 23
  • 17. 3. Smoothness in the RKHS Regularisation Using Derivatives Proposition Replace functional regularisation : Vf : V(f ) = ∥y − f (x)∥2 2 + λf ∥ f ∥2 H, With smoothness-enforcing regularisation : VD : V(f ) = ∥y − f (x)∥2 2 + λD∥Df ∥2 H. Now : · Hence, smoothness controlled by regularisation. · And, kernel hyperparameter removed from optimisation problem. Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 13 / 23
  • 18. 3. Smoothness in the RKHS Derivatives in the RKHS Derivatives in the RKHS For f ∈ H, Df ∈ H (Zhou, 2008) Hence, a derivative reproducing property can be defined : Df = ⟨ f , DKx ⟩H The Representer Theorem Representer f (x) = N i=1 αiK(xi, x) requires g(∥ f ∥H) : a monotically increasing function of ∥ f ∥H Clearly, ∥Df ∥H g(∥ f ∥H) ⇒ representer is suboptimal for VD. However, if system is well-excited, f (x) = N i=1 αiK(xi, x) can be used. However, it loosely preserves the bias-variance properties of Vf lim λ→∞ f (x) = 0, ∀x ∈ R. Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 14 / 23
  • 19. 3. Smoothness in the RKHS Derivatives in the RKHS A Closed-Form Solution Using derivative reproducing property, ∥Df ∥H can be defined : ∥Df ∥2 H = α⊤ D(1, 1) Kα, where D(1, 1) K(xi, xj) = ∂2 K(xi, xj) ∂xj ∂xi . Permitting a closed-form solution : αD = K⊤ K + λDD(1, 1) K −1 K⊤ y. As per Vf ⇒ αf = (K + λf I)−1 y. Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 15 / 23
  • 20. Outline 1 Kernel Methods in Nonlinear Identification 2 The Kernel Selection Problem 3 Smoothness in the RKHS 4 Simulation Examples Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 16 / 23
  • 21. 4. Simulation Examples Example 1 : Effect of the Regularisation Estimation of 1D switching signal using Vf and VD. Many observations (N = 103 ). uk ∼ U(−1, 1). Significant noise disturbances (SNR = 5dB). Gaussian RBF kernel, with σ = 0.01. Varying levels of regularisation (through λf , λD). -1 -0.5 0 0.5 1 -20 -10 0 10 20 30 fo(uk) uk FIGURE: Estimation of 1D switching signal for different λ values. Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 17 / 23
  • 22. 4. Simulation Examples Example 1 : Effect of the Regularisation ⇒ Negligible regularisation (very small λf , λD). Input -1 -0.5 0 0.5 1 Output -20 -10 0 10 20 30 yo ˆfMEAN ˆfSD FIGURE: Vf : R( f) Input -1 -0.5 0 0.5 1 Output -20 -10 0 10 20 30 yo ˆfMEAN ˆfSD FIGURE: VD : R(Df) Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 18 / 23
  • 23. 4. Simulation Examples Example 1 : Effect of the Regularisation ⇒ Light regularisation (small λf , λD). Input -1 -0.5 0 0.5 1 Output -20 -10 0 10 20 30 yo ˆfMEAN ˆfSD FIGURE: Vf : R( f) Input -1 -0.5 0 0.5 1 Output -20 -10 0 10 20 30 yo ˆfMEAN ˆfSD FIGURE: VD : R(Df) Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 18 / 23
  • 24. 4. Simulation Examples Example 1 : Effect of the Regularisation ⇒ Moderate regularisation. Input -1 -0.5 0 0.5 1 Output -20 -10 0 10 20 30 yo ˆfMEAN ˆfSD FIGURE: Vf : R( f) Input -1 -0.5 0 0.5 1 Output -20 -10 0 10 20 30 yo ˆfMEAN ˆfSD FIGURE: VD : R(Df) Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 18 / 23
  • 25. 4. Simulation Examples Example 1 : Effect of the Regularisation ⇒ Heavy regularisation (large λf , λD). Input -1 -0.5 0 0.5 1 Output -20 -10 0 10 20 30 yo ˆfMEAN ˆfSD FIGURE: Vf : R( f) Input -1 -0.5 0 0.5 1 Output -20 -10 0 10 20 30 yo ˆfMEAN ˆfSD FIGURE: VD : R(Df) Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 18 / 23
  • 26. 4. Simulation Examples Example 1 : Effect of the Regularisation ⇒ Excessive regularisation (very large λf , λD). Input -1 -0.5 0 0.5 1 Output -20 -10 0 10 20 30 yo ˆfMEAN ˆfSD FIGURE: Vf : R( f) Input -1 -0.5 0 0.5 1 Output -20 -10 0 10 20 30 yo ˆfMEAN ˆfSD FIGURE: VD : R(Df) Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 18 / 23
  • 27. 4. Simulation Examples Example 2 : 1D Structural Selection Identification of two unknown systems (X ∈ [−1, 1], SNR = 10dB, N = 103 ). Vf : λ, σ optimised using cross-validation. VD : λ optimised using cross-validation, σ set based on data. -0.5 0 0.5 -10 -5 0 5 10 15 20 25 f1 o(uk) uk FIGURE: S1 o : Smooth -0.5 0 0.5 -10 -5 0 5 10 15 20 25 f2 o(uk) uk FIGURE: S2 o : Nonsmooth Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 19 / 23
  • 28. 4. Simulation Examples Example 2 : Smooth S1 o Using a small kernel, VD can reconstruct a smooth function. Not feasible using Vf - needs kernel smoothing effect. Input -0.5 0 0.5 Output -10 -5 0 5 10 15 20 25 ˆfMEAN ˆfSD kx FIGURE: Vf : R( f) Input -0.5 0 0.5 Output -10 -5 0 5 10 15 20 25 ˆfMEAN ˆfSD kx FIGURE: VD : R(Df) Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 20 / 23
  • 29. 4. Simulation Examples Example 2 : Nonsmooth S2 o Using a small kernel, VD can detect structural nonlinearity. However, Vf is too smooth, as σ must counteract noise. Input -0.5 0 0.5 Output -10 -5 0 5 10 15 20 25 ˆfMEAN ˆfSD kx FIGURE: Vf : R( f) Input -0.5 0 0.5 Output -10 -5 0 5 10 15 20 25 ˆfMEAN ˆfSD kx FIGURE: VD : R(Df) Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 21 / 23
  • 30. Conclusions RKHS in Nonlinear Identification Flexible framework : attractive for nonlinear identification. Smoothness controlled by kernel function and regularisation (σ and λf ) ⇒ Constrained kernel function. Derivatives in the RKHS Smoothness controlled by regularisation (λD). ⇒ Simpler steering of the smoothness. Simpler hyperparameter optimisation (just λD) and increased model flexibility. ⇒ Through use of a smaller kernel (small σ). However, relies on a suboptimal representer. ⇒ Nonetheless, promising results have been obtained. Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 22 / 23
  • 31. The Impact of Smoothness on Model Class Selection in Nonlinear System Identification: An Application of Derivatives in the RKHS Y. Bhujwalla, V. Laurain, M. Gilson 6th July 2016 yusuf-michael.bhujwalla@univ-lorraine.fr Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 23 / 23
  • 32. A. Bibliography Alternative Smoothness-Enforcing Optimisation Schemes Sobolev Spaces (Wahba, 1990 ; Pillonetto et al, 2014) ∥f ∥Hk = m i=0 X di f (x) dxi 2 dx Identification using derivative observations (Zhou, 2008; Rosasco et al, 2010) Vobvs( f ) = ∥y − f (x)∥2 2 + γ1 dy dx − df (x) dx 2 2 + · · · γm dm y dxm − dm f (x) dxm 2 2 + λ ∥f ∥H Regularization Using Derivatives (Rosasco et al, 2010; Lauer, Le and Bloch, 2012; Duijkers et al, 2014) VD( f ) = ∥y − f (x)∥2 2 + λ∥Dm f ∥p. Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 23 / 23
  • 33. A. Bibliography Literature Review Kernel Methods in Machine Learning and System Identification · Kernel methods in system identification, machine learning and function estimation : A survey, G. Pillonetto, F. Dinuzzo, T. Chen, G. D. Nicolao and L. Ljung, 2014. · Learning with Kernels, B. Schölkopf, R. Herbrich and A. J. Smola, 2002. · Gaussian Processes for Machine Learning, C. Rasmussen and C. Williams, 2006. Reproducing Kernel Hilbert Spaces · Theory of Reproducing Kernels, N. Aronszajn, 1950. · A Generalized Representer Theorem, B. Schölkopf, R. Herbrich and A. J. Smola, 2001. · Derivative reproducing properties for kernel methods in learning theory, D. Zhou, 2008. Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 23 / 23
  • 34. B. Example 2 : 1D Structural Selection S1 o : Smooth Input -0.5 0 0.5 Output -10 -5 0 5 10 15 20 25 ˆfMEAN ˆfSD kx FIGURE: Vf : R( f) Input -0.5 0 0.5Output -10 -5 0 5 10 15 20 25 ˆfMEAN ˆfSD kx FIGURE: VD : R(Df) Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 23 / 23
  • 35. B. Example 2 : 1D Structural Selection S2 o : Nonsmooth Input -0.5 0 0.5 Output -10 -5 0 5 10 15 20 25 ˆfMEAN ˆfSD kx FIGURE: Vf : R( f) Input -0.5 0 0.5Output -10 -5 0 5 10 15 20 25 ˆfMEAN ˆfSD kx FIGURE: VD : R(Df) Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 23 / 23
  • 36. C. Applicability of the Representer Kernel Density Applicability of the representer depends on the kernel density, i.e. the ratio of observations to the kernel width : Input (x/σ) -6 -4 -2 0 2 4 6 Output 0 0.5 1 1.5 ˆf Kx FIGURE: ρk = 0.6 Input (x/σ) -6 -4 -2 0 2 4 6 Output 0 0.5 1 1.5 ˆf Kx FIGURE: ρk = 0.6 Input (x/σ) -6 -4 -2 0 2 4 6 Output 0 0.5 1 1.5 ˆf Kx FIGURE: ρk = 0.4 Desirable to ensure σ ≈ max(∆x) (where ∆x is the spacing between adjacent observations). Yusuf Bhujwalla (Université de Lorraine) IEEE ACC 2016 23 / 23