What are the advantages and disadvantages of membrane structures.pptx
Dycops2019
1. Recursive System Identification Using
Outlier-Robust Local Models
Jessyca A. Bessa and Guilherme A. Barreto
Federal University of Cear´a, Graduate Program on Teleinformatics
Engineering, Campus of Pici, Fortaleza, Cear´a, Brazil
e-mails: bessa.jessyca@ifce.edu.br, gbarreto@ufc.br
Abstract: In this paper we revisit the design of neural-network based local linear models for
dynamic system identification aiming at extending their use to scenarios contaminated with
outliers. To this purpose, we modify well-known local linear models by replacing their original
recursive rules with outlier-robust variants developed from the M-estimation framework. The
performances of the proposed variants are evaluated in free simulation tasks over 3 benchmarking
datasets. The obtained results corroborate the considerable improvement in the performance of
the proposed models in the presence of outliers.
Keywords: System identification, neural networks, local linear models, outliers, M-estimation.
1. INTRODUCTION
Dynamical System identification is a very challenging re-
search area, whose goal is to build a useful model from
the observed input-output data (Sj¨oberg et al., 1995).
For this purpose, there are several design and modeling
methodologies to describe a system, select a suitable struc-
ture for the model and then estimate its parameters. The
global approach, for example, is based on the assumption
that all the parameters of the approximating model (e.g.
a neural network) would be estimated using the entire
data (Narendra and Parthasarathy, 1990). This is not
always computationally feasible, especially if a large num-
ber of observations are available. An alternative is to use
local models whose parameters are estimated using only
a partition of the data (Wang and Tanaka, 1999). Local
modeling has a long history of contributions in system
identification, with special contributions originated within
the fields of neural networks (Moshou and Ramon, 1997)
and Fuzzy modeling (Takagi and Sugeno, 1985). Recent
applications are still being reported in specialized confer-
ences and journals (Belz et al., 2017; M¨unker et al., 2017;
Barreto and Souza, 2016; Costa et al., 2015).
Regardless of the structure of the model (either global
or local), it is desirable that the approximating model
be capable of dealing with abnormal samples, commonly
called outliers, in order to avoid biased/unstable responses.
Bearing this in mind, in this paper we revisit the design of
neural network based local linear models for identification
of dynamic systems. Our ultimate goal is to extend the
recursive learning rules of such local models aiming at
improving their performances in scenarios with strong
occurrence of outliers in the data. For this purpose, we
selected three widely used local linear models and modify
their adaptive learning rules with the help of mechanisms
grounded in the robust statistical framework known as M-
estimation (Huber, 1964). The main goal here is to validate
our hypothesis that the use of learning rules based on
M-estimators considerably increase the outlier-robustness
of the evaluated models. For this sake, a comprehensive
performance comparison is carried out using three datasets
commonly used for benchmarking purposes in the field of
system identification.
The remainder of the paper is organized as follows. In Sec-
tion 2 we describe the three local models evaluated in this
article. The fundamentals of the robust framework of M-
estimation is briefly described in Section 3. Experiments
are described and the results are reported in Section 4.
The paper is concluded in Section 5.
2. ANN-BASED LOCAL MODEL APPROACHES
In this section, we describe three ANN-based local linear
models for system identification: the local linear map
(LLM) (Walter et al., 1990), the radial basis function
network (RBFN) (Chen et al., 1990; Yan et al., 2000) and
the local model network (LMN) (Belz et al., 2017). For all
these approaches we assume that that each input vector
x(t) ∈ Rp+q
will be defined as
x(t) = [u(t − 1), . . . , u(t − p), y(t − 1), . . . , y(t − q)]T
, (1)
where x(t) is also called the vector of regressors and p +
q = n. Also, y(t) = f(x(t)) is the observed output, so
that the function f : Rn
→ R is unknown and will be
approximated by multiple local linear models.
The LLM network approximates the unknown function
with a set of linear filters, each of which is constrained to a
non-overlapping local partition of the input space Rn
. Each
partition is associated with a prototype vector and a coeffi-
cient vector. The continuous input space is partitioned by a
reduced number of prototype vectors, while the coefficients
of the linear filter associated to each prototype vector
provides a local estimator of the output of the mapping.
More formally, the input space χ is partitioned via the
self-organizing map (SOM) algorithm (Kohonen, 2013),
with each neuron j owning a prototype vector wj, where
j = 1, . . . , S, with S denoting the number of neurons of the
SOM. The linear filter associated to the j-th prototype
2. vector wj is defined by a coefficient vector aj ∈ Rp+q
,
which plays the role of the coefficients of the j-th local
ARX model:
aj(t) = [aj,1, . . . , aj,p, . . . , aj,p+q]T
. (2)
Thus, the adjustable parameters of the LLM model are the
set of prototype vectors wj and their coefficients vectors
aj, for j = 1, . . . , S. Given the winner-take-all nature of
the SOM, only one neuron per iteration is used to estimate
the output of the LLM model. The index of the winning
neuron at time t is obtained as follows:
j∗
(t) = arg min
∀j
x(t) − wj(t) 2
, (3)
where · denotes the Euclidean norm. The estimate of
the output of the LLM is then computed as
ˆy(t) = yj∗ (t) = aT
j∗ (t)x(t), (4)
where aj∗ (t) is the coefficient vector of the linear filter
associated with the current winning neuron j∗
(t) and it is
used to build a local estimate of the output.
The learning rule for the prototype vectors wj is that of
the usual SOM algorithm:
wj(t + 1) = wj(t) + α(t)h(j∗
, j; t)[x(t) − wj(t)], (5)
while the learning rule of the coefficient vectors aj(t) is
given by
aj(t + 1) = aj(t) + α (t)h(j∗
, j; t)∆aj(t), (6)
where 0 < α, α 1 are, respectively, the learning
rates of the weight and coefficient vectors. The correction
term ∆aj(t) is computed by means of a variant of the
normalized LMS algorithm (Widrow, 2005):
∆aj(t) = ej(t)
x(t)
x(t) 2
= [y(t) − aT
j (t)x(t)]
x(t)
x(t) 2
(7)
where ej(t) is the prediction error of the j-th local model
and y(t) is the actual observed output.
The RBFN is a classical feedforward neural network ar-
chitecture with a single hidden layer of neurons (Chen
et al., 1990). Hidden neurons have nonlinear activation
functions, hereafter referred to as radial basis functions,
while output neurons use linear ones. In RBFNs, the j-th
basis function is comprised of two elements: a distance
metric dj(x) = dj(x; cj) and the basis function itself
zj = φ(dj(x)), where cj denotes the center of the j-th
function. An Euclidean distance metric, dj(x) = x − cj ,
and a Gaussian basis function zj = exp{−d2
j /2σ2
j } are
common choices, with σ denoting the radius (or width) of
the basis function.
The design of the RBF network basically involves the spec-
ification of the number S of basic functions, determination
of their parameters (cj, σj), j = 1, . . . , S, and computa-
tion of the output weights. For this purpose, we follow
the approach introduced by Moody and Darken (1989),
which consists in three stages sequentially executed: (i)
the positions of the S centers are found by means of a
vector quantization algorithm, such as the SOM network.
(ii) Heuristics are used for specifying the radius σj for the
S basis functions. In this paper, σj is computed as half of
the distance between the center cj and the nearest center.
(iii) Computation of the output weights by means of the
LMS learning rule (a.k.a. Widrow-Hoff rule).
In the current paper, we are interested in identifying a
MISO system, thus we assume only one output neuron 1
.
Hence, the estimate of the output of the RBFN is then
computed as
ˆy(t) = wT
(t)z(t) =
S
j=1
wj(t)zj(t), (8)
where w = [w1 w2 · · · wS]T
is the weight vector of the
output neuron e z = [z1 z2 · · · zS]T
is the vector of basis
function activations. The LMS-rule is used to adapt the
output weights as
w(t + 1) = w(t) + α(t)e(t)z(t), (9)
= w(t) + α(t)[y(t) − ˆy(t)]z(t),
where e(t) = y(t) − ˆy(t) is the model’s prediction error.
The NARX-RBFN model can be understood as an ANN-
based implementation of a zero-order Takagi-Sugeno (TS)
model (Takagi and Sugeno, 1985).
The LMN was introduced by Johansen and Foss (1992) as
a generalization of the RBF network. It can be viewed as
implementing a decomposition of the complex, nonlinear
system into a set of locally accurate submodels which are
then smoothly integrated by associated basis functions.
This means that a smaller number of local models can
cover larger ares of the input space, when compared with
the simple RBFN model. In LMN, the output estimation
of the RBF model in Eq. (8) is extended to involve not
only a constant weight associated with each basis function,
but instead a function fj(x; wj) associated with each basis
function:
ˆy(t) =
S
j=1
zj(t)fj(x; wj), (10)
where a common choice for fj(x; wj) is the multiple linear
regression function: fj(x; wj) = wT
j x.
Like the NARX-RBF, the NARX-LMN model can be
interpreted as an ANN-based implementation of the TS
model. In the NARK-LMN model, however, the radial
basis functions correspond to the rules in the TS model,
while the local function fj(x; wj) = wT
j x is used in the
rules’ consequents. LMN is still a common choice for ANN-
based system identification and control (K¨onig et al., 2014;
Costa et al., 2015; M¨unker et al., 2017; Belz et al., 2017;
Maier et al., 2018).
3. OUTLIER-ROBUSTNESS VIA M-ESTIMATION
In real-world applications of recursive estimation, the
available data are often contaminated with outliers, which
are roughly defined as observations differing markedly
from those usually expected in the task of interest. Robust
regression techniques, such as those developed within the
framework of M-estimation (Huber, 1964), seek to develop
estimators that are more resilient to outliers.
In this context, Zou et al. (2000) and Chan and Zhou
(2010) introduced modifications to the standard LMS rule
in order to robustify it for outlier-rich data. The proposed
method was named least mean M-estimate (LMM) and
1 However, the proposed approach can be easily extended for han-
dling MIMO systems
3. Table 1. Summary of the evaluated datasets.
Evaluated Datasets
Dataset Estimation Samples Test Samples ˆLu
ˆLy
Dryer 500 500 5 5
pH 200 800 5 5
Silverbox 91072 40000 10 10
one of its main virtues is simplicity, in the sense that no
extra computational burden is added to the parameter
estimation process. As a consequence, the robust LMM
rule works as fast as the original non-robust LMS rule.
As an example, let us take the LMS rule shown in Eq. (9),
the equivalent LMM rule is simply written as
w(t + 1) = w(t) + α(t)q(e(t))e(t)z(t), (11)
= w(t) + α(t)q(e(t))[y(t) − ˆy(t)]z(t)
where q(e(t)) is a scalar function that penalizes high values
of the errors e(t) (usually caused by outliers). Similar
change is introduced in the LMS-based rule of the other
local models discussed previously.
In this paper, we will use q(e(t)) as the Huber function:
q(e(t)) =
κ
|e(t)|
, se |e(t)| > κ,
1, otherwise.
(12)
where the constant κ > 0 is a user-defined threshold. When
|e(t)| is less than κ, the weight function q(e(t)) is equal to
1 and Eq. (12) reduces to the LMS rule. When |e(t)| is
greater than κ, q(e(t)) decreases exponentially to zero as
|e(t)| → ∞. This way, the LMM rule effectively reduces
the effect of large errors, usually caused by outliers. It is
recommended to use κ = 1.345σ, where σ corresponds to
the standard deviation of the residuals.
As mentioned in the introduction, the main goal of this
paper is to replace the original LMS rule used by the
local linear models described in Section 2 with the outlier-
robust LMM rule. We hypothesize that this replacement
provides greater resilience to outliers to the local models.
The computational experiments testing this hypothesis
and the obtained results are presented in the next section.
4. RESULTS AND DISCUSSION
In this section, we report the results of a compre-
hensive performance comparison of the local models
described in Section 2. Six models are implemented
from scratch using Octave 2
: the three original versions
(LLM/LMS, RBF/LMS and LMN/LMS) and the three
proposed outlier-robust variants (LLM/LMM, RBF/LMM
and LMN/LMM). We first report results on two bench-
marking datasets (Dryer and pH ), which are publicly
available for download from the DaISy repository website
at the Katholieke Universiteit Leuven 3
.
Additionally, we evaluate all aforementioned models on
a large-scale dataset. For this matter, we choose the
Silverbox dataset, which was introduced in Schoukens
et al. (2003). Some important features of Dryer, pH and
2 www.gnu.org/software/octave/
3 http://homes.esat.kuleuven.be/~smc/daisy/
0% 5% 10% 15% 20%
1
2
3
4
·10−2
Outliers
RMSE
LMN-LMS
LMN-LMM
Fig. 1. RMSE values for validation data as a function of the
amount of outliers in training data (Dryer dataset).
Silverbox datasets are summarized in Table 1. All models
are evaluated by the root mean square error: RMSE =
1
N
N
t=1 e2(t). It must be pointed out that our goal in
the experiments is not to find the best performing model
for each dataset, but rather, to confirm the improvement in
performance of all models by the use of the outlier-robust
LLM rule. Thus, for each dataset we report the results for
only one of the local models described in Section 2.
Training was carried out using one-step-ahead prediction
mode, a.k.a as series-parallel training mode (Ribeiro and
Aguirre, 2018). Outliers were artificially introduced in
training data only in different proportions. For this pur-
pose, we followed the same procedure used by Mattos et al.
(2017). The reported results correspond to post-training
evaluation of the 6 models in outlier-free scenarios. The
rationale for this approach is to assess how the outliers
affect the parameter estimation process (training) and its
impact in the model validation (testing) phase. Trained
models were tested under the free simulation regime, in
which predicted output values were fed back to the input
regression vector. During testing, the parameters of the
models were not updated. We defined 200 epochs for train-
ing all models. The number of neurons was set to S = 5
after some initial experimentation with the data. Higher
values of S did not improve significantly the results, while
smaller values led to poorer performances for all evaluated
models. The initial and final learning rates were defined as
α0 = 0.5 and αT = 0.001.
For the Dryer dataset, we report the results of the LMN
models only. This dataset corresponds to a mechanical
SISO system from a laboratory setup acting like a hair
dryer. In this system, the air temperature yn is measured
by a thermocouple at the output. The input un is the
voltage over the heating device. From the 1000 available
samples of un and yn, we use the first half for model
building and the other half for model evaluation. The
RMSE values as a function of the percentage of outliers for
the Dryer dataset are shown in Figure 1 and Table 2. One
can easily see the improvement in performance of the LMN
model due to the replacement of the original LMS rule by
the outlier-robust LMM rule. While the performance of the
original LMN-LMS model deteriorates with the increase in
the number of outliers in training data, the performance
4. 0 100 200 300 400 500
4
5
6
step
output
YREAL
YEST IMAT ION
0 100 200 300 400 500
4
5
6
step
output
YREAL
YEST IMAT ION
Fig. 2. Predictive performance of the LMN-LMS (L) and LMN-LMM (R) for 15% of outliers (Dryer dataset).
Table 2. RMSE results for the LMN model on
the Dryer dataset.
0% outliers 15% outliers
Models RMSE STD RMSE STD
LMN-LMS (original) 1.66E-02 3.75E-03 2.68E-02 1.63E-03
LMN-LMM (robust) 1.76E-02 1.52E-03 1.26E-02 3.13E-03
Table 3. RMSE results for the RBF models on
the pH dataset.
0% outliers 15% outliers
Models RMSE STD RMSE STD
RBF-LMS (original) 5.96E-01 1.21E-03 6.53E-01 4.50E-03
RBF-LMM (robust) 5.90E-01 2.97E-04 5.88E-01 1.70E-03
0% 5% 10% 15%
0.6
0.65
0.7
Outliers
RMSE
RBF-LMS
RBF-LMM
Fig. 3. RMSE values for validation data as a function of
the amount of outliers in training data (pH dataset).
of the robust LMN-LMM model is practically insensitive
to the presence of outliers.
We can exemplify the improvement in performance for
the Dryer dataset with a typical prediction result of the
LMN model, as shown in Figure 2. The improvement in
performance of the LMN model due to the replacement of
the original LMS rule by the outlier-robust LMM rule is
easily observed.
For the pH dataset, we evaluate RBF models only. The
data comes from a pH neutralization process in a constant
volume stirring tank. The control input is the base solution
flow and the output is the pH value of the solution in the
tank. We use the first 200 samples for model building and
parameter estimation and the next 800 samples for model
validation (testing).
A numerical comparison in terms of RMSE values of
the two variants of the RBF model for the pH dataset
is shown in Table 3 and Figure 3. Two scenarios are
tested: the outlier-free scenario and one with 15% of outlier
contamination. We observe that while the RMSE values
are similar for the outlier-free scenario, the robust RBF
model achieved lower RMSE values in comparison to the
original RBF model for the outlier-contaminated scenario.
We also observe in Figure 3 that the deterioration in
performance with the increase in contamination levels is
much smaller for the RBF-LMM model.
A typical example of the prediction results of the RBF
models under free simulation regime for the pH dataset
is shown in Figure 4. As can be seen, both models were
able to capture the system dynamics; however, the robust
RBF-LLM model was much less influenced by the presence
of outliers in the training data.
As a final experiment, we decided to evaluate the perfor-
mance of the LLM models on a large-scale dataset. To this
sake, we selected the Silverbox dataset (Schoukens et al.,
2003). Obtained from an electrical circuit simulating a
mass-spring damper system, it corresponds to a nonlinear
dynamical system with feedback, with a dominant linear
behavior. This dataset contains a total of 131,072 samples
of each sequence un and yn, where the first 40,000 samples
of un and yn were used for model building and the remain-
ing 91,072 for model validation. Since the Silverbox dataset
is too long, training is not repeated for several epochs. In
other words, the model is used in a fully online mode with
recursive parameter estimation, in which only one pass
through the data is enough for the model to converge.
A numerical comparison in terms of RMSE values of
the two variants of the LLM model for the Silverbox
dataset is shown in Table 4 and Figure 5. Two scenarios
were tested: the outlier-free scenario and one with 15%
of outlier contamination. It can be easily observed that
5. 0 200 400 600 800
4
6
8
10
12
step
output
0 200 400 600 800
4
6
8
10
12
step
output
Fig. 4. Predictive performance of the RBF-LMS (L) and RBF-LMM (R) for 15% of outliers (pH dataset).
Table 4. RMSE results for the LLM models on
the Silverbox dataset.
0% outliers 15% outliers
Models RMSE STD RMSE STD
LLM-LMS (original) 2.05E-05 2.63E-06 3.40E-05 4.05E-06
LLM-LMM (robust) 1.21E-05 1.68E-07 1.93E-05 6.43E-07
0% 5% 10% 15%
1
2
3
·10−5
Outliers
RMSE
LLM-LMS
LLM-LMM
Fig. 5. RMSE values for validation data as a function of
the amount of outliers in training data (Silverbox).
the deterioration of the performance with the increase in
contamination levels is much smaller for the robust LLM-
LMM model.
A typical example of the prediction results of the LLM
models under free simulation regime for the Silverbox
dataset is shown in Figure 6. For better visualization of
the results, we show only the first 200 output samples
of the validation data. As expected, the improvement in
performance of the LLM model due to the replacement of
the original LMS rule by the outlier-robust LMM rule is
clearly observed.
5. CONCLUSIONS
In this paper, we revisited a class of ANN-based identi-
fication models known as local models and addressed the
issue of how outliers affect the parameter estimation (i.e.
learning) of the approximating model. The evaluated local
models use the well-known Widrow-Hoff rule, also called
the LMS-rule, in the recursive estimation of its parameters.
The main goal of the article was firstly to evaluate how
badly outliers affect the predictive performance of the
approximating local models and secondly, if the replace-
ment of the original LMS rule by an outlier-robust version,
called the LMM rule (Zou et al., 2000), was capable of
somehow improving the performances of the local models
by providing them with higher resilience to outliers.
The results of a comprehensive comparative study with
benchmarking datasets revealed that the simple replace-
ment of the LMS-rule by the outlier-robust LLM rule
provided a considerable improvement in the performance
of the local models evaluated in outlier-contaminated sce-
narios. This study also showed a considerable improvement
for all the robust versions of the local models evaluated
and that the improvement was even more evident when
the proportion of outliers in the training set increased.
Currently, we are evaluating the performance of growing
robust local models; that is, models in which the prior
specification of the number of neurons is not necessary
because neurons are being allocated over time only when
required.
ACKNOWLEDGEMENTS
This study was financed in part by the Coordena¸c˜ao de
Aperfei¸coamento de Pessoal de N´ıvel Superior - Brasil
(CAPES) - Finance Code 001. The authors also thanks
IFCE (Campus of Maranguape) and CNPq (grant no.
309451/2015-9) for supporting this research.
REFERENCES
Barreto, G.A. and Souza, L.G.M. (2016). Novel ap-
proaches for parameter estimation of local linear models
for dynamical system identification. Applied Intelli-
gence, 44(1), 149–165.
Belz, J., M¨unker, T., Heinz, T.O., Kampmann, G.,
and Nelles, O. (2017). Automatic modeling with lo-
cal model networks for benchmark processes. IFAC-
PapersOnLine, 50(1), 470–475.
Chan, S.C. and Zhou, Y. (2010). On the performance anal-
ysis of the least mean M-estimate and normalized least
mean M-estimate algorithms with gaussian inputs and
additive gaussian and contaminated Gaussian noises.
Journal of Signal Processing Systems, 60(1), 81–103.
6. 0 50 100 150 200
−0.5
0
0.5
1
·10−2
step
output
YREAL
YEST IMAT ION
0 50 100 150 200
−0.5
0
0.5
1
·10−2
step
output
YREAL
YEST IMAT ION
Fig. 6. Predictive performance of the LLM-LMS (L) and LLM-LMM (R) for 15% of outliers (Silverbox dataset).
Chen, S., Billings, S.A., Cowan, C.F.N., and Grant, P.M.
(1990). Non-linear systems identification using radial
basis functions. International Journal of Systems Sci-
ence, 21(12), 2513–2539.
Costa, T.V., Fileti, A.M.F., Oliveira-Lopes, L.C., and
Silva, F.V. (2015). Experimental assessment and de-
sign of multiple model predictive control based on local
model networks for industrial processes. Evolving Sys-
tems, 6(4), 243–253.
Huber, P. (1964). Robust estimation of a location pa-
rameter. The Annals of Mathematical Statistics, 35(1),
73–101.
Johansen, T.A. and Foss, B.A. (1992). A NARMAX
model representation for adaptive control based on local
models. Modeling, Identification and Control, 13(1), 25.
Kohonen, T. (2013). Essentials of the self-organizing map.
Neural Networks, 37, 52–65.
K¨onig, O., Hametner, C., Prochart, G., and Jakubek, S.
(2014). Battery emulation for power-HIL using local
model networks and robust impedance control. IEEE
Transactions on Industrial Electronics, 61(2), 943–955.
Maier, C.C., Schirrer, A., and Kozek, M. (2018). Real-time
capable nonlinear pantograph models using local model
networks in state-space configuration. Mechatronics, 50,
292–302.
Mattos, C.L.C., Dai, Z., Damianou, A., Barreto, G.A., and
Lawrence, N.D. (2017). Deep recurrent gaussian pro-
cesses for outlier-robust system identification. Journal
of Process Control, 60, 82–94.
Moody, J. and Darken, C.J. (1989). Fast learning in
networks of locally-tuned processing units. Neural com-
putation, 1(2), 281–294.
Moshou, D. and Ramon, H. (1997). Extended self-
organizing maps with local linear mappings for func-
tion approximation and system identification. In Pro-
ceedings of the 1st Workshop on Self-Organizing Maps
(WSOM’97), 1–6.
M¨unker, T., Heinz, T.O., and Nelles, O. (2017). Hier-
archical model predictive control for local model net-
works. In Proceedings of the American Control Confer-
ence (ACC’2017), 5026–5031.
Narendra, K. and Parthasarathy, K. (1990). Identification
and control of dynamical systems using neural networks.
IEEE Transactions on Neural Networks, 1(1), 4–27.
Ribeiro, A.H. and Aguirre, L.A. (2018). Parallel train-
ing considered harmful?: Comparing series-parallel and
parallel feedforward network training. Neurocomputing,
316, 222–231.
Schoukens, J., Nemeth, J.G., Crama, P., Rolain, Y., and
Pintelon, R. (2003). Fast approximate identification of
nonlinear systems. Automatica, 39(7), 1267–1274.
Sj¨oberg, J., Zhang, Q., Ljung, L., Benveniste, A., Delyon,
B., Glorennec, P.Y., Hjalmarsson, H., and Juditsky,
A. (1995). Nonlinear black-box modeling in system
identification: a unified overview. Automatica, 31(12),
1691–1724.
Takagi, T. and Sugeno, M. (1985). Fuzzy identification of
systems and its applications to modelling and control.
IEEE Transactions on Systems, Man, and Cybernetics,
15(1), 116–132.
Walter, J., Ritter, H., and Schulten, K. (1990). Nonlinear
prediction with self-organizing maps. In Proceedings
of the IEEE International Joint Conference on Neural
Networks (IJCNN’90), 589–594.
Wang, S. and Tanaka, M. (1999). Nonlinear system
identification with piecewise-linear functions. IFAC
Proceedings Volumes, 32(2), 3796–3801.
Widrow, B. (2005). Thinking about thinking: The dis-
covery of the LMS algorithm. IEEE Signal Processing
Magazine, 22(1), 100–106.
Yan, L., Sundararajan, N., and Saratchandran, P. (2000).
Nonlinear system identification using Lyapunov based
fully tuned dynamic RBF networks. Neural Processing
Letters, 12(3), 291–303.
Zou, Y., Chan, S.C., and Ng, T.S. (2000). Least mean
M-estimate algorithms for robust adaptive filtering in
impulse noise. IEEE Transactions on Circuits and Sys-
tems II: Analog and Digital Signal Processing, 47(12),
1564–1569.