1. TSINGHUA SCIENCE AND TECHNOLOGY
ISSN 1007-0214 04/19 pp24-29
Volume 9, Number 1, February 2004
d channel
Wavelet Neural Networks for Adaptive Equalization by
Using the Orthogonal Least Square Algorithm*
JIANG Minghu ( )1,**
, DENG Beixing ( )2
, Georges Gielen3
1. Lab of Computational Linguistics, Department of Chinese Language, Tsinghua University, Beijing 100084, China;
2. Department of Electronic Engineering, Tsinghua University, Beijing 100084, China;
3. Department of Electrical Engineering, K.U.Leuven, Kasteelpark Arenberg 10, B3001 Heverlee, Belgium
Abstract: Equalizers are widely used in digital communication systems for corrupted or time varying
channels. To overcome performance decline for noisy and nonlinear channels, many kinds of neural
network models have been used in nonlinear equalization. In this paper, we propose a new nonlinear
channel equalization, which is structured by wavelet neural networks. The orthogonal least square
algorithm is applied to update the weighting matrix of wavelet networks to form a more compact wavelet
basis unit, thus obtaining good equalization performance. The experimental results show that performance
of the proposed equalizer based on wavelet networks can significantly improve the neural modeling
accuracy and outperform conventional neural network equalization in signal to noise ratio an
non-linearity.
Key words: adaptive equalization; wavelet neural networks (WNNs); orthogonal least square (OLS)
Introduction
The aim of equalization is to find an unknown signal
from a corrupted and imperfect observation. Early
equalization algorithms deal only with the case of lin-
ear operations whereas the equalizer performance of
these algorithms is greatly degraded under a nonlinear
operation. In order to solve the problem, researchers
have used a number of nonlinear neural network
models, such as the multilayer perceptron (MLP)[1]
,
the radial basis function (RBF) networks[2,3]
and the
high order function neural networks[4]
, to carry out
nonlinear channel equalization in a digital communi-
cation system. In the last decade, wavelet neural net-
works (WNNs)[5]
have been successfully applied to
system identification and control, function approxima-
tion, pattern recognition, signal detecting and com-
pressing, and rapid classification of varying signals. In
each case many promising results have been report-
ed[6-11]
on the powerful nonlinear ability of WNNs and
their ability to sample arbitrarily complex decision re-
gions. WNNs have a single hidden layer structure and
have scaling functions and wavelets as activation
functions. They can be understood as neural structures
which employ a wavelet layer to perform an adaptive
feature extraction in the time-frequency domain.
These wavelet analyses in combination with neural
networks are used in feature extraction and dimension
reduction of the input space. WNNs can be used for
classifying channel equalization disturbances. Not
only weights, but also the parameters of the wavelet
functions (translation, dilation) can be jointly fitted
from the input data. In the current study, the number
of wavelet functions may be chosen by the user and
the parameters are optimized by a learning process.
The more wavelets used, the more precise the classi-
Received: 2002-06-24
Supported by the Tsinghua University Research Foundation,
the Excellent Young Teacher Program of the Ministry of
Education, and the Returnee Science Research Startup Fund
of the Ministry of Education of China
To whom correspondence should be addressed.
E-mail: jiang.mh@tsinghua.edu.cn;
Tel: 86-10-62788647
2. JIANG Minghu et al Wavelet Neural Networks for Adaptive Equalization 25
fication.
1 Modeling of WNNs for Adaptive
Equalization by the OLS
Algorithm
Assume that the transmitted sequence x(n) is passed
through a dispersive channel, where the channel out-
put y(n) is corrupted by an additive zero-mean white
noise u(n) (independent of x(n)) with variance 2
n ,
and then y(n) is equal to:
( ) ( ) ( ) ( ) ( ) ( )y n g n u n h n x n u n
)
(1)
c
0
( ) ( ) ( )
N
i
h i x n i u n
where g(n) denotes the output of the noise-free chan-
nel, and h(i) and Nc are the i-th channel coefficients,
and the channel order, respectively.
The role of an equalizer is to produce an estimate of
the ˆ(x n d value from the observed data y(n),
y(n 1), , y(n m+1). The integers m and d are known
as the order and the decision delay of the equalizer.
The signal x(n) is distorted while it propagates
through the nonlinear channel. The estimate ˆ( )x n d
is given as
ˆ ˆ( ) sgn( ( ))v n x n d
(2)( ( ), ( 1), , ( 1))f y n y n y n m
where f ( ) is a nonlinear discriminative function
specified by the channel characteristic and noise. The
WNNs are adopted to approximate this function by
using the minimum mean square error (MSE)
criterion.
It is well known that wavelet decomposition allows
us to decompose any function using a family of func-
tions obtained by dilating and translating a single
mother wavelet function ( )y . Let P be a wavelet
frame in the space L2(R1
)[5]
:
/ 2
, ( ) ( ),r r
q rP y a a y qb
(3)1
,r qZ Z
]
where b and a are shifts and dilations for each daugh-
ter wavelet, e.g., a = 2, b = 1. For the i-th pattern, the
input pattern is defined by
( )
[ ( ), ( 1), , ( 1)]i
y i y i y i mY
( ) ( ) ( )
1 2[ , , ,i i i
my y y .
WNNs perform a nonlinear mapping on the state
obtained at the input layer such that more compact
clusters are created for the different classes as the out-
put of the network. The equalizer order, channel order
and delay are the determining factors in establishing
the number of wavelet units. Increasing the equalizer
order can improve the performance of the system to
some extent, but too large an order is also concomitant
with increasing noise power in the equalizer input.
Generally, the equalizer order is chosen to be not less
than the channel delay[12]
. The constructed WNNs are
expected to have the best performance for a certain
complexity or provide a certain performance level
with a minimum complexity.
The WNNs are constructed to approximate the
nonlinear discriminant function f ( ):
1
( ) ( )
( ), ( )
0 1 0
ˆ( ) ( )
mm l
n n
k j m q j r jk k
k j k
v n w y w y
1
( )
( ), ( )
1 0
( ) ( )
s m
n
i m l q i r i k
i k
w y u n
{1,
(4)( )
0
( )
K
n
k k
k
w p u n
where K = l+s+m+1 is the number of wavelet basis
units; l and s are the wavelet product and the sum of
the number of basis units; wk and are the k-th
weight coefficient (equalizer coefficient) and wavelet
basis. Because the second and third terms in Eq. (4)
can easily express any non-linearity, Eq. (4) can real-
ize arbitrary nonlinear transformations within certain
accuracy ranges when l and s are large enough. The
number of the WNNs units grows exponentially with
the channel filter length. Assume that the vector
of the hidden layer
kp
( ) ( ) ( ) ( )
0 1{ , , , }n n n n
Kp p pP
1 1
( ) ( ) ( ) ( ) ( )
(1), (1) ( ), ( )1 2
00
, , , , ( ), , ( )
m m
n n n n n
m q r q s r sk k
kk
y y y y y }is the
input vector Y through wavelet functional expansion
to higher dimensions. The higher performance of the
WNN structure is easily obtained due to its capability
of forming more complex nonlinear discrimination
functions in the input pattern space and of using
wavelet-hidden spaces with larger dimensions. The
output of WNNs performs a nonlinear mapping on the
state obtained at the input layer such that more com-
pact clusters are created for the different classes. The
expanded model can easily yield a flat network solu-
tion W, which means the model is linearly separable.
3. 26 Tsinghua Science and Technology, February 2004, 9(1): 24 29
Although wavelet functional expansion can easily ob-
tain higher nonlinear ability, the expanded wavelet
basis is often redundant for estimating v(n). In practice,
such expansions are much too inefficient computa-
tionally to reduce the number of relevant channel
states. It is reasonable instead of using a limited ex-
pansion in terms of a few basis functions with appro-
priate parameter values, and therefore, both the train-
ing and testing sets can be represented with acceptable
accuracy. For the WNNs equalizer, the channel states
are linearly separable. We can make use of the OLS
method to reduce the number of relevant channel
states.
Take Eq. (4) into consideration. If N input and out-
put measurements are available, we have the matrix
form as
T
1 ˆ ˆ ˆ[ (1), (2), , ( )]N v v v NV
(5)( 1) ( 1) 1 1
N
N K K NP W U R
where
(1) (2) ( ) T ( 1)
( 1) [ , , , ]N N K
N K P P PP R .
wk can be obtained by minimizing the MSE be-
tween the input x(n d) and the estimated signal
ˆ( )x n d , i.e.,
1
2
ˆ0.5 ( ( ) ( ))
N m
n m
E v n v n
(6)
21
1
0.5 ( )
N m K
k k
n m k
v n w p
Minimization of Eq. (6) to determine the parame-
ters wk can be performed by using error
back-propagation of gradient descent[5]
. In order to
reduce the number of wavelet basis terms and increase
the computational efficiency, we used the OLS opti-
mal algorithm to select the wavelet nodes.
OLS algorithms have been used to construct RBF
networks and to reduce the number of basis units [13-15]
.
Although RBF networks are used in adaptive equali-
zation, WNNs are generally not an RBF network,
since multi-dimensional scaling functions are
non-radial symmetric[11]
. The training data do not pro-
vide any information for determining the coefficients
of these empty wavelet basis units which should be
eliminated, and therefore, a small number of basis
units may be adequate for determining the solution of
the minimized MSE. The OLS method is a more effi-
cient selection of wavelet basis units than the ran-
dom-based approach. It does not produce the smallest
network for a given approximation accuracy. Instead,
it tries to transform those good features which contain
only problem-specific information of the input pat-
terns and remove much of the additional irrelevant
information. This approach leads to a simpler and
more reasonable structure. The number reduction of
wavelet basis units is a more practical design for the
equalizer. The equalizer order, channel order and de-
lay are the determining factors in establishing the
number of wavelet units. We expect to reduce the
number of wavelet basis units needed to achieve the
same performance provided that only those wavelets
which contain useful information are retained. WNNs
are recursively established by decreasing number of
units that result in the network convergence at each
stage. The algorithm for carrying out the procedure is
described as follows. Equation (5) can be rewritten in
matrix form as
1 0 1 0
2 0 1 1
0 1
(1) (1) (1)
(2) (2) (2)
[ ]
( ) ( ) ( )
K
K
N K K
v p p p w
v p p p w
u
v p N p N p N w
(7)
where [u] is assumed to be a zero mean white noise
sequence which is not correlated with the input and
output data. ( 1)N KP is a wavelet basis matrix that
can be decomposed into the product of an orthogonal
matrix and an upper triangular matrix A.H
( 1) 1KW represents unknown parameters to be esti-
mated. Equation (7) can be re-arranged to yield:
1 ( 1) ( 1) 1 1
( 1) ( 1) ( 1) ( 1) 1 1
N N K K N
N K K K K N
V P W U
H A W U
( 1) ( 1) 1 1N K K NH G U (8)
where
T
0 10
1 0 1
0 1
(1) (1) (1)
(2) (2) (2)
( ) ( ) ( )
K
K
K K
h h h
h h h
h N h N h N
H
H
H
H
(9)
According to the Gram-Schmidt algorithm[16]
, the or-
thogonal optimal wavelet basis units are used to con-
struct WNNs. The set of orthogonal vector is{ }kH
4. JIANG Minghu et al Wavelet Neural Networks for Adaptive Equalization 27
constructed from { }kP vectors by:
0
0
1 1,0 0
1
1
,
0
K
K k i iK
i
C
C
P
H
P H
H
P HH
(10)
Equation (10) is equivalent with the following two
equations:
, l=1, 2, , N (11a)0 0( ) ( )h l p l
(11b)
1
,
0
( ) ( ) ( )
k
k k k i i
i
h l p l C h l
where Ck,i is determined by the orthogonal condition
over the data record:
,, 0 , ,j k j k k i j jCH H H P H H ,
(12)j k
1
,
2
1
( ) ( ),
, ( )
N
j k
j k l
k j N
j j
j
l
h l p l
C
h l
H P
H H
,
j = 0,1, , k 1; k =1,2, , K (13)
T 1 T
( 1) 1 ( 1) ( 1) ( 1) 1( )K K N N K K N NG H H H V
)
(14)
where denotes the inner product, and,
T 1
T 1
0 0
T 1
1 1
T 1
( )
( ) 0 0
0 ( ) 0
0 0 ( K K
H H
H H
H H
H H
(15)
is a diagonal matrix. Equation (14) is equivalent with
the following form:
1
2
1
( )
( )
N
k l
l
k N
k
l
h l v
g
h l
, k=0,1, , K (16)
Taking into consideration the constraint of Eq. (8),
we have
0
[ ]
K
k k
k
g uV H (17)
and
1
( 1) 1 ( 1) ( 1) ( 1) 1K K K KW A G
01 1
1
0 1
K K K
K
g
g
A A C
,
1
(18)
1
kA can be calculated recursively:
(19)
1 2 2
2 3
1 1
1
, ,
0 1 0 1
0 1
k k
k
C A C
A A
A C
A
where
T
,0 ,1 , 1( , , , )k k k k kC C CC
According to Eq. (19), we can now easily obtain the
inverse matrix:
1 1
1 1 1
0 1
k k k
k
A A C
A (20)
After some simple derivations, one can easily arrive
at recursive formulas for unknown weights as :
K Kw g (21)
,
1
, 1, 2,
K
k k j k j
j k
w g C w k K K ,1,0
2
(22)
Taking the square of Eq. (17) yields
(23)2 2 2
0
( ) ( ) ( )
K
k k
k
v l h l g u l
The error of k-th term is expressed as
2 2
1
2
1
( )
100%
( )
N
k k
l
k N
l
h l g
E
v l
(24)
The value of Ek is computed together with the pa-
rameter estimates to indicate the significance of each
term, and then the terms are ranked according to their
contributions to the overall mean square error. If Ek is
very small, which indicates an insignificant term, then
the k-th term is canceled. The procedure provides an
optimal reduction of the wavelet basis units by re-
moving some redundant units at each stage so as to
retain the highest achievable accuracy with the re-
maining units. The selection procedure is terminated
when a desired error tolerance is achieved:
1
1
K
k
k
E (25)
The process of selecting terms is continued until the
sum of the error reduction ratios approaches 100%.
5. 28 Tsinghua Science and Technology, February 2004, 9(1): 24 29
After application of the OLS algorithm, the basis units
form a more compact representation and enable the
input patterns to be free from redundant or irrelevant
information and correlate well with the process states.
From this point of view, the constructed WNNs will
have the best performance for a certain complexity or
provide a certain performance level with minimum
complexity.
2 Experiments
To show the validity of the proposed method and the
associated learning algorithm, experimental simula-
tions are performed on linear and nonlinear channel
equalization problems. Assume the transmitted se-
quence is real and independent and that equiprobable
binary symbols x(n) pass through a linear or nonlinear
channel to produce the observation signal y(n) and to
obtain the vector .
The vector is fed to equalizers based on WNNs, MLP
and LLMS, respectively. We design one channel such
that its characteristic of dispersion is represented as
( )
[ ( ), ( 1), , ( 1)]i
y i y i y i mY
3
4
1 2
1( ) 0.181 0.272 0.905 0.272H z z z z
(26)
The equalizer input of the nonlinear channel is
shown as
y(n)=g(n)+0.15g(n) · tanh(g(n))+u(n) (27)
where u(n) is an additive zero-mean white Gaussian
noise, and g(n) denotes the output of the noise-free
channel. Generally, the equalizer order is chosen to be
not less than the channel delay. For the H1(Z) channel,
Nc = 3 and d = 2 are the channel order and delay. We
choose the equalizer order m=3 which is larger than d.
Here assuming that the channel order is equal to the
equalizer order, the maximum number of different
(possible) equalizer inputs is equal to 23+3+1
= 128. We
design another more complex channel; its impulse re-
sponse and the corresponding equalizer input of the
nonlinear channel are expressed as:
1 2
2 ( ) 0.202 0.364 0.808H z z z
3
0.364 0.202z z (28)
y(n) = g(n)+0.15cos (2 g (n)) 0.28 g2
(n)+
0.09 sigmoid3
(g(n))+0.21g4
(n)+u (n) (29)
For the latter channel, we use channel order Nc =
4, delay parameter d = 3, and equalizer order m = 4,
giving the maximum number of different equalizer
inputs equal to 512.
We choose the Mexihat 1/ 4 22
( ) (1 )
3
x x
as a mother wavelet. The signal to noise ratio
(SNR) of the equalizer input is defined as
2
/ 2
e x
2 2
/i n
i
h (hi is the i-th channel coefficient). The sig-
nal sequence of different SNRs is applied as training
data and 500 000 training data pairs (y(n), x(n)) are
used in the training of networks. Performance com-
parisons among the equalizers based on the WNNs,
MLP and the linear LMS (LLMS) are carried out un-
der different noise conditions. The processes are
simulated with different noise variances. The LLMS
algorithm uses a linear adaptive filter which has no
hidden nodes and no nonlinear activation function.
The MLP structure consists of (3+1) input nodes,
(12+1) hidden nodes and 1 output node for channel 1
(i.e., H1(Z)). The MLP structure consists of (4+1) in-
put nodes, (15+1) hidden nodes and 1 output node for
channel 2 (i.e., H2(Z)). Here 1 denotes bias. The num-
bers of selected initial wavelets for H1(Z) and H2(Z)
are 25 (m = 3, l = 12, s = 9) and 30 (m = 4, l = 14, s =
11). The tolerance is set at = 0.005-0.01. After ap-
plication of the OLS algorithm the wavelet basis units
are optimally reduced to 10 and 13 units. Figures 1-4
show the simulation results. Figures 1 and 3 show a
comparison of results for error probability (EP), aver-
aged over 50 independent runs for the linear channel
equalizers (WNNs, MLP and LLMS). 10 000 data
samples are used to train these networks per each run.
Different random initial weights and parameters are
used in each run. Figures 2 and 4 show a comparison
of results for EP averaged over 50 independent runs
for the nonlinear channel equalizers (WNNs, MLP
and LLMS). The simulation results show that the first
channel is better than the second, and the deterioration
of former nonlinear is also smaller than the latter. The
WNNs and MLP are capable of forming complex de-
cision regions in input pattern space. The former ex-
hibits a more powerful nonlinear ability and gives su-
perior performance with the lowest bit error rate
(BER). It can be observed that WNNs equalizer out-
performs MLP, and far outperforms LLMS equalizer
over a wide range of SNR conditions.
6. JIANG Minghu et al Wavelet Neural Networks for Adaptive Equalization 29
Fig. 1 BER performance of several equalizers for
channel H1(Z) with different SNRs: Linear channel
Fig. 2 BER performance of several equalizers for
channel H1(Z) with different SNRs: Nonlinear channel
Fig. 3 BER performance of several equalizers for
channel H2(Z) with different SNRs: Linear channel
Fig. 4 BER performance of several equalizers for
channel H2(Z) with different SNRs: Nonlinear channel
The reduced WNNs equalizer using an OLS algo-
rithm is practical to implement and outperforms the
conventional neural network equalizers. The OLS al-
gorithm for WNNs gives a small number of wavelet
units and less BER under any SNR condition for lin-
ear and nonlinear channels. The superior performance
is due to the powerful nonlinear ability of the WNNs
equalizer and to the computational refinement arising
from the use of the OLS optimazation algorithm.
3 Conclusions
WNNs are capable of forming arbitrarily complex
nonlinear decision boundaries to solve complex clas-
sification problems. Equalizers based WNNs are ca-
pable of performing quite well in compensating
nonlinear distortions introduced in a channel. The
OLS algorithm for WNNs gives a small number of
wavelet units (forming a more compact set of basis
units), free from redundant or irrelevant information.
Experimental results show that the proposed WNNs
based equalizer can significantly improve the neural
modeling accuracy, and outperforms conventional
neural networks in signal to noise ratio and channel
non-linearity.
References
[1] Chen S, Gibson G J, Cowan C F N, et al. Adaptive
equalization of finite nonlinear channels using multilayer
perceptrons. Signal Processing, 1990, 20(2): 107-119.
[2] Gan Q, Saratchandran P, Sundaraarjan N. A complex val-
ued radial basis function network for equalization of fast
time varying channels. IEEE Transactions on Neural Net-
works, 1999, 10(4): 958-960.
[3] Chen S, Mulgrew B, Grant P M. A clustering techniques
for digital communications channel equalization using ra-
dial basis function networks. IEEE Transactions on Neu-
ral Networks, 1993, 4(4): 570-579.
[4] Patra J C, Pal R N. A functional link artificial neural net-
work for adaptive channel equalization. Signal Processing,
1995, 43: 181-195.
[5] Zhang Q, Benveniste A. Wavelet networks. IEEE Trans-
actions on Neural Networks, 1992, 3: 889-898.
[6] Fang Y, Chow T W S. Orthogonal wavelet neural net-
works applying to identification of Wiener model. IEEE
Transactions on Circuits and Systems-I, 2000, 47(4):
591-593.
(Continued on page 37)