Multiuser detection based on generalized side-lobe canceller plus SOVA algorithm

Universidad Politécnica de Madrid
Escuela Técnica Superior de Ingenieros de Telecomunicación
Multiuser detection based on
Generalized Side-lobe Canceller
plus SOVA algorithm
Undergraduate Thesis Project by Aitor López Hernández
to obtain the Bachelor’s degree in Ingenier´ıa en Tecnolog´ıas y
Servicios de Telecomunicación
Departamento de Señales, Sistemas y Radiocomunicaciones
TUTOR: ZAZO BELLO, SANTIAGO
Presidente: GARCÍA OTERO, MARIANO
Vocal: ZAZO BELLO, SANTIAGO
Secretario: GARCÍA IZQUIERDO, MIGUEL ÁNGEL
Suplente: GRAJAL DE LA FUENTE, JESÚS

Agradecimientos
Hola, abuela.
Ya hace algunos meses desde que nos dejaste, pero mamá y yo te seguimos
echando mucho de menos.
Me acuerdo muchas veces de tu sonrisa tan radiante con la que me esperabas
en casa para celebrar cada aprobado, lo contenta que te pusiste cuando me dieron
mi primer trabajo... Siempre estuviste ah´ı, en los buenos y en los malos momentos.
También, en cierto modo, has estado a mi lado mientras he escrito este trabajo. Y
sé que lo estarás mientras lo defienda.
Has regalado tu ejemplo y tu cariño a toda tu familia, demostrando en cada mo-
mento por qué eres la mujer más fuerte que jamás he conocido.
Por ello —y mucho más— ser tu nieto es algo que llevo y que llevaré,
ahora y hasta el último d´ıa de mi vida, con el mayor orgullo.
He hecho esto para ti. Espero que te guste.
Aitor
3

Abstract
This Undergraduate Thesis Project takes on the issue of Multi User Detection in
digital communications. This topic has become a great matter of interest, as multi
user interference unavoidably occurs in every real system (as, for instance, a cell)
and because of that, it must be taken into account and faced consequently.
There exist so many procedures to attain this task. For this thesis we have chosen
a solution based on GSLC (“Generalized Side Lobe Canceller”) scheme, as it allows
us to eﬃciently suppress this multi user interference (taking advantage of the vector
nature of the incoming signal) by designing a “blocking matrix” fed back though an
adaptive system.
The intention behind the usage of adaptive processing is because of the ignorance
of the exact number of simultaneous transmissions per time instant. In the example
described before about the cell, this number dynamically varies according to the
entrance/exit of new users within the coverty area.
Last but not least, we have also decided to add a convolutional decoder in the
last stage of our receiver with the aim of reducing the resulting bit error rate, which
can result specially useful for those time instants in which the receiver is not still
ready to properly cancel the multi user + self interference that varies according to
the above mentioned changes in the cell’s number of users.
First of all and after an introduction to the topic, the mathematical background
of this kind of solutions is to be widely discussed. Moreover, several computer simu-
5

Abstract 6
lations over a multipath scenario support the feasibility of our proposal and compare
it with other traditional solutions such as Zero-Forcing and Minimum Mean Square
Error (MMSE) criteria.
Key words: GSLC, ISI, MUI, SOVA, RLS, LMS, adaptive processing.

Resumen
En este Trabajo Fin de Grado se aborda la problemática de la detección mul-
tiusuario en comunicaciones digitales. Se trata de un caso de estudio muy interesante,
ya que la interferencia multiusuario es algo que ocurre inevitablemente en cualquier
sistema real (como, por ejemplo, una célula de comunicaciones móviles) y que, por
lo tanto, debe ser tenido en cuenta y combatido consecuentemente.
Existen muchas maneras de llevar a cabo esta tarea. Sin embargo para este tra-
bajo hemos escogido una solución basada en un esquema GSLC (“Generalized Side
Lobe Canceller”), ya que nos permite cancelar eficientemente esta interferencia mul-
tiusuario (aprovechándonos de la naturaleza vectorial de la señal recibida) a través
del diseño de una “matriz de bloqueo” realimentada mediante un sistema adaptativo.
El sentido de incluir en nuestro esquema un procesado adaptativo reside en el
desconocimiento del número exacto de transmisiones simultáneas que se producen
para cada instante de tiempo. En el ejemplo de la célula mencionado anteriormente,
este número var´ıa dinámicamente conforme nuevos usuarios comienzan a registrarse
o a abandonar su zona de cobertura correspondiente.
Por último, hemos decidido añadir en la última etapa del receptor un codificador
convolucional con el objeto de refinar la tasa de error de bit resultante, lo cual puede
resultarnos especialmente útil sobre todo para aquellos instantes de tiempo en los
cuales el sistema todav´ıa no permanece “enganchado”; esto es, preparado para corre-
gir la interferencia multiusuario que va variando a medida que se producen cambios
en la célula como los mencionados superiormente.
7

Resumen 8
En primer lugar y tras una introducción a la problemática, se describe el modelo
matemático que da soporte a este tipo de sistemas. Junto a él lo acompañan una
serie de simulaciones en MATLAB, que nos permiten refrendar su validez frente a
otro tipo de soluciones tradicionales como los criterios Zero-Forcing (ZF) o Mimi-
mum Mean Square Error (MMSE).
Palabras clave: GSLC, ISI, MUI, SOVA, RLS, LMS, adaptive processing.

Contents
Agradecimientos 3
Abstract 5
Resumen 7
1 Introduction 15
1.1 A (not so) brief introduction to DSSS-CDMA . . . . . . . . . . . . . 15
1.2 Multiuser detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.2.1 Rake receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.2.2 ZF criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.2.3 MMSE criteria . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.3 Particularizing in our CDMA case . . . . . . . . . . . . . . . . . . . . 21
1.3.1 Signal model . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.3.2 Possible implementations . . . . . . . . . . . . . . . . . . . . . 22
1.3.2.1 Decorrelation receiver . . . . . . . . . . . . . . . . . 22
1.3.2.2 Multipath combining . . . . . . . . . . . . . . . . . . 23
1.3.2.3 Signature decorrelation + multipath combining . . . 23
2 Fundamentals of array processing 27
2.1 Prelude . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2 Mathematical background . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2.1 Orthogonal decomposition of a given vector . . . . . . . . . . 28
2.2.2 Approach to GSLC scheme in a beamforming scenario . . . . 29
9

Contents 10
3 Uplink. A review of several alternatives based on interference can-
cellation 33
3.1 Blocking the MUI coming from previous data symbols . . . . . . . . . 33
3.2 Behaviour of MUI + ISI free schemes . . . . . . . . . . . . . . . . . . 35
3.2.1 Multipath combining . . . . . . . . . . . . . . . . . . . . . . . 35
3.2.2 Signature decorrelation + multipath combining . . . . . . . . 36
4 Downlink. Our proposal to eliminate both ISI and MUI 39
4.1 Introduction and block diagram . . . . . . . . . . . . . . . . . . . . . 39
4.2 Adaptive processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.2.1 LMS algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.2.1.1 Initial approach . . . . . . . . . . . . . . . . . . . . . 40
4.2.1.2 Discussion and recursive algorithm . . . . . . . . . . 41
4.2.1.3 Computational cost . . . . . . . . . . . . . . . . . . 43
4.2.1.4 Variants of LMS algorithm . . . . . . . . . . . . . . 44
4.2.2 RLS algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2.2.1 Initial approach . . . . . . . . . . . . . . . . . . . . . 44
4.2.2.2 Discussion and recursive algorithm . . . . . . . . . . 45
4.2.2.3 Computational cost . . . . . . . . . . . . . . . . . . 49
4.2.2.4 Variants of RLS algorithm . . . . . . . . . . . . . . . 50
4.3 Error correction using SOVA algorithm . . . . . . . . . . . . . . . . . 51
4.3.1 A review of the (usual) Viterbi Algorithm . . . . . . . . . . . 51
4.3.1.1 Background . . . . . . . . . . . . . . . . . . . . . . . 51
4.3.1.2 Algorithm discussion . . . . . . . . . . . . . . . . . . 52
4.3.2 Soft Output Viterbi Algorithm . . . . . . . . . . . . . . . . . 54
4.3.3 Other approaches of extending the Viterbi Algorithm . . . . . 54
4.3.3.1 List Viterbi Algorithm (LVA) . . . . . . . . . . . . . 54
4.3.3.2 Circular Viterbi Algorithm (CVA) . . . . . . . . . . 55
Conclusions 59
A Generation of the spreading codes 61
Bibliography 65

List of Figures
1.1 Data transmitted by user and its correspondant (and unique) signature. 16
1.2 Received signal and signal after decorrelation stage. . . . . . . . . . . 17
1.3 Scheme of the Rake receiver. . . . . . . . . . . . . . . . . . . . . . . . 18
1.4 Scheme of the whole transmission/reception stage including ZF/MMSE
ﬁltering. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.5 Variation of the mean square error at receiver’s entrance with Eb/No
for both ZF and MMSE criteria . . . . . . . . . . . . . . . . . . . . . 20
1.6 CDMA network representation . . . . . . . . . . . . . . . . . . . . . . 22
1.7 Evolution of BER with Eb/No for receivers studied in section 1.3.2 . . 24
2.1 Scheme of our GSLC solution . . . . . . . . . . . . . . . . . . . . . . 28
2.2 Decomposition of a vector x[n] in its two orthogonal components xq[n]
and xp[n] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.1 Evolution of BER with Eb/No for receivers studied in section 3.2 . . . 36
3.2 Representation of incoming signal’s blocking process in GSLC’s lower
branch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.1 Scheme of our proposed solution for downlink scenario. Reception
stage for user k=1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.2 2D and 3D representation of the quadratic error surface . . . . . . . . 42
4.3 RLS functional block diagram . . . . . . . . . . . . . . . . . . . . . . 45
4.4 Performance of LMS/RLS algorithms within GSLC scheme . . . . . . 49
4.5 Example of a sequence of most likely states of a 5-states Hidden
Markov Model given a 5-length observations sequence . . . . . . . . . 53
11

List of Figures 12
4.6 Block of coded information without tail (same ending and starting
states) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.7 Simulation of a situation where the information is transmitted repeat-
edly in block form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.8 Evolution of BER with Eb/No using hard/soft decoding . . . . . . . . 56
4.9 Evolution of BER with Eb/No of our ﬁnal system with/without SOVA
decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
A.1 Scheme of a shift register . . . . . . . . . . . . . . . . . . . . . . . . . 61
A.2 Auto/cross-correlation of two M-sequences . . . . . . . . . . . . . . . 62
A.3 Scheme of the production of a Gold code . . . . . . . . . . . . . . . . 62
A.4 Auto/cross-correlation of two Gold sequences . . . . . . . . . . . . . . 63

List of Tables
4.1 Estimated computational cost of RLS algorithm per iteration for complex-
valued data in terms of real multiplications, real additions and real
divisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
A.1 Several pairs of M-sequences used to generate Gold codes . . . . . . . 63
13

Chapter 1
Introduction
1.1 A (not so) brief introduction to DSSS-CDMA
Multi access techniques FDMA (Frequency Division Multiple Access) and TDMA
(Time Division Multiple Access) assign disjoint resources (either of frequency or
time) to each channel. Otherwise, CDMA (Code Division Multiple Access) provides
to each channel all the available bandwidth during the whole time and (also) in the
whole coverage area, allowing the simultaneous transmission of several communica-
tions sk(t) that share the very same resources at the same time.
s(t) =
K
k=1
sk(t), 0 ≤ t < T, being sk(t) =
n
Ak[n]gk(t − nT) (1.1)
Consequently, a strong mutual interference is generated. Thus, some mechanisms
to extract each individual communication from the bunch of mutually interfering
signals must be established. This is achieved by multiplying each one by a unique
“noise” signal xk[m], which is a pseudo random sequence of -1 and 1 values, at a
frequency N times higher than that of the original signal:
xk[m] = (xk[0]δ[m] + . . . + xk[N − 1]δ[m − N − 1] =
N−1
l=0
xk[l]δ[m − l] (1.2)
gk(t) =
N−1
m=0
xk(m)gc(t − mTc) (1.3)
15

Chapter 1. Introduction 16
sk(t) ∗ gk(t) =
n
Ak[n]
N−1
l=0
x(l)gc(t − lTc − nT) (1.4)
The insertion of this pseudo random sequence (named “signature”) produces a spec-
tral spread of the transmitted signal, giving this technique the name of Direct-
Sequence Spread Spectrum (DSSS). By this procedure, spectral power density heavily
shrinks to the point that resulting signal resembles white noise, making it difficult
to be detected (which made this procedure particularly attractive for military pur-
poses). Afterwards, this initiative was transferred to civilian world and has led to
high-capacity multiuser transmissions (for instance, in 3G mobile applications). [5]
0 50 100 150 200 250 300
−2
0
2
Data transmitted by the user
t
d(t)
0 50 100 150 200 250 300
−2
0
2
Signature correspondent to that user
t
c(t)
Figure 1.1: Data transmitted by user and its correspondant (and unique) signature.
Ultimately, thanks to this resource, the original data can be reconstructed, sep-
arated and identified at the receiving end, by multiplying this resulting signal by
the same signature (as 1 × 1 = 1 and −1 × −1 = 1). This process is known as
“de-spreading”, and has been proved to be quite effective against wide and narrow
bandwidth interferences:
• As a consequence of this de-spreading process, narrow bandwidth interferences
are reduced by a N factor (N = T
Tc
).
• Although wider bandwidth interferences remain spread, they may be easily
discriminated by being multiplied by the correspondent signature.

0 50 100 150 200 250 300
−2
−1
0
1
2
t
m(t)=d(t)c(t)
0 50 100 150 200 250 300
−2
−1
0
1
2
t
m(t)c(t)
Figure 1.2: Received signal and signal after decorrelation stage.
1.2 Multiuser detection
Let us consider a similar scenario than the one described in our Abstract before-
hand:
y = ((s(t) + n(t)) ∗ g∗
c (−t)|t=nTc
=
K
k=1
Akxk +
T
0
n(t)g∗
c (t)dt
=
K
k=1
Akxk + n
= XA + n (1.5)
However, regarding multipath channels, this procedure loses eﬀectiveness and be-
comes insuﬃcient to detect each communication properly. This is on account of the
orthogonality loss between users’ signatures once they are convolutioned by their
correspondent channel:
˜xk = xk(m) ∗ hk(m) (1.6)
y =
K
k=1
Ak ˜xk + n = ˜XA + n (1.7)

1.2.1 Rake receiver
h1(t)
h2(t)
hn(t)
s1(t)
s2(t)
sn(t)
+
n(t)
g∗
c (t)
×
˜x1
×
˜x2
×
˜xn
Tc
z1
z2
zn
1
Figure 1.3: Scheme of the Rake receiver.
Implementation of Rake receiver simply consists on correlating with this eﬀective
waveforms:
zk = ˜xH
k y = ˜xH
k (
K
l=1
Al ˜xl + n)
= ˜xH
k
˜XA + ˜xH
k n (1.8)
And, for the whole group of them:
z = ˜XH
y = ˜XH
(
K
l=1
Al ˜xl + n)
= ˜XH ˜XA + ˜XH
n
= ˜RA + ˜XH
n (1.9)
In this case, ˜R becomes so far from being diagonal. In other words, as explained be-
fore, the acceptable correlation properties under the design of these pseudo random
sequences have been reduced. This consequently leads to the necessity of perform-
ing another additional processing to minimize inter-symbol interference (ISI) and

multiuser interference (MUI). For this purpose, two equalization strategies come to
hand: the Zero-Forcing (ZF) and the Minimum Mean Square Error (MMSE) criteria.
One important matter of study that will be covered in this project is the com-
parison between these criteria (widely known alternatives by their rather good effec-
tiveness in spite of their simplicity) with our GSLC solution. Both of them are to
be discussed in the two following subsections.
1.2.2 ZF criteria
Zero-Forcing criteria consists on applying the inverse of the channel frequency
response to the received signal. Therefore, the combination of channel and equalizer’s
impulse response offers a linear phase and a flat frequency response:
v = Â = GZF z
= GZF ( ˜RA + ˜XH
n)
= ˜R−1
( ˜RA + ˜XH
n)
= A + ˜R−1 ˜XH
n (1.10)
h1(t)
h2(t)
hn(t)
s1(t)
s2(t)
sn(t)
+
n(t)
g∗
c (t)
×
˜x1
×
˜x2
×
˜xn
Tc
ZF/MMSE1
ZF/MMSE2
ZF/MMSEn
z1
z2
zn
w1
w2
wn
Figure 1.4: Scheme of the whole transmission/reception stage including ZF/MMSE
filtering.

1.2.3 MMSE criteria
Our detector may also be designed by making its output “as similar as possible”
to the original data symbols sequence, taking into account both ISI and noise dis-
torting effects and avoiding the gain of this latter.
For this purpose, forcing mean square error at receiver’s output to 0, we have:
E{êzH
} = E{( Â − A)zH
} = 0
= E{(GMMSEz − A)zH
}
= E{((GMMSE( ˜RA + ˜Xn) − A)( ˜RA + ˜XH
n)H
}
(1.11)
Defining GMMSE as
GMMSE = (σ2
A
˜R ˜RH
+ σ2
n
˜R)−1
σ2
A
˜RH
(1.12)
Figure 1.5: Variation of the mean square error at receiver’s entrance with Eb/No for
both ZF and MMSE criteria

1.3 Particularizing in our CDMA case
We must take into account that previous models, without considering the inter-
symbol interference (ISI) side effects, are remarkably simplified. But in a real com-
munications system such as the one under study, this phenomena does unavoidably
take place due to so many causes (band limited channels, multipath propagation...).
This is why we are considering it from now on, as another handicap that we have to
deal with and to eliminate when possible.
1.3.1 Signal model
In this section we proceed to expand our matter of study to the case of an ad-hoc
real CDMA network. In this case, the received signal at a certain time m will be:
r[m] = SCφAb[m] + SdCφAb[m − 1] + n[m] = r0 + rd + n (1.13)
With the intention of simplifying both calculations and notation, the expression
above ‘emulates’ the effect of a multipath scenario (a convolution) using a simple
matrix multiplication. And, by doing so, we have:
• S = [S1, · · · , SK] ∈ CN×KL
is the matrix of signature vectors and their delayed
versions at a time instant m.
• Sd = [S1d
, · · · , SKd
] ∈ CN×KL
is the matrix of signature vectors’ delayed ver-
sions at a time instant m − 1.
• C = diag[c1, · · · , cK] ∈ CKL×K
are the L-multipath K channels.
• φ = diag[ejφ1
, · · · , ejφK
] ∈ CK×K
are the received phases.
• A = diag[A1, · · · , AK] ∈ CK×K
are the received data amplitudes.
• b = diag[b1, · · · , bK]T
∈ CK×1
are the user data symbols.
Previous expression 1.13 clearly depicts this potential problem: because of the exis-
tence of multipath delays, continuous transmitted symbols are likely to get involved
in this Intersymbol interference (ISI) situation. So: how can this additional difficulty

be overcome? Several alternatives to reduce both MUI and ISI interference will be
brought forward in the following subsection.
Figure 1.6: CDMA network representation
1.3.2 Possible implementations
Before presenting a valid alternative to handle with both MUI and ISI, we will
study the performance of some of the receivers presented in previous subsection.
To do so, we consider now that received signal is firstly filtered by the corre-
sponding matched filter bank ST
(ST
n in the decorrelation receiver’s case), obtaining
the following results:
1.3.2.1 Decorrelation receiver
The output of this receiver will be:
y[m] = T−1
ST
n r[m]
= T−1
ST
n SCφAb[m] + T−1
ST
n SdCφAb[m − 1] + T−1
ST
n n[m]
= DS + MUI1 + ISI + MUI2 + Noise (1.14)

where T = ST
n Sn is the signatures’ correlation matrix without multipath.
As expressed in 1.14, this strategy does not seem to be very eﬀective against MUI
because it is not taking into account the multipath eﬀect inside our received signal.
And because of that, it will be discarded of our posterior analysis.
1.3.2.2 Multipath combining
y[m] = CH
ST
r[m]
= CH
RCφAb[m] + CH
RM CφAb[m − 1] + CH
ST
n[m]
= DS + MUI1 + ISI + MUI2 + Noise (1.15)
where R = ST
S is the signatures’ correlation matrix with multipath and RM = ST
Sd
is the advanced and delayed signatures’ crosscorrelation matrix with multipath.
This receiver –whose performance is identical to the Rake receiver one studied in
the Introduction– does not also cancel MUI1 as the previous one due to the residual
crosscorrelation in matrix R.
1.3.2.3 Signature decorrelation + multipath combining
y[m] = (CH
RC)−1
CH
ST
r[m]
= φAb[m] + (CH
RC)−1
CH
RM CφAb[m − 1] + (CH
RC)−1
CH
ST
n[m]
= DS + ISI + MUI2 + Noise (1.16)

This receiver also works the other way round, this is, performing the signature decor-
relation after the multipath combining:
y[m] = CH
R−1
ST
r[m]
= CH
CφAb[m] + CH
R−1
RM CφAb[m − 1] + CH
R−1
ST
n[m]
= DS + ISI + MUI2 + Noise (1.17)
It is in this case that MUI1 can be completely suppressed, however, this receiver
(whose performance is identical to the one of Zero Forcing criteria) cannot cope with
the interference coming from the ISI effect.
So as we have just reviewed, these alternatives have proven to be insufficient to
be used in a communications system where we would like to remove this interfer-
ence. Therefore, the scope of this project is to propose an alternative strategy to
overcome this problem effectively in both scenarios, downlink and uplink —which
will be tackled separately in the following chapters— and demonstrate its feasibility.
Figure 1.7: Evolution of BER with Eb/No for receivers studied in section 1.3.2

Figure 1.7 represents the evolution of BER with Eb/No for receivers under study
in this Introduction. We have picked out a spreading factor of N = 8, a multipath
channel consisting of L = 2 rays and a number of K = 2 users. For that purpose, as
this is an extremely simpliﬁed model, orthogonal codes have been preferred rather
than Gold codes (which appear explained in more detail in the Annex) to construct
each user’s signature due to the fact of their ideal auto and cross correlation prop-
erties.

Chapter 2
Fundamentals of array processing
2.1 Prelude
As we shortly introduced in the previous chapter, this thesis is concerned with
the reduction of both multipath and multiuser interference. For this purpose, we
pick out a single sensor multiuser detector that exploits the well known Generalized
Side Lobe Canceller (GSLC) scheme [8, 10]
This receiver is shown to attain essentially single user performance assuming the
receiver knows (or can acquire) the following:
• Every user signature waveforms
• Timing for all users
• Received amplitudes for all users
Our proposal is concerned with a synchronous multiple access scheme in a multi-
path scenario assuming that every user timing and channel characteristic has been
acquired in a previous section. Our interference canceller is based on the well known
Generalized Side Lobe Canceller (GSLC) with the proper design of the blocking ma-
trix of the lower branch. This blocking matrix will span the orthogonal subspace of
the desired signature and delayed/advanced versions of it in terms of the expected
maximum channel time-span.
27

Chapter 2. Fundamentals of array processing 28
The block diagram which corresponds to our proposed schema is the following:
Figure 2.1: Scheme of our GSLC solution
Where, going back on notation from previous chapter, we have:
• DS stands for “Desired Signal”.
• MUI stands for “MultiUser Interference” (this is, the addition of each other
users’ communications). Furthermore, MUI and is the MultiUser Interference
after the blocking matrix B.
• N corresponds to the Additive White Gaussian Noise (AWGN) generated
through the channel.
2.2 Mathematical background
2.2.1 Orthogonal decomposition of a given vector
Figure 2.2: Decomposition of a vector x[n] in its two orthogonal components xq[n]
and xp[n]

A vector x ∈ N×1
may be expressed as the combination of two (or more) or-
thogonal components.
Let us deﬁne a subspace through a matrix P ∈ N×L
and its correspondent orthog-
onal complement Q ∈ N×(N−L)
.
PH
Q = 0 (2.1)
As x may be expressed as:
x = xP + xQ = Pγ + xQ (2.2)
γ is in this case a random linear combiner which is needed to be determined.
Taking into account that PH
xQ = 0, we will obtain:
PH
x = PH
xP + PH
xQ
= PH
Pγ + PH
xQ
= PH
Pγ → γ = (PH
P)−1
PH
x (2.3)
Leading to the following values for xP and xQ:
xP = Pγ = P(PH
P)−1
PH
x = PP x
xQ = x − xP = (I − P(PH
P)−1
PH
)x = PQx (2.4)
Where PP and PQ are the projection over the two orthogonal subspaces (P and its
complement Q).
2.2.2 Approach to GSLC scheme in a beamforming scenario
As explained backwards throughout this very same chapter, GSLC performance
lies on how accurately we can deﬁne the orthogonal subspace of the incoming sig-
nature through the blocking matrix from the lower branch of the canceller, whose
dimension, needless to say, is dim = DOF − restrictions, where:
• DOF stands for Degrees Of Freedom, this is, the number of ‘free components’

of a random vector (which in our CDMA case would be the ‘length’ of each
user’s signature).
• The restrictions are the parameters used as intermediate steps to estimate the
dimension itself. In the CDMA case under study, this is the addition of the
number of users plus the length of each user’s multipath channel.
To get to know in depth the behaviour of this strategy before implementing it in our
scenario, we will kick off from a simpler case: the study of a narrow band receiver
consisting of an array of antennas which objective is to target its radiation pattern
to a desired direction, as well as reducing as much as possible all interfering signals
coming from the rest of them. In such a way, the DOF are given by the number of
antennas that compound our array, whereas the number of restrictions does coin-
cide with the number of these interfering signals.
Our objective consists, as usual, on minimizing the resulting error function e[n]
(which is the difference between our desired signal at filter’s output, d[n], and an
estimate y[n]):
minw[n]E|e[n]|2
= minw[n]E|d[n] − y[n]|2
= minw[n]E|y[n]|2
= minw[n]wH
Rxw (2.5)
Now, taking into account that
CH
w = g (2.6)
where C ∈ N×M
does contain the “steering vector” of the group of directions that
we are willing to preserve,
C(φ) =






1
e
2πd
λ
sin (φ)
. . .
e
2π(N−1)d
λ
sin (φ)






(2.7)

and g represents the power gain of each of these directions, this problem could be
solved by using the Lagrangian operator:
J(w, λ) = wH
Rxw + λH
(CH
W − g) + (wH
C − gH
)λ (2.8)
By estimating the gradient of this function respect to wH
and equalizing this result
to zero,
wH J(w, λ) =
∂J(w, λ)
∂wH
= Rxw + Cλ = 0 → w = −R−1
x Cλ (2.9)
And substituting in (2.6), we have:
CH
w = −CH
R−1
x Cλ = g → λ = −(CH
R−1
x C)−1
g (2.10)
Finally obtaining
wopt = −R−1
x Cλ = R−1
x C(CH
R−1
x C)−1
g (2.11)
Moving to the implementation of our GSLC solution, according to our proposed
scheme we will have:
wopt = wC + wB (2.12)
Going back to (2.4), we may express wC and wB as:
wC = Pcwopt = C(CH
C)−1
R−1
x C(CH
R−1
x C)−1
g
= C(CH
C)−1
g = wq (2.13)
w[n] = wq + wB[n] = wq + PBw[n]
= wq + B(BH
B)−1
BH
w[n] (2.14)
What enables us to express the previous formula as
w[n] = wq − Bwa[n]
wa[n] = −(BH
B)−1
BH
w[n] (2.15)

The discussion concerning the possible adaptive processing techniques that may be
used to shape the blocking matrix ﬁlter coeﬃcients in downlink scenario will be the
matter of study of chapter 4.

Chapter 3
Uplink. A review of several
alternatives based on interference
cancellation
3.1 Blocking the MUI coming from previous data
symbols
Similarly than in previous chapter, the output signal from our solution is the
combination of the current-symbol matched filter output minus a blocked version of
the complete signal.
The N × N blocking matrix B is simply defined as:
B = I − S(ST
S)−1
ST
(3.1)
It can be easily contrasted that this lower branch of the circuit effectively blocks the
current component of the signal vector:
BS = (I − S(ST
S)−1
ST
)S = S − S(ST
S)−1
ST
S = S − S = 0 (3.2)
And, after blocking the current component symbol in the lower branch, the recom-
33

Chapter 3. Uplink. A review of several alternatives based on interference
cancellation 34
bination at the output will be:
z[m] = ST
r0 + (ST
rd − WBrd) + ST
n (3.3)
At this point, we are ready to determine which value of W minimizes the MUI at the
output of our receiver, represented in ﬁgure 3.3 as (ST
rd −WBrd). Considering now
a MSE solution at this output = E{zH
[m]z[m]} against the recombination matrix
W, it leads to:
minW∈ KL×N E{zH
[m]z[m]} = minW∈ KL×N E{(ST
r − WBrd)T
(ST
r − WBrd)}
= minW∈ KL×N E{(d − Wa)H
(d − Wa)} (3.4)
E{(d − Wa)aH
} = 0 → E{daH
} = WE{daH
} (3.5)
where, assuming now an AWGN of variance σ2
, we have:
a = Br = Brd + Bn = BSdCφAb[m − 1] + Bn (3.6)
d = ST
r = ST
SCφAb[m] + ST
SdCφAb[m − 1] + ST
n (3.7)
and therefore
E{daH
} = ST
SdCA2
CH
ST
d BT
+ σ2
ST
BT
= ST
SdCA2
CH
ST
d BT
(3.8)
E{aaH
} = BSdCA2
CH
ST
d BT
+ σ2
BBT
(3.9)
denoting now G = CA2
CH
as a real matrix dependent on the channels, we can ﬁnally
write:
W = ST
SdGST
d BT
(BSdGST
d BT
+ σ2
BBT
)P
(3.10)
P in this expression refers to the Moore-Penrose generalized inverse (a generalization
of the inverse matrix also applicable for non-squared matrices).
AP
= (AH
A)−1
AH
if matrix AH
A is invertible
AP
= AH
(AAH
)−1
if matrix AAH
is invertible (3.11)

cancellation 35
Substituting this solution in the system’s output equation in 3.3 we now have a
residual MUI:
ST
rd − WBrd = ST
Sd[I − GST
d BT
(BSdGST
d BT
+ σ2
BBT
)P
BSd] = 0 (3.12)
Having the following output signal after the blocking and matched ﬁltering operation:
z[m] = ST
r0[m] + (ST
rd[m] − WBrd[m]) + (ST
− WB)n[m]
= z0[m] + rmui2 + nb[m]
= RCφAb[m] + rmui2 + nb[m] (3.13)
where mui in lower letters represents a residual MUI after the blocking process in
the noise situation.
3.2 Behaviour of MUI + ISI free schemes
We can now apply the same strategies from previous subsection to suppress MUI
in this incoming signal z[m]:
3.2.1 Multipath combining
y[m] = CH
z[m]
= CH
RCφAb[m] + CH
rmui2 + CH
nb[m]
= DS + MUI1 + mui2 + Noise (3.14)
As we can see, this architecture seems insuﬃcient even to suppress MUI at time
instant m.

cancellation 36
3.2.2 Signature decorrelation + multipath combining
y[m] = CH
R−1
z[m]
= CH
CφAb[m] + CH
R−1
rmui2 + CH
R−1
nb[m]
= DS + mui2 + Noise (3.15)
As in previous case, this solution does also work the other way round (i.e., decorre-
lating each signature after recombining each user’s path).
y[m] = (CH
RC)−1
CH
z[m]
= φAb[m] + (CH
RC)−1
CH
rmui2 + (CH
RC)−1
CH
nb[m]
= DS + mui2 + Noise (3.16)
Results in both cases show that MUI at time instant m has been completely removed,
leaving a residual mui at time instant m − 1 due to the noise eﬀect.
Figure 3.1: Evolution of BER with Eb/No for receivers studied in section 3.2

cancellation 37
Results from figure 3.1 do clearly manifest that based-on GSLC scheme receivers
behave better than the ones inspired in traditional strategies such as the ones from
the Introduction; according to mathematical models described just lines before.
The influence of residual mui in our final results does strongly depend on how ‘or-
thogonal’ the corresponding interfering signals are to blocking matrix B’s generated
subspace. Both figures in 3.2 reveal more effectively the importance of this depen-
dency. The nearer each interfering signal’s vector is to incoming signal’s residual
component (this is, the orthogonal one to B matrix) the higher mui will gener-
ate. In the case under study in picture below, left-side subfigure’s interference will
generate a higher mui than right-side one.
Figure 3.2: Representation of incoming signal’s blocking process in GSLC’s lower
branch

Chapter 4
Downlink. Our proposal to
eliminate both ISI and MUI
4.1 Introduction and block diagram
In downlink scenario, i.e., communication ﬂowing from base station to each user,
the latter is not aware of the number of ‘peers’ that are transmitting simultaneously,
neither of each one’s channel. Therefore, the only information that each user does
actually need is the one given by its own signature and its corresponding channel.
Following this philosophy, our intention is to propose a receiver which ‘discards’
the information coming from the rest of users (this is, multi user interference). Note
that this cannot be done in uplink scenario, where base station knows exactly how
many users are allocated in its corresponding cell (as they need to register once they
penetrate inside it).
A scheme of our solution may be found in the following block diagram:
39

Chapter 4. Downlink. Our proposal to eliminate both ISI and MUI 40
Figure 4.1: Scheme of our proposed solution for downlink scenario. Reception stage
for user k=1
Where ˆS1 = SH
1 ST
1 . And, consequently:
ˆS1r[m] = ˆS1 (S1C1φ1A1b1[m] + · · · + SKCKφKAKbK[m]
+ Sd1 C1φ1A1b1[m − 1] + · · · + SdKCKφKAKbK[m − 1] + n[m])(4.1)
The main advantage of using this architecture is that it only ‘accepts’ the first term
of the above expression, suppressing the others from the rest of users as long as the
cross correlation properties of the family of signatures we are using are good enough.
In this way, as mentioned just a few lines before, there is no need for each final user
to know anything else apart from its own signature and corresponding multipath
channel. The rest of modules from the block diagram depicted in 4.1, as well as their
capabilities, are to be discussed in the following subsections.
4.2 Adaptive processing
4.2.1 LMS algorithm
4.2.1.1 Initial approach
LMS algorithm (Least-mean squares algorithm) was invented in 1960 by Stanford
University professor Bernard Widrow and his student, Ted Hoff.

One of its most important properties is its simplicity. This comes from the fact
that the update for the kth coefficient does only require one multiplication and one
addition (the value for µe[n] need only be computed once and may be used for all
the coefficients), neither requiring any correlation measurements nor inverting the
correlation matrix.
The algorithm consists on two basic processes:
1. One filtering process, that involves:
• The generation of an output within a linear filter in response to a certain
input, and
• The estimation of a error function e[n] through the comparison of this
output with a desired signal.
2. One adaptive process, that implies the automatic adjustment of the filter pa-
rameters according to the estimated error.
Nevertheless, the simplicity of this algorithm comes at the cost of converging slower
than other algorithms like RLS. In spite of that, its computational complexity is so
low compared to the others that made this algorithm widely known and employed.
4.2.1.2 Discussion and recursive algorithm
The Steepest Descent Adaptive Filter
Its performance consists on finding the vector wn at time n that minimizes the
cuadratic function:
ξ[n] = E{|e[n]|2
} (4.2)
Being wn an estimate of the vector that minimizes the mean-square error ξ[n] at
time n, the new estimate at time n + 1 is formed by adding a correction to wn that
brings it closer to the desired solution. This correction involves taking a step of size
µ in the direction of maximum descent down the quadratic error surface, given by
the gradient function:

−20
−10
0
10
20 −20
−10
0
10
20
0
200
400
600
800
weight w(1)
Error quadric surface
weight w(0) Weight w(1)
Weightw(0)
−20 −15 −10 −5 0 5 10 15 20
−20
−15
−10
−5
0
5
10
15
20
Figure 4.2: 2D and 3D representation of the quadratic error surface
wn+1 = wn − µ ξ[n] (4.3)
ξ[n] = E{|e[n]2
|} = E{ |e[n]2
|}
= E{e[n] e∗
[n]} (4.4)
Therefore, with a step size of µ, the steepest descent algorithm results:
wn+1 = wn − µE{e[n]x∗
[n]} (4.5)
The LMS Algorithm
Given previous expression 4.5, a practical limitation with this algorithm is that
the expectation E{e[n]x∗
[n]} is generally unknown. It must be consequently replaced
with an ‘estimate’ such as the following:
ˆE{e[n]x∗
[n]} =
1
L
L−1
l=0
e[n − l]x∗
[n − l] (4.6)
One particular case of this estimate lies on using a one-point sample mean (L = 1):
ˆE{e[n]x∗
[n]} = e[n]x∗
[n] (4.7)

Consequently leading to the assumption by the weight vector update equation of this
simple form:
wn+1 = wn + µe[n]x∗
[n] (4.8)
4.2.1.3 Computational cost
It was previously mentioned that the computational cost of LMS algorithm is
rather small if we compare it with others. In this section we are going to study in
more depth the signiﬁcance of this statement.
It is important to point out that when we talk about an algorithm’s computa-
tional cost we are referring to the amount of needed resources to execute it. This
time we are focusing on its running time, this is, the time spent in performing each
of its mathematical operations. In the case of LMS algorithm, going back to (4.8)
for each iteration we have:
1. An inner product x[n]w[n − 1] between two vectors of size M each. Assum-
ing that x, w ∈ C, and taking into account that one complex multiplication
comprises four real multiplications and one complex addition requires two real
additions, we ﬁnd that a single evaluation of this inner product does require
4M real multiplications and 4M − 2 real additions.
2. It is also needed to evaluate the scalar term e[n] = d[n] − x[n]w[n − 1], which
amounts to one complex adition, i.e., two more real additions.
3. Evaluation of the product µe[n], where µ is a real scalar, requiring again two
real multiplications when data is complex-valued.
4. What immediately comes afterwards is the product between the scalar µe[n]
and x∗
[n]. This will again require M complex multiplications and, conse-
quently, 4M real multiplications and 2M real additions.
5. Finally, the addition of vectors w[n − 1] and µe[n]x∗
[n] urges M complex (2M
real) additions.

Summarizing, LMS requires 8M + 2 real multiplications and 8M real additions per
iteration when it comes to the treatment complex-valued signals. We will contrast
that this cost is far smaller than the one presented by our other algorithm under
study, RLS.
4.2.1.4 Variants of LMS algorithm
The main drawback of “pure” LMS algorithm is that it is quite sensitive to
the scalling of its imput x[n] (and, what is more, it requires an understanding of
the statics of this input signal prior to the adaptive filtering operation). A valid
alternative to bypass this inconvenience is to normalise the power of the input.
This is the main philosophy of Normalised Least Mean Squares filter (NLMS), which
update equation is the following:
wn+1 = wn + β
x∗
[n]e[n]
x[n] 2
(4.9)
being β ∈ (0, 2) the normalized step size.
4.2.2 RLS algorithm
4.2.2.1 Initial approach
RLS (Recursive Least Squares) algorithm recursively finds the filter coefficients
that minimize a weighted linear least squares cost function relating to the input sig-
nals. It was discovered by Gauss in 1821, but ignored until Plackett rediscovered his
job in 1950.
Compared to LMS algorithm, RLS converges quite faster; however, this benefit
comes at the cost of increasing its computational complexity. Moreover, it does
usually present a worse behaviour than LMS for non-stationary processes.

wn +
Update algorithm
x[n] −ˆd[n]
d[n]
∆wn
e[n]
1
Figure 4.3: RLS functional block diagram
4.2.2.2 Discussion and recursive algorithm
Let us reconsider the design of a FIR adaptive Wiener filter and find the filter
coefficients:
wn = (wn[0], wn[1], . . . , wn[p])T
(4.10)
The idea behind this algorithm basically consists on appropriately selecting the
filter coefficients and updating their values while new data is being received [4, 7].
Desired signal dn and error signal en appear defined in functional block diagram 4.3.
The error implicitly depends on the filter coefficients through the estimate ˆd(n):
e[n] = d[n] − ˆd[n] = d[n] − wT
n x[n] (4.11)
Cost error function C -that we desire to minimize- is also dependent on the filter
coefficients, as it is also dependent of e[n].
C(wn) =
n
i=0
λ[n−i]
e2
[i] (4.12)
Where 0 < λ ≤ 1 is known as forgetting factor, and gives exponentially less weight
to the eldest error samples. 1
1
The smaller λ is, the smaller contribution of previous samples. This makes the filter more
sensitive to recent samples, which means more fluctuations in the filter coefficients. The λ = 1 case
is referred to as the growing window RLS algorithm. In practice, λ is usually chosen between 0.98
and 1.

Now we proceed to minimize this function C:
∂C(wn)
∂wn[k]
=
n
i=0
2λn−i
e[i]
∂e[i]
∂wn[k]
= −
n
i=0
2λn−i
e[i]x[i − k]
=
n
i=0
λn−i
[d[i] −
p
l=0
wn[l]x[i − l]]x[i − k] = 0,
k = 0, 1, . . . , p (4.13)
Thus, interchanging the order of summation and rearranging terms we have:
n
l=0
wn[l]
n
i=0
λn−i
x[i − l]x[i − k] =
n
i=0
λn−1
d[i]x[i − k], k = 0, 1, . . . , p (4.14)
These equations may be also expressed in matrix form as follows:
Rx[n]wn = rdx[n] (4.15)
Being
• Rx[n] = n
i=0 λn−i
x[i]xT
[i] ∈ (p+1)×(p+1)
the exponentially weighed determin-
istic autocorrelation matrix for x[n].
• x[i] = [x[i], x[i − 1], . . . , x[i − p]]T
the data vector.
• rdx[n] = n
i=0 λn−i
d[i]x[i] the deterministic cross-correlation between x[n] and
d[n].
Recursive algorithm
Since both Rx[n] and rdx[n] depend on n, instead of solving these equations for
each value of n, we want to derive a recursive solution of the form
wn = wn−1 + ∆wn−1 (4.16)

where ∆wn−1 is a correction applied to the solution at time n − 1. Since
wn = R−1
x [n]rdx[n] (4.17)
we start by expressing the cross-correlation rdx(n) in terms of rdx(n − 1):
rdx[n] =
n
i=0
λn−i
d[i]x[i]
=
n−1
i=0
λn−i
d[i]x[i] + λ0
d[n]x[n]
= λrdx[n − 1] + d[n]x[n] (4.18)
Similarly we express Rx[n] in terms of Rx[n − 1] by:
Rx[n] =
n
i=0
λn−i
x[i]xT
[i] = λRx[n − 1] + x[n]xT
[n] (4.19)
Since it is the inverse of Rx[n] that we are interested in, we may apply Woodsbury’s
Identity to get this desired recursion:
(A + UCV )−1
= A−1
− A−1
U(C−1
+ V A−1
U)−1
V A−1
(4.20)
Where
A = λRx[n − 1] ∈ (p+1)×(p+1)
(4.21)
U = x[n] ∈ (p+1)×1
(4.22)
V = xT
[n] ∈ 1×(p+1)
(4.23)
C = I ∈ (4.24)
R−1
x [n] = λRx[n − 1] + x[n]xT
[n]
−1
= λ−1
R−1
x [n − 1] − λ−1
R−1
x [n − 1]x[n]
= {1 + xT
[n]λ−1
R−1
x [n − 1]x[n]}−1
xT
[n]λ−1
R−1
x [n − 1] (4.25)

To simplify notation we proceed to deﬁne:
P[n] = R−1
x [n] = λ−1
P[n − 1] − g[n]xT
[n]λ−1
P[n − 1] (4.26)
g[n] = λ−1
P[n − 1]x[n](1 + xT
[n]λ−1
P[n − 1]x[n])−1
= P[n − 1]x[n](1 + xT
[n]P[n − 1]x[n])−1
(4.27)
and to bring g[n] into another form:
g[n](1 + xT
λ−1
P[n − 1]x[n]) = g[n] + g[n]xT
λ−1
P[n − 1]x[n]
= λ−1
P[n − 1] (4.28)
Substracting the second term on the left side, we have
g[n] = λ−1
P([n − 1]x[n] − g[n]xT
[n]λ−1
P[n − 1]x[n]
= λ−1
(P[n − 1] − g[n]xT
[n]P[n − 1])x[n]
= P[n]x[n] (4.29)
Now we are ready to complete the recursion. As discussed
wn = P[n]rdx[n]
= λP[n]rdx[n − 1] + d[n]P[n]x[n] (4.30)
Going back to 4.26,
wn = λ(λ−1
P[n − 1] − g[n]xT
[n]λ−1
P[n − 1])rdx[n − 1] + d[n]g[n]
= P[n − 1]rdx[n − 1] − g[n]xT
[n]P[n − 1]rdx[n − 1] + d[n]g[n]
= P[n − 1]rdx[n − 1] − g[n](d[n] − xT
[n]P[n − 1]rdx[n − 1]) (4.31)

Figure 4.4: Performance of LMS/RLS algorithms within GSLC scheme
Finally arriving at the update equation
wn = wn−1 + g[n](d[n] − xT
[n]wn−1)
= wn−1 + g[n]α[n] (4.32)
4.2.2.3 Computational cost
Computational cost of RLS algorithm is one order higher than of LMS. And it is
not only this fact, but also that a reliable RLS implementation will usually require
a higher precision than a LMS one.
Although the order by which the quantities are being computed may diﬀer —and
other ways of carrying out the results may result in a slightly diﬀerent computational
cost, they will all lead to the same order of magnitude, namely, O(M2
). A detailed
analysis of RLS computational cost may be found in 4.1.

4.2.2.4 Variants of RLS algorithm
As well as in LMS algorithm case, RLS algorithm does also have several variants
which may speed up even more its learning curve. Although they will be out of our
interest for this project, we will briefly mention a couple of them in the following lines:
Lattice Recursive Least Squares filter (LRLS)
The Lattice Recursive Least Squares adaptive filter is related to the standard
RLS, but its performance surprisingly does require fewer operations per iteration
(O(M)). It does also offer several advantages over conventional LMS algorithm,
such as faster convergence rates and insensitivity to variations in eigenvalue spread
of the input correlation matrix.
Lattice forms are primarily concerned with order-updating the output estimation
error and not the weight vector itself. To do so, it is also needed to order-update
forward and backward prediction errors.
There may be so many ways to attain this task. In [7], seven different lattice
forms are described. Because of the huge diversity of implementations of this algo-
rithm that can be found, it will be out of our matter of study for this project.
Normalized Lattice Recursive Least Squares filter (NLRLS)
As its name states, this is the normalized form of LRLS. It has fewer recursions
and variables than its predecessor, and is calculated by applying a normalization to
the internal variables of the algorithm (which will keep their magnitude bounded
by one). However, it is not generally used in real-time applications. The reason
lies on the number of divisions and square-root operations, which leads to a high
computational load.

Term × + /
xT
wn−1 M M − 1
d[n] − xT
wn−1 1
λ−1
x M
Pi−1(λ−1
x) M2
M(M − 1)
xT
Pi−1(λ−1
x) M M − 1
1 + xT
Pi−1(λ−1
x) 1
1/[1 + xT
Pi−1(λ−1
x)] 1
(λ−1
xT
Pi−1x) 1
1+xT Pi−1(λ−1x)
1
(λ−1
Pi−1x) × λ−1xT Pi−1x
1+xT Pi−1(λ−1x)
M
Pix M
Pix[d[n] − xT
wn−1] M
wi M
TOTAL per iteration M2
+ 5M + 1 M2
+ 3M 1
Table 4.1: Estimated computational cost of RLS algorithm per iteration for complex-
valued data in terms of real multiplications, real additions and real divisions
4.3 Error correction using SOVA algorithm
As mentioned in our summary, the last stage of our receiver will attach a convo-
lutional encoder based on Soft Output Viterbi Algorithm (SOVA).
To include this latest element in our analysis, let us suppose that the value of
our previous adaptive filter has already converged. This is in fact the only way to
estimate the resulting BER at receiver’s output, given that, if we proceed differently,
our BER would be time-depending.
4.3.1 A review of the (usual) Viterbi Algorithm
4.3.1.1 Background
The Viterbi algorithm, named after Andrew Viterbi (who firstly proposed it in
1967) is a dynamic programming algorithm for finding the most likely sequence of
hidden states, called the Viterbi path. This algorithm has found universal appli-
cation in decoding the convolutional codes used in both CDMA and GSM digital

cellular, dial-up communications an so long.
There are other approaches of extending the Viterbi Algorithm in addition to
the one of our choice. These are the Circular Viterbi Algorithm (CVA) [2] and the
List Viterbi Algorithm (LVA) [9] —which produces a rank ordered list of L > 1 best
paths for the ”trellis” (we will discuss the meaning of this word in next subsection),
making it very useful, for instance, for hybrid FEC/ARQ systems— between others.
4.3.1.2 Algorithm discussion
Channel model depicted in figure 4.1 generates at output q[n] = o[n]+z[n], where
o[n] =
K
k=0
p[k]A[n − k] (4.33)
is the channel output without the addition of AWGN. In a discrete channel with
memory as ours, a key concept to define is the state(ψ). It is defined as the minimum
necessary information to determine the system’s output for each n instant of time
(in this case, o[n]) given the value of the input at the same time n (A[n]). Our state
coincides with the vector formed by the K previously transmitted data symbols:
ψ[n] =






A[n − 1]
A[n − 2]
. . .
A[n − k]






(4.34)
Therefore, o[n] is determined by the system’s state ψ[n] and the input, A[n]; For
next time instant n + 1 is also determined by these values. Notice that the number
of possible states is finite and equal to MK
, as data symbols belong to a discrete
constellation. All systems where this property is verified receive the name of finite-
state machines.
Let us now overview this Viterbi algorithm performance. Given an observation

vector q, the metric associated to each candidate sequence a can be defined as
l(q, a) =
Nk−1
n=0
|q[n] − o[n]|2
=
Nk−1
ln(q, a) (4.35)
l0,m(q, a) =
m
n=0
ln(q, a) (4.36)
Let us define â as the path that minimizes metric lq,a for a given trellis diagram and
observation vector q. Besides, a = â will be another remaining path that attains a
certain node alongside â . In this case, both â and a “accumulated metrics” verify
that:
l0,m(q, â) ≤ l0,m(q, a ) (4.37)
l0,m(q, a) = l0,m−1(q, a) + lm(q, a) (4.38)
Notice that l0,m−1(q, a) is the “surviving” path from one of the possible states at a
previous time instant (m − 1), while lm(q, a) corresponds to the transition towards
the current state at time instant m. Consequently, the remaining path will be the
one for which this addition is the smallest [1, 6].
Figure 4.5: Example of a sequence of most likely states of a 5-states Hidden Markov
Model given a 5-length observations sequence

4.3.2 Soft Output Viterbi Algorithm
The Soft Output Viterbi Algorithm (SOVA) [3] is a variation of this Viterbi al-
gorithm, including two modifications to the classical version:
• The path metrics used to select the maximum likelihood path through the
trellis are modified to take into account the a priori probabilities of the input
symbols.
• The algorithm is modified to provide a soft output for each decoded bit indi-
cating the reliability of the decision.
SOVA performance lies on the idea of that the probability that a given value is
correct is proportional to how close the algorithm came to selecting the other value
(or values). Once the algorithm traverses the entire trellis, thereby tracing the most
likely (ML) path, it traces back from time instant t along it noting all path metric
comparisons that could have changed the ML information at time t − D.
From among these comparisons that comparison is selected in which the dif-
ference between the compared partial path metrics is the smallest one. Thus the
minimization is carried out only for those paths merging which would have given a
different value for the bit at time t − D if they had been selected as the survivor.
4.3.3 Other approaches of extending the Viterbi Algorithm
SOVA algorithm is not at all the only variation of Viterbi Algorithm that may
come in handy for a certain purpose. In the following sections two other alternatives
(List Viterbi Algorithm and Circular Viterbi Algorithm) are to be briefly presented.
4.3.3.1 List Viterbi Algorithm (LVA)
List Viterbi Algorithm (LVA) basically consists on producing a rank ordered list
of L > 1 best paths through the trellis diagram:
This modification of the Viterbi Algorithm may be used in so many cases (not
in vain, its performance has been proven to be significantly better). For instance, in

a hybrid FEC/ARQ scheme, the retransmission of a frame is only requested by the
receiver once it has taken notice that all L frames contain errors.
LVA works out by the use of three information arrays. These are:
• φt(i, k), 0 ≤ i ≤ N, 0 ≤ k ≤ L, the kth lowest cost to reach state i at time t.
• t(i, k), the state of this kth best path at time t − 1, when it passes through
state i at time t.
• and rt(i, k), the corresponding ranking.
4.3.3.2 Circular Viterbi Algorithm (CVA)
Viterbi decoding algorithm for a block of convolutionally coded message does
require a terminating tail known to the decoder, forcing the ﬁnal trellis state to be
a known one (usually all zeros).
But it is also possible to perform a block-wise decoding without this known
tail, by ensuring that coded message begins and ends in the same state which is
unknown to the receiver. This case is widely known as decoding of tailbiting convo-
lutional codes, and it is very used in several new cellular mobile radio systems such
as WIMAX or LTE to avoid the overhead of this “zeros” tail, improving the eﬃciency.
Figure 4.6: Block of coded information without tail (same ending and starting states)
The idea of Circular Viterbi Algorithm (CVA) is simple. It consists on applying
the Viterbi Algorithm for continuous decoding to a sequence of repeated received

blocks:
Figure 4.7: Simulation of a situation where the information is transmitted repeatedly
in block form
By connecting the repeated versions of the same block together we can satisfy
the condition that the ending state and the starting state of the decoding path are
the same. And any decoding path passing through this repeated trellis will fulﬁl this
condition.
Figure 4.8: Evolution of BER with Eb/No using hard/soft decoding
Image above manifests that soft decoding is a preferred option to improve BER
results rather than hard decoding.

Figure 4.9: Evolution of BER with Eb/No of our final system with/without SOVA
decoder
It is clear, according to these two final simulations, that SOVA decoder allocated
at the final stage of our downlink reception chain is a very recommended option to
reduce BER.

Conclusions
This project has been concerned with the description of the mathematical model
and posterior comparison between several strategies inspired in traditional multi
user detection architectures and another valid uplink/downlink alternative based on
Generalised Side Lobe Canceller (GSLC) scheme, showing itself, in the light of the
gathered results through several MATLAB simulations, as the most effective mode
to deal with both inter symbol (ISI) and multi user (MUI) interference.
However, there is still so much room for improvement in its optimization. These
attached simulations have been performed over a two-ray multipath channel, with
a difference of 10 dB between direct and reflected ray. As these models are mainly
concerned with reducing the interference level, the more multipath delays we intro-
duce, the clearer their effects will be.
Another valid possibility to improve performance could be the use of either a
higher spreading factor or Gold codes. Due to program’s limitations, current simu-
lations have been performed with a spreading factor of N = 8, but we could easily
find higher spreading factors in a real network, such as N = 32 and onwards.
On top of that, another issue that must be taken into account in this implemen-
tation is its computational cost. Not in vain, although the learning curve of RLS
algorithm converges much faster than for LMS, it is because of this issue that it
might not be as adequate as the latter for real-time applications.
59

Conclusions 60
From this point, many further lines of research may be followed. We could, for
instance, include in a deeper study al the features mentioned but not fully developed
throughout this project such as NLMS, LRLS or NLRLS between others. And the
same goes for the another mentioned Viterbi algorithm versions LVA and CVA.
Although there are so many others that could also serve as more than valid ideas to
continue this study.

Appendix A
Generation of the spreading codes
From our discussion about this topic in the Introduction, we can infer that spread-
ing codes play a crucial role in our system’s output. Therefore we will require an
acceptable autocorrelation and cross-correlation properties.
There exist many techniques to generate these spreading sequences, such as M-
sequences, Gold codes, Kasami codes and Walsh codes, among others. The ﬁrst ones
allow us to reach a better performance, and they are usually produced by a shift
register which contents are combined through logical operations (typically X-OR or
modulo-2 addition).
Figure A.1: Scheme of a shift register
System’s feedback is obtained according to the following formula:
ai = c1ai−1 + c2ai2 + · · · + cnain (A.1)
61

Appendix A. Generation of the spreading codes 62
The behaviour of the register depends on its connections, that are described by an
expression named generator polynomial. It is thanks to this polynomial that regis-
ters obtain sequences of a certain period. M-sequences are those which period is the
longest one (Being equal to 2L−1
, where L is the number of “memories” of the reg-
ister). The reason behind the choice of these sequences lies on their autocorrelation
function (equal to a delta (δ)). Nevertheless, their poor cross-correlation properties
have provoked the adoption of other families of codes for which this parameter does
present a more suitable values.
Figure A.2: Auto/cross-correlation of two M-sequences
One of the most popular designs so as to tackle this problem is the one of the
Gold codes (named on behalf of their inventor, Robert Gold). Those are generate
through the combination of two M-sequences as depicted below:
Figure A.3: Scheme of the production of a Gold code

Appendix A. Generation of the spreading codes 63
However, not every M-sequence is eligible for this purpose. Gold selected a subset
of them, which are tabulated and can be easily found. In every case, the higher value
L does have, the lesser is the value of their cross-correlation and the more suitable
will these codes be for our CDMA application.
Figure A.4: Auto/cross-correlation of two Gold sequences
Number of memories Period Nc = 2L
− 1 Pairs of M-sequences
5 31 [5 2 0], [5 4 3 2 0]
6 63 [6 1 0], [6 5 2 1 0]
7 127
[7 3 0], [7 3 2 1 0]
[7 3 2 1 0], [7 5 4 3 2 1 0]
8 255 [8 7 6 1 0], [8 7 6 5 2 1 0]
9 511
[9 4 0], [9 6 4 3 0]
[9 6 4 3 0], [9 8 4 1 0]
10 1023
[10 9 8 7 6 5 4 3 0], [10 9 7 6 4 1 0]
[10 8 5 1 0], [10 7 6 4 2 1 0]
[10 8 7 6 5 4 3 1 0], [10 9 7 6 4 1 0]
Table A.1: Several pairs of M-sequences used to generate Gold codes

Bibliography
[1] Antonio Artés and Fernándo Pérez. Comunicaciones digitales. Pearson Prentice
Hall, 2007.
[2] Paloma Garc´ıa, Antonio Valdovinos, and Fernando Gutiérrez. Simplified Cir-
cular Viterbi Algorithm for Tailbiting Convolutional Codes. In Vehicular Tech-
nology Conference (VTC Fall), 2011 IEEE, 2011.
[3] Joachim Hagenauer and Peter Hoeher. A Viterbi algorithm with soft-decision
outputs and its applications. In Proc. IEEE Global Telecommunications Con-
ference 1989, 1989.
[4] Simon O. Haykin. Adaptive Filter Theory. Prentice Hall Inc., 1996.
[5] José Mar´ıa Hernando Rábanos. Comunicaciones móviles. Centro de estudios
Ramón Areces, 2004.
[6] José Ignacio Ronda and Santiago Zazo. Apuntes de Transmisión Digital. U.P.M.,
2013.
[7] Ali H. Sayed. Adaptive filters. Wiley, 2008.
[8] Harry L. Van Trees. Optimum array processing. Wiley, 2002.
[9] Branka Vucetic and Jinhong Yuan. Turbo Codes Principles and Applications.
Kluwer publishers, 2001.
[10] Santiago Zazo, Faouzi Baden, and José Mar´ıa Páez-Borrallo. A multiple ac-
cess/self interference canceller receiver for DS-CDMA multiuser detection over
fading channels. In IEEE 5nd Vehicular Technology Conference, 2000, 2000.
65

Multiuser detection based on generalized side-lobe canceller plus SOVA algorithm

Recommended

Recommended

More Related Content

What's hot

What's hot (17)

Viewers also liked

Viewers also liked (20)

Similar to Multiuser detection based on generalized side-lobe canceller plus SOVA algorithm

Similar to Multiuser detection based on generalized side-lobe canceller plus SOVA algorithm (20)

Recently uploaded

Recently uploaded (20)

Multiuser detection based on generalized side-lobe canceller plus SOVA algorithm