Random Matrix Theory in Signal Processing
Xavier Mestre
xavier.mestre@cttc.cat
Centre Tecnològic de Telecomunicacions de Catalunya (CTTC)
Klagenfurt University (Austria)
February 25, 2019
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Outline
• Introduction to RMT: Convergence of spectral statistics of the sample covariance matrix.
• First Application: Subspace-based estimation of directions-of-arrival (DoA).
• Second Application: Detection tests of correlation and sphericity.
• Third Application: Large multivariate time series analysis
• Fourth Application: Outlier production characterization in Conditional/Unconditional Maximum
Likelihood parametric estimation.
Xavier Mestre: Random Matrix Theory in Signal Processing. 2/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
The sample covariance matrix
We assume that we collect  independent samples (snapshots) from an array of  antennas:
Consider the  ×  observation matrix Y = [y(1)     y()] and the sample covariance matrix
ˆR =
1

YY
 =
1

X
=1
y()y
()
Xavier Mestre: Random Matrix Theory in Signal Processing. 3/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Problem statement and objective of the talk
Typically, one expects ˆR to be a close approximation of R. Under conventional statistical
assumptions, we have
ˆR → R
almost surely when  → ∞ for a fixed . Furthermore, if (·) is a reasonable function, we also
have 
³
ˆR
´
→  (R).
Unfortunately, when   have the same order of magnitude, this does not hold anymore: the Finite
Sample Size effect appears.
In these situations, the regime where   → ∞ but  → , 0    ∞ becomes much more
relevant. RMT will help us in solving the following two problems in this regime:
• To what 
³
ˆR
´
does converge to?
• How do we design  (·) so that 
³
ˆR
´
→  (R).
Xavier Mestre: Random Matrix Theory in Signal Processing. 4/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Conditional versus unconditional model
Typically, the observations are superposition of some signals plus noise. For example, in array
processing the observation consists of the contribution from  signals:
Y = A (Θ) S + N
where:
• Matrix S ∈ C×
contains the contribution of the  signals (at each of its rows).
• Matrix A (Θ) ∈ C×
contains, at each of its columns, the spatial signature of each source,
namely
A (Θ) =
£
a (1) a (2) · · · a ()
¤
• Matrix N ∈ C×
contains the background noise samples. It is typically modeled as a matrix
with i.i.d. entries following a zero mean Gaussian distribution
{N} ∼ CN
¡
0 2
¢

Xavier Mestre: Random Matrix Theory in Signal Processing. 5/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Conditional versus unconditional model (II)
Depending on how the signals are modeled, we differentiate between the conditional and the
unconditional models. Let us denote
S = [s(1)     s()]
• Conditional Model: The entries of S are modelled as deterministic unknowns. In this case, the
observation can be described as
y() ∼ CN
¡
A (Θ) s() 2
I
¢

• Unconditional Model: The entries of S are modelled as random variables. Typically, we assume
that the column vectors s() are independent, circularly symmetric Gaussian Random variables, i.e.
s ∼ CN (0 P), P  0. In this case, we have
y() ∼ CN (0 R)  R = A (Θ) PA
(Θ) + 2
I
Xavier Mestre: Random Matrix Theory in Signal Processing. 6/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Conditional versus unconditional model (III)
Depending on the signal model, the structure of the sample covariance matrix model will be inherently
different. Let X be an  × matrix of i.i.d. Gaussian standardized entries {X} ∼ CN (0 1).
The two most important models can be described as:
• Conditional Model (also known as Information plus Noise model), the SCM can be expressed as
ˆR =
1

(V + X) (V + X)
where V some deterministic matrix that contains the signal (information) contribution.
• Unconditional Model (also known as Single Side Correlation model), the SCM can be expressed as
ˆR = R
12

µ
1

XX

¶
R
12

where R
12
 is the positive Hermitian square root of R.
In many of the results obtained by RMT, the Gaussian assumption can be dropped.
Xavier Mestre: Random Matrix Theory in Signal Processing. 7/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Uncorrelated signals: the Marchenko-Pastur Law
Consider the simplest case where ˆR = 1
 XX
, where the entries of X are zero mean i.i.d. with
unit variance. Consider the eigenvalue distribution for different  , but fixed ratio .
0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
0
2
4
6
8
10
12
Eigenvalues
Numberofeigenvalues
Histogram of the eigenvalues of the sample covariance matrix, M=80, N=800
0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
0
2
4
6
8
10
12
14
Eigenvalues
Histogram of the eigenvalues of the sample covariance matrix, M=800, N=8000
Numberofeigenvalues
Xavier Mestre: Random Matrix Theory in Signal Processing. 8/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Uncorrelated signals: the Marchenko-Pastur Law (II)
It turns out, that when   → ∞,  → , 0    ∞, the empirical density of eigenvalues
converges to a deterministic measure.
0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
0
2
4
6
8
10
12
14
Eigenvalues
Histogram of the eigenvalues of the sample covariance matrix, M=800, N=8000
Numberofeigenvalues
For   1 () = 1
2
q¡
 − −
¢ ¡
+
− 
¢
I[−
+
]() −
= (1 −
√
)
2
 +
= (1 +
√
)
2
.
Xavier Mestre: Random Matrix Theory in Signal Processing. 9/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
The sample covariance matrix
We assume that we collect  independent samples (snapshots) from an array of  antennas:
ˆR = 1

P
=1 y()y
(). Example: R has 4 eigenvalues {1 2 3 7} with equal multiplicity.
0 1 2 3 4 5 6 7 8 9 10
0
5
10
15
20
25
Histogram of the eigenvalues of the sample covariance matrix, M=80, N=800
lambda
Numberofeigenvalues
0 1 2 3 4 5 6 7 8 9 10
0
5
10
15
20
25
30
lambda
Numberofeigenvalues
Histogram of the eigenvalues of the sample covariance matrix, M=400, N=4000
Xavier Mestre: Random Matrix Theory in Signal Processing. 10/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
The sample covariance matrix: asymptotic properties
When both   → ∞,  → , 0    ∞, the e.d.f. of the eigenvalues of ˆR tends to a
deterministic density function. Example: R has 4 eigenvalues {1 2 3 7} with equal multiplicity.
0 1 2 3 4 5 6 7 8 9 10
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
Eigenvalues
Aysmptotic density of eigenvalues of the sample correlation matrix
c=0.01
c=0.1
c=1
Xavier Mestre: Random Matrix Theory in Signal Processing. 11/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
1st Application: Subspace-based estimation of directions of arrival (DoA)
Xavier Mestre: Random Matrix Theory in Signal Processing. 12/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Introduction and signal model
We consider DoA detection based on subspace approaches (MUSIC), that exploit the orthogonality
between signal and noise subspaces.
Consider a set of  sources impinging on an array of  sensors/antennas. We work with a fixed
number of snapshots ,
{y(1)     y()}
assumed i.i.d., with zero mean and covariance R.
The true spatial covariance matrix can be described as
R = A (Θ) ΦA (Θ)
+ 2
I
where A (Θ) is an  × matrix that contains the steering vectors corresponding to the  different
sources,
A (Θ) =
£
a (1) a (2) · · · a ()
¤
and 2
is the background noise power.
Xavier Mestre: Random Matrix Theory in Signal Processing. 13/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Introduction and signal model (II)
The eigendecomposition of R allows us to differentiate between signal and noise subspaces:
R =
£
E E
¤
∙
Λ 0
0 2
I−
¸
£
E E
¤

It turns out that E
a () = 0,  = 1    .
Since R is unknown, one must work with the sample covariance matrix
ˆR =
1

X
=1
y()y
()
The MUSIC algorithm uses the sample noise eigenvectors, and searches for the deepest local minima
of the cost function
MUSIC () = a
() ˆE
ˆE
a () 
It is interesting to investigate the behavior of MUSIC () when   have the same order of magnitude.
Xavier Mestre: Random Matrix Theory in Signal Processing. 14/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Asymptotic behavior of MUSIC
The MUSIC algorithm suffers from the breakdown effect. The performance breaks down when
the number of samples or the SNR falls below a certain threshold. Cause: ˆE is not a very good
estimator of E when   have the same order of magnitude.
The performance breakdown effect can be easily analyzed using random matrix theory, especially under
a noise eigenvalue separation assumption: |MUSIC () − ¯MUSIC ()| → 0
¯MUSIC () = s
()
à X
=1
()ee

!
s ()
() =
⎧
⎨
⎩
1 − 1
−
P
=−+1
³
2
−2 −
1
−1
´
 ≤  − 
2
−2 −
1
−1
   − 
where {  = 1     } are the solutions to 1

P
=1

− = 1

Xavier Mestre: Random Matrix Theory in Signal Processing. 15/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Asymptotic behavior of MUSIC: an example
We consider a scenario with two sources impinging on a ULA ( = 05,  = 20) from DoAs:
35◦
, 37◦
.
−100 −80 −60 −40 −20 0 20 40 60 80 100
−35
−30
−25
−20
−15
−10
−5
0
MUSIC asymptotic pseudospectrum, M=20, DoAs=[35,37]deg
Azimuth (deg)
32 34 36 38 40
−32
−30
−28
−26
−24
−22
−20
−18
N=25
N=15
SNR=12dB
SNR=17dB
2 4 6 8 10 12 14 16 18 20
25
30
35
40
45
50
SNR (dB)
Azimuth(deg)
Position of the two deepest local minima of the asymptotic MUSIC cost function
10 12 14
35
36
37
N=15
N=25
Xavier Mestre: Random Matrix Theory in Signal Processing. 16/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
 -consistent subspace detection: G-MUSIC
We can derive an  -consistent estimator of the cost function  () = s
() EE
s ():
G-MUSIC () = s
()
à X
=1
()ˆeˆe

!
s ()
() =
⎧
⎨
⎩
1 +
P
=−+1
³
ˆ
ˆ−ˆ
− ˆ
ˆ−ˆ
´
 ≤  − 
−
P−
=1
³
ˆ
ˆ−ˆ
− ˆ
ˆ−ˆ
´
   − 
where now ˆ1     ˆ are the solutions to the equation
1

X
=1
ˆ
ˆ − ˆ
=
1


Xavier Mestre: Random Matrix Theory in Signal Processing. 17/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Performance evaluation MUSIC vs. G-MUSIC
Comparative evaluation of MUSIC and G-MUSIC via simulations. Scenarios with four
(−20◦
 −10◦
 35◦
, 37◦
) and two (35◦
, 37◦
) sources respectively, ULA ( = 20,  = 05).
−80 −60 −40 −20 0 20 40 60 80
10
−4
10
−3
10
−2
10
−1
10
0
Example of MUSIC and GMUSIC cost function, SNR=18dB, M=20, N=15, DoAs=35, 37, −10, −20 deg.
Angle of arrival (azimuth), degrees
MUSIC
GMUSIC
34 36 38
10
−4
10
−3
10
−2
5 10 15 20 25
10
−3
10
−2
10
−1
10
0
10
1
10
2
10
3
10
4
SNR (dB)
MSE
Mean Squared Error
MUSIC
GMUSIC
CRB
M=20, N=15
M=20, N=75
Xavier Mestre: Random Matrix Theory in Signal Processing. 18/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
2nd Application: characterization of sphericity and correlation tests
Xavier Mestre: Random Matrix Theory in Signal Processing. 19/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Problem formulation
We consider two very important tests in signal processing, which try to establish the structure of the
covariance matrix of the received signal:
• Sphericity test: seeks to establish whether the received signal is spatio-temporal white noise:
H0 : R = 2
I
H1 : R 6= 2
I
• Correlation test: seeks to establish whether the signals received from multiple sensors is corre-
lated:
H0 : R = R ¯ I
H1 : R 6= R ¯ I
In both cases, the true covariance matrix is unknown, so one must work on the sampled version ˆR.
Xavier Mestre: Random Matrix Theory in Signal Processing. 20/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Generalized Maximum Likelihood Ratio Test (GLRT)
In order to address this binary hypothesis problem, one may resort to the Generalized Likelihood Ratio
Test (GLRT):
supR
Y
=1
Φ (y; R)
sup2
Y
=1
Φ (y; 2I)
H1
≷
H0

supR
Y
=1
Φ (y; R)
supD
Y
=1
Φ (y; D)
H1
≷
H0

where Φ (y; R) is the pdf of a complex Gaussian with zero mean and covariance R.
For  ≥ , the GLRT for sphericity and correlation respectively reject H0 for large values of
ˆsphr
 = log
∙
1

tr
³
ˆR
´¸
−
1

log det
³
ˆR
´
ˆcorr
 = log
∙
1

tr
³
ˆC
´¸
−
1

log det
³
ˆC
´
where ˆC =
³
ˆR ¯ I
´−12
ˆR
³
ˆR ¯ I
´−12
is the sample correlation/coherence matrix.
Xavier Mestre: Random Matrix Theory in Signal Processing. 21/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Frobenius norm test
Other more ad-hoc tests can be constructed using a more intuitive reasoning:
• Non-sphericity will manifest in ˆR being far from proportional to the identity.
• Correlation will lead to high absolute values of the off-diagonal elements of ˆC .
Therefore, it seems reasonable to design the test to reject H0 for large values of
ˆsphr
 =
1

°
°
°
°
ˆR −
1

tr
h
ˆR
i
I
°
°
°
°
2

ˆsphr
 =
1

°
°
°ˆC − I
°
°
°
2


In both cases, we have
ˆ
 =
1

X
=1
ˆ
2
 −
Ã
1

X
=1
ˆ
!2
which are LSS with () = 2
and () = .
Xavier Mestre: Random Matrix Theory in Signal Processing. 22/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
General study of tests
It is generally difficult to derive the distribution of these tests, so in practice the literature has focused
on the case  → ∞ for fixed 
We would like to know the asymptotic behavior of these tests, for   having the same order of
magnitude, allowing for the possibility of    (undersampled regime).
Fortunately, there is a direct relationship between LSS and Stieltjes transform:
ˆ =
1

X
=1

³
ˆ
´
=
1
2 j
I
C−
() ˆ()
where
ˆ() =
1

X
=1
1
ˆ − 
and where the contour C−
enclosed all the positive eigenvalues and not zero.
Xavier Mestre: Random Matrix Theory in Signal Processing. 23/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
First order convergence
By replacing ˆ() with the asymptotic equivalent, we obtain the almost sure asymptotic behavior
of the test ˆ, in the sense that |ˆ − ¯| → 0 where
¯ =
1
2 j
I
C−
() ¯()
Most of the times, we can carry out the integral and find a closed form for ¯.
For example, for the ¯
 , we can establish
¯
 =
(

 + 
 + −
 log
¯
¯1 − 

¯
¯   

 + 
 − −
 log |∗| + 1

P
=1  log
¯
¯
¯

−∗
¯
¯
¯   
where ∗ ≤ 0 is a solution to a certain equation and 
 is the large- value of the GLRT

 = log
"
1

X
=1

#
−
1

X
=1
log 
Xavier Mestre: Random Matrix Theory in Signal Processing. 24/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Second order convergence
Using RMT tools that establish how ˆ() fluctuates around ¯(), we may establish a CLT on
these tests. Under certain statistical conditions, the LSS ˆ will asymptotically fluctuate as Gaussian
random variable, in the sense that
−1
 ( (ˆ − ¯) − )
L
−→ N (0 1) 
where
 =
1
2 j
I
C−

() () 
2
 =
−1
42
I
C−

I
C−

(1)(2)2
 (1 2) 12
where () is the original test function () after some change of variable, and where the mean
 () and variance 2
 (1 2) are different for the Sphericity and Correlation tests.
These integrals can be computed in closed form, and one can generally approximate ˆ ≈
N
¡
¯ +  2
2
¢

Xavier Mestre: Random Matrix Theory in Signal Processing. 25/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Numerical Results: correlation test
Simulations for 105
independent simulation runs (GLRT). Under H0, D takes uniform values between
0 and 1. Under H1, R = D + Ψ where {Ψ} = 09|−|
.
250 300 350 400 450 500 550 600
0
0.005
0.01
0.015
Density of the statistic under H
0
300 350 400 450 500 550 600
0
0.005
0.01
0.015
Density of the statistic under H1
Simulated
Theory (large M,N)
Theory (large N)
M=20,N=25
M=20,N=25
 = 20  = 25
250 300 350 400 450 500 550
0
0.005
0.01
0.015
Density of the statistic under H
0
, M=20, N=100
350 400 450 500 550 600 650 700
0
0.005
0.01
0.015
Density of the statistic under H1
, M=20, N=100
Simulated
Theory (large M,N)
Theory (large N)
 = 20  = 100
Xavier Mestre: Random Matrix Theory in Signal Processing. 26/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
3rd Application: Large multi-variate time series
Xavier Mestre: Random Matrix Theory in Signal Processing. 27/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Introduction: testing independence of multiple time series
We consider an -variate zero-mean Gaussian time series
y() = [1()     ()]
where  = 1     , and ask ourselves whether the different components of the series are independent.
Xavier Mestre: Random Matrix Theory in Signal Processing. 28/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Motivation
Consider a certain window of  samples and the extended random vector
y() =
⎡
⎣1()     1( +  − 1)
| {z }
 samples
     ()     ( +  − 1)
| {z }
 samples
⎤
⎦


and consider the second order statistics of this vector, namely
E
£
y()y
 ()
¤
= R =
⎡
⎢
⎢
⎢
⎣
R
(11)
 R
(12)
 · · · R
(1)

R
(21)
 R
(22)
 · · · R
(2)

... ... ... ...
R
(1)
 R
(2)
 · · · R
()

⎤
⎥
⎥
⎥
⎦
where R
(0
)
 has dimensions  × . If the different time series are independent, R becomes block
diagonal and
 =
1

Ã
log det R −
X
=1
log det R
()

!
= 0
Xavier Mestre: Random Matrix Theory in Signal Processing. 29/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Spatio-temporal sample covariance matrix
In practice, R has to be estimated from the observations y(). Using the above formulation, we
can estimate the time covariance between series  and 0
as
ˆR
(0
)
 =
1

YY
0
where
Y =
⎡
⎢
⎢
⎢
⎣
 (1)  (2) · · ·  () · · ·  ()
 (2) ... · · · ... ...  ( + 1)
...  () ... ... · · · ...
 () · · ·  ()  ( + 1) · · ·  ( +  − 1)
⎤
⎥
⎥
⎥
⎦
has a Hankel structure. Under the null hypothesis (uncorrelation), and assuming stationarity
E
£
()∗
0(0
)
¤
=  ( − 0
) =0  () =
Z 1
0
S () e2i

Xavier Mestre: Random Matrix Theory in Signal Processing. 30/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Tests from the spatio-temporal sample covariance matrix
We can therefore build the test
ˆ =
1

Ã
log det ˆR −
X
=1
log det ˆR
()

!
and ask ourselves how to choose  (time window parameter) to make ˆ close to zero under the
uncorrelation hypothesis. There is some trade-off between choosing  small (so that ˆR is close to
R in spectral norm) and testing independence in large time lags  ar large as possible.
For this all this, it appears reasonable to investigate the behavior of the eigenvalues of ˆR when
 → ∞ and  → ∞ at the same rate, so that  = 
 → , 0    +∞. We will assume
that 4
→ 0, the spectral densities are uniformly bounded above and away from zero, and that
sup

X
∈Z
Ã
1

X
=1
| ()|2
!12
 +∞.
Xavier Mestre: Random Matrix Theory in Signal Processing. 31/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Tests from the spatio-temporal sample covariance matrix
Let Q(),  ∈ C+
, denote the resolvent matrix of ˆR, that is Q() =
³
ˆR − I
´−1
. Under the
above assumptions, Q() ³ T() (deterministic asymptotic equivalent), where T() is the unique
solution to
T() =
−1

µ
I + 
µ
−1

¡
I + Ψ (T())
¢
¶¶−1
in the class of matrix valued Stieltjes transforms, where Ψ : C×
→ C×
and Ψ : C×
→
C×
are the operators
Ψ(A) =
Z 1
0
d
 () Ad ()

¡
S () ⊗ d () d
 ()
¢

Ψ(B) =
1

X
=1
Z 1
0
S ()
d
 () B()
d ()

d () d
 () 
where d () =
£
1     ei(−1)
¤
and S () = diag (S1 ()      S ()).
Xavier Mestre: Random Matrix Theory in Signal Processing. 32/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
4th Application: outlier characterization of Maximum Likelihood estimation
Xavier Mestre: Random Matrix Theory in Signal Processing. 33/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Considered scenario
• Consider an array of  sensors receiving the signal transmitted by  sources from parameters
¯ =
£
¯(1)     ¯()
¤
, where we assume   .
• Let y() denote an  × 1 complex vector containing the received samples. We model this obser-
vation vector as
y() = A(¯)s() + n()
where s() contains the signal transmitted by the    sources, n() contains the received
noise (assumed i.i.d. and CN(0 2
)) and
A(¯) =
£
a
¡
¯(1)
¢
   a
¡
¯()
¢ ¤
• We assume that a total of  snapshots are available, and that   .
• Problem: estimate the parameters ¯ from the observations {y()  = 1    }.
• We investigate the use of Maximum Likelihood approaches =⇒ Highest resolution at the cost of
increased computational complexity (multidimensional search)
Xavier Mestre: Random Matrix Theory in Signal Processing. 34/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Maximum likelihood
• The estimated angles are determined as ˆ = arg min∈Θ ˆ () where ˆ () is the negative (concen-
trated) log-likelihood function.
• The “conditional” (or deterministic) model: assumes that the signals s() are deterministic un-
knowns. In this situation, one must minimize
ˆ () =
1

tr
h
P⊥
()ˆR
i
where ˆR = 1

P
=1 y()y
() is the sample covariance matrix, P⊥
() = I − P(), and
P() = A()
¡
A
()A()
¢−1
A
()
is the orthogonal projection on the column space of A().
• The “unconditional” (or stochastic) model: assumes that the source signals are random variables,
typically s() ∼ CN(0 P) and i.i.d. in the time domain. In this situation,
ˆ () =
1

log det
h
ˆ () P⊥
() + P()ˆRP()
i

Xavier Mestre: Random Matrix Theory in Signal Processing. 35/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Breakdown effect in maximum likelihood (I)
• General nonlinear parametric estimators exhibit a threshold effect.
• At low SNR, or low , the MSE suddenly departs from the Cramér Rao Bound. The presence of
outliers is the main cause for this behavior.
MSE
Threshold
effect
B
CRB
d
SNR
dB
Xavier Mestre: Random Matrix Theory in Signal Processing. 36/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Breakdown effect in maximum likelihood (II)
At low values of the SNR and , there exist realizations of the cost functions for which local minima
corresponding to outliers become deeper than the intended one.
UML, SNR=0dB, M=5, N=20, uncorrelated signals,DoA=[16,18]deg
θ1
(deg)θ2
(deg)
−80 −60 −40 −20 0 20 40 60 80
−80
−60
−40
−20
0
20
40
60
80
UML cost function
Local Minima
Intended Minimum
Selected Minimum
Xavier Mestre: Random Matrix Theory in Signal Processing. 37/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Probability of resolution
• We consider here the definition of [Athley, 05] of the resolution probability. If ˆ () is a generic cost
function that fluctuates around a deterministic ¯ (), which has  + 1 local minima at the values
¯ 1     , the probability of resolution can be defined as
 = P
" 
=1
©
ˆ ()  ˆ
¡
¯
¢ª
#

• It was shown in [Athley, 05] that this definition of  provides a very accurate description of both
the breakdown effect and the expected mean squared error (MSE) of the DoA estimation process.
• Unfortunately, in our ML setting,  is difficult to analyze for finite values of   due to the
complicated structure of the cost functions, especially ˆ ().
• We propose to use the asymptotic distributions (as   → ∞) instead of the actual ones. Very
accurate description of the actual probability, even for very low  .
Xavier Mestre: Random Matrix Theory in Signal Processing. 38/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
First order behavior
When   → ∞, under several technical conditions, the two ML cost functions become (pointwise)
asymptotically close to two deterministic counterparts, namely
|ˆ () − ¯ ()| → 0 |ˆ () − ¯ ()| → 0
a.s. pointwise in  as   → ∞, where
¯ () =
1

tr
£
P⊥
()R
¤
and
¯ () =
1

log det
£
2
() P⊥
() + P()RP()
¤
+
 − 

log
µ

 − 
¶
−


respectively, where R is the true covariance matrix of the observations and 2
() =
1
− tr
£
P⊥
()R
¤
.
Xavier Mestre: Random Matrix Theory in Signal Processing. 39/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Second order behavior
Let 1      be a set of multidimensional points (e.g. local minima of ¯ () or ¯ ()). Let
ˆη = [ˆ (1)      ˆ ()]
and ¯η = [¯ (1)      ¯ ()]
and take the equivalent definitions for the UML cost function. Assume that y() ∼ CN (0 R).
Under certain technical conditions, as   → ∞ ,  → , 0    1, we have
Γ−1
 (ˆη − ¯η) → N (0 I) and Γ−1
 (ˆη − ¯η) → N (0 I)
for some covariance matrices Γ, Γ given by
{Γ} =
1

tr
£
P⊥
 P⊥

¤
and
{Γ} =
1
2
2

1

tr
£
P⊥
 P⊥

¤
+
1
2

1

tr
£
P⊥
Q
¤
+
1
2

1

tr
£
P⊥
 Q
¤
− log
¯
¯
¯
¯1 −
1

tr [QQ]
¯
¯
¯
¯
where P = R
12
 P
³

()

´
R
12
 ,P⊥
 = R − P, and Q = R
12
 A
£
A
 RA
¤−1
A
 R
12
 .
Xavier Mestre: Random Matrix Theory in Signal Processing. 40/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Application
• The previous asymptotic result is basically stating that for sufficiently large  , we are able
to approximate the finite-dimensional distributions of the functions ˆ () and ˆ () as
multivariate Gaussians, namely
ˆη ∼ N (¯η Γ)
ˆη ∼ N (¯η Γ)
• It turns out that these results are very good approximations of the finite dimensional reality, even
for relatively low values of  .
• This provides a tool to evaluate the resolution probability of the ML method according to
 = P
" 
=1
©
ˆ ()  ˆ
¡
¯
¢ª
#
= P
⎡
⎣
⎡
⎣
−1 1
... ...
−1 1
⎤
⎦ ˆη  0
⎤
⎦
which can be evaluated by computing the cdf of the Gaussian law.
Xavier Mestre: Random Matrix Theory in Signal Processing. 41/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Simulation results
ULA of  = 5 elements, two sources coming from 16 and 18 degrees with respect to the broadside.
−15 −10 −5 0 5 10 15 20 25
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
SNR (dB)
Prob.ofRes.
Prob. of res., M=5, Theta=[16,18] deg, corr=0
UML (Predicted)
UML (Simulated)
CML (Predicted)
CML (Simulated)
N=100
N=10
Uncorrelated sources
−15 −10 −5 0 5 10 15 20 25
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
SNR (dB)Prob.ofRes.
Prob. of res., M=5, Theta=[16,18] deg, corr=0.95
UML (Predicted)
UML (Simulated)
CML (Predicted)
CML (Simulated)
N=100
N=10
Highly correlated sources
Xavier Mestre: Random Matrix Theory in Signal Processing. 42/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Concluding remarks
Xavier Mestre: Random Matrix Theory in Signal Processing. 43/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Conclusions
Random Matrix Theory offers the possibility of analyzing the behavior of different quantities depending
on ˆR when the sample size and the number of sensors/antennas have the same order of magnitude.
The objective is to describe the asymptotic behavior of a certain scalar function of ˆR, namely

³
ˆR
´
.
• Traditional Approach: Assuming that the number of samples is high, we might establish that

³
ˆR
´
→  (R) in some stochastic sense as  → ∞ while  remains fixed.
• New Approach: In order to characterize the situation where   have the same order of
magnitude, one might consider the limit   → ∞,  → , 0    ∞.
Results obtained under this asymptotic limit turn out to be extremely accurate, even for reasonably
low  .
Xavier Mestre: Random Matrix Theory in Signal Processing. 44/41
Centre Tecnològic de Telecomunicacions de Catalunya - CTTC
Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/
Thank you for your attention!!!
Xavier Mestre: Random Matrix Theory in Signal Processing. 45/41

Random Matrix Theory in Array Signal Processing: Application Examples

  • 1.
    Random Matrix Theoryin Signal Processing Xavier Mestre xavier.mestre@cttc.cat Centre Tecnològic de Telecomunicacions de Catalunya (CTTC) Klagenfurt University (Austria) February 25, 2019
  • 2.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Outline • Introduction to RMT: Convergence of spectral statistics of the sample covariance matrix. • First Application: Subspace-based estimation of directions-of-arrival (DoA). • Second Application: Detection tests of correlation and sphericity. • Third Application: Large multivariate time series analysis • Fourth Application: Outlier production characterization in Conditional/Unconditional Maximum Likelihood parametric estimation. Xavier Mestre: Random Matrix Theory in Signal Processing. 2/41
  • 3.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ The sample covariance matrix We assume that we collect  independent samples (snapshots) from an array of  antennas: Consider the  ×  observation matrix Y = [y(1)     y()] and the sample covariance matrix ˆR = 1  YY  = 1  X =1 y()y () Xavier Mestre: Random Matrix Theory in Signal Processing. 3/41
  • 4.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Problem statement and objective of the talk Typically, one expects ˆR to be a close approximation of R. Under conventional statistical assumptions, we have ˆR → R almost surely when  → ∞ for a fixed . Furthermore, if (·) is a reasonable function, we also have  ³ ˆR ´ →  (R). Unfortunately, when   have the same order of magnitude, this does not hold anymore: the Finite Sample Size effect appears. In these situations, the regime where   → ∞ but  → , 0    ∞ becomes much more relevant. RMT will help us in solving the following two problems in this regime: • To what  ³ ˆR ´ does converge to? • How do we design  (·) so that  ³ ˆR ´ →  (R). Xavier Mestre: Random Matrix Theory in Signal Processing. 4/41
  • 5.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Conditional versus unconditional model Typically, the observations are superposition of some signals plus noise. For example, in array processing the observation consists of the contribution from  signals: Y = A (Θ) S + N where: • Matrix S ∈ C× contains the contribution of the  signals (at each of its rows). • Matrix A (Θ) ∈ C× contains, at each of its columns, the spatial signature of each source, namely A (Θ) = £ a (1) a (2) · · · a () ¤ • Matrix N ∈ C× contains the background noise samples. It is typically modeled as a matrix with i.i.d. entries following a zero mean Gaussian distribution {N} ∼ CN ¡ 0 2 ¢  Xavier Mestre: Random Matrix Theory in Signal Processing. 5/41
  • 6.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Conditional versus unconditional model (II) Depending on how the signals are modeled, we differentiate between the conditional and the unconditional models. Let us denote S = [s(1)     s()] • Conditional Model: The entries of S are modelled as deterministic unknowns. In this case, the observation can be described as y() ∼ CN ¡ A (Θ) s() 2 I ¢  • Unconditional Model: The entries of S are modelled as random variables. Typically, we assume that the column vectors s() are independent, circularly symmetric Gaussian Random variables, i.e. s ∼ CN (0 P), P  0. In this case, we have y() ∼ CN (0 R)  R = A (Θ) PA (Θ) + 2 I Xavier Mestre: Random Matrix Theory in Signal Processing. 6/41
  • 7.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Conditional versus unconditional model (III) Depending on the signal model, the structure of the sample covariance matrix model will be inherently different. Let X be an  × matrix of i.i.d. Gaussian standardized entries {X} ∼ CN (0 1). The two most important models can be described as: • Conditional Model (also known as Information plus Noise model), the SCM can be expressed as ˆR = 1  (V + X) (V + X) where V some deterministic matrix that contains the signal (information) contribution. • Unconditional Model (also known as Single Side Correlation model), the SCM can be expressed as ˆR = R 12  µ 1  XX  ¶ R 12  where R 12  is the positive Hermitian square root of R. In many of the results obtained by RMT, the Gaussian assumption can be dropped. Xavier Mestre: Random Matrix Theory in Signal Processing. 7/41
  • 8.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Uncorrelated signals: the Marchenko-Pastur Law Consider the simplest case where ˆR = 1  XX , where the entries of X are zero mean i.i.d. with unit variance. Consider the eigenvalue distribution for different  , but fixed ratio . 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 2 4 6 8 10 12 Eigenvalues Numberofeigenvalues Histogram of the eigenvalues of the sample covariance matrix, M=80, N=800 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 2 4 6 8 10 12 14 Eigenvalues Histogram of the eigenvalues of the sample covariance matrix, M=800, N=8000 Numberofeigenvalues Xavier Mestre: Random Matrix Theory in Signal Processing. 8/41
  • 9.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Uncorrelated signals: the Marchenko-Pastur Law (II) It turns out, that when   → ∞,  → , 0    ∞, the empirical density of eigenvalues converges to a deterministic measure. 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 2 4 6 8 10 12 14 Eigenvalues Histogram of the eigenvalues of the sample covariance matrix, M=800, N=8000 Numberofeigenvalues For   1 () = 1 2 q¡  − − ¢ ¡ + −  ¢ I[− + ]() − = (1 − √ ) 2  + = (1 + √ ) 2 . Xavier Mestre: Random Matrix Theory in Signal Processing. 9/41
  • 10.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ The sample covariance matrix We assume that we collect  independent samples (snapshots) from an array of  antennas: ˆR = 1  P =1 y()y (). Example: R has 4 eigenvalues {1 2 3 7} with equal multiplicity. 0 1 2 3 4 5 6 7 8 9 10 0 5 10 15 20 25 Histogram of the eigenvalues of the sample covariance matrix, M=80, N=800 lambda Numberofeigenvalues 0 1 2 3 4 5 6 7 8 9 10 0 5 10 15 20 25 30 lambda Numberofeigenvalues Histogram of the eigenvalues of the sample covariance matrix, M=400, N=4000 Xavier Mestre: Random Matrix Theory in Signal Processing. 10/41
  • 11.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ The sample covariance matrix: asymptotic properties When both   → ∞,  → , 0    ∞, the e.d.f. of the eigenvalues of ˆR tends to a deterministic density function. Example: R has 4 eigenvalues {1 2 3 7} with equal multiplicity. 0 1 2 3 4 5 6 7 8 9 10 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 Eigenvalues Aysmptotic density of eigenvalues of the sample correlation matrix c=0.01 c=0.1 c=1 Xavier Mestre: Random Matrix Theory in Signal Processing. 11/41
  • 12.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ 1st Application: Subspace-based estimation of directions of arrival (DoA) Xavier Mestre: Random Matrix Theory in Signal Processing. 12/41
  • 13.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Introduction and signal model We consider DoA detection based on subspace approaches (MUSIC), that exploit the orthogonality between signal and noise subspaces. Consider a set of  sources impinging on an array of  sensors/antennas. We work with a fixed number of snapshots , {y(1)     y()} assumed i.i.d., with zero mean and covariance R. The true spatial covariance matrix can be described as R = A (Θ) ΦA (Θ) + 2 I where A (Θ) is an  × matrix that contains the steering vectors corresponding to the  different sources, A (Θ) = £ a (1) a (2) · · · a () ¤ and 2 is the background noise power. Xavier Mestre: Random Matrix Theory in Signal Processing. 13/41
  • 14.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Introduction and signal model (II) The eigendecomposition of R allows us to differentiate between signal and noise subspaces: R = £ E E ¤ ∙ Λ 0 0 2 I− ¸ £ E E ¤  It turns out that E a () = 0,  = 1    . Since R is unknown, one must work with the sample covariance matrix ˆR = 1  X =1 y()y () The MUSIC algorithm uses the sample noise eigenvectors, and searches for the deepest local minima of the cost function MUSIC () = a () ˆE ˆE a ()  It is interesting to investigate the behavior of MUSIC () when   have the same order of magnitude. Xavier Mestre: Random Matrix Theory in Signal Processing. 14/41
  • 15.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Asymptotic behavior of MUSIC The MUSIC algorithm suffers from the breakdown effect. The performance breaks down when the number of samples or the SNR falls below a certain threshold. Cause: ˆE is not a very good estimator of E when   have the same order of magnitude. The performance breakdown effect can be easily analyzed using random matrix theory, especially under a noise eigenvalue separation assumption: |MUSIC () − ¯MUSIC ()| → 0 ¯MUSIC () = s () Ã X =1 ()ee  ! s () () = ⎧ ⎨ ⎩ 1 − 1 − P =−+1 ³ 2 −2 − 1 −1 ´  ≤  −  2 −2 − 1 −1    −  where {  = 1     } are the solutions to 1  P =1  − = 1  Xavier Mestre: Random Matrix Theory in Signal Processing. 15/41
  • 16.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Asymptotic behavior of MUSIC: an example We consider a scenario with two sources impinging on a ULA ( = 05,  = 20) from DoAs: 35◦ , 37◦ . −100 −80 −60 −40 −20 0 20 40 60 80 100 −35 −30 −25 −20 −15 −10 −5 0 MUSIC asymptotic pseudospectrum, M=20, DoAs=[35,37]deg Azimuth (deg) 32 34 36 38 40 −32 −30 −28 −26 −24 −22 −20 −18 N=25 N=15 SNR=12dB SNR=17dB 2 4 6 8 10 12 14 16 18 20 25 30 35 40 45 50 SNR (dB) Azimuth(deg) Position of the two deepest local minima of the asymptotic MUSIC cost function 10 12 14 35 36 37 N=15 N=25 Xavier Mestre: Random Matrix Theory in Signal Processing. 16/41
  • 17.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/  -consistent subspace detection: G-MUSIC We can derive an  -consistent estimator of the cost function  () = s () EE s (): G-MUSIC () = s () Ã X =1 ()ˆeˆe  ! s () () = ⎧ ⎨ ⎩ 1 + P =−+1 ³ ˆ ˆ−ˆ − ˆ ˆ−ˆ ´  ≤  −  − P− =1 ³ ˆ ˆ−ˆ − ˆ ˆ−ˆ ´    −  where now ˆ1     ˆ are the solutions to the equation 1  X =1 ˆ ˆ − ˆ = 1   Xavier Mestre: Random Matrix Theory in Signal Processing. 17/41
  • 18.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Performance evaluation MUSIC vs. G-MUSIC Comparative evaluation of MUSIC and G-MUSIC via simulations. Scenarios with four (−20◦  −10◦  35◦ , 37◦ ) and two (35◦ , 37◦ ) sources respectively, ULA ( = 20,  = 05). −80 −60 −40 −20 0 20 40 60 80 10 −4 10 −3 10 −2 10 −1 10 0 Example of MUSIC and GMUSIC cost function, SNR=18dB, M=20, N=15, DoAs=35, 37, −10, −20 deg. Angle of arrival (azimuth), degrees MUSIC GMUSIC 34 36 38 10 −4 10 −3 10 −2 5 10 15 20 25 10 −3 10 −2 10 −1 10 0 10 1 10 2 10 3 10 4 SNR (dB) MSE Mean Squared Error MUSIC GMUSIC CRB M=20, N=15 M=20, N=75 Xavier Mestre: Random Matrix Theory in Signal Processing. 18/41
  • 19.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ 2nd Application: characterization of sphericity and correlation tests Xavier Mestre: Random Matrix Theory in Signal Processing. 19/41
  • 20.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Problem formulation We consider two very important tests in signal processing, which try to establish the structure of the covariance matrix of the received signal: • Sphericity test: seeks to establish whether the received signal is spatio-temporal white noise: H0 : R = 2 I H1 : R 6= 2 I • Correlation test: seeks to establish whether the signals received from multiple sensors is corre- lated: H0 : R = R ¯ I H1 : R 6= R ¯ I In both cases, the true covariance matrix is unknown, so one must work on the sampled version ˆR. Xavier Mestre: Random Matrix Theory in Signal Processing. 20/41
  • 21.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Generalized Maximum Likelihood Ratio Test (GLRT) In order to address this binary hypothesis problem, one may resort to the Generalized Likelihood Ratio Test (GLRT): supR Y =1 Φ (y; R) sup2 Y =1 Φ (y; 2I) H1 ≷ H0  supR Y =1 Φ (y; R) supD Y =1 Φ (y; D) H1 ≷ H0  where Φ (y; R) is the pdf of a complex Gaussian with zero mean and covariance R. For  ≥ , the GLRT for sphericity and correlation respectively reject H0 for large values of ˆsphr  = log ∙ 1  tr ³ ˆR ´¸ − 1  log det ³ ˆR ´ ˆcorr  = log ∙ 1  tr ³ ˆC ´¸ − 1  log det ³ ˆC ´ where ˆC = ³ ˆR ¯ I ´−12 ˆR ³ ˆR ¯ I ´−12 is the sample correlation/coherence matrix. Xavier Mestre: Random Matrix Theory in Signal Processing. 21/41
  • 22.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Frobenius norm test Other more ad-hoc tests can be constructed using a more intuitive reasoning: • Non-sphericity will manifest in ˆR being far from proportional to the identity. • Correlation will lead to high absolute values of the off-diagonal elements of ˆC . Therefore, it seems reasonable to design the test to reject H0 for large values of ˆsphr  = 1  ° ° ° ° ˆR − 1  tr h ˆR i I ° ° ° ° 2  ˆsphr  = 1  ° ° °ˆC − I ° ° ° 2   In both cases, we have ˆ  = 1  X =1 ˆ 2  − Ã 1  X =1 ˆ !2 which are LSS with () = 2 and () = . Xavier Mestre: Random Matrix Theory in Signal Processing. 22/41
  • 23.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ General study of tests It is generally difficult to derive the distribution of these tests, so in practice the literature has focused on the case  → ∞ for fixed  We would like to know the asymptotic behavior of these tests, for   having the same order of magnitude, allowing for the possibility of    (undersampled regime). Fortunately, there is a direct relationship between LSS and Stieltjes transform: ˆ = 1  X =1  ³ ˆ ´ = 1 2 j I C− () ˆ() where ˆ() = 1  X =1 1 ˆ −  and where the contour C− enclosed all the positive eigenvalues and not zero. Xavier Mestre: Random Matrix Theory in Signal Processing. 23/41
  • 24.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ First order convergence By replacing ˆ() with the asymptotic equivalent, we obtain the almost sure asymptotic behavior of the test ˆ, in the sense that |ˆ − ¯| → 0 where ¯ = 1 2 j I C− () ¯() Most of the times, we can carry out the integral and find a closed form for ¯. For example, for the ¯  , we can establish ¯  = (   +   + −  log ¯ ¯1 −   ¯ ¯      +   − −  log |∗| + 1  P =1  log ¯ ¯ ¯  −∗ ¯ ¯ ¯    where ∗ ≤ 0 is a solution to a certain equation and   is the large- value of the GLRT   = log " 1  X =1  # − 1  X =1 log  Xavier Mestre: Random Matrix Theory in Signal Processing. 24/41
  • 25.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Second order convergence Using RMT tools that establish how ˆ() fluctuates around ¯(), we may establish a CLT on these tests. Under certain statistical conditions, the LSS ˆ will asymptotically fluctuate as Gaussian random variable, in the sense that −1  ( (ˆ − ¯) − ) L −→ N (0 1)  where  = 1 2 j I C−  () ()  2  = −1 42 I C−  I C−  (1)(2)2  (1 2) 12 where () is the original test function () after some change of variable, and where the mean  () and variance 2  (1 2) are different for the Sphericity and Correlation tests. These integrals can be computed in closed form, and one can generally approximate ˆ ≈ N ¡ ¯ +  2 2 ¢  Xavier Mestre: Random Matrix Theory in Signal Processing. 25/41
  • 26.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Numerical Results: correlation test Simulations for 105 independent simulation runs (GLRT). Under H0, D takes uniform values between 0 and 1. Under H1, R = D + Ψ where {Ψ} = 09|−| . 250 300 350 400 450 500 550 600 0 0.005 0.01 0.015 Density of the statistic under H 0 300 350 400 450 500 550 600 0 0.005 0.01 0.015 Density of the statistic under H1 Simulated Theory (large M,N) Theory (large N) M=20,N=25 M=20,N=25  = 20  = 25 250 300 350 400 450 500 550 0 0.005 0.01 0.015 Density of the statistic under H 0 , M=20, N=100 350 400 450 500 550 600 650 700 0 0.005 0.01 0.015 Density of the statistic under H1 , M=20, N=100 Simulated Theory (large M,N) Theory (large N)  = 20  = 100 Xavier Mestre: Random Matrix Theory in Signal Processing. 26/41
  • 27.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ 3rd Application: Large multi-variate time series Xavier Mestre: Random Matrix Theory in Signal Processing. 27/41
  • 28.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Introduction: testing independence of multiple time series We consider an -variate zero-mean Gaussian time series y() = [1()     ()] where  = 1     , and ask ourselves whether the different components of the series are independent. Xavier Mestre: Random Matrix Theory in Signal Processing. 28/41
  • 29.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Motivation Consider a certain window of  samples and the extended random vector y() = ⎡ ⎣1()     1( +  − 1) | {z }  samples      ()     ( +  − 1) | {z }  samples ⎤ ⎦   and consider the second order statistics of this vector, namely E £ y()y  () ¤ = R = ⎡ ⎢ ⎢ ⎢ ⎣ R (11)  R (12)  · · · R (1)  R (21)  R (22)  · · · R (2)  ... ... ... ... R (1)  R (2)  · · · R ()  ⎤ ⎥ ⎥ ⎥ ⎦ where R (0 )  has dimensions  × . If the different time series are independent, R becomes block diagonal and  = 1  Ã log det R − X =1 log det R ()  ! = 0 Xavier Mestre: Random Matrix Theory in Signal Processing. 29/41
  • 30.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Spatio-temporal sample covariance matrix In practice, R has to be estimated from the observations y(). Using the above formulation, we can estimate the time covariance between series  and 0 as ˆR (0 )  = 1  YY 0 where Y = ⎡ ⎢ ⎢ ⎢ ⎣  (1)  (2) · · ·  () · · ·  ()  (2) ... · · · ... ...  ( + 1) ...  () ... ... · · · ...  () · · ·  ()  ( + 1) · · ·  ( +  − 1) ⎤ ⎥ ⎥ ⎥ ⎦ has a Hankel structure. Under the null hypothesis (uncorrelation), and assuming stationarity E £ ()∗ 0(0 ) ¤ =  ( − 0 ) =0  () = Z 1 0 S () e2i  Xavier Mestre: Random Matrix Theory in Signal Processing. 30/41
  • 31.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Tests from the spatio-temporal sample covariance matrix We can therefore build the test ˆ = 1  Ã log det ˆR − X =1 log det ˆR ()  ! and ask ourselves how to choose  (time window parameter) to make ˆ close to zero under the uncorrelation hypothesis. There is some trade-off between choosing  small (so that ˆR is close to R in spectral norm) and testing independence in large time lags  ar large as possible. For this all this, it appears reasonable to investigate the behavior of the eigenvalues of ˆR when  → ∞ and  → ∞ at the same rate, so that  =   → , 0    +∞. We will assume that 4 → 0, the spectral densities are uniformly bounded above and away from zero, and that sup  X ∈Z Ã 1  X =1 | ()|2 !12  +∞. Xavier Mestre: Random Matrix Theory in Signal Processing. 31/41
  • 32.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Tests from the spatio-temporal sample covariance matrix Let Q(),  ∈ C+ , denote the resolvent matrix of ˆR, that is Q() = ³ ˆR − I ´−1 . Under the above assumptions, Q() ³ T() (deterministic asymptotic equivalent), where T() is the unique solution to T() = −1  µ I +  µ −1  ¡ I + Ψ (T()) ¢ ¶¶−1 in the class of matrix valued Stieltjes transforms, where Ψ : C× → C× and Ψ : C× → C× are the operators Ψ(A) = Z 1 0 d  () Ad ()  ¡ S () ⊗ d () d  () ¢  Ψ(B) = 1  X =1 Z 1 0 S () d  () B() d ()  d () d  ()  where d () = £ 1     ei(−1) ¤ and S () = diag (S1 ()      S ()). Xavier Mestre: Random Matrix Theory in Signal Processing. 32/41
  • 33.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ 4th Application: outlier characterization of Maximum Likelihood estimation Xavier Mestre: Random Matrix Theory in Signal Processing. 33/41
  • 34.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Considered scenario • Consider an array of  sensors receiving the signal transmitted by  sources from parameters ¯ = £ ¯(1)     ¯() ¤ , where we assume   . • Let y() denote an  × 1 complex vector containing the received samples. We model this obser- vation vector as y() = A(¯)s() + n() where s() contains the signal transmitted by the    sources, n() contains the received noise (assumed i.i.d. and CN(0 2 )) and A(¯) = £ a ¡ ¯(1) ¢    a ¡ ¯() ¢ ¤ • We assume that a total of  snapshots are available, and that   . • Problem: estimate the parameters ¯ from the observations {y()  = 1    }. • We investigate the use of Maximum Likelihood approaches =⇒ Highest resolution at the cost of increased computational complexity (multidimensional search) Xavier Mestre: Random Matrix Theory in Signal Processing. 34/41
  • 35.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Maximum likelihood • The estimated angles are determined as ˆ = arg min∈Θ ˆ () where ˆ () is the negative (concen- trated) log-likelihood function. • The “conditional” (or deterministic) model: assumes that the signals s() are deterministic un- knowns. In this situation, one must minimize ˆ () = 1  tr h P⊥ ()ˆR i where ˆR = 1  P =1 y()y () is the sample covariance matrix, P⊥ () = I − P(), and P() = A() ¡ A ()A() ¢−1 A () is the orthogonal projection on the column space of A(). • The “unconditional” (or stochastic) model: assumes that the source signals are random variables, typically s() ∼ CN(0 P) and i.i.d. in the time domain. In this situation, ˆ () = 1  log det h ˆ () P⊥ () + P()ˆRP() i  Xavier Mestre: Random Matrix Theory in Signal Processing. 35/41
  • 36.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Breakdown effect in maximum likelihood (I) • General nonlinear parametric estimators exhibit a threshold effect. • At low SNR, or low , the MSE suddenly departs from the Cramér Rao Bound. The presence of outliers is the main cause for this behavior. MSE Threshold effect B CRB d SNR dB Xavier Mestre: Random Matrix Theory in Signal Processing. 36/41
  • 37.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Breakdown effect in maximum likelihood (II) At low values of the SNR and , there exist realizations of the cost functions for which local minima corresponding to outliers become deeper than the intended one. UML, SNR=0dB, M=5, N=20, uncorrelated signals,DoA=[16,18]deg θ1 (deg)θ2 (deg) −80 −60 −40 −20 0 20 40 60 80 −80 −60 −40 −20 0 20 40 60 80 UML cost function Local Minima Intended Minimum Selected Minimum Xavier Mestre: Random Matrix Theory in Signal Processing. 37/41
  • 38.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Probability of resolution • We consider here the definition of [Athley, 05] of the resolution probability. If ˆ () is a generic cost function that fluctuates around a deterministic ¯ (), which has  + 1 local minima at the values ¯ 1     , the probability of resolution can be defined as  = P "  =1 © ˆ ()  ˆ ¡ ¯ ¢ª #  • It was shown in [Athley, 05] that this definition of  provides a very accurate description of both the breakdown effect and the expected mean squared error (MSE) of the DoA estimation process. • Unfortunately, in our ML setting,  is difficult to analyze for finite values of   due to the complicated structure of the cost functions, especially ˆ (). • We propose to use the asymptotic distributions (as   → ∞) instead of the actual ones. Very accurate description of the actual probability, even for very low  . Xavier Mestre: Random Matrix Theory in Signal Processing. 38/41
  • 39.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ First order behavior When   → ∞, under several technical conditions, the two ML cost functions become (pointwise) asymptotically close to two deterministic counterparts, namely |ˆ () − ¯ ()| → 0 |ˆ () − ¯ ()| → 0 a.s. pointwise in  as   → ∞, where ¯ () = 1  tr £ P⊥ ()R ¤ and ¯ () = 1  log det £ 2 () P⊥ () + P()RP() ¤ +  −   log µ   −  ¶ −   respectively, where R is the true covariance matrix of the observations and 2 () = 1 − tr £ P⊥ ()R ¤ . Xavier Mestre: Random Matrix Theory in Signal Processing. 39/41
  • 40.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Second order behavior Let 1      be a set of multidimensional points (e.g. local minima of ¯ () or ¯ ()). Let ˆη = [ˆ (1)      ˆ ()] and ¯η = [¯ (1)      ¯ ()] and take the equivalent definitions for the UML cost function. Assume that y() ∼ CN (0 R). Under certain technical conditions, as   → ∞ ,  → , 0    1, we have Γ−1  (ˆη − ¯η) → N (0 I) and Γ−1  (ˆη − ¯η) → N (0 I) for some covariance matrices Γ, Γ given by {Γ} = 1  tr £ P⊥  P⊥  ¤ and {Γ} = 1 2 2  1  tr £ P⊥  P⊥  ¤ + 1 2  1  tr £ P⊥ Q ¤ + 1 2  1  tr £ P⊥  Q ¤ − log ¯ ¯ ¯ ¯1 − 1  tr [QQ] ¯ ¯ ¯ ¯ where P = R 12  P ³  ()  ´ R 12  ,P⊥  = R − P, and Q = R 12  A £ A  RA ¤−1 A  R 12  . Xavier Mestre: Random Matrix Theory in Signal Processing. 40/41
  • 41.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Application • The previous asymptotic result is basically stating that for sufficiently large  , we are able to approximate the finite-dimensional distributions of the functions ˆ () and ˆ () as multivariate Gaussians, namely ˆη ∼ N (¯η Γ) ˆη ∼ N (¯η Γ) • It turns out that these results are very good approximations of the finite dimensional reality, even for relatively low values of  . • This provides a tool to evaluate the resolution probability of the ML method according to  = P "  =1 © ˆ ()  ˆ ¡ ¯ ¢ª # = P ⎡ ⎣ ⎡ ⎣ −1 1 ... ... −1 1 ⎤ ⎦ ˆη  0 ⎤ ⎦ which can be evaluated by computing the cdf of the Gaussian law. Xavier Mestre: Random Matrix Theory in Signal Processing. 41/41
  • 42.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Simulation results ULA of  = 5 elements, two sources coming from 16 and 18 degrees with respect to the broadside. −15 −10 −5 0 5 10 15 20 25 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 SNR (dB) Prob.ofRes. Prob. of res., M=5, Theta=[16,18] deg, corr=0 UML (Predicted) UML (Simulated) CML (Predicted) CML (Simulated) N=100 N=10 Uncorrelated sources −15 −10 −5 0 5 10 15 20 25 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 SNR (dB)Prob.ofRes. Prob. of res., M=5, Theta=[16,18] deg, corr=0.95 UML (Predicted) UML (Simulated) CML (Predicted) CML (Simulated) N=100 N=10 Highly correlated sources Xavier Mestre: Random Matrix Theory in Signal Processing. 42/41
  • 43.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Concluding remarks Xavier Mestre: Random Matrix Theory in Signal Processing. 43/41
  • 44.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Conclusions Random Matrix Theory offers the possibility of analyzing the behavior of different quantities depending on ˆR when the sample size and the number of sensors/antennas have the same order of magnitude. The objective is to describe the asymptotic behavior of a certain scalar function of ˆR, namely  ³ ˆR ´ . • Traditional Approach: Assuming that the number of samples is high, we might establish that  ³ ˆR ´ →  (R) in some stochastic sense as  → ∞ while  remains fixed. • New Approach: In order to characterize the situation where   have the same order of magnitude, one might consider the limit   → ∞,  → , 0    ∞. Results obtained under this asymptotic limit turn out to be extremely accurate, even for reasonably low  . Xavier Mestre: Random Matrix Theory in Signal Processing. 44/41
  • 45.
    Centre Tecnològic deTelecomunicacions de Catalunya - CTTC Parc Mediterrani de la Tecnologia, Castelldefels (Barcelona), Spain, http://www.cttc.cat/ Thank you for your attention!!! Xavier Mestre: Random Matrix Theory in Signal Processing. 45/41