2.
1470 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 3, MARCH 2012
diagonalization tool for adaptive signal processing for the following
reasons.
1) Singular values of the “diagonalized” covariance matrix Cd
x in
(1) are restricted to unity, and, hence, do not contain information
about the powers in the actual data channels.
2) Singular values of the “diagonalized” pseudocovariance Pd
x in (1)
pertain to those of C0
x PxC0
x and not to those of Px (see also
(3)).
3) From (3), SUT requires two SVDs for Cx and Px.
For univariate complex processes, we propose to address these issues
by exploiting the eigenstructure of their correlation matrices, in order to
introduce a new diagonalization procedure that i) requires only a single
SVD on the pseudocovariance matrix; ii) preserves the original eigen
values of EfxxHg and EfxxTg, which is beneﬁcial when working
on real world problems; and iii) helps to simplify, and to make more
intuitive, various statistical analyses and performance bounds in widely
linear signal processing.
It should be mentioned that an efﬁcient technique to compute the
SUT was proposed in [9] and is referred to as the generalized uncor
relating transform (GUT). The method ﬁnds directly the subspace of
the matrix pseudoproduct of the covariance and pseudocovariance ma
trices and achieves computational complexity reduction equivalent to
the computation of one SVD, thus satisfying the issue 3) above. Notice
that GUT keeps the same eigenstructure as SUT and does not address
the issues 1) and 2) above, crucial in the statistical analysis of adaptive
ﬁltering algorithms.
A. The Approximate Uncorrelating Transform
Recall that for single channel data (univariate), the covariance and
pseudocovariance matrices have the form Cx = EfxxHg and Px =
EfxxTg. To show that the diagonalization of both the sample covari
ance and pseudocovariance matrices3 (of size L 2 L) requires only
one SVD, consider the diagonalized pseudocovariance matrix ^Pd
x =
UHp ^PxU3p, where Up can be obtained from the Takagi factorization
^Px = Up333pUTp . The quadratic form ^Pd
x ^Pd
x is guaranteed to be di
agonal, and can be estimated from the data as
^Pd
x ^Pd
x = 1
T
T
k=1
x(t +k)xT(t+k)
2
1
T
T
l=1
x3(t+l)xH(t+l)
= 1
T2
T
i=1
T
j=1
x(t+k)
2 xT(t+k)x3(t+l)
xH(t+l)
1
T2
T
i=1
T
j=1
x(t+k)
2 xH(t+k)x(t+l)
xH(t+l)
1
T
T
k=1
x(t +k)xH(t+k)
2
1
T
T
l=1
x(t +l)xH(t+l)
^Cd
x ^Cd
x : (4)
3An implicit assumption is second order stationarity over T samples.
Therefore, the diagonalization of the pseudocovariance matrix at the
same time implies an approximately diagonalized covariance matrix,
requiring only one SVD for their joint diagonalization. This is possible4
because for k = l the scalar kl in (4) is real valued, allowing us to
substitute kl = xT(t + k)x3(t + l) = xH(t + k)x(t + l), whereas
for k 6= l and 8k l, kl = xT(t + k)x3(t + l) 0, since ^Pd
x is
diagonal.
To validate this approximation, recall that for x = xr +xi, the ‘co
variance’ term xx3 = x2r + x2i is independent of the improperness of
the signal, the ‘pseudocovariance’ term5 is xxT = x2r 0 x2i +2xrxi,
and the correlation coefﬁcient between the real and imaginary part is
given by = Efx x g
( ) where x and x denote the respective stan
dard deviations. Consider now the following two special cases for im
proper signals:
1) For = 1, that is, a perfect correlation between the real xr and
imaginary xi parts of x (a strongly improper signal), the term
kl = xT(t+k)x3(t+l) = L
`=1 x`(t+k)x3`(t+l) in (4) is
a real scalar and the approximation in (4) holds, because
Efx(k)x3(l)g =Efxr(k)xr(l)g +Efxi(k)xi(l)g
+(Efxr(l)xi(k)g 0 Efxr(k)xi(l)g)
Efxr(k)xr(l)g +Efxi(k)xi(l)g
+(Efxi(l)xi(k)g 0 Efxi(k)xi(l)g)
Efxr(k)xr(l)g +Efxi(k)xi(l)g 2 : (5)
2) For = 0, that is, for a doubly white signal, from (5) we have
Efxr(l)xi(k)g = Efxr(k)xi(l)g 0and the approximation in
(4) holds, and similarly for either xi = 0 or xr = 0, where kl is
real.
Without loss in generality, the approximation error in a general case can
be expressed as = kl 10 2 (see the Appendix for more details).
Observe that kl can be assumed to be negligible when compared with
the diagonal elements of the correlation matrices; the approximation in
(4) thus holds, which is conﬁrmed in Section IV. Notice that for = 1,
the approximation becomes exact.
Remark 1: We have shown that both the pseudocovariance, Px
and covariance, Cx, matrices can be diagonalized by the same uni
tary matrix Q, obtained from the singular value decomposition6 of
Px = Q333pQT. Thus, the proposed approximate uncorrelating trans
form (AUT) has the form
Cx Q333cQH
Px =Q333pQT: (6)
Remark 2: The proposed diagonalization procedure applies to cor
relation matrices of univariate processes and not to those of multivariate
data. To show this, consider a pvariate signal represented by the p2T
matrix X = [x1;...;xp], whose p 2 p pseudocovariance matrix is
PX = EfXXTg. Diagonalizing such a matrix implies that the offdi
agonal terms ^PX = 1
T t xk(t)xl(t) = 0 8k 6= l, however, this
does not mean that all the individual products xk(t+i)xl(t+j) = 0.
These constitute the offdiagonal elements of the L2 L matrix XTX,
and, hence, the condition XTX3 XHX does not hold for multi
variate data, and we require SUT for such diagonalizations.
4These conditions also apply for negative indices, as the autocorrelation func
tion is symmetric.
5For improper (secondorder noncircular) signals, either their real and imag
inary parts are correlated, or they are uncorrelated but have different powers.
6The SVD of a symmetric matrix, such as P , is known as the Takagi/Au
tonne factorization [10], and it ensures that the singular values have been sorted
in a descending order.
3.
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 3, MARCH 2012 1471
III. SOME APPLICATIONS OF AUT DIAGONALIZATION
We next brieﬂy illustrate the usefulness of the proposed diagonal
ization procedure in some standard optimization problems in statis
tical signal processing of noncircular complex random variables, that
is, those with rotation dependent distributions.
A. Convergence of Widely Linear Adaptive Filters
Consider the augmented complex least mean square (ACLMS) al
gorithm, whose output is governed by the widely linear model y(t) =
xH(t)h(t) + xT(t)g(t), and its update is given by [1], [5]
wa(t + 1) = wa(t) + xa(t)ewl(t) (7)
where the ﬁlter coefﬁcient vector wa(t) = [hT(t)gT(t)]
T, the aug
mented input xa(t) = [xT(t)xH(t)]
T, output error ewl(t) = d(t) 0
xHa (t)wa(t), and the desired signal d(t) = xHa (t)woa(t) + q(t), with
q(t) being doubly white Gaussian noise uncorrelated with x(t), and
woa(t) the optimum adaptive coefﬁcient vector that minimizes the cost
function J(t) = je2
wl(t)j. The evolution of the weight error vector
v(t) = wa(t) 0 woa(t) is given by
E fv(t + 1)g = (I 0Ca
x) E fv(t)g (8)
where the augmented covariance matrix
Ca
x =
Cx Px
P3
x C3
x
comprises both the covariance matrix Cx and the pseudocovariance
matrix Px. When analyzing “convergence in the mean,” we need to di
agonalize Ca
x = Ua333aUa in order to determine the stepsize range
as a function of eigenvalues of Ca
x; however, doing this directly does
not allow us to examine how the individual parameters of the covari
ance matrices Cx and Px affect the convergence. On the other hand,
the proposed diagonalization in (6) allows us to express Ca
x in terms
of the eigenvalues of Cx and Px, that is
Ca
x =
Q 0
0 Q3
333c 333p
333p 333c
QH 0
0 QT : (9)
The middle term above can be diagonalized as [5]
333c 333p
333p 333c
=
1
2
I 0I
I I
333c + 333p 0
0 333c 0333p
I I
0I I
giving a simple and intuitive demonstration that the convergence of
ACLMS is governed by the condition number of Ca
x, with the mis
match between covariance and pseudocovariance given by
(Ca
x) =
max(c) + max(p)
min(c) 0min(p)
: (10)
Remark 3: The proposed diagonalization procedure allows us to ex
press the condition number in terms of the true eigenvalues of Cx and
Px, instead of those of Ca
x which are less straightforward to physically
interpret in terms of, e.g., the degree of circularity or signal power.
Remark 4: When using SUT for the same analysis, the imposed con
straint max(c) = min(c) = 1 might not reﬂect the true dynamics
of ACLMS convergence [5]. This is because the SUTbased condition
number (Ca
x) is not computed based on the eigenvalues of the co
variance matrices of the “actual” data, but on the eigenvalues of the
covariance matrices of the “whitened” data.
B. Performance Advantage of Widely Linear Models
The performance advantage of the widely linear model y = hHx +
gHx3
over the strictly linear model y = hHx when modeling im
proper processes is given by [11, eq. (2.10)]
2 = 2
l 02
wl = p3
yx 0P3
xC01
x pycx
H
2 C3
x 0P3
xC01
x Px
01
p3
yx 0P3
xC01
x pycx (11)
and evaluates the difference between the corresponding mean squared
errors of the strictly linear model (denoted by 2
l ) and the widely linear
model (denoted by 2
wl).
Expression (11) is difﬁcult to interpret, and at best, it can be simpli
ﬁed to wH55501w + 2
wl, where 555 = C3
x 0 P3
xC01
x Px is the Schur
complement of Ca
x. Since 555 is a positivedeﬁnite matrix, this guaran
tees that the widely linear model always performs no worse than the
strictly linear model. We can now apply the proposed diagonalization
in (6) to obtain
2 = wHQ3
333c 0333p333
01
c 333p
01
QTw
= ~wH 333c 0333p333
01
c 333p
8.
i = c;i 0 2
p;i
c;i
01
=
c;i
2
c;i 02
p;i
: (12)
The result in (11) has now been simpliﬁed to (12), providing physical
meaning to the analysis. For instance, it is now clear that as the degree
of noncircularity increases (i.e., an increase in p;i),
9.
i increases as
well, indicating a larger performance advantage of the widely linear
over the strictly linear model.
C. Numerical Stability and Complexity Reduction
In the context of singlechannel modelling, inversion of correlation
matrices is performed routinely, but can be very challenging due to a
possibly high condition number or rank deﬁciency. To provide insight
into the sensitivity of such inversions and to have control over numer
ical stability, it would be advantageous to factorize such matrices into
their diagonal forms. We shall now illustrate the usefulness of the pro
posed AUT relationships in (6) when performing an inverse of the aug
mented covariance matrix.
The standard inverse of the augmented covariance is given by [7]
Cx Px
P3
x C3
x
01
=
A D
D3
A3 (13)
where A = [Cx 0PxC01
x P3
x]
01 and D = APxC3
x . To deal with
numerical instability, we can perform a further factorization (e.g., as in
squareroot Kalman ﬁlters), as
Ca
x =
Q 0
0 Q3
14.
3p33301
c . Therefore, to guarantee
numerical stability of the inverse, we need to ensure all diagonal terms
of
15.
and
are nonzero, leading to a full rank matrix. Since the subma
trices
16.
and
are not only diagonal but also realvalued, this condition
is met. The additional beneﬁt of this result is its generic nature, as it can
be applied in many applications requiring an inverse of the correlation
matrices such as in Wiener ﬁlters or square root Kalman ﬁlters.
The proposed approximate diagonalization also reduces the compu
tational complexity of the inversion of Ca
x. The usual tradeoff between
17.
1472 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 3, MARCH 2012
Fig. 1. Performance of the approximate diagonalization in (6) in terms of the percentage error . Left: for synthetic data as a function of the ﬁlter length L and
the circularity coefﬁcient . Right: for the “low”, “medium”, and “high” dynamics of a real world 2D wind signal.
reducing the computational complexity of an algorithm and the accu
racy of the algorithm applies, i.e. a decrease in the computational com
plexity would lead to a less accurate Wiener solution, yet a reasonable
approximation due to the negligible error (see Section IV). Also, no
tice from (17) that when we have perfect correlation ( = 1) between
the real and imaginary part of a complex number, the approximation
of widely linear Wiener solution by the proposed AUT technique be
comes exact.
D. Extension to Widely Linear Modeling in the Quaternion Domain
Augmented quaternion statistics employ the covariance matrix Cx
and complementary covariance matrices7 C{
x = Efxx{H
g, C
x =
EfxxH
g, and C
x = EfxxH
g [12]. Using (4), it can be shown that
= xH
x (xH
x)
{
(xH
x)

(xH
x)
which either vanishes or
is realvalued. Following on the proposed result for complex data, one
SVD is adequate to diagonalize all the quaternionvalued correlation
matrices,8 which helps in the analysis of quaternionvalued adaptive
ﬁlters [13].
IV. SIMULATIONS
The performance index used was a ratio of the powers of the off
diagonal, cij , versus diagonal, cii, elements of the correlation matrix,
deﬁned as 2
=
Efjc j g
Efc g 2 100 8 i 6= j and the performance was
assessed against: i) size of the correlation matrices L and ii) degree of
circularity
= Efx2
g =E jxj2
(15)
where = 1 corresponds to a highly secondorder noncircular (im
proper) signal, whereas = 0 corresponds to a circular, proper, signal.
In the ﬁrst set of simulations, 100 independent trials were averaged
for an MA synthetic signal with a varying degree of noncircularity. It
was generated by ﬁltering a complex doubly white circular or noncir
cular Gaussian signal with a moving average ﬁlter whose coefﬁcients
were taken from a uniform distribution within [00.5,0.5]. The lefthand
7Quaternion involution operation (1) 8 2 f{; ; g corresponds to keeping
the real and the imaginary part intact and conjugating the two other imaginary
parts, see [12] for more detail.
8Notice that real and complex numbers are special instances of quaternions,
and as such this generic quaternion result decomposes into the complex solution,
provided the same two imaginary parts of the quaternion variables vanish.
plot in Fig. 1 illustrates an excellent performance of the proposed ap
proximation; the diagonalization error had a very small value of 1.2%
even for a large ﬁlter length of L = 100 and only a moderately non
circular signal. The diagonalization error stems from the imaginary
part of the approximation of the term kl in (4), i.e., =f L
`=1 x`(t +
k)x3
` (t + l)g, which explains the behavior in Fig. 1 for large L. The
Appendix shows that is a decreasing function of the circularity coef
ﬁcient , that is, 1 0 2.
The righthand plot in Fig. 1 illustrates the performance of the pro
posed diagonalization method for a twodimensional wind signal w,
where the speed v and direction were measured at 50 Hz using a
2D ultrasonic anemometer, and the signal was made complex valued
as w = ve
. The degree of noncircularity of the “high” wind regime
was the highest ( = 0:95), followed by the “medium” ( = 0:67)
and “low” ( = 0:54) regimes. Observe a near perfect approximate di
agonalization for small sizes of correlation matrices (L 20) and an
excellent approximation for large sizes of correlation matrices (L !
100). The approximation was better for signals with higher degree of
noncircularity, conforming with the analysis.
V. CONCLUSION
We have demonstrated that a single singular value decomposition of
the pseudocovariance matrix is sufﬁcient to simultaneously diagonalize
both the covariance and pseudocovariance matrices of single channel
noncircular complex data. This has been achieved by examining the
structure of the quadratic form of the sample pseudocorrelation ma
trix. The proposed approximate diagonalization has been shown to pre
serve the original eigenvalues of the covariance and pseudocovariance
matrix, and to make several standard analyses of performance bounds
in widely linear modelling simpler and more intuitive. Case studies in
the context of widely linear adaptive ﬁltering, together with illustrative
simulations, support the approach.
APPENDIX
Without loss in generality assume that the real and imaginary part
of a complex random variable x = xr + xi are generated from zero
mean, unit variance, uncorrelated variables x1 and x2 as
xr(t) = x1(t) xi(t) = x1(t) + x2(t) 1 0 2 (16)
where the correlation coefﬁcient =
Efx x g
( ) allows us to control the
correlation between xr and xi between uncorrelated ( = 0) to fully
Be the first to comment