ESTIMATION OF HYPERSPECTRAL COVARIANCE MATRICES Avishai Ben-David1 and Charles E. Davidson21Edgewood Chemical Biological Center, USA.2Science and Technology Corporation, USA.
Outline• Why covariance matrices are important?• What is the difficulty in estimation?• Our approach• Example for hyperspectral detection
Why covariance matrices are important• The covariance matrix C is the engine of most multivariate detection algorithms• Examples: Matched Filter: score = αT·C-1·t Anomaly Detector: score = α T·C-1· α α = measurement vector, t = target vector
How do we compute C ?• z is measurement vector with p spectral bands (i.e., z is p-by-1 vector) that is measured when target was absent (i.e., the H0 hypothesis)• We acquire n z-vectors and construct a p-by-n matrix Z, and center it (mean subtracted) Z Z-E(Z)• C=cov(Z)=E(ZZT)=UUT (CWishart statistics) where is the estimated eigenvalue-matrix and U is the estimated eigenvector-matrix using SVD decomposition.
What is the difficulty in estimation• The problem is that there are not enough measurement of z-vectors (n is too small)• Example: sampled eigenvalues from sampled C (average of 1000 matrices)• 5 spectral bands (p), i.e., C=5-by-5 matrix (very small) with true eigenvalues =[1 2 3 4 5] (a) n=50 measurements: n/p=10 (e.g., p=150 (typical in hyperspectral) n 1,500 =[0.9 1. 8 2.8 4.0 5.6] (b) n=10 measurements (n/p=2 e.g. RMB rule in radar) =[0. 4 1.1 2.1 4.0 7.3] (Reed, Mallet & Brennan, 1974, average SNR loss for matched Filter is X2)
Our solution (general overview)• Objective: to find a simple transformation from sampled eigenvalues Λx to population (truth) eigenvalues (ΛΩ). Λ=f(Λx) ΛΩ• The improved covariance matrix is computed as C=UTΛU. We replace sampled eigenvalues Λx with the improved estimate Λ, and using the sampled eigenvectors U (for lack of knowledge of the population eigenvectors).
• Our solution involves two steps. 1st step is interpreted as adding energy spectrally. 2nd step is balancing the energy in two big blocks: small and large eigenvalue regions.Thus, we “redistribute” energy to the eigenvalues• We use theory for statistical distribution of eigenvalues for Wishart matrices and bounds on magnitude of eigenvalues, and energy conservation constraint.
We view the sampled eigenvalues ”as if” they can be represented with diagonal of p block- matrices, each with Marcenko-Pastur law.Sampled eigenvalues “as if” sampled from the mode (i.e., highest probability location).Sampled eigenvalue are “shifted” (1ststep) toward the population eigenvalues.We impose energy conservation (2nd step) for the solution - because the sum of eigenvalues (trace) is unbiased, i.e., trace(x)=trace() Trace is the signal “energy” (total variation)
Our solution (detailed view) How simple is it?Multiplication of 3 matrices: f ( x , n) x F E x is the sampled eigenvalues matrix, x = eig(C) pi 2 ) (1 1 n shift sampled eigenvaluesF diag( ); Fmode (i) Fmode p (1 i ) based on mode with matrix n F and multiplicity pi Elarge I t 0 E 0 Esmall I p t balance the energy with p matrix E s x (i ) t i t 1 s x (i ) Esmall p i 1 s x (i ) E large t s x (i ) i t 1 Fmode (i ) i 1 Fmode (i )
Regularization aspect of the solution (enhanced stability)• The solution is a nonlinear transformation of the sampled eigenvalues x• We can also write the solution in the framework of traditional regularization as x ; x ( F E I )• Our correction is potentially different for each eigenvalue. (it is single offset in traditional regularization).• With our method the condition number of improves (decreases) due to the fact that in the magnitude of the small sampled eigenvalues tend to increase. Thus, cond() < cond(x)
Eigenvalue estimation for diagonal matrix: Marcenko-Pastur law• C is p-by-p diagonal matrix with C=2 (multiplicity of p eigenvalues each is 2)• The pdf of sampled eigenvalues is known analytically.• There is a relationship between the mode of the pdf and the true (population) eigenvalue. Mode is ML position C ~ Wp ( I 2 n1, n) (1 k )2 2 sx (mode) Fmode 2 1 k k p/n • based on the mode location, the sampled eigenvalue is shifted upward (step 1 of the process) toward population value (the mean)
Apparent multiplicity p for nondiagonal matrices pi 2 (1 ) 1 nF diag( ); Fmode (i) Fmode p (1 i ) n• We use theory for bounds of the sampled a(i) s x (i) k b(i) s (i) k 2 2 eigenvalues a(i) s x (i) b(i) k p / n 1 x 1• We count the number (pi) of overlapped eigenvalues within [ai bi] for each sampled eigenvalue The multiplicity of the 4th eigenvalue is 3 (two neighbors, the 2nd & 3rd plus itself)
Examples1. Simulations with many analytical functions & statistics for population eigenvalues (normal, uniform, Gamma)2. Field data: hyperspectral sensors SEBASS & TELOPS figures of merit SEBASS Ratio of improvement of the solution over the data n/p=2 • Re = residual p=115 • RA = area data • Rcond = condition # solution • Rd =distance in probability truth All figures of merit are greater than 1. Hence, improvement of our solution over data
Probability density functions for TELOPSmeasurements for selected eigenvalues All figures of merit are n/p=2 greater than 1. p=135truth Hence, improvement data solution of our solution over data Drastic Improvement: panels 3, 4, 6, 7, 8, 9 (eigenvalues # 30, 40, 80, 100,120,135) No Difference panels 1, 5 (eigenvalues # 1, 50) Failure panel 2 (eigenvalue # 10)
Application to Hyperspectral Detection• Matched Filter: score = αT·C-1·t α = measurement vector, t = target vector• Random target direction • from data: Pd < 50% data • with solution Pd >60% solution • known eigenvalues truth (known (& sampled eigenvectors) population) Pd >65% clairvoyant C • known covariance (known population and directions) C (true eigenvalues & vectors) Pd >80%
Summary• We presented a method to estimate the eigenvalues of a sampled covariance matrix (Wishart distributed) with few samples.• The method is practical, quick and simple for implementation with a multiplication of three diagonal matrices.• The method achieves two objectives: improved estimation of eigenvalues & improved condition number (i.e., regularization).• With the method we improve the detection (ROC curve)