Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Anna Denisova - Two Realizations of Probability Anomaly Detector with Different Vector Quantization Algorithms
1. Two Realizations of Probability
Anomaly Detector with Different
Vector Quantization Algorithms
Anna Denisova
Samara State Aerospace University
1
2. Anomaly detection for hyperspectral images
Anomaly is a small partition of data with some characteristics significantly
different from background.
Hyperspectral image Anomaly detection method
Anomaly
measure image
1. High dimension
2. Physical meaning of
image pixels
1. No prior information
about target objects
2. Background model
3. Anomaly measure
Post processing
2
Anomaly detection methods classification
Gaussian Mixture
(GMM-GLRT, Cluster
based anomaly detector)
Linear spectral
mixture (OSP and
SSP Detectors)
Local normal
model (RXD)
Non parametric
background model
(Kernel RX-Detector)
Local normal model
in feature space
(SVDD)
4. PAD with uniform quantization (PAD UQ)
PAD UQ:
Uniform quantizing with K levels for each image component l.
Integer hash functions:
• modulo hashing
•multiplicative modulo hashing
•Hash functions for strings (Horner algorithm)
Kxx
xnnx
nnq
ll
ll
l
minmax
min21
21
,
,
MKqqf i
n
i
i mod)(
1
0
MKqqf i
n
i
i mod)(
1
0
4
5. PAD with agglomerative clusterzation
PAD AC:
New vector quantization algorithm based on agglomerative clusterization.
Properties:
1. Quantization values are the centers of clusters.
2. Number of clusters M is fixed.
3. Cluster size threshold – ε
4. Output – a codebook Q of size M.
M, ε, 0,0, 11
xxxQ сс
mсxnnxd ,, 21
true false
MQ •Include x in Cm
•Recalculate xCm
•Increase ε
•Merge clusters
with d<ε
•Add new cluster with center in x
5
9. Experimental research of aggregation and noise
9
0
0.2
0.4
0.6
0.8
1
1.2
3×3, TP 3×3, FP 5×5, TP 5×5, FP 7×7, TP 7×7, FP
Probabilityvalue(TPorFP)
Anomaly size, Probability type (TP or FP)
PAD UQ
Minimum Median Sigma filtr
0
0.2
0.4
0.6
0.8
1
1.2
3×3, TP 3×3, FP 5×5, TP 5×5, FP 7×7, TP 7×7, FP
Probabilityvalue(TPorFP)
Anomaly size, Probability type (TP or FP)
PAD AC
without aggregation minimum median
0
0.2
0.4
0.6
0.8
1
1.2
3×3,
TP
3×3,
FP
5×5,
TP
5×5,
FP
7×7,
TP
7×7,
FP
Probabilityvalue(TPorFP)
Anomaly size, Probability type (TP or FP)
PAD AC for noised images
∞
250
150
15
10. Experimental research on real images
10
Input image №1 PAD AC (M=15, EPS=1) RXD
Not detected signature (red on the left chart),
detected signature (red on the right chart)
and background signature (green)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Threshold
PAD AC (M=15,
EPSILON=1), TP
RXD, TP
PAD AC (M=15,
EPSILON=1), FP
RXD, FP
11. Experimental research on real images
11
Input image №2 PAD AC (M=90, EPS=1) RXD
Region 2 (Yellow), Region 1 (Green), Background (Red)
1
2
0
0.2
0.4
0.6
0.8
1
1.2
Threshold
PAD AC (M=90,
EPSILON=1), TP
RXD, TP
PAD AC (M=90,
EPSILON=1), FP
RXD, FP
12. Conclusion
Proposed two realizations of PAD anomaly detection algorithm
PAD UQ and PAD AC:
• PAD UQ inefficient in presence of noise and highly depends on
correlation properties of background.
• Further development of PAD AC consists in production
modifications with PCA.
• PAD AC noise resistant and fewer dependant from image
correlation.
• PAD AC requires automatic procedure of initial error and
codebook size estimation to be applied on real images.
Questions?
12
Editor's Notes
Anomaly is a small partition of data with particular characteristics significantly different from background. The area of anomaly detection is too large and includes different types of methods for many kinds of data types. And all the methods are highly dependent from nature of data.
My work is devoted to anomaly detection for hyperspectral images. The input is an image derived by hyperspectral sensor on board of remote sensing system. Key characteristics of input data that are crucial for anomaly detection are high dimensionality and specific interpretation of data due to its physical meaning.
The pixel of hyperspectral image is a vector and represents spectra of underlined surface. Different objects due to different reflectance properties has various spectral description. So an anomaly on hyperspectral image is an area with significantly different spectral characteristics and corresponds to objects unusual for the scene.
Anomaly detection problem concludes in target recognition without any apriori knowledge. This task is related to classification problem for two classes where one of classes is background and another is anomalous regions. The main issues for anomaly detection are the way of describing normal data behavior or background model and criterion of anomalous values that is called anomaly measure.
So concrete algorithm takes as input hyperspectral image and outputs anomaly measure image, that can be used for further analysis. I also include on diagram postprocessing as a part of process that i used in research. Postprocessing means here additional operations done by the user to achieve valuable information from anomaly image.
In this work it was performance measures of the algorithms that can be achieved via thresholding or visually.
In collaboration with Myasnikov VV it was proposed probability criterion to define anomaly measure and general scheme of probability anomaly detector.
The main idea is to use a probability of image pixel value as anomaly measure, so that anomalies are the pixels with the lowest probabilities on the image.
These criterion defines global anomaly method that we called PAD or probability anomaly detector.
The general scheme consists of following steps
To decrease a variety of input data keeping its multidimensional structure we used vector quantization as a first step.
Further the histogram of quantized values is calculated. For high dimensional data it is a big problem to calculate histogram effectively, that is why we use hash table to contain all histogram bins. By means of simple hash functions we can organize fast addressing of unique quantized values and calculate their probabilities.
The probabilities of quantized values are used to calculate as final anomaly measure by means of some aggregation function inside sliding window. Aggregation can be useful in case on further visual analysis and corresponds to spatial filtration of anomaly measure values. The simplest aggregation function is minimum.
Today I present you comparison of two PAD realizations with different vector quantization algorithms.
First variant is uniform quantization for each image component. The range between maximum and minimum in each channel is divided into K levels. Quantized vector is an integer vector with N components.
Integer vector can be represented as number in an integer number in numerical system with base K. We used several simple hash functions to produce unique indexes for quantized values that can be stored as unsigned integers.
Another modification is PAD with vector quantization based on agglomerative clusterization. It is a method with fixed code book size. The space of pixels is divided into unequal cells. Cells centers are considered as members of code book. There are several vector quantization algorithms with fixed codebook size: LGB and k-means, for example. But they are very computationally ineffective because they are iterative and examine all image pixels for many times. It is proposed a new algorithm of vector quantization with agglomerative clusterization in this work. We initialize a maximum number of clusters, initial radius of cluster. Firstly we create only one cluster with center in first pixel. Then we examine all other pixels. If the distance between the cluster center and current pixel is less than threshold then current pixel is included into this cluster and center of cluster is redefined. If the distance between the current pixel and all existing clusters is more than threshold, a new cluster is created. If the maximum codebook size is exceed nearest clusters are merged and threshold is increased.
After codebook creation each pixel substitutes by the center of cluster that is first in code book for which the distance between pixel and cluster is less than output threshold value. In this case histogramm can be calculated during the process of codebook creation. And index of quantized value in code book is used as hash.
For the experimental research on real images test images were constructed from two different parts of initially one image. These parts, building mask and results of embedding anomalies are shown on slide.
the parameters of images are listed in the right.
.
To choose the best parameters of described algorithms some experiments with synthetic hyperspectral images were done. Images were generated in accordance to linear spectral model using IGCP spectral library. Test images were 128x128 pixels in size and had 99 spectral bands. Small square patters were used as anomalies. They were built in image in five positions and had size 3x3 pixels. Each anomaly had its own spectral signature that is why probability of anomaly was 0.00055
Two types of images with different correlation properties were used: with gauss and biexponential correlation function for linear spectral mixture coefficients, to investigate performance of proposed methods.
In both algorithms we used minimum as aggregation function and window of one pixel. All anomaly images were converted into binary images by means of threshold 0,999.
To evaluate performance of algorithm true positive probability TP and false positive probability FP were used.
On this slide the results of average TP and FP values for different quantization levels and hash functions are shown. As you can see uniform quantization needs to use only 2 quantization levels per 99 channels, It deals with high variability of data presented by more quantization levels. False probability can be reduced by means of using PCA before quantization.
As for hash functions multiplicative modulo hash significantly decreases probability of false detection but not all anomalies are recognized. This effect is connected with leading digits of big integer number produced from quantized vector. In multiplicative modulo hashing the middle part of the spectrum is more significant. Controlling this parameter can produce better results and is a part of my future work.
Below you can see hash images and results of binarization for inputs with different correlation properties. All images has its own regions with low probability pixel values due to the model of gauss and exponential correlation. These regions produce spots of different sizes according to neighbor pixels correlation coefficient. Visual perception of binary anomaly image allows us to make conclusion about anomaly regions by its shape. The shape of false detections is regular and size depends on correlation. But if we analyze images automatically these spots become a problem and leads to increasing false detection errors. The most difficult case is the case of correlation that defines local continuous areas with the same size as anomalies.
For PAD AC algorithms several codebook sizes were examined. In all cases the method recognized all anomalies. That is why charts only for FP are provided. False probability increases with amount of clusters because more smaller clusters can be produced. The number of clusters should be as less as possible. You can visually predict a number of different type of common objects of the scene and than set M a little bit grater. Starting error doesn’t influence on result so much as codebook size. But it reflects minimum radius of sphere of pixels that are defined as anomalies. Further in experiments i used M=10.
The hash images and binary results are shown at the bottom of the slide. As you can see this method gives stable results in spite of correlation of initial image.
It is useful in presence of noise to use some aggregation functions. Test images consist of three groups with anomaly of different sizes from 3 to 7 pixels.
In presence of noise i tested different aggregation functions such as minimum, maximum, median, sum, sigma filter and gauss filter. The window with 3х3 pixels was used.
As you can see from the diagram for PAD UQ median filtration and sigma filter provides better results than minimum. And their performance increases with anomaly size.
But PAD UQ method can not be applied in presence of noise. Because of high false detection probability.
PAD AC gives much higher TP and much lower FP probabilities. The results of best aggregation functions are presented on chart. PAD AC is also noise stable method. The chart at the right shows this fact. PAD AC woks well until signal to noise ratio is less than 15. For all test images false detection probability was 10^(-4).
Here you can see the results for PAD AC algorithm in comparison with RXD Detector on real image. Binary images were produced by means of thresholding and the threshold was chosen as quantile of anomaly measure histograms for both methods.
ROC curve shows that PAD AC gives good results for lower quantile values. And it cannot recognize as anomaly a piece of data that has signature close to background. False detection for both algorithms includes part of landscape untypical for this image.
This is the results for the second image.
ROC curve was built for the best pair of epsilon and M parameters.
As it can be seen from the binary images PAD has more false detection probability than the RXD algorithm. False detected regions includes areas 1 and 2 circled on input image. They also has untypical for this image signatories that is why they can not be considered as real false detections in terms of PAD.
The signatories of background, anomalies and regions 1 and 2 are shown on the up right figure.