SlideShare a Scribd company logo
1 of 47
Download to read offline
Image Analysis & Retrieval
CS/EE 5590 Special Topics (Class Ids: 44873,44874)
Fall 2016,M/W 4-5:15pm@Bloch0012
Lec 08
Feature Aggregation II:
Fisher Vector, Super Vector and AKULA
Zhu Li
Dept of CSEE, UMKC
Office: FH560E,Email: lizhu@umkc.edu, Ph: x 2346.
http://l.web.umkc.edu/lizhu
p.1Z. Li, Image Analysis&Retrv.2016
Outline
 ReCap of Lecture 07
 Image Retrieval System
 BoW
 VLAD
 Dense SIFT
 Fisher Vector Aggregation
 AKULA
 Summary
Z. Li, Image Analysis&Retrv.2016 p.2
Precision, Recall, F-measure
Precision, TPR = TP/(TP + FP),
Recall = TP/(TP + FN),
 FPR=FP/(TP+FP)
F-measure
= 2*(precision*recall)/(precision +
recall)
Precision:
is the probability that a
retrieved document
is relevant.
Recall:
is the probability that a
relevant document
is retrieved in a search.
Z. Li, Image Analysis&Retrv.2016 p.3
Why Aggregation ?
 Curse of Dimensionality
Decision Boundary / Indexing
Z. Li, Image Analysis&Retrv.2016 p.4
+
…..
Bag-of-Words: Histogram Coding
Codebook:
 Feature space: Rd, k-means to get k centroids, {πœ‡1, πœ‡2, … , πœ‡ π‘˜}
 BoW Hard Encoding:
 For n feature points,{x1, x2, …,xn} assignment matrix: kxn,
with column only 1-non zero entry
 Aggregated dimension: k
Z. Li, Image Analysis&Retrv.2016 p.5
k
n
Kernel Code Book Soft Encoding
Kernel Code Book Soft Encoding
 Kernel Affinity: 𝐾 π‘₯𝑗, πœ‡ π‘˜ = π‘’βˆ’π‘˜|π‘₯ π‘—βˆ’πœ‡ π‘˜|2
 Assignment Matrix: 𝐴𝑗,π‘˜ = 𝐾(π‘₯𝑗, πœ‡ π‘˜)/ π‘˜ 𝐾(π‘₯𝑗, πœ‡ π‘˜)
 Encoding: k-dimensional: X(k)=
1
𝑛 𝑗 𝐴𝑗,π‘˜
Z. Li, Image Analysis&Retrv.2016 p.6
VLAD- Vector of Locally Aggregated Descriptors
 Aggregate feature difference
from the codebook
 Hard assignment by finding the
NN of feature {xk} to {πœ‡ π‘˜}
 Compute aggregated
differences
 L2 normalize
 Final feature: k x d
Z. Li, Image Analysis&Retrv.2016 p.7
 3
x
v1 v2
v3 v4
v5
1
 4
 2
 5
β‘  assign descriptors
β‘‘ compute x-  i
β‘’ vi=sum x-  i for cell i
𝑣 π‘˜ =
βˆ€π‘—,𝑠.𝑑.𝑁𝑁 π‘₯ 𝑗 =πœ‡ π‘˜
π‘₯𝑗 βˆ’ πœ‡ π‘˜
𝑣 π‘˜ = 𝑣 π‘˜/ 𝑣 π‘˜ 2
VLAD on SIFT
 Example of aggregating SIFT with VLAD
 K=16 codebook entries
 Each cell is a SIFT visualized as centroids in blue, and VLAD
difference in red
 Top row: left image, bottom row: right image, red: code
book, blue: encoded VLAD
Z. Li, Image Analysis&Retrv.2016 p.8
Outline
 ReCap of Lecture 07
 Image Retrieval System
 BoW
 VLAD
 Dense SIFT
 Fisher Vector Aggregation
 AKULA
 Summary
Z. Li, Image Analysis&Retrv.2016 p.9
One more trick
 Recall that SIFT is a powerful descriptor
 VL_FEAT: vl_dsift
 A dense description of image by computing SIFT descriptor
(no spatial-scale space extrema detection) at predetermined
grid
 Supplement HoG as an alternative texture descriptor
Z. Li, Image Analysis&Retrv.2016 p.10
VL_FEAT: vl_dsift
 Compute dense SIFT as a texture descriptor for the
image
 [f, dsift]=vl_dsift(single(rgb2gray(im)), β€˜step’, 2);
 There’s also a FAST option
 [f, dsift]=vl_dsift(single(rgb2gray(im)), β€˜fast’, β€˜step’, 2);
 Huge amount of SIFT data will be generated
Z. Li, Image Analysis&Retrv.2016 p.11
Fisher Vector
 Fisher Vector and variations:
 Winning in image classification:
 Winning in the MPEG object re-identification:
o SCFV(Scalable Coded Fisher Vec) in CDVS
Z. Li, Image Analysis&Retrv.2016 p.12
Codebook: Gaussian Mixture Model (GMM)
 GMM is a generative model to express data
 Assuming data is generated from with parameters {𝑀 π‘˜, πœ‡ π‘˜, 𝜎 π‘˜}
Z. Li, Image Analysis&Retrv.2016 p.13
π‘₯ π‘˜ ~
π‘˜=1
𝐾
𝑀 π‘˜ 𝑁(πœ‡ π‘˜, 𝜎 π‘˜)
𝑁 πœ‡ π‘˜, 𝜎 π‘˜ =
1
2πœ‹
𝑑
2 Ξ£ π‘˜
1/2
π‘’βˆ’
1
2
π‘₯βˆ’πœ‡ π‘˜
β€²Ξ£ π‘˜
βˆ’1
(π‘₯βˆ’πœ‡ π‘˜)
A bit of Theory: Fisher Kernel
Encode the derivation from the generative model
 Observed feature set, {x1, x2, …,xn} in Rd, e.g, d=128 for
SIFT.
 How’s these observations derivate from the given GMM
model with a set of parameter, πœ† = 𝑀 π‘˜, πœ‡ π‘˜, 𝜎 π‘˜ ?
o i.e, how the parameter, e.g, mean will move to best fit the observation
?
Z. Li, Image Analysis&Retrv.2016 p.14
πœ‡4
πœ‡3
πœ‡2
πœ‡1
X1
+
A bit of Theory: Fisher Kernel
Score function w.r.t. the likelihood function πœ‡ πœ†(𝑋)
 πΊπœ†
𝑋
= π›»πœ† log 𝑒 πœ†(𝑋): derivative on the log likelihood
 The dimension of score function is m, where m is the number
of generative model parameters, m=3 for GMM
 Given the observed data X, score function indicate how
likelihood function parameter (e.g, mean) should move to
better fit the data.
Distance/Derivation of two observation X, Y w.r.t the
generative model
 Fisher Info Matrix (roughly the covariance in the Mahanolibis
distance)
πΉπœ† = 𝐸 𝑋 πΊπœ†
𝑋
πΊπœ†
𝑋′
 Fisher Kernel Distance: normalized by the Fisher Info
Matrix:
Z. Li, Image Analysis&Retrv.2016 p.15
𝐾𝐹𝐾 𝑋, π‘Œ = πΊπœ†
𝑋′
πΉπœ†
βˆ’1
πΊπœ†
𝑋
Fisher Vector
 KFK(X, Y) is a measure of similarity,
w.r.t. the generative model
 Similar to the Mahanolibis distance case,
we can decompose this kernel as,
 That give us a kernel feature mappingof
X to Fisher Vector
 For observed images features {xt}, can
be computed as,
Z. Li, Image Analysis&Retrv.2016 p.16
𝐾𝐹𝐾 𝑋, π‘Œ = πΊπœ†
𝑋′
πΉπœ†
βˆ’1
πΊπœ†
𝑋
= πΊπœ†
𝑋′
𝐿 πœ†β€²πΏ πœ† πΊπœ†
𝑋
GMM Fisher Vector
Encode the derivation from the generative model
 Observed feature set, {x1, x2, …,xn} in Rd, e.g, d=128 (!) for SIFT.
 How’s these observations derivate from the given GMM model with a set
of parameter, πœƒ = π‘Ž π‘˜, πœ‡ π‘˜, 𝜎 π‘˜ ?
 GMM Log Likelihood Gradient
 Let 𝑀 π‘˜ =
𝑒 π‘Ž π‘˜
𝑗 𝑒
π‘Ž 𝑗
, Then we have
Z. Li, Image Analysis&Retrv.2016 p.17
weight
mean
variance
GMM Fisher Vector VL_FEAT implementation
 GMM codebook
 For a K-component GMM, we only allow 3K parameters,
πœ‹ π‘˜, πœ‡ π‘˜, 𝜎 π‘˜ π‘˜ = 1. . 𝐾}, i.e, iid Gaussian component
 Posterior prob of feature point xi to GMM component k
Z. Li, Image Analysis&Retrv.2016 p.18
Ξ£ π‘˜ =
𝜎 π‘˜ 0 0 0
0 𝜎 π‘˜ 0 0
….
𝜎 π‘˜
GMM Fisher Vector VL_FEAT implementation
 FV encoding
 Gradient on the mean, for GMM component k, j=1..D
 In the end, we have 2K x D aggregation on the derivation
w.r.t. the means and variances
Z. Li, Image Analysis&Retrv.2016 p.19
𝐹𝑉 = [𝑒1, 𝑒2,… , 𝑒 𝐾, 𝑣1, 𝑣2, … , 𝑣 𝐾]
VL_FEAT GMM/FV API
 Compute GMM model with VL_FEAT
 Prepare data:
numPoints = 1000 ; dimension = 2 ;
data = rand(dimension,N) ;
 Call vl_gmm:
numClusters = 30 ;
[means, covariances, priors] = vl_gmm(data, numClusters) ;
 Visualize:
figure ;
hold on ;
plot(data(1,:),data(2,:),'r.') ;
for i=1:numClusters
vl_plotframe([means(:,i)' sigmas(1,i) 0 sigmas(2,i)]);
end
Z. Li, Image Analysis&Retrv.2016 p.20
VL_FEAT API
 FV encoding
encoding = vl_fisher(datatoBeEncoded, means, covariances,
priors);
 Bonus points:
 Encode HoG features with Fisher Vector ?
 randomly collect 2~3 images from each class
 Stack all HoG features together into an n x 36 data matrix
 Compute its GMM
 Use this GMM to encode all image HoG features (other than
average)
Z. Li, Image Analysis&Retrv.2016 p.21
Super Vector Aggregation – Speaker ID
 Fisher Vector: Aggregates Features against a GMM
 Super Vector: Aggregates GMM against GMM
 Ref:
o William M. Campbell, Douglas E. Sturim, Douglas A. Reynolds: Support vector
machines using GMM supervectors for speaker verification. IEEE Signal Process. Lett.
13(5): 308-311(2006)
Z. Li, Image Analysis&Retrv.2016 p.22
β€œYes, We Can !”
?
Super Vector from MFCC
 Motivated from Speaker ID work
 Speech is a continuousevolution of the vocal tract
 Need to extract a sequence of spectra or sequence of spectral coefficients
 Use a sliding window - 25 ms window, 10 ms shift
Z. Li, Image Analysis&Retrv.2016 p.23
DCTLog|X(Ο‰)|
MFCC
GMM Model from MFCC
 GMM on MFCC feature
Z. Li, Image Analysis&Retrv.2016 p.24
οƒ₯ο€½
ο“ο€½οŒ
M
j
s
j
s
j
s
j
s
pp
1
)()()()(
),|()|(  xx
β€’ The acoustic vectors (MFCC) of speaker s is modeled by a
prob. density function parameterized by
M
j
s
j
s
j
s
j
s
1
)()()()(
},,{ ο€½ο“ο€½οŒ 
β€’ Gaussian mixture model (GMM) for speaker s:
M
j
s
j
s
j
s
j
s
1
)()()()(
},,{ ο€½ο“ο€½οŒ 
Universal Background Model
 UBM GMM Model:
Z. Li, Image Analysis&Retrv.2016 p.25
οƒ₯ο€½
ο“ο€½οŒ
M
j
jjj pp
1
)ubm()ubm()ubm()ubm(
),|()|(  xx
β€’ The acoustic vectors of a general population is modeled by
another GMM called the universal background model
(UBM):
β€’ Parameters of the UBM
M
jjjj 1
)ubm()ubm()ubm()ubm(
},,{ ο€½ο“ο€½οŒ 
MAP Adaption
 Given the UBM GMM, how is the new observation
derivate ?
 The adapted mean is given by:
Z. Li, Image Analysis&Retrv.2016 p.26
Supervector Distance
 Assuming we have UBM GMM model
πœ† π‘ˆπ΅π‘€ = {π‘ƒπ‘˜, πœ‡ π‘˜, Ξ£ π‘˜},
with identical prior and covariance
Then for two utterance samples a and b, with GMM models
 πœ† π‘Ž = {π‘ƒπ‘˜, πœ‡ π‘˜
π‘Ž
, Ξ£ π‘˜},
 πœ† 𝑏 = {π‘ƒπ‘˜, πœ‡ π‘˜
𝑏
,Ξ£ π‘˜},
The SV distance is,
It means the means of two models need to be normalized by the UBM
covariance induced Mahanolibis distance metric
This is also a linear kernel function scaled by the UBM covariances
Z. Li, Image Analysis&Retrv.2016 p.27
𝐾 πœ† π‘Ž, πœ† 𝑏 =
π‘˜
π‘ƒπ‘˜Ξ£ π‘˜
βˆ’(
1
2
)
πœ‡ π‘˜
π‘Ž
𝑇
( π‘ƒπ‘˜Ξ£ π‘˜
βˆ’(
1
2
)
πœ‡ π‘˜
𝑏)
Supervector Performance in NIST Speaker ID
 System 5: Gaussian SV
 DCF (Detection Cost Function)
Z. Li, Image Analysis&Retrv.2016 p.28
m31491
AKULA – Adaptive KLUster Aggregation
2013/10/25
Abhishek Nagar, Zhu Li, Gaurav Srivastava and Kyungmo Park
Z. Li, Image Analysis&Retrv.2016 p.29
Outline
Motivation
Adaptive Aggregation
Results with TM7
Summary
Z. Li, Image Analysis&Retrv.2016 p.30
Motivation
Better Aggregation
 Fisher Vector and VLAD type aggregation depending on a
global model
 AKULA removes this dependence, and directly coding the
cluster centroids and sift count
 SCFV/RVD all having situations where clusters are turned off
due to no assignment, this can be avoided in AKULA
SIFTdetection & selection K-means AKULA description
Z. Li, Image Analysis&Retrv.2016 p.31
Motivation
Better Subspace Choice
 Both SCFV and RVD do fixed normalization and PCA
projection based on heuristic.
 What is the best possible subspace to do the aggregation ?
 Using a boosting scheme to keep adding subspaces and
aggregations in an iterative fashion, and tune TPR-FPR to
the desired operating points on FPR.
Z. Li, Image Analysis&Retrv.2016 p.32
CE2: AKULA – Adaptive KLUster Aggregation
AKULA Descriptor: cluster centroids +
SIFT count
A2={yc2
1, yc2
2, …, yc2
k ; pc2
1, pc2
2, …, pc2
k }
Distance metric:
 Min centroids distance, weighted
by SIFT count
d A1 ,A2 =
1
π‘˜ 𝑗=0
π‘˜
d π‘šπ‘–π‘›
1
𝑗 𝑀 π‘šπ‘–π‘›
1
(𝑗) +
1
π‘˜ 𝑖=0
π‘˜
d π‘šπ‘–π‘›
2
𝑖 𝑀 π‘šπ‘–π‘›
2
(𝑖)
A1={yc1
1, yc1
2, …, yc1
k ; pc1
1, pc1
2, …, pc1
k },
d π‘šπ‘–π‘›
1
𝑗 = min
𝑖
𝑑𝑗,𝑖
d π‘šπ‘–π‘›
2
𝑖 = min
𝑗
𝑑𝑗,𝑖
w π‘šπ‘–π‘›
1
𝑗 = 𝑀𝑗,π‘–βˆ— , π‘–βˆ— = π‘Žπ‘Ÿπ‘”min
𝑖
𝑑𝑗,𝑖
w π‘šπ‘–π‘›
2
𝑖 = π‘€π‘—βˆ—,𝑖, π‘—βˆ— = π‘Žπ‘Ÿπ‘”min
𝑗
𝑑𝑗,𝑖
Z. Li, Image Analysis&Retrv.2016 p.33
AKULA implementation in TM7
Inner loop aggregation
 Dimension is fixed at 8
 Numb of clusters, or nc=8, 16, 32, to hit 64, 128, and 256
bytes
 Quantization: scale by Β½ and quantized to int8, sift count is
8 bits, total (nc+1)*dim bytes per aggregation
Z. Li, Image Analysis&Retrv.2016 p.34
AKULA implementation in TM7
Outer loop subspace optimization by boosting
 Initial set of subspace models {Ak} computed from MIR
FLICKR data set SIFT extractions by k-means the space to
4096 clusters
 Iterative search on subspaces to generate AKULA
aggregation that can improve performance in precision-
recall
 Notice that aggregation is de-coupled in subspace iteration,
to allow more DoF in aggregation, to find subspaces that
provides complimentary info.
The algorithm is still being debugged, hence only
having 1st iteration results in TM7
Z. Li, Image Analysis&Retrv.2016 p.35
AKULA implementation in TM7
Outer loop subspace optimization by boosting
 Initial set of subspace models {Ak} computed from MIR
FLICKR data set SIFT extractions by k-means the space to 4096
clusters
 Iterative search on subspaces to generate AKULA aggregation
that can improve performance in precision-recall
 Notice that aggregation is de-coupled in subspace iteration, to
allow more DoF in aggregation, to find subspaces that provides
complimentary info.
The algorithm is still being debugged, hence only having
1st iteration results in TM7
Indexing/Hashing is required for AKULA, it involves nc x
dim multiplications and additions at this time. A
binarization scheme will be considered once its
performance is optimized in non-binary form.
Z. Li, Image Analysis&Retrv.2016 p.36
GD Only TPR-FPR: AKULA vs SCFV
Data set 1:
 AKULA (128bytes, dim=8, nc=16) distance is just 1-way
dmin1.*wt
 Forcing a weighted sum on SCFV (512 bytes) hamming
distances without 2D decision fitting, i.e, count hamming
distance between common active clusters, and sum up their
distances
Z. Li, Image Analysis&Retrv.2016 p.37
GD Only TPR-FPR: AKULA vs SCFV
Data set 2, 3:
 AKULA distance is just 1-way dmin1.*wt
 AKULA=128bytes, SCFV = 512 bytes.
Z. Li, Image Analysis&Retrv.2016 p.38
3D object set: 4 , 5
Data set4, 5:
Z. Li, Image Analysis&Retrv.2016 p.39
AKULA in PM
FPR performance:
AKULA rates:
pm rates m akula rates
512 8 64
1K 16 128
2K 16 128
1K_4K 16 128
2K_4K 16 128
4K 16 128
8K 32 256
16K 32 256
Z. Li, Image Analysis&Retrv.2016 p.40
TPR@1% FPR
0
20
40
60
80
100
120
1a 1b 1c 2 3 4 5
TPR(%)
bitrate:512
TM7
AKULA
0
20
40
60
80
100
120
1a 1b 1c 2 3 4 5TPR(%)
bitrate:1k
TM7
AKULA
Z. Li, Image Analysis&Retrv.2016 p.41
TPR@1%FPR:
0
20
40
60
80
100
120
1a 1b 1c 2 3 4 5
TPR(%)
bitrate:2k
TM7
AKULA
0
20
40
60
80
100
120
1a 1b 1c 2 3 4 5
TPR(%)
bitrate:1k-4k
TM7
AKULA
Z. Li, Image Analysis&Retrv.2016 p.42
TPR@1%FPR:
0
20
40
60
80
100
120
1a 1b 1c 2 3 4 5
TPR(%)
bitrate:2k-4k
TM7
AKULA
0
20
40
60
80
100
120
1a 1b 1c 2 3 4 5
TPR(%)
bitrate:4k
TM7
AKULA
Z. Li, Image Analysis&Retrv.2016 p.43
TPR@1%FPR:
75
80
85
90
95
100
105
1a 1b 1c 2 3 4 5
TPR(%)
bitrate:8k
TM7
AKULA
80
85
90
95
100
105
1a 1b 1c 2 3 4 5
TPR(%)
bitrate:16k
TM7
AKULA
Z. Li, Image Analysis&Retrv.2016 p.44
AKULA Localization
Quite some improvements: 2.7%
Z. Li, Image Analysis&Retrv.2016 p.45
AKULA Summary
Benefits:
 Allow more DoF in aggregation optimization,
o by an outer loop boosting scheme for subspace projection optimization
o And an inner loop adaptive clustering without the constraint of the
global GMM model
 Simple weighted distance sum metric, with no need to tune a
multi-dimensional decision boundary
 The overall pair wise matching matched up with TM7 SCFV
with 2-dimensional decision boundary
 In GD only matching outperforms the TM7 GD
 Good improvements to the localization accuracy
 Light in extraction, but still heavy in pair wise matching, and
need binarization scheme and/or indexing scheme to work for
retrieval
 Future Improvements:
 SupervectorAKULA ?
Z. Li, Image Analysis&Retrv.2016 p.46
Lec 08 Summary
 Fisher Vector
 Aggregate features {Xk} in RD
against GMM
Super Vector
 Aggregate GMM against a global
GMM (UBM)
 AKULA
 Direct Aggregation
Z. Li, Image Analysis&Retrv.2016 p.47
+
+ + +

More Related Content

What's hot

90963869 latihan-soal-struktur-data-semester2
90963869 latihan-soal-struktur-data-semester290963869 latihan-soal-struktur-data-semester2
90963869 latihan-soal-struktur-data-semester2Saybia Himma
Β 
A Novel Approach for Tomato Diseases Classification Based on Deep Convolution...
A Novel Approach for Tomato Diseases Classification Based on Deep Convolution...A Novel Approach for Tomato Diseases Classification Based on Deep Convolution...
A Novel Approach for Tomato Diseases Classification Based on Deep Convolution...Mohammad Shakirul islam
Β 
Data mining 5 klasifikasi decision tree dan random forest
Data mining 5   klasifikasi decision tree dan random forestData mining 5   klasifikasi decision tree dan random forest
Data mining 5 klasifikasi decision tree dan random forestIrwansyahSaputra1
Β 
Architecture of dbms(lecture 3)
Architecture of dbms(lecture 3)Architecture of dbms(lecture 3)
Architecture of dbms(lecture 3)Ravinder Kamboj
Β 
Data mining 7
Data mining 7Data mining 7
Data mining 7dedidarwis
Β 
[PBO] Pertemuan 5 - Inheritance
[PBO] Pertemuan 5 - Inheritance[PBO] Pertemuan 5 - Inheritance
[PBO] Pertemuan 5 - Inheritancerizki adam kurniawan
Β 
Abstract Class & Interface
Abstract Class & InterfaceAbstract Class & Interface
Abstract Class & InterfaceYoppy Yunhasnawa
Β 
Bab 8 pendeteksian tepi
Bab 8 pendeteksian tepiBab 8 pendeteksian tepi
Bab 8 pendeteksian tepiSyafrizal
Β 
Analisis Semantik - P 6 Teknik Kompilasi
Analisis Semantik - P 6 Teknik KompilasiAnalisis Semantik - P 6 Teknik Kompilasi
Analisis Semantik - P 6 Teknik Kompilasiahmad haidaroh
Β 
Heart Disease Prediction Using Data Mining Techniques
Heart Disease Prediction Using Data Mining TechniquesHeart Disease Prediction Using Data Mining Techniques
Heart Disease Prediction Using Data Mining TechniquesIJRES Journal
Β 
Object Oriented Programming Lecture Notes
Object Oriented Programming Lecture NotesObject Oriented Programming Lecture Notes
Object Oriented Programming Lecture NotesFellowBuddy.com
Β 
Operator Overloading
Operator OverloadingOperator Overloading
Operator OverloadingNilesh Dalvi
Β 
Attribute oriented analysis
Attribute oriented analysisAttribute oriented analysis
Attribute oriented analysisHirra Sultan
Β 
Contoh Program Jaringan Syaraf Tiruan Sederhana
Contoh Program Jaringan Syaraf Tiruan SederhanaContoh Program Jaringan Syaraf Tiruan Sederhana
Contoh Program Jaringan Syaraf Tiruan SederhanaSherly Uda
Β 
Branching statements
Branching statementsBranching statements
Branching statementsArunMK17
Β 
Struktur Data Tree
Struktur Data TreeStruktur Data Tree
Struktur Data TreeSiti Khotijah
Β 
Sparse matrix
Sparse matrixSparse matrix
Sparse matrixdincyjain
Β 

What's hot (20)

90963869 latihan-soal-struktur-data-semester2
90963869 latihan-soal-struktur-data-semester290963869 latihan-soal-struktur-data-semester2
90963869 latihan-soal-struktur-data-semester2
Β 
A Novel Approach for Tomato Diseases Classification Based on Deep Convolution...
A Novel Approach for Tomato Diseases Classification Based on Deep Convolution...A Novel Approach for Tomato Diseases Classification Based on Deep Convolution...
A Novel Approach for Tomato Diseases Classification Based on Deep Convolution...
Β 
Data mining 5 klasifikasi decision tree dan random forest
Data mining 5   klasifikasi decision tree dan random forestData mining 5   klasifikasi decision tree dan random forest
Data mining 5 klasifikasi decision tree dan random forest
Β 
Architecture of dbms(lecture 3)
Architecture of dbms(lecture 3)Architecture of dbms(lecture 3)
Architecture of dbms(lecture 3)
Β 
Modul delphi-7
Modul delphi-7Modul delphi-7
Modul delphi-7
Β 
Data mining 7
Data mining 7Data mining 7
Data mining 7
Β 
[PBO] Pertemuan 5 - Inheritance
[PBO] Pertemuan 5 - Inheritance[PBO] Pertemuan 5 - Inheritance
[PBO] Pertemuan 5 - Inheritance
Β 
Abstract Class & Interface
Abstract Class & InterfaceAbstract Class & Interface
Abstract Class & Interface
Β 
Bab 8 pendeteksian tepi
Bab 8 pendeteksian tepiBab 8 pendeteksian tepi
Bab 8 pendeteksian tepi
Β 
Analisis Semantik - P 6 Teknik Kompilasi
Analisis Semantik - P 6 Teknik KompilasiAnalisis Semantik - P 6 Teknik Kompilasi
Analisis Semantik - P 6 Teknik Kompilasi
Β 
20731 21 visualisasi data
20731 21 visualisasi data20731 21 visualisasi data
20731 21 visualisasi data
Β 
Heart Disease Prediction Using Data Mining Techniques
Heart Disease Prediction Using Data Mining TechniquesHeart Disease Prediction Using Data Mining Techniques
Heart Disease Prediction Using Data Mining Techniques
Β 
Object Oriented Programming Lecture Notes
Object Oriented Programming Lecture NotesObject Oriented Programming Lecture Notes
Object Oriented Programming Lecture Notes
Β 
Unit4 C
Unit4 C Unit4 C
Unit4 C
Β 
Operator Overloading
Operator OverloadingOperator Overloading
Operator Overloading
Β 
Attribute oriented analysis
Attribute oriented analysisAttribute oriented analysis
Attribute oriented analysis
Β 
Contoh Program Jaringan Syaraf Tiruan Sederhana
Contoh Program Jaringan Syaraf Tiruan SederhanaContoh Program Jaringan Syaraf Tiruan Sederhana
Contoh Program Jaringan Syaraf Tiruan Sederhana
Β 
Branching statements
Branching statementsBranching statements
Branching statements
Β 
Struktur Data Tree
Struktur Data TreeStruktur Data Tree
Struktur Data Tree
Β 
Sparse matrix
Sparse matrixSparse matrix
Sparse matrix
Β 

Viewers also liked

4 new-patch-agggregation.pptx
4 new-patch-agggregation.pptx4 new-patch-agggregation.pptx
4 new-patch-agggregation.pptxmustafa sarac
Β 
Timbral modeling for music artist recognition using i-vectors
Timbral modeling for music artist recognition using i-vectorsTimbral modeling for music artist recognition using i-vectors
Timbral modeling for music artist recognition using i-vectorsHamid Eghbal-zadeh
Β 
Mobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large RepositoriesMobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large RepositoriesUnited States Air Force Academy
Β 
A Survey about Object Retrieval
A Survey about Object RetrievalA Survey about Object Retrieval
A Survey about Object RetrievalNguyen Tuan
Β 
Application of Fisher Linear Discriminant Analysis to Speech/Music Classifica...
Application of Fisher Linear Discriminant Analysis to Speech/Music Classifica...Application of Fisher Linear Discriminant Analysis to Speech/Music Classifica...
Application of Fisher Linear Discriminant Analysis to Speech/Music Classifica...Lushanthan Sivaneasharajah
Β 
Factor analysis
Factor analysisFactor analysis
Factor analysisDhruv Goel
Β 
Voice Identification And Recognition System, Matlab
Voice Identification And Recognition System, MatlabVoice Identification And Recognition System, Matlab
Voice Identification And Recognition System, MatlabSohaib Tallat
Β 
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)Yusuke Uchida
Β 
Speaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization ApproachSpeaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization Approachijsrd.com
Β 

Viewers also liked (13)

4 new-patch-agggregation.pptx
4 new-patch-agggregation.pptx4 new-patch-agggregation.pptx
4 new-patch-agggregation.pptx
Β 
Timbral modeling for music artist recognition using i-vectors
Timbral modeling for music artist recognition using i-vectorsTimbral modeling for music artist recognition using i-vectors
Timbral modeling for music artist recognition using i-vectors
Β 
ICME 2013
ICME 2013ICME 2013
ICME 2013
Β 
Mobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large RepositoriesMobile Visual Search: Object Re-Identification Against Large Repositories
Mobile Visual Search: Object Re-Identification Against Large Repositories
Β 
A Survey about Object Retrieval
A Survey about Object RetrievalA Survey about Object Retrieval
A Survey about Object Retrieval
Β 
Lec16 subspace optimization
Lec16 subspace optimizationLec16 subspace optimization
Lec16 subspace optimization
Β 
Lec11 rate distortion optimization
Lec11 rate distortion optimizationLec11 rate distortion optimization
Lec11 rate distortion optimization
Β 
Lec07 aggregation-and-retrieval-system
Lec07 aggregation-and-retrieval-systemLec07 aggregation-and-retrieval-system
Lec07 aggregation-and-retrieval-system
Β 
Application of Fisher Linear Discriminant Analysis to Speech/Music Classifica...
Application of Fisher Linear Discriminant Analysis to Speech/Music Classifica...Application of Fisher Linear Discriminant Analysis to Speech/Music Classifica...
Application of Fisher Linear Discriminant Analysis to Speech/Music Classifica...
Β 
Factor analysis
Factor analysisFactor analysis
Factor analysis
Β 
Voice Identification And Recognition System, Matlab
Voice Identification And Recognition System, MatlabVoice Identification And Recognition System, Matlab
Voice Identification And Recognition System, Matlab
Β 
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)
Β 
Speaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization ApproachSpeaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization Approach
Β 

Similar to Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super Vector

Performance Evaluation of Object Tracking Technique Based on Position Vectors
Performance Evaluation of Object Tracking Technique Based on Position VectorsPerformance Evaluation of Object Tracking Technique Based on Position Vectors
Performance Evaluation of Object Tracking Technique Based on Position VectorsCSCJournals
Β 
Research Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and ScienceResearch Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and Scienceinventy
Β 
A simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsA simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsDevansh16
Β 
Feature extraction based retrieval of
Feature extraction based retrieval ofFeature extraction based retrieval of
Feature extraction based retrieval ofijcsity
Β 
EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171Yaxin Liu
Β 
Image Super-Resolution Reconstruction Based On Multi-Dictionary Learning
Image Super-Resolution Reconstruction Based On Multi-Dictionary LearningImage Super-Resolution Reconstruction Based On Multi-Dictionary Learning
Image Super-Resolution Reconstruction Based On Multi-Dictionary LearningIJRESJOURNAL
Β 
Gait Based Person Recognition Using Partial Least Squares Selection Scheme
Gait Based Person Recognition Using Partial Least Squares Selection Scheme Gait Based Person Recognition Using Partial Least Squares Selection Scheme
Gait Based Person Recognition Using Partial Least Squares Selection Scheme ijcisjournal
Β 
Human Face Detection Based on Combination of Logistic Regression, Distance of...
Human Face Detection Based on Combination of Logistic Regression, Distance of...Human Face Detection Based on Combination of Logistic Regression, Distance of...
Human Face Detection Based on Combination of Logistic Regression, Distance of...IJCSIS Research Publications
Β 
Multimodal Residual Networks for Visual QA
Multimodal Residual Networks for Visual QAMultimodal Residual Networks for Visual QA
Multimodal Residual Networks for Visual QAJin-Hwa Kim
Β 
DataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at GoogleDataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at GoogleHakka Labs
Β 
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral ImagesBand Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral ImagesIDES Editor
Β 
Citython presentation
Citython presentationCitython presentation
Citython presentationAnkit Tewari
Β 

Similar to Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super Vector (20)

Lec12 review-part-i
Lec12 review-part-iLec12 review-part-i
Lec12 review-part-i
Β 
Lec11 object-re-id
Lec11 object-re-idLec11 object-re-id
Lec11 object-re-id
Β 
CLIM: Transition Workshop - Statistical Emulation with Dimension Reduction fo...
CLIM: Transition Workshop - Statistical Emulation with Dimension Reduction fo...CLIM: Transition Workshop - Statistical Emulation with Dimension Reduction fo...
CLIM: Transition Workshop - Statistical Emulation with Dimension Reduction fo...
Β 
CLIM Program: Remote Sensing Workshop, Statistical Emulation with Dimension R...
CLIM Program: Remote Sensing Workshop, Statistical Emulation with Dimension R...CLIM Program: Remote Sensing Workshop, Statistical Emulation with Dimension R...
CLIM Program: Remote Sensing Workshop, Statistical Emulation with Dimension R...
Β 
Performance Evaluation of Object Tracking Technique Based on Position Vectors
Performance Evaluation of Object Tracking Technique Based on Position VectorsPerformance Evaluation of Object Tracking Technique Based on Position Vectors
Performance Evaluation of Object Tracking Technique Based on Position Vectors
Β 
Research Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and ScienceResearch Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and Science
Β 
Lec15 graph laplacian embedding
Lec15 graph laplacian embeddingLec15 graph laplacian embedding
Lec15 graph laplacian embedding
Β 
A simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsA simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representations
Β 
Lec14 eigenface and fisherface
Lec14 eigenface and fisherfaceLec14 eigenface and fisherface
Lec14 eigenface and fisherface
Β 
Feature extraction based retrieval of
Feature extraction based retrieval ofFeature extraction based retrieval of
Feature extraction based retrieval of
Β 
EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171
Β 
Image Super-Resolution Reconstruction Based On Multi-Dictionary Learning
Image Super-Resolution Reconstruction Based On Multi-Dictionary LearningImage Super-Resolution Reconstruction Based On Multi-Dictionary Learning
Image Super-Resolution Reconstruction Based On Multi-Dictionary Learning
Β 
Gait Based Person Recognition Using Partial Least Squares Selection Scheme
Gait Based Person Recognition Using Partial Least Squares Selection Scheme Gait Based Person Recognition Using Partial Least Squares Selection Scheme
Gait Based Person Recognition Using Partial Least Squares Selection Scheme
Β 
Human Face Detection Based on Combination of Logistic Regression, Distance of...
Human Face Detection Based on Combination of Logistic Regression, Distance of...Human Face Detection Based on Combination of Logistic Regression, Distance of...
Human Face Detection Based on Combination of Logistic Regression, Distance of...
Β 
Multimodal Residual Networks for Visual QA
Multimodal Residual Networks for Visual QAMultimodal Residual Networks for Visual QA
Multimodal Residual Networks for Visual QA
Β 
DataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at GoogleDataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at Google
Β 
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral ImagesBand Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Β 
W33123127
W33123127W33123127
W33123127
Β 
Dycops2019
Dycops2019 Dycops2019
Dycops2019
Β 
Citython presentation
Citython presentationCitython presentation
Citython presentation
Β 

More from United States Air Force Academy

Multimedia Communication Lec02: Info Theory and Entropy
Multimedia Communication Lec02: Info Theory and EntropyMultimedia Communication Lec02: Info Theory and Entropy
Multimedia Communication Lec02: Info Theory and EntropyUnited States Air Force Academy
Β 
Tutorial on MPEG CDVS/CDVA Standardization at ICNITS L3 Meeting
Tutorial on MPEG CDVS/CDVA Standardization at ICNITS L3 MeetingTutorial on MPEG CDVS/CDVA Standardization at ICNITS L3 Meeting
Tutorial on MPEG CDVS/CDVA Standardization at ICNITS L3 MeetingUnited States Air Force Academy
Β 
Light Weight Fingerprinting for Video Playback Verification in MPEG DASH
Light Weight Fingerprinting for Video Playback Verification in MPEG DASHLight Weight Fingerprinting for Video Playback Verification in MPEG DASH
Light Weight Fingerprinting for Video Playback Verification in MPEG DASHUnited States Air Force Academy
Β 
Subspace Indexing on Grassmannian Manifold for Large Scale Visual Identification
Subspace Indexing on Grassmannian Manifold for Large Scale Visual IdentificationSubspace Indexing on Grassmannian Manifold for Large Scale Visual Identification
Subspace Indexing on Grassmannian Manifold for Large Scale Visual IdentificationUnited States Air Force Academy
Β 
Scaled Eigen Appearance and Likelihood Prunning for Large Scale Video Duplica...
Scaled Eigen Appearance and Likelihood Prunning for Large Scale Video Duplica...Scaled Eigen Appearance and Likelihood Prunning for Large Scale Video Duplica...
Scaled Eigen Appearance and Likelihood Prunning for Large Scale Video Duplica...United States Air Force Academy
Β 

More from United States Air Force Academy (8)

Lec17 sparse signal processing & applications
Lec17 sparse signal processing & applicationsLec17 sparse signal processing & applications
Lec17 sparse signal processing & applications
Β 
Lec-03 Entropy Coding I: Hoffmann & Golomb Codes
Lec-03 Entropy Coding I: Hoffmann & Golomb CodesLec-03 Entropy Coding I: Hoffmann & Golomb Codes
Lec-03 Entropy Coding I: Hoffmann & Golomb Codes
Β 
Multimedia Communication Lec02: Info Theory and Entropy
Multimedia Communication Lec02: Info Theory and EntropyMultimedia Communication Lec02: Info Theory and Entropy
Multimedia Communication Lec02: Info Theory and Entropy
Β 
ECE 4490 Multimedia Communication Lec01
ECE 4490 Multimedia Communication Lec01ECE 4490 Multimedia Communication Lec01
ECE 4490 Multimedia Communication Lec01
Β 
Tutorial on MPEG CDVS/CDVA Standardization at ICNITS L3 Meeting
Tutorial on MPEG CDVS/CDVA Standardization at ICNITS L3 MeetingTutorial on MPEG CDVS/CDVA Standardization at ICNITS L3 Meeting
Tutorial on MPEG CDVS/CDVA Standardization at ICNITS L3 Meeting
Β 
Light Weight Fingerprinting for Video Playback Verification in MPEG DASH
Light Weight Fingerprinting for Video Playback Verification in MPEG DASHLight Weight Fingerprinting for Video Playback Verification in MPEG DASH
Light Weight Fingerprinting for Video Playback Verification in MPEG DASH
Β 
Subspace Indexing on Grassmannian Manifold for Large Scale Visual Identification
Subspace Indexing on Grassmannian Manifold for Large Scale Visual IdentificationSubspace Indexing on Grassmannian Manifold for Large Scale Visual Identification
Subspace Indexing on Grassmannian Manifold for Large Scale Visual Identification
Β 
Scaled Eigen Appearance and Likelihood Prunning for Large Scale Video Duplica...
Scaled Eigen Appearance and Likelihood Prunning for Large Scale Video Duplica...Scaled Eigen Appearance and Likelihood Prunning for Large Scale Video Duplica...
Scaled Eigen Appearance and Likelihood Prunning for Large Scale Video Duplica...
Β 

Recently uploaded

Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
Β 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
Β 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
Β 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
Β 
β€œOh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
β€œOh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...β€œOh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
β€œOh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
Β 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
Β 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
Β 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxabhijeetpadhi001
Β 
Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxUnboundStockton
Β 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)Dr. Mazin Mohamed alkathiri
Β 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
Β 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
Β 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
Β 
call girls in Kamla Market (DELHI) πŸ” >ΰΌ’9953330565πŸ” genuine Escort Service πŸ”βœ”οΈβœ”οΈ
call girls in Kamla Market (DELHI) πŸ” >ΰΌ’9953330565πŸ” genuine Escort Service πŸ”βœ”οΈβœ”οΈcall girls in Kamla Market (DELHI) πŸ” >ΰΌ’9953330565πŸ” genuine Escort Service πŸ”βœ”οΈβœ”οΈ
call girls in Kamla Market (DELHI) πŸ” >ΰΌ’9953330565πŸ” genuine Escort Service πŸ”βœ”οΈβœ”οΈ9953056974 Low Rate Call Girls In Saket, Delhi NCR
Β 
18-04-UA_REPORT_MEDIALITERAΠ‘Y_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAΠ‘Y_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAΠ‘Y_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAΠ‘Y_INDEX-DM_23-1-final-eng.pdfssuser54595a
Β 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
Β 

Recently uploaded (20)

OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
Β 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
Β 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
Β 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
Β 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
Β 
β€œOh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
β€œOh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...β€œOh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
β€œOh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
Β 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
Β 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
Β 
Model Call Girl in Bikash Puri Delhi reach out to us at πŸ”9953056974πŸ”
Model Call Girl in Bikash Puri  Delhi reach out to us at πŸ”9953056974πŸ”Model Call Girl in Bikash Puri  Delhi reach out to us at πŸ”9953056974πŸ”
Model Call Girl in Bikash Puri Delhi reach out to us at πŸ”9953056974πŸ”
Β 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
Β 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptx
Β 
Model Call Girl in Tilak Nagar Delhi reach out to us at πŸ”9953056974πŸ”
Model Call Girl in Tilak Nagar Delhi reach out to us at πŸ”9953056974πŸ”Model Call Girl in Tilak Nagar Delhi reach out to us at πŸ”9953056974πŸ”
Model Call Girl in Tilak Nagar Delhi reach out to us at πŸ”9953056974πŸ”
Β 
Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docx
Β 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
Β 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Β 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
Β 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
Β 
call girls in Kamla Market (DELHI) πŸ” >ΰΌ’9953330565πŸ” genuine Escort Service πŸ”βœ”οΈβœ”οΈ
call girls in Kamla Market (DELHI) πŸ” >ΰΌ’9953330565πŸ” genuine Escort Service πŸ”βœ”οΈβœ”οΈcall girls in Kamla Market (DELHI) πŸ” >ΰΌ’9953330565πŸ” genuine Escort Service πŸ”βœ”οΈβœ”οΈ
call girls in Kamla Market (DELHI) πŸ” >ΰΌ’9953330565πŸ” genuine Escort Service πŸ”βœ”οΈβœ”οΈ
Β 
18-04-UA_REPORT_MEDIALITERAΠ‘Y_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAΠ‘Y_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAΠ‘Y_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAΠ‘Y_INDEX-DM_23-1-final-eng.pdf
Β 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
Β 

Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super Vector

  • 1. Image Analysis & Retrieval CS/EE 5590 Special Topics (Class Ids: 44873,44874) Fall 2016,M/W 4-5:15pm@Bloch0012 Lec 08 Feature Aggregation II: Fisher Vector, Super Vector and AKULA Zhu Li Dept of CSEE, UMKC Office: FH560E,Email: lizhu@umkc.edu, Ph: x 2346. http://l.web.umkc.edu/lizhu p.1Z. Li, Image Analysis&Retrv.2016
  • 2. Outline  ReCap of Lecture 07  Image Retrieval System  BoW  VLAD  Dense SIFT  Fisher Vector Aggregation  AKULA  Summary Z. Li, Image Analysis&Retrv.2016 p.2
  • 3. Precision, Recall, F-measure Precision, TPR = TP/(TP + FP), Recall = TP/(TP + FN),  FPR=FP/(TP+FP) F-measure = 2*(precision*recall)/(precision + recall) Precision: is the probability that a retrieved document is relevant. Recall: is the probability that a relevant document is retrieved in a search. Z. Li, Image Analysis&Retrv.2016 p.3
  • 4. Why Aggregation ?  Curse of Dimensionality Decision Boundary / Indexing Z. Li, Image Analysis&Retrv.2016 p.4 + …..
  • 5. Bag-of-Words: Histogram Coding Codebook:  Feature space: Rd, k-means to get k centroids, {πœ‡1, πœ‡2, … , πœ‡ π‘˜}  BoW Hard Encoding:  For n feature points,{x1, x2, …,xn} assignment matrix: kxn, with column only 1-non zero entry  Aggregated dimension: k Z. Li, Image Analysis&Retrv.2016 p.5 k n
  • 6. Kernel Code Book Soft Encoding Kernel Code Book Soft Encoding  Kernel Affinity: 𝐾 π‘₯𝑗, πœ‡ π‘˜ = π‘’βˆ’π‘˜|π‘₯ π‘—βˆ’πœ‡ π‘˜|2  Assignment Matrix: 𝐴𝑗,π‘˜ = 𝐾(π‘₯𝑗, πœ‡ π‘˜)/ π‘˜ 𝐾(π‘₯𝑗, πœ‡ π‘˜)  Encoding: k-dimensional: X(k)= 1 𝑛 𝑗 𝐴𝑗,π‘˜ Z. Li, Image Analysis&Retrv.2016 p.6
  • 7. VLAD- Vector of Locally Aggregated Descriptors  Aggregate feature difference from the codebook  Hard assignment by finding the NN of feature {xk} to {πœ‡ π‘˜}  Compute aggregated differences  L2 normalize  Final feature: k x d Z. Li, Image Analysis&Retrv.2016 p.7  3 x v1 v2 v3 v4 v5 1  4  2  5 β‘  assign descriptors β‘‘ compute x-  i β‘’ vi=sum x-  i for cell i 𝑣 π‘˜ = βˆ€π‘—,𝑠.𝑑.𝑁𝑁 π‘₯ 𝑗 =πœ‡ π‘˜ π‘₯𝑗 βˆ’ πœ‡ π‘˜ 𝑣 π‘˜ = 𝑣 π‘˜/ 𝑣 π‘˜ 2
  • 8. VLAD on SIFT  Example of aggregating SIFT with VLAD  K=16 codebook entries  Each cell is a SIFT visualized as centroids in blue, and VLAD difference in red  Top row: left image, bottom row: right image, red: code book, blue: encoded VLAD Z. Li, Image Analysis&Retrv.2016 p.8
  • 9. Outline  ReCap of Lecture 07  Image Retrieval System  BoW  VLAD  Dense SIFT  Fisher Vector Aggregation  AKULA  Summary Z. Li, Image Analysis&Retrv.2016 p.9
  • 10. One more trick  Recall that SIFT is a powerful descriptor  VL_FEAT: vl_dsift  A dense description of image by computing SIFT descriptor (no spatial-scale space extrema detection) at predetermined grid  Supplement HoG as an alternative texture descriptor Z. Li, Image Analysis&Retrv.2016 p.10
  • 11. VL_FEAT: vl_dsift  Compute dense SIFT as a texture descriptor for the image  [f, dsift]=vl_dsift(single(rgb2gray(im)), β€˜step’, 2);  There’s also a FAST option  [f, dsift]=vl_dsift(single(rgb2gray(im)), β€˜fast’, β€˜step’, 2);  Huge amount of SIFT data will be generated Z. Li, Image Analysis&Retrv.2016 p.11
  • 12. Fisher Vector  Fisher Vector and variations:  Winning in image classification:  Winning in the MPEG object re-identification: o SCFV(Scalable Coded Fisher Vec) in CDVS Z. Li, Image Analysis&Retrv.2016 p.12
  • 13. Codebook: Gaussian Mixture Model (GMM)  GMM is a generative model to express data  Assuming data is generated from with parameters {𝑀 π‘˜, πœ‡ π‘˜, 𝜎 π‘˜} Z. Li, Image Analysis&Retrv.2016 p.13 π‘₯ π‘˜ ~ π‘˜=1 𝐾 𝑀 π‘˜ 𝑁(πœ‡ π‘˜, 𝜎 π‘˜) 𝑁 πœ‡ π‘˜, 𝜎 π‘˜ = 1 2πœ‹ 𝑑 2 Ξ£ π‘˜ 1/2 π‘’βˆ’ 1 2 π‘₯βˆ’πœ‡ π‘˜ β€²Ξ£ π‘˜ βˆ’1 (π‘₯βˆ’πœ‡ π‘˜)
  • 14. A bit of Theory: Fisher Kernel Encode the derivation from the generative model  Observed feature set, {x1, x2, …,xn} in Rd, e.g, d=128 for SIFT.  How’s these observations derivate from the given GMM model with a set of parameter, πœ† = 𝑀 π‘˜, πœ‡ π‘˜, 𝜎 π‘˜ ? o i.e, how the parameter, e.g, mean will move to best fit the observation ? Z. Li, Image Analysis&Retrv.2016 p.14 πœ‡4 πœ‡3 πœ‡2 πœ‡1 X1 +
  • 15. A bit of Theory: Fisher Kernel Score function w.r.t. the likelihood function πœ‡ πœ†(𝑋)  πΊπœ† 𝑋 = π›»πœ† log 𝑒 πœ†(𝑋): derivative on the log likelihood  The dimension of score function is m, where m is the number of generative model parameters, m=3 for GMM  Given the observed data X, score function indicate how likelihood function parameter (e.g, mean) should move to better fit the data. Distance/Derivation of two observation X, Y w.r.t the generative model  Fisher Info Matrix (roughly the covariance in the Mahanolibis distance) πΉπœ† = 𝐸 𝑋 πΊπœ† 𝑋 πΊπœ† 𝑋′  Fisher Kernel Distance: normalized by the Fisher Info Matrix: Z. Li, Image Analysis&Retrv.2016 p.15 𝐾𝐹𝐾 𝑋, π‘Œ = πΊπœ† 𝑋′ πΉπœ† βˆ’1 πΊπœ† 𝑋
  • 16. Fisher Vector  KFK(X, Y) is a measure of similarity, w.r.t. the generative model  Similar to the Mahanolibis distance case, we can decompose this kernel as,  That give us a kernel feature mappingof X to Fisher Vector  For observed images features {xt}, can be computed as, Z. Li, Image Analysis&Retrv.2016 p.16 𝐾𝐹𝐾 𝑋, π‘Œ = πΊπœ† 𝑋′ πΉπœ† βˆ’1 πΊπœ† 𝑋 = πΊπœ† 𝑋′ 𝐿 πœ†β€²πΏ πœ† πΊπœ† 𝑋
  • 17. GMM Fisher Vector Encode the derivation from the generative model  Observed feature set, {x1, x2, …,xn} in Rd, e.g, d=128 (!) for SIFT.  How’s these observations derivate from the given GMM model with a set of parameter, πœƒ = π‘Ž π‘˜, πœ‡ π‘˜, 𝜎 π‘˜ ?  GMM Log Likelihood Gradient  Let 𝑀 π‘˜ = 𝑒 π‘Ž π‘˜ 𝑗 𝑒 π‘Ž 𝑗 , Then we have Z. Li, Image Analysis&Retrv.2016 p.17 weight mean variance
  • 18. GMM Fisher Vector VL_FEAT implementation  GMM codebook  For a K-component GMM, we only allow 3K parameters, πœ‹ π‘˜, πœ‡ π‘˜, 𝜎 π‘˜ π‘˜ = 1. . 𝐾}, i.e, iid Gaussian component  Posterior prob of feature point xi to GMM component k Z. Li, Image Analysis&Retrv.2016 p.18 Ξ£ π‘˜ = 𝜎 π‘˜ 0 0 0 0 𝜎 π‘˜ 0 0 …. 𝜎 π‘˜
  • 19. GMM Fisher Vector VL_FEAT implementation  FV encoding  Gradient on the mean, for GMM component k, j=1..D  In the end, we have 2K x D aggregation on the derivation w.r.t. the means and variances Z. Li, Image Analysis&Retrv.2016 p.19 𝐹𝑉 = [𝑒1, 𝑒2,… , 𝑒 𝐾, 𝑣1, 𝑣2, … , 𝑣 𝐾]
  • 20. VL_FEAT GMM/FV API  Compute GMM model with VL_FEAT  Prepare data: numPoints = 1000 ; dimension = 2 ; data = rand(dimension,N) ;  Call vl_gmm: numClusters = 30 ; [means, covariances, priors] = vl_gmm(data, numClusters) ;  Visualize: figure ; hold on ; plot(data(1,:),data(2,:),'r.') ; for i=1:numClusters vl_plotframe([means(:,i)' sigmas(1,i) 0 sigmas(2,i)]); end Z. Li, Image Analysis&Retrv.2016 p.20
  • 21. VL_FEAT API  FV encoding encoding = vl_fisher(datatoBeEncoded, means, covariances, priors);  Bonus points:  Encode HoG features with Fisher Vector ?  randomly collect 2~3 images from each class  Stack all HoG features together into an n x 36 data matrix  Compute its GMM  Use this GMM to encode all image HoG features (other than average) Z. Li, Image Analysis&Retrv.2016 p.21
  • 22. Super Vector Aggregation – Speaker ID  Fisher Vector: Aggregates Features against a GMM  Super Vector: Aggregates GMM against GMM  Ref: o William M. Campbell, Douglas E. Sturim, Douglas A. Reynolds: Support vector machines using GMM supervectors for speaker verification. IEEE Signal Process. Lett. 13(5): 308-311(2006) Z. Li, Image Analysis&Retrv.2016 p.22 β€œYes, We Can !” ?
  • 23. Super Vector from MFCC  Motivated from Speaker ID work  Speech is a continuousevolution of the vocal tract  Need to extract a sequence of spectra or sequence of spectral coefficients  Use a sliding window - 25 ms window, 10 ms shift Z. Li, Image Analysis&Retrv.2016 p.23 DCTLog|X(Ο‰)| MFCC
  • 24. GMM Model from MFCC  GMM on MFCC feature Z. Li, Image Analysis&Retrv.2016 p.24 οƒ₯ο€½ ο“ο€½οŒ M j s j s j s j s pp 1 )()()()( ),|()|(  xx β€’ The acoustic vectors (MFCC) of speaker s is modeled by a prob. density function parameterized by M j s j s j s j s 1 )()()()( },,{ ο€½ο“ο€½οŒ  β€’ Gaussian mixture model (GMM) for speaker s: M j s j s j s j s 1 )()()()( },,{ ο€½ο“ο€½οŒ 
  • 25. Universal Background Model  UBM GMM Model: Z. Li, Image Analysis&Retrv.2016 p.25 οƒ₯ο€½ ο“ο€½οŒ M j jjj pp 1 )ubm()ubm()ubm()ubm( ),|()|(  xx β€’ The acoustic vectors of a general population is modeled by another GMM called the universal background model (UBM): β€’ Parameters of the UBM M jjjj 1 )ubm()ubm()ubm()ubm( },,{ ο€½ο“ο€½οŒ 
  • 26. MAP Adaption  Given the UBM GMM, how is the new observation derivate ?  The adapted mean is given by: Z. Li, Image Analysis&Retrv.2016 p.26
  • 27. Supervector Distance  Assuming we have UBM GMM model πœ† π‘ˆπ΅π‘€ = {π‘ƒπ‘˜, πœ‡ π‘˜, Ξ£ π‘˜}, with identical prior and covariance Then for two utterance samples a and b, with GMM models  πœ† π‘Ž = {π‘ƒπ‘˜, πœ‡ π‘˜ π‘Ž , Ξ£ π‘˜},  πœ† 𝑏 = {π‘ƒπ‘˜, πœ‡ π‘˜ 𝑏 ,Ξ£ π‘˜}, The SV distance is, It means the means of two models need to be normalized by the UBM covariance induced Mahanolibis distance metric This is also a linear kernel function scaled by the UBM covariances Z. Li, Image Analysis&Retrv.2016 p.27 𝐾 πœ† π‘Ž, πœ† 𝑏 = π‘˜ π‘ƒπ‘˜Ξ£ π‘˜ βˆ’( 1 2 ) πœ‡ π‘˜ π‘Ž 𝑇 ( π‘ƒπ‘˜Ξ£ π‘˜ βˆ’( 1 2 ) πœ‡ π‘˜ 𝑏)
  • 28. Supervector Performance in NIST Speaker ID  System 5: Gaussian SV  DCF (Detection Cost Function) Z. Li, Image Analysis&Retrv.2016 p.28
  • 29. m31491 AKULA – Adaptive KLUster Aggregation 2013/10/25 Abhishek Nagar, Zhu Li, Gaurav Srivastava and Kyungmo Park Z. Li, Image Analysis&Retrv.2016 p.29
  • 30. Outline Motivation Adaptive Aggregation Results with TM7 Summary Z. Li, Image Analysis&Retrv.2016 p.30
  • 31. Motivation Better Aggregation  Fisher Vector and VLAD type aggregation depending on a global model  AKULA removes this dependence, and directly coding the cluster centroids and sift count  SCFV/RVD all having situations where clusters are turned off due to no assignment, this can be avoided in AKULA SIFTdetection & selection K-means AKULA description Z. Li, Image Analysis&Retrv.2016 p.31
  • 32. Motivation Better Subspace Choice  Both SCFV and RVD do fixed normalization and PCA projection based on heuristic.  What is the best possible subspace to do the aggregation ?  Using a boosting scheme to keep adding subspaces and aggregations in an iterative fashion, and tune TPR-FPR to the desired operating points on FPR. Z. Li, Image Analysis&Retrv.2016 p.32
  • 33. CE2: AKULA – Adaptive KLUster Aggregation AKULA Descriptor: cluster centroids + SIFT count A2={yc2 1, yc2 2, …, yc2 k ; pc2 1, pc2 2, …, pc2 k } Distance metric:  Min centroids distance, weighted by SIFT count d A1 ,A2 = 1 π‘˜ 𝑗=0 π‘˜ d π‘šπ‘–π‘› 1 𝑗 𝑀 π‘šπ‘–π‘› 1 (𝑗) + 1 π‘˜ 𝑖=0 π‘˜ d π‘šπ‘–π‘› 2 𝑖 𝑀 π‘šπ‘–π‘› 2 (𝑖) A1={yc1 1, yc1 2, …, yc1 k ; pc1 1, pc1 2, …, pc1 k }, d π‘šπ‘–π‘› 1 𝑗 = min 𝑖 𝑑𝑗,𝑖 d π‘šπ‘–π‘› 2 𝑖 = min 𝑗 𝑑𝑗,𝑖 w π‘šπ‘–π‘› 1 𝑗 = 𝑀𝑗,π‘–βˆ— , π‘–βˆ— = π‘Žπ‘Ÿπ‘”min 𝑖 𝑑𝑗,𝑖 w π‘šπ‘–π‘› 2 𝑖 = π‘€π‘—βˆ—,𝑖, π‘—βˆ— = π‘Žπ‘Ÿπ‘”min 𝑗 𝑑𝑗,𝑖 Z. Li, Image Analysis&Retrv.2016 p.33
  • 34. AKULA implementation in TM7 Inner loop aggregation  Dimension is fixed at 8  Numb of clusters, or nc=8, 16, 32, to hit 64, 128, and 256 bytes  Quantization: scale by Β½ and quantized to int8, sift count is 8 bits, total (nc+1)*dim bytes per aggregation Z. Li, Image Analysis&Retrv.2016 p.34
  • 35. AKULA implementation in TM7 Outer loop subspace optimization by boosting  Initial set of subspace models {Ak} computed from MIR FLICKR data set SIFT extractions by k-means the space to 4096 clusters  Iterative search on subspaces to generate AKULA aggregation that can improve performance in precision- recall  Notice that aggregation is de-coupled in subspace iteration, to allow more DoF in aggregation, to find subspaces that provides complimentary info. The algorithm is still being debugged, hence only having 1st iteration results in TM7 Z. Li, Image Analysis&Retrv.2016 p.35
  • 36. AKULA implementation in TM7 Outer loop subspace optimization by boosting  Initial set of subspace models {Ak} computed from MIR FLICKR data set SIFT extractions by k-means the space to 4096 clusters  Iterative search on subspaces to generate AKULA aggregation that can improve performance in precision-recall  Notice that aggregation is de-coupled in subspace iteration, to allow more DoF in aggregation, to find subspaces that provides complimentary info. The algorithm is still being debugged, hence only having 1st iteration results in TM7 Indexing/Hashing is required for AKULA, it involves nc x dim multiplications and additions at this time. A binarization scheme will be considered once its performance is optimized in non-binary form. Z. Li, Image Analysis&Retrv.2016 p.36
  • 37. GD Only TPR-FPR: AKULA vs SCFV Data set 1:  AKULA (128bytes, dim=8, nc=16) distance is just 1-way dmin1.*wt  Forcing a weighted sum on SCFV (512 bytes) hamming distances without 2D decision fitting, i.e, count hamming distance between common active clusters, and sum up their distances Z. Li, Image Analysis&Retrv.2016 p.37
  • 38. GD Only TPR-FPR: AKULA vs SCFV Data set 2, 3:  AKULA distance is just 1-way dmin1.*wt  AKULA=128bytes, SCFV = 512 bytes. Z. Li, Image Analysis&Retrv.2016 p.38
  • 39. 3D object set: 4 , 5 Data set4, 5: Z. Li, Image Analysis&Retrv.2016 p.39
  • 40. AKULA in PM FPR performance: AKULA rates: pm rates m akula rates 512 8 64 1K 16 128 2K 16 128 1K_4K 16 128 2K_4K 16 128 4K 16 128 8K 32 256 16K 32 256 Z. Li, Image Analysis&Retrv.2016 p.40
  • 41. TPR@1% FPR 0 20 40 60 80 100 120 1a 1b 1c 2 3 4 5 TPR(%) bitrate:512 TM7 AKULA 0 20 40 60 80 100 120 1a 1b 1c 2 3 4 5TPR(%) bitrate:1k TM7 AKULA Z. Li, Image Analysis&Retrv.2016 p.41
  • 42. TPR@1%FPR: 0 20 40 60 80 100 120 1a 1b 1c 2 3 4 5 TPR(%) bitrate:2k TM7 AKULA 0 20 40 60 80 100 120 1a 1b 1c 2 3 4 5 TPR(%) bitrate:1k-4k TM7 AKULA Z. Li, Image Analysis&Retrv.2016 p.42
  • 43. TPR@1%FPR: 0 20 40 60 80 100 120 1a 1b 1c 2 3 4 5 TPR(%) bitrate:2k-4k TM7 AKULA 0 20 40 60 80 100 120 1a 1b 1c 2 3 4 5 TPR(%) bitrate:4k TM7 AKULA Z. Li, Image Analysis&Retrv.2016 p.43
  • 44. TPR@1%FPR: 75 80 85 90 95 100 105 1a 1b 1c 2 3 4 5 TPR(%) bitrate:8k TM7 AKULA 80 85 90 95 100 105 1a 1b 1c 2 3 4 5 TPR(%) bitrate:16k TM7 AKULA Z. Li, Image Analysis&Retrv.2016 p.44
  • 45. AKULA Localization Quite some improvements: 2.7% Z. Li, Image Analysis&Retrv.2016 p.45
  • 46. AKULA Summary Benefits:  Allow more DoF in aggregation optimization, o by an outer loop boosting scheme for subspace projection optimization o And an inner loop adaptive clustering without the constraint of the global GMM model  Simple weighted distance sum metric, with no need to tune a multi-dimensional decision boundary  The overall pair wise matching matched up with TM7 SCFV with 2-dimensional decision boundary  In GD only matching outperforms the TM7 GD  Good improvements to the localization accuracy  Light in extraction, but still heavy in pair wise matching, and need binarization scheme and/or indexing scheme to work for retrieval  Future Improvements:  SupervectorAKULA ? Z. Li, Image Analysis&Retrv.2016 p.46
  • 47. Lec 08 Summary  Fisher Vector  Aggregate features {Xk} in RD against GMM Super Vector  Aggregate GMM against a global GMM (UBM)  AKULA  Direct Aggregation Z. Li, Image Analysis&Retrv.2016 p.47 + + + +