SlideShare a Scribd company logo
1 of 39
Download to read offline
Large-scale visual recognition
Novel patch aggregation mechanisms
Florent Perronnin, XRCE
Hervé Jégou, INRIA
CVPR tutorial on Large-Scale Visual Recognition
June 16, 2012
Motivation
For large-scale visual recognition, we need image signatures which contain
fine-grained information:
•! in retrieval: the larger the dataset size, the higher the probability to find
another similar but irrelevant image to a given query
•! in classification: the larger the number of other classes, the higher the
probability to find a class which is similar to any given class
BOV answer to the problem: increase visual vocabulary size
! see previous part on scaling visual vocabularies
How to increase amount of information without increasing the visual
vocabulary size?
Motivation
BOV is only about counting the number of local descriptors assigned to
each Voronoi region
Why not including other statistics?
http://www.cs.utexas.edu/~grauman/courses/fall2009/papers/bag_of_visual_words.pdf
Motivation
BOV is only about counting the number of local descriptors assigned to
each Voronoi region
Why not including other statistics? For instance:
•! mean of local descriptors
http://www.cs.utexas.edu/~grauman/courses/fall2009/papers/bag_of_visual_words.pdf
Motivation
BOV is only about counting the number of local descriptors assigned to
each Voronoi region
Why not including other statistics? For instance:
•! mean of local descriptors
•! (co)variance of local descriptors
http://www.cs.utexas.edu/~grauman/courses/fall2009/papers/bag_of_visual_words.pdf
Outline
A first example: the VLAD
The Fisher Vector
Other higher-order representations
Example results
Outline
A first example: the VLAD
The Fisher Vector
Other higher-order representations
Example results
A first example: the VLAD
Given a codebook ,
e.g. learned with K-means, and a set of
local descriptors :
•! ! assign:
•! "# compute:
•! concatenate vi’s + normalize
Jégou, Douze, Schmid and Pérez, “Aggregating local descriptors into a compact image representation”, CVPR’10.
µ 3
x
v1 v2
v3 v4
v5
µ1
µ 4
µ 2
µ 5
assign descriptors
compute x- µ i
vi=sum x- µ i for cell i
A first example: the VLAD
A graphical representation of
Jégou, Douze, Schmid and Pérez, “Aggregating local descriptors into a compact image representation”, CVPR’10.
A first example: the VLAD
But in which sense is the VLAD optimal?
Could we add other (higher-order) statistics?
Outline
A first simple example: the VLAD
The Fisher Vector
Other higher-order representations
Example results
The Fisher vector
Score function
Given a likelihood function with parameters ", the score function of a
given sample X is given by:
! Fixed-length vector whose dimensionality depends only on #
parameters.
Intuition: direction in which the parameters " of the model should we
modified to better fit the data.
Page 12 F. Perronnin, H. Jégou, CVR tutorial on LSVRC, June 16, 2012.
The Fisher vector
Fisher information matrix
Fisher information matrix (FIM) or negative Hessian:
Measure similarity between using the Fisher Kernel (FK):
! can be interpreted as a score whitening
As the FIM, is PSD, it can be decomposed as:
and the FK can be rewritten as a dot product between Fisher Vectors (FV):
Jaakkola and Haussler, “Exploiting generative models in discriminative classifiers”, NIPS’98.
The Fisher vector
Application to images
is the set of T i.i.d. D-dim local descriptors (e.g.
SIFT) extracted from an image:
! average pooling is a direct consequence of independence assumption
is a Gaussian Mixture Model (GMM)
with parameters trained on a large set of
local descriptors ! a probabilistic visual vocabulary
Perronnin and Dance, “Fisher kernels on visual categories for image categorization”, CVPR’07.
FV formulas:
The Fisher vector
Relationship with the BOV
0.5
0.4 0.05
0.05
Perronnin and Dance, “Fisher kernels on visual categories for image categorization”, CVPR’07.
FV formulas:
•! gradient wrt to w
!
! soft BOV
= soft-assignment of patch t to Gaussian i
The Fisher vector
Relationship with the BOV
Perronnin and Dance, “Fisher kernels on visual categories for image categorization”, CVPR’07.
0.5
0.4 0.05
0.05
•! gradient wrt to µ and #
FV formulas:
•! gradient wrt to w
!
! soft BOV
= soft-assignment of patch t to Gaussian i
! compared to BOV, include higher-order statistics (up to order 2)
Let us denote: D = feature dim, N = # Gaussians
•! BOV = N-dim
•! FV = 2DN-dim
The Fisher vector
Relationship with the BOV
Perronnin and Dance, “Fisher kernels on visual categories for image categorization”, CVPR’07.
0.5
0.4 0.05
0.05
•! gradient wrt to µ and #
FV formulas:
•! gradient wrt to w
!
! soft BOV
= soft-assignment of patch t to Gaussian i
! compared to BOV, include higher-order statistics (up to order 2)
!! FV much higher-dim than BOV for a given visual vocabulary size
!! FV much faster to compute than BOV for a given feature dim
The Fisher vector
Relationship with the BOV
Perronnin and Dance, “Fisher kernels on visual categories for image categorization”, CVPR’07.
0.5
0.4 0.05
0.05
The Fisher vector
Dimensionality reduction on local descriptors
Perform PCA on local descriptors:
! uncorrelated features are more consistent with diagonal assumption of
covariance matrices in GMM
! FK performs whitening and enhances low-energy (possibly noisy) dimensions
The Fisher vector
Dimensionality reduction on local descriptors
Perform PCA on local descriptors:
! uncorrelated features are more consistent with diagonal assumption of
covariance matrices in GMM
! FK performs whitening and enhances low-energy (possibly noisy) dimensions
Jégou, Perronnin, Douze, Sánchez, Pérez and Schmid, “Aggregating local descriptors into compact codes”, TPAMI’11.
Results on INRIA Holidays
Assuming that the xt’s are iid drawn from a distribution p, we have:
If we assume that p is a mixture of image-dependent and image-
independent information:
Then we have:
!!The FV depends only (approximately) on image-specific content (TF-IDF)
!! normalization removes dependence on $
The Fisher vector
Normalization: TF-IDF effect
Perronnin, Sánchez and Mensink, “Improving the Fisher kernel for large-scale image classification”, ECCV’10.
The Fisher vector
Normalization: variance stabilization
FVs can be (approximately) viewed as emissions of a compound Poisson: a
sum of N iid random variables with N~Poisson.
!! variance depends on mean
!! Variance stabilizing transforms of the form:
(with %=0.5 by default)
can be used on the FV (or the VLAD).
Jégou, Perronnin, Douze, Sánchez, Pérez and Schmid, “Aggregating local descriptors into compact codes”, TPAMI’11.
Perronnin, Sánchez and Mensink, “Improving the Fisher kernel for large-scale image classification”, ECCV’10.
The Fisher vector
Normalization: variance stabilization
FVs can be (approximately) viewed as emissions of a compound Poisson: a
sum of N iid random variables with N~Poisson.
!! variance depends on mean
!! Variance stabilizing transforms of the form:
(with %=0.5 by default)
can be used on the FV (or the VLAD).
! Reduce impact of bursty visual elements
Jégou, Douze, Schmid, “On the burstiness of visual elements”, ICCV’09.
Jégou, Perronnin, Douze, Sánchez, Pérez and Schmid, “Aggregating local descriptors into compact codes”, TPAMI’11.
Outline
A first example: the VLAD
The Fisher Vector
Other higher-order representations
Example results
But in which sense is the VLAD optimal?
Could we add other (higher-order) statistics?
Other higher-order representations
Revisiting the VLAD
Jégou, Douze, Schmid and Pérez, “Aggregating local descriptors into a compact image representation”, CVPR’10.
But in which sense is the VLAD optimal?
! The VLAD can be viewed as a non-probabilistic version of the FV:
•! gradient with respect to mean only
•! replace GMM clustering by k-means
!
Could we add other (higher-order) statistics?
! extension of the VLAD to include 2nd order statistics: VLAT
Picard and Gosselin, “Improving image similarity with vectors of locally aggregated tensors”, ICIP ‘11.
Other higher-order representations
Revisiting the VLAD
Other higher-order representations
Super-Vector (SV) coding
is Lipschitz smooth if :
Given a codebook and a patch we have:
with
and (to be learned)
Zhou, Yu, Zhang and Huang, “Image classification using super-vector coding of local image descriptors”, ECCV’10.
Other higher-order representations
Super-Vector (SV) coding
is Lipschitz smooth if :
Given a codebook and a patch we have:
with
Average pooling ! SV ! BOV + VLAD
Bound in Lipschitz smooth inequality provides argument for k-means.
Zhou, Yu, Zhang and Huang, “Image classification using super-vector coding of local image descriptors”, ECCV’10.
See also: Ladick" and Torr, “Locally linear support vector machines”, ICML’11.
Outline
A first example: the VLAD
The Fisher Vector
Other higher-order representations
Example results
Examples
Retrieval
Example on Holidays:
From: Jégou, Perronnin, Douze, Sánchez, Pérez and Schmid, “Aggregating local descriptors into compact codes”, TPAMI’11.
Examples
Retrieval
Example on Holidays:
!! second order statistics are not essential for retrieval
From: Jégou, Perronnin, Douze, Sánchez, Pérez and Schmid, “Aggregating local descriptors into compact codes”, TPAMI’11.
Examples
Retrieval
Example on Holidays:
!! second order statistics are not essential for retrieval
!! even for the same feature dim, the FV/VLAD can beat the BOV
From: Jégou, Perronnin, Douze, Sánchez, Pérez and Schmid, “Aggregating local descriptors into compact codes”, TPAMI’11.
Examples
Retrieval
Example on Holidays:
!! second order statistics are not essential for retrieval
!! even for the same feature dim, the FV/VLAD can beat the BOV
!! soft assignment + whitening of FV helps when number of Gaussians &
From: Jégou, Perronnin, Douze, Sánchez, Pérez and Schmid, “Aggregating local descriptors into compact codes”, TPAMI’11.
Examples
Retrieval
Example on Holidays:
!! second order statistics are not essential for retrieval
!! even for the same feature dim, the FV/VLAD can beat the BOV
!! soft assignment + whitening of FV helps when number of Gaussians &
!! after dim-reduction however, the FV and VLAD perform similarly
From: Jégou, Perronnin, Douze, Sánchez, Pérez and Schmid, “Aggregating local descriptors into compact codes”, TPAMI’11.
Examples
Classification
Example on PASCAL VOC 2007:
From: Chatfield, Lempitsky, Vedaldi and Zisserman,
“The devil is in the details: an evaluation of recent
feature encoding methods”, BMVC’11.
Feature dim mAP
VQ 25K 55.30
KCB 25K 56.26
LLC 25K 57.27
SV 41K 58.16
FV 132K 61.69
Examples
Classification
Example on PASCAL VOC 2007:
From: Chatfield, Lempitsky, Vedaldi and Zisserman,
“The devil is in the details: an evaluation of recent
feature encoding methods”, BMVC’11.
! FV outperforms BOV-based
techniques including:
•! VQ: plain vanilla BOV
•! KCB: BOV with soft assignment
•! LLC: BOV with sparse coding
Feature dim mAP
VQ 25K 55.30
KCB 25K 56.26
LLC 25K 57.27
SV 41K 58.16
FV 132K 61.69
Examples
Classification
Example on PASCAL VOC 2007:
From: Chatfield, Lempitsky, Vedaldi and Zisserman,
“The devil is in the details: an evaluation of recent
feature encoding methods”, BMVC’11.
! FV outperforms BOV-based
techniques including:
•! VQ: plain vanilla BOV
•! KCB: BOV with soft assignment
•! LLC: BOV with sparse coding
! including 2nd order information is
important for classification
Feature dim mAP
VQ 25K 55.30
KCB 25K 56.26
LLC 25K 57.27
SV 41K 58.16
FV 132K 61.69
Packages
The INRIA package:
http://lear.inrialpes.fr/src/inria_fisher/
The Oxford package (soon to be released):
http://www.robots.ox.ac.uk/~vgg/research/encoding_eval/
Questions?

More Related Content

Similar to 4 new-patch-agggregation.pptx

30REGRESSION Regression is a statistical tool that a.docx
30REGRESSION  Regression is a statistical tool that a.docx30REGRESSION  Regression is a statistical tool that a.docx
30REGRESSION Regression is a statistical tool that a.docxtarifarmarie
 
Ibica2014(p15)image fusion based on broveywavelet
Ibica2014(p15)image fusion based on broveywaveletIbica2014(p15)image fusion based on broveywavelet
Ibica2014(p15)image fusion based on broveywaveletAboul Ella Hassanien
 
Ibica2014(p15)image fusion based on broveywavelet
Ibica2014(p15)image fusion based on broveywaveletIbica2014(p15)image fusion based on broveywavelet
Ibica2014(p15)image fusion based on broveywaveletAboul Ella Hassanien
 
Wsd as distributed constraint optimization problem
Wsd as distributed constraint optimization problemWsd as distributed constraint optimization problem
Wsd as distributed constraint optimization problemlolokikipipi
 

Similar to 4 new-patch-agggregation.pptx (7)

Image captions.pptx
Image captions.pptxImage captions.pptx
Image captions.pptx
 
30REGRESSION Regression is a statistical tool that a.docx
30REGRESSION  Regression is a statistical tool that a.docx30REGRESSION  Regression is a statistical tool that a.docx
30REGRESSION Regression is a statistical tool that a.docx
 
Ibica2014(p15)image fusion based on broveywavelet
Ibica2014(p15)image fusion based on broveywaveletIbica2014(p15)image fusion based on broveywavelet
Ibica2014(p15)image fusion based on broveywavelet
 
Ibica2014(p15)image fusion based on broveywavelet
Ibica2014(p15)image fusion based on broveywaveletIbica2014(p15)image fusion based on broveywavelet
Ibica2014(p15)image fusion based on broveywavelet
 
Matlab abstract 2016
Matlab abstract 2016Matlab abstract 2016
Matlab abstract 2016
 
Wsd as distributed constraint optimization problem
Wsd as distributed constraint optimization problemWsd as distributed constraint optimization problem
Wsd as distributed constraint optimization problem
 
Sharath copy
Sharath   copySharath   copy
Sharath copy
 

More from mustafa sarac

Uluslararasilasma son
Uluslararasilasma sonUluslararasilasma son
Uluslararasilasma sonmustafa sarac
 
Real time machine learning proposers day v3
Real time machine learning proposers day v3Real time machine learning proposers day v3
Real time machine learning proposers day v3mustafa sarac
 
Latka december digital
Latka december digitalLatka december digital
Latka december digitalmustafa sarac
 
Axial RC SCX10 AE2 ESC user manual
Axial RC SCX10 AE2 ESC user manualAxial RC SCX10 AE2 ESC user manual
Axial RC SCX10 AE2 ESC user manualmustafa sarac
 
Array programming with Numpy
Array programming with NumpyArray programming with Numpy
Array programming with Numpymustafa sarac
 
Math for programmers
Math for programmersMath for programmers
Math for programmersmustafa sarac
 
TEGV 2020 Bireysel bagiscilarimiz
TEGV 2020 Bireysel bagiscilarimizTEGV 2020 Bireysel bagiscilarimiz
TEGV 2020 Bireysel bagiscilarimizmustafa sarac
 
How to make and manage a bee hotel?
How to make and manage a bee hotel?How to make and manage a bee hotel?
How to make and manage a bee hotel?mustafa sarac
 
Cahit arf makineler dusunebilir mi
Cahit arf makineler dusunebilir miCahit arf makineler dusunebilir mi
Cahit arf makineler dusunebilir mimustafa sarac
 
How did Software Got So Reliable Without Proof?
How did Software Got So Reliable Without Proof?How did Software Got So Reliable Without Proof?
How did Software Got So Reliable Without Proof?mustafa sarac
 
Staff Report on Algorithmic Trading in US Capital Markets
Staff Report on Algorithmic Trading in US Capital MarketsStaff Report on Algorithmic Trading in US Capital Markets
Staff Report on Algorithmic Trading in US Capital Marketsmustafa sarac
 
Yetiskinler icin okuma yazma egitimi
Yetiskinler icin okuma yazma egitimiYetiskinler icin okuma yazma egitimi
Yetiskinler icin okuma yazma egitimimustafa sarac
 
Consumer centric api design v0.4.0
Consumer centric api design v0.4.0Consumer centric api design v0.4.0
Consumer centric api design v0.4.0mustafa sarac
 
State of microservices 2020 by tsh
State of microservices 2020 by tshState of microservices 2020 by tsh
State of microservices 2020 by tshmustafa sarac
 
Uber pitch deck 2008
Uber pitch deck 2008Uber pitch deck 2008
Uber pitch deck 2008mustafa sarac
 
Wireless solar keyboard k760 quickstart guide
Wireless solar keyboard k760 quickstart guideWireless solar keyboard k760 quickstart guide
Wireless solar keyboard k760 quickstart guidemustafa sarac
 
State of Serverless Report 2020
State of Serverless Report 2020State of Serverless Report 2020
State of Serverless Report 2020mustafa sarac
 
Dont just roll the dice
Dont just roll the diceDont just roll the dice
Dont just roll the dicemustafa sarac
 

More from mustafa sarac (20)

Uluslararasilasma son
Uluslararasilasma sonUluslararasilasma son
Uluslararasilasma son
 
Real time machine learning proposers day v3
Real time machine learning proposers day v3Real time machine learning proposers day v3
Real time machine learning proposers day v3
 
Latka december digital
Latka december digitalLatka december digital
Latka december digital
 
Axial RC SCX10 AE2 ESC user manual
Axial RC SCX10 AE2 ESC user manualAxial RC SCX10 AE2 ESC user manual
Axial RC SCX10 AE2 ESC user manual
 
Array programming with Numpy
Array programming with NumpyArray programming with Numpy
Array programming with Numpy
 
Math for programmers
Math for programmersMath for programmers
Math for programmers
 
The book of Why
The book of WhyThe book of Why
The book of Why
 
BM sgk meslek kodu
BM sgk meslek koduBM sgk meslek kodu
BM sgk meslek kodu
 
TEGV 2020 Bireysel bagiscilarimiz
TEGV 2020 Bireysel bagiscilarimizTEGV 2020 Bireysel bagiscilarimiz
TEGV 2020 Bireysel bagiscilarimiz
 
How to make and manage a bee hotel?
How to make and manage a bee hotel?How to make and manage a bee hotel?
How to make and manage a bee hotel?
 
Cahit arf makineler dusunebilir mi
Cahit arf makineler dusunebilir miCahit arf makineler dusunebilir mi
Cahit arf makineler dusunebilir mi
 
How did Software Got So Reliable Without Proof?
How did Software Got So Reliable Without Proof?How did Software Got So Reliable Without Proof?
How did Software Got So Reliable Without Proof?
 
Staff Report on Algorithmic Trading in US Capital Markets
Staff Report on Algorithmic Trading in US Capital MarketsStaff Report on Algorithmic Trading in US Capital Markets
Staff Report on Algorithmic Trading in US Capital Markets
 
Yetiskinler icin okuma yazma egitimi
Yetiskinler icin okuma yazma egitimiYetiskinler icin okuma yazma egitimi
Yetiskinler icin okuma yazma egitimi
 
Consumer centric api design v0.4.0
Consumer centric api design v0.4.0Consumer centric api design v0.4.0
Consumer centric api design v0.4.0
 
State of microservices 2020 by tsh
State of microservices 2020 by tshState of microservices 2020 by tsh
State of microservices 2020 by tsh
 
Uber pitch deck 2008
Uber pitch deck 2008Uber pitch deck 2008
Uber pitch deck 2008
 
Wireless solar keyboard k760 quickstart guide
Wireless solar keyboard k760 quickstart guideWireless solar keyboard k760 quickstart guide
Wireless solar keyboard k760 quickstart guide
 
State of Serverless Report 2020
State of Serverless Report 2020State of Serverless Report 2020
State of Serverless Report 2020
 
Dont just roll the dice
Dont just roll the diceDont just roll the dice
Dont just roll the dice
 

Recently uploaded

Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 

Recently uploaded (20)

Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 

4 new-patch-agggregation.pptx

  • 1. Large-scale visual recognition Novel patch aggregation mechanisms Florent Perronnin, XRCE Hervé Jégou, INRIA CVPR tutorial on Large-Scale Visual Recognition June 16, 2012
  • 2. Motivation For large-scale visual recognition, we need image signatures which contain fine-grained information: •! in retrieval: the larger the dataset size, the higher the probability to find another similar but irrelevant image to a given query •! in classification: the larger the number of other classes, the higher the probability to find a class which is similar to any given class BOV answer to the problem: increase visual vocabulary size ! see previous part on scaling visual vocabularies How to increase amount of information without increasing the visual vocabulary size?
  • 3. Motivation BOV is only about counting the number of local descriptors assigned to each Voronoi region Why not including other statistics? http://www.cs.utexas.edu/~grauman/courses/fall2009/papers/bag_of_visual_words.pdf
  • 4. Motivation BOV is only about counting the number of local descriptors assigned to each Voronoi region Why not including other statistics? For instance: •! mean of local descriptors http://www.cs.utexas.edu/~grauman/courses/fall2009/papers/bag_of_visual_words.pdf
  • 5. Motivation BOV is only about counting the number of local descriptors assigned to each Voronoi region Why not including other statistics? For instance: •! mean of local descriptors •! (co)variance of local descriptors http://www.cs.utexas.edu/~grauman/courses/fall2009/papers/bag_of_visual_words.pdf
  • 6. Outline A first example: the VLAD The Fisher Vector Other higher-order representations Example results
  • 7. Outline A first example: the VLAD The Fisher Vector Other higher-order representations Example results
  • 8. A first example: the VLAD Given a codebook , e.g. learned with K-means, and a set of local descriptors : •! ! assign: •! "# compute: •! concatenate vi’s + normalize Jégou, Douze, Schmid and Pérez, “Aggregating local descriptors into a compact image representation”, CVPR’10. µ 3 x v1 v2 v3 v4 v5 µ1 µ 4 µ 2 µ 5 assign descriptors compute x- µ i vi=sum x- µ i for cell i
  • 9. A first example: the VLAD A graphical representation of Jégou, Douze, Schmid and Pérez, “Aggregating local descriptors into a compact image representation”, CVPR’10.
  • 10. A first example: the VLAD But in which sense is the VLAD optimal? Could we add other (higher-order) statistics?
  • 11. Outline A first simple example: the VLAD The Fisher Vector Other higher-order representations Example results
  • 12. The Fisher vector Score function Given a likelihood function with parameters ", the score function of a given sample X is given by: ! Fixed-length vector whose dimensionality depends only on # parameters. Intuition: direction in which the parameters " of the model should we modified to better fit the data. Page 12 F. Perronnin, H. Jégou, CVR tutorial on LSVRC, June 16, 2012.
  • 13. The Fisher vector Fisher information matrix Fisher information matrix (FIM) or negative Hessian: Measure similarity between using the Fisher Kernel (FK): ! can be interpreted as a score whitening As the FIM, is PSD, it can be decomposed as: and the FK can be rewritten as a dot product between Fisher Vectors (FV): Jaakkola and Haussler, “Exploiting generative models in discriminative classifiers”, NIPS’98.
  • 14. The Fisher vector Application to images is the set of T i.i.d. D-dim local descriptors (e.g. SIFT) extracted from an image: ! average pooling is a direct consequence of independence assumption is a Gaussian Mixture Model (GMM) with parameters trained on a large set of local descriptors ! a probabilistic visual vocabulary Perronnin and Dance, “Fisher kernels on visual categories for image categorization”, CVPR’07.
  • 15. FV formulas: The Fisher vector Relationship with the BOV 0.5 0.4 0.05 0.05 Perronnin and Dance, “Fisher kernels on visual categories for image categorization”, CVPR’07.
  • 16. FV formulas: •! gradient wrt to w ! ! soft BOV = soft-assignment of patch t to Gaussian i The Fisher vector Relationship with the BOV Perronnin and Dance, “Fisher kernels on visual categories for image categorization”, CVPR’07. 0.5 0.4 0.05 0.05
  • 17. •! gradient wrt to µ and # FV formulas: •! gradient wrt to w ! ! soft BOV = soft-assignment of patch t to Gaussian i ! compared to BOV, include higher-order statistics (up to order 2) Let us denote: D = feature dim, N = # Gaussians •! BOV = N-dim •! FV = 2DN-dim The Fisher vector Relationship with the BOV Perronnin and Dance, “Fisher kernels on visual categories for image categorization”, CVPR’07. 0.5 0.4 0.05 0.05
  • 18. •! gradient wrt to µ and # FV formulas: •! gradient wrt to w ! ! soft BOV = soft-assignment of patch t to Gaussian i ! compared to BOV, include higher-order statistics (up to order 2) !! FV much higher-dim than BOV for a given visual vocabulary size !! FV much faster to compute than BOV for a given feature dim The Fisher vector Relationship with the BOV Perronnin and Dance, “Fisher kernels on visual categories for image categorization”, CVPR’07. 0.5 0.4 0.05 0.05
  • 19. The Fisher vector Dimensionality reduction on local descriptors Perform PCA on local descriptors: ! uncorrelated features are more consistent with diagonal assumption of covariance matrices in GMM ! FK performs whitening and enhances low-energy (possibly noisy) dimensions
  • 20. The Fisher vector Dimensionality reduction on local descriptors Perform PCA on local descriptors: ! uncorrelated features are more consistent with diagonal assumption of covariance matrices in GMM ! FK performs whitening and enhances low-energy (possibly noisy) dimensions Jégou, Perronnin, Douze, Sánchez, Pérez and Schmid, “Aggregating local descriptors into compact codes”, TPAMI’11. Results on INRIA Holidays
  • 21. Assuming that the xt’s are iid drawn from a distribution p, we have: If we assume that p is a mixture of image-dependent and image- independent information: Then we have: !!The FV depends only (approximately) on image-specific content (TF-IDF) !! normalization removes dependence on $ The Fisher vector Normalization: TF-IDF effect Perronnin, Sánchez and Mensink, “Improving the Fisher kernel for large-scale image classification”, ECCV’10.
  • 22. The Fisher vector Normalization: variance stabilization FVs can be (approximately) viewed as emissions of a compound Poisson: a sum of N iid random variables with N~Poisson. !! variance depends on mean !! Variance stabilizing transforms of the form: (with %=0.5 by default) can be used on the FV (or the VLAD). Jégou, Perronnin, Douze, Sánchez, Pérez and Schmid, “Aggregating local descriptors into compact codes”, TPAMI’11. Perronnin, Sánchez and Mensink, “Improving the Fisher kernel for large-scale image classification”, ECCV’10.
  • 23. The Fisher vector Normalization: variance stabilization FVs can be (approximately) viewed as emissions of a compound Poisson: a sum of N iid random variables with N~Poisson. !! variance depends on mean !! Variance stabilizing transforms of the form: (with %=0.5 by default) can be used on the FV (or the VLAD). ! Reduce impact of bursty visual elements Jégou, Douze, Schmid, “On the burstiness of visual elements”, ICCV’09. Jégou, Perronnin, Douze, Sánchez, Pérez and Schmid, “Aggregating local descriptors into compact codes”, TPAMI’11.
  • 24. Outline A first example: the VLAD The Fisher Vector Other higher-order representations Example results
  • 25. But in which sense is the VLAD optimal? Could we add other (higher-order) statistics? Other higher-order representations Revisiting the VLAD Jégou, Douze, Schmid and Pérez, “Aggregating local descriptors into a compact image representation”, CVPR’10.
  • 26. But in which sense is the VLAD optimal? ! The VLAD can be viewed as a non-probabilistic version of the FV: •! gradient with respect to mean only •! replace GMM clustering by k-means ! Could we add other (higher-order) statistics? ! extension of the VLAD to include 2nd order statistics: VLAT Picard and Gosselin, “Improving image similarity with vectors of locally aggregated tensors”, ICIP ‘11. Other higher-order representations Revisiting the VLAD
  • 27. Other higher-order representations Super-Vector (SV) coding is Lipschitz smooth if : Given a codebook and a patch we have: with and (to be learned) Zhou, Yu, Zhang and Huang, “Image classification using super-vector coding of local image descriptors”, ECCV’10.
  • 28. Other higher-order representations Super-Vector (SV) coding is Lipschitz smooth if : Given a codebook and a patch we have: with Average pooling ! SV ! BOV + VLAD Bound in Lipschitz smooth inequality provides argument for k-means. Zhou, Yu, Zhang and Huang, “Image classification using super-vector coding of local image descriptors”, ECCV’10. See also: Ladick" and Torr, “Locally linear support vector machines”, ICML’11.
  • 29. Outline A first example: the VLAD The Fisher Vector Other higher-order representations Example results
  • 30. Examples Retrieval Example on Holidays: From: Jégou, Perronnin, Douze, Sánchez, Pérez and Schmid, “Aggregating local descriptors into compact codes”, TPAMI’11.
  • 31. Examples Retrieval Example on Holidays: !! second order statistics are not essential for retrieval From: Jégou, Perronnin, Douze, Sánchez, Pérez and Schmid, “Aggregating local descriptors into compact codes”, TPAMI’11.
  • 32. Examples Retrieval Example on Holidays: !! second order statistics are not essential for retrieval !! even for the same feature dim, the FV/VLAD can beat the BOV From: Jégou, Perronnin, Douze, Sánchez, Pérez and Schmid, “Aggregating local descriptors into compact codes”, TPAMI’11.
  • 33. Examples Retrieval Example on Holidays: !! second order statistics are not essential for retrieval !! even for the same feature dim, the FV/VLAD can beat the BOV !! soft assignment + whitening of FV helps when number of Gaussians & From: Jégou, Perronnin, Douze, Sánchez, Pérez and Schmid, “Aggregating local descriptors into compact codes”, TPAMI’11.
  • 34. Examples Retrieval Example on Holidays: !! second order statistics are not essential for retrieval !! even for the same feature dim, the FV/VLAD can beat the BOV !! soft assignment + whitening of FV helps when number of Gaussians & !! after dim-reduction however, the FV and VLAD perform similarly From: Jégou, Perronnin, Douze, Sánchez, Pérez and Schmid, “Aggregating local descriptors into compact codes”, TPAMI’11.
  • 35. Examples Classification Example on PASCAL VOC 2007: From: Chatfield, Lempitsky, Vedaldi and Zisserman, “The devil is in the details: an evaluation of recent feature encoding methods”, BMVC’11. Feature dim mAP VQ 25K 55.30 KCB 25K 56.26 LLC 25K 57.27 SV 41K 58.16 FV 132K 61.69
  • 36. Examples Classification Example on PASCAL VOC 2007: From: Chatfield, Lempitsky, Vedaldi and Zisserman, “The devil is in the details: an evaluation of recent feature encoding methods”, BMVC’11. ! FV outperforms BOV-based techniques including: •! VQ: plain vanilla BOV •! KCB: BOV with soft assignment •! LLC: BOV with sparse coding Feature dim mAP VQ 25K 55.30 KCB 25K 56.26 LLC 25K 57.27 SV 41K 58.16 FV 132K 61.69
  • 37. Examples Classification Example on PASCAL VOC 2007: From: Chatfield, Lempitsky, Vedaldi and Zisserman, “The devil is in the details: an evaluation of recent feature encoding methods”, BMVC’11. ! FV outperforms BOV-based techniques including: •! VQ: plain vanilla BOV •! KCB: BOV with soft assignment •! LLC: BOV with sparse coding ! including 2nd order information is important for classification Feature dim mAP VQ 25K 55.30 KCB 25K 56.26 LLC 25K 57.27 SV 41K 58.16 FV 132K 61.69
  • 38. Packages The INRIA package: http://lear.inrialpes.fr/src/inria_fisher/ The Oxford package (soon to be released): http://www.robots.ox.ac.uk/~vgg/research/encoding_eval/