Large Scale Online Learning of Image Similarity Through Ranking


Published on

Presentation of paper "Large Scale Online Learning of Image Similarity Through Ranking" for Synchromedia Seminar on 15. 9. 2012.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Large Scale Online Learning of Image Similarity Through Ranking

  1. 1. Large Scale Online Learning of Image SimilarityThrough Rankingfrom G. Chechik, V. Sharma, U. Shalit, S. Bengio – JML 2010by Lukas Tencer
  2. 2. Motivation• Needed for applications, which compare any kind of data: – image, video, web-page, document• Two levels of similarity: – Features (visual for images) – Semantic• Large-scale learning: limited by computational cost, not by availability of data• What similarity the user wants to express, visual or semantic?• Presented approach deals with semantic similarity once we have visual similarity• Similarity learning requires pairwise distance, not always available• Instead pairwise distance use relative distance, two images are close: – if are returned by same query – if does have the same label
  3. 3. Example of query• Especially problem in QVE (Query by Visual Example)• Query:• Images retrieved for vs. visually similar images “mount royal park”
  4. 4. Motivation II• Relationship to classification: – Similarity measure could be used as metric for classification – Good classification infers labels, which induce similarity across images• Constrain on semidefinite positive similarity matrix: – for small data prevents overfitting – for big data, with enough of samples could be removed to reduce computational cost
  5. 5. Problem Statement• Get pairwise similarity function S on given data on relative pairs of image simlarities• Given data P and rij r ( pi , p j ) relative similarities• We do not have access to all values of r, where it is not available equals 0• Then S ( pi , p j ) is defined as:S ( pi , pi ) S ( pi , pi ), pi , pi , pi P, such as r ( pi , pi ) r ( pi , pi )SW ( pi , p j ) piTWp j , whereW Rd d
  6. 6. Online Algorithm• Passive-Aggressive family of learning algorithms, online learning algorithm (iterative) – PA 1: 1 2 wt 1 arg min 2 w wt , such as l ( w; ( xt , yt )) 0 w Rn – Passive, if loss function is 0 – Aggressive, if loss is positive, enforces to satisfy regardless of the step size l ( w; ( xt , yt )) 0 – PA2: Trade off between proximity and desired margin – constrained optimization problem
  7. 7. Online Algorithm II• So we are searching for S, with safety margin of 1, to then: SW ( pi , pi ) SW ( pi , pi ) 1• The hinge loss function is defined as: lW ( pi , pi , pi ) max{ 0,1 SW ( pi , pi ) SW ( pi , pi )} LW lW ( pi , pi , pi ) ( pi , pi , pi ) P• Then the PA 2 constrained optimization problem is: i 1 i 1 2 w arg min W W C W 2 Fro such that lW ( pi , pi , pi ) and 0 where C is the parameter, which controls tradeoff between margin enforcement and proximity of solution
  8. 8. Online Algorithm III• Loss bound could be derived by rewriting into linear classification problem
  9. 9. Sampling strategy• Uniformly sample pi from P• Uniformly sample pi+ from images with same category• Uniformly sample pi- from images which does not share category with pi, – pi- could be chosen by random from all images, if number of categories and number of queries is very large• If relevance feedback r(pi,pj) is not just binary function, then sampling of positive examples could be changed to prioritize samples with higher relevance
  10. 10. Image representation• bag-of-word approach (bag-of-local-descriptors) – get regions of interest – calculate local descriptors – treat them independently• Divide image into overlapping square blocks• Extract color and edge descriptors – Edge: uniform Local Binary Patterns – difference of intensities at circular neighborhood, • 2^8 possible sequence = 256 bin histogram • Non-uniform sequences could be merged  59 bin histogram – Color: histograms from k-means clustering • Train color codebook and map block pixel to closes value in codebook – Concatenate in the end
  11. 11. Image representation II• Aim for high dimensional sparse vector representation• Thus representing local descriptor as visual term and image is represented as binary vector indicating presence/absence of visual term• Visual terms are rated according to term frequency and inverse document frequency• Parameters of setup: – 20 bins for colors – 10000 visterm vocabulary size (approx 70 non 0 values / img) – Blocks of 64x64 overlapping each 32 pixels – Blocks extracted at different scales, by downscaling images by factor of 1:25 until less then 10 block remains
  12. 12. Experiments and evaluation• Tested in 2 settings – Caltech256 dataset (30k images) – Web-Scale experiment (2.7 M images) – (another databases for image retrieval testing: MIRFLICK 1M, Corel5k, Corel30k, UCID)• Web-Scale Experiment: – Queries from Google Image Search and relevance feedback – Stop condition for training is value of mean average precision (160M iterations) ~ 4000 min on single CPU – Evaluation Criterion: mAP and precision at top k
  13. 13. Failure cases
  14. 14. Scalability• Comparison with Largest Margin Nearest Neighbour LMNN• Scales linearly with number of images
  15. 15. Caltech 256 test
  16. 16. Discussion• Metric learning could help to capture semantic relationships, once visual similarity is available• Relevance feedback or semantic similarity measure (class modeling) is required to capture semantic similarity• Compared to raw visual similarity comparison precision at top k and mAP increases, • recall is hard to measure for databases, which are not fully annotated• Online metric learning is an ongoing problem (Davis 2007) (Jain 2008) (Chechik 2010) and even though applied to images, could be used in other fields to capture semantic similarity • Images: object semantics vs. visual features • Documents: topics vs. textual features (dtf,tf-idf) • SBIR: relative object mapping vs. sketch features
  17. 17. Thank you for your attention Available at: