• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Large Scale Online Learning of Image Similarity Through Ranking
 

Large Scale Online Learning of Image Similarity Through Ranking

on

  • 459 views

Presentation of paper "Large Scale Online Learning of Image Similarity Through Ranking" for Synchromedia Seminar on 15. 9. 2012.

Presentation of paper "Large Scale Online Learning of Image Similarity Through Ranking" for Synchromedia Seminar on 15. 9. 2012.

Statistics

Views

Total Views
459
Views on SlideShare
458
Embed Views
1

Actions

Likes
0
Downloads
7
Comments
0

1 Embed 1

http://www.slashdocs.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Large Scale Online Learning of Image Similarity Through Ranking Large Scale Online Learning of Image Similarity Through Ranking Presentation Transcript

    • Large Scale Online Learning of Image SimilarityThrough Rankingfrom G. Chechik, V. Sharma, U. Shalit, S. Bengio – JML 2010by Lukas Tencer
    • Motivation• Needed for applications, which compare any kind of data: – image, video, web-page, document• Two levels of similarity: – Features (visual for images) – Semantic• Large-scale learning: limited by computational cost, not by availability of data• What similarity the user wants to express, visual or semantic?• Presented approach deals with semantic similarity once we have visual similarity• Similarity learning requires pairwise distance, not always available• Instead pairwise distance use relative distance, two images are close: – if are returned by same query – if does have the same label
    • Example of query• Especially problem in QVE (Query by Visual Example)• Query:• Images retrieved for vs. visually similar images “mount royal park”
    • Motivation II• Relationship to classification: – Similarity measure could be used as metric for classification – Good classification infers labels, which induce similarity across images• Constrain on semidefinite positive similarity matrix: – for small data prevents overfitting – for big data, with enough of samples could be removed to reduce computational cost
    • Problem Statement• Get pairwise similarity function S on given data on relative pairs of image simlarities• Given data P and rij r ( pi , p j ) relative similarities• We do not have access to all values of r, where it is not available equals 0• Then S ( pi , p j ) is defined as:S ( pi , pi ) S ( pi , pi ), pi , pi , pi P, such as r ( pi , pi ) r ( pi , pi )SW ( pi , p j ) piTWp j , whereW Rd d
    • Online Algorithm• Passive-Aggressive family of learning algorithms, online learning algorithm (iterative) – PA 1: 1 2 wt 1 arg min 2 w wt , such as l ( w; ( xt , yt )) 0 w Rn – Passive, if loss function is 0 – Aggressive, if loss is positive, enforces to satisfy regardless of the step size l ( w; ( xt , yt )) 0 – PA2: Trade off between proximity and desired margin – constrained optimization problem
    • Online Algorithm II• So we are searching for S, with safety margin of 1, to then: SW ( pi , pi ) SW ( pi , pi ) 1• The hinge loss function is defined as: lW ( pi , pi , pi ) max{ 0,1 SW ( pi , pi ) SW ( pi , pi )} LW lW ( pi , pi , pi ) ( pi , pi , pi ) P• Then the PA 2 constrained optimization problem is: i 1 i 1 2 w arg min W W C W 2 Fro such that lW ( pi , pi , pi ) and 0 where C is the parameter, which controls tradeoff between margin enforcement and proximity of solution
    • Online Algorithm III• Loss bound could be derived by rewriting into linear classification problem
    • Sampling strategy• Uniformly sample pi from P• Uniformly sample pi+ from images with same category• Uniformly sample pi- from images which does not share category with pi, – pi- could be chosen by random from all images, if number of categories and number of queries is very large• If relevance feedback r(pi,pj) is not just binary function, then sampling of positive examples could be changed to prioritize samples with higher relevance
    • Image representation• bag-of-word approach (bag-of-local-descriptors) – get regions of interest – calculate local descriptors – treat them independently• Divide image into overlapping square blocks• Extract color and edge descriptors – Edge: uniform Local Binary Patterns – difference of intensities at circular neighborhood, • 2^8 possible sequence = 256 bin histogram • Non-uniform sequences could be merged  59 bin histogram – Color: histograms from k-means clustering • Train color codebook and map block pixel to closes value in codebook – Concatenate in the end
    • Image representation II• Aim for high dimensional sparse vector representation• Thus representing local descriptor as visual term and image is represented as binary vector indicating presence/absence of visual term• Visual terms are rated according to term frequency and inverse document frequency• Parameters of setup: – 20 bins for colors – 10000 visterm vocabulary size (approx 70 non 0 values / img) – Blocks of 64x64 overlapping each 32 pixels – Blocks extracted at different scales, by downscaling images by factor of 1:25 until less then 10 block remains
    • Experiments and evaluation• Tested in 2 settings – Caltech256 dataset (30k images) – Web-Scale experiment (2.7 M images) – (another databases for image retrieval testing: MIRFLICK 1M, Corel5k, Corel30k, UCID)• Web-Scale Experiment: – Queries from Google Image Search and relevance feedback – Stop condition for training is value of mean average precision (160M iterations) ~ 4000 min on single CPU – Evaluation Criterion: mAP and precision at top k
    • Failure cases
    • Scalability• Comparison with Largest Margin Nearest Neighbour LMNN• Scales linearly with number of images
    • Caltech 256 test
    • Discussion• Metric learning could help to capture semantic relationships, once visual similarity is available• Relevance feedback or semantic similarity measure (class modeling) is required to capture semantic similarity• Compared to raw visual similarity comparison precision at top k and mAP increases, • recall is hard to measure for databases, which are not fully annotated• Online metric learning is an ongoing problem (Davis 2007) (Jain 2008) (Chechik 2010) and even though applied to images, could be used in other fields to capture semantic similarity • Images: object semantics vs. visual features • Documents: topics vs. textual features (dtf,tf-idf) • SBIR: relative object mapping vs. sketch features
    • Thank you for your attention Available at: http://www.slideshare.net/lukastencer