• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Modern features-part-4-evaluation
 

Modern features-part-4-evaluation

on

  • 917 views

 

Statistics

Views

Total Views
917
Views on SlideShare
917
Embed Views
0

Actions

Likes
0
Downloads
29
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-ShareAlike LicenseCC Attribution-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Modern features-part-4-evaluation Modern features-part-4-evaluation Presentation Transcript

    • HARVEST ProgrammeFeature evaluationOld and new benchmarks, and new softwareAndrea Vedaldi, University of Oxford
    • Benchmarking: why and how 37• Dozens of feature detectors and descriptors have been proposed• Benchmarks - compare methods empirically - select the best method for a task• Public benchmarks - reproducible research - simplify your life!• Ingredients of a benchmark: Theory Data Software
    • 38Indirect evaluationRepeatability and matching scoreData: affine covariant testbedDirect evaluationImage retrievalData: oxford 5kSoftwareVLBenchmarks
    • 39Indirect evaluationRepeatability and matching scoreData: affine covariant testbedDirect evaluationImage retrievalData: oxford 5kSoftwareVLBenchmarks
    • Indirect feature evaluation 40• Intuition Test how well features persist and can be matched across image transformations.• Data - must be representative of the transformations (viewpoint, illumination, noise, etc.)• Performance measures - repeatability persistence of features - matching score matchability of featuresK. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky, T. Kadir, and L. Van Gool. Acomparison of affine region detectors. IJCV, 1(65):43–72, 2005.
    • frame
    • Affine Testbed 49Viewpoint, scale, rotation
    • Affine Testbed 50Lighting, compression, blur
    • Detector repeatability 51 Intuition• two pairs of features correspond• another pair of features does not• repeatability = 2/3
    • Region overlap 52 Formal definition H homography Rb |Ra HRb |Ra HRb overlap(a, b) = |Ra [ HRb | area intersection = area union
    • Region overlap 53 A Comparison of Affine Region Detec Intuitiongure 12. Overlap error O . Examples of ellipses projected on the corresponding ellipse with the ground truth transformation. (bot erlap error for above displayed ellipses. Note that the overlap error comes from different size, orientation and position of the ellipses. 1 - overlap metimes specific to detectors and scene types of correspondences. The results for images contaiscussed below), and sometimes general—the trans- ing repeated texture motifs (Fig. 9(b)) are displarmation is outside the range for which the detector is in Fig. 14. The best results are obtained with • Examples of ellipses overlapping by different amounts signed, e.g. discretization errors, noise, non-linear MSER detector for both scene types. This is dueumination changes, projective deformations etc. the high detection accuracy especially on the hom •lso the limited features are tested at 40% overlap geneous regions overlap) Usually, ‘range’ of the regions shape (size, error (= 60% with distinctive boundaries. The ewness, . . . ) can partially explain this effect. For peatability score for a viewpoint change of 20 degrstance, in case of a zoomed out test image, only the varies between 40% and 78% and decreases for large regions in the reference image will survive the viewpoint angles to 10% − 46%. The largest num
    • 54
    • 55
    • 56
    • 57
    • Normalised region overlap 58 Intuition larger scale = better overlap Rescale so that reference region has and area of 302 pixels
    • Normalised region overlap 59 Formal definition H1. Detect homography Ra Rb2. Warp Ra HRb3. Normalise sRa sHRb s = |Ra |/3024. Intersection |sRa sHRb | overlap(a, b) = over union |sRa [ sHRb |
    • Detector repeatability 60 Formal definition1. Find features Hin common area homography {Ra : a 2 A} {Rb : b 2 B} (2. Threshold the overlap(a, b), overlap(a, b) 1 ✏o sab =overlap score inf, otherwise. X3. Find geometric M⇤ = max sab greedymatches M bipartite approximation (a,b)2M |M⇤ | repeatability(A, B) = min{|A|, |B|} min{|A|
    • Descriptor matching score 61 Intuition• In addition to being stable, features must be visually distinctive• Descriptor matching score - similar to repeatability - but matches are constructed by comparing descriptors
    • Descriptor matching score 62 Formal definition1. Find features Hin common area homography {Ra : a 2 A} {Rb : b 2 B}2. Descriptor {da : a 2 A} {db : b 2 B} dab = kda db k2distances X3. Descriptor M⇤ = d min dab M bipartitematches (a,b)2M X4. Geometric M⇤ = max sab M bipartitematches (as before) (a,b)2M |M⇤ M⇤ | d match-score(A, B) = min{|A|, |B|}
    • Example of a repeatability graph 63 Repeatability (graf) 100 VL DoG VL Hessian 90 VL HessianLaplace VL HarrisLaplace 80 VL DoG (double) VL Hessian (double) 70 VL HessianLaplace (double)Repeatability (graf) VL HarrisLaplace (double) 60 VLFeat SIFT CMP Hessian 50 VGG hes VGG har 40 30 20 10 0 30 40 50 60 20 Viewpoint angle
    • 64Indirect evaluationRepeatability and matching scoreData: affine covariant testbedDirect evaluationImage retrievalData: oxford 5kSoftwareVLBenchmarks
    • Indirect evaluation 65• Indirect evaluation - a “synthetic” performance measure in a “synthetic” setting• The good - independent of a specific application / implementation - allow to evaluate single components, e.g. ▪ repeatability of detector ▪ matching score of descriptor• The bad - difficult to design well - unclear correlation to the performance in applications
    • Direct evaluation 66• Direct evaluation - performance of a real system using a feature object instance retrieval object category recognition• The good object detection - tied to the “real” performance of the feature text recognition semantic segmentation• The bad ... - tied to one application - worse, tied to one implementation - difficult to evaluate single aspects of a feature• In the follow up we will focus on object instance retrieval
    • Image retrieval 67Used to evaluate features ...
    • Image retrieval pipeline 68 Represent images as bags of featuresinput image detector descriptor {f1 , . . . , fn } {d1 , . . . , dn } Harris (Laplace) SIFT Hessian (Laplace) LIOP DoG BRIEF MSER Jets Harris Affine ... Hessian Affine ....
    • Image retrieval pipeline 69 Step 1: find neighbours of each query descriptor query image increasing descriptor distance ...H. Jégou, M. Douze, and C. Schmid. Exploiting descriptor distances for precise image search. Technical Report 7656, INRIA, 2011.
    • Image retrieval pipeline 70Step 2: each query descriptor casts a vote for each DB imaged query descriptord1 d2 dk ... vote strength max{dk di , 0} distance ... rank k
    • Image retrieval pipeline 71 Step 3: sort DB images by decreasing total votes query image 1 Average Precision (AP) precision 35% ✔ ✗ ✔ ✗ 1 recall ...✗ ✔ ✗ decreasing total votes
    • Image retrieval pipeline 72 Step 4: Overall performance scorequery retrieval results AP 35% ✗ ✔ ✗ 100% ✔ ✗ ✗ 75% ✔ ✗ ✔ ... ... ... ... ... Mean Average Precision (mAP) 53%
    • Oxford 5K data 73 A retrieval benchmark dataset Query Retrieved Images ... ✔ ✗ ✔• ~ 5K images of Oxford - For each of 58 queries ▪ about XX matching images ▪ about XX confounders images• Larger datasets are possible, but slow for extensive evaluation• Relative ranking of features seems to be representative
    • 74Indirect evaluationRepeatability and matching scoreData: affine covariant testbedDirect evaluationImage retrievalData: oxford 5kSoftwareVLBenchmarks
    • VLBenchmarks 75 A new easy-to-use benchmarking suite http://www.vlfeat.org/benchmarks/index.html• A novel MATLAB framework for feature evaluation - Repeatability and matching scores ▪ VGG affine testbed - Image retrieval ▪ Oxford 5K• Goodies - Simple to use MATLAB code - Automatically download datasets & run evaluations - Backward compatible with published results
    • VLBenchmarks 76 Obtaining and installing the code• Installation - Download the latest version - Unpack the archive - Launch MATLAB and type >>8install• Requirements - MATLAB R2008a (7.6) - A C compiler (e.g. Visual Studio, GCC, or Xcode) - Do not forget to setup MATLAB to use your C compiler mex8;setup
    • Example usage 77 t.a choose a detector t.a (Harris Affine, VGG version) t.a o se a choose a dataset (graffiti sequence) o s* "*d* *e a choose test o " sea (detector repeatability) " o ttt t s dt tt t sled ttt t s;ed ttt t slee a run the evaluation(repeatability = 0.66)
    • Testing on a sequence of images 78 .ls . ls . ls r *; s r *o aoto o; s r a *;s r ceu a* ; r ... . d * t. .. . * ;t ... . *c;t ... . * ;; s s * at o " otUse parfor on a cluster! F; sd *o o; sa *o ao; s s
    • 79 1 0.9 0.8 0.7 0.6repeatability 0.5 0.4 0.3 0.2 0.1 0 1 1.5 2 2.5 3 3.5 4 4.5 5 image number
    • Comparing two features 80import"datasets.*;"import"localFeatures.*;"import"benchmarks.*;detectors{1}"="VlFeatSift()";detectors{2}"="VlFeatCovdet(EstimateAffineShape,"true)";dataset"="VggAffineDataset(category,graf)";benchmark"="RepeatabilityBenchmark()";for"d"="1:2 for"j"="1:5 repeatability(j,d)"="... """"""benchmark."testFeatureExtractor(detectors{d},"... """""""""""""""""""""""""""dataset.getTransformation(j),"... """""""""""""""""""""""""""dataset.getImagePath(1),"... """""""""""""""""""""""""""dataset.getImagePath(j))"; endendclf";"plot(repeatability,"linewidth,"4)";xlabel(image"number)";ylabel(repeatability)";grid"on";
    • 81 1 0.9 0.8 0.7 Affine Adaptation 0.6repeatability 0.5 0.4 0.3 0.2 0.1 0 1 1.5 2 2.5 3 3.5 4 4.5 5 image number
    • Example 82• Compare the following features - SIFT, MSER, and features on a grid - on the Graffiti sequence - for repeatability and number of correspondence Repeatability Number of correspondences 70 800 SIFT SIFT MSER MSER Features on a grid 700 Features on a grid 60 Number of correspondences 600 50 500 Repeatability 40 400 30 300 20 200 10 100 0 0 30 40 50 60 20 30 40 50 60 20 Viewpoint angle Viewpoint angle
    • Backward compatible 83• Previously published results can be easily reproducedInternational Journal of Computer Vision c 2006 Springer Science + Business Media, Inc. Manufactured in The Netherlands. - if interested, try the script8reproduceIjcv05.m DOI: 10.1007/s11263-005-3848-x A Comparison of Affine Region Detectors K. MIKOLAJCZYK University of Oxford, OX1 3PJ, Oxford, United Kingdom km@robots.ox.ac.uk T. TUYTELAARS University of Leuven, Kasteelpark Arenberg 10, 3001 Leuven, Belgium tuytelaa@esat.kuleuven.be C. SCHMID INRIA, GRAVIR-CNRS, 655, av. de l’Europe, 38330, Montbonnot, France schmid@inrialpes.fr A. ZISSERMAN University of Oxford, OX1 3PJ, Oxford, United Kingdom az@robots.ox.ac.uk J. MATAS Czech Technical University, Karlovo Namesti 13, 121 35, Prague, Czech Republic matas@cmp.felk.cvut.cz F. SCHAFFALITZKY AND T. KADIR University of Oxford, OX1 3PJ, Oxford, United Kingdom
    • Other useful tricks 84• Compare different parameter settings detectors{1}"="VggAffine(Detector,haraff,"Threshold","500)"; detectors{2}"="VggAffine(Detector,haraff,"Threshold","1000)";• Visualising matches [~"~"matches"reprojFrames]"="benchmark.testFeatureExtractor("...") ... benchmarks.helpers.plotFrameMatches(matches,"reprojFrames) SIFT Matches with 4 image (VggAffineDataset−graf dataset). Matches using mean−variance−median descriptor with 4 image (VggAffineDataset−graf dataset). Matched ref. image frames Matched ref. image frames Unmatched ref. image frames Unmatched ref. image frames Matched test image frames Matched test image frames Unmatched test image frames Unmatched test image frames
    • Other benchmarks 85• Detector matching score benchmark"="RepeatabilityBenchmark(mode,MatchingScore)";• Image retrieval - Example: Oxford 5K lite - mAP evaluation dataset"="VggRetrievalDataset(Category,oxbuild, """"""""""""""""""""""""""""""BadImagesNum,100); benchmark"="RetrievalBenchmark()"; mAP"="benchmark.testFeatureExtractor(detectors{d},"dataset);
    • Summary 86 http://www.vlfeat.org/benchmarks/index.html• Benchmarks - Indirect: repeatability and matching score - Direct: image retrieval• VLBenchmarks - a simple to use MATLAB framework - convenient• The future - Existing measures have many shortcomings - Hopefully better benchmarks will be available soon - And they will be added to VLBenchmarks for your convenience
    • Credits 87 Karel Lenc Varun Gulshan HARVEST ProgrammeKrystian Mikolajczyk Tinne Tuytelaars Jiri Matas Cordelia Schmid Andrew Zisserman
    • Thank you for coming! 88VLFeathttp://www.vlfeat.org/VLBenchmarkshttp://www.vlfeat.org/benchmarks/