An Evaluation Methodology for Stereo Correspondence Algorithms

An Evaluation Methodology for
Stereo Correspondence Algorithms
Ivan Cabezas, Maria Trujillo and Margaret Florian
ivan.cabezas@correounivalle.edu.co

February 25th 2012
International Conference on Computer Vision Theory and Applications, VISAPP 2012, Rome - Italy

Multimedia and Vision Laboratory
 MMV is a research group of the Universidad del Valle in Cali, Colombia

Ivan Maria et al.

An Evaluation Methodology for Stereo Correspondence Algorithms, VISAPP 2012, Rome - Italy Slide 2

Ayax Inc.
 Ayax Inc. offers informatics solutions for decision analysis

Margaret


Content

 Stereo Vision
 Canonical Stereo Geometry and Disparity
 Ground-truth Based Evaluation
 Quantitative Evaluation Methodologies
 Middlebury’s Methodology
 A* Methodology
 A* Groups Methodology
 Experimental Results
 Final Remarks


Stereo Vision
 The stereo vision problem is to recover the 3D structure of the scene using
two or more images
3D World

Optics
Problem
Camera Inverse
System Problem

Disparity Map Reconstruction
Algorithm
2D Images

Left Right
Correspondence
Stereo Images Algorithm 3D Model

Yang Q. et al., Stereo Matching with Colour-Weighted Correlation, Hierarchical Belief Propagation, and Occlusion Handling, IEEE PAMI 2009


Canonical Stereo Geometry and Disparity
 Disparity is the distance between corresponding points

Accurate Estimation Inaccurate Estimation
P P

P’

Z Z’
pl pr pl pr
πl πr πl πr
pr ’
f f

Cl B Cr Cl B Cr

Trucco, E. and Verri A., Introductory Techniques for 3D Computer Vision, Prentice Hall 1998


Ground-truth Based Evaluation
 Ground-truth based evaluation is based on the comparison using disparity
ground-truth data

Scharstein, D. and Szeliski, R., High-accuracy Stereo Depth Maps using Structured Light, CVPR 2003
Tola, E., Lepetit, V. and Fua, P., A Fast Local Descriptor for Dense Matching, CVPR 2008
Strecha, C., et al. On Benchmarking Camera Calibration and Multi-View Stereo for High Resolution Imagery, CVPR 2008
http://www.zf-usa.com/products/3d-laser-scanners/

Quantitative Evaluation Methodologies

 The use of a methodology allows to:

 Assert specific components and procedures

 Tune algorithm's parameters

 Support decision for researchers and
practitioners

 Measure the progress on the field

Szeliski, R., Prediction Error as a Quality Metric for Motion and Stereo, ICCV 2000
Kostliva, J., Cech, J., and Sara, R., Feasibility Boundary in Dense and Semi-Dense Stereo Matching, CVPR 2007
Tomabari, F., Mattoccia, S., and Di Stefano, L., Stereo for robots: Quantitative Evaluation of Efficient and Low-memory Dense Stereo Algorithms, ICCARV 2010
Cabezas, I. and Trujillo M., A Non-Linear Quantitative Evaluation Approach for Disparity Estimation, VISAPP 2011

Middlebury’s Methodology

Select Test Bed Images Select Error Criteria

nonocc all disc

Select and Apply Stereo Algorithms Select Error Measures

ObjectStereo GC+SegmBorder PUTv3

Compute Error Measures
PatchMatch ImproveSubPix OverSegmBP

Scharstein, D. and Szeliski, R., http://vision.middlebury.edu/stereo/eval/, 2012

Middlebury’s Methodology (ii)

Select and Apply Stereo Algorithms Apply Evaluation Model

Algorithm nonocc all disc
ObjectStereo 2.20 1 6.99 2 6.36 1

GC+SegmBorder 4.99 6 5.78 1 8.66 5
Compute Error Measures PUTv3 2.40 2 9.11 6 6.56 2

PatchMatch 2.47 3 7.80 3 7.11 3

ImproveSubPix 2.96 4 8.22 4 8.55 4
OverSegmBP 3.19 5 8.81 5 8.89 6
ObjectStereo 2.20 6.99 6.36
GC+SegmBorder 4.99 5.78 8.66
PUTv3 2.40 9.11 6.56 Algorithm Average Final
PatchMatch 2.47 7.80 7.11 Rank Ranking
ImproveSubPix 2.96 8.22 8.55 ObjectStereo 1.33 1
OverSegmBP 3.19 8.81 8.89 PatchMatch 3.00 2
PUTv3 3.33 3
GC+SegmBorder 4.00 4
ImproveSubPix 4.00 5
OverSegmBP 5.33 6

Middlebury’s Methodology (iii)

Apply Evaluation Model Interpret Results

The ObjectStereo algorithm
produces accurate results
Middlebury’s
Evaluation Model

Algorithm Average Final
Rank Ranking
ObjectStereo 1.33 1
PatchMatch 3.00 2
PUTv3 3.33 3
GC+SegmBorder 4.00 4
ImproveSubPix 4.00 5
OverSegmBP 5.33 6

Scharstein, D. and Szeliski, R., A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms, IJCV 2002

Middlebury’s Methodology (iv): Weaknesses

 The Middlebury’s evaluation model have some shortcomings

 In some cases, the ranks are assigned arbitrarily

 The same average ranking does not imply the same performance (and
vice versa)

 The cardinality of the set of top-performer algorithms is a free parameter

 It operates values related to incommensurable measures


Middlebury’s Methodology (v): Weaknesses
 The BMP percentage measures the quantity of disparity estimation errors
exceeding a threshold

 The BMP measure have some shortcomings:

 It is sensitive to the threshold selection

 It ignores the error magnitude

 It ignores the inverse relation between depth and disparity

 It may conceal estimation errors of a large magnitude, and, also it may
penalise errors of small impact in the final 3D reconstruction

Cabezas, I., Padilla, V., and Trujillo M., A Measure for Accuracy Disparity Maps Evaluation, CIARP 2011
Gallup, D., et al. Variable Baseline/Resolution Stereo, CVPR, 2008

A* Methodology
 The A* evaluation methodology brings a theoretical background for the
comparison of stereo correspondence algorithms
 The set of algorithms under evaluation

 The set of estimated maps to be compared

 The function that produces a vector of error measures

 The set of vectors of error measures


A* Methodology (ii)
 The evaluation model of the A* methodology addresses the comparison of
stereo correspondence algorithms as a multi-objective optimisation problem
 It defines a partition over the set A (the decision space)

 Subject to:

 where ≺ denotes the Pareto Dominance relation:
Let p and q be two algorithms
Let Vp and Vq be a pair of vectors belonging to the objective space

Thus, three possible relations are considered


A* Methodology (iii): Pareto Dominance
 The Pareto Dominance defines a partial order relation

VGC+SegmBorder = < 4.99, 5.78, 8.66 >
VPatchMatch = < 2.47, 7.80, 7.11 >
VImproveSubPix = < 2.96, 8.22, 8.55 >

VGC+SegmBorder VPatchMatch
< 4.99, 5.78, 8.66 > < 2.47, 7.80, 7.11 >

GC+SegmBorder ~ PatchMatch

VPatchMatch VImproveSubPix
< 2.47, 7.80, 7.11 > < 2.96, 8.22, 8.55 >

Patchmatch ≺ ImproveSubPix

Van Veldhuizen, D., et al., Considerations in Engineering Parallel Multi-objective Evolutionary Algorithms, Trans in Evolutionary Computing 2003

A* Methodology (iv): Illustration


nonocc all disc





A* Methodology (v): Illustration
 The evaluation model performs the partitioning and the grouping of stereo
algorithms under evaluation, based on the Pareto Dominance relation

Compute Error Measures Apply Evaluation Model

ObjectStereo 2.20 6.99 6.36
ObjectStereo , GC+SegmBorder
GC+SegmBorder 4.99 5.78 8.66
PUTv3 2.40 9.11 6.56 PatchMatch , , ,
PUTv3 ImproveSubPix OverSegmBP

PatchMatch 2.47 7.80 7.11
ImproveSubPix 2.96 8.22 8.55 Algorithm nonocc all disc Set
OverSegmBP 3.19 8.81 8.89 ObjectStereo 2.20 6.99 6.36 A*
GC+SegmBorder 4.99 5.78 8.66 A*
PUTv3 2.40 9.11 6.56 A’
PatchMatch 2.47 7.80 7.11 A’
ImproveSubPix 2.96 8.22 8.55 A’
OverSegmBP 3.19 8.81 8.89 A’


A* Methodology (vi): Illustration
 Interpretation of results is based on the cardinality of the set A*


A* Evaluation Model

The Objectstereo and the
GC+SegmBorder algorithms
Algorithm nonocc all disc Set are, comparable among them, and have a
ObjectStereo 2.20 6.99 6.36 A* superior performance to the rest of
algorithms
GC+SegmBorder 4.99 5.78 8.66 A*
PUTv3 2.40 9.11 6.56 A’
PatchMatch 2.47 7.80 7.11 A’
ImproveSubPix 2.96 8.22 8.55 A’
OverSegmBP 3.19 8.81 8.89 A’


A* Methodology (vii): Strength and Weakness
 Strength: It allows a formal interpretation of results, based on the cardinality
of the set A*, and in regard to considered imagery test-bed

 Weakness: It does not allow an exhaustive evaluation of the entire set of
algorithms under evaluation
 It computes the set A* just once, and does not bring information about A’


A* Groups Methodology
 It extends the evaluation model of the A* methodology, incorporating the
capability of performing an exhaustive evaluation

subject to:

 It introduces the partitioningAndGrouping algorithm
A = Set ( { } );
A.load( “Algorithms.dat” );
A* = Set ( { } );
A’ = Set ( { } );
group = 1;
do {
computePartition( A, A*, A’, g, ≺ );
A*.save ( “A*_group_”+group );
group++;
A.update ( A’ ); // A = A / A*
A*.removeAll ( ); // A* = { }
A’.removeAll ( ); // A’ = { }
}while ( ! A.isEmpty ( ) );

A* Groups Methodology (ii): Sigma-Z-Error

 The A* Groups methodology uses the Sigma-Z-Error
(SZE) measure

 The SZE measure has the following properties:
 It is inherently related to depth reconstruction in a stereo system

 It is based on the inverse relation between depth and disparity

 It considers the magnitude of the estimation error

 It is threshold free

Cabezas, I., Padilla, V., and Trujillo M., A Measure for Accuracy Disparity Maps Evaluation, CIARP 2011

A* Groups Methodology (iii): Illustration
 The evaluation process of selected algorithms by using the proposal


nonocc all disc





A* Groups Methodology (iv): Illustration
 The evaluation model performs the partitioning and the grouping of stereo
algorithms under evaluation, based on the Pareto Dominance relation

Compute Error Measures Apply Evaluation Model

ObjectStereo 73.88 117.90 36.25
GC+SegmBorder ,, PatchMatch
GC+SegmBorder 50.48 64.90 24.33
PUTv3 99.67 333.37 53.79 ObjectStereo , PUTv3 , ImproveSubPix , OverSegmBP
PatchMatch 49.95 261.84 32.85
Algorithm nonocc all disc Group
ImproveSubPix 50.66 97.94 32.01
OverSegmBP 58.65 108.60 34.58 GC+SegmBorder 50.48 64.90 24.33 1
PatchMatch 49.95 261.84 32.85 1
PUTv3 99.67 333.37 53.79
ImproveSubPix 50.66 97.94 32.01
OverSegmBP 58.65 108.60 34.58
ObjectStereo 73.88 117.90 36.25


A* Groups Methodology (v): Illustration

Apply Evaluation Model

ObjectStereo , PUTv3, ImproveSubPix , OverSegmBP
ObjectStereo, PUTv3 , OverSegmBP

PUTv3 99.67 333.37 53.79 Algorithm nonocc all disc
ImproveSubPix 50.66 97.94 32.01 PUTv3 99.67 333.37 53.79
OverSegmBP 58.65 108.60 34.58 OverSegmBP 58.65 108.60 34.58
ObjectStereo 73.88 117.90 36.25 ObjectStereo 73.88 117.90 36.25

OverSegmBP

ImproveSubPix
PUTv3 , ObjectStereo

ObjectStereo , PUTv3 , OverSegmBP
Algorithm nonocc all disc Group

Algorithm nonocc all disc Group OverSegmBP 58.65 108.60 34.58 3
PUTv3 99.67 333.37 53.79
ImproveSubPix 50.66 97.94 32.01 2
ObjectStereo 73.88 117.90 36.25
PUTv3 99.67 333.37 53.79
ObjectStereo 73.88 117.90 36.25
OverSegmBP 58.65 108.60 34.58
And so on …

A* Groups Methodology (vi): Illustration
 Interpretation of results is based on the cardinality of each group


A* Groups
Evaluation Model

There are 5 groups of different performance

The GC+SegmBorder and the PatchMatch
algorithms are, comparable among them,
and have a superior performance to the rest
Algorithm nonocc all disc Group of algorithms
GC+SegmBorder 50.48 64.90 24.33 1
The ImproveSubPix algorithm is superior to
PatchMatch 49.95 261.84 32.85 1 the OverSegmBP, the ObjectStereo, and
ImproveSubPix 50.66 97.94 32.01 2 the PUTv3 algorithms
OverSegmBP 58.65 108.60 34.58 3
…
ObjectStereo 73.88 117.90 36.25 4
PUTv3 99.67 333.37 53.79 5 The PUTv3 algorithm has the lowest
performance


Experimental Results
 The conducted evaluation involves the following elements:

Test Bed Images

Error Criteria nonocc , all , disc

Error Measures SZE , BMP

Stereo Algorithms 112 algorithms from the Middlebury’s repository

Evaluation Models A* Groups Middlebury


Experimental Results (ii)

Algorithm Group Middlebury’s
Ranking
ADCensus 2 1
AdaptingBP 2 2
CoopRegion 2 3
DoubleBP 1 4
RDP 2 5
OutlierConf 2 6
Algorithm Strategy Group Middlebury’s SubPixDoubleBP 2 7
Ranking SurfaceStereo 2 8
DoubleBP Global 1 4 WarpMat 2 9
PatchMatch Local 1 11 ObjectStereo 2 10
GC+SegmBorder Global 1 13 PatchMatch 1 11
FeatureGC Global 1 18 Undr+OverSeg 2 12
Segm+Visib Global 1 29 GC+SegmBorder 1 13
MultiresGC Global 1 30 InfoPermeable 2 14
DistinctSM Local 1 34 CostFilter 2 15
GC+occ Global 1 67
MultiCamGC Global 1 68


Final Remarks
 The use of the A* Groups methodology allows to perform an exhaustive
evaluation, as well as an objective interpretation of results

 Innovative results in regard to the comparison of stereo correspondence
algorithms were obtained using proposed methodology and the SZE error
measure

 The introduced methodology offers advantages over the conventional
approaches to compare stereo correspondence algorithms

 Authors are already working in order to provide to the research community an
accessible way to use the introduced methodology


An Evaluation Methodology for
Stereo Correspondence Algorithms
Ivan Cabezas, Maria Trujillo and Margaret Florian
ivan.cabezas@correounivalle.edu.co

February 25th 2012
International Conference on Computer Vision Theory and Applications, VISAPP 2012, Rome, Italy

An Evaluation Methodology for Stereo Correspondence Algorithms

Recommended

Recommended

More Related Content

What's hot

What's hot (15)

Viewers also liked

Viewers also liked (14)

Similar to An Evaluation Methodology for Stereo Correspondence Algorithms

Similar to An Evaluation Methodology for Stereo Correspondence Algorithms (20)

An Evaluation Methodology for Stereo Correspondence Algorithms