The University of
Ontario
CS 433/557
Algorithms for Image Analysis
Template Matching
Acknowledgements: Dan Huttenlocher
The University of
Ontario
CS 433/557 Algorithms for Image Analysis
Matching and Registration
Template Matching
• intensity based (correlation measures)
• feature based (distance transforms)
Flexible Templates
• pictorial structures
– Dynamic Programming on trees
– generalized distance transforms
Extra Material:
The University of
Ontario
Intensity Based Template Matching
Basic Idea
Left ventricle template
Find best template “position” in the image
Face template
image
image
The University of
Ontario
Intensity-Based
Rigid Template matching
image
coordinate
system s
template
coordinate
system
pixel p in
template T
pixel p+s
in image
For each position s of the template compute
some goodness of “match” measure Q(s)
2
|)()(|1
1
)(
pTspI
sQ
Tp
∑∈
−+⋅+
=
α
e.g. sum of
squared
differences
Sum over all pixels p in template T
The University of
Ontario
Intensity-Based
Rigid Template matching
image
coordinate
system
s1
template
coordinate
system
s2
)2()1( sQsQ <
Search over all plausible positions s and find the optimal
one that has the largest goodness of match value Q(s)
The University of
Ontario
Intensity-Based
Rigid Template matching
What if intensities of your image are not exactly the
same as in the template? (e.g. may happen due to
different gain setting at image acquisition)
The University of
Ontario
Other intensity based
goodness of match measures
Normalized correlation
Mutual Information (next slide)
)()()()(
)()(
)(
pTpTspIspI
pTspI
sQ
Tp
⋅⋅+⋅+
⋅+
=
∑∈
The University of
Ontario
Other goodness of match measures :
Mutual Information
Will work even in extreme cases
In this example the spatial structure of template
and image object are similar while actual
intensities are completely different
The University of
Ontario
Other goodness of match measures :
Mutual Information
Fix s and consider joint histogram of intensity “pairs”: TpforIT spp ∈+ ),(
s1
•Mutual information between template T and image I (for given transformation s)
describes “peakedness” of the joint histogram
•measures how well spatial structures in T and I align
s2
Joint histogram is more
concentrated (peaked)
for s2
T
I
T
I
Joint histogram is spread-out
for s1
T
I
The University of
Ontario
Mutual Information
(technical definition)
),()()(),( YXeYeXeYXMI −+=
∑∈
⋅−=
)(
)Pr(ln)Pr()(
Xrangex
xxXe
∑ ⋅−=
yx
yxyxYXe
,
),Pr(ln),Pr(),(
entropy and joint entropy e for random variables X and Y
measures “peakedness” of histogram/distribution
Assuming two random variables X and Y their mutual information is
joint histogram (distribution)
marginal histogram (distribution)
The University of
Ontario
Mutual Information
Computing MI for a given position s
∑ ⋅
yx yx
yx
yx
, )Pr()Pr(
),Pr(
ln),Pr(
•We want to find s that maximizes MI that can be written as
T
I
joint distribution Pr(x,y)
(normalized histogram)
for a fixed given s
NOTE: has to be careful when computing.
For example, what if H(x,y)=0 for a given pair (x,y)?
∑=
y
yxx ),Pr()Pr( ∑=
x
yxy ),Pr()Pr(
marginal distributions Pr(x) and Pr(y)
The University of
OntarioFinding optimal template position s
Need to search over all feasible values of s
• Template T could be large
– The bigger the template T the more time we spend
computing goodness of match measure at each s
• Search space (of feasible positions s) could be huge
– Besides translation/shift, position s could include scale,
rotation angle, and other parameters (e.g. shear)
Q: Efficient search over all s?
The University of
OntarioFinding optimal template position s
One possible solution: Hierarchical Approach
1. Subsample both template and image.
Note that the search space can be
significantly reduced. The template
size is also reduced.
2. Once a good solution(s) is found at a
corser scale, go to a finer scale. Refine
the search in the neighborhood of the
courser scale solution.
The University of
OntarioFeature Based Template Matching
Features: edges, corners,… (found via filtering)
Distance transforms of binary images
Chamfer and Housdorff matching
Iterated Closed Points
The University of
Ontario
Feature-based Binary Templates/Models
What are they?
What are features?
• Object edges, corners, junctions, e.t.c.
– Features can be detected by the corresponding image filters
• Intensity can also be a considered a feature but it may not
be very robust (e.g. due to illumination changes)
A model (binary template) is a set of feature
points in N-dimensional
space (also called feature space)
• Each feature is defined by a descriptor (vector)
{ } N
n RMMM ⊂= ,...,1
N
i RM ∈
The University of
Ontario
Binary Feature Templates (Models)
2D example
• Links may represent neighborhood relationships
between the features of the model
• Model’s features are represented by points
– descriptor could be a 2D vector
specifying feature position with respect to
model’s coordinate system
– Feature spaces could be 3D (or higher).
E.g., position of an edge in a medical
volumes is a 3D vector. But even in 2D
images edge features can be described by
3D vectors (add edge’s angular orientation
to its 2D location)
reference
point
iM
jM
iM
2D feature space
For simplicity, we will mainly concentrate on 2D feature space examples
iM
The University of
OntarioMatching Binary Template to Image
iM
L
L - model’s positioning
iML⊕ - position of feature i
At fixed position L we can
compute match quality Q(L)
using some goodness of
match criteria.
Example: Q(L) = number of (exact)
matches (in red) between model and
image features (e.g. edges).
Object is detected at all
positions which are local
maxima of function Q(L)
such that where
K is some presence threshold
Lˆ
KLQ >)ˆ(
The University of
OntarioExact feature matching is not robust
iM
L
Counting exact matches may be sensitive to even minor deviation
in shape between the model and the actual object appearance
The University of
OntarioDistance Transform
More robust goodness of match measures use
distance transform of image features
1. Detect desirable image features
(edges, corners, e.t.c.) using
appropriate filters
2. For all image pixels p find
distance D(p) to the nearest
image feature
p
0)( =pD
q
0)( >qD
)()( qDsD >
s
The University of
OntarioDistance Transform
3
4
2
3
2
3
5 4 4
2
2
3
1
1
2
2 1 1 2 1
1 0 0 1 2 1
0
0
0
1
232101
1 0 1 2 3 3 2
1
0
1
1
1
0 1
2
1 0 1 2 3 4 3 2
1
0
1
2
2
Distance TransformImage features (2D)
Distance Transform is a function that for each image
pixel p assigns a non-negative number corresponding to
distance from p to the nearest feature in the image I
)(⋅ID
)(pDI
The University of
OntarioDistance Transform
Image features
(edges)
Distance Transform
Distance Transform can be
visualized as a gray-scale image
ID
ID
The University of
Ontario
Distance Transform
can be very efficiently computed
The University of
Ontario
Distance Transform
can be very efficiently computed
The University of
Ontario
Metric properties of
discrete Distance Transforms
- 1
1 0
0 1
1 -
Forward
mask
Backward
mask
Manhattan (L1) metric
Set of equidistant
points
Metric
1.4 1
1 0
1.4 0
1.4 1
1
1.4
Better approximation
of Euclidean metric
Exact Euclidean Distance transform
can be computed fairly efficiently (in
linear time) without bigger masks.
www.cs.cornell.edu/~dph/matchalgs/
Euclidean (L2) metric
The University of
Ontario
Goodness of Match via
Distance Transforms
At each model position one can “probe” distance transform
values at locations specified by model (template) features
3
4
2
3
2
3
5 4 4
2
2
3
1
1
2
2 1 1 2 1
1 0 0 1 2 1
0
0
0
1
232101
1 0 1 2 3 3 2
1
0
1
1
1
0 1
2
1 0 1 2 3 4 3 2
1
0
1
2
2
Use distance transform
values as evidence of
proximity to image features.
The University of
Ontario
Goodness of Match Measures
using Distance Transforms
Chamfer Measure
• sum distance transform values “probed” by template features
Hausdorff Measure
• k-th largest value of the distance transform at locations
“probed” by template features
• (Equivalently) number of template features with “probed”
distance transform values less than fixed (small) threshold
– Count template features “sufficiently” close to image features
Spatially coherent matching
The University of
OntarioHausdorff Matching
counting matches with a dialated set of image features
The University of
OntarioSpatial Coherence of Feature Matches
50%
matched
50%
matched
'L "L
Spatially incoherent
matches
Spatially incoherent
matches
• Few “discontinuities” between
neighboring features
• Neighborhood is defined by links
between template/model features
Spatial coherence:
The University of
OntarioSpatially Coherent Matching
Separate template/model features into three subsets
Count the number of non-boundary matchable features
•Matchable (red)
-near image features
•Boundary (blue circle)
-matchable but “near” un-matchable
-links define “near” for model features
•Un-matchable (gray)
-far from image features
The University of
Ontario
L
Spatially Coherent Matching
L
Percentage of non-boundary matchable features
(spatially coherent matches)
%0 %50≈
The University of
OntarioComparing different match measures
Binary model
(edges)
5% clutter image
• Monte Carlo experiments with known
object location and synthetic clutter and
occlusion
-Matching edge locations
• Varying percent clutter
-Probability of edge pixel 2.5-15%
• Varying occlusion
-Single missing interval 10-25% of the
boundary
• Search over location, scale, orientation
The University of
Ontario
Comparing different match measures:
ROC curves
Probability of false alarm versus detection
- 10% and 15% of occlusion with 5% clutter
-Chamfer is lowest, Hausdorff (f=0.8) is highest
-Chamfer truncated distance better than trimmed
The University of
Ontario
ROC’s for
Spatial Coherence Matching
Clutter 3%
Occlusion 20%
FA
CD
0 1
1
Clutter 5%
Occlusion 20%
0=β
0>β
Clutter 5%
Occlusion 40%
0=β
0>β
Clutter 3%
Occlusion 40%
0=β
0>β
FA
CD
0 1
1
FA
CD
0 1
1
FA
CD
0 1
1
• Parameter
defined degree of
connectivity between
model features
• If then
model
features are not
connected at all. In
this case, spatially
coherent matching
reduces to plain
Hausdorff matching.
0=β
β
The University of
OntarioEdge Orientation Information
Match edge orientation (in addition to location)
• Edge normals or gradient direction
3D model feature space (2D location + orientation)
Extract 3D (edge) features from image as well.
Requires 3D distance transform of image features
• weight orientation versus location
• fast forward-backward pass algorithm applies
Increases detection robustness and speeds up matching
• better able to discriminate object from clutter
• better able to eliminate cells in branch and bound search
The University of
OntarioROC’s for Oriented Edge Pixels
Vast Improvement for moderate clutter
• Images with 5% randomly generated contours
• Good for 20-25% occlusion rather than 2-5%
Oriented Edges Location only
The University of
Ontario
Efficient search for
good matching positions L
Distance transform of observed image features needs to
be computed only once (fast operation).
Need to compute match quality for all possible
template/model locations L (global search)
• Use hierarchical approach to efficiently prune the search space.
Alternatively, gradient descent from a given initial
position (e.g. Iterative Closest Point algorithm, …later)
• Easily gets stuck at local minima
• Sensitive to initialization
The University of
Ontario
Global Search
Hierarchical Search Space Pruning
Assume that the entire box might be pruned out if the
match quality is sufficiently bad in the center of the box
(how? … in a moment)
The University of
Ontario
Global Search
Hierarchical Search Space Pruning
If a box is not pruned then subdivide it into smaller
boxes and test the centers of these smaller boxes.
The University of
Ontario
Global Search
Hierarchical Search Space Pruning
•Continue in this fashion until the object is localized.
The University of
Ontario
Pruning a Box
(preliminary technicality)
Location L’ is uniformly better than L” if
for all model features i
A uniformly better location is guaranteed to have better match quality!
5 6 7 7 7
6 6
2
4
3
4
1
2
0
1 0 1
0
5
1
2
3
4
4
4
3
2
1
1
0
1
3
3
4
1
1
2
0
1
L’
)"()'( iIiI MLDMLD ⊕≤⊕
L”
9 10 11 10 9
8 7
12
9
11
10
11
8
10
7 6 5
4
8
3
2
3
4
5
4
3
2
1
1
0
1
6
7
8
3
4
5
2
9
The University of
Ontario
Pruning a Box
(preliminary technicality)
7
5 6 7 6 5
4 3
8
5
6
7
4
6
3 2 1
0
4
0
0
0
0
1
0
0
0
0
0
0
0
2
3
4
0
0
1
0
5
λ
hypothetical
location
Assume that is uniformly better than any location
then the match quality satisfies for any
If the presence test fails ( for a given threshold K)
then any location must also fail the test
The entire box can be pruned by one test at !!!!λ
BoxL∈
λ BoxL∈
Assume that
is uniformly better than any
location in the box
λ
KQ <)(λ
)()( LQQ ≥λ BoxL∈
The University of
OntarioBuilding “ “ for a Box of “Radius” n
at the center
of the box
9 10 11 10 9
8 7
12
9
11
10
11
8
10
7 6 5
4
8
3
2
3
4
5
4
3
2
1
1
0
1
6
7
8
3
4
5
2
9
• value of the distance transform changes at most by 1 between neighboring
pixels
λ
BoxLanyforMLDnpD ii ∈+≤− ),())((
)( ipD• value of can decrease by at most n (box radius) for other box positions
)( ipD 7
5 6 7 6 5
4 3
8
5
6
7
4
6
3 2 1
0
4
0
0
0
0
1
0
0
0
0
0
0
0
2
3
4
0
0
1
0
5
λ
hypothetical
location
{ }0,))((max npD i −
The University of
Ontario
Global Hierarchical Search
(Branch and Bound)
Hierarchical search works in more
general case where “position” L
includes translation, scale, and
orientation of the model
• N-dimensional search space
Guaranteed or admissible search
heuristic
• Bound on how good answer could be in
unexplored region
– can not miss an answer
• In worst case won’t rule anything out
In practice rule out vast majority of
template locations (transformations)
The University of
Ontario
Local Search (gradient descent):
Iterated Closest Point algorithm
ICP: Iterate until convergence
1. Estimate correspondence between each template feature i and
some image feature located at F(i) (Fitzgibbons: use DT)
2. Move model to minimize the sum of distances between the
corresponding features (like chamfer matching)
Alternatively, find local move of the model
improving DT-based match quality function Q(L)
)(~ LQL −∇∆
L∆
i
)(iF
∑ −⊕∇−∆
i
i iFMLL 2
))()((~
The University of
Ontario
Problems with ICP
and gradient descent matching
Slow
• Can take many iterations
• ICP: each iteration is slow due to search for
correspondences
– Fitzgibbons: improve this by using DT
No convergence guarantees
• Can get stuck in local minima
– Not much to do about this
– Can be improved by using robust distance measures (e.g.
truncated Euclidean measure)
The University of
OntarioObservations on DT based matching
Main point of DT: allows to measure match quality
without explicitly finding correspondence between pairs
of mode and image features (hard problem!)
Hierarchical search over entire transformation space
Important to use robust distance
• Straight Chamfer very sensitive to outliers
• Truncated DT can be computed very fast
Fast exact or approximate methods for DT ( metric)
For edge features use orientation too
• edge normals or intensity gradients
2L
The University of
Ontario
Rigid 2D templates
Should we really care?
So far we studied matching in case of 2D images
and rigid 2D templates/models of objects
• When do rigid 2D templates work?
– there are rigid 2D objects (e.g. fingerprints)
– 3D object may be imaged from the same view point:
• controlled image-based data bases (e.g. photos of employees,
criminals)
• 2D satellite images always view 3D objects from above
• X-Rays, microscope photography, e.t.c.
The University of
OntarioMore general 3D objects
3D image volumes and 3D objects
• Distance transforms, DT-based matching criteria,
and hierarchical search techniques easily generalize
• Mainly medical applications
2D image and 3D objects
• 3D objects may be represented by a collection of 2D
templates (e.g. tree-structured templates, next slide)
• 3D objects may be represented by flexible 2D
templates (soon)
The University of
OntarioTree-structured templates
Larger pair-wise differences higher in tree
The University of
Ontario
Rule out multiple templates simultaneously
- Speeds up matching
- Course-to-fine search where coarse granularity can
rule out many templates
- Applies to variety of DT based matching measures:
Chamfer, Hausdorff, robust Chamfer
Tree-structured templates
The University of
OntarioFlexible Templates
• parts connected by springs and
appearance models for each part
• Used for human bodies, faces
• Fischler & Elschlager, 1973 –
considerable recent work (e.g.
Felzenszwalb & Huttenlocher, 2003 )
Flexible Template combines
a number of rigid templates
connected by flexible
strings
The University of
Ontario
Flexible Templates
Why?
To account for significant deviation
between proportions of generic model
(e.g average face template) and a
multitude of actual object appearance
non-rigid (3D) objects may consist of
multiple rigid parts with (relatively)
view independent 2D appearance
The University of
Ontario
Flexible Templates:
Formal Definition
Set of parts
Positioning Configuration
• specifies locations of the parts
Appearance model
• matching quality of part i at location
Edge for connected parts
• explicit dependency between edge-connected parts
Interaction/connection energy
• e.g. elastic energy
},...,{ 1 nvvV =
},...,{ 1 nllL =
Evve jiij ∈= ),(
),( jiij llC
2
||||),( jijiij llllC −=
)( ii lm
il
The University of
Ontario
Flexible Templates:
Formal Definition
Find configuration L (location of all parts) that
minimizes
Difficulty depends on graph structure
• Which parts are connected (E) and how (C)
General case: exponential time
),()()( j
i Eij
iijii llClmLE ∑ ∑∈
+=
The University of
Ontario
Flexible Templates:
simplistic example from the past
Discrete Snakes
• What graph?
• What appearance model?
• What connection/interaction model?
• What optimization algorithm?
1v
2v
3v
4v
6v
5v
),()()( j
i Eij
iijii llClmLE ∑ ∑∈
+=
The University of
Ontario
Flexible Templates:
special cases
Pictorial Structures
• What graph?
• What appearance model?
-intensity based match measure
-DT based match measure (binary templates)
• What connection/interaction model?
-elastic springs
• What optimization algorithm?
),()()( j
i Eij
iijii llClmLE ∑ ∑∈
+=
The University of
Ontario
Dynamic Programming for
Flexible Template Matching
DP can be used for minimization
of E(L) for tree graphs (no loops!)
),()()( j
i Eij
iijii llClmLE ∑ ∑∈
+=
1v
7v9v8v 10v 6v
2v
3v
4v
5v
11v
The University of
Ontario
Dynamic Programming for
Flexible Template Matching
DP algorithm on trees
• Choose post-order traversal for
any selected “root” site/part
• Compute for all “leaf” parts
• Process a part after its children are processed
• Select best energy position for the “root” and backtrack to “leafs”
),()()( j
i Eij
iijii llClmLE ∑ ∑∈
+=
1v
7v9v8v 10v 6v
2v
3v
4v
5v
11v root
)()( lmlE ii =
)},()({min)()( , llClElmlE aiaaa
l
ii
a
++=
...)},()({min)},()({min)()( ,, +++++= llClEllClElmlE bibbb
l
aiaaa
l
ii
ba
If part ‘i ‘ has only one child ‘a’
If part ‘i ‘ has two (or more) children ‘a’, ‘b’,
…
The University of
Ontario
Dynamic Programming for
Flexible Template Matching
DP’s complexity on trees
(same as for 1D snakes)
– n parts, m positions
– OK complexity for local search where “m”
is relatively small (e.g. in snakes)
• E.g. for tracking a flexible model from
frame to frame in a video sequence
),()()( j
i Eij
iijii llClmLE ∑ ∑∈
+=
1v
7v9v8v 10v 6v
2v
3v
4v
5v
11v root
2
mn⋅
The University of
Ontario
Local Search
Tracking Flexible Templates
The University of
Ontario
Local Search
Tracking Flexible Templates
The University of
Ontario
Local Search
Tracking Flexible Templates
The University of
Ontario
Searching in the whole image
(large m)
m = image size
or
m = image size*rotations
Then complexity is not good
For some interactions can improve to based on
Generalized Distance Transform (from Computational Geometry)
2
nm
)( mnO ⋅
This is an amazing complexity for matching n dependent parts
Note that is the number of operations for finding n independent matchesmn⋅
)( mnO ⋅
The University of
OntarioGeneralized Distance Transform
Idea: improve efficiency of the key computational step
(performed for each parent-child pair, n-times)
( operations))},()({min)(ˆ yxCxEyE
x
+= 2
m
Intuitively: if x and y describe all feasible positions of “parts” in
the image then energy functions and can be though
of as some gray-scale images (e.g. like responses of the original image
to some filters)
)(ˆ yE)(xE
The University of
OntarioGeneralized Distance Transform
Idea: improve efficiency of the key computational step
( operations performed for each parent-child pair)
Let (distance between x and y)
reasonable interactions model!
Then is called a
Generalized Distance Transform of
)},()({min)(ˆ yxCxEyE
x
+=
||||),( yxyxC −⋅= α
2
m
)(ˆ yE
)(xE
The University of
Ontario
From Distance Transform to
Generalized Distance Transform
Assuming
then
is standard Distance Transform (of image features)
||}||)({min)(ˆ yxxEyE
x
−+=






∞
=
..
0
)(
WO
featureimageisxif
xE
)(xE
)(ˆ yE
∞+ ∞+∞+
Locations of binary image features
The University of
Ontario
From Distance Transform to
Generalized Distance Transform
For general and any fixed
is called Generalized Distance Transform of
||}||)({min)(ˆ yxxEyE
x
−⋅+= α
)(xE
E(x) may represent non-binary image features (e.g. image intensity gradient)
)(xE
)(xE
)(ˆ yE
EˆE(y) may prefer
strength of E(x)
to proximity
α
The University of
Ontario
Algorithm for computing
Generalized Distance Transform
Straightforward generalization of
forward-backward pass algorithm for
standard Distance Transforms






∞
=
..
0
)(
WO
featureimageisxif
xδ• Initialize to E(x) instead of
• Use instead of 1α
The University of
Ontario
Flexible Template Matching
Complexity
Computing
via Generalized Distance Transform:
previously
(m-number of positions x and y)
Improves complexity of Flexible Template Matching
to in case of interactions
||}||)({min)(ˆ yxxEyE
x
−⋅+= α
)(mO
)( 2
mO
||||),( yxyxC −⋅= α)(nmO
The University of
Ontario
“Simple” Flexible Template Example:
Central Part Model
Consider special case in which parts translate
with respect to common origin
• E.g., useful for faces
•Parts
•Distinguished central part
•Connect each to
•Elastic spring costs
},...,{ 1 nvvV =
1v
iv 1v
NOTE: for simplicity (only) we consider part positions that are
translations only (no rotation or scaling of parts)
il
The University of
OntarioCentral Part Model example
“Ideal” location w.r.t. is given by
where is a fixed translation vector for each i>1
il 1l
io
ii ollT += 11)(
iv
ioil
3v
3o
3l
2v
2o
2l
1l
1v
∑>
−⋅++
1
111 ||})(||)({)(
i
iiiii lTllmlm α
Whole template energy
||)(|| 1lTl iii −⋅α
“String cost for deformation from this “ideal” location
The University of
Ontario
Central Part Model
Summary of search algorithm
1. For each non-central part i>1 compute matching cost
for all possible positions of that part in the image
2. For each i>1 compute Generalized DT of
3. For all possible positions of the central part compute energy
4. Select the best location or select all locations with
larger then a fixed threshold
∑>
−⋅++
1
111 ||})(||)({)(
i
iiiii lTllmlm α
||}||)({min)(ˆ yxxmyE ii
x
i −⋅+= α
)( ii lm
il
)(xmi
∑>
+=
1
11111 ))((ˆ)()(ˆ
i
ii lTElmlE
)( mnO ⋅
)(ˆ
11 lE1l
Matching cost:
The University of
Ontario
Central Part Model
for face detection
The University of
Ontario
Search Algorithm
for tree-based pictorial structures
)( mnO ⋅
•Algorithm is basically the same as for Central Part Model.
•Each “parent” part knows ideal positions of “child” parts.
•String deformations are accounted for by the Generalized Distance Transform
of the children’s positioning energies

Lec10 matching

  • 1.
    The University of Ontario CS433/557 Algorithms for Image Analysis Template Matching Acknowledgements: Dan Huttenlocher
  • 2.
    The University of Ontario CS433/557 Algorithms for Image Analysis Matching and Registration Template Matching • intensity based (correlation measures) • feature based (distance transforms) Flexible Templates • pictorial structures – Dynamic Programming on trees – generalized distance transforms Extra Material:
  • 3.
    The University of Ontario IntensityBased Template Matching Basic Idea Left ventricle template Find best template “position” in the image Face template image image
  • 4.
    The University of Ontario Intensity-Based RigidTemplate matching image coordinate system s template coordinate system pixel p in template T pixel p+s in image For each position s of the template compute some goodness of “match” measure Q(s) 2 |)()(|1 1 )( pTspI sQ Tp ∑∈ −+⋅+ = α e.g. sum of squared differences Sum over all pixels p in template T
  • 5.
    The University of Ontario Intensity-Based RigidTemplate matching image coordinate system s1 template coordinate system s2 )2()1( sQsQ < Search over all plausible positions s and find the optimal one that has the largest goodness of match value Q(s)
  • 6.
    The University of Ontario Intensity-Based RigidTemplate matching What if intensities of your image are not exactly the same as in the template? (e.g. may happen due to different gain setting at image acquisition)
  • 7.
    The University of Ontario Otherintensity based goodness of match measures Normalized correlation Mutual Information (next slide) )()()()( )()( )( pTpTspIspI pTspI sQ Tp ⋅⋅+⋅+ ⋅+ = ∑∈
  • 8.
    The University of Ontario Othergoodness of match measures : Mutual Information Will work even in extreme cases In this example the spatial structure of template and image object are similar while actual intensities are completely different
  • 9.
    The University of Ontario Othergoodness of match measures : Mutual Information Fix s and consider joint histogram of intensity “pairs”: TpforIT spp ∈+ ),( s1 •Mutual information between template T and image I (for given transformation s) describes “peakedness” of the joint histogram •measures how well spatial structures in T and I align s2 Joint histogram is more concentrated (peaked) for s2 T I T I Joint histogram is spread-out for s1 T I
  • 10.
    The University of Ontario MutualInformation (technical definition) ),()()(),( YXeYeXeYXMI −+= ∑∈ ⋅−= )( )Pr(ln)Pr()( Xrangex xxXe ∑ ⋅−= yx yxyxYXe , ),Pr(ln),Pr(),( entropy and joint entropy e for random variables X and Y measures “peakedness” of histogram/distribution Assuming two random variables X and Y their mutual information is joint histogram (distribution) marginal histogram (distribution)
  • 11.
    The University of Ontario MutualInformation Computing MI for a given position s ∑ ⋅ yx yx yx yx , )Pr()Pr( ),Pr( ln),Pr( •We want to find s that maximizes MI that can be written as T I joint distribution Pr(x,y) (normalized histogram) for a fixed given s NOTE: has to be careful when computing. For example, what if H(x,y)=0 for a given pair (x,y)? ∑= y yxx ),Pr()Pr( ∑= x yxy ),Pr()Pr( marginal distributions Pr(x) and Pr(y)
  • 12.
    The University of OntarioFindingoptimal template position s Need to search over all feasible values of s • Template T could be large – The bigger the template T the more time we spend computing goodness of match measure at each s • Search space (of feasible positions s) could be huge – Besides translation/shift, position s could include scale, rotation angle, and other parameters (e.g. shear) Q: Efficient search over all s?
  • 13.
    The University of OntarioFindingoptimal template position s One possible solution: Hierarchical Approach 1. Subsample both template and image. Note that the search space can be significantly reduced. The template size is also reduced. 2. Once a good solution(s) is found at a corser scale, go to a finer scale. Refine the search in the neighborhood of the courser scale solution.
  • 14.
    The University of OntarioFeatureBased Template Matching Features: edges, corners,… (found via filtering) Distance transforms of binary images Chamfer and Housdorff matching Iterated Closed Points
  • 15.
    The University of Ontario Feature-basedBinary Templates/Models What are they? What are features? • Object edges, corners, junctions, e.t.c. – Features can be detected by the corresponding image filters • Intensity can also be a considered a feature but it may not be very robust (e.g. due to illumination changes) A model (binary template) is a set of feature points in N-dimensional space (also called feature space) • Each feature is defined by a descriptor (vector) { } N n RMMM ⊂= ,...,1 N i RM ∈
  • 16.
    The University of Ontario BinaryFeature Templates (Models) 2D example • Links may represent neighborhood relationships between the features of the model • Model’s features are represented by points – descriptor could be a 2D vector specifying feature position with respect to model’s coordinate system – Feature spaces could be 3D (or higher). E.g., position of an edge in a medical volumes is a 3D vector. But even in 2D images edge features can be described by 3D vectors (add edge’s angular orientation to its 2D location) reference point iM jM iM 2D feature space For simplicity, we will mainly concentrate on 2D feature space examples iM
  • 17.
    The University of OntarioMatchingBinary Template to Image iM L L - model’s positioning iML⊕ - position of feature i At fixed position L we can compute match quality Q(L) using some goodness of match criteria. Example: Q(L) = number of (exact) matches (in red) between model and image features (e.g. edges). Object is detected at all positions which are local maxima of function Q(L) such that where K is some presence threshold Lˆ KLQ >)ˆ(
  • 18.
    The University of OntarioExactfeature matching is not robust iM L Counting exact matches may be sensitive to even minor deviation in shape between the model and the actual object appearance
  • 19.
    The University of OntarioDistanceTransform More robust goodness of match measures use distance transform of image features 1. Detect desirable image features (edges, corners, e.t.c.) using appropriate filters 2. For all image pixels p find distance D(p) to the nearest image feature p 0)( =pD q 0)( >qD )()( qDsD > s
  • 20.
    The University of OntarioDistanceTransform 3 4 2 3 2 3 5 4 4 2 2 3 1 1 2 2 1 1 2 1 1 0 0 1 2 1 0 0 0 1 232101 1 0 1 2 3 3 2 1 0 1 1 1 0 1 2 1 0 1 2 3 4 3 2 1 0 1 2 2 Distance TransformImage features (2D) Distance Transform is a function that for each image pixel p assigns a non-negative number corresponding to distance from p to the nearest feature in the image I )(⋅ID )(pDI
  • 21.
    The University of OntarioDistanceTransform Image features (edges) Distance Transform Distance Transform can be visualized as a gray-scale image ID ID
  • 22.
    The University of Ontario DistanceTransform can be very efficiently computed
  • 23.
    The University of Ontario DistanceTransform can be very efficiently computed
  • 24.
    The University of Ontario Metricproperties of discrete Distance Transforms - 1 1 0 0 1 1 - Forward mask Backward mask Manhattan (L1) metric Set of equidistant points Metric 1.4 1 1 0 1.4 0 1.4 1 1 1.4 Better approximation of Euclidean metric Exact Euclidean Distance transform can be computed fairly efficiently (in linear time) without bigger masks. www.cs.cornell.edu/~dph/matchalgs/ Euclidean (L2) metric
  • 25.
    The University of Ontario Goodnessof Match via Distance Transforms At each model position one can “probe” distance transform values at locations specified by model (template) features 3 4 2 3 2 3 5 4 4 2 2 3 1 1 2 2 1 1 2 1 1 0 0 1 2 1 0 0 0 1 232101 1 0 1 2 3 3 2 1 0 1 1 1 0 1 2 1 0 1 2 3 4 3 2 1 0 1 2 2 Use distance transform values as evidence of proximity to image features.
  • 26.
    The University of Ontario Goodnessof Match Measures using Distance Transforms Chamfer Measure • sum distance transform values “probed” by template features Hausdorff Measure • k-th largest value of the distance transform at locations “probed” by template features • (Equivalently) number of template features with “probed” distance transform values less than fixed (small) threshold – Count template features “sufficiently” close to image features Spatially coherent matching
  • 27.
    The University of OntarioHausdorffMatching counting matches with a dialated set of image features
  • 28.
    The University of OntarioSpatialCoherence of Feature Matches 50% matched 50% matched 'L "L Spatially incoherent matches Spatially incoherent matches • Few “discontinuities” between neighboring features • Neighborhood is defined by links between template/model features Spatial coherence:
  • 29.
    The University of OntarioSpatiallyCoherent Matching Separate template/model features into three subsets Count the number of non-boundary matchable features •Matchable (red) -near image features •Boundary (blue circle) -matchable but “near” un-matchable -links define “near” for model features •Un-matchable (gray) -far from image features
  • 30.
    The University of Ontario L SpatiallyCoherent Matching L Percentage of non-boundary matchable features (spatially coherent matches) %0 %50≈
  • 31.
    The University of OntarioComparingdifferent match measures Binary model (edges) 5% clutter image • Monte Carlo experiments with known object location and synthetic clutter and occlusion -Matching edge locations • Varying percent clutter -Probability of edge pixel 2.5-15% • Varying occlusion -Single missing interval 10-25% of the boundary • Search over location, scale, orientation
  • 32.
    The University of Ontario Comparingdifferent match measures: ROC curves Probability of false alarm versus detection - 10% and 15% of occlusion with 5% clutter -Chamfer is lowest, Hausdorff (f=0.8) is highest -Chamfer truncated distance better than trimmed
  • 33.
    The University of Ontario ROC’sfor Spatial Coherence Matching Clutter 3% Occlusion 20% FA CD 0 1 1 Clutter 5% Occlusion 20% 0=β 0>β Clutter 5% Occlusion 40% 0=β 0>β Clutter 3% Occlusion 40% 0=β 0>β FA CD 0 1 1 FA CD 0 1 1 FA CD 0 1 1 • Parameter defined degree of connectivity between model features • If then model features are not connected at all. In this case, spatially coherent matching reduces to plain Hausdorff matching. 0=β β
  • 34.
    The University of OntarioEdgeOrientation Information Match edge orientation (in addition to location) • Edge normals or gradient direction 3D model feature space (2D location + orientation) Extract 3D (edge) features from image as well. Requires 3D distance transform of image features • weight orientation versus location • fast forward-backward pass algorithm applies Increases detection robustness and speeds up matching • better able to discriminate object from clutter • better able to eliminate cells in branch and bound search
  • 35.
    The University of OntarioROC’sfor Oriented Edge Pixels Vast Improvement for moderate clutter • Images with 5% randomly generated contours • Good for 20-25% occlusion rather than 2-5% Oriented Edges Location only
  • 36.
    The University of Ontario Efficientsearch for good matching positions L Distance transform of observed image features needs to be computed only once (fast operation). Need to compute match quality for all possible template/model locations L (global search) • Use hierarchical approach to efficiently prune the search space. Alternatively, gradient descent from a given initial position (e.g. Iterative Closest Point algorithm, …later) • Easily gets stuck at local minima • Sensitive to initialization
  • 37.
    The University of Ontario GlobalSearch Hierarchical Search Space Pruning Assume that the entire box might be pruned out if the match quality is sufficiently bad in the center of the box (how? … in a moment)
  • 38.
    The University of Ontario GlobalSearch Hierarchical Search Space Pruning If a box is not pruned then subdivide it into smaller boxes and test the centers of these smaller boxes.
  • 39.
    The University of Ontario GlobalSearch Hierarchical Search Space Pruning •Continue in this fashion until the object is localized.
  • 40.
    The University of Ontario Pruninga Box (preliminary technicality) Location L’ is uniformly better than L” if for all model features i A uniformly better location is guaranteed to have better match quality! 5 6 7 7 7 6 6 2 4 3 4 1 2 0 1 0 1 0 5 1 2 3 4 4 4 3 2 1 1 0 1 3 3 4 1 1 2 0 1 L’ )"()'( iIiI MLDMLD ⊕≤⊕ L” 9 10 11 10 9 8 7 12 9 11 10 11 8 10 7 6 5 4 8 3 2 3 4 5 4 3 2 1 1 0 1 6 7 8 3 4 5 2 9
  • 41.
    The University of Ontario Pruninga Box (preliminary technicality) 7 5 6 7 6 5 4 3 8 5 6 7 4 6 3 2 1 0 4 0 0 0 0 1 0 0 0 0 0 0 0 2 3 4 0 0 1 0 5 λ hypothetical location Assume that is uniformly better than any location then the match quality satisfies for any If the presence test fails ( for a given threshold K) then any location must also fail the test The entire box can be pruned by one test at !!!!λ BoxL∈ λ BoxL∈ Assume that is uniformly better than any location in the box λ KQ <)(λ )()( LQQ ≥λ BoxL∈
  • 42.
    The University of OntarioBuilding“ “ for a Box of “Radius” n at the center of the box 9 10 11 10 9 8 7 12 9 11 10 11 8 10 7 6 5 4 8 3 2 3 4 5 4 3 2 1 1 0 1 6 7 8 3 4 5 2 9 • value of the distance transform changes at most by 1 between neighboring pixels λ BoxLanyforMLDnpD ii ∈+≤− ),())(( )( ipD• value of can decrease by at most n (box radius) for other box positions )( ipD 7 5 6 7 6 5 4 3 8 5 6 7 4 6 3 2 1 0 4 0 0 0 0 1 0 0 0 0 0 0 0 2 3 4 0 0 1 0 5 λ hypothetical location { }0,))((max npD i −
  • 43.
    The University of Ontario GlobalHierarchical Search (Branch and Bound) Hierarchical search works in more general case where “position” L includes translation, scale, and orientation of the model • N-dimensional search space Guaranteed or admissible search heuristic • Bound on how good answer could be in unexplored region – can not miss an answer • In worst case won’t rule anything out In practice rule out vast majority of template locations (transformations)
  • 44.
    The University of Ontario LocalSearch (gradient descent): Iterated Closest Point algorithm ICP: Iterate until convergence 1. Estimate correspondence between each template feature i and some image feature located at F(i) (Fitzgibbons: use DT) 2. Move model to minimize the sum of distances between the corresponding features (like chamfer matching) Alternatively, find local move of the model improving DT-based match quality function Q(L) )(~ LQL −∇∆ L∆ i )(iF ∑ −⊕∇−∆ i i iFMLL 2 ))()((~
  • 45.
    The University of Ontario Problemswith ICP and gradient descent matching Slow • Can take many iterations • ICP: each iteration is slow due to search for correspondences – Fitzgibbons: improve this by using DT No convergence guarantees • Can get stuck in local minima – Not much to do about this – Can be improved by using robust distance measures (e.g. truncated Euclidean measure)
  • 46.
    The University of OntarioObservationson DT based matching Main point of DT: allows to measure match quality without explicitly finding correspondence between pairs of mode and image features (hard problem!) Hierarchical search over entire transformation space Important to use robust distance • Straight Chamfer very sensitive to outliers • Truncated DT can be computed very fast Fast exact or approximate methods for DT ( metric) For edge features use orientation too • edge normals or intensity gradients 2L
  • 47.
    The University of Ontario Rigid2D templates Should we really care? So far we studied matching in case of 2D images and rigid 2D templates/models of objects • When do rigid 2D templates work? – there are rigid 2D objects (e.g. fingerprints) – 3D object may be imaged from the same view point: • controlled image-based data bases (e.g. photos of employees, criminals) • 2D satellite images always view 3D objects from above • X-Rays, microscope photography, e.t.c.
  • 48.
    The University of OntarioMoregeneral 3D objects 3D image volumes and 3D objects • Distance transforms, DT-based matching criteria, and hierarchical search techniques easily generalize • Mainly medical applications 2D image and 3D objects • 3D objects may be represented by a collection of 2D templates (e.g. tree-structured templates, next slide) • 3D objects may be represented by flexible 2D templates (soon)
  • 49.
    The University of OntarioTree-structuredtemplates Larger pair-wise differences higher in tree
  • 50.
    The University of Ontario Ruleout multiple templates simultaneously - Speeds up matching - Course-to-fine search where coarse granularity can rule out many templates - Applies to variety of DT based matching measures: Chamfer, Hausdorff, robust Chamfer Tree-structured templates
  • 51.
    The University of OntarioFlexibleTemplates • parts connected by springs and appearance models for each part • Used for human bodies, faces • Fischler & Elschlager, 1973 – considerable recent work (e.g. Felzenszwalb & Huttenlocher, 2003 ) Flexible Template combines a number of rigid templates connected by flexible strings
  • 52.
    The University of Ontario FlexibleTemplates Why? To account for significant deviation between proportions of generic model (e.g average face template) and a multitude of actual object appearance non-rigid (3D) objects may consist of multiple rigid parts with (relatively) view independent 2D appearance
  • 53.
    The University of Ontario FlexibleTemplates: Formal Definition Set of parts Positioning Configuration • specifies locations of the parts Appearance model • matching quality of part i at location Edge for connected parts • explicit dependency between edge-connected parts Interaction/connection energy • e.g. elastic energy },...,{ 1 nvvV = },...,{ 1 nllL = Evve jiij ∈= ),( ),( jiij llC 2 ||||),( jijiij llllC −= )( ii lm il
  • 54.
    The University of Ontario FlexibleTemplates: Formal Definition Find configuration L (location of all parts) that minimizes Difficulty depends on graph structure • Which parts are connected (E) and how (C) General case: exponential time ),()()( j i Eij iijii llClmLE ∑ ∑∈ +=
  • 55.
    The University of Ontario FlexibleTemplates: simplistic example from the past Discrete Snakes • What graph? • What appearance model? • What connection/interaction model? • What optimization algorithm? 1v 2v 3v 4v 6v 5v ),()()( j i Eij iijii llClmLE ∑ ∑∈ +=
  • 56.
    The University of Ontario FlexibleTemplates: special cases Pictorial Structures • What graph? • What appearance model? -intensity based match measure -DT based match measure (binary templates) • What connection/interaction model? -elastic springs • What optimization algorithm? ),()()( j i Eij iijii llClmLE ∑ ∑∈ +=
  • 57.
    The University of Ontario DynamicProgramming for Flexible Template Matching DP can be used for minimization of E(L) for tree graphs (no loops!) ),()()( j i Eij iijii llClmLE ∑ ∑∈ += 1v 7v9v8v 10v 6v 2v 3v 4v 5v 11v
  • 58.
    The University of Ontario DynamicProgramming for Flexible Template Matching DP algorithm on trees • Choose post-order traversal for any selected “root” site/part • Compute for all “leaf” parts • Process a part after its children are processed • Select best energy position for the “root” and backtrack to “leafs” ),()()( j i Eij iijii llClmLE ∑ ∑∈ += 1v 7v9v8v 10v 6v 2v 3v 4v 5v 11v root )()( lmlE ii = )},()({min)()( , llClElmlE aiaaa l ii a ++= ...)},()({min)},()({min)()( ,, +++++= llClEllClElmlE bibbb l aiaaa l ii ba If part ‘i ‘ has only one child ‘a’ If part ‘i ‘ has two (or more) children ‘a’, ‘b’, …
  • 59.
    The University of Ontario DynamicProgramming for Flexible Template Matching DP’s complexity on trees (same as for 1D snakes) – n parts, m positions – OK complexity for local search where “m” is relatively small (e.g. in snakes) • E.g. for tracking a flexible model from frame to frame in a video sequence ),()()( j i Eij iijii llClmLE ∑ ∑∈ += 1v 7v9v8v 10v 6v 2v 3v 4v 5v 11v root 2 mn⋅
  • 60.
    The University of Ontario LocalSearch Tracking Flexible Templates
  • 61.
    The University of Ontario LocalSearch Tracking Flexible Templates
  • 62.
    The University of Ontario LocalSearch Tracking Flexible Templates
  • 63.
    The University of Ontario Searchingin the whole image (large m) m = image size or m = image size*rotations Then complexity is not good For some interactions can improve to based on Generalized Distance Transform (from Computational Geometry) 2 nm )( mnO ⋅ This is an amazing complexity for matching n dependent parts Note that is the number of operations for finding n independent matchesmn⋅ )( mnO ⋅
  • 64.
    The University of OntarioGeneralizedDistance Transform Idea: improve efficiency of the key computational step (performed for each parent-child pair, n-times) ( operations))},()({min)(ˆ yxCxEyE x += 2 m Intuitively: if x and y describe all feasible positions of “parts” in the image then energy functions and can be though of as some gray-scale images (e.g. like responses of the original image to some filters) )(ˆ yE)(xE
  • 65.
    The University of OntarioGeneralizedDistance Transform Idea: improve efficiency of the key computational step ( operations performed for each parent-child pair) Let (distance between x and y) reasonable interactions model! Then is called a Generalized Distance Transform of )},()({min)(ˆ yxCxEyE x += ||||),( yxyxC −⋅= α 2 m )(ˆ yE )(xE
  • 66.
    The University of Ontario FromDistance Transform to Generalized Distance Transform Assuming then is standard Distance Transform (of image features) ||}||)({min)(ˆ yxxEyE x −+=       ∞ = .. 0 )( WO featureimageisxif xE )(xE )(ˆ yE ∞+ ∞+∞+ Locations of binary image features
  • 67.
    The University of Ontario FromDistance Transform to Generalized Distance Transform For general and any fixed is called Generalized Distance Transform of ||}||)({min)(ˆ yxxEyE x −⋅+= α )(xE E(x) may represent non-binary image features (e.g. image intensity gradient) )(xE )(xE )(ˆ yE EˆE(y) may prefer strength of E(x) to proximity α
  • 68.
    The University of Ontario Algorithmfor computing Generalized Distance Transform Straightforward generalization of forward-backward pass algorithm for standard Distance Transforms       ∞ = .. 0 )( WO featureimageisxif xδ• Initialize to E(x) instead of • Use instead of 1α
  • 69.
    The University of Ontario FlexibleTemplate Matching Complexity Computing via Generalized Distance Transform: previously (m-number of positions x and y) Improves complexity of Flexible Template Matching to in case of interactions ||}||)({min)(ˆ yxxEyE x −⋅+= α )(mO )( 2 mO ||||),( yxyxC −⋅= α)(nmO
  • 70.
    The University of Ontario “Simple”Flexible Template Example: Central Part Model Consider special case in which parts translate with respect to common origin • E.g., useful for faces •Parts •Distinguished central part •Connect each to •Elastic spring costs },...,{ 1 nvvV = 1v iv 1v NOTE: for simplicity (only) we consider part positions that are translations only (no rotation or scaling of parts) il
  • 71.
    The University of OntarioCentralPart Model example “Ideal” location w.r.t. is given by where is a fixed translation vector for each i>1 il 1l io ii ollT += 11)( iv ioil 3v 3o 3l 2v 2o 2l 1l 1v ∑> −⋅++ 1 111 ||})(||)({)( i iiiii lTllmlm α Whole template energy ||)(|| 1lTl iii −⋅α “String cost for deformation from this “ideal” location
  • 72.
    The University of Ontario CentralPart Model Summary of search algorithm 1. For each non-central part i>1 compute matching cost for all possible positions of that part in the image 2. For each i>1 compute Generalized DT of 3. For all possible positions of the central part compute energy 4. Select the best location or select all locations with larger then a fixed threshold ∑> −⋅++ 1 111 ||})(||)({)( i iiiii lTllmlm α ||}||)({min)(ˆ yxxmyE ii x i −⋅+= α )( ii lm il )(xmi ∑> += 1 11111 ))((ˆ)()(ˆ i ii lTElmlE )( mnO ⋅ )(ˆ 11 lE1l Matching cost:
  • 73.
    The University of Ontario CentralPart Model for face detection
  • 74.
    The University of Ontario SearchAlgorithm for tree-based pictorial structures )( mnO ⋅ •Algorithm is basically the same as for Central Part Model. •Each “parent” part knows ideal positions of “child” parts. •String deformations are accounted for by the Generalized Distance Transform of the children’s positioning energies