Template matching is a technique in computer vision used for finding a sub-image of a target image which matches a template image. This technique is widely used in object detection fields such as vehicle tracking, robotics , medical imaging, and manufacturing .
Template matching is a technique in computer vision used for finding a sub-image
of a target image which matches a template image. This technique is
widely used in object detection fields such as vehicle tracking, robotics ,
medical imaging, and manufacturing .
The crucial point is to adopt an appropriate “measure” to quantify similarity
or matching. However, this method also requires extensive computational
cost since the matching process involves moving the template image to all
possible positions in a larger target image and computing a numerical index
that indicates how well the template matches the image in that position. This
problem is thus considered as an optimization problem.
A reasonable first step to approaching such a task is to define a measure or
a cost measuring the“distance”or the“similarity”between the (known)
reference patterns and the (unknown) test pattern, in order to perform the
matching operation known as template matching.
Pattern Recognition – TM Page | 2
2. Template Matching Types
Template matching has been performed at the pixel level and also on higher
A. Pixel Level Template Matching:
Pixel templates come in four types:
a) Total templates: Template is the same size as the input image.
There is no rotation or translation invariance.
b) Partial templates: Template is free from the background. Multiple
matches are allowed. Partial matches may also be allowed. Care
must be taken in this case -- an F template could easily match to
Pattern Recognition – TM Page | 3
c) Piece templates : Templates that match one feature of a figure.
These templates break a pattern into its component segments so,
for example, "A" can be broken down into "/", "" and "-". The
order in which templates are compared to the scene is important:
the largest templates must be tried first, since they contain the
most information and may subsume smaller templates.
d) Flexible templates: These templates can handle stretching,
misorientation and other possible deviations. A good prototype of
a known object is first obtained and represented parametrically.
Pattern Recognition – TM Page | 4
B. High Level Template Matching
A problem with pixel based is that although fairly cheap and simple to
implement; rotation and translation is a problem, also images are rarely
perfect suffering from blurring, stretched and other distortions and
peppered with noise.
High level template matching methods operate on an image that has
typically been segmented into regions of interest. Regions can be described
in terms of area, average intensity, rate of change of intensity, curvature and
also compared -- bigger than, adjacent to, above, distance between.
Templates are described in relationships between regions. Production rules
and other linguistic representations have been used. Also statistical methods
(relaxation based techniques) have been applied to perform the matching.
a) Feature-based Matching: When the template image has strong
features, a feature-based approach may be considered; the approach
may prove further useful if the match in the search image might be
transformed in some fashion. Since this approach does not consider
the entirety of the template image, it can be more computationally
efficient when working with source images of larger resolution.
Pattern Recognition – TM Page | 5
b) Template-based Matching: For templates without strong features, or
for when the bulk of the template image constitutes the matching
image, a template-based approach may be effective. Template-based
template matching may potentially require sampling of a large
number of points, it is possible to reduce the number of sampling
points by reducing the resolution of the search and template images
by the same factor and performing the operation on the resultant
downsized images (multi-resolution, or pyramid, image processing).
Image Pyramid is a series of images, each image being a result of
downsampling (scaling down, by the factor of two in this case) of the
At each level of the pyramid, we will need appropriately downsampled
picture of the reference template, i.e. both input image pyramid and
template image pyramid (Pyramid Processing) should be computed.
Pattern Recognition – TM Page | 6
Although in some of the applications the orientation of the objects is
uniform and fixed (as we have seen in the plug example), it is often the
case that the objects that are to be detected appear rotated. In Template
Matching algorithms the classic pyramid search is adapted to allow multi-angle
matching, i.e. identification of rotated instances of the template.
This is achieved by computing not just one template image pyramid, but a
set of pyramids - one for each possible rotation of the template. During the
pyramid search on the input image the algorithm identifies the
pairs (template position, template orientation) rather than sole template
positions. Similarly to the original schema, on each level of the search the
algorithm verifies only those (position, orientation)pairs that scored well on
the previous level (i.e. seemed to match the template in the image of lower
The technique of pyramid matching together with multi-angle search
constitute the Grayscale-based Template Matching method.
Pattern Recognition – TM Page | 7
Edge-based Matching enhances the previously discussed Grayscale-based
Matching using one crucial observation - that the shape of any object is
defined mainly by the shape of its edges. Therefore, instead of matching
of the whole template, we could extract its edges and match only the
nearby pixels, thus avoiding some unnecessary computations. In common
applications the achieved speed-up is usually significant.
3. Template Matching Measures
Measure of match between two images is considered to be a metric that
indicate the degree of similarity or dissimilarity between them. Unless it is
specifically stated otherwise, this metric can be increasing or decreasing with
degree of similarity. Where the metric is specifically stated to be a measure
of mismatch, it is a quantity that is increasing with the degree of dissimilarity.
Pattern Recognition – TM Page | 8
3.1 Measures of Match (similarity)
1) MEASURES BASED ON OPTIMAL PATH SEARCHING TECHNIQUES
Representation: Represent the template by a sequence of measurement
r(1), r(2),..., r(I )
t(1), t(2),..., t(J )
Form a grid with I points (template) in horizontal and J points (test)
Each point (i,j) of the grid measures the distance between r(i) and t(j)
Path: A path through the grid, from an initial node
(i0, j0) to a final one (if, jf), is an ordered set of nodes
(i0, j0), (i1, j1), (i2, j2) … (ik, jk) … (if, jf)
Each path is associated with a cost
k k D d( i , j )
Where K is the number of nodes across the path
Pattern Recognition – TM Page | 9
The optimal path (blue) is constructed by searching among all allowable
paths. The optimal node correspondence, between the test and reference
patterns, is unraveled by backtracking the optimal path.
2) Euclidean Distance
Let I be a gray level image and g be a gray-value template of size n X m.
In this formula (r,c) denotes the top left corner of template g.
3) The Edit Distance
Deals with patterns that consist of sets of ordered symbols. For example,if
these symbols are letters,then the patterns are words from a written text.
Such problems arise in automatic editing and text retrieval applications.
Other examples of symbol strings occur in structural pattern recognition.
Once the symbols of a (test) pattern have been identified, for example, via a
reading device, the task is to recognize the pattern, searching for the best
match of it against a set of reference patterns.
Pattern Recognition – TM Page | 10
■ Wrongly identified symbol (e.g.,“befuty” instead of “beauty”)
■ Insertion error (e.g.,“bearuty”)
■ Deletion error (e.g.,“beuty”)
The similarity between two patterns is based on the “cost” associated with
converting one pattern to the other. If the patterns are of the same length,
then the cost is directly related to the number of symbols that have to be
changed in one of them so that the other pattern results.
The Edit distance between two string patterns A and B, denoted D(A, B), is
defined as the minimum total number of changes C, insertions I ,and
deletions R required to change pattern A into pattern B,
D(A,B) min[C( j) I ( j)
Where j runs over all possible variations of symbols, in order to convert A
Computation of the Edit distance with (a) an insertion, (b) a change, (c) a
deletion, and (d) an equality.
Pattern Recognition – TM Page | 11
Allowable predecessors and costs
1. Diagonal transitions:
t i r j
0, if ( ) ( )
t i r j
1, ( )
d i j i j
( , 1, 1)
2. Horizontal and vertical transitions:
d(i, j i 1, j) 1
d(i, j i, j 1) 1
4) MEASURES BASED ON CORRELATIONS
The major task here is to find whether a specific known reference pattern
resides within a given block of data. Such problems arise in problems such as
target detection, robot vision, video coding. There are two basic steps in
such a procedure:
Step 1: Move the reference pattern to all possible positions within the
block of data. For each position, compute the “similarity” between the
reference pattern and the respective part of the block of data.
Pattern Recognition – TM Page | 12
Step 2: Compute the best matching value.
x x y y
x x y
x y y x is the template gray level image
i Pattern Recognition – TM Page | 13
y y x x
is the average grey level in the template image
y is the source image section
y is the average grey level in the source image
N is the number of pixels in the section image
(N= template image size = columns * rows)
The value cor is between –1 and +1,
with larger values representing a stronger relationship between the two images.
3.2 Measures of Mismatch (dissimilarity)
These measures of match are based on the pixel-by-pixel intensity
differences between the two images f and g.
1) Root mean square distance (RMS): The RMS distance metric is a
common measure of mismatch between two digital images. It is given
2) Sum of absolute differences (SAD): compare the intensities of the
pixels to handle translation problems on images, using template
A pixel in the search image with coordinates (xs, ys) has intensity Is(xs, ys) and
a pixel in the template with coordinates (xt, yt) has intensity It(xt, yt ). Thus
the absolute difference in the pixel intensities is defined as
Diff(xs, ys, x t, y t) = | Is(xs, ys) – It(x t, y t) |.
Pattern Recognition – TM Page | 14
4. Problems with template matching
1) The template represents the object as we expect to find it in the image
2) The object can indeed be scaled or rotated
3) This technique requires a separate template for each scale and
4) Template matching become thus too expensive, especially for large
5) Sensitive to:
Pattern Recognition – TM Page | 15
5. Template Matching Applications:
1) Template matching with various average face pyramid levels.
2) 3D reconstruction.
3) Motion detection.
4) Object recognition.
5) Panorama reconstruction.
Pattern Recognition – TM Page | 16
1. G.s.cox,1995. “ template matching and measures of match in image
processing”,July 12 . cape town university.
7. OpenCV 188.8.131.52 documentation.htm
8. Jain. D, Tolga. H, and Meiyappan. S, “Face Detection using Template
Matching”, , EE 368 – Digital Image Processing, Spring 2002-2003.
Pattern Recognition – TM Page | 17