Coefficient of Thermal Expansion and their Importance.pptx
Β
SIFT/SURF can achieve scale, rotation and illumination invariant during image matching
1. Name: Muhammad Irsyadi Firdaus
Student ID: P66067055
Course: Digital Photogrammetry
Questions for Digital Photogrammetry Course
Chapter 4
1. Please explain why and how SIFT/SURF can achieve scale, rotation and illumination
invariant during image matching.
SIFT (Scale Invariant Feature Transform) proposed by Lowe solves the image
rotation, affine transformations, intensity, and viewpoint change in matching
features. The SIFT algorithm has 4 basic steps. First is to estimate a scale space
extrema using the Difference of Gaussian (DoG) to identify the potential interest
points, which were invariant to scale and orientation. DOG was used instead of
Gaussian to improve the computation speed.
π·π·(π₯π₯, π¦π¦, ππ) = οΏ½πΊπΊ(π₯π₯, π¦π¦, ππ ππ) β πΊπΊ(π₯π₯, π¦π¦, ππ)οΏ½ β πΌπΌ(π₯π₯, π¦π¦) = πΏπΏ(π₯π₯, π¦π¦, ππππ) β πΏπΏ(π₯π₯, π¦π¦, ππ)
Given a digital image πΌπΌ(π₯π₯, π¦π¦), its scale space representation will be πΏπΏ(π₯π₯, π¦π¦, ππ).
πΊπΊ(π₯π₯, π¦π¦, ππ) is the variablescale Gaussian kernel with the standard deviation. A pixel
compared of 3Γ3 neighborhood to detect the local maxima and minima of π·π·(π₯π₯, π¦π¦, ππ).
Secondly, a key point localization where the key point candidates are localized
and refined by eliminating the low contrast points. Hessian matrix was used to
compute the principal curvatures and eliminate the keypoints that have a ratio
between the principal curvatures that are greater than the ratio. Thirdly, a key point
orientation assignment based on local image gradient and lastly a descriptor
generator to compute the local image descriptor for each key point based on image
gradient magnitude and orientation at each image sample point in a region centered
at key point.
SURF (Speed Up Robust Feature) SURF (Speed Up Robust Features) algorithm, is
base on multi-scale space theory and the feature detector is base on Hessian matrix.
Since Hessian matrix has good performance and accuracy. In image I, π₯π₯ = (π₯π₯, π¦π¦) is the
given point, the Hessian matrix H(x, Ο) in x at scale Ο, it can be define as.
π»π» (π₯π₯, ππ) = οΏ½
πΏπΏπ₯π₯π₯π₯(π₯π₯, ππ) πΏπΏπ₯π₯π₯π₯(π₯π₯, ππ)
πΏπΏπ¦π¦π¦π¦(π₯π₯, ππ) πΏπΏπ¦π¦π¦π¦(π₯π₯, ππ)
οΏ½
Where πΏπΏπ₯π₯π₯π₯(π₯π₯, ππ) is the convolution result of the second order derivative of Gaussian filter
ππ2
ππππ2 ππ(ππ) with the image I in point π₯π₯, and similarly for πΏπΏπ₯π₯π₯π₯(π₯π₯, ππ) and πΏπΏπ¦π¦π¦π¦(π₯π₯, ππ).
SURF approximates the DoG with box filters. Instead of Gaussian averaging the
image, squares are used for approximation since the convolution with square is
much faster if the integral image is used. Also this can be done in parallel for
different scales. The SURF approach can be divided into three main steps. First,
keypoints are selected at distinctive locations in the image, such as corners, blobs,
and T-junctions. Next, the neighborhood of every keypoint is represented by a
feature vector. This descriptor has to be distinctive. At the same time, it should be
robust to noise, detection errors, and geometric and photometric deformations.
Finally, the descriptor vectors are matched among the different images. Keypoints
are found by using a so-called Fast-Hessian Detector that is based on the
approximation of the Hessian matrix of a given image point.
2. Name: Muhammad Irsyadi Firdaus
Student ID: P66067055
Course: Digital Photogrammetry
Figure 1. The matching of varying intensity images using (a) SIFT (b) SURF
Figure 2. The matching of the original image with its rotated image using: (a) SIFT (b) SURF
3. Name: Muhammad Irsyadi Firdaus
Student ID: P66067055
Course: Digital Photogrammetry
Figure 3. The matching of the original image with its scaled image using: (a) SIFT (b) SURF
2. Please describe the concept and procedure of Semi-Global Matching.
Semi-Global Matching (SGM) is a robust stereo method that has proven its
usefulness in various applications ranging from aerial image matching to driver
assistance systems. It supports pixelwise matching for maintaining sharp object
boundaries and fine structures and can be implemented efficiently on different
computation hardware. Furthermore, the method is not sensitive to the choice of
parameters.
The Semi-Global Matching method performs a pixel-wise matching allowing to
shape efficiently object boundaries and fine details. The algorithm works with a pair
of images with known interior and exterior orientation and epipolar geometry (i.e.
assumes that corresponding points lie on the same horizontal image line). It realizes
the minimization of a global smoothness constraint, combining matching costs along
independent one-dimensional paths trough the image.
The first scanline developments, exploiting a single global matching cost for each
individual image line were prone to streaking effects, being the optimal solution of
each scan not connected with the neighboring ones. SGM algorithm allows to
overcome these problems thanks to the innovative idea of symmetrically compute
the pixel matching cost through several paths in the image. With a known disparity
value, the costs extract by the each path are summed for each pixel and disparity
value. Finally, the algorithm choses the pixel matching solution with the minimum
cost, usually using a dynamic programming approach. The cost πΏπΏππ
β²
(ππ, ππ) of the pixel
ππ at disparity ππ, along the path direction ππ is defined as:
πΏπΏππ
β² (ππ, ππ) = πΆπΆ(ππ, ππ) + min(πΏπΏππ(ππ β ππ, ππ),
πΏπΏππ(ππ β ππ, ππ β 1) + ππ1
4. Name: Muhammad Irsyadi Firdaus
Student ID: P66067055
Course: Digital Photogrammetry
πΏπΏππ(ππ β ππ, ππ β 1) + ππ2
πΏπΏππ(ππ β ππ, ππ β 1) + ππ2
ππππππππ πΏπΏππ(ππ β ππ, ππ) + ππ2) β ππππππππ πΏπΏππ(ππ β ππ, ππ)
where the first term is the similarity cost (i.e. a value that penalize, using
appropriate metrics, solutions where different radiometric values are encountered in
the neighbor area of the corresponding points) , whereas the second term evaluates
the regularity of the disparity field, adding a penalty term ππ1 for little changes and ππ2
for all larger disparity change with respect to the previous point in the considered
matching path. The two penalty values allow to describe curved surfaces and to
preserve disparity discontinuities, respectively. Since the cost gradually increases
during cost aggregation along the path, the last term allows reducing the final value
subtracting the minimum path cost of the previous pixel from the amount.
Minimization operation is performed efficiently with Dynamic Programing (Van
Meerbergen, et al., 2002) but, in order to avoid streaking effects, SGM strategy has
introduced the innovative idea of computing the optimization combining several
individual path, symmetrically from all directions through the image. Summing the
path costs in all directions and searching the disparity with the minimal cost for each
image pixel , produce the final disparity map. The aggregated cost is defined as
ππ(ππ, ππ) = οΏ½ πΏπΏππ(ππ, ππ)
ππ
and, for sub-pixel estimation of the final disparity solution, the position of the
minimum is calculated fitting a quadratic curve through the cost values of the
neighbor's pixels. Similar approaches, where the surface reconstruction is solved
through an energy minimization problem has been evaluated in (Pierrot-Deseilligny
& Paparoditis, 2006). He has implemented a Semi-Global matching-like method
identifying the formulation of an energy function πΈπΈ(ππ) described as:
πΈπΈ(ππ) = οΏ½ π΄π΄(π₯π₯, π¦π¦, ππ(π₯π₯, π¦π¦)) + πΌπΌ β πΉπΉ (πΊπΊβ(ππ))
where
o ππ is the disparity function;
o π΄π΄(π₯π₯, π¦π¦, ππ(π₯π₯, π¦π¦)) represents the similarity term;
o πΉπΉ (πΊπΊβ(ππ)) is the positive function expressing the initial parameters which
characterize the surface regularity;
o πΌπΌ represents the weight to permit the data adaptation to the image content (i.e.
the weight of disparity regularization enforcement).
3. Please compare the differences between Feature-based Matching and Dense Image
Matching, including their characteristics and suitability to certain kinds of
applications.
The real innovation that has been introduced in several dense image matching
methods during the last years regards the integration of different basic correlation
algorithms, consistency measures and constraints into a multi-step procedure, which
5. Name: Muhammad Irsyadi Firdaus
Student ID: P66067055
Course: Digital Photogrammetry
in many cases works through a multiresolution approach. Indeed, local correlation
algorithms assume constant disparities within a correlation window. The larger is the
size of this window, the greater is the robustness of matching. But this implicit
assumption about constant disparity inside the area is violated for elements like
geometric discontinuities, which lead to blurred object boundaries and smoothing
results. Furthermore, the matching phase, as commonly based on intensity
differences, is very sensitive to recording and illumination differences and is not
reliable in poorly textured or homogeneous regions.
A dense matching algorithm should be able to extract 3D points with a sufficient
resolution to describe the objectβs surface and its discontinuities. Two critical issues
should be considered for an optimal approach: (i) the point resolution must be
adaptively tuned to preserve edges and to avoid too many points in flat areas; (ii) the
reconstruction must be guaranteed also in regions with poor textures or illumination
and scale changes.
A rough surface model of the object is often required by some techniques in
order to initialize the matching procedure. Such models can be derived in different
ways, e.g. by using a point cloud interpolated on the basis of tie points obtained
from the orientation stage, from already existing 3D models, or from low-resolution
range data. Other methods are organized in a hierarchical framework which
generates first a rough surface reconstruction, which is refined and made denser at a
later stage.
Many algorithms are based on normalized and distortion free images, whose
adoption simplifies and speeds up the search of correspondences. Possible outliers
are generally removed following two opposite strategies: (i) the use of multi-image
techniques to discard possible blunders by intersecting the homologous rays of the
matched point in object space; (ii) by computing a surface model as dense as
possible without any care of outliers and applying different filtering / smoothing
methods.
The most intuitive classification of image matching algorithms is based on the
used primitives - image intensity patterns (windows composed of grey values
around a point of interest) or features ( e.g. edges and regions) - which are then
transformed into 3D information through a mathematical model (e.g. collinearity
model or camera projection matrix). According to these primitives, the resulting
matching algorithms are generally classified as area - Based matching (ABM) or
feature - based matching (FBM). FBM is often used as alternative or combined with
ABM. FBM techniques are more flexible with respect to surface discontinuities, less
sensitive to image noise and require less approximate values. Because of the
sparse and irregularly distributed nature of the extracted features, the matching
results are in general sparse point clouds which are ten used as seeds to grow
additional matches.
The following properties are important for utilizing a feature detector in
computer vision applications:
β’ Robustness, the feature detection algorithm should be able to detect the
same feature locations independent of scaling, rotation, shifting, photometric
deformations, compression artifacts, and noise.
β’ Repeatability, the feature detection algorithm should be able to detect the
same features of the same scene or object repeatedly under variety of
6. Name: Muhammad Irsyadi Firdaus
Student ID: P66067055
Course: Digital Photogrammetry
viewing conditions.
β’ Accuracy, the feature detection algorithm should accurately localize the
image features (same pixel locations), especially for image matching tasks,
where precise correspondences are needed to estimate the epipolar
geometry.
β’ Generality, the feature detection algorithm should be able to detect features
that can be used in different applications.
β’ Efficiency, the feature detection algorithm should be able to detect features
in new images quickly to support real-time applications.
β’ Quantity, the feature detection algorithm should be able to detect all or most
of the features in the image. Where, the density of detected features should
reflect the information content of the image for providing a compact image
representation.