Can you please tell me how you managed to stitch 3 images without having problems with the black borders that warpprespective creates around 2 stitched images?
1.
Panoramic Imaging using SURF and SIFT of
OpenCV
Eric Jansen – 2210105030
Computer Engineering and Telematics
Department of Electrical Engineering
Institut Teknologi Sepuluh Nopember
November 5th
, 2011
Abstract
Technology to provide good panoramic images
or mosaic images are existed several years ago.
And one of the hot topics being developed till
today. In this experiment, several images are
combined with ﬁnding their matching keypoints
to provide image stitching. To solve this prob-
lem, the concept of scale-invariant features has
been introduced in computer vision. The main
idea here is to have a scale factor associated with
each of the detected feature points. In recent
year, several scale-invariant features have been
proposed and this recipe presents two of them,
the SURF and SIFT features. SURF stands for
Speeded Up Robust Features, and as we will see,
they are not only scale-invariant features, but
they also oﬀer the advantage of being computed
very eﬃciently. The SIFT or Scale Invariant Fea-
ture Transform is developed from the SURF al-
gorithm as an eﬃcient variant of another well-
known scale-invariant feature detector. SIFT also
detects features as local maxima in image and
scale space, but uses the Laplacian ﬁlter response
instead of the Hessian determinant. This Lapla-
cian at diﬀerent scales is computed using diﬀer-
ence of Gaussian ﬁlters. OpenCV has a wrapper
class that detects these features and it is called
in a similar way to the SURF features.
1 Preface
Methods in determining the matching points be-
tween two images are determined in calculating
the color intensity of the images. Previous to
determining the intensity. The images are then
converted into grayscale color, as to combine
three-scale of RGB colors into one scale (black
to white or from 0 to 255). After converting im-
ages into grayscale-color, with Hessian method of
SURF, the images are normalized for better re-
sults. OpenCV provides all these features, even
the matching key points between images that are
going to be warped. with RANSAC or Random
Sample Consensus, the key points are directly
searching the detected matches points at other
images.
2 Methodology
The algorithms of both SURF and SIFT are de-
scribed as the followings:
2.1 Detecting the scale-invariant
SURF and SIFT features
The OpenCV implementation of SURF fea-
tures also use the cv::SurfFeatureDetector or
cv::SiftFeatureDetector of SIFT features in-
terface.
std::vector<cv::KeyPoint> keypoints;
// Construct the SURF or SIFT features
// detector object
cv::Ptr<cv::FeatureDetector> feature =
new cv::SurfFeatureDetector(2500);
// or
new cv::SiftFeatureDetector;
// or the SIFT feature detector object
// Detect the SURF features or
// detect the SIFT features
feature->detect(image,keypoints);
Using cv::drawKeypoints OpenCV function to
show the scale factor associated with each fea-
ture:
// Draw the keypoints with scale
cv::drawKeypoints(image,keypoints,
featureImage,
cv::Scalar(255,255,255),
cv::DrawMatchesFlags::
DRAW_RICH_KEYPOINTS);
1
2.
ﬁgure 1. img03.jpg
ﬁgure 2. img04.jpg
ﬁgure 3. SIFT-method keypoints-img01.jpg
ﬁgure 4. SIFT-method keypoints-img02.jpg
2.2 Feature Matching
In both images, a SURF or SIFT feature has
been detected at that location and the two corre-
sponding circles (of diﬀerent sizes) contain in the
same visual elements. Based on the std::vector
of cv::KeyPoint instances obtained from feature
detection, the descriptors are obtained as follows:
cv::Ptr<cv::DescriptorExtractor>
m_pExtractor =
new cv::
SurfDescriptorExtractor;
// or
cv::Ptr<cv::DescriptorExtractor>
m_pExtractor =
new cv::
SiftDescriptorExtractor;
m_pExtractor->compute(image1,keypoints1,
desc1);
m_pExtractor->compute(image2,keypoints2,
desc2);
The result is a matrix which will contain as many
rows as the number of elements in the keypoints
vector. Each of these rows is an N-dimensional
descriptor vector. In the case of the SURF de-
scriptor, by default, it has a size of 64. This vec-
tor characterizes the intensity pattern surround-
ing a feature point. The more similar the two
feature points, the closer their descriptor vectors
should be.
These descriptors are particularly useful in im-
age matching. Suppose, for example, that two
images of the same scene are to be matched. This
can be accomplished by ﬁrst detecting features on
each image, and then extracting the descriptors
of these features. Each feature descriptor vector
in the ﬁrst image is then compared to all feature
descriptors in the second image. The pair that
obtains the best score is then kept as the best
match for that feature. This process is repeated
for all features in the ﬁrst image. This is the
most basic scheme that has been implemented in
OpenCV as the cv::BruteForceMatcher. It us
used as follows:
cv::BruteForceMatcher< cv::L2<float> >
matcher;
From image 1 to image 2 and image 2
to image 1 based on k-nearest neighbours
with k=2. This is accomplished by the
cv::BruteForceMatcher::knnMatch method as
shown as follows:
std::vector< std::vector<cv::DMatch> >
matches1;
matcher.knnMatch(desc1,desc2,matches1,2);
2
3.
std::vector< std::vector<cv::DMatch> >
matches2;
matcher.knnMatch(desc2,desc1,matches2,2);
As shown in ﬁgure 3 and 4, the white lines link-
ing the matched points show that even if there
are a large number of good matches, a signiﬁcant
number of false matches have survived. There-
fore, the second test will be performed to ﬁlter
more false matches.
The following method called symmetrical
matching scheme imposing that, for a match pair
to be accepted, both points must be the best
matching feature of the other:
for (std::vector< std::
vector<cv::DMatch> >::
const_iterator matIt1=
matches1.begin();
matIt1!=matches1.end();
++matIt1) {
if (matIt1->size() < 2)
continue;
for (std::vector< std::vector
<cv::DMatch> >::
const_iterator matIt2=
matches2.begin();
matIt2!=matches2.end();
++matIt2) {
if (matIt2->size() < 2)
continue;
if (*matIt1)[0].queryIdx ==
(*matIt2)[0].trainIdx &&
(*matIt2)[0].queryIdx ==
(*matIt1)[0].trainIdx) {
symMatches.push_back(
cv::DMatch(
(*matIt1)[0].queryIdx,
(*matIt1)[0].trainIdx,
(*matIt1)[0].distance));
break;
}
}
}
2.3 Outlier Rejection using
RANSAC
This test is based on the RANSAC method that
can compute the fundamental matrix even when
outliers are still present in the match set. The
RANSAC algorithm aims at estimating a given
mathematical entity from a data set that may
contain a number of outliers. The idea is to
randomly select some data points from the set
and perform the estimation only with these. The
number of selected points should be the minimum
number of points required to estimate the math-
ematical entity.
The central idea behind the RANSAC algo-
rithm is that the larger the support set is, the
higher the probability that the computed matrix
is the right one. Obviously, if one or more of
the randomly selected matches is a wrong match,
then the computed fundamental matrix will also
be wrong, and its support set is expected to be
small. This process is repeated a number of
times, and at the end, the matrix with the largest
support will be retained as the most probable one.
When using the cv::findFundamentalMat func-
tion with RANSAC, two extra parameters are
provided. The ﬁrst one is the conﬁdence level
that determines the number or iterations to be
made. The second one is the maximum distance
to the epipolar line for a point to be consid-
ered as an inlier. All matched pairs in which
a point is at a distance from its epipolar line
larger than the one speciﬁed will be reported as
an outlier. Therefore, the function also returns a
std::vector of char value indicating that the
corresponding match has been identiﬁed as an
outlier (0) or as an inlier (1).
for (std::vector< std::
vector<cv::DMatch> >::
const_iterator it=
matches.begin();
it!=matches.end(); ++it) {
float x = keypoints1[it->
queryIdx].pt.x;
float y = keypoints2[it->
queryIdx].pt.y;
points1.push_back(cv::
Point2f(x,y));
x = keypoints2[it->
trainIdx].pt.x;
y = keypoints2[it->
trainIdx].pt.y;
points2.push_back(cv::
Point2f(x,y));
}
std::vector<uchar> inlier(
points1.size(),0);
fundamental = cv::
findFundamentalMat(
cv::Mat(points1),
cv::Mat(points2),
inliers,
CV_FM_RANSAC,
m_lfDistance,
m_lfConfidence);
std::vector<uchar>::
const_iterator itIn=
3
4.
inliers.begin();
std::vector<cv::DMatch>::
const_iterator itM=
matches.begin();
for (; itIn!=inliers.end();
++itIn,++itM) {
if (*itIn)
outMatches.
push_back(*itM);
}
The more good matches found in initial match
set, the higher the probability that RANSAC will
give the correct fundamental matrix. This is why
applying several ﬁlters to the match set before
calling the cv::findFundamentalMat function.
2.4 Recovery Homography from
Correspondences
After computing the fundamental matrix of an
image pair from a set of matches. Another math-
ematical entity exists that can be computed from
match pairs: a homography. Like the fundamen-
tal matrix, the homography is a 3×3 matrix with
special properties and, it applies to two-view im-
ages in speciﬁc situations. The projective rela-
tion between a 3D point and its image on a cam-
era is expressed by a 3×4 matrix. The special
case where two views of a scene are separated
by a pure rotation, then it can be observed that
the fourth column of the extrinsic matrix will be
mad of all null. As a result, the projective rela-
tion in this special case becomes a 3×3 matrix.
This matrix is called a homography and it implies
that, under special circumstances, e.g. rotation,
the image of a point in one view is related to the
image of the same point in another by a linear
relation:
sx
sy
s
= H
x
y
1
In homogenous coordinates, this relation holds up
to a scale factor represented here by the scalar
value s. Once this matrix is estimated, all points
in one view can be transferred to the second view
using this relation. Note that as a side eﬀect of
the homography relation for pure rotation, the
fundamental matrix becomes undeﬁned in this
case.
Suppose two images separated by a pure ro-
tation. These two images can be matched using
RobustMatcher class, and applying a RANSAC
step which will involve the estimation of a ho-
mography based on a match set that contains a
good number of outliers. This is done by using
the cv::findHomography function:
std::vector<uchar> inliers(
points1.size(),0);
cv::Mat homography =
cv::findHomography(cv::
Mat(points1),cv::
Mat(points2),inliers,
CV_RANSAC,1);
The resulting inliers that comply with the found
homography have been drawn on those images by
the following loop:
std::vector<cv::Point2f>::
const_iterator itPts=
points1.begin();
std::vector<uchar>::
const_iterator itIn=
inliers.begin();
while (itPts != points1.end()) {
if (*itIn)
cv::circle(image1,*itPts,
3,
cv::Scalar(
255,
255,
255),2);
++itPts;
++itIn;
}
itPts = points2.begin();
itIn = inliers.begin();
while (itPts != points2.end()) {
if (*itIn)
cv::circle(image2,
*itPts,
3,
cv::Scalar(
255,
255,
255),2);
++itPts;
++itIn;
}
2.5 Image Compositing
Once the homography is computed, it will trans-
fer image points from one image to the other.
In fact, for all pixels of an image and the result
will be to transform this image to the other view.
There is an OpenCV function that does exactly
this:
cv::warpPerspective(image1,result,
homography,cv::Size(2 *
image1.cols,image1.rows));
cv::Mat half(result,cv::Rect(
0,0,image2.cols,image2.rows));
image2.copyTo(half);
4
5.
ﬁgure 5. SIFT-method matching Key Points
between two images
3 Conclusion
With several experiments in merging these im-
ages, the SIFT algorithm compatibilities are
higher than the SURF. Although in fact, the
panoramic image produced of the SURF algo-
rithm is the fastest. The SURF’s Hessian method
for matching key points were set to 1000 could be
applied to combine several images but with less
accurate matching points.
ﬁgure 6. Panoramic Image
5