6. Invariant local features
- Your eyes don’t see everything at once, but they jump around. You
see only about 2 degrees with high resolution
- Find features that are invariant to transformations
– geometric invariance: translation, rotation, scale
– photometric invariance: brightness, exposure, …
Feature Descriptors
Image Features II SIFT 6
7. How to achieve invariance
Need both of the following:
1. Make sure your detector is invariant
– Harris is invariant to translation and rotation
– Scale is trickier
• common approach is to detect features at many scales using a Gaussian pyramid
• More sophisticated methods find “the best scale” to represent each feature (e.g., SIFT)
2. Design an invariant feature descriptor
– A descriptor captures the information in a region around the
detected feature point
– The simplest descriptor: a square window of pixels
• What’s this invariant to?
– better approaches exist now …
Image Features II SIFT 7
8. Feature descriptors
We know how to detect good points
Next question: How to match them?
?
Image Features II SIFT 8
9. Feature descriptors
We know how to detect good points
Next question: How to match them?
Lots of possibilities (this is a popular research area)
– Simple option: match square windows around the point
– Better approach: SIFT
• David Lowe, UBC http://www.cs.ubc.ca/~lowe/keypoints/
?
9Image Features II SIFT
10. SIFT Background
• Scale-invariant feature transform
– SIFT: to detect and describe local
features in an images.
– Proposed by David Lowe in ICCV1999.
– Refined in IJCV 2004.
– Cited more than 60,000 times till now.
– Wildly used in image search, object
recognition, video tracking, gesture
recognition, etc.
Image Features II SIFT 10
11. Why SIFT is so popular?
• Desired property of SIFT
– Invariant to scale change
– Invariant to rotation change
– Invariant to illumination change
– Robust to addition of noise
– Robust to substantial range of affine
transformation
– Robust to 3D view point
– Highly distinctive for discrimination
Image Features II SIFT 11
12. How to extract SIFT
Test image Detector: where are
the local features?
Descriptor: how
to describe them?
Image Features II SIFT 12
13. SIFT Algorithm Steps
• Step 1: Constructing a scale space
• Step 2: Laplacian of Gaussian
approximation
• Step 3: Finding Keypoints
• Step 4: Eliminate edges and low contrast
regions
• Step 5: Assign an orientation to the
keypoints
• Step 6: Generate SIFT features
13Image Features II SIFT
14. Step 1: Constructing a scale space
• To create a scale space, you take the
original image and generate progressively
blurred out by using Gaussian Blur. To
Focus on certain objects, get rid of other
objects in the scene.
Image Features II SIFT 14
15. Gaussian Blur
The symbols:
• L is a blurred image
• G is the Gaussian Blur operator
• I is an image
• x,y are the location coordinates
• σ is the “scale” parameter. Think of it as the
amount of blur. Greater the value, greater the
blur.
Image Features II SIFT 15
16. Here’s an example:
• Look at how the cat’s helmet loses detail. So do it’s whiskers.
16Image Features II SIFT
17. • SIFT takes scale spaces to the next level.
• Resize the original image to half size.
And you generate blurred out images
again. And you keep repeating.
Image Features II SIFT 17
18. 4 octaves and 5 blur levels
Image Features II SIFT 18
19. Step 2: Laplacian of Gaussian
approximation
• The Laplacian of Gaussian (LoG) operation goes like
this. You take an image, and blur it a little. And then,
you calculate second order derivatives on it (or, the
“laplacian”). These are good for finding keypoints.
• The problem is, calculating all those second order
derivatives is computationally intensive.
• Solution, use the Difference of Gaussians (DoG).
-We use the scale space (from previous step).
-We calculate the difference between two consecutive
scales.
-These DoG images are a great for finding out
interesting key points in the image
Image Features II SIFT 19
20. Image Features II SIFT 20
These Difference of Gaussian images are approximately equivalent to the
Laplacian of Gaussian. And we’ve replaced a computationally intensive
process with a simple subtraction (fast and efficient).
22. Step 3: Finding Keypoints
• Iterate through each pixel and check all it’s neighbors. The check is
done within the current image, and also the one above and below
it. Something like this:
X marks the current
pixel.
The green circles mark
the neighbours.
X is marked as a “key point” if it is the greatest or least of all 26
neighbours
Image Features II SIFT 22
23. Step 4: Eliminate edges and low
contrast regions
• Key points generated in the previous step
produce a lot of key points. Some of them lie
along an edge, or they don’t have enough
contrast. In both cases, they are not useful as
features, so we need to get rid of them.
• Reject points with bad contrast:
– DoG smaller than 0.03 (image values in [0,1])
• Reject edges
– Use Harris detector and keep only corners
Image Features II SIFT 23
27. Step 5: Assign an orientation to the keypoints
• The idea is to collect gradient magnitude and orientation around
each keypoint. Then we figure out the most prominent orientation(s)
in that region. And we assign this orientation(s) to the keypoint.
• This orientation provides rotation invariance
• Let, for a keypoint, L is the image with the closest scale.
– Compute gradient magnitude and orientation using finite
differences:
( 1, ) ( 1, )
( , 1) ( , 1)
L x y L x y
GradientVector
L x y L x y
Image Features II SIFT 27
28. Step 5: Assign an orientation to the keypoints
• The magnitude and orientation is calculated for all pixels around the
keypoint. Then, A histogram is created. In this histogram, the 360 degrees
of orientation are broken into 36 bins (each 10 degrees).the histogram will
have a peak at some point.
• Above, you see the histogram peaks at 20-29 degrees. So, the keypoint is
assigned orientation 3 (the third bin). And the “amount” that is added to the
bin is proportional to the magnitude of gradient at that point
• Also, any peaks above 80% of the highest peak are converted into a new
keypoint. This new keypoint has the same location and scale as the original.
But it’s orientation is equal to the other peak.So, orientation can split up one
keypoint into multiple keypoints.
28Image Features II SIFT
34. Step 6: Generate SIFT features
• Each point so far has x, y, σ, m, θ
– Location x,y
– Scale: σ
– gradient magnitude and orientation: m, θ
• Now we need a descriptor for the region
– Could sample intensities around point, but…
• Sensitive to lighting changes
• Sensitive to slight errors in x, y, θ
Image Features II SIFT 34
35. Making descriptor rotation invariant
• Rotate patch according to its dominant gradient orientation
• This puts the patches into a canonical orientation.
Image Features II SIFT 35
36. Step 6: Generate SIFT features
• Till now, we had scale and rotation invariance. Now we create a
fingerprint for each keypoint to identify each keypoint.
• To do this, take a 16×16 window around the keypoint. This 16×16
window is broken into sixteen 4×4 windows
Within each 4×4 window, gradient magnitudes and orientations are calculated.
These orientations are put into an 8 bin histogram. the amount added to the bin
depends on the magnitude of the gradient, also depends on the distance from the
keypoint. So gradients that are far away from the keypoint will add smaller values
to the histogram.
36Image Features II SIFT
37. Step 6: Generate SIFT features (Cont.)
Do this for all sixteen 4×4 regions. So you end up with 4x4x8 = 128
numbers. Once you have all 128 numbers, you normalize them.
These 128 numbers form the “feature vector”.
This keypoint is uniquely identified by this feature vector.
=> Feature vector (128)
Image Features II SIFT 37
38. 0.37 0.79 0.97 0.98
0.97
0.91
0.98
0.79
0.73
0.900.75
0.31
0.45
0.45
0.04
0.08
by Yao Lu
Numeric Example
38Image Features II SIFT
39. L(x-1,y-1) L(x,y-1) L(x+1,y-1) 0.98
0.97
0.91
0.98
L(x+1,y)
L(x+1,y+1)
0.900.75
L(x,y+1)
0.45
L(x-1,y+1)
L(x-1,y)
magnitude(x,y)= 𝐿 𝑥 + 1, 𝑦 − 𝐿 𝑥 − 1, 𝑦
2
+ 𝐿 𝑥, 𝑦 + 1 − 𝐿 𝑥, 𝑦 − 1
2
𝜃(x,y)=a𝑡𝑎𝑛(
L x,y+1 −L x,y−1
L(x+1,y)−L(x−1,y)
L(x,y)
𝜃(x,y)
by Yao Lu 39Image Features II SIFT
40. Orientations in each of
the 16 pixels of the cell
The orientations all
ended up in two bins:
11 in one bin, 5 in the
other. (rough count)
40
5 11 0 0 0 0 0 0
Image Features II SIFT
41. Summary of SIFT Feature
• Descriptor: 128-D
– 4 by 4 patches, each with 8-D gradient angle
histogram:
4×4×8 = 128
– Normalized to reduce the effects of illumination
change.
• Position: (x, y)
– Where the feature is located at.
• Scale
– Control the region size for descriptor extraction.
• Orientation
– To achieve rotation-invariant descriptor.
Image Features II SIFT 41
43. Properties of SIFT
• Extraordinarily robust matching technique
– Can handle changes in viewpoint
• Up to about 30 degree out of plane rotation
– Can handle significant changes in illumination
• Sometimes even day vs. night (below)
– Fast and efficient—can run in real time
– Various code available
• http://www.cs.ubc.ca/~lowe/keypoints/
43Image Features II SIFT
44. NASA Mars Rover images
with SIFT feature matches
Figure by Noah Snavely
Example
44Image Features II SIFT
45. Example: Object Recognition
Lowe, IJCV04
SIFT is extremely powerful for object instance
recognition, especially for well-textured objects
45Image Features II SIFT
50. Matching with Features
•Detect feature points in both images
•Find corresponding pairs
•Use these matching pairs to align images - the
required mapping is called a homography.
50Image Features II SIFT
52. Recognition of specific objects, scenes
Rothganger et al. 2003 Lowe 2002
Schmid and Mohr 1997 Sivic and Zisserman, 2003
Kristen Grauman
52Image Features II SIFT
53. Example: 3D Reconstructions
• Photosynth (also called Photo Tourism)
developed at UW by Noah Snavely, Steve Seitz,
Rick Szeliski and others
http://www.youtube.com/watch?v=p16frKJLVi0
• Building Rome in a day, developed at UW by
Sameer Agarwal, Noah Snavely, Steve Seitz
and others
http://www.youtube.com/watch?v=kxtQqYLRaSQ&
feature=player_embedded
53Image Features II SIFT
54. When does the SIFT descriptor fail?
Patches SIFT thought were the same but aren’t:
54Image Features II SIFT
55. References
• David G. Lowe, "Distinctive image features from
scale-invariant keypoints," International Journal
of Computer Vision,(2004)
http://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf
• http://www.aishack.in/2010/05/sift-scale-invariant-
feature-transform/
• Implementing SIFT in OpenCV
http://www.aishack.in/2010/07/implementing-sift-in-
opencv/
Image Features II SIFT 55