The document discusses implementing and modifying the Histogram of Oriented Gradients (HOG) descriptor algorithm to improve pedestrian detection performance. It implements HOG using OpenCV and trains a support vector machine (SVM) on the INRIA dataset. Various modifications to HOG are tested, including different cell sizes, histogram bin configurations, and levels of Gaussian blur, to determine their effect on detection rate and false positive rate. The best performance is achieved with 9 signed histogram bins, small cell sizes, and no Gaussian blur.
Supervised Blood Vessel Segmentation in Retinal Images Using Gray level and M...
Implementing HOG Descriptor for Pedestrian Detection
1. I provide an implementation of the
Histograms of Oriented Gradients (HOG)
descriptor algorithm.
Modifications and variations of HOG are
examined in order to enhance performance
(maximize the number of true positives
detected and minimize the number of false
positives detected).
Using the INRIA train dataset, I train a
support vector machine (SVM) for
classification. I then classify the INRIA test
dataset with our trained SVM.
Objective
I begin by implementing the Scale-Invariant
Feature Transform (SIFT) – another popular
feature descriptor.
Second, I implement my own version of
HOG, making use of available image
processing functions within the OpenCV
library.
I then train a linear SVM with the training
images in the INRIA dataset before
classifying the test images.
I can then vary my HOG implementation to
alter the feature vectors and subsequent
classification.
Test cases:
Gaussian blur: no blur, 2x2 blur,
3x3 blur, 4x4 blur.
Cell size: 6x6 pixels, 8x8 pixels,
9x9 pixels, 16x16 pixels.
Histogram size/orientation: 6 bins
(signed/unsigned), 9 bins
(signed/unsigned), 12 bins
(signed/unsigned).
Pedestrian detection is a relevant and
significant task, applied to driver-
assistance systems, as well as security and
video surveillance systems.
There are many challenges involved in
pedestrian detection, mainly due to
occlusion, illumination, and appearance.
Feature descriptors are algorithms that
break down and describe the important
characteristics (known as features) of an
image.
These descriptors are then utilized for
classification of images. Classification is
performed using machine learning
techniques such as support vector
machines (SVMs).
In all cases, the detection rate and false
positive detection rate rise together as the
SVM response threshold increases.
HOG performs best with little to no blur.
When blurred, edge information and
gradient data is smoothed and removed.
Generally speaking, HOG performs best
with more histogram bins and with bins
that use signed orientations. The exception
is in the case of 6 unsigned bins. The best
case uses 9 signed bins.
HOG functions best with a small cell size.
The feature vector should ideally be as
large as possible.
An important consideration with the
addition of bins is the increased
computational complexity when the feature
vector is larger. The same applies for
smaller cell sizes that form larger feature
vectors.
Introduction
Conclusion
Methods
Results
References
Histogram of Oriented Gradients
HOG is one of the most popular and best-
performing descriptors. There is a lot of
interest in improving the current algorithm.
HOG decomposes the image into small
cells.
[1] Garg, Sanyam. 2014. Creating The HOG Feature Descriptor. Image.
http://3.bp.blogspot.com/-
BD3mYU2gVNM/U2K1C_UKHeI/AAAAAAAAAxs/WM8oCbYkj_Y/s1600/blo
g2_9.PNG.
[2] Ko, ByoungChul. 2012. 'Wildfire Smoke Detection Using
Temporospatial Features And Random Forest Classifiers'. Optical
Engineering 51 (1): 017208. doi:10.1117/1.oe.51.1.017208.
Within each of these
cells, the gradients
are calculated.
A block is formed by
combining adjacent
cells, normalizing the
gradients of the cells
within. This allows
for invariance to
illumination.
The histogram of each cell is then
computed. The gradient directions
are quantized into orientation bins of
the histogram.
The histograms of the entire image
are concatenated to form the feature
vector.
Regions of Convergence
Performance
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
TruePositiveRate
False Positive Rate
Cell Size
8x8 9x9 16x16 6x6
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
TruePositiveRate
False Positive Rate
Gaussian Blur
2x2 Blur 3x3 Blur 4x4 Blur No Blur
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
TruePositiveRate
False Positive Rate
Histogram Bin Size/Sign
6 signed 6 unsigned 12 signed 12 unsigned 9 signed 9 unsigned
0
2
4
6
8
10
12
%FalsePositives
Blur
Gaussian Blur - False Positives
2x2 Blur 3x3 Blur 4x4 Blur No Blur
85
86
87
88
89
90
91
92
93
94
95
%TruePositives
Blur
Gaussian Blur - True Positives
2x2 Blur 3x3 Blur 4x4 Blur No Blur
Performance (cont.)
80
82
84
86
88
90
92
94
%TruePositives
Bin Size/ Sign
Histogram Bin Size/Sign- True Positives
6 signed 6 unsigned 9 signed
9 unsigned 12 signed 12 unsigned
0
2
4
6
8
10
12
14
%FalsePositives
Bin Size/Sign
Histogram Bin Size/Sign- False Positives
6 signed 6 unsigned 9 signed
9 unsigned 12 signed 12 unsigned
5
7
9
11
13
15
17
19
21
23
25
%FalsePositives
Cell Size
Cell Size - False Positives
6x6 8x8 9x9 16x16
80
82
84
86
88
90
92
94
%TruePositives
Cell Size
Cell Size - True Positives
6x6 8x8 9x9 16x16
[1]
[2]