For vehicle autonomy, driver assistance and situational awareness, it is necessary to operate at day and night, and in all weather conditions. In particular, long wave infrared (LWIR) sensors that receive predominantly emitted radiation have the capability to operate at night as well as during the day. In this work, we employ a polarised LWIR (POL-LWIR) camera to acquire data from a mobile vehicle, to compare and contrast four different convolutional neural network (CNN) configurations to detect other vehicles in video sequences. We evaluate two distinct and promising approaches, two-stage detection (Faster-RCNN) and one-stage detection (SSD), in four different configurations. We also employ two different image decompositions: the first based on the polarisation ellipse and the second on the Stokes parameters themselves. To evaluate our approach, the experimental trials were quantified by mean average precision (mAP) and processing time, showing a clear trade-off between the two factors. For example, the best mAP result of 80.94 % was achieved using Faster-RCNN, but at a frame rate of 6.4 fps. In contrast, MobileNet SSD achieved only 64.51 % mAP, but at 53.4 fps.
3. Introduction
• Most cars companies are promising level 4 of
autonomy by 2020
• RGB cameras offer poor visibility during night
• Long-wave Infrared (LWIR) sensors are capable of
sensing beyond the visible spectrum and are robust
to falling illumination
• Polarised LWIR is shown by previous works that can
captures features like material refractive index,
surface orientation and angle of observation, which
can lead to better discrimination
3
Example of vehicles sensed by
POL-LWIR
4. Objectives
• Try the two most promising research directions in
object detection based on deep neural networks
to recognise vehicles in polarised long-wave
infrared
• Two-Stage object detection
○ Faster R-CNN [1]
• One-Stage object detection
○ Single Shot MultiBox Detector (SSD) [2]
• Compare results in terms of mean average
precision (mAP) and processing time
4
[1] Ren, Shaoqing, et al. "Faster r-cnn: Towards
real-time object detection with region proposal
networks." Advances in neural information
processing systems. 2015.
[2] Liu, Wei, et al. "Ssd: Single shot multibox
detector." European conference on computer
vision. Springer, Cham, 2016.
5. Polarised Infrared
• The Thales Catherine MP LWIR
Polarimetre was used
• 4 Linear Polarisers were built into the
sensors (0o
, 45o
, 90o
, 135o
)
• It can sense between 8μm to 12μm
(375 THz to 250 THz)
5
Polarised Infrared Camera
developed by Thales
Electron micrograph image of a
polarisation sensitive
6. Stokes Vector
• A way of representing polarised light is to compute the
Stokes Vector.
• The Stokes Vector can describe properties of the light
6
7. Stokes Vector
• The Stokes Vector contains 4 components: I, Q, U and V.
• I component measures the total intensity of the radiation.
• Q and U components describe the amount of radiation polarised in a horizontal
direction and in a plane rotated 45 from the horizontal respectively.
• V component describes the mount of right-circularly polarised radiation.
• From the Stokes vector the degree of linear polarisation, P, and the angle of
polarisation, φ, can be calculated.
• Measuring V requires an additional quarter wave plate, and it was not taken in
consideration in this project.
7
10. Faster R-CNN
• Faster R-CNN applies a network to the whole image to extract features, from which it
proposes bounding boxes (“Region Proposal Network”)
• Then it uses these proposals and the already generated features as input to a small
network to give final results.
• Faster R-CNN can use any CNN to extract the features of
the input.
○ InceptionNet [Szegedy, Christian, et al. "Going deeper with convolutions." CVPR. 2015.]
○ VGG [Simonyan and Zisserman. "Very deep convolutional networks for large-scale image recognition." 2014.]
○ ResNet [He, Kaiming, et al. "Deep residual learning for image recognition." CVPR. 2016.]
• It runs at 10 frames per second (fps) on a NVIDIA Titan X GPU
10
Ren, Shaoqing, et al. "Faster R-CNN: Towards real-time
object detection with region proposal networks." Advances
in neural information processing systems. 2015.
11. Single Shot Detector (SSD)
• As mentioned, Faster R-CNN trains a network in two stages. Single Shot multibox
Detection (SSD) tries to create an end-to-end object detection network.
• Using the image as input, SSD trains a map of the regions with respective classes as
output. It can execute object detection without separate propose-recognise networks,
i.e. in just one shot.
• The SSD method can process videos at 40-58 FPS on a NVIDIA Titan X GPU
• The SSD can have the first layers based on well established neural networks such as
VGG, Mobilenet and Inception.
11
Liu, Wei, et al. "Ssd: Single shot multibox detector." European conference on
computer vision. Springer, Cham, 2016.
12. • These 4 different networks were used evaluated in the polarised IR context:
○ SSD MobileNet
○ SSD InceptionV2
○ Faster R-CNN ResNet-50
○ Faster R-CNN ResNet-101
• It was decided to represent the input in two different ways. I,Q,U and I,P,ɸ
Experimental Method
12
I
P
ɸU
I
Q
13. Dataset
• The dataset was collected by Thales in Glasgow in March, 2013.
• 10659 images for training
• 4553 images for testing
• Train and test data were capture from different days
• All regions with cars were annotated
13
Example of annotations
17. Conclusions
17
●
Deep Neural Networks are shown to be a technique that can find robust patterns from
the polarised signature. It performs better than previous feature extraction developed
by Dickson, et al [1].
• Faster R-CNN got the best results using ResNet-101, showing that the feature extraction
task is crucial for a good classification. However it is the slowest between the compared
methods.
• Polarised Infrared Images generate strong signatures from vehicles, especially from the
metallic parts and the window. This is true in day or night conditions; for LWIR the
signature is mainly emissive.
• I,P,ɸ showed to be a better representation for vehicle detection compared to just using
I. Showing that the polarisation retrieves a good signature from vehicles.
[1] Improving infrared vehicle detection with polarisation
CN Dickson, AM Wallace, M Kitchin, B Connor
Intelligent Signal Processing Conference 2013 (ISP 2013), IET, 1-6