Perspective Multiscale Detection and Tracking of Persons

Perspective Multiscale Detection and
Tracking of Persons
Marcos Nieto, Juan Diego Ortega, Andoni Cortés, and Seán Gaines

MMM 2014 – The 20th Anniversary International Conference on
Multimedia Modeling, Dublin (Ireland), 6,7,8-10th January 2014
1
1

Outline

1.
2.
3.
4.
5.

Motivation
Perspective calibration
Approach
Results
Conclusions

2
2

Outline

1.

Motivation
1.
2.
3.

2.
3.
4.
5.

Object detection in images
Real-time application
Contextual information

Approach
Results
Conclusions

3
3

Motivation

• Object detection in images
Multiscale detection
Sliding window
Spans position & size
Bounding boxes

Detection-by-classification
Supervised learning
Feature extraction
Binary or multiclass

Close

Open
4
4

Close

Open

• Real-time applications
Multiscale detection
Kind of brute-force
Too many evaluations
Some are absurd given the context

Num. Evaluations

100
90
80
70
60
50
40
30
20
10
0

Th.

Motivation

1,02
1,05
1,1

1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61
Levels

Parameters
Initial (smallest) size
Number of scales
Factor between scales
Offset (stride)
…

Therefore, some
knowledge about the
scene must be provided
5
5

Motivation

• Contextual information
Color, motion, depth
Perspective of the scene
Low generality
High generality
Particular to each application
Allows to maintain multiscale technique
Applicable in real-time
Two assumptions
There is a dominant ground plane
Objects lie on the plane, and their 3D size is app. known

Surveillance, ADAS
Vehicles, persons

6
6

Outline

1.
2.

Motivation
Perspective Calibration
1.
2.
3.

3.
4.
5.

Plane view calibration
GUI
Projection of objects

Approach
Results
Conclusions

7
7


• Plane view calibration

Extrinsics from Homography
Rotation and translation of
camera

Homography calculation
4-points
2 metric references

1 DoF Camera model
Focal length from homography
Refinement using Lev.-Marq.

8
8


GUI
Useful to calibrate videos
Quick (2-5 minutes)
Also lens distortion
correction

9
9


• Projection of objects
Farthest size of object

Closest size of object

10
10

Outline

1.
2.
3.

Motivation
Approach
1.
2.
3.

4.
5.

Overview
Perspective Multiscale
Perspective Grid

Results
Conclusions

11
11

Approach

• Define the perspective of the scene
Camera calibration
Intrinsic parameters

Homography
Camera pose
Extrinsic parameters calibration

• Define the 3D size of the object to search
Persons
1700 x 500 x 500

Car
1500 x 1700 x 3500

• A) Calculate the best parameters for multiscale
• B) Define a fixed grid of positions in the plane

12
12

Approach
Multiscale

• A) Perspective multiscale
• Rescale original
image so model size
fits farthest object
• Compute scale
factor so that model
size coincides with
closest object at the
smallest image
• Filter out invalid
positions
Focused effort: less
number of levels are
required
13
13


Approach

Num. Evaluations

100
90
80
70
60
50
40
30
20
10
0

Th.

• It is still necessary to filter out invalid positions-sizes
• The advantage of using this approach is that traditional multiscale
implementations can still be used with much less number of levels

1,02
1,05
1,1

1 6 11 16 21 26 31 36 41 46 51 56 61
Levels

Focused effort: less number
of levels are required
(typically 3 to 5)

14
14

Approach

• B) Grid of fixed positions
• Predefine feasible
locations of objects
• No need to filter
• Can not be used in
multiscale
implementations.
One evaluation per
candidate

Much more focused
effort

Projected boxes

15
15

Bounding boxes

Outline

1.
2.
3.
4.

Motivation
Approach
Results
1.
2.

5.

Case study: person detection
Case study: vehicle detection

Conclusions

16
16

Results

• Case study: Person detection
–
–
–
–

Full-body and Head & Shoulder SVM-HOG detector
Linear multiobject tracking
Active Vision Group dataset (1920x1080, 4500 frames, 71460 persons
labeled)

17
17

Results

• Performance

Multiscale


– Reduction from
144880 to 46226
(68%) for similar
performance
1
0,998
0,996
0,994
0,992
Precision

– Using 3 levels is
enough because
perspective effect is
soft

0,99

L=3, 5, 7
Less FP but also
some
missdetections

FB

0,988

FBUB

0,986

FBUB*

0,984

Filtering

DAF

0,982

Tracking
Less FN

0,98
0,978
-0,1

6E-16

18
18

0,1

0,2

0,3
Recall

0,4

0,5

0,6

Results

• Case study: Vehicle detection
–
–
–
–

Vehicle detection application for embedded vision system
Road can be assumed as planar in the short distance
Ground truth sequence 2 minutes
Grid of fixed positions

19
19

Results

• Case study: Vehicle detection
– Detections are sparse and noisy
– Tracking is still necessary

20
20

Results

•1000x less evaluations
•7x speed in PC
•Same TP
•5 times less FP

21
21

Results

Type

Processor

RAM

CPU

OS

Language

PC

Intel Core
i5

8 GB

3.0 GHz

Windows 7
Ubuntu 12.04

C++

Embedded
HW 1

ARM
Cortex

512 MB

800 MHz

Xilinx Zynq
Linux

C++

Slow
Perspective
multiscale
Brute-force
multiscale

Fast

2 - 10 ms in PC
30 - 40 ms in ARM Cortex
11 – 40 ms in PC
25 fps real-time

22
22

Conclusions
• Perspective is a contextual information available in many situations
• Assumptions: dominant ground plane and known object size
• Its computation is easy (K, R, t) using homographies
• It can be used for object detection to focus computational Two
ways of applying it
• A) Perspective Multiscale: Wrapping multiscale function (~60%
reduction in typical surveillance scene)
• B) Grid of fixed positions: for even more reduction of
complexity (x7 speed up in low perspective scenes like onboard
vehicle detection)

23
23

Thank You!
Dr. Marcos Nieto
Researcher
mnieto@vicomtech.org

24
24

Offline process
Online process
26
26

Perspective Multiscale Detection and Tracking of Persons

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (6)

Similar to Perspective Multiscale Detection and Tracking of Persons

Similar to Perspective Multiscale Detection and Tracking of Persons (20)

Recently uploaded

Recently uploaded (20)

Perspective Multiscale Detection and Tracking of Persons