Outline
University
Of Cagliari
Department of Electrical
and Electronic Engineering
2
• General context
Intelligent Video-Surveillance, and in particular
– Person Re-identification
– Appearance-based People Search
• A framework for constructing descriptors of people
– dissimilarity-based representations and their advantages
– the Multiple Component Dissimilarity (MCD) framework
• MCD and person re-identification
• MCD and people search
• Discussion and conclusions
Intelligent Video Surveillance
University
Of Cagliari
Department of Electrical
and Electronic Engineering
3
Machine Learning
Biometrics and pattern
recognition
Novel sensor
technologies
Useful tools for operators and forensic
investigators
• person identification
• on-line tracking of persons and objects
• detection of events of interest
• detection of suspicious actions
• summarisation of long video footages
…
Intelligent
Video Surveillance
University
Of Cagliari
Department of Electrical
and Electronic Engineering
Person re-identification
Person Re-Identification is the ability to determine if an
individual has already been observed over a network of video-
surveillance cameras
4
A
B
Scenarios
- on-line (e.g. people
tracking among different
cameras)
- off-line (e.g. retrieve all the
frames showing an individual
of interest)
University
Of Cagliari
Department of Electrical
and Electronic Engineering
Person re-identification
Face recognition cannot be used
- bad quality images (low resolution, blur, …)
- unconstrained pose
Other cues must be used
clothing appearance
(easy to extract, good uniqueness in limited time spans)
other ones (e.g. gait) are impractical in real-world
scenarios
5
University
Of Cagliari
Department of Electrical
and Electronic Engineering
Clothing appearance descriptors
6
Blob detection
and tracking
BG/FG
segmentation
Descriptor
computation
Descriptor = body part subdivision + appearance
features
Each body part is automatically detected and described
separately by e.g.
- colour (e.g., histograms)
- texture (e.g., DCT, LBP)
- local/global features
Appearance-based people search
University
Of Cagliari
Department of Electrical
and Electronic Engineering
7
Clothing appearance descriptors can enable another useful
task, appearance-based people search (a novelty in the literature)
Retrieve images of people via a query expressed as a high-level description of
the
clothes (es. “people with red shirt and blue trousers”), instead of as an image
Dissimilarity representations
University
Of Cagliari
Department of Electrical
and Electronic Engineering
9
An alternative way [1] to represent objects in pattern
recognition, useful when
it is unclear how to choose a features
it is difficult to find a good feature set
feature-based representation
dissimilarity-based representation
Object
feature
extraction
[ x1 x2 … xn ]
feature vector
prototypes
[1] Pekalska and Duin. The Dissimilarity Representation for Pattern Recognition: Foundations and
Applications. World Scientific Publishing, 2005
[ d1 d2 … dn ]
dissimilarity vector
Object
dissimilarities
computation
P1 P2 Pn
The Multiple Component Dissimilarity framework
University
Of Cagliari
Department of Electrical
and Electronic Engineering
10
Extension of the dissimilarity-based approach to objects represented by
- multiple parts
- multiple local features (components)
Prototypes
for body
part #1
Prototypes
for body
part #2
Dissimilarity vectors
(one for each body
part)
Local
appearance
Global
appearance
The Multiple Component Dissimilarity framework
University
Of Cagliari
Department of Electrical
and Electronic Engineering
11
Prototype construction
From a design set of images of people
various possible approaches, e.g. clustering
Clustering-based prototype creation example (two body parts)
Design set
Create a set of all the
components of body part 1
Create a set of all the
components of body part 2
Cluster
the set
Take centroids as
prototypes
Cluster
the set
Take centroids as
prototypes
The Multiple Component Dissimilarity framework
University
Of Cagliari
Department of Electrical
and Electronic Engineering
12
MCD representations will be exploited for
person re-identification
appearance-based people search
[d1,1 d1,2 d1,3 d1,4 d2,1 d2,2 d2,3 ] [d1,1 d1,2 d1,3 d1,4 d2,1 d2,2 d2,3 ]
[d1,1 d1,2 d1,3 d1,4 d2,1 d2,2 d2,3 ] [d1,1 d1,2 d1,3 d1,4 d2,1 d2,2 d2,3 ]
MCD and person re-identification
University
Of Cagliari
Department of Electrical
and Electronic Engineering
14
Person re-identification
MCD salient features for person re-identification:
a very compact representation
descriptors are small real vectors (low storage requirements, fast
matching)
dissimilarity vectors are representation-independent
they can be used to combine different features and modalities
Applications: 1) Speed up person re-identification methods
2) Feature combination for person re-identification
3) Multimodal person re-identification
matching
ranked list of templates
(w.r.t. the degree of similarity)
template gallery
probe
0.03 0.28 0.33 0.34
MCD-based matching
University
Of Cagliari
Department of Electrical
and Electronic Engineering
15
A novel weighted Euclidean distance for dissimilarity spaces
RATIONALE: - each dissimilarity is a degree of relevance of the corresponding
prototype;
- lower dissimilarity values carry more information; in fact, they
encode the
most relevant characteristics of the sample.
Weights: where (xi, yi in the range [0,1])
The weighting rule f() is a monotonically increasing
function; its choice governs the difference between
relevant and non-relevant prototypes
x and y: dissimilarity vectors;
W such that
MCD to speed up existing methods
University
Of Cagliari
Department of Electrical
and Electronic Engineering
17
MCD has been applied to an existing method, MCMimpl [2]
MCMimpl in short:
part subdivision:
torso – legs exploiting symmetry and
anti-symmetry properties, discarding head
multiple component representation:
for each part, randomly taken and partly
overlapping patches
Four data sets of increasing size:
i-LIDS (119 pedestrians) VIPeR-316 (316 pedestrians)
VIPeR-474 (474 pedestrians) VIPeR-632 (632 pedestrians)
[2] Satta, Fumera, Roli, Cristani, and Murino. A Multiple Component Matching Framework for Person Re-
Identification. In: ICIAP, 2011
Experimental evaluation
University
Of Cagliari
Department of Electrical
and Electronic Engineering
19
Trade-off between accuracy and computational time
It can be shown that the overall re-identification time* in a practical search
scenario is much lower when using MCD
* sum of processing time plus the average
search time spent by the operator
Fusion of different feature sets by MCD
University
Of Cagliari
Department of Electrical
and Electronic Engineering
22
Prototypes in MCD are representation-independent
MCD dissimilarity vectors can be used to combine together different kinds of
features, either global or local
each feature set will be responsible for a different sub-set of prototypes
Fusion of different feature sets by MCD
University
Of Cagliari
Department of Electrical
and Electronic Engineering
23
This technique has been used to combine five different feature sets
• RandPatchesHSV
• RandPatchesLBP
• FCTH [3]
• EdgeHistogram [4]
• SCD [4]
exploiting a 4-body-parts subdivision
First two feature sets:
200 prototypes per feature set per body part
Last three feature sets:
100 prototypes per feature set per body part
3200 prototypes in total
[3] Chatzichristofis and Boutalis. FCTH: Fuzzy Color and Texture Histogram – a Low Level Feature for
Accurate Image Retrieval. In: WIAMIS, 2008
[4] Sikora. The MPEG-7 Visual Standard for Content Description – an Overview. IEEE Transactions on
Circuits and Systems for Video Technology, 2001
Performance of the single feature sets
University
Of Cagliari
Department of Electrical
and Electronic Engineering
24
I-LIDS: 119 individuals
Comparison with the state-of-the-art
University
Of Cagliari
Department of Electrical
and Electronic Engineering
25
Comparison with two state-of-the-art methods
- SDALF [5]
- CPS [6]
[5] Farenzena, Bazzani, Perina, Murino, and Cristani. Person Re-Identification by Symmetry-Driven
Accumulation of Local Features. In: CVPR, 2010
[6] Cheng, Cristani, Stoppa, Bazzani, and Murino. Custom Pictorial Structures for Re-Identification. In:
BMVC, 2011
Multi-modal person re-identification
University
Of Cagliari
Department of Electrical
and Electronic Engineering
27
• Appearance is a widely used cue for person re-identification
other cues (e.g., gait) pose constraints that limit their applicability
in real world scenarios
• However, the recent introduction of RGB-D sensors makes it
possible to extract anthropometric measures that can be
combined with appearance
Example MS Kinect™!
By processing RGB-D data, it is possible to estimate a 3D model of a person in real-time [7]
From this model, one can extract various anthropometric measures (e.g., height, arm
length)
[7] Shotton, Fitzgibbon, Cook, Sharp, Finocchio, Moore, Kipman, and Blake. Real-time Pose Recognition in
Parts from Single Depth Images. In: CVPR, 2011
Multi-modal person re-identification
University
Of Cagliari
Department of Electrical
and Electronic Engineering
29
A proper fusion strategy must be used to combine different modalities.
Score-level fusion Feature-level fusion
- Performance of score-level fusion is affected by the choice of the fusion
rule (e.g.,
mean, min); a suitable choice for re-id is not trivial
- Feature-level fusion requires homogeneous features
Fusion
Modality 1 Matching score
Modality 2 Matching score
Modality n Matching score
Fusion score
Modality 1
Modality 2
Modality n
Matching
Multi-modal person re-identification
University
Of Cagliari
Department of Electrical
and Electronic Engineering
30
MCD provides a way to combine non-homogeneous modalities at feature
level, by exploiting its representation-independency
Multi-modal person re-identification
University
Of Cagliari
Department of Electrical
and Electronic Engineering
31
This MCD-based approach has been used to combine appearance with anthropometry
Appearance:
two descriptors, MCMimpl v2 and SDALF
Anthropometry:
three measures from the skeleton:
- normalised height
- normalised average arm length
- normalised average leg length
MCMimpl v2 SDALF
Experimental evaluation
University
Of Cagliari
Department of Electrical
and Electronic Engineering
32
Experiments have been carried out on a novel dataset acquired using Kinect
cameras, Kinect4REID
video sequences of 80 individuals taken at different locations
different lighting conditions and view points
2 to 7 different video sequences per person
many persons are carrying bags or accessories
MCD for people search
University
Of Cagliari
Department of Electrical
and Electronic Engineering
36
Implementation by MCD: high-level concepts that describe certain clothing
characteristics (e.g., “red shirt”) may be encoded by one or more visual
prototypes, according to the low-level features and part subdivision used
Prototypes (rectangular patches) extracted from a set of
24 people (upper body part)
Correlation with the presence of the concept “red shirt”
MCD for people search
University
Of Cagliari
Department of Electrical
and Electronic Engineering
37
How to implement people search
(i) define a set of basic queries
(ii) construct a detector for each basic query, using dissimilarity values as input
Complex queries can be built by connecting basic ones through Boolean
operators,
e.g., “red shirt AND (blue trousers OR black trousers)”
Detector[ d1 d2 … dn ] SCORE
Experimental evaluation
University
Of Cagliari
Department of Electrical
and Electronic Engineering
38
Dataset
a subset of 512 images taken from the VIPeR data-set, tagged with respect to 14
different basic queries
Examples:
Three descriptors:
i) MCMimpl
ii) SDALF
iii) MCMimpl-PS, which uses a pictorial structure [8] to subdivide the body into nine
parts
body subdivision,
MCMimpl and SDALF
body subdivision,
MCMimpl-PS
[8] Andriluka, Roth, and Schiele. Pictorial Structures Revisited: People Detection and Articulated Pose
Estimation. In: CVPR 2009
Experimental evaluation
University
Of Cagliari
Department of Electrical
and Electronic Engineering
39
For each basic query:
(i) the VIPeR-Tagged is subdivided into a training and a testing sets of equal size
(ii) a linear SVM is trained on training images to implement a detector
(iii) the P-R curve is evaluated on testing images, by varying the SVM decision threshold
This procedure is repeated ten times
Break-even points for all classes:
Conclusions
University
Of Cagliari
Department of Electrical
and Electronic Engineering
41
What has been done
(i) MCD, a novel dissimilarity-based framework for describing individuals
(ii) an approach based on MCD to speed up any existing person re-
identification method
(iii) a state-of-the-art re-identification method, that combines different
features obtained through the use of MCD
(iv) a method to perform multi-modal person re-identification based on
MCD and using RGB-D cameras, and a novel data set to assess
performance of multi-modal re-identification systems
(v) a method that uses MCD to perform the novel task of “appearance-
based people search”
Conclusions
University
Of Cagliari
Department of Electrical
and Electronic Engineering
42
What to do next (long list…!)
THE FRAMEWORK
(i) explore the commonalities between MCD and Visual Words and Fisher
Vectors
(ii) extend MCD to other domains
MULTIMODAL RE-ID
(i) explore the use of other cues (other anthropometries, skeleton-based
gait…)
(ii) extend the approach to support missing cues
PEOPLE SEARCH
(i) address the problem of ambiguity of concepts
(ii) add semantic interpretation (Natural Language Processing) to support
queries in natural language