SlideShare a Scribd company logo
Distance Metric Learning for
Set-based Visual Recognition
Ruiping Wang
Institute of Computing Technology (ICT),
Chinese Academy of Sciences (CAS)
June 7, 2015
CVPR2015 Tutorial on “Distance Metric Learning for Visual Recognition” Part-4
InstituteofComputingTechnology,ChineseAcademyofSciences
2
Outline
 Background
 Literature review
 Evaluations
 Summary
InstituteofComputingTechnology,ChineseAcademyofSciences
 Identification
 Typical applications
 Photo matching (1:N)
 Watch list screening (1:N+1)
 Performance metric
 FR(@FAR)
 Verification
 Typical applications
 Access control (1:1)
 E-passport (1:1)
 Performance metric
 ROC: FRR+FAR
3
Face Recognition with Single Image
Who is this celebrity? Are they the same guy?
InstituteofComputingTechnology,ChineseAcademyofSciences
4
Challenges
PosePose
Illumination
Pose
Illumination
InstituteofComputingTechnology,ChineseAcademyofSciences
5
Challenges
Pose
Illumination
Personal ID
InstituteofComputingTechnology,ChineseAcademyofSciences
6
Challenges
Pose
Illumination
Personal ID
InstituteofComputingTechnology,ChineseAcademyofSciences
7
Challenges
Pose
Personal ID
Illumination
Probe
Gallery
Same Person
Different Person
InstituteofComputingTechnology,ChineseAcademyofSciences
88
Challenges
 Intra-class variation vs. inter-class variation
 Distance measure semantic meaning
 Sample-based metric learning is made even harder
Pose variation Illumination variation
D(x, y)=?
InstituteofComputingTechnology,ChineseAcademyofSciences
99
Face Recognition with Videos
 Video surveillance
http://www.youtube.com/watch?v=M80DXI932OE
http://www.youtube.com/watch?v=RfJsGeq0xRA#t=22
Seeking missing children Searching criminal suspects
InstituteofComputingTechnology,ChineseAcademyofSciences
1010
Face Recognition with Videos
 Video shot retrieval
1 32 4 5 76 9 Next Page8
S01E01: 10’48’’ S01E06: 05’22’’ S01E02: 04’20’’ S01E02: 00’21’’
S01E03: 08’06’’ S01E05: 09’23’’ S01E01: 22’20’’ S01E04: 03’42’’
Smart TV-Series Character
Shots Retrieval System
“the Big Bang Theory”
InstituteofComputingTechnology,ChineseAcademyofSciences
11
Treating Video as Image Set
 A new & different problem
Unconstrained acquisition conditions
Complex appearance variations
Two phases: set modeling + set matching
New paradigm:
set-based metric learning
InstituteofComputingTechnology,ChineseAcademyofSciences
12
Outline
 Background
 Literature review
 Evaluations
 Summary
InstituteofComputingTechnology,ChineseAcademyofSciences
13
Overview of previous works
From the
view of set
modeling
Statistics
[Shakhnarovich, ECCV’02]
[Arandjelović, CVPR’05]
[Wang, CVPR’12]
[Harandi, ECCV’14]
[Wang, CVPR’15]
G1
G2
K-L
Divergence
S1
S2
Linear subspace
[Yamaguchi, FG’98]
[Kim, PAMI’07]
[Hamm, ICML’08]
[Huang, CVPR’15]
Principal
angle
H1
H2
Affine/Convex hull
[Cevikalp, CVPR’10]
[Hu, CVPR’11]
[Zhu, ICCV’13]
NN
matching
Nonlinear manifold
[Kim, BMVC’05]
[Wang, CVPR’08]
[Wang, CVPR’09]
[Chen, CVPR’13]
[Lu, CVPR’15]
Principal
angle(+)
M1
M2
InstituteofComputingTechnology,ChineseAcademyofSciences
14
Overview of previous works
 Set modeling
 Linear subspaceNonlinear manifold
 Affine/Convex Hull (affine subspace)
 Parametric PDFs  Statistics
 Set matching—basic distance
 Principal angles-based measure
 Nearest neighbor (NN) matching approach
 K-L divergenceSPD Riemannian metric…
 Set matching—metric learning
 Learning in Euclidean space
 Learning on Riemannian manifold
InstituteofComputingTechnology,ChineseAcademyofSciences
15
Set model I: linear subspace
 Properties
 PCA on the set of image samples to get subspace
 Loose characterization of the set distribution region
 Principal angles-based measure discards the
varying importance of different variance directions
 Methods
 MSM [FG’98]
 DCC [PAMI’07]
 GDA [ICML’08]
 GGDA [CVPR’11]
 PML [CVPR’15]
 …
S1
S2
No distribution boundary
No direction difference
InstituteofComputingTechnology,ChineseAcademyofSciences
16
Set model I: linear subspace
 MSM (Mutual Subspace Method) [FG’98]
 Pioneering work on image set classification
 First exploit principal angles as subspace distance
 Metric learning: N/A
[1] O. Yamaguchi, K. Fukui, and K. Maeda. Face Recognition Using Temporal Image
Sequence. IEEE FG 1998.
subspace method Mutual subspace method
InstituteofComputingTechnology,ChineseAcademyofSciences
17
Set model I: linear subspace
 DCC (Discriminant Canonical Correlations) [PAMI’07]
 Metric learning: in Euclidean space
[1] T. Kim, J. Kittler, and R. Cipolla. Discriminative Learning and Recognition of Image
Set Classes Using Canonical Correlations. IEEE T-PAMI, 2007.
Set 1: 𝑿 𝟏 Set 2: 𝑿 𝟐
𝑷 𝟏
𝑷 𝟐
Linear subspace by:
orthonormal basis matrix
𝑿𝑖 𝑿𝑖
𝑇
≃ 𝑷𝑖 𝚲𝑖 𝑷𝑖
𝑇
InstituteofComputingTechnology,ChineseAcademyofSciences
18
DCC
 Canonical Correlations/Principal Angles
 Canonical vectorscommon variation modes
Canonical vectors
Set 1: 𝑿 𝟏
Set 2: 𝑿 𝟐
𝑷1
𝑇
𝑷2 = 𝑸12 𝜦𝑸21
𝑇
𝑼 = 𝑷1 𝑸12 = [𝒖1, … , 𝒖2]
𝑽 = 𝑷2 𝑸21 = [𝒗1, … , 𝒗2]
𝜦 = 𝑑𝑖𝑎𝑔(cos 𝜃1, … , cos 𝜃 𝑑)
Canonical Correlation: cos 𝜃𝑖
Principal Angles: 𝜃𝑖
𝑷 𝟏
𝑷 𝟐
InstituteofComputingTechnology,ChineseAcademyofSciences
19
DCC
 Discriminative learning
 Linear transformation
 𝑻: 𝑿𝑖 ⟶ 𝒀𝑖 = 𝑻 𝑇
𝑿𝑖
 Representation
 𝒀𝑖 𝒀𝑖
𝑇
= (𝑻 𝑇 𝑿𝑖)(𝑻 𝑇 𝑿𝑖) 𝑇
≃ (𝑻 𝑇 𝑷𝑖)𝚲𝑖(𝑻 𝑇 𝑷𝑖) 𝑇
 Set similarity
 𝐹𝑖𝑗 = max
𝑸 𝑖𝑗,𝑸 𝑗𝑖
𝑡𝑟(𝑀𝑖𝑗)
 𝑀𝑖𝑗 = 𝑸𝑖𝑗
𝑇
𝑷′𝑖
𝑇
𝑻𝑻 𝑇
𝑷𝑗
𝑇
𝑸 𝑗𝑖
 Discriminant function
 𝑻 = max
𝑎𝑟𝑔 𝑻
𝑡𝑟 𝑻 𝑇 𝑺 𝑏 𝑻 /𝑡𝑟(𝑻 𝑇 𝑺 𝑤 𝑻)
InstituteofComputingTechnology,ChineseAcademyofSciences
20
Set model I: linear subspace
 GDA [ICML’08] / GGDA [CVPR’11]
 Treat subspaces as points on Grassmann manifold
 Metric learning: on Riemannian manifold
[1] J. Hamm and D. D. Lee. Grassmann Discriminant Analysis: a Unifying View on Subspace-Based
Learning. ICML 2008.
[2] M. Harandi, C. Sanderson, S. Shirazi, B. Lovell. Graph Embedding Discriminant Analysis on
Grassmannian Manifolds for Improved Image Set Matching. IEEE CVPR 2011.
Subspace Subspace
Grassmann manifold
implicit kernel
mapping
InstituteofComputingTechnology,ChineseAcademyofSciences
21
GDA/GGDA
 Projection metric
 𝑑 𝑃 𝒀1, 𝒀2 = sin2 𝜃𝑖𝑖
1/2 = 2−1/2 𝒀1 𝒀1
𝑇
− 𝒀2 𝒀2
𝑇
𝐹
Geodesic distance: (Wong,
1967; Edelman et al., 1999)
𝑑 𝐺
2
(𝒀1, 𝒀2) = 𝜃𝑖
2
𝑖
𝜃𝑖: Principal angles
InstituteofComputingTechnology,ChineseAcademyofSciences
22
GDA/GGDA
 Projection kernel
 Projection embedding (isometric)
 Ψ 𝑃: 𝒢 𝑚, 𝐷 ⟶ ℝ 𝐷×𝐷, 𝑠𝑝𝑎𝑛 𝒀 ⟶ 𝒀𝒀 𝑇
 The inner-product of ℝ 𝐷×𝐷
 𝑡𝑟((𝒀1 𝒀1
𝑇
)(𝒀2 𝒀2
𝑇
)) = 𝒀1
𝑇
𝒀2 𝐹
2
 Grassmann kernel (positive definite kernel)
 𝑘 𝑃 𝒀1, 𝒀2 = 𝒀1
𝑇
𝒀2 𝐹
2
InstituteofComputingTechnology,ChineseAcademyofSciences
23
GDA/GGDA
 Discriminative learning
 Classical kernel methods using the Grassmann kernel
 e.g., Kernel LDA / kernel Graph embedding
 𝜶∗
= arg max
𝜶
𝜶 𝑻 𝑲𝑾𝑲𝜶
𝜶 𝑻 𝑲𝑲𝜶
Grassmann kernel
InstituteofComputingTechnology,ChineseAcademyofSciences
24
Set model I: linear subspace
 PML (Projection Metric Learning) [CVPR’15]
 Metric learning: on Riemannian manifold
[1] Z. Huang, R. Wang, S. Shan, X. Chen. Projection Metric Learning on Grassmann Manifold with
Application to Video based Face Recognition. IEEE CVPR 2015.
(b) (c)(a)
𝒀1
𝒀 𝟐
𝒀 𝟑
𝒀 𝟑
𝒀 𝟐
𝒀1
𝒢(𝑞, 𝐷)
𝒀1
𝒀 𝟑
𝒀 𝟐
𝒢(𝑞, 𝑑)
(d)
𝒀1
𝒀 𝟑
𝒀 𝟐
ℋ (e)
𝒀 𝟑
𝒀1
𝒀 𝟐
ℝ 𝑑
Learning with kernel in RKHS
Learning directly on the manifold
Structure distortion
No explicit mapping
High complexity
InstituteofComputingTechnology,ChineseAcademyofSciences
25
PML
 Explicit manifold to manifold mapping
 𝑓 𝒀 = 𝑾 𝑇 𝒀 ∈ 𝒢 𝑞, 𝑑 , 𝒀 ∈ 𝒢 𝑞, 𝐷 , 𝑑 ≤ 𝐷
 Projection metric on target Grassmann manifold 𝒢(𝑞,𝑑)
 𝑑 𝑝
2
𝑓 𝒀i , 𝑓 𝒀j = 2−1/2
(𝑾 𝑇
𝒀𝑖
′
) 𝑾 𝑇
𝒀𝑖
′ 𝑇
− (𝑾 𝑇
𝒀𝑗
′
)(𝑾 𝑇
𝒀𝑗
′
) 𝑇
𝐹
2
=
2−1/2 𝑡𝑟 𝑷 𝑇 𝑨𝑖𝑗 𝑨𝑖𝑗 𝑃
 𝑨𝑖𝑗 = (𝒀𝑖
′
𝒀𝑖
′ 𝑇
− 𝒀𝑗
′
𝒀𝑗
′
) 𝑇, 𝑷 = 𝐖𝐖 𝑇 is a rank-𝑑 symmetric positive
semidefinite (PSD) matrix of size 𝐷 × 𝐷 (similar form as
Mahalanobis matrix )
 𝒀i needs to be normalized to 𝒀𝑖
′
so that the columns of 𝑾 𝑇
𝒀i are
orthonormal
(b) (c)
𝒀 𝟑
𝒀 𝟐
𝒀1
𝒇(𝒀1)
𝒇(𝒀 𝟑)
𝒇(𝒀 𝟐)
𝒢(𝑞, 𝐷) 𝒢(𝑞, 𝑑)
𝑓
InstituteofComputingTechnology,ChineseAcademyofSciences
26
PML
 Discriminative learning
 Discriminant function
 Minimize/Maximize the projection distances of any within-
class/between-class subspace pairs
 𝐽 = min 𝑡𝑟 𝑷 𝑇 𝑨𝑖𝑗 𝑨𝑖𝑗 𝑷𝑙 𝑖=𝑙 𝑗
− 𝜆 𝑡𝑟 𝑷 𝑇 𝑨𝑖𝑗 𝑨𝑖𝑗 𝑷𝑙 𝑖≠𝑙 𝑗
 Optimization algorithm
 Iterative solution for one of 𝒀′
and 𝑷 by fixing the other
 Normalization of 𝒀 by QR-decomposition
 Computation of 𝑷 by Riemannian Conjugate Gradient (RCG)
algorithm on the manifold of PSD matrices
within-class between-class
InstituteofComputingTechnology,ChineseAcademyofSciences
27
Set model II: nonlinear manifold
 Properties
 Capture nonlinear complex appearance variation
 Need dense sampling and large amount of data
 Less appealing computational efficiency
 Methods
 MMD [CVPR’08]
 MDA [CVPR’09]
 BoMPA [BMVC’05]
 SANS [CVPR’13]
 MMDML [CVPR’15]
 …
Complex distribution
Large amount of data
InstituteofComputingTechnology,ChineseAcademyofSciences
28
Set model II: nonlinear manifold
 MMD (Manifold-Manifold Distance) [CVPR’08]
 Set modeling with nonlinear appearance manifold
 Image set classification distance computation
between two manifolds
 Metric learning: N/A
D(M1,M2)
[1] R. Wang, S. Shan, X. Chen, W. Gao. Manifold-Manifold Distance with Application to Face
Recognition based on Image Set. IEEE CVPR 2008. (Best Student Poster Award Runner-up)
[2] R. Wang, S. Shan, X. Chen, Q. Dai, W. Gao. Manifold-Manifold Distance and Its Application to
Face Recognition with Image Sets. IEEE Trans. Image Processing, 2012.
InstituteofComputingTechnology,ChineseAcademyofSciences
29
MMD
 Multi-level MMD framework
Three pattern levels: Point->Subspace->
Manifold
S
x'
x S1
S2
x1
x2
PPD PSD SSD
xM x''
x S
M M1
M2
PMD SMD MMD
InstituteofComputingTechnology,ChineseAcademyofSciences
30
MMD
 Formulation & Solution
SSD between pairs
of local models
M1
M2
local linear models
iC jC
1C 2C 3C
1C 2C 3C
Three modules
Local model construction
Local model distance
Global integration of local
distances
1 2 1 1
( , ) ( , )
m n
ij i ji j
d f d C C 
  M M
1 1
. . 1, 0
m n
ij iji j
s t f f 
  
InstituteofComputingTechnology,ChineseAcademyofSciences
31
MMD
 Subspace-subspace
distance (SSD)
S1
S2
0
1e
1u
2e
2u
1v
2v
mean (cosine correlation)
+
variance (canonical correlation)
2 1/2
1 2 0 0( , ) (1 cos ) sinEd S S    
2 1/2 2 1/2
1 2
1 1
( , ) ( sin ) ( cos )
r r
P k k
k k
d S S r 
 
   
2 2 1/ 2
1 2 0
1
2 2 1/ 2
0
1
1
( , ) (sin sin )
1
(2 cos cos )
r
k
k
r
k
k
d S S
r
r
 
 


 
  


same
Diff.
InstituteofComputingTechnology,ChineseAcademyofSciences
32
 MDA (Manifold Discriminant Analysis ) [CVPR’09]
 Goal: maximize “manifold margin” under Graph
Embedding framework
 Metric learning: in Euclidean space
 Euclidean distance between pair of image samples
[1] R. Wang, X. Chen. Manifold Discriminant Analysis. IEEE CVPR 2009.
M1
M2
within-class
compactness
between-class
separability
M1
M2
Intrinsic graph Penalty graph
Set model II: nonlinear manifold
InstituteofComputingTechnology,ChineseAcademyofSciences
33
 Optimization framework
 Min. within-class scatter
 Max. between-class scatter
 Objective function
 Global linear transformation
 Solved by generalized eigen-decomposition
2
, 2 ( )T T T T
w m n m n
m,n
S w X D W X    v x v x v v
2
, 2 ( )T T T T
b m n m n
m,n
S w X D W X      v x v x v v
Intrinsic
graph
Penalty
graph
MDA
( )
Maximize ( ) =
( )
T T
b
T T
w
S X D W X
J
S X D W X
 


v v
v
v v
InstituteofComputingTechnology,ChineseAcademyofSciences
34
 BoMPA (Boosted Manifold Principal Angles) [BMVC’05]
 Goal: optimal fusion of different principal angles
 Exploit Adaboost to learn weights for each angle
[1] T. Kim O. Arandjelovic, R. Cipolla. Learning over Sets using Boosted Manifold Principal Angles
(BoMPA). BMVC 2005.
PAsimilar (common illumination) BoMPAdissimilar
Subject A Subject B
Set model II: nonlinear manifold
InstituteofComputingTechnology,ChineseAcademyofSciences
35
 SANS (Sparse Approximated Nearest Subspaces) [CVPR’13]
 Goal: adaptively construct the nearest subspace pair
 Metric learning: N/A
[1] S. Chen, C. Sanderson, M.T. Harandi, B.C. Lovell. Improved Image Set Classification via Joint
Sparse Approximated Nearest Subspaces. IEEE CVPR 2013.
Set model II: nonlinear manifold
SSD (Subspace-Subspace Distance):
Joint sparse representation (JSR) is
applied to approximate the nearest
subspace over a Grassmann manifold.
InstituteofComputingTechnology,ChineseAcademyofSciences
36
 MMDML (Multi-Manifold Deep Metric Learning) [CVPR’15]
 Goal: maximize “manifold margin” under Deep Learning
framework
 Metric learning: in Euclidean space
Set model II: nonlinear manifold
[1] J. Lu, G. Wang, W. Deng, P. Moulin, and J. Zhou. Multi-Manifold Deep Metric Learning for Image
Set Classification. IEEE CVPR 2015.
Class-specific DNN
Model nonlinearity
Objective function
InstituteofComputingTechnology,ChineseAcademyofSciences
37
Set model III: affine subspace
 Properties
 Linear reconstruction using: mean + subspace basis
 Synthesized virtual NN-pair matching
 Less characterization of global data structure
 Computationally expensive by NN-based matching
 Methods
 AHISD/CHISD [CVPR’10]
 SANP [CVPR’11]
 PSDML/SSDML [ICCV’13]
 …
H1
H2
Sensitive to noise samples
High computation cost
InstituteofComputingTechnology,ChineseAcademyofSciences
38
Set model III: affine subspace
 AHISD/CHISD [CVPR’10], SANP [CVPR’11]
 NN-based matching using sample Euclidean distance
 Metric learning: N/A
 Subspace spanned by all the available samples
𝑫 = {𝑑1, … , 𝑑 𝑛} in the set
 Affine hull
 𝐻 𝑫 = 𝑫𝜶 = 𝑑𝑖 𝛼𝑖| 𝛼𝑖 = 1
 Convex hull
 𝐻 𝑫 = {𝑫𝜶 = 𝑑𝑖 𝛼𝑖| 𝛼𝑖 = 1, 0 ≤ 𝛼𝑖 ≤ 1}
[1] H. Cevikalp, B. Triggs. Face Recognition Based on Image Sets. IEEE CVPR 2010.
[2] Y. Hu, A.S. Mian, R. Owens. Sparse Approximated Nearest Points for Image Set
Classification. IEEE CVPR 2011.
InstituteofComputingTechnology,ChineseAcademyofSciences
39
Set model III: affine subspace
 PSDML/SSDML [ICCV’13]
 Metric learning: in Euclidean space
[1] P. Zhu, L. Zhang, W. Zuo, and D. Zhang. From Point to Set: Extend the Learning of
Distance Metrics. IEEE ICCV 2013.
point-to-set distance
metric learning (PSDML)
set-to-set distance metric
learning (SSDML)
InstituteofComputingTechnology,ChineseAcademyofSciences
PSDML/SSDML
 Point-to-set distance
 Basic distance
 𝑑2 𝒙, 𝑫 = 𝑚𝑖𝑛 𝜶 𝒙 − 𝐻(𝑫) 2
2
= 𝑚𝑖𝑛 𝜶 𝒙 − 𝑫𝜶 2
2
 Solution: Least Square Regression or Ridge Regression
 Mahalanobis distance
 𝑑 𝑴
2
𝒙, 𝑫 = 𝑚𝑖𝑛 𝑷(𝒙 − 𝑫𝜶) 2
2
=
𝒙 − 𝑫𝜶 𝑇 𝑷 𝑇 𝑷 𝑥 − 𝑫𝜶 = 𝒙 − 𝑫𝜶 𝑇 𝑴 𝒙 − 𝑫𝜶
Projection matrix
40
InstituteofComputingTechnology,ChineseAcademyofSciences
PSDML/SSDML
 Point-to-set distance metric learning (PSDML)
 SVM-based method
𝑑 𝑴
2
𝒙, 𝑫 = min 𝑷(𝒙 − 𝑫𝜶) 2
2
= 𝒙 − 𝑫𝜶 𝑇
𝑷 𝑇
𝑷 𝑥 − 𝑫𝜶
= 𝒙 − 𝑫𝜶 𝑇 𝑴 𝒙 − 𝑫𝜶
min
𝑴,𝜶 𝑙 𝒙 𝑖
,𝜶 𝒋, 𝝃 𝑖𝑗
𝑁
,𝝃 𝑖
𝑃
,𝑏
𝑴 𝐹
2
+ 𝜈( 𝝃𝑖,𝑗
𝑁
𝑖,𝑗
+ 𝝃𝑖
𝑃
𝑖
)
𝑠. 𝑡. 𝑑 𝑀 𝒙𝑖, 𝑫𝑗 + 𝑏 ≥ 1 − 𝝃𝑖𝑗
𝑁
, 𝑗 ≠ 𝑙 𝒙𝑖 ; (-)
𝑑 𝑀 𝒙𝑖, 𝑫𝑙 𝒙𝒊
+ 𝑏 ≤ −1 + 𝝃𝑖
𝑃
; (+)
𝑴 ≽ 0, ∀𝑖, 𝑗, 𝝃𝑖𝑗
𝑁
≥ 0, 𝝃𝑖
𝑃
≥ 0
41
InstituteofComputingTechnology,ChineseAcademyofSciences
PSDML/SSDML
 Set-to-set distance
 Basic distance
 𝑑2 𝑫1, 𝑫2 = 𝑚𝑖𝑛 𝜶1,𝜶2
𝐻(𝑫1) − 𝐻(𝑫1) 2
2
=
𝑚𝑖𝑛 𝜶1,𝜶2
𝑫1 𝜶1 − 𝑫2 𝜶2 2
2
 Solution: AHISD/CHISD [Cevikalp, CVPR’10]
 Mahalanobis distance
 𝑑 𝑴
2
𝑫1, 𝑫2 = 𝑚𝑖𝑛 𝑷(𝑫1 𝜶1 − 𝑫2 𝜶2) 2
2
=
𝑫1 𝜶1 − 𝑫2 𝜶2
𝑇 𝑴 𝑫1 𝜶1 − 𝑫2 𝜶2
42
InstituteofComputingTechnology,ChineseAcademyofSciences
PSDML/SSDML
 Set-to-set distance metric learning (SSDML)
 SVM-based method
𝑑 𝑴
2
𝑫1, 𝑫2 = min 𝑷(𝑫1 𝜶1 − 𝑫2 𝜶2) 2
2
= 𝑫1 𝜶1 − 𝑫2 𝜶2
𝑇
𝑴 𝑫1 𝜶1 − 𝑫2 𝜶2
min
𝑴,𝜶 𝑖,𝜶 𝑗, 𝜶 𝑘, 𝝃 𝑖𝑗
𝑁
,𝝃𝑖𝑘
𝑃
,𝑏
𝑴 𝐹
2
+ 𝜈( 𝝃𝑖𝑗
𝑁
𝑖,𝑗
+ 𝝃𝑖𝑘
𝑃
𝑖,𝑘
)
𝑠. 𝑡. 𝑑 𝑀 𝑫𝑖, 𝑫𝑗 + 𝑏 ≥ 1 − 𝝃𝑖𝑗
𝑁
, 𝑙(𝑫𝑖) ≠ 𝑙 𝑫𝑗 ; (-)
𝑑 𝑀 𝑫𝑖, 𝑫 𝑘 + 𝑏 ≤ −1 + 𝝃𝑖𝑘
𝑃
, 𝑙 𝑫𝑖 = 𝑙 𝑫 𝑘 ; (+)
𝑴 ≽ 0, ∀𝑖, 𝑗, 𝑘, 𝝃𝑖𝑗
𝑁
≥ 0, 𝝃𝑖𝑘
𝑃
≥ 0
43
InstituteofComputingTechnology,ChineseAcademyofSciences
44
Set model IV: statistics (COV+)
 Properties
 The natural raw statistics of a sample set
 Flexible model of multiple-order statistical information
 Methods
 CDL [CVPR’12]
 LMKML [ICCV’13]
 DARG [CVPR’15]
 SPD-ML [ECCV’14]
 LEML [ICML’15]
 LERM [CVPR’14]
 HER [CVPR’15 ]
 …
G1
G2
Natural raw statistics
More flexible model
[Shakhnarovich, ECCV’02]
[Arandjelović, CVPR’05]
InstituteofComputingTechnology,ChineseAcademyofSciences
45
 CDL (Covariance Discriminative Learning) [CVPR’12]
 Set modeling by Covariance Matrix (COV)
 The 2nd order statistics characterizing set data variations
 Robust to noisy set data, scalable to varying set size
 Metric learning: on the SPD manifold
[1] R. Wang, H. Guo, L.S. Davis, Q. Dai. Covariance Discriminative Learning: A Natural
and Efficient Approach to Image Set Classification. IEEE CVPR 2012.
TI
log
Si

Ci Cj
Sj
I
Model data variations
No assum. of data distr.
Set model IV: statistics (COV+)
InstituteofComputingTechnology,ChineseAcademyofSciences
46
CDL
 Set modeling by Covariance Matrix
1 2= [ , ,..., ]N d NX x x x
COV: d*d symmetric positive
definite (SPD) matrix*
1
1
( )( )
1
N
T
i i
iN 
  

C x x x x
Image set: N samples with d-
dimension image feature
*: use regularization to tackle singularity
problem
InstituteofComputingTechnology,ChineseAcademyofSciences
47
CDL
 Set matching on COV manifold
 Riemannian metrics on the SPD manifold
 Affine-invariant distance (AID) [1]
or
 Log-Euclidean distance (LED) [2]
1 2 1 2( , ) log ( ) log ( ) F
d  I IC C C C
2 2
1 2 1 21
( , ) ln ( , )
d
ii
d 
 C C C C
22 -1/2 -1/2
1 2 1 2 1( , ) log ( ) F
d  IC C C C C
High
computational
burden
More efficient,
more appealing
[1] W. Förstner and B. Moonen. A Metric for Covariance Matrices. Technical Report 1999.
[2] V. Arsigny, P. Fillard, X. Pennec and N. Ayache. Geometric Means In A Novel Vector Space
Structure On Symmetric Positive-Definite Matrices. SIAM J. MATRIX ANAL. APPL. Vol. 29, No. 1, pp.
328-347, 2007.
InstituteofComputingTechnology,ChineseAcademyofSciences
48
CDL
 Set matching on COV manifold (cont.)
 Explicit Riemannian kernel feature mapping with LED
1 2 1 2( , ) [log ( ) log ( )]logk trace I IC C C C
Mercer’s
theorem
Tangent space at
Identity matrix I
Riemannian manifold
of non-singular COV
M
TIM
X1 Y1
0
X2
X3
Y2
Y3
x1
x2
x3
y1
y2
y3
logI logI
: log ( ), ( )d d
log R 
 IC C
InstituteofComputingTechnology,ChineseAcademyofSciences
49
CDL
 Discriminative learning on COV manifold
 Partial Least Squares (PLS ) regression
 Goal: Maximize the covariance between
observations and class labels
Space X Space Y
Linear Projection
Latent Space T
feature vectors,
e.g. log-COV
matrices
class label
vectors, e.g. 1/0
indicators
X = T * P’
Y = T * B * C’ = X * Bpls
T is the common latent
representation
InstituteofComputingTechnology,ChineseAcademyofSciences
 COVSPD manifold
 Model
 Metric
 Kernel
 SubspaceGrassmannian
 Model
 Metric
 Kernel
50
CDL vs. GDA
u1
u2
1
1
( )( )
1
N
T
i i
iN 
  

C x x x x
*
T
C U U
1 2[ , , , ]m D mU u u u
1/ 2
1 2 1 1 2 2( , ) 2 T T
proj F
d 
 U U U U U U
: log ( ), ( )d d
log R 
 IC C : , ( , )T d d
proj m D R 
 U UU
1 2 1 2( , ) log ( ) log ( ) F
d  I IC C C C
log ( ) log ( ) T
I IC U U
InstituteofComputingTechnology,ChineseAcademyofSciences
51
 LMKML (Localized Multi-Kernel Metric Learning) [ICCV’13]
 Exploring multiple order statistics
 Data-adaptive weights for different types of features
 Ignoring the geometric structure of 2nd/3rd-order statistics
 Metric learning: in Euclidean space
[1] J. Lu, G. Wang, and P. Moulin. Image Set Classification Using Holistic Multiple Order Statistics
Features and Localized Multi-Kernel Metric Learning. IEEE ICCV 2013.
Set model IV: statistics (COV+)
Complementary information
(mean vs. covariance)
1st / 2nd / 3rd-order statistics
Objective function
InstituteofComputingTechnology,ChineseAcademyofSciences
52
 DARG (Discriminant Analysis on Riemannian manifold of
Gaussian distributions) [CVPR’15]
 Set modeling by mixture of Gaussian distribution (GMM)
 Naturally encode the 1st order and 2nd order statistics
 Metric learning: on Riemannian manifold
[1] W. Wang, R. Wang, Z. Huang, S. Shan, X. Chen. Discriminant Analysis on Riemannian Manifold
of Gaussian Distributions for Face Recognition with Image Sets. IEEE CVPR 2015.
𝓜:Riemannian
manifold of Gaussian
distributions
Non-discriminative
Time-consuming
×
Set model IV: statistics (COV+)
[Shakhnarovich, ECCV’02]
[Arandjelović, CVPR’05]
InstituteofComputingTechnology,ChineseAcademyofSciences
53
DARG
 Framework
(a)
(b)
Discriminative
learning
(c)
ℳ ℝ 𝑑
Kernel
embedding ℋ
ℳ: Riemannian manifold of Gaussian distributions
ℋ: high-dimensional reproducing kernel Hilbert space (RKHS)
ℝ 𝑑
: target lower-dimensional discriminant Euclidean subspace
InstituteofComputingTechnology,ChineseAcademyofSciences
54
DARG
 Kernels on the Gaussian distribution manifold
kernel based on Lie Group
 Distance based on Lie Group (LGD)
𝐿𝐺𝐷 𝑃𝑖, 𝑃𝑗 = log 𝑃𝑖 − log 𝑃𝑗 𝐹
,
𝑔~𝑁(𝑥|𝜇, Σ) ⟼ 𝑃 = Σ −
1
𝑑+1
Σ + 𝜇𝜇 𝑇
𝜇
𝜇 𝑇 1
 Kernel function
𝐾LGD 𝑔𝑖, 𝑔𝑗 = exp −
𝐿𝐺𝐷2
𝑃𝑖, 𝑃𝑗
2𝑡2
SPD matrix according
to information geometry
InstituteofComputingTechnology,ChineseAcademyofSciences
55
DARG
 Kernels on the Gaussian distribution manifold
kernel based on Lie Group
kernel based on MD and LED
 Mahalanobis Distance (MD) between mean
𝑀𝐷 𝜇𝑖, 𝜇 𝑗 = 𝜇𝑖 − 𝜇 𝑗
𝑇
Σ𝑖
−1
+ Σ𝑗
−1
𝜇𝑖 − 𝜇 𝑗
 LED between covariance matrix
𝐿𝐸𝐷 Σ𝑖, Σ𝑗 = log Σ𝑖 − log Σ𝑗 𝐹
 Kernel function
𝐾 𝑀𝐷+𝐿𝐸𝐷 𝑔𝑖, 𝑔𝑗 = 𝛾1 𝐾 𝑀𝐷 𝜇𝑖, 𝜇 𝑗 + 𝛾2 𝐾𝐿𝐸𝐷 Σ𝑖, Σ𝑗
𝐾 𝑀𝐷 𝜇𝑖, 𝜇 𝑗 = exp −
MD2
𝜇𝑖, 𝜇 𝑗
2𝑡2
𝐾𝐿𝐸𝐷 Σ𝑖, Σ𝑗 = exp −
𝐿𝐸𝐷2
Σ𝑖, Σ𝑗
2𝑡2
𝛾1, 𝛾2 are the combination coefficients
InstituteofComputingTechnology,ChineseAcademyofSciences
56
DARG
 Discriminative learning
Weighted KDA (kernel discriminant analysis)
 incorporating the weights of Gaussian components
𝐽 𝛼 =
𝛼 𝑇
𝑩𝛼
𝛼 𝑇 𝑾𝛼
𝑾 =
1
𝜔𝑖
𝑤𝑗
(𝑖)
𝑘𝑗
𝑖
− 𝑚𝑖 𝑘𝑗
𝑖
− 𝑚𝑖
𝑇
𝑁 𝑖
𝑗=1
𝐶
𝑖=1
𝑩 = 𝑁𝑖 𝑚𝑖 − 𝑚 𝑚𝑖 − 𝑚 𝑇
𝐶
𝑖=1
𝑚𝑖 =
1
𝑁𝑖 𝜔𝑖
𝑤𝑗
𝑖
𝑘𝑗
𝑖
𝑁 𝑖
𝑗=1
, m =
1
𝑁𝑖
1
𝜔𝑖
𝑤𝑗
𝑖
𝑘𝑗
𝑖
𝑁 𝑖
𝑗=1
𝐶
𝑖=1
InstituteofComputingTechnology,ChineseAcademyofSciences
57
Set model IV: statistics (COV+)
 Properties
 The natural raw statistics of a sample set
 Flexible model of multiple-order statistical information
 Methods
 CDL [CVPR’12]
 LMKML [ICCV’13]
 DARG [CVPR’15]
 SPD-ML [ECCV’14]
 LEML [ICML’15]
 LERM [CVPR’14]
 HER [CVPR’15 ]
 …
G1
G2
Natural raw statistics
More flexible model
[Shakhnarovich, ECCV’02]
[Arandjelović, CVPR’05]
InstituteofComputingTechnology,ChineseAcademyofSciences
58
 SPD-ML (SPD Manifold Learning) [ECCV’14]
 Pioneering work on explicit manifold-to-manifold
dimensionality reduction
 Metric learning: on Riemannian manifold
Set model IV: statistics (COV+)
[1] M. Harandi, M. Salzmann, R. Hartley. From Manifold to Manifold: Geometry-Aware
Dimensionality Reduction for SPD Matrices. ECCV 2014.
Structure distortion
No explicit mapping
InstituteofComputingTechnology,ChineseAcademyofSciences
59
SPD-ML
 SPD manifold dimensionality reduction
 Mapping function: 𝑓: 𝒮++
𝑛
× ℝ 𝑛×𝑚 → 𝒮++
𝑚
 𝑓 𝑿, 𝑾 = 𝑾 𝑇 𝑿𝑾 ∈ 𝒮++
𝑚
≻ 0, 𝑿 ∈ 𝒮++
𝑛
, 𝑾 ∈ ℝ 𝑛×𝑚 (full rank)
 Affine invariant metrics: AIRM / Stein divergence on
target SPD manifold 𝒮++
𝑚
 𝛿2 𝑾 𝑇 𝑿𝑖 𝑾, 𝑾 𝑇 𝑿𝑖 𝑾 = 𝛿2 𝑾 𝑇 𝑿𝑖 𝑾, 𝑾 𝑇 𝑿𝑗 𝑾
 𝑾 = 𝑴𝑾, 𝑴 ∈ 𝐺𝐿 𝑛 , 𝑾 ∈ ℝ 𝑛×𝑚
, 𝑾 𝑇
𝑾 = 𝑰 𝑚
InstituteofComputingTechnology,ChineseAcademyofSciences
60
SPD-ML
 Discriminative learning
 Discriminant function
 Graph Embedding formalism with an affinity matrix that
encodes intra-class and inter-class SPD distances
 min 𝐿 𝑾 = min 𝑨𝑖𝑗 𝛿2(𝑾 𝑇 𝑿𝑖 𝑾, 𝑾 𝑇 𝑿𝑗 𝑾)𝑖𝑗
 𝑠. 𝑡. 𝑾 𝑇
𝑾 = 𝑰 𝑚(orthogonality constraint)
 Optimization
 Optimization problems on Stiefel manifold, solved by
nonlinear Conjugate Gradient (CG) method
InstituteofComputingTechnology,ChineseAcademyofSciences
61
 LEML (Log-Euclidean Metric Learning) [ICML’15]
 Learning tangent map by preserving matrix symmetric
structure
 Metric learning: on Riemannian manifold
Set model IV: statistics (COV+)
[1] Z. Huang, R. Wang, S. Shan, X. Li, X. Chen. Log-Euclidean Metric Learning on Symmetric
Positive Definite Manifold with Application to Image Set Classification. ICML 2015.
CDL [CVPR’12]
(a)
(b2)
(c)
OR
× 𝟐
× 𝟐
× 𝟐
(b1)
(d) (e)
InstituteofComputingTechnology,ChineseAcademyofSciences
62
LEML
 SPD tangent map learning
 Mapping function:𝐷𝐹: 𝑓 log(𝑺) = 𝑾 𝑇log(𝑺)𝑾
 𝑾 is column full rank
 Log-Euclidean distance in the target tangent space
 𝑑 𝐿𝐸𝐷 𝑓(𝑻𝑖), 𝑓(𝑻𝑗) = ||𝑾 𝑇 𝑻𝑖 𝑾 − 𝑾 𝑇 𝑻𝑗 𝑾|| 𝐹
= 𝑡𝑟(𝑸(𝑻𝑖 − 𝑻𝑗)(𝑻𝑖 − 𝑻𝑗))
𝑺
𝜉 𝑆
𝑇𝑺 𝕊+
𝑑 𝑇𝐹(𝑺) 𝕊+
𝑘
𝐷𝐹 𝑺 [𝜉 𝑆]
𝕊+
𝑑
𝕊+
𝑘
𝐹(𝑺)
𝐹
𝑫𝑭(𝑺)
𝛾(𝑡) 𝐹(𝛾 𝑡 )
analogy to 2DPCA
𝑸 = (𝑾𝑾 𝑇
)2
𝑻 = log(𝑺)
InstituteofComputingTechnology,ChineseAcademyofSciences
63
LEML
 Discriminative learning
 Objective function
 arg min
𝑄,𝜉
𝐷𝑙𝑑 𝑸, 𝑸 𝟎 + 𝜂𝐷𝑙𝑑 𝝃, 𝝃 𝟎
s. t. , tr 𝑸𝑨𝑖𝑗
𝑇
𝐀 𝑖𝑗 ≤ 𝜉 𝒄 𝑖,𝑗 , 𝑖, 𝑗 ∈ 𝑺
tr 𝑸𝑨𝑖𝑗
𝑇
𝐀 𝑖𝑗 ≥ 𝜉 𝒄 𝑖,𝑗 , (𝑖, 𝑗) ∈ 𝑫
 𝐀 𝑖𝑗 = log 𝑪𝑖 − log (𝑪𝑗), 𝐷𝑙𝑑: LogDet divergence
 Optimization
 Cyclic Bregman projection algorithm [Bregman’1967]
InstituteofComputingTechnology,ChineseAcademyofSciences
64
Set model IV: statistics (COV+)
 Properties
 The natural raw statistics of a sample set
 Flexible model of multiple-order statistical information
 Methods
 CDL [CVPR’12]
 LMKML [ICCV’13]
 DARG [CVPR’15]
 SPD-ML [ECCV’14]
 LEML [ICML’15]
 LERM [CVPR’14]
 HER [CVPR’15 ]
 …
G1
G2
Natural raw statistics
More flexible model
[Shakhnarovich, ECCV’02]
[Arandjelović, CVPR’05]
InstituteofComputingTechnology,ChineseAcademyofSciences
65
 LERM (Learning Euclidean-to-Riemannian Metric) [CVPR’14]
 Application scenario: still-to-video face recognition
 Metric learning: cross Euclidean space and Riemannian manifold
[1] Z. Huang, R. Wang, S. Shan, X. Chen. Learning Euclidean-to-Riemannian Metric for Point-to-Set
Classification. IEEE CVPR 2014.
ℝ 𝐷
Watch list… Surveillance video
Point Set
Point-to-Set
Classification
Watch list screening
Set model IV: statistics (COV+)
InstituteofComputingTechnology,ChineseAcademyofSciences
66
LERM
 Point-to-Set Classification
 Euclidean points vs. Riemannian points
ℳℝ 𝐷
[Hamm, ICML’08]
[Harandi, CVPR’11]
[Hamm, NIPS’08]
[Pennec, IJCV’06]
[Arsigny, SIAM’07]Point
Set model
Corresponding
manifold:
1.Grassmann (G)
2.AffineGrassmann (A)
3.SPD (S)
Euclidean space (E)
Heterogeneous
Linear subspace Affine hull Covariance matrixLinear subspace
[Yamaguchi, FG’98]
[Chien, PAMI’02]
[Kim, PAMI’07]
Affine hull
[Vincent, NIPS’01]
[Cevikalp, CVPR’10]
[Zhu, ICCV’13]
Covariance matrix
[Wang, CVPR’12]
[Vemulapalli, CVPR’13]
[Lu, ICCV’13]
InstituteofComputingTechnology,ChineseAcademyofSciences
67
LERM
ℳℳℝ 𝐷ℝ 𝐷
 Basic idea
 Reduce Euclidean-to-Riemannian metric to
classical Euclidean metric
 Seek maps 𝐹,Φ to a common Euclidean subspace
 𝒅 𝒙𝒊, 𝒚𝒋 = (𝑭 𝒙𝒊 − 𝚽 𝒚𝒋 ) 𝑻(𝑭 𝒙𝒊 − 𝚽 𝒚𝒋 )
ℝ 𝑑
𝐹 Φ
ℝ 𝑑
𝑓 𝑔
ℝ 𝐷1 ℝ 𝐷2
Conventional Metric Learning
InstituteofComputingTechnology,ChineseAcademyofSciences
68
LERM
 Basic idea
Bridge Euclidean-to-Riemannian gap
 Hilbert space embedding
 Adhere to Euclidean geometry
 Globally encode the geometry of manifold
ℳℳℝ 𝐷ℝ 𝐷
ℝ 𝑑
𝐹
ℋ 𝑦
𝝋 𝒚
𝑓𝒚Φ
ℳℳ
Tangent space (locally)
InstituteofComputingTechnology,ChineseAcademyofSciences
69
LERM
 Formulation
 Three Mapping Modes
𝝋 𝒚
ℳ
ℋ 𝑦
ℝ 𝐷
ℝ 𝑑
𝒇 𝒚
ℋ𝑥
𝜑 𝑥
𝑓𝑥
Mapping Mode 2
Double Hilbert spaces
ℝ 𝑑
ℋ𝑥
𝑓𝑥 𝒇 𝒚
ℝ 𝐷 ℳ
ℋ 𝑦
𝜑 𝑦
′
𝜑 𝑥
′
Mapping Mode 3
+Cross-space kernel
Further
reduce
the gap
ℝ 𝐷 ℳ
ℝ 𝑑
𝝋 𝒚
ℋ 𝑦
𝑓𝑥 𝒇 𝒚
Mapping Mode 1
Single Hilbert space
Further explore
correlations
Same:
Objective fun. 𝑬(𝑭, 𝝋)
Different:
Final maps 𝑭,𝚽
𝑭 𝚽
𝑭 𝚽 𝑭 𝚽
InstituteofComputingTechnology,ChineseAcademyofSciences
70
LERM
 e.g. Mode 1
Further explore
correlations
ℝ 𝑑
ℋ𝑥
𝑓𝑥 𝒇 𝒚
ℝ 𝐷 ℳ
ℋ 𝑦
𝜑 𝑦
′
𝜑 𝑥
′
ℝ 𝑑
ℋ𝑥
𝑓𝑥 𝒇 𝒚
ℝ 𝐷 ℳ
ℋ 𝑦
𝜑 𝑦
′
𝜑 𝑥
′
Distance metric:
𝒅 𝒙𝒊, y𝒋 = (𝑭 𝒙𝒊 − 𝚽 𝒚𝒋 ) 𝑻(𝑭 𝒙𝒊 − 𝚽 𝒚𝒋 )
Objective function: 𝑬(𝑭, 𝝋)
min
𝐹,𝚽
{ 𝑫 𝑭, 𝚽 + 𝜆1 𝑮(𝑭, 𝚽) + 𝜆2 𝑻 𝑭, 𝚽 }
Distance Geometry Transformation
𝝋 𝒚
ℳ
ℋ 𝑦ℋ𝑥
𝜑 𝑥
ℝ 𝐷
ℝ 𝑑
𝒇 𝒚𝑓𝑥
Final maps:
𝑭 = 𝒇 𝒙 = 𝑾 𝒙
𝑻 𝑿
𝚽 = 𝒇 𝒚 ∘ 𝝋 𝒚 = 𝑾 𝒚
𝑻
𝑲 𝒚
𝝋 𝒚 𝒊
, 𝝋 𝒚 𝒋
= 𝐾 𝑦 𝑖, 𝑗
𝐾 𝑦 𝑖, 𝑗 = exp(−𝑑2(𝑦𝑖, 𝑦𝑗)/2𝜎2)
Single Hilbert space
Riemannian metrics [ICML’08, NIPS’08, SIAM’06]
ℝ 𝐷 ℳ
ℝ 𝑑
ℋ 𝑦
𝑓𝑥 𝒇 𝒚
𝝋 𝒚
Normalize Transformation
Discriminate Distance
Preserve Geometry
𝑭 𝚽
InstituteofComputingTechnology,ChineseAcademyofSciences
71
 HER (Hashing across Euclidean and Riemannian) [CVPR’15]
 Application scenario: video-based face retrieval
 Metric learning: hamming distance learning across heter. spaces
[1] Y. Li, R. Wang, Z. Huang, S. Shan, X. Chen. Face Video Retrieval with Image Query via Hashing
across Euclidean Space and Riemannian Manifold. IEEE CVPR 2015.
Set model IV: statistics (COV+)
InstituteofComputingTechnology,ChineseAcademyofSciences
72
HER
 Heterogeneous Hash Learning
The Proposed Heterogeneous
Hash Learning Method
Previous Multiple Modalities
Hash Learning Methods
Euclidean Space BEuclidean Space A
Hamming Space
001
011 111
110
100000
101
Riemannian ManifoldEuclidean Space
Hamming Space
001
011 111
110
100000
101
InstituteofComputingTechnology,ChineseAcademyofSciences
73
HER
 Two-Stage Architecture
 Stage-1: kernel mapping
Euclidean Kernel (Image)
𝑲 𝑒
𝑖𝑗
= exp(−
𝒙𝑖 − 𝒙𝑗 2
2
2𝜎𝑒
2 )
Riemannian Kernel (Video)
𝑲 𝑟
𝑖𝑗
= exp(−
log(𝔂𝑖) − log(𝔂 𝑗)
𝐹
2
2𝜎𝑟
2 )*
Reproducing Kernel
Hilbert Space (Euc.)
Reproducing Kernel
Hilbert Space (Rie.)
011
Target Hamming Space
001
111
110
100000
101
Subject A
Subject B
Subject C
𝜑 𝜂
[*] Y. Li, R. Wang, Z. Cui, S. Shan, X. Chen. Compact Video Code and Its Application to Robust Face
Retrieval in TV-Series. BMVC 2014. (**: CVC method for video-video face retrieval)
Euclidean Space
(Image-Vector-𝒙𝑖)
Riemannian Manifold
(Video-Covariance Matrix-𝔂𝑖)
InstituteofComputingTechnology,ChineseAcademyofSciences
74
HER
 Two-Stage Architecture
 Stage-2: binary encoding
𝜑 𝜂
Euclidean Space
(Image-Vector-𝒙𝑖)
Riemannian Manifold
(Video-Covariance Matrix-𝔂𝑖)
Reproducing Kernel
Hilbert Space (Euc.)
Reproducing Kernel
Hilbert Space (Rie.)
Hyperplane
 Discriminability
 Stability
011
Target Hamming Space
001
111
110
100000
101
Subject A
Subject B
Subject C
Hash Functions
𝝎 𝑒
Hash Functions
𝝎 𝑟
InstituteofComputingTechnology,ChineseAcademyofSciences
75
HER
 Binary encoding [Rastegari, ECCV’12]
Discriminability
𝐸𝑒 = 𝑑(𝑩 𝑒
𝑚
, 𝑩 𝑒
𝑛
)
𝑚,𝑛∈𝑐𝑐∈{1:𝐶}
− 𝜆 𝑒 𝑑 𝑩 𝑒
𝑝
, 𝑩 𝑒
𝑞
𝑐2∈ 1:𝐶
𝑐1≠𝑐2,𝑞∈𝑐2
𝑐1∈ 1:𝐶
𝑝∈𝑐1
𝐸𝑟 = 𝑑(𝑩 𝑟
𝑚
, 𝑩 𝑟
𝑛
)
𝑚,𝑛∈𝑐𝑐∈{1:𝐶}
− 𝜆 𝑟 𝑑(𝑩 𝑟
𝑝
, 𝑩 𝑟
𝑞
)
𝑐2∈{1:𝐶}
𝑐1≠𝑐2,𝑞∈𝑐2
𝑐1∈{1:𝐶}
𝑝∈𝑐1
𝐸𝑒𝑟 = 𝑑(𝑩 𝑒
𝑚
, 𝑩 𝑟
𝑛
)
𝑚,𝑛∈𝑐𝑐∈{1:𝐶}
− 𝜆 𝑒𝑟 𝑑(𝑩 𝑒
𝑝
, 𝑩 𝑟
𝑞
)
𝑐2∈{1:𝐶}
𝑐1≠𝑐2,𝑞∈𝑐2
𝑐1∈{1:𝐶}
𝑝∈𝑐1
Objective Function
min
𝝎 𝑒,𝝎 𝑟,𝝃 𝑒,𝝃 𝑟,𝑩 𝑒,𝑩 𝑟
𝜆1 𝐸𝑒 + 𝜆2 𝐸𝑟 + 𝜆3 𝐸𝑒𝑟 + 𝛾1 𝝎 𝑒
𝑘 2
𝑘∈{1:𝐾}
+ 𝐶1 𝝃 𝑒
𝑘𝑖
𝑘∈{1:𝐾}
𝑖∈{1:𝑁}
+ 𝛾2 𝝎 𝑟
𝑘 2
𝑘∈{1:𝐾}
+ 𝐶2 𝝃 𝑟
𝑘𝑖
𝑘∈{1:𝐾}
𝑖∈{1:𝑁}
s.t.
𝑩 𝑒
𝑘𝑖
= 𝑠𝑔𝑛 𝝎 𝑒
𝑘 𝑇
𝜑 𝒙𝑖 , ∀𝑘 ∈ 1: 𝐾 , ∀𝑖 ∈ 1: 𝑁
𝑩 𝑟
𝑘𝑖
= 𝑠𝑔𝑛 𝝎 𝑟
𝑘 𝑇
𝜂 𝔂𝑖 , ∀𝑘 ∈ 1: 𝐾 , ∀𝑖 ∈ 1: 𝑁
𝑩 𝑟
𝑘𝑖
𝝎 𝑒
𝑘 𝑇
𝜑 𝒙𝑖 ≥ 1 − 𝝃 𝑒
𝑘𝑖
, 𝝃 𝑒
𝑘𝑖
> 0, ∀𝑘 ∈ 1: 𝐾 , ∀𝑖 ∈ 1: 𝑁
𝑩 𝑒
𝑘𝑖
𝝎 𝑟
𝑘 𝑇
𝜂 𝔂𝑖 ≥ 1 − 𝝃 𝑟
𝑘𝑖
, 𝝃 𝑟
𝑘𝑖
> 0, ∀𝑘 ∈ 1: 𝐾 , ∀𝑖 ∈ 1: 𝑁
Compatibility
(Cross-training)
Stability (SVM)
Discriminability (LDA)
InstituteofComputingTechnology,ChineseAcademyofSciences
76
Outline
 Background
 Literature review
 Evaluations
 Summary
InstituteofComputingTechnology,ChineseAcademyofSciences
77
Evaluations
 Two YouTube datasets
 YouTube Celebrities (YTC) [Kim, CVPR’08]
 47 subjects, 1910 videos from YouTube
 YouTube FaceDB (YTF) [Wolf, CVPR’11]
 3425 videos, 1595 different people
YTC
YTF
InstituteofComputingTechnology,ChineseAcademyofSciences
78
Evaluations
 COX Face [Huang, ACCV’12/TIP’15]
 1,000 subjects
 each has 1 high quality images, 3 unconstrained video
sequences
Images
Videos
http://vipl.ict.ac.cn/resources/datasets/cox-face-dataset/cox-face
InstituteofComputingTechnology,ChineseAcademyofSciences
79
Evaluations
 PaSC [Beveridge, BTAS’13]
 Control videos
 1 mounted video camera
 1920*1080 resolution
 Handheld videos
 5 handheld video cameras
 640*480~1280*720 resolution
Control video Handheld video
InstituteofComputingTechnology,ChineseAcademyofSciences
80
Evaluations
 Results (reported in our DARG paper*)
[*] W. Wang, R. Wang, Z. Huang, S. Shan, X. Chen. Discriminant Analysis on Riemannian Manifold of
Gaussian Distributions for Face Recognition with Image Sets. IEEE CVPR 2015.
InstituteofComputingTechnology,ChineseAcademyofSciences
81
Evaluations
 Results (reported in our DARG paper*)
VR@FAR=0.01 on PaSC AUC on YTF
InstituteofComputingTechnology,ChineseAcademyofSciences
82
Evaluations
 Performance on PaSC Challenge (IEEE FG’15)
 HERML-DeLF
 DCNN learned image feature
 Hybrid Euclidean and Riemannian Metric Learning*
[*] Z. Huang, R. Wang, S. Shan, X. Chen. Hybrid Euclidean-and-Riemannian Metric Learning for Image
Set Classification. ACCV 2014. (**: the key reference describing the method used for the challenge)
InstituteofComputingTechnology,ChineseAcademyofSciences
83
Evaluations
 Performance on EmotiW Challenge (ACM ICMI’14)*
 Combination of multiple statistics for video modeling
 Learning on the Riemannian manifold
[*] M. Liu, R. Wang, S. Li, S. Shan, Z. Huang, X. Chen. Combining Multiple Kernel Methods on
Riemannian Manifold for Emotion Recognition in the Wild. ACM ICMI 2014.
InstituteofComputingTechnology,ChineseAcademyofSciences
84
Outline
 Background
 Literature review
 Evaluations
 Summary
InstituteofComputingTechnology,ChineseAcademyofSciences
85
Route map of our methods
MMD
CVPR’08/TIP’12
MDA
CVPR’09
CDL
CVPR’12
DARG
CVPR’15
LERM
CVPR’14
CVC
BMVC’14
LEML/PML
ICML’15/CVPR’15
HER
CVPR’15
Complex distribution
Large amount of data
Appearance manifold
Natural raw statistics
No assum. of data dist.
Covariance matrix
binary heter.
InstituteofComputingTechnology,ChineseAcademyofSciences
86
Summary
 What we learn from current studies
 Set modeling
 Linear(/affine) subspace  Manifold  Statistics
 Set matching
 Non-discriminative  Discriminative
Metric learning
 Euclidean space  Riemannian manifold
 Future directions
 More flexible set modeling for different scenarios
 Multi-model combination
 Learning method should be more efficient
 Set-based vs. sample-based?
InstituteofComputingTechnology,ChineseAcademyofSciences
87
Additional references (not listed above)
 [Arandjelović, CVPR’05] O. Arandjelović, G. Shakhnarovich, J. Fisher, R.
Cipolla, and T. Darrell. Face Recognition with Image Sets Using Manifold
Density Divergence. IEEE CVPR 2005.
 [Chien, PAMI’02] J. Chien and C. Wu. Discriminant waveletfaces and nearest
feature classifiers for face recognition. IEEE T-PAMI 2002.
 [Rastegari, ECCV’12] M. Rastegari, A. Farhadi, and D. Forsyth. Attribute
discovery via predictable discriminative binary codes. ECCV 2012.
 [Shakhnarovich, ECCV’02] G. Shakhnarovich, J. W. Fisher, and T. Darrell.
Face Recognition from Long-term Observations. ECCV 2002.
 [Vemulapalli, CVPR’13] R. Vemulapalli, J. K. Pillai, and R. Chellappa. Kernel
learning for extrinsic classification of manifold features. IEEE CVPR 2013.
 [Vincent, NIPS’01] P. Vincent and Y. Bengio. K-local hyperplane and convex
distance nearest neighbor algorithms. NIPS 2001.
InstituteofComputingTechnology,ChineseAcademyofSciences
8888
Thanks, Q & A
Xilin ChenShiguang ShanRuiping Wang
Lab of Visual Information Processing and Learning (VIPL) @ICT@CAS
Codes of our methods available at: http://vipl.ict.ac.cn/resources/codes
Zhiwu Huang Yan Li Wen Wang Mengyi Liu

More Related Content

What's hot

Introduction to Topological Data Analysis
Introduction to Topological Data AnalysisIntroduction to Topological Data Analysis
Introduction to Topological Data Analysis
Mason Porter
 
Image-to-Image Translation pix2pix
Image-to-Image Translation pix2pixImage-to-Image Translation pix2pix
Image-to-Image Translation pix2pix
Yasar Hayat
 
IMAGE FUSION IN IMAGE PROCESSING
IMAGE FUSION IN IMAGE PROCESSINGIMAGE FUSION IN IMAGE PROCESSING
IMAGE FUSION IN IMAGE PROCESSINGgarima0690
 
Gray level transformation
Gray level transformationGray level transformation
Gray level transformation
chauhankapil
 
Enhancement in frequency domain
Enhancement in frequency domainEnhancement in frequency domain
Enhancement in frequency domainAshish Kumar
 
Regularization in deep learning
Regularization in deep learningRegularization in deep learning
Regularization in deep learning
Kien Le
 
Matrix Factorisation (and Dimensionality Reduction)
Matrix Factorisation (and Dimensionality Reduction)Matrix Factorisation (and Dimensionality Reduction)
Matrix Factorisation (and Dimensionality Reduction)
HJ van Veen
 
CS404 Pattern Recognition - Locality Preserving Projections
CS404   Pattern Recognition - Locality Preserving ProjectionsCS404   Pattern Recognition - Locality Preserving Projections
CS404 Pattern Recognition - Locality Preserving Projections
Jishnu P
 
A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...
A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...
A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...
Jin-Hwa Kim
 
Computer Vision
Computer VisionComputer Vision
Computer Vision
Kan-Han (John) Lu
 
Video Segmentation
Video SegmentationVideo Segmentation
Video Segmentation
Smriti Jain
 
Histogram Specification or Matching Problem
Histogram Specification or Matching ProblemHistogram Specification or Matching Problem
Histogram Specification or Matching Problem
Kalyan Acharjya
 
Learn from Example and Learn Probabilistic Model
Learn from Example and Learn Probabilistic ModelLearn from Example and Learn Probabilistic Model
Learn from Example and Learn Probabilistic Model
Junya Tanaka
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGAN
NAVER Engineering
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
Benjamin Bengfort
 
Cluster Validation
Cluster ValidationCluster Validation
Cluster Validation
Udaya Arangala
 
Linear regression
Linear regression Linear regression
Linear regression
mohamed Naas
 
Introduction of DiscoGAN
Introduction of DiscoGANIntroduction of DiscoGAN
Introduction of DiscoGAN
Seongcheol Baek
 
Text data mining1
Text data mining1Text data mining1
Text data mining1KU Leuven
 
Local binary pattern
Local binary patternLocal binary pattern
Local binary pattern
International Islamic University
 

What's hot (20)

Introduction to Topological Data Analysis
Introduction to Topological Data AnalysisIntroduction to Topological Data Analysis
Introduction to Topological Data Analysis
 
Image-to-Image Translation pix2pix
Image-to-Image Translation pix2pixImage-to-Image Translation pix2pix
Image-to-Image Translation pix2pix
 
IMAGE FUSION IN IMAGE PROCESSING
IMAGE FUSION IN IMAGE PROCESSINGIMAGE FUSION IN IMAGE PROCESSING
IMAGE FUSION IN IMAGE PROCESSING
 
Gray level transformation
Gray level transformationGray level transformation
Gray level transformation
 
Enhancement in frequency domain
Enhancement in frequency domainEnhancement in frequency domain
Enhancement in frequency domain
 
Regularization in deep learning
Regularization in deep learningRegularization in deep learning
Regularization in deep learning
 
Matrix Factorisation (and Dimensionality Reduction)
Matrix Factorisation (and Dimensionality Reduction)Matrix Factorisation (and Dimensionality Reduction)
Matrix Factorisation (and Dimensionality Reduction)
 
CS404 Pattern Recognition - Locality Preserving Projections
CS404   Pattern Recognition - Locality Preserving ProjectionsCS404   Pattern Recognition - Locality Preserving Projections
CS404 Pattern Recognition - Locality Preserving Projections
 
A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...
A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...
A summary of Categorical Reparameterization with Gumbel-Softmax by Jang et al...
 
Computer Vision
Computer VisionComputer Vision
Computer Vision
 
Video Segmentation
Video SegmentationVideo Segmentation
Video Segmentation
 
Histogram Specification or Matching Problem
Histogram Specification or Matching ProblemHistogram Specification or Matching Problem
Histogram Specification or Matching Problem
 
Learn from Example and Learn Probabilistic Model
Learn from Example and Learn Probabilistic ModelLearn from Example and Learn Probabilistic Model
Learn from Example and Learn Probabilistic Model
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGAN
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
Cluster Validation
Cluster ValidationCluster Validation
Cluster Validation
 
Linear regression
Linear regression Linear regression
Linear regression
 
Introduction of DiscoGAN
Introduction of DiscoGANIntroduction of DiscoGAN
Introduction of DiscoGAN
 
Text data mining1
Text data mining1Text data mining1
Text data mining1
 
Local binary pattern
Local binary patternLocal binary pattern
Local binary pattern
 

Viewers also liked

Metric learning ICML2010 tutorial
Metric learning  ICML2010 tutorialMetric learning  ICML2010 tutorial
Metric learning ICML2010 tutorialzukun
 
Metric Learning Survey Slides
Metric Learning Survey SlidesMetric Learning Survey Slides
Metric Learning Survey Slidesguestd8baf6
 
ECCV2010: distance function and metric learning part 1
ECCV2010: distance function and metric learning part 1ECCV2010: distance function and metric learning part 1
ECCV2010: distance function and metric learning part 1zukun
 
Lecture slides week14-15
Lecture slides week14-15Lecture slides week14-15
Lecture slides week14-15
Shani729
 
Distance Metric Learning
Distance Metric LearningDistance Metric Learning
Distance Metric Learning
Sanghyuk Chun
 
Ibn Sina
Ibn SinaIbn Sina
Ibn Sina
Yasmine Gaber
 
Image net classification with Deep Convolutional Neural Networks
Image net classification with Deep Convolutional Neural NetworksImage net classification with Deep Convolutional Neural Networks
Image net classification with Deep Convolutional Neural Networks
Shingo Horiuchi
 
Information-Theoretic Metric Learning
Information-Theoretic Metric LearningInformation-Theoretic Metric Learning
Information-Theoretic Metric LearningKoji Matsuda
 
Data mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, dataData mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, data
Salah Amean
 
Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...
Universitat Politècnica de Catalunya
 
MIRU2014 tutorial deeplearning
MIRU2014 tutorial deeplearningMIRU2014 tutorial deeplearning
MIRU2014 tutorial deeplearning
Takayoshi Yamashita
 

Viewers also liked (11)

Metric learning ICML2010 tutorial
Metric learning  ICML2010 tutorialMetric learning  ICML2010 tutorial
Metric learning ICML2010 tutorial
 
Metric Learning Survey Slides
Metric Learning Survey SlidesMetric Learning Survey Slides
Metric Learning Survey Slides
 
ECCV2010: distance function and metric learning part 1
ECCV2010: distance function and metric learning part 1ECCV2010: distance function and metric learning part 1
ECCV2010: distance function and metric learning part 1
 
Lecture slides week14-15
Lecture slides week14-15Lecture slides week14-15
Lecture slides week14-15
 
Distance Metric Learning
Distance Metric LearningDistance Metric Learning
Distance Metric Learning
 
Ibn Sina
Ibn SinaIbn Sina
Ibn Sina
 
Image net classification with Deep Convolutional Neural Networks
Image net classification with Deep Convolutional Neural NetworksImage net classification with Deep Convolutional Neural Networks
Image net classification with Deep Convolutional Neural Networks
 
Information-Theoretic Metric Learning
Information-Theoretic Metric LearningInformation-Theoretic Metric Learning
Information-Theoretic Metric Learning
 
Data mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, dataData mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, data
 
Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...
 
MIRU2014 tutorial deeplearning
MIRU2014 tutorial deeplearningMIRU2014 tutorial deeplearning
MIRU2014 tutorial deeplearning
 

Similar to Distance Metric Learning tutorial at CVPR 2015

Change Detection of Water-Body in Synthetic Aperture Radar Images
Change Detection of Water-Body in Synthetic Aperture Radar ImagesChange Detection of Water-Body in Synthetic Aperture Radar Images
Change Detection of Water-Body in Synthetic Aperture Radar Images
CSCJournals
 
Self Organizing Feature Map(SOM), Topographic Product, Cascade 2 Algorithm
Self Organizing Feature Map(SOM), Topographic Product, Cascade 2 AlgorithmSelf Organizing Feature Map(SOM), Topographic Product, Cascade 2 Algorithm
Self Organizing Feature Map(SOM), Topographic Product, Cascade 2 Algorithm
Chenghao Jin
 
Segmentation and Classification of MRI Brain Tumor
Segmentation and Classification of MRI Brain TumorSegmentation and Classification of MRI Brain Tumor
Segmentation and Classification of MRI Brain Tumor
IRJET Journal
 
Automatic Skin Lesion Segmentation and Melanoma Detection: Transfer Learning ...
Automatic Skin Lesion Segmentation and Melanoma Detection: Transfer Learning ...Automatic Skin Lesion Segmentation and Melanoma Detection: Transfer Learning ...
Automatic Skin Lesion Segmentation and Melanoma Detection: Transfer Learning ...
Zabir Al Nazi Nabil
 
Linearity of Feature Extraction Techniques for Medical Images by using Scale ...
Linearity of Feature Extraction Techniques for Medical Images by using Scale ...Linearity of Feature Extraction Techniques for Medical Images by using Scale ...
Linearity of Feature Extraction Techniques for Medical Images by using Scale ...
ijtsrd
 
National Flags Recognition Based on Principal Component Analysis
National Flags Recognition Based on Principal Component AnalysisNational Flags Recognition Based on Principal Component Analysis
National Flags Recognition Based on Principal Component Analysis
ijtsrd
 
Combination of texture feature extraction and forward selection for one-clas...
Combination of texture feature extraction and forward selection  for one-clas...Combination of texture feature extraction and forward selection  for one-clas...
Combination of texture feature extraction and forward selection for one-clas...
IJECEIAES
 
FINGERPRINT CLASSIFICATION BASED ON ORIENTATION FIELD
FINGERPRINT CLASSIFICATION BASED ON ORIENTATION FIELDFINGERPRINT CLASSIFICATION BASED ON ORIENTATION FIELD
FINGERPRINT CLASSIFICATION BASED ON ORIENTATION FIELD
ijesajournal
 
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
Waqas Tariq
 
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
CSCJournals
 
D05222528
D05222528D05222528
D05222528
IOSR-JEN
 
Introduction to Grad-CAM (complete version)
Introduction to Grad-CAM (complete version)Introduction to Grad-CAM (complete version)
Introduction to Grad-CAM (complete version)
Hsing-chuan Hsieh
 
Dual Tree Complex Wavelet Transform, Probabilistic Neural Network and Fuzzy C...
Dual Tree Complex Wavelet Transform, Probabilistic Neural Network and Fuzzy C...Dual Tree Complex Wavelet Transform, Probabilistic Neural Network and Fuzzy C...
Dual Tree Complex Wavelet Transform, Probabilistic Neural Network and Fuzzy C...
IJAEMSJORNAL
 
Projection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamicsProjection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamics
University of Glasgow
 
DataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at GoogleDataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at Google
Hakka Labs
 
A NOVEL PROBABILISTIC BASED IMAGE SEGMENTATION MODEL FOR REALTIME HUMAN ACTIV...
A NOVEL PROBABILISTIC BASED IMAGE SEGMENTATION MODEL FOR REALTIME HUMAN ACTIV...A NOVEL PROBABILISTIC BASED IMAGE SEGMENTATION MODEL FOR REALTIME HUMAN ACTIV...
A NOVEL PROBABILISTIC BASED IMAGE SEGMENTATION MODEL FOR REALTIME HUMAN ACTIV...
sipij
 
Possibility fuzzy c means clustering for expression invariant face recognition
Possibility fuzzy c means clustering for expression invariant face recognitionPossibility fuzzy c means clustering for expression invariant face recognition
Possibility fuzzy c means clustering for expression invariant face recognition
IJCI JOURNAL
 
Deformation @ TU
Deformation @ TUDeformation @ TU
Deformation @ TUHardik Jain
 
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
acijjournal
 

Similar to Distance Metric Learning tutorial at CVPR 2015 (20)

Change Detection of Water-Body in Synthetic Aperture Radar Images
Change Detection of Water-Body in Synthetic Aperture Radar ImagesChange Detection of Water-Body in Synthetic Aperture Radar Images
Change Detection of Water-Body in Synthetic Aperture Radar Images
 
Self Organizing Feature Map(SOM), Topographic Product, Cascade 2 Algorithm
Self Organizing Feature Map(SOM), Topographic Product, Cascade 2 AlgorithmSelf Organizing Feature Map(SOM), Topographic Product, Cascade 2 Algorithm
Self Organizing Feature Map(SOM), Topographic Product, Cascade 2 Algorithm
 
Segmentation and Classification of MRI Brain Tumor
Segmentation and Classification of MRI Brain TumorSegmentation and Classification of MRI Brain Tumor
Segmentation and Classification of MRI Brain Tumor
 
Automatic Skin Lesion Segmentation and Melanoma Detection: Transfer Learning ...
Automatic Skin Lesion Segmentation and Melanoma Detection: Transfer Learning ...Automatic Skin Lesion Segmentation and Melanoma Detection: Transfer Learning ...
Automatic Skin Lesion Segmentation and Melanoma Detection: Transfer Learning ...
 
50120130406014
5012013040601450120130406014
50120130406014
 
Linearity of Feature Extraction Techniques for Medical Images by using Scale ...
Linearity of Feature Extraction Techniques for Medical Images by using Scale ...Linearity of Feature Extraction Techniques for Medical Images by using Scale ...
Linearity of Feature Extraction Techniques for Medical Images by using Scale ...
 
National Flags Recognition Based on Principal Component Analysis
National Flags Recognition Based on Principal Component AnalysisNational Flags Recognition Based on Principal Component Analysis
National Flags Recognition Based on Principal Component Analysis
 
Combination of texture feature extraction and forward selection for one-clas...
Combination of texture feature extraction and forward selection  for one-clas...Combination of texture feature extraction and forward selection  for one-clas...
Combination of texture feature extraction and forward selection for one-clas...
 
FINGERPRINT CLASSIFICATION BASED ON ORIENTATION FIELD
FINGERPRINT CLASSIFICATION BASED ON ORIENTATION FIELDFINGERPRINT CLASSIFICATION BASED ON ORIENTATION FIELD
FINGERPRINT CLASSIFICATION BASED ON ORIENTATION FIELD
 
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
 
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
 
D05222528
D05222528D05222528
D05222528
 
Introduction to Grad-CAM (complete version)
Introduction to Grad-CAM (complete version)Introduction to Grad-CAM (complete version)
Introduction to Grad-CAM (complete version)
 
Dual Tree Complex Wavelet Transform, Probabilistic Neural Network and Fuzzy C...
Dual Tree Complex Wavelet Transform, Probabilistic Neural Network and Fuzzy C...Dual Tree Complex Wavelet Transform, Probabilistic Neural Network and Fuzzy C...
Dual Tree Complex Wavelet Transform, Probabilistic Neural Network and Fuzzy C...
 
Projection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamicsProjection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamics
 
DataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at GoogleDataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at Google
 
A NOVEL PROBABILISTIC BASED IMAGE SEGMENTATION MODEL FOR REALTIME HUMAN ACTIV...
A NOVEL PROBABILISTIC BASED IMAGE SEGMENTATION MODEL FOR REALTIME HUMAN ACTIV...A NOVEL PROBABILISTIC BASED IMAGE SEGMENTATION MODEL FOR REALTIME HUMAN ACTIV...
A NOVEL PROBABILISTIC BASED IMAGE SEGMENTATION MODEL FOR REALTIME HUMAN ACTIV...
 
Possibility fuzzy c means clustering for expression invariant face recognition
Possibility fuzzy c means clustering for expression invariant face recognitionPossibility fuzzy c means clustering for expression invariant face recognition
Possibility fuzzy c means clustering for expression invariant face recognition
 
Deformation @ TU
Deformation @ TUDeformation @ TU
Deformation @ TU
 
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
 

Recently uploaded

Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
Intella Parts
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
Kamal Acharya
 
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
dxobcob
 
Swimming pool mechanical components design.pptx
Swimming pool  mechanical components design.pptxSwimming pool  mechanical components design.pptx
Swimming pool mechanical components design.pptx
yokeleetan1
 
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTSHeap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Soumen Santra
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
An Approach to Detecting Writing Styles Based on Clustering Techniques
An Approach to Detecting Writing Styles Based on Clustering TechniquesAn Approach to Detecting Writing Styles Based on Clustering Techniques
An Approach to Detecting Writing Styles Based on Clustering Techniques
ambekarshweta25
 
14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application
SyedAbiiAzazi1
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
drwaing
 
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdfTutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
aqil azizi
 
PPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testingPPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testing
anoopmanoharan2
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
Dr Ramhari Poudyal
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
symbo111
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
Technical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prismsTechnical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prisms
heavyhaig
 

Recently uploaded (20)

Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
 
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
 
Swimming pool mechanical components design.pptx
Swimming pool  mechanical components design.pptxSwimming pool  mechanical components design.pptx
Swimming pool mechanical components design.pptx
 
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTSHeap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
An Approach to Detecting Writing Styles Based on Clustering Techniques
An Approach to Detecting Writing Styles Based on Clustering TechniquesAn Approach to Detecting Writing Styles Based on Clustering Techniques
An Approach to Detecting Writing Styles Based on Clustering Techniques
 
14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
 
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdfTutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
 
PPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testingPPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testing
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
Technical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prismsTechnical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prisms
 

Distance Metric Learning tutorial at CVPR 2015

  • 1. Distance Metric Learning for Set-based Visual Recognition Ruiping Wang Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS) June 7, 2015 CVPR2015 Tutorial on “Distance Metric Learning for Visual Recognition” Part-4
  • 3. InstituteofComputingTechnology,ChineseAcademyofSciences  Identification  Typical applications  Photo matching (1:N)  Watch list screening (1:N+1)  Performance metric  FR(@FAR)  Verification  Typical applications  Access control (1:1)  E-passport (1:1)  Performance metric  ROC: FRR+FAR 3 Face Recognition with Single Image Who is this celebrity? Are they the same guy?
  • 8. InstituteofComputingTechnology,ChineseAcademyofSciences 88 Challenges  Intra-class variation vs. inter-class variation  Distance measure semantic meaning  Sample-based metric learning is made even harder Pose variation Illumination variation D(x, y)=?
  • 9. InstituteofComputingTechnology,ChineseAcademyofSciences 99 Face Recognition with Videos  Video surveillance http://www.youtube.com/watch?v=M80DXI932OE http://www.youtube.com/watch?v=RfJsGeq0xRA#t=22 Seeking missing children Searching criminal suspects
  • 10. InstituteofComputingTechnology,ChineseAcademyofSciences 1010 Face Recognition with Videos  Video shot retrieval 1 32 4 5 76 9 Next Page8 S01E01: 10’48’’ S01E06: 05’22’’ S01E02: 04’20’’ S01E02: 00’21’’ S01E03: 08’06’’ S01E05: 09’23’’ S01E01: 22’20’’ S01E04: 03’42’’ Smart TV-Series Character Shots Retrieval System “the Big Bang Theory”
  • 11. InstituteofComputingTechnology,ChineseAcademyofSciences 11 Treating Video as Image Set  A new & different problem Unconstrained acquisition conditions Complex appearance variations Two phases: set modeling + set matching New paradigm: set-based metric learning
  • 13. InstituteofComputingTechnology,ChineseAcademyofSciences 13 Overview of previous works From the view of set modeling Statistics [Shakhnarovich, ECCV’02] [Arandjelović, CVPR’05] [Wang, CVPR’12] [Harandi, ECCV’14] [Wang, CVPR’15] G1 G2 K-L Divergence S1 S2 Linear subspace [Yamaguchi, FG’98] [Kim, PAMI’07] [Hamm, ICML’08] [Huang, CVPR’15] Principal angle H1 H2 Affine/Convex hull [Cevikalp, CVPR’10] [Hu, CVPR’11] [Zhu, ICCV’13] NN matching Nonlinear manifold [Kim, BMVC’05] [Wang, CVPR’08] [Wang, CVPR’09] [Chen, CVPR’13] [Lu, CVPR’15] Principal angle(+) M1 M2
  • 14. InstituteofComputingTechnology,ChineseAcademyofSciences 14 Overview of previous works  Set modeling  Linear subspaceNonlinear manifold  Affine/Convex Hull (affine subspace)  Parametric PDFs  Statistics  Set matching—basic distance  Principal angles-based measure  Nearest neighbor (NN) matching approach  K-L divergenceSPD Riemannian metric…  Set matching—metric learning  Learning in Euclidean space  Learning on Riemannian manifold
  • 15. InstituteofComputingTechnology,ChineseAcademyofSciences 15 Set model I: linear subspace  Properties  PCA on the set of image samples to get subspace  Loose characterization of the set distribution region  Principal angles-based measure discards the varying importance of different variance directions  Methods  MSM [FG’98]  DCC [PAMI’07]  GDA [ICML’08]  GGDA [CVPR’11]  PML [CVPR’15]  … S1 S2 No distribution boundary No direction difference
  • 16. InstituteofComputingTechnology,ChineseAcademyofSciences 16 Set model I: linear subspace  MSM (Mutual Subspace Method) [FG’98]  Pioneering work on image set classification  First exploit principal angles as subspace distance  Metric learning: N/A [1] O. Yamaguchi, K. Fukui, and K. Maeda. Face Recognition Using Temporal Image Sequence. IEEE FG 1998. subspace method Mutual subspace method
  • 17. InstituteofComputingTechnology,ChineseAcademyofSciences 17 Set model I: linear subspace  DCC (Discriminant Canonical Correlations) [PAMI’07]  Metric learning: in Euclidean space [1] T. Kim, J. Kittler, and R. Cipolla. Discriminative Learning and Recognition of Image Set Classes Using Canonical Correlations. IEEE T-PAMI, 2007. Set 1: 𝑿 𝟏 Set 2: 𝑿 𝟐 𝑷 𝟏 𝑷 𝟐 Linear subspace by: orthonormal basis matrix 𝑿𝑖 𝑿𝑖 𝑇 ≃ 𝑷𝑖 𝚲𝑖 𝑷𝑖 𝑇
  • 18. InstituteofComputingTechnology,ChineseAcademyofSciences 18 DCC  Canonical Correlations/Principal Angles  Canonical vectorscommon variation modes Canonical vectors Set 1: 𝑿 𝟏 Set 2: 𝑿 𝟐 𝑷1 𝑇 𝑷2 = 𝑸12 𝜦𝑸21 𝑇 𝑼 = 𝑷1 𝑸12 = [𝒖1, … , 𝒖2] 𝑽 = 𝑷2 𝑸21 = [𝒗1, … , 𝒗2] 𝜦 = 𝑑𝑖𝑎𝑔(cos 𝜃1, … , cos 𝜃 𝑑) Canonical Correlation: cos 𝜃𝑖 Principal Angles: 𝜃𝑖 𝑷 𝟏 𝑷 𝟐
  • 19. InstituteofComputingTechnology,ChineseAcademyofSciences 19 DCC  Discriminative learning  Linear transformation  𝑻: 𝑿𝑖 ⟶ 𝒀𝑖 = 𝑻 𝑇 𝑿𝑖  Representation  𝒀𝑖 𝒀𝑖 𝑇 = (𝑻 𝑇 𝑿𝑖)(𝑻 𝑇 𝑿𝑖) 𝑇 ≃ (𝑻 𝑇 𝑷𝑖)𝚲𝑖(𝑻 𝑇 𝑷𝑖) 𝑇  Set similarity  𝐹𝑖𝑗 = max 𝑸 𝑖𝑗,𝑸 𝑗𝑖 𝑡𝑟(𝑀𝑖𝑗)  𝑀𝑖𝑗 = 𝑸𝑖𝑗 𝑇 𝑷′𝑖 𝑇 𝑻𝑻 𝑇 𝑷𝑗 𝑇 𝑸 𝑗𝑖  Discriminant function  𝑻 = max 𝑎𝑟𝑔 𝑻 𝑡𝑟 𝑻 𝑇 𝑺 𝑏 𝑻 /𝑡𝑟(𝑻 𝑇 𝑺 𝑤 𝑻)
  • 20. InstituteofComputingTechnology,ChineseAcademyofSciences 20 Set model I: linear subspace  GDA [ICML’08] / GGDA [CVPR’11]  Treat subspaces as points on Grassmann manifold  Metric learning: on Riemannian manifold [1] J. Hamm and D. D. Lee. Grassmann Discriminant Analysis: a Unifying View on Subspace-Based Learning. ICML 2008. [2] M. Harandi, C. Sanderson, S. Shirazi, B. Lovell. Graph Embedding Discriminant Analysis on Grassmannian Manifolds for Improved Image Set Matching. IEEE CVPR 2011. Subspace Subspace Grassmann manifold implicit kernel mapping
  • 21. InstituteofComputingTechnology,ChineseAcademyofSciences 21 GDA/GGDA  Projection metric  𝑑 𝑃 𝒀1, 𝒀2 = sin2 𝜃𝑖𝑖 1/2 = 2−1/2 𝒀1 𝒀1 𝑇 − 𝒀2 𝒀2 𝑇 𝐹 Geodesic distance: (Wong, 1967; Edelman et al., 1999) 𝑑 𝐺 2 (𝒀1, 𝒀2) = 𝜃𝑖 2 𝑖 𝜃𝑖: Principal angles
  • 22. InstituteofComputingTechnology,ChineseAcademyofSciences 22 GDA/GGDA  Projection kernel  Projection embedding (isometric)  Ψ 𝑃: 𝒢 𝑚, 𝐷 ⟶ ℝ 𝐷×𝐷, 𝑠𝑝𝑎𝑛 𝒀 ⟶ 𝒀𝒀 𝑇  The inner-product of ℝ 𝐷×𝐷  𝑡𝑟((𝒀1 𝒀1 𝑇 )(𝒀2 𝒀2 𝑇 )) = 𝒀1 𝑇 𝒀2 𝐹 2  Grassmann kernel (positive definite kernel)  𝑘 𝑃 𝒀1, 𝒀2 = 𝒀1 𝑇 𝒀2 𝐹 2
  • 23. InstituteofComputingTechnology,ChineseAcademyofSciences 23 GDA/GGDA  Discriminative learning  Classical kernel methods using the Grassmann kernel  e.g., Kernel LDA / kernel Graph embedding  𝜶∗ = arg max 𝜶 𝜶 𝑻 𝑲𝑾𝑲𝜶 𝜶 𝑻 𝑲𝑲𝜶 Grassmann kernel
  • 24. InstituteofComputingTechnology,ChineseAcademyofSciences 24 Set model I: linear subspace  PML (Projection Metric Learning) [CVPR’15]  Metric learning: on Riemannian manifold [1] Z. Huang, R. Wang, S. Shan, X. Chen. Projection Metric Learning on Grassmann Manifold with Application to Video based Face Recognition. IEEE CVPR 2015. (b) (c)(a) 𝒀1 𝒀 𝟐 𝒀 𝟑 𝒀 𝟑 𝒀 𝟐 𝒀1 𝒢(𝑞, 𝐷) 𝒀1 𝒀 𝟑 𝒀 𝟐 𝒢(𝑞, 𝑑) (d) 𝒀1 𝒀 𝟑 𝒀 𝟐 ℋ (e) 𝒀 𝟑 𝒀1 𝒀 𝟐 ℝ 𝑑 Learning with kernel in RKHS Learning directly on the manifold Structure distortion No explicit mapping High complexity
  • 25. InstituteofComputingTechnology,ChineseAcademyofSciences 25 PML  Explicit manifold to manifold mapping  𝑓 𝒀 = 𝑾 𝑇 𝒀 ∈ 𝒢 𝑞, 𝑑 , 𝒀 ∈ 𝒢 𝑞, 𝐷 , 𝑑 ≤ 𝐷  Projection metric on target Grassmann manifold 𝒢(𝑞,𝑑)  𝑑 𝑝 2 𝑓 𝒀i , 𝑓 𝒀j = 2−1/2 (𝑾 𝑇 𝒀𝑖 ′ ) 𝑾 𝑇 𝒀𝑖 ′ 𝑇 − (𝑾 𝑇 𝒀𝑗 ′ )(𝑾 𝑇 𝒀𝑗 ′ ) 𝑇 𝐹 2 = 2−1/2 𝑡𝑟 𝑷 𝑇 𝑨𝑖𝑗 𝑨𝑖𝑗 𝑃  𝑨𝑖𝑗 = (𝒀𝑖 ′ 𝒀𝑖 ′ 𝑇 − 𝒀𝑗 ′ 𝒀𝑗 ′ ) 𝑇, 𝑷 = 𝐖𝐖 𝑇 is a rank-𝑑 symmetric positive semidefinite (PSD) matrix of size 𝐷 × 𝐷 (similar form as Mahalanobis matrix )  𝒀i needs to be normalized to 𝒀𝑖 ′ so that the columns of 𝑾 𝑇 𝒀i are orthonormal (b) (c) 𝒀 𝟑 𝒀 𝟐 𝒀1 𝒇(𝒀1) 𝒇(𝒀 𝟑) 𝒇(𝒀 𝟐) 𝒢(𝑞, 𝐷) 𝒢(𝑞, 𝑑) 𝑓
  • 26. InstituteofComputingTechnology,ChineseAcademyofSciences 26 PML  Discriminative learning  Discriminant function  Minimize/Maximize the projection distances of any within- class/between-class subspace pairs  𝐽 = min 𝑡𝑟 𝑷 𝑇 𝑨𝑖𝑗 𝑨𝑖𝑗 𝑷𝑙 𝑖=𝑙 𝑗 − 𝜆 𝑡𝑟 𝑷 𝑇 𝑨𝑖𝑗 𝑨𝑖𝑗 𝑷𝑙 𝑖≠𝑙 𝑗  Optimization algorithm  Iterative solution for one of 𝒀′ and 𝑷 by fixing the other  Normalization of 𝒀 by QR-decomposition  Computation of 𝑷 by Riemannian Conjugate Gradient (RCG) algorithm on the manifold of PSD matrices within-class between-class
  • 27. InstituteofComputingTechnology,ChineseAcademyofSciences 27 Set model II: nonlinear manifold  Properties  Capture nonlinear complex appearance variation  Need dense sampling and large amount of data  Less appealing computational efficiency  Methods  MMD [CVPR’08]  MDA [CVPR’09]  BoMPA [BMVC’05]  SANS [CVPR’13]  MMDML [CVPR’15]  … Complex distribution Large amount of data
  • 28. InstituteofComputingTechnology,ChineseAcademyofSciences 28 Set model II: nonlinear manifold  MMD (Manifold-Manifold Distance) [CVPR’08]  Set modeling with nonlinear appearance manifold  Image set classification distance computation between two manifolds  Metric learning: N/A D(M1,M2) [1] R. Wang, S. Shan, X. Chen, W. Gao. Manifold-Manifold Distance with Application to Face Recognition based on Image Set. IEEE CVPR 2008. (Best Student Poster Award Runner-up) [2] R. Wang, S. Shan, X. Chen, Q. Dai, W. Gao. Manifold-Manifold Distance and Its Application to Face Recognition with Image Sets. IEEE Trans. Image Processing, 2012.
  • 29. InstituteofComputingTechnology,ChineseAcademyofSciences 29 MMD  Multi-level MMD framework Three pattern levels: Point->Subspace-> Manifold S x' x S1 S2 x1 x2 PPD PSD SSD xM x'' x S M M1 M2 PMD SMD MMD
  • 30. InstituteofComputingTechnology,ChineseAcademyofSciences 30 MMD  Formulation & Solution SSD between pairs of local models M1 M2 local linear models iC jC 1C 2C 3C 1C 2C 3C Three modules Local model construction Local model distance Global integration of local distances 1 2 1 1 ( , ) ( , ) m n ij i ji j d f d C C    M M 1 1 . . 1, 0 m n ij iji j s t f f    
  • 31. InstituteofComputingTechnology,ChineseAcademyofSciences 31 MMD  Subspace-subspace distance (SSD) S1 S2 0 1e 1u 2e 2u 1v 2v mean (cosine correlation) + variance (canonical correlation) 2 1/2 1 2 0 0( , ) (1 cos ) sinEd S S     2 1/2 2 1/2 1 2 1 1 ( , ) ( sin ) ( cos ) r r P k k k k d S S r        2 2 1/ 2 1 2 0 1 2 2 1/ 2 0 1 1 ( , ) (sin sin ) 1 (2 cos cos ) r k k r k k d S S r r              same Diff.
  • 32. InstituteofComputingTechnology,ChineseAcademyofSciences 32  MDA (Manifold Discriminant Analysis ) [CVPR’09]  Goal: maximize “manifold margin” under Graph Embedding framework  Metric learning: in Euclidean space  Euclidean distance between pair of image samples [1] R. Wang, X. Chen. Manifold Discriminant Analysis. IEEE CVPR 2009. M1 M2 within-class compactness between-class separability M1 M2 Intrinsic graph Penalty graph Set model II: nonlinear manifold
  • 33. InstituteofComputingTechnology,ChineseAcademyofSciences 33  Optimization framework  Min. within-class scatter  Max. between-class scatter  Objective function  Global linear transformation  Solved by generalized eigen-decomposition 2 , 2 ( )T T T T w m n m n m,n S w X D W X    v x v x v v 2 , 2 ( )T T T T b m n m n m,n S w X D W X      v x v x v v Intrinsic graph Penalty graph MDA ( ) Maximize ( ) = ( ) T T b T T w S X D W X J S X D W X     v v v v v
  • 34. InstituteofComputingTechnology,ChineseAcademyofSciences 34  BoMPA (Boosted Manifold Principal Angles) [BMVC’05]  Goal: optimal fusion of different principal angles  Exploit Adaboost to learn weights for each angle [1] T. Kim O. Arandjelovic, R. Cipolla. Learning over Sets using Boosted Manifold Principal Angles (BoMPA). BMVC 2005. PAsimilar (common illumination) BoMPAdissimilar Subject A Subject B Set model II: nonlinear manifold
  • 35. InstituteofComputingTechnology,ChineseAcademyofSciences 35  SANS (Sparse Approximated Nearest Subspaces) [CVPR’13]  Goal: adaptively construct the nearest subspace pair  Metric learning: N/A [1] S. Chen, C. Sanderson, M.T. Harandi, B.C. Lovell. Improved Image Set Classification via Joint Sparse Approximated Nearest Subspaces. IEEE CVPR 2013. Set model II: nonlinear manifold SSD (Subspace-Subspace Distance): Joint sparse representation (JSR) is applied to approximate the nearest subspace over a Grassmann manifold.
  • 36. InstituteofComputingTechnology,ChineseAcademyofSciences 36  MMDML (Multi-Manifold Deep Metric Learning) [CVPR’15]  Goal: maximize “manifold margin” under Deep Learning framework  Metric learning: in Euclidean space Set model II: nonlinear manifold [1] J. Lu, G. Wang, W. Deng, P. Moulin, and J. Zhou. Multi-Manifold Deep Metric Learning for Image Set Classification. IEEE CVPR 2015. Class-specific DNN Model nonlinearity Objective function
  • 37. InstituteofComputingTechnology,ChineseAcademyofSciences 37 Set model III: affine subspace  Properties  Linear reconstruction using: mean + subspace basis  Synthesized virtual NN-pair matching  Less characterization of global data structure  Computationally expensive by NN-based matching  Methods  AHISD/CHISD [CVPR’10]  SANP [CVPR’11]  PSDML/SSDML [ICCV’13]  … H1 H2 Sensitive to noise samples High computation cost
  • 38. InstituteofComputingTechnology,ChineseAcademyofSciences 38 Set model III: affine subspace  AHISD/CHISD [CVPR’10], SANP [CVPR’11]  NN-based matching using sample Euclidean distance  Metric learning: N/A  Subspace spanned by all the available samples 𝑫 = {𝑑1, … , 𝑑 𝑛} in the set  Affine hull  𝐻 𝑫 = 𝑫𝜶 = 𝑑𝑖 𝛼𝑖| 𝛼𝑖 = 1  Convex hull  𝐻 𝑫 = {𝑫𝜶 = 𝑑𝑖 𝛼𝑖| 𝛼𝑖 = 1, 0 ≤ 𝛼𝑖 ≤ 1} [1] H. Cevikalp, B. Triggs. Face Recognition Based on Image Sets. IEEE CVPR 2010. [2] Y. Hu, A.S. Mian, R. Owens. Sparse Approximated Nearest Points for Image Set Classification. IEEE CVPR 2011.
  • 39. InstituteofComputingTechnology,ChineseAcademyofSciences 39 Set model III: affine subspace  PSDML/SSDML [ICCV’13]  Metric learning: in Euclidean space [1] P. Zhu, L. Zhang, W. Zuo, and D. Zhang. From Point to Set: Extend the Learning of Distance Metrics. IEEE ICCV 2013. point-to-set distance metric learning (PSDML) set-to-set distance metric learning (SSDML)
  • 40. InstituteofComputingTechnology,ChineseAcademyofSciences PSDML/SSDML  Point-to-set distance  Basic distance  𝑑2 𝒙, 𝑫 = 𝑚𝑖𝑛 𝜶 𝒙 − 𝐻(𝑫) 2 2 = 𝑚𝑖𝑛 𝜶 𝒙 − 𝑫𝜶 2 2  Solution: Least Square Regression or Ridge Regression  Mahalanobis distance  𝑑 𝑴 2 𝒙, 𝑫 = 𝑚𝑖𝑛 𝑷(𝒙 − 𝑫𝜶) 2 2 = 𝒙 − 𝑫𝜶 𝑇 𝑷 𝑇 𝑷 𝑥 − 𝑫𝜶 = 𝒙 − 𝑫𝜶 𝑇 𝑴 𝒙 − 𝑫𝜶 Projection matrix 40
  • 41. InstituteofComputingTechnology,ChineseAcademyofSciences PSDML/SSDML  Point-to-set distance metric learning (PSDML)  SVM-based method 𝑑 𝑴 2 𝒙, 𝑫 = min 𝑷(𝒙 − 𝑫𝜶) 2 2 = 𝒙 − 𝑫𝜶 𝑇 𝑷 𝑇 𝑷 𝑥 − 𝑫𝜶 = 𝒙 − 𝑫𝜶 𝑇 𝑴 𝒙 − 𝑫𝜶 min 𝑴,𝜶 𝑙 𝒙 𝑖 ,𝜶 𝒋, 𝝃 𝑖𝑗 𝑁 ,𝝃 𝑖 𝑃 ,𝑏 𝑴 𝐹 2 + 𝜈( 𝝃𝑖,𝑗 𝑁 𝑖,𝑗 + 𝝃𝑖 𝑃 𝑖 ) 𝑠. 𝑡. 𝑑 𝑀 𝒙𝑖, 𝑫𝑗 + 𝑏 ≥ 1 − 𝝃𝑖𝑗 𝑁 , 𝑗 ≠ 𝑙 𝒙𝑖 ; (-) 𝑑 𝑀 𝒙𝑖, 𝑫𝑙 𝒙𝒊 + 𝑏 ≤ −1 + 𝝃𝑖 𝑃 ; (+) 𝑴 ≽ 0, ∀𝑖, 𝑗, 𝝃𝑖𝑗 𝑁 ≥ 0, 𝝃𝑖 𝑃 ≥ 0 41
  • 42. InstituteofComputingTechnology,ChineseAcademyofSciences PSDML/SSDML  Set-to-set distance  Basic distance  𝑑2 𝑫1, 𝑫2 = 𝑚𝑖𝑛 𝜶1,𝜶2 𝐻(𝑫1) − 𝐻(𝑫1) 2 2 = 𝑚𝑖𝑛 𝜶1,𝜶2 𝑫1 𝜶1 − 𝑫2 𝜶2 2 2  Solution: AHISD/CHISD [Cevikalp, CVPR’10]  Mahalanobis distance  𝑑 𝑴 2 𝑫1, 𝑫2 = 𝑚𝑖𝑛 𝑷(𝑫1 𝜶1 − 𝑫2 𝜶2) 2 2 = 𝑫1 𝜶1 − 𝑫2 𝜶2 𝑇 𝑴 𝑫1 𝜶1 − 𝑫2 𝜶2 42
  • 43. InstituteofComputingTechnology,ChineseAcademyofSciences PSDML/SSDML  Set-to-set distance metric learning (SSDML)  SVM-based method 𝑑 𝑴 2 𝑫1, 𝑫2 = min 𝑷(𝑫1 𝜶1 − 𝑫2 𝜶2) 2 2 = 𝑫1 𝜶1 − 𝑫2 𝜶2 𝑇 𝑴 𝑫1 𝜶1 − 𝑫2 𝜶2 min 𝑴,𝜶 𝑖,𝜶 𝑗, 𝜶 𝑘, 𝝃 𝑖𝑗 𝑁 ,𝝃𝑖𝑘 𝑃 ,𝑏 𝑴 𝐹 2 + 𝜈( 𝝃𝑖𝑗 𝑁 𝑖,𝑗 + 𝝃𝑖𝑘 𝑃 𝑖,𝑘 ) 𝑠. 𝑡. 𝑑 𝑀 𝑫𝑖, 𝑫𝑗 + 𝑏 ≥ 1 − 𝝃𝑖𝑗 𝑁 , 𝑙(𝑫𝑖) ≠ 𝑙 𝑫𝑗 ; (-) 𝑑 𝑀 𝑫𝑖, 𝑫 𝑘 + 𝑏 ≤ −1 + 𝝃𝑖𝑘 𝑃 , 𝑙 𝑫𝑖 = 𝑙 𝑫 𝑘 ; (+) 𝑴 ≽ 0, ∀𝑖, 𝑗, 𝑘, 𝝃𝑖𝑗 𝑁 ≥ 0, 𝝃𝑖𝑘 𝑃 ≥ 0 43
  • 44. InstituteofComputingTechnology,ChineseAcademyofSciences 44 Set model IV: statistics (COV+)  Properties  The natural raw statistics of a sample set  Flexible model of multiple-order statistical information  Methods  CDL [CVPR’12]  LMKML [ICCV’13]  DARG [CVPR’15]  SPD-ML [ECCV’14]  LEML [ICML’15]  LERM [CVPR’14]  HER [CVPR’15 ]  … G1 G2 Natural raw statistics More flexible model [Shakhnarovich, ECCV’02] [Arandjelović, CVPR’05]
  • 45. InstituteofComputingTechnology,ChineseAcademyofSciences 45  CDL (Covariance Discriminative Learning) [CVPR’12]  Set modeling by Covariance Matrix (COV)  The 2nd order statistics characterizing set data variations  Robust to noisy set data, scalable to varying set size  Metric learning: on the SPD manifold [1] R. Wang, H. Guo, L.S. Davis, Q. Dai. Covariance Discriminative Learning: A Natural and Efficient Approach to Image Set Classification. IEEE CVPR 2012. TI log Si  Ci Cj Sj I Model data variations No assum. of data distr. Set model IV: statistics (COV+)
  • 46. InstituteofComputingTechnology,ChineseAcademyofSciences 46 CDL  Set modeling by Covariance Matrix 1 2= [ , ,..., ]N d NX x x x COV: d*d symmetric positive definite (SPD) matrix* 1 1 ( )( ) 1 N T i i iN      C x x x x Image set: N samples with d- dimension image feature *: use regularization to tackle singularity problem
  • 47. InstituteofComputingTechnology,ChineseAcademyofSciences 47 CDL  Set matching on COV manifold  Riemannian metrics on the SPD manifold  Affine-invariant distance (AID) [1] or  Log-Euclidean distance (LED) [2] 1 2 1 2( , ) log ( ) log ( ) F d  I IC C C C 2 2 1 2 1 21 ( , ) ln ( , ) d ii d   C C C C 22 -1/2 -1/2 1 2 1 2 1( , ) log ( ) F d  IC C C C C High computational burden More efficient, more appealing [1] W. Förstner and B. Moonen. A Metric for Covariance Matrices. Technical Report 1999. [2] V. Arsigny, P. Fillard, X. Pennec and N. Ayache. Geometric Means In A Novel Vector Space Structure On Symmetric Positive-Definite Matrices. SIAM J. MATRIX ANAL. APPL. Vol. 29, No. 1, pp. 328-347, 2007.
  • 48. InstituteofComputingTechnology,ChineseAcademyofSciences 48 CDL  Set matching on COV manifold (cont.)  Explicit Riemannian kernel feature mapping with LED 1 2 1 2( , ) [log ( ) log ( )]logk trace I IC C C C Mercer’s theorem Tangent space at Identity matrix I Riemannian manifold of non-singular COV M TIM X1 Y1 0 X2 X3 Y2 Y3 x1 x2 x3 y1 y2 y3 logI logI : log ( ), ( )d d log R   IC C
  • 49. InstituteofComputingTechnology,ChineseAcademyofSciences 49 CDL  Discriminative learning on COV manifold  Partial Least Squares (PLS ) regression  Goal: Maximize the covariance between observations and class labels Space X Space Y Linear Projection Latent Space T feature vectors, e.g. log-COV matrices class label vectors, e.g. 1/0 indicators X = T * P’ Y = T * B * C’ = X * Bpls T is the common latent representation
  • 50. InstituteofComputingTechnology,ChineseAcademyofSciences  COVSPD manifold  Model  Metric  Kernel  SubspaceGrassmannian  Model  Metric  Kernel 50 CDL vs. GDA u1 u2 1 1 ( )( ) 1 N T i i iN      C x x x x * T C U U 1 2[ , , , ]m D mU u u u 1/ 2 1 2 1 1 2 2( , ) 2 T T proj F d   U U U U U U : log ( ), ( )d d log R   IC C : , ( , )T d d proj m D R   U UU 1 2 1 2( , ) log ( ) log ( ) F d  I IC C C C log ( ) log ( ) T I IC U U
  • 51. InstituteofComputingTechnology,ChineseAcademyofSciences 51  LMKML (Localized Multi-Kernel Metric Learning) [ICCV’13]  Exploring multiple order statistics  Data-adaptive weights for different types of features  Ignoring the geometric structure of 2nd/3rd-order statistics  Metric learning: in Euclidean space [1] J. Lu, G. Wang, and P. Moulin. Image Set Classification Using Holistic Multiple Order Statistics Features and Localized Multi-Kernel Metric Learning. IEEE ICCV 2013. Set model IV: statistics (COV+) Complementary information (mean vs. covariance) 1st / 2nd / 3rd-order statistics Objective function
  • 52. InstituteofComputingTechnology,ChineseAcademyofSciences 52  DARG (Discriminant Analysis on Riemannian manifold of Gaussian distributions) [CVPR’15]  Set modeling by mixture of Gaussian distribution (GMM)  Naturally encode the 1st order and 2nd order statistics  Metric learning: on Riemannian manifold [1] W. Wang, R. Wang, Z. Huang, S. Shan, X. Chen. Discriminant Analysis on Riemannian Manifold of Gaussian Distributions for Face Recognition with Image Sets. IEEE CVPR 2015. 𝓜:Riemannian manifold of Gaussian distributions Non-discriminative Time-consuming × Set model IV: statistics (COV+) [Shakhnarovich, ECCV’02] [Arandjelović, CVPR’05]
  • 53. InstituteofComputingTechnology,ChineseAcademyofSciences 53 DARG  Framework (a) (b) Discriminative learning (c) ℳ ℝ 𝑑 Kernel embedding ℋ ℳ: Riemannian manifold of Gaussian distributions ℋ: high-dimensional reproducing kernel Hilbert space (RKHS) ℝ 𝑑 : target lower-dimensional discriminant Euclidean subspace
  • 54. InstituteofComputingTechnology,ChineseAcademyofSciences 54 DARG  Kernels on the Gaussian distribution manifold kernel based on Lie Group  Distance based on Lie Group (LGD) 𝐿𝐺𝐷 𝑃𝑖, 𝑃𝑗 = log 𝑃𝑖 − log 𝑃𝑗 𝐹 , 𝑔~𝑁(𝑥|𝜇, Σ) ⟼ 𝑃 = Σ − 1 𝑑+1 Σ + 𝜇𝜇 𝑇 𝜇 𝜇 𝑇 1  Kernel function 𝐾LGD 𝑔𝑖, 𝑔𝑗 = exp − 𝐿𝐺𝐷2 𝑃𝑖, 𝑃𝑗 2𝑡2 SPD matrix according to information geometry
  • 55. InstituteofComputingTechnology,ChineseAcademyofSciences 55 DARG  Kernels on the Gaussian distribution manifold kernel based on Lie Group kernel based on MD and LED  Mahalanobis Distance (MD) between mean 𝑀𝐷 𝜇𝑖, 𝜇 𝑗 = 𝜇𝑖 − 𝜇 𝑗 𝑇 Σ𝑖 −1 + Σ𝑗 −1 𝜇𝑖 − 𝜇 𝑗  LED between covariance matrix 𝐿𝐸𝐷 Σ𝑖, Σ𝑗 = log Σ𝑖 − log Σ𝑗 𝐹  Kernel function 𝐾 𝑀𝐷+𝐿𝐸𝐷 𝑔𝑖, 𝑔𝑗 = 𝛾1 𝐾 𝑀𝐷 𝜇𝑖, 𝜇 𝑗 + 𝛾2 𝐾𝐿𝐸𝐷 Σ𝑖, Σ𝑗 𝐾 𝑀𝐷 𝜇𝑖, 𝜇 𝑗 = exp − MD2 𝜇𝑖, 𝜇 𝑗 2𝑡2 𝐾𝐿𝐸𝐷 Σ𝑖, Σ𝑗 = exp − 𝐿𝐸𝐷2 Σ𝑖, Σ𝑗 2𝑡2 𝛾1, 𝛾2 are the combination coefficients
  • 56. InstituteofComputingTechnology,ChineseAcademyofSciences 56 DARG  Discriminative learning Weighted KDA (kernel discriminant analysis)  incorporating the weights of Gaussian components 𝐽 𝛼 = 𝛼 𝑇 𝑩𝛼 𝛼 𝑇 𝑾𝛼 𝑾 = 1 𝜔𝑖 𝑤𝑗 (𝑖) 𝑘𝑗 𝑖 − 𝑚𝑖 𝑘𝑗 𝑖 − 𝑚𝑖 𝑇 𝑁 𝑖 𝑗=1 𝐶 𝑖=1 𝑩 = 𝑁𝑖 𝑚𝑖 − 𝑚 𝑚𝑖 − 𝑚 𝑇 𝐶 𝑖=1 𝑚𝑖 = 1 𝑁𝑖 𝜔𝑖 𝑤𝑗 𝑖 𝑘𝑗 𝑖 𝑁 𝑖 𝑗=1 , m = 1 𝑁𝑖 1 𝜔𝑖 𝑤𝑗 𝑖 𝑘𝑗 𝑖 𝑁 𝑖 𝑗=1 𝐶 𝑖=1
  • 57. InstituteofComputingTechnology,ChineseAcademyofSciences 57 Set model IV: statistics (COV+)  Properties  The natural raw statistics of a sample set  Flexible model of multiple-order statistical information  Methods  CDL [CVPR’12]  LMKML [ICCV’13]  DARG [CVPR’15]  SPD-ML [ECCV’14]  LEML [ICML’15]  LERM [CVPR’14]  HER [CVPR’15 ]  … G1 G2 Natural raw statistics More flexible model [Shakhnarovich, ECCV’02] [Arandjelović, CVPR’05]
  • 58. InstituteofComputingTechnology,ChineseAcademyofSciences 58  SPD-ML (SPD Manifold Learning) [ECCV’14]  Pioneering work on explicit manifold-to-manifold dimensionality reduction  Metric learning: on Riemannian manifold Set model IV: statistics (COV+) [1] M. Harandi, M. Salzmann, R. Hartley. From Manifold to Manifold: Geometry-Aware Dimensionality Reduction for SPD Matrices. ECCV 2014. Structure distortion No explicit mapping
  • 59. InstituteofComputingTechnology,ChineseAcademyofSciences 59 SPD-ML  SPD manifold dimensionality reduction  Mapping function: 𝑓: 𝒮++ 𝑛 × ℝ 𝑛×𝑚 → 𝒮++ 𝑚  𝑓 𝑿, 𝑾 = 𝑾 𝑇 𝑿𝑾 ∈ 𝒮++ 𝑚 ≻ 0, 𝑿 ∈ 𝒮++ 𝑛 , 𝑾 ∈ ℝ 𝑛×𝑚 (full rank)  Affine invariant metrics: AIRM / Stein divergence on target SPD manifold 𝒮++ 𝑚  𝛿2 𝑾 𝑇 𝑿𝑖 𝑾, 𝑾 𝑇 𝑿𝑖 𝑾 = 𝛿2 𝑾 𝑇 𝑿𝑖 𝑾, 𝑾 𝑇 𝑿𝑗 𝑾  𝑾 = 𝑴𝑾, 𝑴 ∈ 𝐺𝐿 𝑛 , 𝑾 ∈ ℝ 𝑛×𝑚 , 𝑾 𝑇 𝑾 = 𝑰 𝑚
  • 60. InstituteofComputingTechnology,ChineseAcademyofSciences 60 SPD-ML  Discriminative learning  Discriminant function  Graph Embedding formalism with an affinity matrix that encodes intra-class and inter-class SPD distances  min 𝐿 𝑾 = min 𝑨𝑖𝑗 𝛿2(𝑾 𝑇 𝑿𝑖 𝑾, 𝑾 𝑇 𝑿𝑗 𝑾)𝑖𝑗  𝑠. 𝑡. 𝑾 𝑇 𝑾 = 𝑰 𝑚(orthogonality constraint)  Optimization  Optimization problems on Stiefel manifold, solved by nonlinear Conjugate Gradient (CG) method
  • 61. InstituteofComputingTechnology,ChineseAcademyofSciences 61  LEML (Log-Euclidean Metric Learning) [ICML’15]  Learning tangent map by preserving matrix symmetric structure  Metric learning: on Riemannian manifold Set model IV: statistics (COV+) [1] Z. Huang, R. Wang, S. Shan, X. Li, X. Chen. Log-Euclidean Metric Learning on Symmetric Positive Definite Manifold with Application to Image Set Classification. ICML 2015. CDL [CVPR’12] (a) (b2) (c) OR × 𝟐 × 𝟐 × 𝟐 (b1) (d) (e)
  • 62. InstituteofComputingTechnology,ChineseAcademyofSciences 62 LEML  SPD tangent map learning  Mapping function:𝐷𝐹: 𝑓 log(𝑺) = 𝑾 𝑇log(𝑺)𝑾  𝑾 is column full rank  Log-Euclidean distance in the target tangent space  𝑑 𝐿𝐸𝐷 𝑓(𝑻𝑖), 𝑓(𝑻𝑗) = ||𝑾 𝑇 𝑻𝑖 𝑾 − 𝑾 𝑇 𝑻𝑗 𝑾|| 𝐹 = 𝑡𝑟(𝑸(𝑻𝑖 − 𝑻𝑗)(𝑻𝑖 − 𝑻𝑗)) 𝑺 𝜉 𝑆 𝑇𝑺 𝕊+ 𝑑 𝑇𝐹(𝑺) 𝕊+ 𝑘 𝐷𝐹 𝑺 [𝜉 𝑆] 𝕊+ 𝑑 𝕊+ 𝑘 𝐹(𝑺) 𝐹 𝑫𝑭(𝑺) 𝛾(𝑡) 𝐹(𝛾 𝑡 ) analogy to 2DPCA 𝑸 = (𝑾𝑾 𝑇 )2 𝑻 = log(𝑺)
  • 63. InstituteofComputingTechnology,ChineseAcademyofSciences 63 LEML  Discriminative learning  Objective function  arg min 𝑄,𝜉 𝐷𝑙𝑑 𝑸, 𝑸 𝟎 + 𝜂𝐷𝑙𝑑 𝝃, 𝝃 𝟎 s. t. , tr 𝑸𝑨𝑖𝑗 𝑇 𝐀 𝑖𝑗 ≤ 𝜉 𝒄 𝑖,𝑗 , 𝑖, 𝑗 ∈ 𝑺 tr 𝑸𝑨𝑖𝑗 𝑇 𝐀 𝑖𝑗 ≥ 𝜉 𝒄 𝑖,𝑗 , (𝑖, 𝑗) ∈ 𝑫  𝐀 𝑖𝑗 = log 𝑪𝑖 − log (𝑪𝑗), 𝐷𝑙𝑑: LogDet divergence  Optimization  Cyclic Bregman projection algorithm [Bregman’1967]
  • 64. InstituteofComputingTechnology,ChineseAcademyofSciences 64 Set model IV: statistics (COV+)  Properties  The natural raw statistics of a sample set  Flexible model of multiple-order statistical information  Methods  CDL [CVPR’12]  LMKML [ICCV’13]  DARG [CVPR’15]  SPD-ML [ECCV’14]  LEML [ICML’15]  LERM [CVPR’14]  HER [CVPR’15 ]  … G1 G2 Natural raw statistics More flexible model [Shakhnarovich, ECCV’02] [Arandjelović, CVPR’05]
  • 65. InstituteofComputingTechnology,ChineseAcademyofSciences 65  LERM (Learning Euclidean-to-Riemannian Metric) [CVPR’14]  Application scenario: still-to-video face recognition  Metric learning: cross Euclidean space and Riemannian manifold [1] Z. Huang, R. Wang, S. Shan, X. Chen. Learning Euclidean-to-Riemannian Metric for Point-to-Set Classification. IEEE CVPR 2014. ℝ 𝐷 Watch list… Surveillance video Point Set Point-to-Set Classification Watch list screening Set model IV: statistics (COV+)
  • 66. InstituteofComputingTechnology,ChineseAcademyofSciences 66 LERM  Point-to-Set Classification  Euclidean points vs. Riemannian points ℳℝ 𝐷 [Hamm, ICML’08] [Harandi, CVPR’11] [Hamm, NIPS’08] [Pennec, IJCV’06] [Arsigny, SIAM’07]Point Set model Corresponding manifold: 1.Grassmann (G) 2.AffineGrassmann (A) 3.SPD (S) Euclidean space (E) Heterogeneous Linear subspace Affine hull Covariance matrixLinear subspace [Yamaguchi, FG’98] [Chien, PAMI’02] [Kim, PAMI’07] Affine hull [Vincent, NIPS’01] [Cevikalp, CVPR’10] [Zhu, ICCV’13] Covariance matrix [Wang, CVPR’12] [Vemulapalli, CVPR’13] [Lu, ICCV’13]
  • 67. InstituteofComputingTechnology,ChineseAcademyofSciences 67 LERM ℳℳℝ 𝐷ℝ 𝐷  Basic idea  Reduce Euclidean-to-Riemannian metric to classical Euclidean metric  Seek maps 𝐹,Φ to a common Euclidean subspace  𝒅 𝒙𝒊, 𝒚𝒋 = (𝑭 𝒙𝒊 − 𝚽 𝒚𝒋 ) 𝑻(𝑭 𝒙𝒊 − 𝚽 𝒚𝒋 ) ℝ 𝑑 𝐹 Φ ℝ 𝑑 𝑓 𝑔 ℝ 𝐷1 ℝ 𝐷2 Conventional Metric Learning
  • 68. InstituteofComputingTechnology,ChineseAcademyofSciences 68 LERM  Basic idea Bridge Euclidean-to-Riemannian gap  Hilbert space embedding  Adhere to Euclidean geometry  Globally encode the geometry of manifold ℳℳℝ 𝐷ℝ 𝐷 ℝ 𝑑 𝐹 ℋ 𝑦 𝝋 𝒚 𝑓𝒚Φ ℳℳ Tangent space (locally)
  • 69. InstituteofComputingTechnology,ChineseAcademyofSciences 69 LERM  Formulation  Three Mapping Modes 𝝋 𝒚 ℳ ℋ 𝑦 ℝ 𝐷 ℝ 𝑑 𝒇 𝒚 ℋ𝑥 𝜑 𝑥 𝑓𝑥 Mapping Mode 2 Double Hilbert spaces ℝ 𝑑 ℋ𝑥 𝑓𝑥 𝒇 𝒚 ℝ 𝐷 ℳ ℋ 𝑦 𝜑 𝑦 ′ 𝜑 𝑥 ′ Mapping Mode 3 +Cross-space kernel Further reduce the gap ℝ 𝐷 ℳ ℝ 𝑑 𝝋 𝒚 ℋ 𝑦 𝑓𝑥 𝒇 𝒚 Mapping Mode 1 Single Hilbert space Further explore correlations Same: Objective fun. 𝑬(𝑭, 𝝋) Different: Final maps 𝑭,𝚽 𝑭 𝚽 𝑭 𝚽 𝑭 𝚽
  • 70. InstituteofComputingTechnology,ChineseAcademyofSciences 70 LERM  e.g. Mode 1 Further explore correlations ℝ 𝑑 ℋ𝑥 𝑓𝑥 𝒇 𝒚 ℝ 𝐷 ℳ ℋ 𝑦 𝜑 𝑦 ′ 𝜑 𝑥 ′ ℝ 𝑑 ℋ𝑥 𝑓𝑥 𝒇 𝒚 ℝ 𝐷 ℳ ℋ 𝑦 𝜑 𝑦 ′ 𝜑 𝑥 ′ Distance metric: 𝒅 𝒙𝒊, y𝒋 = (𝑭 𝒙𝒊 − 𝚽 𝒚𝒋 ) 𝑻(𝑭 𝒙𝒊 − 𝚽 𝒚𝒋 ) Objective function: 𝑬(𝑭, 𝝋) min 𝐹,𝚽 { 𝑫 𝑭, 𝚽 + 𝜆1 𝑮(𝑭, 𝚽) + 𝜆2 𝑻 𝑭, 𝚽 } Distance Geometry Transformation 𝝋 𝒚 ℳ ℋ 𝑦ℋ𝑥 𝜑 𝑥 ℝ 𝐷 ℝ 𝑑 𝒇 𝒚𝑓𝑥 Final maps: 𝑭 = 𝒇 𝒙 = 𝑾 𝒙 𝑻 𝑿 𝚽 = 𝒇 𝒚 ∘ 𝝋 𝒚 = 𝑾 𝒚 𝑻 𝑲 𝒚 𝝋 𝒚 𝒊 , 𝝋 𝒚 𝒋 = 𝐾 𝑦 𝑖, 𝑗 𝐾 𝑦 𝑖, 𝑗 = exp(−𝑑2(𝑦𝑖, 𝑦𝑗)/2𝜎2) Single Hilbert space Riemannian metrics [ICML’08, NIPS’08, SIAM’06] ℝ 𝐷 ℳ ℝ 𝑑 ℋ 𝑦 𝑓𝑥 𝒇 𝒚 𝝋 𝒚 Normalize Transformation Discriminate Distance Preserve Geometry 𝑭 𝚽
  • 71. InstituteofComputingTechnology,ChineseAcademyofSciences 71  HER (Hashing across Euclidean and Riemannian) [CVPR’15]  Application scenario: video-based face retrieval  Metric learning: hamming distance learning across heter. spaces [1] Y. Li, R. Wang, Z. Huang, S. Shan, X. Chen. Face Video Retrieval with Image Query via Hashing across Euclidean Space and Riemannian Manifold. IEEE CVPR 2015. Set model IV: statistics (COV+)
  • 72. InstituteofComputingTechnology,ChineseAcademyofSciences 72 HER  Heterogeneous Hash Learning The Proposed Heterogeneous Hash Learning Method Previous Multiple Modalities Hash Learning Methods Euclidean Space BEuclidean Space A Hamming Space 001 011 111 110 100000 101 Riemannian ManifoldEuclidean Space Hamming Space 001 011 111 110 100000 101
  • 73. InstituteofComputingTechnology,ChineseAcademyofSciences 73 HER  Two-Stage Architecture  Stage-1: kernel mapping Euclidean Kernel (Image) 𝑲 𝑒 𝑖𝑗 = exp(− 𝒙𝑖 − 𝒙𝑗 2 2 2𝜎𝑒 2 ) Riemannian Kernel (Video) 𝑲 𝑟 𝑖𝑗 = exp(− log(𝔂𝑖) − log(𝔂 𝑗) 𝐹 2 2𝜎𝑟 2 )* Reproducing Kernel Hilbert Space (Euc.) Reproducing Kernel Hilbert Space (Rie.) 011 Target Hamming Space 001 111 110 100000 101 Subject A Subject B Subject C 𝜑 𝜂 [*] Y. Li, R. Wang, Z. Cui, S. Shan, X. Chen. Compact Video Code and Its Application to Robust Face Retrieval in TV-Series. BMVC 2014. (**: CVC method for video-video face retrieval) Euclidean Space (Image-Vector-𝒙𝑖) Riemannian Manifold (Video-Covariance Matrix-𝔂𝑖)
  • 74. InstituteofComputingTechnology,ChineseAcademyofSciences 74 HER  Two-Stage Architecture  Stage-2: binary encoding 𝜑 𝜂 Euclidean Space (Image-Vector-𝒙𝑖) Riemannian Manifold (Video-Covariance Matrix-𝔂𝑖) Reproducing Kernel Hilbert Space (Euc.) Reproducing Kernel Hilbert Space (Rie.) Hyperplane  Discriminability  Stability 011 Target Hamming Space 001 111 110 100000 101 Subject A Subject B Subject C Hash Functions 𝝎 𝑒 Hash Functions 𝝎 𝑟
  • 75. InstituteofComputingTechnology,ChineseAcademyofSciences 75 HER  Binary encoding [Rastegari, ECCV’12] Discriminability 𝐸𝑒 = 𝑑(𝑩 𝑒 𝑚 , 𝑩 𝑒 𝑛 ) 𝑚,𝑛∈𝑐𝑐∈{1:𝐶} − 𝜆 𝑒 𝑑 𝑩 𝑒 𝑝 , 𝑩 𝑒 𝑞 𝑐2∈ 1:𝐶 𝑐1≠𝑐2,𝑞∈𝑐2 𝑐1∈ 1:𝐶 𝑝∈𝑐1 𝐸𝑟 = 𝑑(𝑩 𝑟 𝑚 , 𝑩 𝑟 𝑛 ) 𝑚,𝑛∈𝑐𝑐∈{1:𝐶} − 𝜆 𝑟 𝑑(𝑩 𝑟 𝑝 , 𝑩 𝑟 𝑞 ) 𝑐2∈{1:𝐶} 𝑐1≠𝑐2,𝑞∈𝑐2 𝑐1∈{1:𝐶} 𝑝∈𝑐1 𝐸𝑒𝑟 = 𝑑(𝑩 𝑒 𝑚 , 𝑩 𝑟 𝑛 ) 𝑚,𝑛∈𝑐𝑐∈{1:𝐶} − 𝜆 𝑒𝑟 𝑑(𝑩 𝑒 𝑝 , 𝑩 𝑟 𝑞 ) 𝑐2∈{1:𝐶} 𝑐1≠𝑐2,𝑞∈𝑐2 𝑐1∈{1:𝐶} 𝑝∈𝑐1 Objective Function min 𝝎 𝑒,𝝎 𝑟,𝝃 𝑒,𝝃 𝑟,𝑩 𝑒,𝑩 𝑟 𝜆1 𝐸𝑒 + 𝜆2 𝐸𝑟 + 𝜆3 𝐸𝑒𝑟 + 𝛾1 𝝎 𝑒 𝑘 2 𝑘∈{1:𝐾} + 𝐶1 𝝃 𝑒 𝑘𝑖 𝑘∈{1:𝐾} 𝑖∈{1:𝑁} + 𝛾2 𝝎 𝑟 𝑘 2 𝑘∈{1:𝐾} + 𝐶2 𝝃 𝑟 𝑘𝑖 𝑘∈{1:𝐾} 𝑖∈{1:𝑁} s.t. 𝑩 𝑒 𝑘𝑖 = 𝑠𝑔𝑛 𝝎 𝑒 𝑘 𝑇 𝜑 𝒙𝑖 , ∀𝑘 ∈ 1: 𝐾 , ∀𝑖 ∈ 1: 𝑁 𝑩 𝑟 𝑘𝑖 = 𝑠𝑔𝑛 𝝎 𝑟 𝑘 𝑇 𝜂 𝔂𝑖 , ∀𝑘 ∈ 1: 𝐾 , ∀𝑖 ∈ 1: 𝑁 𝑩 𝑟 𝑘𝑖 𝝎 𝑒 𝑘 𝑇 𝜑 𝒙𝑖 ≥ 1 − 𝝃 𝑒 𝑘𝑖 , 𝝃 𝑒 𝑘𝑖 > 0, ∀𝑘 ∈ 1: 𝐾 , ∀𝑖 ∈ 1: 𝑁 𝑩 𝑒 𝑘𝑖 𝝎 𝑟 𝑘 𝑇 𝜂 𝔂𝑖 ≥ 1 − 𝝃 𝑟 𝑘𝑖 , 𝝃 𝑟 𝑘𝑖 > 0, ∀𝑘 ∈ 1: 𝐾 , ∀𝑖 ∈ 1: 𝑁 Compatibility (Cross-training) Stability (SVM) Discriminability (LDA)
  • 77. InstituteofComputingTechnology,ChineseAcademyofSciences 77 Evaluations  Two YouTube datasets  YouTube Celebrities (YTC) [Kim, CVPR’08]  47 subjects, 1910 videos from YouTube  YouTube FaceDB (YTF) [Wolf, CVPR’11]  3425 videos, 1595 different people YTC YTF
  • 78. InstituteofComputingTechnology,ChineseAcademyofSciences 78 Evaluations  COX Face [Huang, ACCV’12/TIP’15]  1,000 subjects  each has 1 high quality images, 3 unconstrained video sequences Images Videos http://vipl.ict.ac.cn/resources/datasets/cox-face-dataset/cox-face
  • 79. InstituteofComputingTechnology,ChineseAcademyofSciences 79 Evaluations  PaSC [Beveridge, BTAS’13]  Control videos  1 mounted video camera  1920*1080 resolution  Handheld videos  5 handheld video cameras  640*480~1280*720 resolution Control video Handheld video
  • 80. InstituteofComputingTechnology,ChineseAcademyofSciences 80 Evaluations  Results (reported in our DARG paper*) [*] W. Wang, R. Wang, Z. Huang, S. Shan, X. Chen. Discriminant Analysis on Riemannian Manifold of Gaussian Distributions for Face Recognition with Image Sets. IEEE CVPR 2015.
  • 82. InstituteofComputingTechnology,ChineseAcademyofSciences 82 Evaluations  Performance on PaSC Challenge (IEEE FG’15)  HERML-DeLF  DCNN learned image feature  Hybrid Euclidean and Riemannian Metric Learning* [*] Z. Huang, R. Wang, S. Shan, X. Chen. Hybrid Euclidean-and-Riemannian Metric Learning for Image Set Classification. ACCV 2014. (**: the key reference describing the method used for the challenge)
  • 83. InstituteofComputingTechnology,ChineseAcademyofSciences 83 Evaluations  Performance on EmotiW Challenge (ACM ICMI’14)*  Combination of multiple statistics for video modeling  Learning on the Riemannian manifold [*] M. Liu, R. Wang, S. Li, S. Shan, Z. Huang, X. Chen. Combining Multiple Kernel Methods on Riemannian Manifold for Emotion Recognition in the Wild. ACM ICMI 2014.
  • 85. InstituteofComputingTechnology,ChineseAcademyofSciences 85 Route map of our methods MMD CVPR’08/TIP’12 MDA CVPR’09 CDL CVPR’12 DARG CVPR’15 LERM CVPR’14 CVC BMVC’14 LEML/PML ICML’15/CVPR’15 HER CVPR’15 Complex distribution Large amount of data Appearance manifold Natural raw statistics No assum. of data dist. Covariance matrix binary heter.
  • 86. InstituteofComputingTechnology,ChineseAcademyofSciences 86 Summary  What we learn from current studies  Set modeling  Linear(/affine) subspace  Manifold  Statistics  Set matching  Non-discriminative  Discriminative Metric learning  Euclidean space  Riemannian manifold  Future directions  More flexible set modeling for different scenarios  Multi-model combination  Learning method should be more efficient  Set-based vs. sample-based?
  • 87. InstituteofComputingTechnology,ChineseAcademyofSciences 87 Additional references (not listed above)  [Arandjelović, CVPR’05] O. Arandjelović, G. Shakhnarovich, J. Fisher, R. Cipolla, and T. Darrell. Face Recognition with Image Sets Using Manifold Density Divergence. IEEE CVPR 2005.  [Chien, PAMI’02] J. Chien and C. Wu. Discriminant waveletfaces and nearest feature classifiers for face recognition. IEEE T-PAMI 2002.  [Rastegari, ECCV’12] M. Rastegari, A. Farhadi, and D. Forsyth. Attribute discovery via predictable discriminative binary codes. ECCV 2012.  [Shakhnarovich, ECCV’02] G. Shakhnarovich, J. W. Fisher, and T. Darrell. Face Recognition from Long-term Observations. ECCV 2002.  [Vemulapalli, CVPR’13] R. Vemulapalli, J. K. Pillai, and R. Chellappa. Kernel learning for extrinsic classification of manifold features. IEEE CVPR 2013.  [Vincent, NIPS’01] P. Vincent and Y. Bengio. K-local hyperplane and convex distance nearest neighbor algorithms. NIPS 2001.
  • 88. InstituteofComputingTechnology,ChineseAcademyofSciences 8888 Thanks, Q & A Xilin ChenShiguang ShanRuiping Wang Lab of Visual Information Processing and Learning (VIPL) @ICT@CAS Codes of our methods available at: http://vipl.ict.ac.cn/resources/codes Zhiwu Huang Yan Li Wen Wang Mengyi Liu