MediaEval 2017 - Interestingness Task: Predicting Media Interestingness via Biased Discriminant Embedding and Supervised Manifold Regression

Predicting Media Interestingness via
Biased Discriminant Embedding and
Supervised Manifold Regression
Yang Liu1,2, Zhonglei Gu1, Tobey H. Ko3
1Department of Computer Science, Hong Kong Baptist University
2Institute of Research and Continuing Education, Hong Kong Baptist
University
3Department of Industrial and Manufacturing Systems Engineering,
University of Hong Kong

Our System
 Feature Extraction + Prediction
 Step1: Supervised Feature Extraction
 Biased Discriminant Embedding (BDE) for discrete labels
 Supervised Manifold Regression (SMR) for continuous labels
 Step2: Classification/Regression
 v-Support Vector Classification (v-SVC)
 ε-Support Vector Regression (ε-SVR)
Predicting Media Interestingness via Biased Discriminant Embedding and Supervised Manifold Regression
2

Feature Extraction for Discrete Labels:
Biased Discriminant Embedding
: Transformation matrix to be learned
: Biased between-class scatter matrix
Idea: The biased discriminant information in the reduced subspace
should be maximized. (Motivation: we are probably more interested
in the interesting class than the non-interesting one)
Formulation:
: Biased within-class scatter matrix
Solution: The problem can be solved by eigen-decomposition
3

Feature Extraction for Continuous Labels:
Supervised Manifold Regression
: Transformation matrix to be learned
: The i-th data sample
Idea: The more similar the interestingness levels of two data
samples, the closer the two feature vectors should be in the
learned subspace
Formulation:
: Interestingness similarity, i.e., label similarity
Solution: The problem can be solved by eigen-decomposition
: Data similarity in the original space
4

Media Interestingness Prediction
 Predicting Interesting/Non-interesting
 Formulated as a binary classification problem
 v-Support Vector Classification (v-SVC)
 Predicting Interestingness levels
 Formulated as a real-value regression problem
 ε-Support Vector Regression (ε-SVR)
5

Experimental Setting 1
 Image feature vectors: 1299-D
 128-D color histogram features, 300-D denseSIFT
features, 512-D gist features, 300-D hog2*2, 59-D LBP
features
 Video feature vectors: 2598-D
 Calculate the average and standard deviation over all
frames of the corresponding shot, and concatenate
these two 1299-D feature vectors
6

 Run 1 of image data
 Use the normalized 1299-D feature vector as the input of SVM
 Runs 2-5 of image data
 Reduce the original data to the 23-D, 25-D, 26-D, 27-D
subspaces via BDE (for discrete labels) and SMR (for continuous
labels), respectively
 Run 1 of video data
 Use the normalized 2598-D feature vector as the input of SVM
 Runs 2-5 of video data
 Reduce the original data to the 23-D, 25-D, 26-D, 27-D
subspaces via BDE (for discrete labels) and SMR (for continuous
labels), respectively
7

 𝜈-SVC
 Use an RBF kernel with 𝜈 = 0.1 and 𝑔𝑎𝑚𝑚𝑎 = 100
(for image data) / 64 (for video data).
 ε-SVR
 Use an RBF kernel with 𝑐𝑜𝑠𝑡 = 1, 𝜖 = 0.01, and 𝛾 =
1/𝐷
8

Results and Observations
 Observation: For image data, the reduced features perform better than the
original ones
 Subspaces learned by BDE and SMR capture important discriminant
information
 Observation: For video data, the performance of reduced features is slightly
worse than that of the original ones
 Video data are more complex than image data so that such a low-dimensional
representation cannot fully capture the key discriminant information embedded in
the original space.
9

10

 The contribution of the 𝑖-th dimension is defined as
where 𝜆𝑗 denotes the 𝑗-th eigenvalue, 𝑤𝑖𝑗 denotes the (𝑖, 𝑗)-th element of
W, and | · | denotes the absolute value operator.
 Observation 1: Color histogram and LBP features contribute more than
the others while the GIST features contribute the least in the discrete
prediction task (Figures 1(a) and 1(c))
 Observation 2: Color histogram and GIST features contribute the most
among the five feature sets in the continuous prediction task (Figures
1(b) and 1(d))
11

Outlook
 Improve the performance of video interestingness
prediction by incorporating the video temporal
information
 Refine the human labeled ground truth (especially
for continuous case) via machine learning
technologies
 The ground truth (labels) of interestingness are
provided by human beings, they generally vary with
each individual and are somewhat subjective
12

Q & A
Thank You !
13

MediaEval 2017 - Interestingness Task: Predicting Media Interestingness via Biased Discriminant Embedding and Supervised Manifold Regression

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Similar to MediaEval 2017 - Interestingness Task: Predicting Media Interestingness via Biased Discriminant Embedding and Supervised Manifold Regression

Similar to MediaEval 2017 - Interestingness Task: Predicting Media Interestingness via Biased Discriminant Embedding and Supervised Manifold Regression (20)

More from multimediaeval

More from multimediaeval (20)

Recently uploaded

Recently uploaded (20)

MediaEval 2017 - Interestingness Task: Predicting Media Interestingness via Biased Discriminant Embedding and Supervised Manifold Regression