UNIFESP Predicting Media Interestingness Using Motion Histograms

UNIFESP at MediaEval 2016:
Predicting Media Interestingness Task
Jurandy Almeida
GIBIS Lab, Institute of Science and Technology, Federal University of São Paulo – UNIFESP
jurandy.almeida@unifesp.br
Introduction
• Developed in the MediaEval 2016 Pre-
dicting Media Interestingness Task
and for its video subtask only.
• The goal is to automatically select the
most interesting video segments ac-
cording to a common viewer.
• The focus is on features derived from
audio-visual content or associated tex-
tual information.
Proposed Approach
It relies on combining learning-to-rank algo-
rithms and exploiting visual information:
1. A simple histogram of motion patterns
is used for processing visual information.
2. A majority voting scheme is used for
combining machine-learned rankers and
predicting the interestingness of videos.
Visual Features
• Low-Level & Mid-Level Features: Not used
• Applying an algorithm to encode visual
properties from video segments.
– “Comparison of Video Sequences with
Histograms of Motion Patterns” [1].
• It relies on three steps:
1. partial decoding;
2. feature extraction;
3. signature generation.
106 111
100 88
91 94
95 90
90 93
96 91
1 1
2 1
2 1
0 3
Previous Current Next
Temporal Spatial
Time Series of Macroblocks
Video Frames
I-frames
Macroblock
Pixel Block
Histogram Distribution
DC coefficient
1: Partial Decoding
2: Feature Extraction
3: Signature Generation
Motion Pattern
0101100110010011
Histograms of Motion Patterns (HMP)
Learning to Rank Strategies
• Ranking SVM [5]: Use the traditional SVM classifier
to learn a ranking function.
• RankNet [2]: Probability distribution metrics as cost
functions to be optimized.
• RankBoost [4]: Regression error on weighted distri-
butions of pairwise rankings.
• ListNet [3]: Extension of RankNet that uses a ranked
list instead of pairwise rankings.
• Majority Voting [6]: The label with the most votes
is selected as the label for a given instance.
Input
Rankers R1 R2 RN
O1 O2 ON
Combining Rankings
Output ô
Experimental Protocol
• 4-fold cross validation
• Development data
– 5,054 videos from 52 movie trailers
• Test data
– 2,342 videos from 26 movie trailers
• Mean Average Precision (MAP)
Configurations of Runs
Run Learning-to-Rank Strategy
1 Ranking SVM
2 RankNet
3 RankBoost
4 ListNet
5 Majority Voting
Experimental Results
Results obtained on the development data. Results of the official submitted runs.
Ranking
SVM
RankN
et
RankBoost
ListN
et
M
ajority
Voting
MAP(%)
10
11
12
13
14
15
16
17
18
19
20
0
5
10
15
20
25
MAP(%)
Ranking
SVM
RankN
et
RankBoost
ListN
et
M
ajority
Voting
18.15
16.1716.17 16.56
14.35
AP per movie trailer achieved in each run.
video−52
video−53
video−54
video−55
video−56
video−57
video−58
video−59
video−60
video−61
video−62
video−63
video−64
video−65
video−66
video−67
video−68
video−69
video−70
video−71
video−72
video−73
video−74
video−75
video−76
video−77
0
10
20
30
40
50
60
70
AveragePrecision(%)
Ranking SVM
RankNet
RankBoost
ListNet
Majority Voting
The learning-to-rank algorithms
provide complementary infor-
mation that can be combined by
fusion techniques aiming at pro-
ducing better results.
Remarks
• The proposed approach has explored only
visual properties. Different learning-
to-rank strategies were considered, in-
cluding a fusion of all of them.
• Results demonstrate that the proposed
approach is promising. By combining
learning-to-rank algorithms, it is possible
to make a contribution to better results.
Future Works
The investigation of a smarter strategy for combining learning-to-rank algorithms and considering
other information sources to include more features semantically related to visual content.
Acknowledgements
This research was supported by Brazilian agencies FAPESP, CAPES, and CNPq.
References
[1] J. Almeida, N. J. Leite, and R. S. Torres. Compar-
ison of video sequences with Histograms of Motion
Patterns. In ICIP, pages 3673–3676, 2011.
[2] C. J. C. Burges, T. Shaked, E. Renshaw, A. Lazier,
M. Deeds, N. Hamilton and G. N. Hullender. Learn-
ing to rank using gradient descent. In ICML, pages
89–96, 2005.
[3] Z. Cao, T. Qin, T.-Y. Liu, M.-F. Tsai, and H. Li.
Learning to rank: from pairwise approach to listwise
approach. In ICML, pages 129–136, 2007.
[4] Y. Freund, R. D. Iyer, R. E. Schapire, and Y. Singer.
An efficient boosting algorithm for combining prefer-
ences. Journal of Machine Learning Research, 4:933–
969, 2003.
[5] T. Joachims. Training linear SVMs in linear time. In
ACM SIGKDD, pages 217–226, 2006.
[6] L. Lam and C. Y. Suen. Application of majority vot-
ing to pattern recognition: an analysis of its behavior
and performance. IEEE Trans. Systems, Man, and
Cybernetics, Part A, 27(5):553–568, 1997.

UNIFESP Predicting Media Interestingness Using Motion Histograms

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (14)

Similar to UNIFESP Predicting Media Interestingness Using Motion Histograms

Similar to UNIFESP Predicting Media Interestingness Using Motion Histograms (20)

More from multimediaeval

More from multimediaeval (20)

Recently uploaded

Recently uploaded (20)

UNIFESP Predicting Media Interestingness Using Motion Histograms