Presenter: Emmanuel Dellandréa, Ecole Centrale de Lyon, France
Paper: http://ceur-ws.org/Vol-1984/Mediaeval_2017_paper_5.pdf
Video: https://youtu.be/nOC83JXWS2E
Authors: Emmanuel Dellandrea, Martijn Huigsloot, Liming Chen, Yoann Baveye, Mats Sjoberg
Abstract: This paper provides a description of the MediaEval 2017 “Emotional Impact of Movies task". It continues to build on previous years’ editions. In this year’s task, participants are expected to create systems that automatically predict the emotional impact that video content will have on viewers, in terms of valence, arousal and fear. Here we provide a description of the use case, task challenges, dataset and ground truth, task run requirements and evaluation metrics.
3. Context
An evolution of previous years tasks on violence, affect and
emotion prediction from videos
Applications:
Personalized content delivery
Movie recommendation
Video editing supervision
Video summarization
Protection of children from potential harmful content
3MediaEval'17,13-15 September 2017,Dublin,Ireland
4. Task description
Goal: Deploy multimedia features and models to
automatically predict the emotional impact of movies
Emotion considered in terms of induced valence,
arousal and fear
Long movies are considered and the emotional impact
has to be predicted for consecutive 10-second segments
sliding over the whole movie with a shift of 5 seconds
Local prediction of emotion
Should allow to benefit from the audio-visual context and
temporal dependencies
4MediaEval'17,13-15 September 2017,Dublin,Ireland
6. Run submissions and evaluation
Up to 5 runs for each subtask
Models can rely on the features provided by the
organizers or any other external data
Standard evaluation metrics:
Valence/arousal prediction subtask (regression
problem): Mean Square Error, Pearson’s Correlation
Coefficient
Fear prediction subtask (binary classification problem):
Accuracy, Precision, Recall and F1-score
6MediaEval'17,13-15 September 2017,Dublin,Ireland
7. Dataset: LIRIS-ACCEDE
Development set
30 movies selected among 160 movies
under Creative Commons licenses
Duration between 117s and 4,566s (total
duration: ~7 hours)
Continuous induced valence and arousal
self-assessments
Test set:
14 other movies selected among the set of
160 movies
Duration between 210s and 6,260s (total
duration: ~8 hours)
Audio and visual features provided
1582 general purpose audio features
11 types of visual features (VGG16, LBP,
ACC, Tamura, ...)
7
LIRIS-ACCEDE available at:
http://liris-accede.ec-lyon.fr
MediaEval'17,13-15 September 2017,Dublin,Ireland
24. Conclusion
Participants’ approaches provided encouraging results,
(better than last year for valence/arousal prediction)
Arousal generally better predicted than valence
(consistent with the literature)
Some submissions rely on features/models to cope with
temporal dependencies
Half of the registered participants have submitted runs
è task too difficult ?
Both subtasks remain particularly challenging
High subjectivity of emotions
Unbalanced data for fear prediction
24MediaEval'17,13-15 September 2017,Dublin,Ireland