Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The MediaEval 2017 Emotional Impact of Movies Task (Overview)

389 views

Published on

Presenter: Emmanuel Dellandréa, Ecole Centrale de Lyon, France

Paper: http://ceur-ws.org/Vol-1984/Mediaeval_2017_paper_5.pdf

Video: https://youtu.be/nOC83JXWS2E

Authors: Emmanuel Dellandrea, Martijn Huigsloot, Liming Chen, Yoann Baveye, Mats Sjoberg

Abstract: This paper provides a description of the MediaEval 2017 “Emotional Impact of Movies task". It continues to build on previous years’ editions. In this year’s task, participants are expected to create systems that automatically predict the emotional impact that video content will have on viewers, in terms of valence, arousal and fear. Here we provide a description of the use case, task challenges, dataset and ground truth, task run requirements and evaluation metrics.

Published in: Science
  • Be the first to comment

  • Be the first to like this

The MediaEval 2017 Emotional Impact of Movies Task (Overview)

  1. 1. MediaEval 2017 Emotional Impact of Movies Task Organizers: Emmanuel Dellandréa, Martijn Huigsloot, Liming Chen, Yoann Baveye,Mats Sjöberg Contact: Emmanuel Dellandréa – emmanuel.dellandrea@ec-lyon.fr 1MediaEval'17,13-15 September 2017,Dublin,Ireland
  2. 2. Task overview 2MediaEval'17,13-15 September 2017,Dublin,Ireland è Participants are expected to create systems that automatically predict the emotional impact that video content will have on viewers, in terms of valence, arousal and fear
  3. 3. Context An evolution of previous years tasks on violence, affect and emotion prediction from videos Applications: Personalized content delivery Movie recommendation Video editing supervision Video summarization Protection of children from potential harmful content 3MediaEval'17,13-15 September 2017,Dublin,Ireland
  4. 4. Task description Goal: Deploy multimedia features and models to automatically predict the emotional impact of movies Emotion considered in terms of induced valence, arousal and fear Long movies are considered and the emotional impact has to be predicted for consecutive 10-second segments sliding over the whole movie with a shift of 5 seconds Local prediction of emotion Should allow to benefit from the audio-visual context and temporal dependencies 4MediaEval'17,13-15 September 2017,Dublin,Ireland
  5. 5. Task description Two subtasks: Valence/Arousal prediction: predict a score of expected valence and arousal for each consecutive 10-second segments Fear prediction: predict for each consecutive segments whether they are likely to induce fear or not Targeted use case: the prediction of frightening scenes to help systems protecting children from potentially harmful video 5MediaEval'17,13-15 September 2017,Dublin,Ireland
  6. 6. Run submissions and evaluation Up to 5 runs for each subtask Models can rely on the features provided by the organizers or any other external data Standard evaluation metrics: Valence/arousal prediction subtask (regression problem): Mean Square Error, Pearson’s Correlation Coefficient Fear prediction subtask (binary classification problem): Accuracy, Precision, Recall and F1-score 6MediaEval'17,13-15 September 2017,Dublin,Ireland
  7. 7. Dataset: LIRIS-ACCEDE Development set 30 movies selected among 160 movies under Creative Commons licenses Duration between 117s and 4,566s (total duration: ~7 hours) Continuous induced valence and arousal self-assessments Test set: 14 other movies selected among the set of 160 movies Duration between 210s and 6,260s (total duration: ~8 hours) Audio and visual features provided 1582 general purpose audio features 11 types of visual features (VGG16, LBP, ACC, Tamura, ...) 7 LIRIS-ACCEDE available at: http://liris-accede.ec-lyon.fr MediaEval'17,13-15 September 2017,Dublin,Ireland
  8. 8. Ground truth Valence/arousal predition subtask: Induced valence and arousal self-assessments 16 annotators ModifiedGtrace interface and joystick è Arousal and valence values for consecutive 10-second segments sliding over the whole movie with a shift of 5 seconds 8MediaEval'17,13-15 September 2017,Dublin,Ireland
  9. 9. Ground truth Fear predition subtask: Use of tool specifically designed for the classification of audio-visual media (NICAM*) Annotations realized by two well experienced team members of NICAM trained in classification of media Each movie annotated by 1 annotator reporting the start and stop times of each sequence in the movie expected to induce fear è Segments labeled as fear (value 1) if they intersect one of the fear sequences and as not fear (value 0) otherwise 9MediaEval'17,13-15 September 2017,Dublin,Ireland * Netherlands Institute for the Classification of Audio-visual Media
  10. 10. Task participation 12 team registered, 5 have submitted runs Grand total of 39 run submissions Valence/arousal prediction subtask: 5 teams, 22 runs Fear prediction subtask: 4 teams, 17 runs 10MediaEval'17,13-15 September 2017,Dublin,Ireland
  11. 11. Teams MIC-TJU Yun Yi1,2, Hanli Wang2, Jiangchuan Wei2 1Gannan Normal University, China 2Tongji University, China HKBU Yang Liu, Zhonglei Gu, Tobey H. Ko Hong Kong Baptist University, HKSAR, China THU-HCSI Zitong Jin, Yuqi Yao, Ye Ma, Mingxing Xu Tsinghua University, China 11MediaEval'17,13-15 September 2017,Dublin,Ireland
  12. 12. Teams BOUN-NKU Nihan Karslioglu1, Yasemin Timar1, Albert Ali Salah1, Heysem Kaya2 1Bogazici University, Turkey 2Namik Kemal University, Turkey TCNJ-CS Sejong Yoon The College of New Jersey, U.S.A 12MediaEval'17,13-15 September 2017,Dublin,Ireland
  13. 13. Participants’ approaches Visual Features General purpose visual features provided by organizers Auto Color Correlogram, Color and Edge Directivity De- scriptor, Color Layout, Edge Histogram, Fuzzy Color and Texture Histogram, Gabor, Joint descriptor joining CEDD and FCTH, Scalable Color, Tamura, Local Binary Patterns, fc6 layer of VGG16 network Motion Keypoint Trajectory (MKT) feature based on Histogram of Oriented Gradient (HOG), Motion Boundary Histogram (MBH) , Histogram of Optical Flow (HOF) and Trajectory-Based Covariance (TBC) Two-stream Convolutional Networks (ConvNets) Dense SIFT features 13MediaEval'17,13-15 September 2017,Dublin,Ireland
  14. 14. Participants’ approaches Audio Features Features provided by organizers 1,582 audio features (EmoBase2010 from OpenSmile) Extended Geneva Minimalistic Acoustic Parameter Set (eGeMAPS) High-level features Lingering features: computationally model the gradually amplifying or decaying emotional flow 14MediaEval'17,13-15 September 2017,Dublin,Ireland
  15. 15. Participants’ approaches Feature reduction Principal Components Analysis Fisher Vectors Biased Discriminant Embedding 15MediaEval'17,13-15 September 2017,Dublin,Ireland
  16. 16. Participants’ approaches Regression/classification models Support Vector Regression Support Vector Classification Multiple Kernel learning Adaboost Extreme Learning Machines Random forests Long Short-Term Memory models èApproaches quite similar to last year 16MediaEval'17,13-15 September 2017,Dublin,Ireland
  17. 17. Valence/arousal prediction subtask Valence 17MediaEval'17,13-15 September 2017,Dublin,Ireland ( Best Pearson's CC last year: 0.14 )
  18. 18. Valence/arousal prediction subtask Arousal 18MediaEval'17,13-15 September 2017,Dublin,Ireland ( Best Pearson's CC last year: 0.23 )
  19. 19. Fear prediction subtask Clarifying the evaluation metrics: True Positives: segments are predicted as fear and are actually fear True Negatives: segments are predicted as not fear and are actually not fear False Positives: segments are predicted as fear and are actually not fear False Negatives: segments are predicted as not fear and are actually fear MediaEval'17,13-15 September 2017,Dublin,Ireland 19
  20. 20. Fear prediction subtask 20MediaEval'17,13-15 September 2017,Dublin,Ireland Accuracy = (TP+TN) / (TP+TN+FP+FN)
  21. 21. Fear prediction subtask 21MediaEval'17,13-15 September 2017,Dublin,Ireland Precision = TP / (TP+FP)
  22. 22. Fear prediction subtask 22MediaEval'17,13-15 September 2017,Dublin,Ireland Recall = TP / (TP+FN)
  23. 23. Fear prediction subtask 23MediaEval'17,13-15 September 2017,Dublin,Ireland F1_score = 2.Precision.Recall / (Precision+Recall)
  24. 24. Conclusion Participants’ approaches provided encouraging results, (better than last year for valence/arousal prediction) Arousal generally better predicted than valence (consistent with the literature) Some submissions rely on features/models to cope with temporal dependencies Half of the registered participants have submitted runs è task too difficult ? Both subtasks remain particularly challenging High subjectivity of emotions Unbalanced data for fear prediction 24MediaEval'17,13-15 September 2017,Dublin,Ireland
  25. 25. The future of the Emotional Impact task This year development and test sets as an extension of LIRIS-ACCEDE dataset available at http://liris- accede.ec-lyon.fr Some possible directions of investigation: Collect more data for fear prediction Encourage to go further in developping approaches to model temporal dependencies Push to study interplays between valence/arousal and fear A novel orientation of the task ? 25MediaEval'17,13-15 September 2017,Dublin,Ireland
  26. 26. Program of the session THUHCSI in MediaEval 2017 Emotional Impact of Movies Task Presenter: Mingxing Xu, Tsinghua University, China MIC-TJU in MediaEval 2017 Emotional Impact of Movies Task Presenter: (Stand in) Emmanuel Dellandréa, Ecole Centrale de Lyon, France TCNJ-CS @ MediaEval 2017 Emotional Impact of Movie Task (video) BOUN-NKU in MediaEval 2017 Emotional Impact of Movies Task (video) HKBU at MediaEval 2017 Emotional Impact of Movies Task (video) MediaEval'17,13-15 September 2017,Dublin,Ireland 26
  27. 27. Thank you for your attention ! MediaEval'17,13-15 September 2017,Dublin,Ireland 27

×