SlideShare a Scribd company logo
Affect Task: Violent scenes detection
                        Task overview
                                October, 03 2012




                        MediaEval 2012
     Guillaume, Mohammad, Cédric & Claire-Hélène
Task definition
   Second year!
   Derives from a Technicolor Use Case
       Helping users choose movies that are suitable for children in their family by
        proposing a preview of the most violent segments
   Very same definition
    “Physical violence or accident resulting in human injury or pain”
     As objective as possible
     But:
         Dead people without seeing how they appear to be dead => Not annotated
         Somebody hurting himself while shaving => Annotated
           Does not match the use case…




2       10/08/12
Task definition
   Two types of runs
       Primary and required run at shot level,
           i.e. a decision violent/non violent should be provided for each movie shot
       Optional run at segment level,
           i.e. violent segments (starting and ending times) should be extracted by the
            participants
       Scores are required to compute the official measure


   Rules
       Any features automatically extracted from the DVDs can be used
       This includes audio, video and subtitles
       No external additional data (e.g. from the internet)




3       10/08/12
Data set
   18 Hollywood movies purchased by participants
         Of different genre (from extremely violent to non violent) both in the learning and
          test sets.




4      10/08/12
Data set – development set

    Movie                      Duration         Shot #   Violence duration (%)   Violent shots (%)
    Armageddon                  8680.16          3562            14.03                 14.6
    Billy elliot                6349.44          1236            5.14                  4.21
    Eragon                      5985.44          1663            11.02                 16.6
    Harry Potter 5              7953.52          1891            10.46                 13.43
    I am Legend                 5779.92          1547            12.75                 20.43
    Leon                        6344.56          1547             4.3                  7.24
    Midnight Express            6961.04          1677            7.28                  11.15
    Pirates Carib.              8239.44          2534            11.3                  12.47
    Reservoir Dogs              5712.96          856             11.55                 12.38
    Saving private Ryan         9751.0           2494            12.92                 18.81
    The Six Senth               6178.04          963             1.34                  2.80
    The wicker man              5870.44          1638            8.36                  6.72
    Kill Bill1                  5626.6           1597            17.4                  24.8
    The Bourne Identity         5877.6           1995             7.5                   9.3
    The wizard of Oz            5415.7           908              5.5                   5.0
    TOTAL                 100725.8 (27h58min)   26108            9.39                 11.99




5       10/08/12
Data set – test set


        Movie                Duration    Shot #   Violence duration (%)   Violent shots (%)
        Dead Poets Society    7413.24     1583            0.75                  2.15

        Fight Club            8005.72     2335            7.61                  13.28

        Independence Day      8834.32     2652             6.4                  13.99

        TOTAL                24253.28    6570             4.92                  9.80
                             (6h44min)




6   10/08/12
Annotations & additional data
   Groundtruth manually created by 7 human assessors:
         Segments containing violent events according to the definition
                 One unique violent action per segment wherever possible
                 Or tag ‘multiple_action_scenes’

         7 high level video concepts:
                 Presence of blood
                 Presence of fire
                 Presence of guns or assimilated weapons
                 Presence of cold arms (knives or assimilated weapons)
                 Fights (1 against 1, small, large, distant attack)
                 Car chases
                 Gory scenes (graphic images of bloodletting and/or tissue damage)

         3 high level audio concepts:
                 Gunshots, cannon fire
                 Screams, effort noise
                 Explosions


   Automatically generated shot boundaries with keyframes
7      10/08/12
Results
Evaluation metrics
        Official measure : Mean Average Precision @100
            Average precision at the 100 top ranked violent shots, over the 3 test movies




        For comparison purpose with 2011, the MediaEval Cost
                                                                C fa = 1
               C = C fa Pfa + Cmiss Pmiss             where
                                                               
                                                               Cmiss = 10
and     Pfa Pmiss    are the estimated probabilities of false alarm and missed detection




        Additional metrics:
          false alarm rate, miss detection rate, precision, recall, F-measure, MAP@20, MAP
          Detection error trade-off (DET) curves




9         10/08/12
Task participation
    Survey:
        35 teams manifested interest for the task (among which 12 were very interested)
        2011: 13 teams
    Registration:
        11 teams = 6 core partipants + 1 organizers team + 4 additional teams
        At least, 3 joint submissions - 16 research teams - 9 countries
        3 teams already worked on the detection of violence in movies
        2011: 6 teams = 4 + 2 organizers, 1 joint submission, 4 countries
    Submission:
        7 teams + 1 organizers team
        We have lost 3 teams (corpus availability, economical issues, low performance)
        Grand total of 36 runs: 35 at shot level and 1 brave submission at segment level!
        2011: 29 runs at shot level, 4 teams + 2 organizers teams
    Workshop participation:
        6 teams
        2011: 3 teams


10       10/08/12
Task baseline – random classification


                Movie                MAP@100
                Dead Poets Society     2.17
                Fight Club             13.27
                Independence Day       13.98
                Total                  9.08




11   10/08/12
Task participation

                                        Run         2011       Workshop
Registration             Country                                            MAP@100 MediaEvalCost
                                     submission participation Participation

                                       1 (shot)                             65.05       3.56
ARF                       Austria                                X
                                     1 (segment)                            54.82       5.13
DYNI – LSIS               France          5          X                      12.44       7.96
NII - Video Processing
                          Japan           5          X                      30.82       1.28
Lab
Shanghai-Hongkong         China           5                      X          62.38       5.52
TUB - DAI                Germany          5          X           X          18.53       4.20
                         Germany-
TUM                                       5                      X          48.43       7.83
                          Austria
LIG - MRIM                France          4          X           X          31.37       4.16

TEC*                     France-UK        5          X           X          61.82       3.56

                          8 teams
Total                                    36          5         6 (75%)
                           (23%)
Rand. classification                                                        9.8

        *: task organizer
        Best run according to the MAP@100.
   12         10/08/12
Task participation

                                        Run         2011       Workshop
Registration             Country                                            MAP@100 MediaEvalCost
                                     submission participation Participation

                                       1 (shot)                             65.05       3.56
ARF                       Austria                                X
                                     1 (segment)                            54.82       5.13
DYNI – LSIS               France          5          X                      12.44       7.96
NII - Video Processing
                          Japan           5          X                      30.82       1.28
Lab
Shanghai-Hongkong         China           5                      X          62.38       5.52
TUB - DAI                Germany          5          X           X          18.53       4.20
                         Germany-
TUM                                       5                      X          48.43       7.83
                          Austria
LIG - MRIM                France          4          X           X          31.37       4.16

TEC*                     France-UK        5          X           X          61.82       3.56

                          8 teams
Total                                    36          5         6 (75%)
                           (23%)
Rand. classification                                                        9.8
        *: task organizer
        Best run according to the MAP@100.
   13         10/08/12
Task participation

                                        Run         2011       Workshop
Registration             Country                                            MAP@100 MediaEvalCost
                                     submission participation Participation

                                       1 (shot)                             65.05       3.56
ARF                       Austria                                X
                                     1 (segment)                            54.82       5.13
DYNI – LSIS               France          5          X                      12.44       7.96
NII - Video Processing
                          Japan           5          X                      30.82       1.28
Lab
Shanghai-Hongkong         China           5                      X          62.38       5.52
TUB - DAI                Germany          5          X           X          18.53       4.20
                         Germany-
TUM                                       5                      X          48.43       7.83
                          Austria
LIG - MRIM                France          4          X           X          31.37       4.16

TEC*                     France-UK        5          X           X          61.82       3.56

                          8 teams
Total                                    36          5         6 (75%)
                           (23%)
Rand. classification                                                        9.8
        *: task organizer
        Best run according to the MAP@100.
   14         10/08/12
Learned points
    Features:
        Mainly classic low-level features either audio or video
        Mainly computed at frame level

    Classification step:
        Mainly supervised machine learning systems
          Mostly SVM-based, 1 NN, 1BN
        Two systems based on similarity computation (KNN)

    Multimodality:
        Is audio, video, audio and video more informative? No real convergence
        No use of text features

    Mid-level concepts:
        YES! This year, they were largelly used (4 teams out of 8)
        Seems promising, for some of them (except blood)
        But how to use them? (as additional features, as an intermediate step)

    Test set: seems that…
        It worked better on Independence Day and Dead Poets Society was more difficult.
        Due to some similarity with other movies in the dev set?
        Generalization issue?

15        10/08/12
DET curves (best run per participant-MAP@100)




16   10/08/12
Recall vs. Precision (best run per participant – MAP@100)




  17   10/08/12
Conclusions & perspectives
    Success of the task
        Increased number of participants
        Attracked people from the domain
        Quality of results has deeply increased


    MediaEval2013
        Which task definition?
        How to go one step further in the multimodality?
            Text is still not used
        Who will join the organizers’ group for next year?




18       10/08/12

More Related Content

Viewers also liked

DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking TaskDCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
MediaEval2012
 
LIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodLIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic method
MediaEval2012
 
Simha_23_REFFIT_Biochar_ICT_Published Version
Simha_23_REFFIT_Biochar_ICT_Published VersionSimha_23_REFFIT_Biochar_ICT_Published Version
Simha_23_REFFIT_Biochar_ICT_Published Version
Prithvi Simha
 
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval2012
 
MediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval 2012 Opening
MediaEval 2012 Opening
MediaEval2012
 
Overview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskOverview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy Task
MediaEval2012
 
When Ideas and Opportunities Collide
When Ideas and Opportunities CollideWhen Ideas and Opportunities Collide
When Ideas and Opportunities Collide
Grow America
 
CERTH @ MediaEval 2012 Social Event Detection Task
CERTH @ MediaEval 2012 Social Event Detection TaskCERTH @ MediaEval 2012 Social Event Detection Task
CERTH @ MediaEval 2012 Social Event Detection Task
MediaEval2012
 
The Watershed-based Social Events Detection Method with Support from External...
The Watershed-based Social Events Detection Method with Support from External...The Watershed-based Social Events Detection Method with Support from External...
The Watershed-based Social Events Detection Method with Support from External...
MediaEval2012
 
Working Notes for the Placing Task at MediaEval 2012
Working Notes for the Placing Task at MediaEval 2012Working Notes for the Placing Task at MediaEval 2012
Working Notes for the Placing Task at MediaEval 2012
MediaEval2012
 
Event Detection via LDA for the MediaEval2012 SED Task
Event Detection via LDA for the MediaEval2012 SED TaskEvent Detection via LDA for the MediaEval2012 SED Task
Event Detection via LDA for the MediaEval2012 SED Task
MediaEval2012
 

Viewers also liked (16)

DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking TaskDCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
 
LIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodLIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic method
 
The Deck by Phil Polstra GrrCON2012
The Deck by Phil Polstra GrrCON2012The Deck by Phil Polstra GrrCON2012
The Deck by Phil Polstra GrrCON2012
 
Simha_23_REFFIT_Biochar_ICT_Published Version
Simha_23_REFFIT_Biochar_ICT_Published VersionSimha_23_REFFIT_Biochar_ICT_Published Version
Simha_23_REFFIT_Biochar_ICT_Published Version
 
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
 
MediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval 2012 Opening
MediaEval 2012 Opening
 
Overview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskOverview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy Task
 
When Ideas and Opportunities Collide
When Ideas and Opportunities CollideWhen Ideas and Opportunities Collide
When Ideas and Opportunities Collide
 
CERTH @ MediaEval 2012 Social Event Detection Task
CERTH @ MediaEval 2012 Social Event Detection TaskCERTH @ MediaEval 2012 Social Event Detection Task
CERTH @ MediaEval 2012 Social Event Detection Task
 
The Watershed-based Social Events Detection Method with Support from External...
The Watershed-based Social Events Detection Method with Support from External...The Watershed-based Social Events Detection Method with Support from External...
The Watershed-based Social Events Detection Method with Support from External...
 
Closing
ClosingClosing
Closing
 
Mentor Strategy Session: Business Plan and Video
Mentor Strategy Session: Business Plan and VideoMentor Strategy Session: Business Plan and Video
Mentor Strategy Session: Business Plan and Video
 
Working Notes for the Placing Task at MediaEval 2012
Working Notes for the Placing Task at MediaEval 2012Working Notes for the Placing Task at MediaEval 2012
Working Notes for the Placing Task at MediaEval 2012
 
Live pitch event
Live pitch eventLive pitch event
Live pitch event
 
Secrets of Storytelling by Candace Klein
Secrets of Storytelling by Candace KleinSecrets of Storytelling by Candace Klein
Secrets of Storytelling by Candace Klein
 
Event Detection via LDA for the MediaEval2012 SED Task
Event Detection via LDA for the MediaEval2012 SED TaskEvent Detection via LDA for the MediaEval2012 SED Task
Event Detection via LDA for the MediaEval2012 SED Task
 

More from MediaEval2012

A Multimodal Approach for Video Geocoding
A Multimodal Approach for   Video Geocoding A Multimodal Approach for   Video Geocoding
A Multimodal Approach for Video Geocoding
MediaEval2012
 
CUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskCUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking Task
MediaEval2012
 
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
MediaEval2012
 
Brave New Task: User Account Matching
Brave New Task: User Account MatchingBrave New Task: User Account Matching
Brave New Task: User Account Matching
MediaEval2012
 
The CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsThe CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and Onwards
MediaEval2012
 
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval2012
 
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
MediaEval2012
 
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
MediaEval2012
 
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskNII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
MediaEval2012
 
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
MediaEval2012
 
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
MediaEval2012
 
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskUNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
MediaEval2012
 
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
MediaEval2012
 
ARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video ClassificationARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video Classification
MediaEval2012
 
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
MediaEval2012
 
KIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
KIT at MediaEval 2012 – Content–based Genre Classification with Visual CuesKIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
KIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
MediaEval2012
 
Overview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging TaskOverview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging Task
MediaEval2012
 
Telefonica Research System for the Spoken Web Search task at Mediaeval 2012
Telefonica Research System for the Spoken Web Search task at Mediaeval 2012Telefonica Research System for the Spoken Web Search task at Mediaeval 2012
Telefonica Research System for the Spoken Web Search task at Mediaeval 2012
MediaEval2012
 
CUHK System for the Spoken Web Search task at Mediaeval 2012
CUHK System for the Spoken Web Search task at Mediaeval 2012CUHK System for the Spoken Web Search task at Mediaeval 2012
CUHK System for the Spoken Web Search task at Mediaeval 2012
MediaEval2012
 

More from MediaEval2012 (20)

A Multimodal Approach for Video Geocoding
A Multimodal Approach for   Video Geocoding A Multimodal Approach for   Video Geocoding
A Multimodal Approach for Video Geocoding
 
CUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskCUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking Task
 
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
 
Brave New Task: User Account Matching
Brave New Task: User Account MatchingBrave New Task: User Account Matching
Brave New Task: User Account Matching
 
The CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsThe CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and Onwards
 
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
 
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
 
mevd2012 esra_
 mevd2012 esra_ mevd2012 esra_
mevd2012 esra_
 
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
 
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskNII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
 
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
 
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
 
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskUNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
 
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
 
ARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video ClassificationARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video Classification
 
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
 
KIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
KIT at MediaEval 2012 – Content–based Genre Classification with Visual CuesKIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
KIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues
 
Overview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging TaskOverview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging Task
 
Telefonica Research System for the Spoken Web Search task at Mediaeval 2012
Telefonica Research System for the Spoken Web Search task at Mediaeval 2012Telefonica Research System for the Spoken Web Search task at Mediaeval 2012
Telefonica Research System for the Spoken Web Search task at Mediaeval 2012
 
CUHK System for the Spoken Web Search task at Mediaeval 2012
CUHK System for the Spoken Web Search task at Mediaeval 2012CUHK System for the Spoken Web Search task at Mediaeval 2012
CUHK System for the Spoken Web Search task at Mediaeval 2012
 

Recently uploaded

Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 

Recently uploaded (20)

De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
The architecture of Generative AI for enterprises.pdf
The architecture of Generative AI for enterprises.pdfThe architecture of Generative AI for enterprises.pdf
The architecture of Generative AI for enterprises.pdf
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 

The MediaEval 2012 Affect Task: Violent Scenes Detectio

  • 1. Affect Task: Violent scenes detection Task overview October, 03 2012 MediaEval 2012 Guillaume, Mohammad, Cédric & Claire-Hélène
  • 2. Task definition  Second year!  Derives from a Technicolor Use Case  Helping users choose movies that are suitable for children in their family by proposing a preview of the most violent segments  Very same definition “Physical violence or accident resulting in human injury or pain”  As objective as possible  But:  Dead people without seeing how they appear to be dead => Not annotated  Somebody hurting himself while shaving => Annotated  Does not match the use case… 2 10/08/12
  • 3. Task definition  Two types of runs  Primary and required run at shot level,  i.e. a decision violent/non violent should be provided for each movie shot  Optional run at segment level,  i.e. violent segments (starting and ending times) should be extracted by the participants  Scores are required to compute the official measure  Rules  Any features automatically extracted from the DVDs can be used  This includes audio, video and subtitles  No external additional data (e.g. from the internet) 3 10/08/12
  • 4. Data set  18 Hollywood movies purchased by participants  Of different genre (from extremely violent to non violent) both in the learning and test sets. 4 10/08/12
  • 5. Data set – development set Movie Duration Shot # Violence duration (%) Violent shots (%) Armageddon 8680.16 3562 14.03 14.6 Billy elliot 6349.44 1236 5.14 4.21 Eragon 5985.44 1663 11.02 16.6 Harry Potter 5 7953.52 1891 10.46 13.43 I am Legend 5779.92 1547 12.75 20.43 Leon 6344.56 1547 4.3 7.24 Midnight Express 6961.04 1677 7.28 11.15 Pirates Carib. 8239.44 2534 11.3 12.47 Reservoir Dogs 5712.96 856 11.55 12.38 Saving private Ryan 9751.0 2494 12.92 18.81 The Six Senth 6178.04 963 1.34 2.80 The wicker man 5870.44 1638 8.36 6.72 Kill Bill1 5626.6 1597 17.4 24.8 The Bourne Identity 5877.6 1995 7.5 9.3 The wizard of Oz 5415.7 908 5.5 5.0 TOTAL 100725.8 (27h58min) 26108 9.39 11.99 5 10/08/12
  • 6. Data set – test set Movie Duration Shot # Violence duration (%) Violent shots (%) Dead Poets Society 7413.24 1583 0.75 2.15 Fight Club 8005.72 2335 7.61 13.28 Independence Day 8834.32 2652 6.4 13.99 TOTAL 24253.28 6570 4.92 9.80 (6h44min) 6 10/08/12
  • 7. Annotations & additional data  Groundtruth manually created by 7 human assessors:  Segments containing violent events according to the definition  One unique violent action per segment wherever possible  Or tag ‘multiple_action_scenes’  7 high level video concepts:  Presence of blood  Presence of fire  Presence of guns or assimilated weapons  Presence of cold arms (knives or assimilated weapons)  Fights (1 against 1, small, large, distant attack)  Car chases  Gory scenes (graphic images of bloodletting and/or tissue damage)  3 high level audio concepts:  Gunshots, cannon fire  Screams, effort noise  Explosions  Automatically generated shot boundaries with keyframes 7 10/08/12
  • 9. Evaluation metrics  Official measure : Mean Average Precision @100  Average precision at the 100 top ranked violent shots, over the 3 test movies  For comparison purpose with 2011, the MediaEval Cost  C fa = 1 C = C fa Pfa + Cmiss Pmiss where  Cmiss = 10 and Pfa Pmiss are the estimated probabilities of false alarm and missed detection  Additional metrics:  false alarm rate, miss detection rate, precision, recall, F-measure, MAP@20, MAP  Detection error trade-off (DET) curves 9 10/08/12
  • 10. Task participation  Survey:  35 teams manifested interest for the task (among which 12 were very interested)  2011: 13 teams  Registration:  11 teams = 6 core partipants + 1 organizers team + 4 additional teams  At least, 3 joint submissions - 16 research teams - 9 countries  3 teams already worked on the detection of violence in movies  2011: 6 teams = 4 + 2 organizers, 1 joint submission, 4 countries  Submission:  7 teams + 1 organizers team  We have lost 3 teams (corpus availability, economical issues, low performance)  Grand total of 36 runs: 35 at shot level and 1 brave submission at segment level!  2011: 29 runs at shot level, 4 teams + 2 organizers teams  Workshop participation:  6 teams  2011: 3 teams 10 10/08/12
  • 11. Task baseline – random classification Movie MAP@100 Dead Poets Society 2.17 Fight Club 13.27 Independence Day 13.98 Total 9.08 11 10/08/12
  • 12. Task participation Run 2011 Workshop Registration Country MAP@100 MediaEvalCost submission participation Participation 1 (shot) 65.05 3.56 ARF Austria X 1 (segment) 54.82 5.13 DYNI – LSIS France 5 X 12.44 7.96 NII - Video Processing Japan 5 X 30.82 1.28 Lab Shanghai-Hongkong China 5 X 62.38 5.52 TUB - DAI Germany 5 X X 18.53 4.20 Germany- TUM 5 X 48.43 7.83 Austria LIG - MRIM France 4 X X 31.37 4.16 TEC* France-UK 5 X X 61.82 3.56 8 teams Total 36 5 6 (75%) (23%) Rand. classification 9.8 *: task organizer Best run according to the MAP@100. 12 10/08/12
  • 13. Task participation Run 2011 Workshop Registration Country MAP@100 MediaEvalCost submission participation Participation 1 (shot) 65.05 3.56 ARF Austria X 1 (segment) 54.82 5.13 DYNI – LSIS France 5 X 12.44 7.96 NII - Video Processing Japan 5 X 30.82 1.28 Lab Shanghai-Hongkong China 5 X 62.38 5.52 TUB - DAI Germany 5 X X 18.53 4.20 Germany- TUM 5 X 48.43 7.83 Austria LIG - MRIM France 4 X X 31.37 4.16 TEC* France-UK 5 X X 61.82 3.56 8 teams Total 36 5 6 (75%) (23%) Rand. classification 9.8 *: task organizer Best run according to the MAP@100. 13 10/08/12
  • 14. Task participation Run 2011 Workshop Registration Country MAP@100 MediaEvalCost submission participation Participation 1 (shot) 65.05 3.56 ARF Austria X 1 (segment) 54.82 5.13 DYNI – LSIS France 5 X 12.44 7.96 NII - Video Processing Japan 5 X 30.82 1.28 Lab Shanghai-Hongkong China 5 X 62.38 5.52 TUB - DAI Germany 5 X X 18.53 4.20 Germany- TUM 5 X 48.43 7.83 Austria LIG - MRIM France 4 X X 31.37 4.16 TEC* France-UK 5 X X 61.82 3.56 8 teams Total 36 5 6 (75%) (23%) Rand. classification 9.8 *: task organizer Best run according to the MAP@100. 14 10/08/12
  • 15. Learned points  Features:  Mainly classic low-level features either audio or video  Mainly computed at frame level  Classification step:  Mainly supervised machine learning systems  Mostly SVM-based, 1 NN, 1BN  Two systems based on similarity computation (KNN)  Multimodality:  Is audio, video, audio and video more informative? No real convergence  No use of text features  Mid-level concepts:  YES! This year, they were largelly used (4 teams out of 8)  Seems promising, for some of them (except blood)  But how to use them? (as additional features, as an intermediate step)  Test set: seems that…  It worked better on Independence Day and Dead Poets Society was more difficult.  Due to some similarity with other movies in the dev set?  Generalization issue? 15 10/08/12
  • 16. DET curves (best run per participant-MAP@100) 16 10/08/12
  • 17. Recall vs. Precision (best run per participant – MAP@100) 17 10/08/12
  • 18. Conclusions & perspectives  Success of the task  Increased number of participants  Attracked people from the domain  Quality of results has deeply increased  MediaEval2013  Which task definition?  How to go one step further in the multimodality?  Text is still not used  Who will join the organizers’ group for next year? 18 10/08/12