Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

LIG at MediaEval 2012 affect task: use of a generic method


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

LIG at MediaEval 2012 affect task: use of a generic method

  1. 1. LIG Quaero consortium at MediaEval 2012Affect task: Violent Scenes Detection Task Nadia Derbas, Franck Thollard, Bahjat Safadi and Georges Quénot UJF-LIG 4 October 2012
  2. 2. Outline • Global system architecture • Descriptors with optimization • Classification • Hierarchical fusion • Conceptual feedback • Re-ranking • Submitted runs • Conclusion04/10/12 LIG - Nadia Derbas 2
  3. 3. The classical classification pipeline 0101 0101 Discourse of President Bill ClintonPresident Clinton is 0101basking in some goodnews Signal Semantics Semantic gap 04/10/12 LIG - Nadia Derbas 3
  4. 4. 04/10/12 Text Audio Image Descriptor extraction Descriptor transformation Classification Descriptors and classifier variants fusionLIG - Nadia Derbas Conceptual feedback Higher level hierarchical fusion Re-ranking (re-scoring) The LIG classification pipeline Classification score4
  5. 5. Descriptors and variants Descriptor extraction: ● color: 4 x 4 x 4 RGB histogram; ● texture: 8 orientations x 5 scales Gabor transform; ● points of interest: bags of SIFTs: Harris-Laplace and dense sampling, hard and fuzzy clustering, use of color opponent SIFTs (van de Sande); ● Audio: bag of MFCCs, MFCCs only and MFCCs plus their first and second derivatives. ● Motion Descriptor optimization: ● power normalization: x ← xα, α ~ 0.4: good for sparse descriptors; ● principal component analysis: dimensionality reduction and noise removal;04/10/12 LIG - Nadia Derbas 5
  6. 6. Use of multiple classifiers • Tow different classification methods: • KNN • MSVM • Use of multiple SVMs to address the unbalanced data problem • Improves over regular SVM on highly imbalanced datasets • MSVM is generally better than kNN but not always04/10/12 LIG - Nadia Derbas 6
  7. 7. Hierarchical fusion • Late fusion of descriptor and classifier variants: get the maximum from each descriptor type: • fuse spatial variants • then fuse other variants • finally fuse classification results from different classifiers • Further hierarchical late fusion: fuse across different descriptors with similar types: • all color together, all texture together ... • then all visual together, all audio together ... • finally everything together A linear combination of the scores is used with weight optimized on the MediaEval development set.04/10/12 LIG - Nadia Derbas 7
  8. 8. Conceptual feedback ● Idea: using the probability(-like) scores predicted on the 11 concepts for building a new descriptor ● 11 component vector ● Trained with classifiers as the signal-based descriptors Late fusion between the original scores and the scores computed from classification on these original scores yield a small improvement on the MAP@100.04/10/12 LIG - Nadia Derbas 8
  9. 9. Temporal re-ranking ● Fact: shot within a video are semantically related, especially if they are close within the same video ● Idea: update shot scores according to neighbors’ scores ● May be done globally (whole video) (Mérialdo 2009) or locally (window of a few shots) (Safadi 2010). ● Case of the full video: • Compute a global score for a whole video from the scores of all shots it contains (typically average or a variant) • Update the score of each shot using the global video shot (typically a linear combination or a variant)04/10/12 LIG - Nadia Derbas 9
  10. 10. Submitted runs ● LIG-1: 0.3138 ● Hierarchical fusion of all available descriptor/classifier combinations including the concept score feedback descriptor including temporal re- ranking ● LIG-2: 0.3122 ● Hierarchical fusion of all available descriptor/classifier combinations including temporal re-ranking ● LIG-3: 0.3138 ● Hierarchical fusion of all available descriptor/classifier combinations including the concept score feedback descriptor ● LIG-4: 0.3122 ● Hierarchical fusion of all available descriptor/classifier combinations04/10/12 LIG - Nadia Derbas 10
  11. 11. Submitted runs Metric MAP@100 MAP P@100 Best 0.6506 0.3183 0.4833 LIG-1 0.3138 0.1723 0.3167 LIG-2 0.3122 0.1731 0.3034 LIG-3 0.3138 0.1307 0.3166 LIG-4 0.3122 0.1259 0.3033 Median 0.3122 0.1249 0.260004/10/12 LIG - Nadia Derbas 11
  12. 12. Conclusion ● Temporal re-ranking always improve the result or has no significant effect ● Conceptual feedback improve the precision in the head of the returned list (MAP@100, P@100) ● Motion descriptors ● Audio was used (small contribution) but not ASR ● Improvements still possible04/10/12 LIG - Nadia Derbas 12
  13. 13. Thank you for your attention! Questions?04/10/12 LIG - Nadia Derbas 13