KIT at MediaEval 2012 - Content-based GenreClassification with Visual CuesTomas SemelaMakarand TapaswiMediaEval 2012 Works...
Motivation      Rapid growth of digital      multimedia data in the      broadcast and web video      domain      Need for...
Challenges    Broadcast TV domain                                             Web video domain      Channel archives      ...
Related work      System from University of Torino, Italy              Extract video features from aural, visual, cognitiv...
KIT System      Visual feature extraction from keyframes      SVM classification system      Fusion of results with majori...
Low-level visual features      Color                                                                    Texture           ...
SIFT – For each keyframe      Interest point detection          Dense sampling      Spatial-pyramid          1x1 – 2x2 – 1...
Classification      Training of one support vector machine (SVM) for each      genre and each feature              Binary ...
Domain Knowledge      Video distribution in the development set:              Autos 8 videos              Technology ~ 500...
Evaluation       Blip.tv data with ~ 9550 clips       Two configurations with/without prior domain knowledge            No...
Evaluation – Run 611    05.10.2012   KIT at MediaEval 2011 – Content-based genre classification on web-videos   Institute ...
Evaluation                                                           Run6 (MAP):                   Top 4 categories:      ...
Conclusions & Future Work      Conclusions         Visual-based classification shows limitations for category tagging     ...
Thank you14   05.10.2012   KIT at MediaEval 2012 – Content-based Genre Classification with Visual Cues   Institute for Ant...
15   05.10.2012   KIT at MediaEval 2011 – Content-based genre classification on web-videos   Institute for Anthropomatics ...
Upcoming SlideShare
Loading in …5
×

KIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues

932 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
932
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

KIT at MediaEval 2012 – Content–based Genre Classification with Visual Cues

  1. 1. KIT at MediaEval 2012 - Content-based GenreClassification with Visual CuesTomas SemelaMakarand TapaswiMediaEval 2012 WorkshopInstitute for AnthropomaticsKIT – University of the State of Baden-Wuerttemberg andNational Research Center of the Helmholtz Association www.kit.edu
  2. 2. Motivation Rapid growth of digital multimedia data in the broadcast and web video domain Need for efficient automated indexing and content search2 KIT at MediaEval 2012 – Content-based Genre Classification with Visual Cues Institute for Anthropomatics MediaEval 2012 Workshop
  3. 3. Challenges Broadcast TV domain Web video domain Channel archives Video portals like YouTube Digital distribution (User content) Web offerings Arrangement in categories: Resemblence to topics (Autos – Animals – Travel) Arrangement in genres: Variation in production Highly characteristic values and style Low variance Not limited to single genre Clear boundaries characterstics3 KIT at MediaEval 2012 – Content-based Genre Classification with Visual Cues Institute for Anthropomatics MediaEval 2012 Workshop
  4. 4. Related work System from University of Torino, Italy Extract video features from aural, visual, cognitive and structural cues Neural network for classification M. Montagnuolo, A. Messina, ”Parallel Neural Networks for Multimodal Video Genre Classification”, Multimedia Tools and Appl., 41(1):125–159, 20094 05.10.2012 KIT at MediaEval 2012 – Content-based Genre Classification with Visual Cues Institute for Anthropomatics MediaEval 2012 Workshop
  5. 5. KIT System Visual feature extraction from keyframes SVM classification system Fusion of results with majority voting5 05.10.2012 KIT at MediaEval 2012 – Content-based Genre Classification with Visual Cues Institute for Anthropomatics MediaEval 2012 Workshop
  6. 6. Low-level visual features Color Texture Color moments Wavelet texture HSV histogram Edge histogram Color auto correlogram Co-occurrence texture Global features for each video H. K. Ekenel, T. Semela, and R. Stiefelhagen, “Content-based video genre classification using multiple cues”, AIEMPro10, pages 21-26, 2010.6 05.10.2012 KIT at MediaEval 2012 – Content-based Genre Classification with Visual Cues Institute for Anthropomatics MediaEval 2012 Workshop
  7. 7. SIFT – For each keyframe Interest point detection Dense sampling Spatial-pyramid 1x1 – 2x2 – 1x3 SIFT descriptors SIFT rgbSIFT opponentSIFT Bag-of-visual-words Codebook (500-dim.) Codeword histograms K. E. A. van de Sande, T. Gevers, and C. G. M. Snoek, “Empowering Visual Categorization with the GPU”, IEEE Transactions on Multimedia, 13(1):60-70, 2011.7 KIT at MediaEval 2012 – Content-based Genre Classification with Visual Cues Institute for Anthropomatics MediaEval 2012 Workshop
  8. 8. Classification Training of one support vector machine (SVM) for each genre and each feature Binary classification (one vs. all) RBF kernel Cross-validation Fusion in decision level Majority voting (probability output) SIFT: keyframes classified individually, output averaged over video8 05.10.2012 KIT at MediaEval 2012 – Content-based Genre Classification with Visual Cues Institute for Anthropomatics MediaEval 2012 Workshop
  9. 9. Domain Knowledge Video distribution in the development set: Autos 8 videos Technology ~ 500 videos Use this information in the final prediction of the category as a likelihood of the distribution on blip.tv: 1. SVM scores for each video normalized to unit sum 2. Divide these probabilities by the square root of the number of videos in the development set for each category to include the a-priori knowledge of the class distribution 3. Finally, step one is repeated to obtain unit sum9 05.10.2012 KIT at MediaEval 2012 – Content-based Genre Classification with Visual Cues Institute for Anthropomatics MediaEval 2012 Workshop
  10. 10. Evaluation Blip.tv data with ~ 9550 clips Two configurations with/without prior domain knowledge No prior run1 run2 run3 Visual SIFT Visual + SIFT MAP 0.3008 0.2329 0.3499 Prior run4 run5 run6 Visual SIFT Visual + SIFT MAP 0.3461 0.1448 0.358110 05.10.2012 KIT at MediaEval 2012 – Content-based Genre Classification with Visual Cues Institute for Anthropomatics MediaEval 2012 Workshop
  11. 11. Evaluation – Run 611 05.10.2012 KIT at MediaEval 2011 – Content-based genre classification on web-videos Institute for Anthropomatics MediaEval 2011 Workshop
  12. 12. Evaluation Run6 (MAP): Top 4 categories: Worst 4 categories: autos and vehicles (0.812) citizen journalism (0.158) health (0.668) documentary (0.119) movies and television (0.602) videoblogging (0.100) religion (0.578) travel (0.010)12 05.10.2012 KIT at MediaEval 2012 – Content-based Genre Classification with Visual Cues Institute for Anthropomatics MediaEval 2012 Workshop
  13. 13. Conclusions & Future Work Conclusions Visual-based classification shows limitations for category tagging Few categories with satisfactory results SIFT: only slight improvement of overall results Prior domain knowledge improves results overall Future Work Temporal features Mid-level semantics Action Detection, Audio segmentation ASR & Metadata integration Individual classification approach & features for each genre13 KIT at MediaEval 2012 – Content-based Genre Classification with Visual Cues Institute for Anthropomatics MediaEval 2012 Workshop
  14. 14. Thank you14 05.10.2012 KIT at MediaEval 2012 – Content-based Genre Classification with Visual Cues Institute for Anthropomatics MediaEval 2012 Workshop
  15. 15. 15 05.10.2012 KIT at MediaEval 2011 – Content-based genre classification on web-videos Institute for Anthropomatics MediaEval 2011 Workshop

×