Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

TubeTagger – YouTube-based Concept Detection

996 views

Published on

Talk at the IEEE ICDM Workshop on Internet Multimedia Mining in Miami, USA

Link to original publication: http://madm.dfki.de/publication&pubid=4492

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

TubeTagger – YouTube-based Concept Detection

  1. 1. TubeTagger — YouTube-based Concept Detection — Adrian Ulges, Markus Koch Damian Borth, Thomas M. Breuel German Research Center for Artificial Intelligence (DFKI) & University of Kaiserslautern December 6 2009 D.Borth: : TubeTagger 1 December 6 2009
  2. 2. Outline Motivation TubeTagger System TubeTagger Web Demo Summary D.Borth: : TubeTagger 2 December 6 2009
  3. 3. Data, data, data D.Borth: : TubeTagger 3 December 6 2009
  4. 4. Data, data, data ”...TV, video on demand, Internet video, and P2P video will account for over 91 percent of global consumer traffic by 2013...” ”Cisco Visual Networking Index”, 2009 D.Borth: : TubeTagger 3 December 6 2009
  5. 5. Data, data, data ”...TV, video on demand, Internet video, and P2P video will account for over 91 percent of global consumer traffic by 2013...” ”Cisco Visual Networking Index”, 2009 ”...more than 20 hours of video uploaded to YouTube every minute, 1 billion views per day...” , 2009 D.Borth: : TubeTagger 3 December 6 2009
  6. 6. Data, data, data ”...TV, video on demand, Internet video, and P2P video will account for over 91 percent of global consumer traffic by 2013...” ”Cisco Visual Networking Index”, 2009 ”...more than 20 hours of video uploaded to YouTube every minute, 1 billion views per day...” , 2009 Increasing importance of video live streams: Obama’s inauguration, Michael Jackson’s memorial service, music video debuts... , 2009 D.Borth: : TubeTagger 3 December 6 2009
  7. 7. Content-based Video Retrieval How to search in large video databases? D.Borth: : TubeTagger 4 December 6 2009
  8. 8. Content-based Video Retrieval How to search in large video databases? ?? ? D.Borth: : TubeTagger 5 December 6 2009
  9. 9. Content-based Video Retrieval How to search in large video databases? b ah ec s ce o cr itriw neve Video Concept Detection [MediaMill, Columbia, IBM,...] → as key building block of CBVR D.Borth: : TubeTagger 6 December 6 2009
  10. 10. Content-based Video Retrieval How to search in large video databases? b ah ec s ce o cr itriw neve S p rie u evs d Ma hn L ann c ie e rig Video Concept Detection [MediaMill, Columbia, IBM,...] → as key building block of CBVR Supervised Machine Learning → need labeled training data D.Borth: : TubeTagger 7 December 6 2009
  11. 11. Training Data State-of-the-art Datasets TRECVID LSCOM Acquisition collaborative [Ayache07] precise & high quality time consuming effort D.Borth: : TubeTagger 8 December 6 2009
  12. 12. Training Data State-of-the-art Datasets TRECVID LSCOM Acquisition collaborative [Ayache07] precise & high quality time consuming effort vocabulary size does not scale → from hundreds to thousands of concepts [Hauptmann07] missing new emerging concepts of interest → e.g. ”Michael Jackson”, ”Obama”, ”iPod”1 1 top ranked searches 2009-Q3 by “Google Insights for Search” for web search, news search and product search respectively D.Borth: : TubeTagger 8 December 6 2009
  13. 13. Web-video as training data D.Borth: : TubeTagger 9 December 6 2009
  14. 14. Web-video as training data Pros available at large scale high variability of data enriched with tags comments ratings could allow automatic concept learning D.Borth: : TubeTagger 9 December 6 2009
  15. 15. Web-video as training data Pros Cons available at large scale weakly labeled high variability of data incomplete / subjective coarse enriched with tags focus of interest comments domain change ratings video portal as black box could allow automatic concept learning D.Borth: : TubeTagger 9 December 6 2009
  16. 16. TubeTagger System Overview D.Borth: : TubeTagger 10 December 6 2009
  17. 17. TubeTagger System Overview Notation t ∈ T := tags (≈concepts) x ∈ Xdb := keyframe of a videos P(t|x) := probability of tag t for keyframe x D.Borth: : TubeTagger 10 December 6 2009
  18. 18. TubeTagger System Overview D.Borth: : TubeTagger 11 December 6 2009
  19. 19. TubeTagger System Overview Training Data Acquisition use YouTube as primary training data source use tags as weak labels pos. = tagged neg. = all non-tagged D.Borth: : TubeTagger 11 December 6 2009
  20. 20. TubeTagger System Overview Training Data Acquisition Learning Pipelines use YouTube as primary Visual Concept Learning training data source Semantic Concept use tags as weak labels Learning pos. = tagged neg. = all non-tagged D.Borth: : TubeTagger 12 December 6 2009
  21. 21. Visual Concept Learning D.Borth: : TubeTagger 13 December 6 2009
  22. 22. Visual Concept Learning Features keyframe extraction → x ∈ Xdb bag-of-visual-words descriptors [Sivic06] SIFT [Lowe99] SURF [Bay06] D.Borth: : TubeTagger 13 December 6 2009
  23. 23. Visual Concept Learning Features Statistical Models keyframe extraction SVM [Sch¨lkopf01] o → x ∈ Xdb PAMIR [Grangier08, Paredes09] bag-of-visual-words Max. Entropy [Deselaers05] descriptors [Sivic06] SIFT [Lowe99] SURF [Bay06] D.Borth: : TubeTagger 13 December 6 2009
  24. 24. Visual Concept Learning Features Statistical Models keyframe extraction SVM [Sch¨lkopf01] o → x ∈ Xdb PAMIR [Grangier08, Paredes09] bag-of-visual-words Max. Entropy [Deselaers05] descriptors [Sivic06] SIFT [Lowe99] → output: P(t|x) SURF [Bay06] D.Borth: : TubeTagger 13 December 6 2009
  25. 25. Semantic Concept Learning D.Borth: : TubeTagger 14 December 6 2009
  26. 26. Semantic Concept Learning Query Formulation concepts = textual queries mapping of text queries to concept vocabulary ”computer” → Monitor, Windows-Desktop, iPhone ”funny” → Muppets, Cats, Commercials learning of tag co-occurrences D.Borth: : TubeTagger 14 December 6 2009
  27. 27. Semantic Concept Learning Feature bag-of-words features t ∈ T → ht q := hq mapping to concepts as weights w (q, t) :=< ht , hq > D.Borth: : TubeTagger 15 December 6 2009
  28. 28. Semantic Concept Learning Feature Approach bag-of-words features T5 = w (q, t)top5 t ∈ T → ht final fusion q := hq P(q|x) = mapping to concepts as weights P w (q,t) P(t|x) t∈T5 t∈T w (q,t) 5 w (q, t) :=< ht , hq > D.Borth: : TubeTagger 15 December 6 2009
  29. 29. Experiments D.Borth: : TubeTagger 16 December 6 2009
  30. 30. Experiments Dataset 1200 hrs. (= 750k keyframes) 233 concepts per concept: 150 videos for training 50 videos for testing ”soccer” ”traffic” D.Borth: : TubeTagger 16 December 6 2009
  31. 31. Experiments Dataset 1200 hrs. (= 750k keyframes) 233 concepts per concept: 150 videos for training 50 videos for testing ”soccer” ”traffic” Clips/Concept Evaluation 10 concepts trained on N clips tested on 200k keyframes saturation at 100 − 150 clips/concepts for SURF+PAMIR D.Borth: : TubeTagger 16 December 6 2009
  32. 32. Experiments Feature & Classifier Evaluation 81 concepts feature model SURF SIFT video level testing (avg. fusion) SVMs 20.4 22.4 results: PAMIR 15.4 18.4 MAP: 22.4% MAXENT 14.1 15.5 MAP: 15.4% (6 times faster) D.Borth: : TubeTagger 17 December 6 2009
  33. 33. Experiments Feature & Classifier Evaluation 81 concepts feature model SURF SIFT video level testing (avg. fusion) SVMs 20.4 22.4 results: PAMIR 15.4 18.4 MAP: 22.4% MAXENT 14.1 15.5 MAP: 15.4% (6 times faster) Performance Distribution oi mi ra g s ce ocr bs u rs ua t e t rn a o eai - o p rt nr m o o showing 78 representative concepts out of 233 D.Borth: : TubeTagger 17 December 6 2009
  34. 34. Insights in Web-based Concept Detection Good Concepts ”boat/ship”, ”pyramids” broad community of YouTube users often ”interesting”, ”spectacular” ”boat ship” Redundant Concepts ”drummer”, ”fencing” series of clips, not sufficiently diverse ”drummer” problem: generalization Bad Concepts ”fence”, ”operation-room” not regularly used as a tag ”fence” D.Borth: : TubeTagger 18 December 6 2009
  35. 35. TubeTagger Web Demo User Interface 1. selected a concept 2. or enter a query 3. switch between video or keyframe level D.Borth: : TubeTagger 19 December 6 2009
  36. 36. TubeTagger Web Demo Deep Tagging frame-accurate concept detection beyond coarse tags find concept (e.g. ”Christmas Tree”) within a video clip D.Borth: : TubeTagger 20 December 6 2009
  37. 37. TubeTagger Web Demo Text-based Search matches queries to concepts e.g. ”diving” → ”underwater”, ”shipwreck”, ”fish”, . . . D.Borth: : TubeTagger 21 December 6 2009
  38. 38. Summary TubeTagger - YouTube-based Concept Detection utilize YouTube videos as training source for visual concept detector learning utilize YouTube tags for semantic model generation web demo providing: deep tagging tag recommendation text-based search D.Borth: : TubeTagger 22 December 6 2009
  39. 39. questions? project site: www.dfki.de/moonvid web demo at: http://madm.dfki.de/demo/tubetagger/ D.Borth: : TubeTagger 23 December 6 2009

×