TubeTagger
                        — YouTube-based Concept Detection —


                             Adrian Ulges, Markus...
Outline



    Motivation


    TubeTagger System


    TubeTagger Web Demo


    Summary




D.Borth: : TubeTagger     2 ...
Data, data, data




D.Borth: : TubeTagger   3   December 6 2009
Data, data, data

       ”...TV, video on demand, Internet video, and P2P video will
     account for over 91 percent of g...
Data, data, data

       ”...TV, video on demand, Internet video, and P2P video will
     account for over 91 percent of g...
Data, data, data

       ”...TV, video on demand, Internet video, and P2P video will
     account for over 91 percent of g...
Content-based Video Retrieval

                        How to search in large video databases?




D.Borth: : TubeTagger  ...
Content-based Video Retrieval

                        How to search in large video databases?




                       ...
Content-based Video Retrieval

                        How to search in large video databases?

                          ...
Content-based Video Retrieval

                        How to search in large video databases?

                          ...
Training Data
    State-of-the-art Datasets
           TRECVID
           LSCOM

    Acquisition
           collaborative ...
Training Data
    State-of-the-art Datasets
           TRECVID
           LSCOM

    Acquisition
           collaborative ...
Web-video as training data




D.Borth: : TubeTagger        9   December 6 2009
Web-video as training data




    Pros
           available at large scale
           high variability of data
          ...
Web-video as training data




    Pros                                  Cons
           available at large scale         ...
TubeTagger System Overview




D.Borth: : TubeTagger    10   December 6 2009
TubeTagger System Overview




    Notation
           t ∈ T := tags (≈concepts)
           x ∈ Xdb := keyframe of a video...
TubeTagger System Overview




D.Borth: : TubeTagger    11   December 6 2009
TubeTagger System Overview




    Training Data Acquisition
           use YouTube as primary
           training data so...
TubeTagger System Overview




    Training Data Acquisition                  Learning Pipelines
           use YouTube as...
Visual Concept Learning




D.Borth: : TubeTagger     13   December 6 2009
Visual Concept Learning




    Features
           keyframe extraction
           → x ∈ Xdb
           bag-of-visual-word...
Visual Concept Learning




    Features                            Statistical Models
           keyframe extraction     ...
Visual Concept Learning




    Features                            Statistical Models
           keyframe extraction     ...
Semantic Concept Learning




D.Borth: : TubeTagger       14   December 6 2009
Semantic Concept Learning




    Query Formulation
           concepts = textual queries
           mapping of text queri...
Semantic Concept Learning




 Feature
     bag-of-words features
               t ∈ T → ht
               q := hq
       ...
Semantic Concept Learning




 Feature                                Approach
     bag-of-words features                 ...
Experiments




D.Borth: : TubeTagger   16   December 6 2009
Experiments

 Dataset
        1200 hrs. (= 750k keyframes)
        233 concepts
        per concept:
               150 vi...
Experiments

 Dataset
        1200 hrs. (= 750k keyframes)
        233 concepts
        per concept:
               150 vi...
Experiments
    Feature & Classifier Evaluation
           81 concepts                                      feature
       ...
Experiments
    Feature & Classifier Evaluation
           81 concepts                                                     ...
Insights in Web-based Concept Detection

 Good Concepts
        ”boat/ship”, ”pyramids”
        broad community of YouTube...
TubeTagger Web Demo




    User Interface
       1. selected a concept
       2. or enter a query
       3. switch betwee...
TubeTagger Web Demo




    Deep Tagging
           frame-accurate concept detection beyond coarse tags
           find con...
TubeTagger Web Demo




    Text-based Search
           matches queries to concepts
           e.g. ”diving” → ”underwate...
Summary



    TubeTagger - YouTube-based Concept Detection
           utilize YouTube videos as training source for visua...
questions?


                           project site: www.dfki.de/moonvid

                  web demo at: http://madm.dfki...
Upcoming SlideShare
Loading in …5
×

TubeTagger – YouTube-based Concept Detection

920 views
854 views

Published on

Talk at the IEEE ICDM Workshop on Internet Multimedia Mining in Miami, USA

Link to original publication: http://madm.dfki.de/publication&pubid=4492

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
920
On SlideShare
0
From Embeds
0
Number of Embeds
11
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

TubeTagger – YouTube-based Concept Detection

  1. 1. TubeTagger — YouTube-based Concept Detection — Adrian Ulges, Markus Koch Damian Borth, Thomas M. Breuel German Research Center for Artificial Intelligence (DFKI) & University of Kaiserslautern December 6 2009 D.Borth: : TubeTagger 1 December 6 2009
  2. 2. Outline Motivation TubeTagger System TubeTagger Web Demo Summary D.Borth: : TubeTagger 2 December 6 2009
  3. 3. Data, data, data D.Borth: : TubeTagger 3 December 6 2009
  4. 4. Data, data, data ”...TV, video on demand, Internet video, and P2P video will account for over 91 percent of global consumer traffic by 2013...” ”Cisco Visual Networking Index”, 2009 D.Borth: : TubeTagger 3 December 6 2009
  5. 5. Data, data, data ”...TV, video on demand, Internet video, and P2P video will account for over 91 percent of global consumer traffic by 2013...” ”Cisco Visual Networking Index”, 2009 ”...more than 20 hours of video uploaded to YouTube every minute, 1 billion views per day...” , 2009 D.Borth: : TubeTagger 3 December 6 2009
  6. 6. Data, data, data ”...TV, video on demand, Internet video, and P2P video will account for over 91 percent of global consumer traffic by 2013...” ”Cisco Visual Networking Index”, 2009 ”...more than 20 hours of video uploaded to YouTube every minute, 1 billion views per day...” , 2009 Increasing importance of video live streams: Obama’s inauguration, Michael Jackson’s memorial service, music video debuts... , 2009 D.Borth: : TubeTagger 3 December 6 2009
  7. 7. Content-based Video Retrieval How to search in large video databases? D.Borth: : TubeTagger 4 December 6 2009
  8. 8. Content-based Video Retrieval How to search in large video databases? ?? ? D.Borth: : TubeTagger 5 December 6 2009
  9. 9. Content-based Video Retrieval How to search in large video databases? b ah ec s ce o cr itriw neve Video Concept Detection [MediaMill, Columbia, IBM,...] → as key building block of CBVR D.Borth: : TubeTagger 6 December 6 2009
  10. 10. Content-based Video Retrieval How to search in large video databases? b ah ec s ce o cr itriw neve S p rie u evs d Ma hn L ann c ie e rig Video Concept Detection [MediaMill, Columbia, IBM,...] → as key building block of CBVR Supervised Machine Learning → need labeled training data D.Borth: : TubeTagger 7 December 6 2009
  11. 11. Training Data State-of-the-art Datasets TRECVID LSCOM Acquisition collaborative [Ayache07] precise & high quality time consuming effort D.Borth: : TubeTagger 8 December 6 2009
  12. 12. Training Data State-of-the-art Datasets TRECVID LSCOM Acquisition collaborative [Ayache07] precise & high quality time consuming effort vocabulary size does not scale → from hundreds to thousands of concepts [Hauptmann07] missing new emerging concepts of interest → e.g. ”Michael Jackson”, ”Obama”, ”iPod”1 1 top ranked searches 2009-Q3 by “Google Insights for Search” for web search, news search and product search respectively D.Borth: : TubeTagger 8 December 6 2009
  13. 13. Web-video as training data D.Borth: : TubeTagger 9 December 6 2009
  14. 14. Web-video as training data Pros available at large scale high variability of data enriched with tags comments ratings could allow automatic concept learning D.Borth: : TubeTagger 9 December 6 2009
  15. 15. Web-video as training data Pros Cons available at large scale weakly labeled high variability of data incomplete / subjective coarse enriched with tags focus of interest comments domain change ratings video portal as black box could allow automatic concept learning D.Borth: : TubeTagger 9 December 6 2009
  16. 16. TubeTagger System Overview D.Borth: : TubeTagger 10 December 6 2009
  17. 17. TubeTagger System Overview Notation t ∈ T := tags (≈concepts) x ∈ Xdb := keyframe of a videos P(t|x) := probability of tag t for keyframe x D.Borth: : TubeTagger 10 December 6 2009
  18. 18. TubeTagger System Overview D.Borth: : TubeTagger 11 December 6 2009
  19. 19. TubeTagger System Overview Training Data Acquisition use YouTube as primary training data source use tags as weak labels pos. = tagged neg. = all non-tagged D.Borth: : TubeTagger 11 December 6 2009
  20. 20. TubeTagger System Overview Training Data Acquisition Learning Pipelines use YouTube as primary Visual Concept Learning training data source Semantic Concept use tags as weak labels Learning pos. = tagged neg. = all non-tagged D.Borth: : TubeTagger 12 December 6 2009
  21. 21. Visual Concept Learning D.Borth: : TubeTagger 13 December 6 2009
  22. 22. Visual Concept Learning Features keyframe extraction → x ∈ Xdb bag-of-visual-words descriptors [Sivic06] SIFT [Lowe99] SURF [Bay06] D.Borth: : TubeTagger 13 December 6 2009
  23. 23. Visual Concept Learning Features Statistical Models keyframe extraction SVM [Sch¨lkopf01] o → x ∈ Xdb PAMIR [Grangier08, Paredes09] bag-of-visual-words Max. Entropy [Deselaers05] descriptors [Sivic06] SIFT [Lowe99] SURF [Bay06] D.Borth: : TubeTagger 13 December 6 2009
  24. 24. Visual Concept Learning Features Statistical Models keyframe extraction SVM [Sch¨lkopf01] o → x ∈ Xdb PAMIR [Grangier08, Paredes09] bag-of-visual-words Max. Entropy [Deselaers05] descriptors [Sivic06] SIFT [Lowe99] → output: P(t|x) SURF [Bay06] D.Borth: : TubeTagger 13 December 6 2009
  25. 25. Semantic Concept Learning D.Borth: : TubeTagger 14 December 6 2009
  26. 26. Semantic Concept Learning Query Formulation concepts = textual queries mapping of text queries to concept vocabulary ”computer” → Monitor, Windows-Desktop, iPhone ”funny” → Muppets, Cats, Commercials learning of tag co-occurrences D.Borth: : TubeTagger 14 December 6 2009
  27. 27. Semantic Concept Learning Feature bag-of-words features t ∈ T → ht q := hq mapping to concepts as weights w (q, t) :=< ht , hq > D.Borth: : TubeTagger 15 December 6 2009
  28. 28. Semantic Concept Learning Feature Approach bag-of-words features T5 = w (q, t)top5 t ∈ T → ht final fusion q := hq P(q|x) = mapping to concepts as weights P w (q,t) P(t|x) t∈T5 t∈T w (q,t) 5 w (q, t) :=< ht , hq > D.Borth: : TubeTagger 15 December 6 2009
  29. 29. Experiments D.Borth: : TubeTagger 16 December 6 2009
  30. 30. Experiments Dataset 1200 hrs. (= 750k keyframes) 233 concepts per concept: 150 videos for training 50 videos for testing ”soccer” ”traffic” D.Borth: : TubeTagger 16 December 6 2009
  31. 31. Experiments Dataset 1200 hrs. (= 750k keyframes) 233 concepts per concept: 150 videos for training 50 videos for testing ”soccer” ”traffic” Clips/Concept Evaluation 10 concepts trained on N clips tested on 200k keyframes saturation at 100 − 150 clips/concepts for SURF+PAMIR D.Borth: : TubeTagger 16 December 6 2009
  32. 32. Experiments Feature & Classifier Evaluation 81 concepts feature model SURF SIFT video level testing (avg. fusion) SVMs 20.4 22.4 results: PAMIR 15.4 18.4 MAP: 22.4% MAXENT 14.1 15.5 MAP: 15.4% (6 times faster) D.Borth: : TubeTagger 17 December 6 2009
  33. 33. Experiments Feature & Classifier Evaluation 81 concepts feature model SURF SIFT video level testing (avg. fusion) SVMs 20.4 22.4 results: PAMIR 15.4 18.4 MAP: 22.4% MAXENT 14.1 15.5 MAP: 15.4% (6 times faster) Performance Distribution oi mi ra g s ce ocr bs u rs ua t e t rn a o eai - o p rt nr m o o showing 78 representative concepts out of 233 D.Borth: : TubeTagger 17 December 6 2009
  34. 34. Insights in Web-based Concept Detection Good Concepts ”boat/ship”, ”pyramids” broad community of YouTube users often ”interesting”, ”spectacular” ”boat ship” Redundant Concepts ”drummer”, ”fencing” series of clips, not sufficiently diverse ”drummer” problem: generalization Bad Concepts ”fence”, ”operation-room” not regularly used as a tag ”fence” D.Borth: : TubeTagger 18 December 6 2009
  35. 35. TubeTagger Web Demo User Interface 1. selected a concept 2. or enter a query 3. switch between video or keyframe level D.Borth: : TubeTagger 19 December 6 2009
  36. 36. TubeTagger Web Demo Deep Tagging frame-accurate concept detection beyond coarse tags find concept (e.g. ”Christmas Tree”) within a video clip D.Borth: : TubeTagger 20 December 6 2009
  37. 37. TubeTagger Web Demo Text-based Search matches queries to concepts e.g. ”diving” → ”underwater”, ”shipwreck”, ”fish”, . . . D.Borth: : TubeTagger 21 December 6 2009
  38. 38. Summary TubeTagger - YouTube-based Concept Detection utilize YouTube videos as training source for visual concept detector learning utilize YouTube tags for semantic model generation web demo providing: deep tagging tag recommendation text-based search D.Borth: : TubeTagger 22 December 6 2009
  39. 39. questions? project site: www.dfki.de/moonvid web demo at: http://madm.dfki.de/demo/tubetagger/ D.Borth: : TubeTagger 23 December 6 2009

×