SlideShare a Scribd company logo
Relevance Filtering meets Active Learning
                     — Improving Web-based Concept Detectors —


                   Damian Borth*, Adrian Ulges, Thomas M. Breuel

                      German Research Center for Artificial Intelligence (DFKI) &
                               University of Kaiserslautern, Germany


                                                March 29 2010




D.Borth: : Relevance Filtering meets Active Learning            1                  March 29 2010
Outline



     Introduction


     Approach: Active Relevance Filtering


     Experimental Results


     Summary




D.Borth: : Relevance Filtering meets Active Learning   2   March 29 2010
Digital Video




D.Borth: : Relevance Filtering meets Active Learning   3   March 29 2010
Digital Video

     ”...about 24 hours of video is uploaded every minute, 1 billion views per day...”

                                                       , 2010




D.Borth: : Relevance Filtering meets Active Learning            3         March 29 2010
Digital Video

     ”...about 24 hours of video is uploaded every minute, 1 billion views per day...”

                                                       , 2010



      ”...TV, video on demand, Internet video, and P2P video will account for over
                    91 percent of global consumer traffic by 2013...”
                                                       , 2009




D.Borth: : Relevance Filtering meets Active Learning            3         March 29 2010
Digital Video

     ”...about 24 hours of video is uploaded every minute, 1 billion views per day...”

                                                       , 2010



      ”...TV, video on demand, Internet video, and P2P video will account for over
                    91 percent of global consumer traffic by 2013...”
                                                       , 2009


     Information Overload vs. Video Retrieval
             high demand for automatic machine indexing




D.Borth: : Relevance Filtering meets Active Learning            3         March 29 2010
Digital Video

     ”...about 24 hours of video is uploaded every minute, 1 billion views per day...”

                                                       , 2010



      ”...TV, video on demand, Internet video, and P2P video will account for over
                    91 percent of global consumer traffic by 2013...”
                                                       , 2009


     Information Overload vs. Video Retrieval
             high demand for automatic machine indexing
             one solution: concept detection [Snoek09], [Smeaton09], ...
             → as key building block of CBVR


D.Borth: : Relevance Filtering meets Active Learning            3          March 29 2010
Concept Detection - Framework




             unknown video shot X
             concept vocabulary t1 ...tn
             statistical model estimating concept presence P(ti |X )

D.Borth: : Relevance Filtering meets Active Learning   4           March 29 2010
Concept Detection - Framework




             expert labels are used as training data
             time consuming effort [Ayache07]
             → datasets are limited in vocabulary size         [Hauptmann07],

                 overfit    [Yang08]   and narrowed in its flexibility
D.Borth: : Relevance Filtering meets Active Learning              5             March 29 2010
Concept Detection - Framework




             propose web video as training source      [Ulges07]

             use tags as class labels
             allows autonomous concept learning

D.Borth: : Relevance Filtering meets Active Learning   6           March 29 2010
Concept Detection - Framework




             label noise problem
                     subjective
                     coarse


D.Borth: : Relevance Filtering meets Active Learning   7   March 29 2010
Concept Detection - Framework




             relevance filtering
                     adapt concept learning to noisy labels
                     perform label refinement


D.Borth: : Relevance Filtering meets Active Learning     8    March 29 2010
Relevance Filtering Approaches

                                               Relevance Filtering
                    (¸)                                                           ˚
                    (¸)                         :       automatic
                                                        Relevance Filtering
                                                                                  ¸
                    (¸)                                                           ˚
                    (¸)                       Ä        manual annotation
                                                       with Active Learning       ˚
                    (¸)                                                           ¸
                                                               Active
                    (¸)

                    (¸)
                                                Ä
                                               :+              Relevance
                                                               Filtering
                                                                                  ˚
                                                                                  ¸
                    weak labels                                                   filtered labels




             auto. relevance filtering
             active learning
             combination of both → active relevance filtering

D.Borth: : Relevance Filtering meets Active Learning                          9                 March 29 2010
Automatic Relevance Filtering



     Idea
             take label noise into account during model training
             identify false positive and filter them




D.Borth: : Relevance Filtering meets Active Learning   10          March 29 2010
Automatic Relevance Filtering



     Idea
             take label noise into account during model training
             identify false positive and filter them

     Related Work
             joint probabilities of tags and content    [Bernard03], [Feng04]

             neighbor voting            [Snoek09]

             samples reweighting according to inferred relevance                [Ulges08]




D.Borth: : Relevance Filtering meets Active Learning   10                        March 29 2010
Automatic Relevance Filtering


     Approach         [Ulges10]


             training data:            X = {x1 , . . . , xn }
             training labels: Y = {y1 , . . . , yn }             (known)
             true labels:            Y = {y1 , . . . , yn }     (unknown)
                yi = −1             → yi = −1
                yi = 1              → yi ∈ {1, −1}              (true pos. or false pos.)




D.Borth: : Relevance Filtering meets Active Learning              11              March 29 2010
Automatic Relevance Filtering


     Approach         [Ulges10]


             training data:            X = {x1 , . . . , xn }
             training labels: Y = {y1 , . . . , yn }             (known)
             true labels:            Y = {y1 , . . . , yn }     (unknown)
                yi = −1             → yi = −1
                yi = 1              → yi ∈ {1, −1}              (true pos. or false pos.)


             statistical model: kernel densities
             infer yi by estimating relevance scores βi = P(yi |xi , yi = 1)
                     fitted by EM
             model extension: φ(X , Y ) → φ(X , Y , β)


D.Borth: : Relevance Filtering meets Active Learning              11              March 29 2010
Active Learning

     Idea
             select informative samples for manual labeling




D.Borth: : Relevance Filtering meets Active Learning   12     March 29 2010
Active Learning

     Idea
             select informative samples for manual labeling

     Related Work
             text classification           [Lewis94], [Tong02], ...

             image retrieval          [Tong01], [Chang05], ...

             video data labeling             [Ayache07], [Hua08], ...




D.Borth: : Relevance Filtering meets Active Learning                    12   March 29 2010
Active Learning

     Idea
             select informative samples for manual labeling

     Related Work
             text classification           [Lewis94], [Tong02], ...

             image retrieval          [Tong01], [Chang05], ...

             video data labeling             [Ayache07], [Hua08], ...




     Sample Selection Methods
        1. most relevant sampling
        2. uncertainty sampling
        3. most relevant sampling + density weighted repulsion (DWR)

D.Borth: : Relevance Filtering meets Active Learning                    12   March 29 2010
Active Learning




             pool-based active learning
             selects label according to model
             new labeled sample helps further selection


D.Borth: : Relevance Filtering meets Active Learning   13   March 29 2010
Our Approach: Active Relevance Filtering




             active learning + auto. relevance filtering
             selects label according to filtered model
             new labeled sample helps further filtering & selection


D.Borth: : Relevance Filtering meets Active Learning   14        March 29 2010
Experiments




D.Borth: : Relevance Filtering meets Active Learning   15   March 29 2010
Experiments

 YouTube-22Concepts-Dataset
        100 videos per concept
        keyframes extracted
        features:
                SIFT [Lowe99]
                                                       ”swimming”         ”cats”
                visual words        [Sivic03]




D.Borth: : Relevance Filtering meets Active Learning   15           March 29 2010
Experiments

 YouTube-22Concepts-Dataset
        100 videos per concept
        keyframes extracted
        features:
                SIFT [Lowe99]
                                                       ”swimming”         ”cats”
                visual words        [Sivic03]


 Setup
        subset of 10 concepts
        trained on:
                500 noisy pos. samples
                1000 neg. samples
        tested on:
                500 pos. samples
                1500 neg. samples
D.Borth: : Relevance Filtering meets Active Learning   15           March 29 2010
Experiments

 YouTube-22Concepts-Dataset
        100 videos per concept
        keyframes extracted
        features:
                SIFT [Lowe99]
                                                          ”swimming”             ”cats”
                visual words        [Sivic03]


 Setup                                                 Noisy Pos. Samples
        subset of 10 concepts                              label precision of web
        trained on:                                        video: 20 − 50% [Ulges10]
                500 noisy pos. samples                     for this experiments: 20%
                1000 neg. samples                          500 noisy pos. samples:
        tested on:                                               100 true pos. samples
                500 pos. samples                                 400 false pos. samples
                1500 neg. samples
D.Borth: : Relevance Filtering meets Active Learning      15               March 29 2010
Experiments - Impact of Label Noise

                                                               Relevance Filtering




                                                 0.60
                                                             no relevance filtering
                                                             automatic relevance filt.
                                                             ground truth




                                       mean avg. precision
                                        0.40       0.50
                                                 0.30




 System Performance
        Mean Average Precision (MAP)                                                                     system     MAP
                                                                                                      noisy data    0.455
        auto. relevance filtering helps                                                   auto. relevance filtering   0.482
        potential gap of improvement                                                                ground truth    0.557
        remains

D.Borth: : Relevance Filtering meets Active Learning                                       16                March 29 2010
Experiments - Relevance Filtering
                                         Active Learning                                                             Active Relevance Filtering
                                     DWR                        ground truth labels                                        ground truth labels
                                     random
                        0.54




                                                                                                              0.54
         mean avg. precision




                                                                                               mean avg. precision
                                     most relevant
                                     uncertainty
              0.50




                                                                                                    0.50
                               automatic relevance filtering                                                         automatic relevance filtering
                                                                                                                                                     DWR
                                                                                                                                                     most relevant
      0.46




                                                                                            0.46
                                                               no relevance filtering                                     no relevance filtering     uncertainty
                                                                                                                                                     random

                               0      100      200 300 400                       500                                 0        100      200 300 400           500
                                            labeled samples                                                                         labeled samples



 Active Learning                                                                        Active Rel. Filtering
                active learning can                                                            initial auto. relevance
                outperform random                                                              filtering helps
                selection                                                                      improves active learning further


D.Borth: : Relevance Filtering meets Active Learning                                                                     17                          March 29 2010
Experiments - Relevance Filtering
                                         Active Learning                                                         Active Relevance Filtering
                                     DWR                        ground truth labels                                    ground truth labels
                                     random
                        0.54




                                                                                                          0.54
         mean avg. precision




                                                                                           mean avg. precision
                                     most relevant
                                     uncertainty
              0.50




                                                                                                0.50
                               automatic relevance filtering                                                     automatic relevance filtering
                                                                                                                                                  DWR
                                                                                                                                                  most relevant
      0.46




                                                                                        0.46
                                                               no relevance filtering                                 no relevance filtering      uncertainty
                                                                                                                                                  random

                               0      100      200 300 400                       500                             0        100      200 300 400            500
                                            labeled samples                                                                     labeled samples



    Direct Comparison
                               DWR sampling                                                                                                        approach
                                                                                                # refined samples                                    AL   ARF
                                                                                                               0                                 0.455 0.482
                                                                                                              50                                 0.474 0.541
                                                                                                            250                                  0.525 0.557


D.Borth: : Relevance Filtering meets Active Learning                                                                 18                          March 29 2010
Experiments - Top Ranked Keyframes

                                               concept: basketball




       a) no relevance filtering, b) automatic relevance filtering, c) active relevance filtering



D.Borth: : Relevance Filtering meets Active Learning                 19          March 29 2010
Experiments - Top Ranked Keyframes

                                               concept: basketball




       a) no relevance filtering, b) automatic relevance filtering, c) active relevance filtering



D.Borth: : Relevance Filtering meets Active Learning                 20          March 29 2010
Experiments - Top Ranked Keyframes

                                               concept: basketball




       a) no relevance filtering, b) automatic relevance filtering, c) active relevance filtering



D.Borth: : Relevance Filtering meets Active Learning                 21          March 29 2010
Experiments - Top Ranked Keyframes

                                               concept: eiffeltower




       a) no relevance filtering, b) automatic relevance filtering, c) active relevance filtering



D.Borth: : Relevance Filtering meets Active Learning                 22          March 29 2010
Experiments - Top Ranked Keyframes

                                               concept: eiffeltower




       a) no relevance filtering, b) automatic relevance filtering, c) active relevance filtering



D.Borth: : Relevance Filtering meets Active Learning                 23          March 29 2010
Experiments - Top Ranked Keyframes

                                               concept: eiffeltower




       a) no relevance filtering, b) automatic relevance filtering, c) active relevance filtering



D.Borth: : Relevance Filtering meets Active Learning                 24          March 29 2010
Discussion


     Contributions
             concept learning from noisy (= weakly labeled) web video
             evaluation of different refinement approaches
             proposed approach: active relevance filtering




D.Borth: : Relevance Filtering meets Active Learning   25       March 29 2010
Discussion


     Contributions
             concept learning from noisy (= weakly labeled) web video
             evaluation of different refinement approaches
             proposed approach: active relevance filtering

     Experimental Results
             automatic relevance filtering helps but is limited
             active learning is outperforming random selection
             active relevance filtering is able to improves active learning
                     auto. relevance filtering + active learning



D.Borth: : Relevance Filtering meets Active Learning     25        March 29 2010
Questions?




D.Borth: : Relevance Filtering meets Active Learning        26   March 29 2010

More Related Content

Recently uploaded

“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 

Recently uploaded (20)

“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 

Featured

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
Marius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
Expeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
Pixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
marketingartwork
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
Skeleton Technologies
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
SpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Lily Ray
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
Rajiv Jayarajah, MAppComm, ACC
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Christy Abraham Joy
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
Vit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
MindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
RachelPearson36
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Relevance Filtering meets Active Learning: Improving Web-based Concept Detectors

  • 1. Relevance Filtering meets Active Learning — Improving Web-based Concept Detectors — Damian Borth*, Adrian Ulges, Thomas M. Breuel German Research Center for Artificial Intelligence (DFKI) & University of Kaiserslautern, Germany March 29 2010 D.Borth: : Relevance Filtering meets Active Learning 1 March 29 2010
  • 2. Outline Introduction Approach: Active Relevance Filtering Experimental Results Summary D.Borth: : Relevance Filtering meets Active Learning 2 March 29 2010
  • 3. Digital Video D.Borth: : Relevance Filtering meets Active Learning 3 March 29 2010
  • 4. Digital Video ”...about 24 hours of video is uploaded every minute, 1 billion views per day...” , 2010 D.Borth: : Relevance Filtering meets Active Learning 3 March 29 2010
  • 5. Digital Video ”...about 24 hours of video is uploaded every minute, 1 billion views per day...” , 2010 ”...TV, video on demand, Internet video, and P2P video will account for over 91 percent of global consumer traffic by 2013...” , 2009 D.Borth: : Relevance Filtering meets Active Learning 3 March 29 2010
  • 6. Digital Video ”...about 24 hours of video is uploaded every minute, 1 billion views per day...” , 2010 ”...TV, video on demand, Internet video, and P2P video will account for over 91 percent of global consumer traffic by 2013...” , 2009 Information Overload vs. Video Retrieval high demand for automatic machine indexing D.Borth: : Relevance Filtering meets Active Learning 3 March 29 2010
  • 7. Digital Video ”...about 24 hours of video is uploaded every minute, 1 billion views per day...” , 2010 ”...TV, video on demand, Internet video, and P2P video will account for over 91 percent of global consumer traffic by 2013...” , 2009 Information Overload vs. Video Retrieval high demand for automatic machine indexing one solution: concept detection [Snoek09], [Smeaton09], ... → as key building block of CBVR D.Borth: : Relevance Filtering meets Active Learning 3 March 29 2010
  • 8. Concept Detection - Framework unknown video shot X concept vocabulary t1 ...tn statistical model estimating concept presence P(ti |X ) D.Borth: : Relevance Filtering meets Active Learning 4 March 29 2010
  • 9. Concept Detection - Framework expert labels are used as training data time consuming effort [Ayache07] → datasets are limited in vocabulary size [Hauptmann07], overfit [Yang08] and narrowed in its flexibility D.Borth: : Relevance Filtering meets Active Learning 5 March 29 2010
  • 10. Concept Detection - Framework propose web video as training source [Ulges07] use tags as class labels allows autonomous concept learning D.Borth: : Relevance Filtering meets Active Learning 6 March 29 2010
  • 11. Concept Detection - Framework label noise problem subjective coarse D.Borth: : Relevance Filtering meets Active Learning 7 March 29 2010
  • 12. Concept Detection - Framework relevance filtering adapt concept learning to noisy labels perform label refinement D.Borth: : Relevance Filtering meets Active Learning 8 March 29 2010
  • 13. Relevance Filtering Approaches Relevance Filtering (¸) ˚ (¸) : automatic Relevance Filtering ¸ (¸) ˚ (¸) Ä manual annotation with Active Learning ˚ (¸) ¸ Active (¸) (¸) Ä :+ Relevance Filtering ˚ ¸ weak labels filtered labels auto. relevance filtering active learning combination of both → active relevance filtering D.Borth: : Relevance Filtering meets Active Learning 9 March 29 2010
  • 14. Automatic Relevance Filtering Idea take label noise into account during model training identify false positive and filter them D.Borth: : Relevance Filtering meets Active Learning 10 March 29 2010
  • 15. Automatic Relevance Filtering Idea take label noise into account during model training identify false positive and filter them Related Work joint probabilities of tags and content [Bernard03], [Feng04] neighbor voting [Snoek09] samples reweighting according to inferred relevance [Ulges08] D.Borth: : Relevance Filtering meets Active Learning 10 March 29 2010
  • 16. Automatic Relevance Filtering Approach [Ulges10] training data: X = {x1 , . . . , xn } training labels: Y = {y1 , . . . , yn } (known) true labels: Y = {y1 , . . . , yn } (unknown) yi = −1 → yi = −1 yi = 1 → yi ∈ {1, −1} (true pos. or false pos.) D.Borth: : Relevance Filtering meets Active Learning 11 March 29 2010
  • 17. Automatic Relevance Filtering Approach [Ulges10] training data: X = {x1 , . . . , xn } training labels: Y = {y1 , . . . , yn } (known) true labels: Y = {y1 , . . . , yn } (unknown) yi = −1 → yi = −1 yi = 1 → yi ∈ {1, −1} (true pos. or false pos.) statistical model: kernel densities infer yi by estimating relevance scores βi = P(yi |xi , yi = 1) fitted by EM model extension: φ(X , Y ) → φ(X , Y , β) D.Borth: : Relevance Filtering meets Active Learning 11 March 29 2010
  • 18. Active Learning Idea select informative samples for manual labeling D.Borth: : Relevance Filtering meets Active Learning 12 March 29 2010
  • 19. Active Learning Idea select informative samples for manual labeling Related Work text classification [Lewis94], [Tong02], ... image retrieval [Tong01], [Chang05], ... video data labeling [Ayache07], [Hua08], ... D.Borth: : Relevance Filtering meets Active Learning 12 March 29 2010
  • 20. Active Learning Idea select informative samples for manual labeling Related Work text classification [Lewis94], [Tong02], ... image retrieval [Tong01], [Chang05], ... video data labeling [Ayache07], [Hua08], ... Sample Selection Methods 1. most relevant sampling 2. uncertainty sampling 3. most relevant sampling + density weighted repulsion (DWR) D.Borth: : Relevance Filtering meets Active Learning 12 March 29 2010
  • 21. Active Learning pool-based active learning selects label according to model new labeled sample helps further selection D.Borth: : Relevance Filtering meets Active Learning 13 March 29 2010
  • 22. Our Approach: Active Relevance Filtering active learning + auto. relevance filtering selects label according to filtered model new labeled sample helps further filtering & selection D.Borth: : Relevance Filtering meets Active Learning 14 March 29 2010
  • 23. Experiments D.Borth: : Relevance Filtering meets Active Learning 15 March 29 2010
  • 24. Experiments YouTube-22Concepts-Dataset 100 videos per concept keyframes extracted features: SIFT [Lowe99] ”swimming” ”cats” visual words [Sivic03] D.Borth: : Relevance Filtering meets Active Learning 15 March 29 2010
  • 25. Experiments YouTube-22Concepts-Dataset 100 videos per concept keyframes extracted features: SIFT [Lowe99] ”swimming” ”cats” visual words [Sivic03] Setup subset of 10 concepts trained on: 500 noisy pos. samples 1000 neg. samples tested on: 500 pos. samples 1500 neg. samples D.Borth: : Relevance Filtering meets Active Learning 15 March 29 2010
  • 26. Experiments YouTube-22Concepts-Dataset 100 videos per concept keyframes extracted features: SIFT [Lowe99] ”swimming” ”cats” visual words [Sivic03] Setup Noisy Pos. Samples subset of 10 concepts label precision of web trained on: video: 20 − 50% [Ulges10] 500 noisy pos. samples for this experiments: 20% 1000 neg. samples 500 noisy pos. samples: tested on: 100 true pos. samples 500 pos. samples 400 false pos. samples 1500 neg. samples D.Borth: : Relevance Filtering meets Active Learning 15 March 29 2010
  • 27. Experiments - Impact of Label Noise Relevance Filtering 0.60 no relevance filtering automatic relevance filt. ground truth mean avg. precision 0.40 0.50 0.30 System Performance Mean Average Precision (MAP) system MAP noisy data 0.455 auto. relevance filtering helps auto. relevance filtering 0.482 potential gap of improvement ground truth 0.557 remains D.Borth: : Relevance Filtering meets Active Learning 16 March 29 2010
  • 28. Experiments - Relevance Filtering Active Learning Active Relevance Filtering DWR ground truth labels ground truth labels random 0.54 0.54 mean avg. precision mean avg. precision most relevant uncertainty 0.50 0.50 automatic relevance filtering automatic relevance filtering DWR most relevant 0.46 0.46 no relevance filtering no relevance filtering uncertainty random 0 100 200 300 400 500 0 100 200 300 400 500 labeled samples labeled samples Active Learning Active Rel. Filtering active learning can initial auto. relevance outperform random filtering helps selection improves active learning further D.Borth: : Relevance Filtering meets Active Learning 17 March 29 2010
  • 29. Experiments - Relevance Filtering Active Learning Active Relevance Filtering DWR ground truth labels ground truth labels random 0.54 0.54 mean avg. precision mean avg. precision most relevant uncertainty 0.50 0.50 automatic relevance filtering automatic relevance filtering DWR most relevant 0.46 0.46 no relevance filtering no relevance filtering uncertainty random 0 100 200 300 400 500 0 100 200 300 400 500 labeled samples labeled samples Direct Comparison DWR sampling approach # refined samples AL ARF 0 0.455 0.482 50 0.474 0.541 250 0.525 0.557 D.Borth: : Relevance Filtering meets Active Learning 18 March 29 2010
  • 30. Experiments - Top Ranked Keyframes concept: basketball a) no relevance filtering, b) automatic relevance filtering, c) active relevance filtering D.Borth: : Relevance Filtering meets Active Learning 19 March 29 2010
  • 31. Experiments - Top Ranked Keyframes concept: basketball a) no relevance filtering, b) automatic relevance filtering, c) active relevance filtering D.Borth: : Relevance Filtering meets Active Learning 20 March 29 2010
  • 32. Experiments - Top Ranked Keyframes concept: basketball a) no relevance filtering, b) automatic relevance filtering, c) active relevance filtering D.Borth: : Relevance Filtering meets Active Learning 21 March 29 2010
  • 33. Experiments - Top Ranked Keyframes concept: eiffeltower a) no relevance filtering, b) automatic relevance filtering, c) active relevance filtering D.Borth: : Relevance Filtering meets Active Learning 22 March 29 2010
  • 34. Experiments - Top Ranked Keyframes concept: eiffeltower a) no relevance filtering, b) automatic relevance filtering, c) active relevance filtering D.Borth: : Relevance Filtering meets Active Learning 23 March 29 2010
  • 35. Experiments - Top Ranked Keyframes concept: eiffeltower a) no relevance filtering, b) automatic relevance filtering, c) active relevance filtering D.Borth: : Relevance Filtering meets Active Learning 24 March 29 2010
  • 36. Discussion Contributions concept learning from noisy (= weakly labeled) web video evaluation of different refinement approaches proposed approach: active relevance filtering D.Borth: : Relevance Filtering meets Active Learning 25 March 29 2010
  • 37. Discussion Contributions concept learning from noisy (= weakly labeled) web video evaluation of different refinement approaches proposed approach: active relevance filtering Experimental Results automatic relevance filtering helps but is limited active learning is outperforming random selection active relevance filtering is able to improves active learning auto. relevance filtering + active learning D.Borth: : Relevance Filtering meets Active Learning 25 March 29 2010
  • 38. Questions? D.Borth: : Relevance Filtering meets Active Learning 26 March 29 2010