SlideShare a Scribd company logo
1 of 30
Download to read offline
CMU-Informedia @ TRECVID 2013
Multimedia Event Detection
Speaker: Lu Jiang
On behalf of CMU E-LAMP*
Carnegie Mellon University
* Zhen-Zhong Lan, Lu Jiang, Shoou-I Yu, Shourabh Rawat, Yang Cai, Chenqiang Gao,
Shicheng Xu, Haoquan Shen, Xuanchong Li, Yipei Wang, Waito Sze, Yan Yan, Zhigang
Ma, Wei Tong, Yi Yang, Susanne Burger, Florian Metze, Rita Singh, Bhiksha Raj,
Richard Stern, Teruko Mitamura, Eric Nyberg, and Alexander Hauptmann
Acknowledgement
This work was partially supported by Intelligence
Advanced Research Projects Activity (IARPA) via
Department of Interior National Business Center
contract number D11PC20068. The U.S. Government is
authorized to reproduce and distribute reprints for
Governmental purposes notwithstanding any copyright
annotation thereon. Disclaimer: The views and
conclusions contained herein are those of the authors
and should not be interpreted as necessarily
representing the official policies or endorsements,
either expressed or implied, of IARPA, DoI/NBC, or the
U.S. Government.
Outline
 MED EK0 Overview
 Related Work
 Pseudo Relevant Set Construction
 Experiment Results
 Conclusions
Outline
 MED EK0 Overview
 Related Work
 Pseudo Relevant Set Construction
 Experiment Results
 Conclusions
EK0 Pipeline
Name Birthday party
Definition
An individual celebrates a birthday
with other people etc.
Explication
Evidences
Audio
singing "Happy Birthday to You";
laughing, etc.
Activities Singing, eating, opening gifts
Object/
people
Decorations, birthday cake,
candles, gifts, etc.
Scene Indoors or outdoors, etc.
A birthday in this context is the
anniversary of a person's birth etc.
Happy Birthday,
congratulations
Birthday,
birthday party
Cake, candle,
boy, girl, food
Happy Birthday
Song, laughing
Speech
(ASR)
Optical
Character
(OCR)
Visual
Objects
Audio
Concepts
Query
Event Kit Description
(1) Query Generation (2) Retrieval
…
…
…
…
Modality
Rank list
Event Query Generator Event Search
…
Rank list
Fusion
Non-trivial to use the discriminative low-level features.
Low-level features are non-semantic.
EK0 Pipeline
Name Birthday party
Definition
An individual celebrates a birthday
with other people etc.
Explication
Evidences
Audio
singing "Happy Birthday to You";
laughing, etc.
Activities Singing, eating, opening gifts
Object/
people
Decorations, birthday cake,
candles, gifts, etc.
Scene Indoors or outdoors, etc.
A birthday in this context is the
anniversary of a person's birth etc.
Happy Birthday,
congratulations
Birthday,
birthday party
Cake, candle,
boy, girl, food
Happy Birthday
Song, laughing
Speech
(ASR)
Optical
Character
(OCR)
Visual
Objects
Audio
Concepts
Query
Event Kit Description
(1) Query Generation (2) Retrieval
…
…
…
…
Modality
Rank list
Event Query Generator Event Search
…
Rank list
Fusion
No way to use the discriminative low-level features such as SIFT.
Low-level features are non-semantic.
EK0 Pipeline with MMPRF
Name Birthday party
Definition
An individual celebrates a birthday
with other people etc.
Explication
Evidences
Audio
singing "Happy Birthday to You";
laughing, etc.
Activities Singing, eating, opening gifts
Object/
people
Decorations, birthday cake,
candles, gifts, etc.
Scene Indoors or outdoors, etc.
A birthday in this context is the
anniversary of a person's birth etc.
Happy Birthday,
congratulations
Birthday,
birthday party
Cake, candle,
boy, girl, food
Happy Birthday
Song, laughing
Speech
(ASR)
Optical
Character
(OCR)
Visual
Objects
Audio
Concepts
Query
Event Kit Description
Pseudo Label Set
+
+ +
+
+
+
+
- -
-
-
--- -
Fusion Model
(1) Query Generation (2) Retrieval
…
…
…
…
+
+ +
+
+
+
+
- -
-
-
--- -
…
Modality
Rank list
(3) MMPRF
Low-level
features
High-level
features
Feedback
Rank list
Event Query Generator Event Search
Leverage both high-level features (semantic concepts) and low-level
features(Dense trajectory and SIFT)
MultiModal Pseudo Relevance Feedback
(MMPRF)
• MultiModal Pseudo Relevance Feedback (MMPRF) in a
nutshell:
– Construct a pseudo label set.
– Find a fusion model on the pseudo label set using both high-
level and low-level features.
– Feedback the ranked list of the fusion model to establish the
pseudo label set for the next iteration.
• MultiModal: the feedback is carried out on multimodal
data (or multiple ranked lists).
• Pseudo: no ground-truth training data or manual
relevance judgment is used.
• MMPRF is completely automatic event search.
Outline
 MED EK0 Overview
 Related Work
 Pseudo Relevant Set Construction
 Experiment Results
 Conclusions
Pseudo Relevance Feedback(PRF)
• In text retrieval:
– Rocchio algorithm (Joachims, 1996)
– Relevance Model (Lavrenko, 2001)
• In multimedia retrieval:
– Classification-based PRF (Yan, 2003)(Hauptmann, 2008)
– Learning to rank (Liu, 2008)
• In existing methods, the initial feedback ranking score is
obtained from a single modality(a single ranked list).
Classification-based PRF (CPRF)
• Treat top-ranked videos as
pseudo-positives.
• Treat bottom-ranked videos as
pseudo-negatives.
Work reasonably well on unimodal
data(Yan, 2003).
Cause inconsistency on
multimodal data.
… +
+
+
--
Rank list
£1
1
2
3
999
1000
Inconsistency on multimodal data
Considering each modality independently may
cause the inconsistency on multimodal data
… +
+
+
--
Rank list
£1
1
2
3
999
1000
…
+
++
-
-
Rank list
£2
1
2
3
999
1000
Modality 1 Modality 2
Outline
 MED EK0 Overview
 Related Work
 Pseudo Relevant Set Construction
 Experiment Results
 Conclusions
Pseudo Relevant Set Construction
• Each modality has its own preference on which pseudo-
positives to choose.
• The desired label set satisfies the most modalities.
• The selection process is analogous to voting (unavailable in
the single modality)
• Every modality votes for some pseudo-positives.
• The better the label set fits a modality, the higher the vote is.
• The set with the highest votes is selected as the pseudo label
set.
• A principled approach Maximum Likelihood Estimation
(MLE).
MMPRF
• Our objective is to find a pseudo label set that maximizes the
likelihood of all modalities.
arg max
y
mX
i=1
X
dj2Ð
yjµT
i wij s.t.
AT
y · g; y 2 f0; 1gjÐj
MMPRF
• Our objective is to find a pseudo label set that maximizes the
likelihood of all modalities.
• Objective function: The sum of logarithmic likelihood across all
modalities.
arg max
y
mX
i=1
X
dj2Ð
yjµT
i wij s.t.
AT
y · g; y 2 f0; 1gjÐj
MMPRF
• Our objective is to find a pseudo label set that maximizes the
likelihood of all modalities.
• Objective function: The sum of logarithmic likelihood across all
modalities.
• The constraint controls the maximum number of pseudo-positives
to be selected in each modality.
arg max
y
mX
i=1
X
dj2Ð
yjµT
i wij s.t.
AT
y · g; y 2 f0; 1gjÐj
MMPRF
• Our objective is to find a pseudo label set that maximizes the
likelihood of all modalities.
• Objective function: The sum of logarithmic likelihood across all
modalities.
• The constraint controls the maximum number of pseudo-positives
to be selected in each modality.
• The objective function is linear to the y variable  Integer
Programming.
arg max
y
mX
i=1
X
dj2Ð
yjµT
i wij s.t.
AT
y · g; y 2 f0; 1gjÐj
MMPRF
• Our objective is to find a pseudo label set that maximizes the
likelihood of all modalities.
• Objective function: The sum of logarithmic likelihood across all
modalities.
• The constraint controls the maximum number of pseudo-positives
to be selected in each modality.
• The objective function is linear to the y variable  Integer
Programming.
• Relaxed to linear programming if . Efficiently solvable.
Time complexity (by default ). Cost less than 10
seconds on a single core on MEDTest.
arg max
y
mX
i=1
X
dj2Ð
yjµT
i wij s.t.
AT
y · g; y 2 f0; 1gjÐj
0 · y · 1
jÐj3
jÐj = 50
Pseudo Label Set Construction
With Late Fusion
• Can we use late fusion to construct the pseudo label set?
That is first average the scores of all ranked lists and then
select the top-ranked videos as pseudo-positives.
• Yes. If we change the objective function.
• Expectation versus Likelihood.
• Late fusion finds the optimal solution to the problem.
• Theoretical justification for the late fusion.
• Unfortunately however, optimizing expectation is 50%
worse than optimizing likelihood on MEDTest.
argmax
y
E[yjÐ;£i] =
X
dj2Ð
yjP(yjjdj;£i) s.t.
JT
y · k+
1; y 2 f0; 1gjÐj
Modality Weighting
• At most how many pseudo-positive to select in each
modality?
• Estimate using modality accuracy:
– Query likelihood: a modality whose top-ranked videos contain
more query words is supposed to be more important.
– Find indicative words in the event kit description. For example,
the occurrence of words “narration/narrating” and “process” in
the event kit description indicates an “accurate ASR event”.
AT
y · g
Outline
 MED EK0 Overview
 Related Work
 Pseudo Relevant Set Construction
 Experiment Results
 Conclusions
Results on MEDTest
• MMPRF1: w/o modality weighting. MMPRF2: w/ modality weighting.
• Improve the baseline Without PRF by a relative 158% (absolute 6.2%)
on Pre-Specified events and by a relative 107% (absolute 4.3%) on Ad-
Hoc events.
• Statistically significantly better than other baseline methods.
Official Results on PROGAll
EK0 Results FullSys ASRSys AudioSys OCRSys VisualSys
CMU Pre-Specified 3.7 1.8 0.3 2.1 2.4
CMU Ad-Hoc 10.1 3.1 0.2 2.8 5.2
• MMPRF is only used in our FullSys.
• Relative improvement of the FullSys over the best sub-system.
• by a relative 54% on Pre-Specified events.
• by a relative 94% on Ad-Hoc events.
Results on PROGAll
• Relative Improvement of FullSys over the best
SubSys.
Pre-Specified Ad-Hoc
-1.0
-0.8
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
1.0
1.2
Relativeimprovment
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
Relativeimprovment
Outline
 MED EK0 Overview
 Related Work
 Pseudo Relevant Set Construction
 Experiment Results
 Conclusions
Conclusions
 A few things to take away from this talk:
 MultiModal Pseudo Relevance Feedback (MMPRF) is a first
attempt to use both high-level and low-level features in
MED EK0.
 MMPRF offers a solution to conduct PRF on multiple ranked
lists. Empirically it significantly outperforms all baseline
methods on MEDTest.
 Modality weighting is beneficial.
References
• Joachims, Thorsten. A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization.
No. CMU-CS-96-118. CARNEGIE-MELLON UNIV PITTSBURGH PA DEPT OF COMPUTER SCIENCE, 1996.
• Yan, Rong, Alexander G. Hauptmann, and Rong Jin. "Negative pseudo-relevance feedback in content-based
video retrieval." Proceedings of the eleventh ACM international conference on Multimedia. ACM, 2003.
• Lavrenko, Victor, and W. Bruce Croft. "Relevance based language models."Proceedings of the 24th annual
international ACM SIGIR conference on Research and development in information retrieval. ACM, 2001.
• Liu, Yuan, et al. "Learning to video search rerank via pseudo preference feedback." Multimedia and Expo,
2008 IEEE International Conference on. IEEE, 2008.
• Bao, Lei, et al. "Informedia@ trecvid 2011." TRECVID2011, NIST (2011).
• Jiang, Lu, Alexander G. Hauptmann, and Guang Xiang. "Leveraging high-level and low-level features for
multimedia event detection." Proceedings of the 20th ACM international conference on Multimedia. ACM,
2012.
• Tong, Wei, et al. "E-LAMP: integration of innovative ideas for multimedia event detection." Machine Vision
and Applications (2013): 1-11.
• Hauptmann, Alexander G., Michael G. Christel, and Rong Yan. "Video retrieval based on semantic
concepts." Proceedings of the IEEE 96.4 (2008): 602-622.
Name Birthday party
Definition
An individual celebrates a birthday
with other people etc.
Explication
Evidences
Audio
singing "Happy Birthday to You";
laughing, etc.
Activities Singing, eating, opening gifts
Object/
people
Decorations, birthday cake,
candles, gifts, etc.
Scene Indoors or outdoors, etc.
A birthday in this context is the
anniversary of a person's birth etc.
Happy Birthday,
congratulations
Birthday,
birthday party
Cake, candle,
boy, girl, food
Happy Birthday
Song, laughing
Speech
(ASR)
Optical
Character
(OCR)
Visual
Objects
Audio
Concepts
Query
Event Kit Description
Pseudo Label Set
+
+ +
+
+
+
+
- -
-
-
--- -
Fusion Model
(1) Query Generation (2) Retrieval
…
…
…
…
+
+ +
+
+
+
+
- -
-
-
--- -
…
Modality
Rank list
(3) MMPRF
Low-level
features
High-level
features
Feedback
Rank list
Event Query Generator Event Search

More Related Content

Similar to CMU Trecvid med13 nist

840 plenary elder_using his laptop
840 plenary elder_using his laptop840 plenary elder_using his laptop
840 plenary elder_using his laptopRising Media, Inc.
 
know Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdfknow Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdfhemangppatel
 
Predicting and Optimizing the End Price of an Online Auction using Genetic-Fu...
Predicting and Optimizing the End Price of an Online Auction using Genetic-Fu...Predicting and Optimizing the End Price of an Online Auction using Genetic-Fu...
Predicting and Optimizing the End Price of an Online Auction using Genetic-Fu...Pratheeban Rajendran
 
Scott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SFScott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SFMLconf
 
Meet up september19-final
Meet up september19-finalMeet up september19-final
Meet up september19-finalIdo Rozen
 
WWW 2021report public
WWW 2021report publicWWW 2021report public
WWW 2021report publicTakuma Oda
 
Artificial Agents Without Ontological Access to Reality
Artificial Agents Without Ontological Access to RealityArtificial Agents Without Ontological Access to Reality
Artificial Agents Without Ontological Access to Realityogeorgeon
 
An Early Warning System for Ambient Assisted Living
An Early Warning System for Ambient Assisted LivingAn Early Warning System for Ambient Assisted Living
An Early Warning System for Ambient Assisted LivingAndrea Monacchi
 
Weakly supervised PICO information extraction using Snorkel
Weakly supervised PICO information extraction using SnorkelWeakly supervised PICO information extraction using Snorkel
Weakly supervised PICO information extraction using SnorkelAnjani Dhrangadhariya
 
Research Method Review Report ( Experimentation )
Research Method Review Report ( Experimentation )Research Method Review Report ( Experimentation )
Research Method Review Report ( Experimentation )Jennifer Campbell
 
Saliency Based Hookworm and Infection Detection for Wireless Capsule Endoscop...
Saliency Based Hookworm and Infection Detection for Wireless Capsule Endoscop...Saliency Based Hookworm and Infection Detection for Wireless Capsule Endoscop...
Saliency Based Hookworm and Infection Detection for Wireless Capsule Endoscop...IRJET Journal
 
Using Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based LearningUsing Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based LearningDavid Gleich
 

Similar to CMU Trecvid med13 nist (20)

840 plenary elder_using his laptop
840 plenary elder_using his laptop840 plenary elder_using his laptop
840 plenary elder_using his laptop
 
know Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdfknow Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdf
 
From data lakes to actionable data (adventures in data curation)
From data lakes to actionable data (adventures in data curation)From data lakes to actionable data (adventures in data curation)
From data lakes to actionable data (adventures in data curation)
 
Predicting and Optimizing the End Price of an Online Auction using Genetic-Fu...
Predicting and Optimizing the End Price of an Online Auction using Genetic-Fu...Predicting and Optimizing the End Price of an Online Auction using Genetic-Fu...
Predicting and Optimizing the End Price of an Online Auction using Genetic-Fu...
 
50120140504022
5012014050402250120140504022
50120140504022
 
Breast Cancer
Breast CancerBreast Cancer
Breast Cancer
 
Scott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SFScott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SF
 
Meet up september19-final
Meet up september19-finalMeet up september19-final
Meet up september19-final
 
Ml in genomics
Ml in genomicsMl in genomics
Ml in genomics
 
920 plenary elder
920 plenary elder920 plenary elder
920 plenary elder
 
910 plenary Elder
910 plenary Elder910 plenary Elder
910 plenary Elder
 
WWW 2021report public
WWW 2021report publicWWW 2021report public
WWW 2021report public
 
Artificial Agents Without Ontological Access to Reality
Artificial Agents Without Ontological Access to RealityArtificial Agents Without Ontological Access to Reality
Artificial Agents Without Ontological Access to Reality
 
An Early Warning System for Ambient Assisted Living
An Early Warning System for Ambient Assisted LivingAn Early Warning System for Ambient Assisted Living
An Early Warning System for Ambient Assisted Living
 
Weakly supervised PICO information extraction using Snorkel
Weakly supervised PICO information extraction using SnorkelWeakly supervised PICO information extraction using Snorkel
Weakly supervised PICO information extraction using Snorkel
 
Research Method Review Report ( Experimentation )
Research Method Review Report ( Experimentation )Research Method Review Report ( Experimentation )
Research Method Review Report ( Experimentation )
 
Kaggle kenneth
Kaggle kennethKaggle kenneth
Kaggle kenneth
 
Saliency Based Hookworm and Infection Detection for Wireless Capsule Endoscop...
Saliency Based Hookworm and Infection Detection for Wireless Capsule Endoscop...Saliency Based Hookworm and Infection Detection for Wireless Capsule Endoscop...
Saliency Based Hookworm and Infection Detection for Wireless Capsule Endoscop...
 
Using Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based LearningUsing Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based Learning
 
Pro max icdm2012-slides
Pro max icdm2012-slidesPro max icdm2012-slides
Pro max icdm2012-slides
 

Recently uploaded

How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...software pro Development
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfryanfarris8
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 

Recently uploaded (20)

How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 

CMU Trecvid med13 nist

  • 1. CMU-Informedia @ TRECVID 2013 Multimedia Event Detection Speaker: Lu Jiang On behalf of CMU E-LAMP* Carnegie Mellon University * Zhen-Zhong Lan, Lu Jiang, Shoou-I Yu, Shourabh Rawat, Yang Cai, Chenqiang Gao, Shicheng Xu, Haoquan Shen, Xuanchong Li, Yipei Wang, Waito Sze, Yan Yan, Zhigang Ma, Wei Tong, Yi Yang, Susanne Burger, Florian Metze, Rita Singh, Bhiksha Raj, Richard Stern, Teruko Mitamura, Eric Nyberg, and Alexander Hauptmann
  • 2. Acknowledgement This work was partially supported by Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number D11PC20068. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. Disclaimer: The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of IARPA, DoI/NBC, or the U.S. Government.
  • 3. Outline  MED EK0 Overview  Related Work  Pseudo Relevant Set Construction  Experiment Results  Conclusions
  • 4. Outline  MED EK0 Overview  Related Work  Pseudo Relevant Set Construction  Experiment Results  Conclusions
  • 5. EK0 Pipeline Name Birthday party Definition An individual celebrates a birthday with other people etc. Explication Evidences Audio singing "Happy Birthday to You"; laughing, etc. Activities Singing, eating, opening gifts Object/ people Decorations, birthday cake, candles, gifts, etc. Scene Indoors or outdoors, etc. A birthday in this context is the anniversary of a person's birth etc. Happy Birthday, congratulations Birthday, birthday party Cake, candle, boy, girl, food Happy Birthday Song, laughing Speech (ASR) Optical Character (OCR) Visual Objects Audio Concepts Query Event Kit Description (1) Query Generation (2) Retrieval … … … … Modality Rank list Event Query Generator Event Search … Rank list Fusion Non-trivial to use the discriminative low-level features. Low-level features are non-semantic.
  • 6. EK0 Pipeline Name Birthday party Definition An individual celebrates a birthday with other people etc. Explication Evidences Audio singing "Happy Birthday to You"; laughing, etc. Activities Singing, eating, opening gifts Object/ people Decorations, birthday cake, candles, gifts, etc. Scene Indoors or outdoors, etc. A birthday in this context is the anniversary of a person's birth etc. Happy Birthday, congratulations Birthday, birthday party Cake, candle, boy, girl, food Happy Birthday Song, laughing Speech (ASR) Optical Character (OCR) Visual Objects Audio Concepts Query Event Kit Description (1) Query Generation (2) Retrieval … … … … Modality Rank list Event Query Generator Event Search … Rank list Fusion No way to use the discriminative low-level features such as SIFT. Low-level features are non-semantic.
  • 7. EK0 Pipeline with MMPRF Name Birthday party Definition An individual celebrates a birthday with other people etc. Explication Evidences Audio singing "Happy Birthday to You"; laughing, etc. Activities Singing, eating, opening gifts Object/ people Decorations, birthday cake, candles, gifts, etc. Scene Indoors or outdoors, etc. A birthday in this context is the anniversary of a person's birth etc. Happy Birthday, congratulations Birthday, birthday party Cake, candle, boy, girl, food Happy Birthday Song, laughing Speech (ASR) Optical Character (OCR) Visual Objects Audio Concepts Query Event Kit Description Pseudo Label Set + + + + + + + - - - - --- - Fusion Model (1) Query Generation (2) Retrieval … … … … + + + + + + + - - - - --- - … Modality Rank list (3) MMPRF Low-level features High-level features Feedback Rank list Event Query Generator Event Search Leverage both high-level features (semantic concepts) and low-level features(Dense trajectory and SIFT)
  • 8. MultiModal Pseudo Relevance Feedback (MMPRF) • MultiModal Pseudo Relevance Feedback (MMPRF) in a nutshell: – Construct a pseudo label set. – Find a fusion model on the pseudo label set using both high- level and low-level features. – Feedback the ranked list of the fusion model to establish the pseudo label set for the next iteration. • MultiModal: the feedback is carried out on multimodal data (or multiple ranked lists). • Pseudo: no ground-truth training data or manual relevance judgment is used. • MMPRF is completely automatic event search.
  • 9. Outline  MED EK0 Overview  Related Work  Pseudo Relevant Set Construction  Experiment Results  Conclusions
  • 10. Pseudo Relevance Feedback(PRF) • In text retrieval: – Rocchio algorithm (Joachims, 1996) – Relevance Model (Lavrenko, 2001) • In multimedia retrieval: – Classification-based PRF (Yan, 2003)(Hauptmann, 2008) – Learning to rank (Liu, 2008) • In existing methods, the initial feedback ranking score is obtained from a single modality(a single ranked list).
  • 11. Classification-based PRF (CPRF) • Treat top-ranked videos as pseudo-positives. • Treat bottom-ranked videos as pseudo-negatives. Work reasonably well on unimodal data(Yan, 2003). Cause inconsistency on multimodal data. … + + + -- Rank list £1 1 2 3 999 1000
  • 12. Inconsistency on multimodal data Considering each modality independently may cause the inconsistency on multimodal data … + + + -- Rank list £1 1 2 3 999 1000 … + ++ - - Rank list £2 1 2 3 999 1000 Modality 1 Modality 2
  • 13. Outline  MED EK0 Overview  Related Work  Pseudo Relevant Set Construction  Experiment Results  Conclusions
  • 14. Pseudo Relevant Set Construction • Each modality has its own preference on which pseudo- positives to choose. • The desired label set satisfies the most modalities. • The selection process is analogous to voting (unavailable in the single modality) • Every modality votes for some pseudo-positives. • The better the label set fits a modality, the higher the vote is. • The set with the highest votes is selected as the pseudo label set. • A principled approach Maximum Likelihood Estimation (MLE).
  • 15. MMPRF • Our objective is to find a pseudo label set that maximizes the likelihood of all modalities. arg max y mX i=1 X dj2Ð yjµT i wij s.t. AT y · g; y 2 f0; 1gjÐj
  • 16. MMPRF • Our objective is to find a pseudo label set that maximizes the likelihood of all modalities. • Objective function: The sum of logarithmic likelihood across all modalities. arg max y mX i=1 X dj2Ð yjµT i wij s.t. AT y · g; y 2 f0; 1gjÐj
  • 17. MMPRF • Our objective is to find a pseudo label set that maximizes the likelihood of all modalities. • Objective function: The sum of logarithmic likelihood across all modalities. • The constraint controls the maximum number of pseudo-positives to be selected in each modality. arg max y mX i=1 X dj2Ð yjµT i wij s.t. AT y · g; y 2 f0; 1gjÐj
  • 18. MMPRF • Our objective is to find a pseudo label set that maximizes the likelihood of all modalities. • Objective function: The sum of logarithmic likelihood across all modalities. • The constraint controls the maximum number of pseudo-positives to be selected in each modality. • The objective function is linear to the y variable  Integer Programming. arg max y mX i=1 X dj2Ð yjµT i wij s.t. AT y · g; y 2 f0; 1gjÐj
  • 19. MMPRF • Our objective is to find a pseudo label set that maximizes the likelihood of all modalities. • Objective function: The sum of logarithmic likelihood across all modalities. • The constraint controls the maximum number of pseudo-positives to be selected in each modality. • The objective function is linear to the y variable  Integer Programming. • Relaxed to linear programming if . Efficiently solvable. Time complexity (by default ). Cost less than 10 seconds on a single core on MEDTest. arg max y mX i=1 X dj2Ð yjµT i wij s.t. AT y · g; y 2 f0; 1gjÐj 0 · y · 1 jÐj3 jÐj = 50
  • 20. Pseudo Label Set Construction With Late Fusion • Can we use late fusion to construct the pseudo label set? That is first average the scores of all ranked lists and then select the top-ranked videos as pseudo-positives. • Yes. If we change the objective function. • Expectation versus Likelihood. • Late fusion finds the optimal solution to the problem. • Theoretical justification for the late fusion. • Unfortunately however, optimizing expectation is 50% worse than optimizing likelihood on MEDTest. argmax y E[yjÐ;£i] = X dj2Ð yjP(yjjdj;£i) s.t. JT y · k+ 1; y 2 f0; 1gjÐj
  • 21. Modality Weighting • At most how many pseudo-positive to select in each modality? • Estimate using modality accuracy: – Query likelihood: a modality whose top-ranked videos contain more query words is supposed to be more important. – Find indicative words in the event kit description. For example, the occurrence of words “narration/narrating” and “process” in the event kit description indicates an “accurate ASR event”. AT y · g
  • 22. Outline  MED EK0 Overview  Related Work  Pseudo Relevant Set Construction  Experiment Results  Conclusions
  • 23. Results on MEDTest • MMPRF1: w/o modality weighting. MMPRF2: w/ modality weighting. • Improve the baseline Without PRF by a relative 158% (absolute 6.2%) on Pre-Specified events and by a relative 107% (absolute 4.3%) on Ad- Hoc events. • Statistically significantly better than other baseline methods.
  • 24. Official Results on PROGAll EK0 Results FullSys ASRSys AudioSys OCRSys VisualSys CMU Pre-Specified 3.7 1.8 0.3 2.1 2.4 CMU Ad-Hoc 10.1 3.1 0.2 2.8 5.2 • MMPRF is only used in our FullSys. • Relative improvement of the FullSys over the best sub-system. • by a relative 54% on Pre-Specified events. • by a relative 94% on Ad-Hoc events.
  • 25. Results on PROGAll • Relative Improvement of FullSys over the best SubSys. Pre-Specified Ad-Hoc -1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 Relativeimprovment -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 Relativeimprovment
  • 26. Outline  MED EK0 Overview  Related Work  Pseudo Relevant Set Construction  Experiment Results  Conclusions
  • 27. Conclusions  A few things to take away from this talk:  MultiModal Pseudo Relevance Feedback (MMPRF) is a first attempt to use both high-level and low-level features in MED EK0.  MMPRF offers a solution to conduct PRF on multiple ranked lists. Empirically it significantly outperforms all baseline methods on MEDTest.  Modality weighting is beneficial.
  • 28. References • Joachims, Thorsten. A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization. No. CMU-CS-96-118. CARNEGIE-MELLON UNIV PITTSBURGH PA DEPT OF COMPUTER SCIENCE, 1996. • Yan, Rong, Alexander G. Hauptmann, and Rong Jin. "Negative pseudo-relevance feedback in content-based video retrieval." Proceedings of the eleventh ACM international conference on Multimedia. ACM, 2003. • Lavrenko, Victor, and W. Bruce Croft. "Relevance based language models."Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 2001. • Liu, Yuan, et al. "Learning to video search rerank via pseudo preference feedback." Multimedia and Expo, 2008 IEEE International Conference on. IEEE, 2008. • Bao, Lei, et al. "Informedia@ trecvid 2011." TRECVID2011, NIST (2011). • Jiang, Lu, Alexander G. Hauptmann, and Guang Xiang. "Leveraging high-level and low-level features for multimedia event detection." Proceedings of the 20th ACM international conference on Multimedia. ACM, 2012. • Tong, Wei, et al. "E-LAMP: integration of innovative ideas for multimedia event detection." Machine Vision and Applications (2013): 1-11. • Hauptmann, Alexander G., Michael G. Christel, and Rong Yan. "Video retrieval based on semantic concepts." Proceedings of the IEEE 96.4 (2008): 602-622.
  • 29.
  • 30. Name Birthday party Definition An individual celebrates a birthday with other people etc. Explication Evidences Audio singing "Happy Birthday to You"; laughing, etc. Activities Singing, eating, opening gifts Object/ people Decorations, birthday cake, candles, gifts, etc. Scene Indoors or outdoors, etc. A birthday in this context is the anniversary of a person's birth etc. Happy Birthday, congratulations Birthday, birthday party Cake, candle, boy, girl, food Happy Birthday Song, laughing Speech (ASR) Optical Character (OCR) Visual Objects Audio Concepts Query Event Kit Description Pseudo Label Set + + + + + + + - - - - --- - Fusion Model (1) Query Generation (2) Retrieval … … … … + + + + + + + - - - - --- - … Modality Rank list (3) MMPRF Low-level features High-level features Feedback Rank list Event Query Generator Event Search