SlideShare a Scribd company logo
1 of 23
Download to read offline
1Information Technologies Institute
Centre for Research and Technology Hellas
ITI-CERTH in TRECVID 2016
Ad-hoc Video Search (AVS)
Foteini Markatopoulou, Damianos Galanopoulos, Ioannis Patras,
Vasileios Mezaris
Information Technologies Institute / Centre for Research and Technology Hellas
TRECVID 2016 Workshop, Gaithersburg, MD, USA, November 2016
2Information Technologies Institute
Centre for Research and Technology Hellas
Highlights
• AVS’s task objective is to retrieve a list of the 1000 most
related test shots for a specific text query
• Our approach: a fully-automatic system
• The system consists of three components
– Video shot processing
– Query processing
– Video shot retrieval
• Both fully-automatic and manually-assisted (with users just
specifying additional cues) runs were submitted
3Information Technologies Institute
Centre for Research and Technology Hellas
System Overview
4Information Technologies Institute
Centre for Research and Technology Hellas
Video shot processing
5Information Technologies Institute
Centre for Research and Technology Hellas
Video shot processing
ImageNet 1000
• Five pre-trained DCCNs for 1000 concepts
– AlexNet
– GoogLeNet
– ResNet
– VGG Net
– GoogLeNet trained on 5055 ImageNet concepts (we only considered the
subset of 1000 concepts out of the 5055 ones)
• Late fusion (averaging) on the direct output of the networks
to obtain a single score per concept
6Information Technologies Institute
Centre for Research and Technology Hellas
Video shot processing
TRECVID SIN 345
• Three pre-trained ImageNet networks, fine-tuned (FT; three FT strategies
with different parameter instantiations from [1]; in total 51 FT networks)
for these concepts
– AlexNet (1000 ImageNet concepts)
– GoogLeNet (1000 ImageNet concepts)
– GoogLeNet originally trained on 5055 ImageNet concepts
• The best performing FT network (as evaluated on the TRECVID SIN 2013
test dataset) is selected
• Examined two approaches for using this for shot annotation
– Using the direct output of the FT network
– Linear SVM training with DCNN-based features
[1] N. Pittaras, F. Markatopoulou, V. Mezaris, I. Patras, "Comparison of Fine-tuning and Extension Strategies for Deep Convolutional Neural
Networks", at the 23rd Int. Conf. on MultiMedia Modeling (MMM'17), Reykjavik, Iceland, 4 January 2017. (accepted for publication)
7Information Technologies Institute
Centre for Research and Technology Hellas
Query processing
• Each query is represented as a vector of related concepts
– We select concepts which are most closely related to the query
– These concepts form the query’s concept vector
– Each element of this vector indicates the degree that the corresponding
concept is related to the query
• A five-step procedure is used
– Each step selects concepts, from the concept pool, related to the query
8Information Technologies Institute
Centre for Research and Technology Hellas
Query processing: Step 1
Motivation: Some concepts are semantically close to
input query and they can describe it extremely well
Approach:
– Compare every concept in our pool with the entire input query,
using the Explicit Semantic Analysis (ESA) measure
– If the score between the query and a concept is higher than a
threshold (0.8) then the concept is selected
– If at least one concept is selected in this way, we assume that
the query is very well described and the query processing stops;
otherwise the query processing continues in step 2
Example: the query Find shots of a sewing machine
and the concept sewing machine are semantically
extremely close
Step 1 
Step 2
9Information Technologies Institute
Centre for Research and Technology Hellas
Query processing: Step 1
The processing stopped in step 1 for 3 out of the 30
queries:
• For Find shots of a sewing machine the concept
sewing machine was selected
• For Find shots of a policeman where a police car is
visible the concept police car was selected
• For Find shots of people shopping the concept
tobacco shop was selected
Step 1 
Step 2
10Information Technologies Institute
Centre for Research and Technology Hellas
Query processing: Step 2
Motivation: Some (complex) concepts may describe
the query quite well, but appear in a way that
subsequent linguistic analysis to break down the
query to sub-queries can make their detection difficult
Approach:
– We search if any of the concepts appear in any part of the
query, by string matching
– Any concepts that appear in the query are selected and the
query processing continues in step 3
Example: For the query Find shots of a man with
beard and wearing white robe speaking and
gesturing to camera the concept speaking to camera
was found
Step 1 
Step 2
Step 3
11Information Technologies Institute
Centre for Research and Technology Hellas
Query processing: Step 2
For 5 out of 30 queries concepts were selected
through string matching
• For Find shots of a man with beard and wearing white robe
speaking and gesturing to camera, the concept speaking to
camera was selected
• For Find shots of one or more people opening a door and exiting
through it, the concept door opening was selected
• For Find shots of the 43rd president George W. Bush sitting down
talking with people indoors, the concept sitting down was selected
• For Find shots of military personnel interacting with protesters, the
concept military personnel was selected
• For Find shots of a person sitting down with a laptop visible, the
concept sitting down was selected
Step 1 
Step 2
Step 3
12Information Technologies Institute
Centre for Research and Technology Hellas
Query processing: Step 3
Step 1 
Step 2
Step 3
Motivation: Queries are complex sentences; we
decompose queries to understand and process better
their parts
Approach:
– We define a sub-query as a meaningful smaller phrase or term
that is included in the original query, and we automatically
decompose the query to subqueries
• NLP procedures (e.g. PoS tagging, stop-word removal) and task-specific NLP
rules are used
• For example the triad Noun-Verb-Noun forms a sub-query
– The ESA distance is evaluated for every sub-query – concept
pair
– If the score is higher than our step-1 threshold (0.8), then the
concept is selected
13Information Technologies Institute
Centre for Research and Technology Hellas
Query processing: Step 3
Example: the query Find shots of a diver wearing
diving suit and swimming under water is split into
the following four sub-queries: diver wearing diving
suit, swimming, water
• If for every sub-query at least one concept is
selected we consider the query completely
analyzed and we proceed to video shot retrieval
component
• If for a subset of the sub-queries no concepts have
been selected we continue to step 4
• If for all of the of the sub-queries no concepts have
been selected we continue to step 5
Step 1 
Step 2
Step 3 
Step 4
Step 5
14Information Technologies Institute
Centre for Research and Technology Hellas
Query processing: Step 3
• On average, a query was broken down to 3.7 sub-
queries
• For none of the test queries there was at least one
concept from our pool matched to each sub-query
• For 17 out of 27 queries, concepts were matched
to a subset of the sub-queries, thus the processing
continued to step 4
• For the remaining 10 queries, no concept was
matched to any of their sub-queries, thus the
processing continued to step 5
Step 1 
Step 2
Step 3 
Step 4
Step 5
15Information Technologies Institute
Centre for Research and Technology Hellas
Query processing: Step 4
Motivation: For a subset of the sub-queries no
concepts were selected due to their small semantic
relatedness (i.e., in terms of ESA measure their
relatedness is lower than the 0.8 threshold)
Approach:
– For these sub-queries the concept with the higher value of ESA
measure is selected, and the we proceed to video shot
retrieval
Example:
Step 1 
Step 2
Step 3 
Step 4Query: Find shots of one or more people walking or bicycling on a bridge during daytime
Sub-queries Selected concepts (ESA score)
Steps 2,3
• people walking
• bicycling
• bridge
• walking (1.0)
• bicycle-built-for-two (1.0)
• suspension bridge (1.0)
• bicycles (0.85)
• bridges (0.84)
• bicycling (0.84)
Step 4 • daytime • daytime outdoor (0.74)

16Information Technologies Institute
Centre for Research and Technology Hellas
Query processing: Step 5
Motivation: For some queries none of the above
steps is able to select concepts
Approach:
– Our MED16 000Ex framework is used
– The query title and its sub-queries form an Event Language
Model
– A Concept Language Model is formed for every concept using
retrieved articles from Wikipedia
– A ranked list of the most relevant concepts and the
corresponding scores (semantic correlation between each
query-concept pair) is returned
– We proceed to video shot retrieval component
Step 1 
Step 2
Step 3 
Step 4
Step 5


17Information Technologies Institute
Centre for Research and Technology Hellas
Query processing: Step 5
Example: For the query Find shots of a person
playing guitar outdoors the framework returns the
following concepts: outdoor, acoustic guitar, electric
guitar and daytime outdoor
Step 1 
Step 2
Step 3 
Step 4
Step 5


18Information Technologies Institute
Centre for Research and Technology Hellas
Video shot retrieval
• The query’s concept vector is formed by the corresponding
scores of the selected concepts
• If a concept has been selected in steps 1, 3, 4 or 5 the
corresponding vector’s element is assigned with the
relatedness score (calculated using the ESA measure) and if it
has been selected in step 2 it is set equal to 1
• Histogram intersection calculates the distance between
query’s concept vector and keyframe’s concept vector for
each of the test keyframes
• The 1000 keyframes with the smallest distance from query’s
concept vector are retrieved
19Information Technologies Institute
Centre for Research and Technology Hellas
Submitted Runs
• We submitted both fully-automatic and manually-assisted
runs
• For the manually-assisted ones
– We used the same fully-automatic system, but
– A member of our team that was not involved in the development of our AVS
system took a look at each query and manually suggested sub-queries for it,
without knowledge of the automatically-generated ones
– The manually defined sub-queries were added to the automatically-
generated ones, and our automatic AVS system was applied
20Information Technologies Institute
Centre for Research and Technology Hellas
Submitted Runs
ITI-CERTH 1:
– Late fusion of the direct output from 5 DCNNs for ImageNet 1000 concepts
– SVM-based concepts detectors for 345 TRECVID SIN concepts
ITI-CERTH 2:
– Late fusion of the direct output from 5 DCNNs for ImageNet 1000 concepts
– The direct output of the FT network for 345 TRECVID SIN concepts
ITI-CERTH 3: ITI-CERTH 1 run without step 4
ITI-CERTH 4: ITI-CERTH 1 run without step 2
Submitted run: ITI-CERTH 1 ITI-CERTH 2 ITI-CERTH 3 ITI-CERTH 4
MXinfAP
(fully-automatic)
0.051 0.042 0.051 0.051
MXinfAP
(manually-assisted)
0.043 0.037 0.037 0.043
21Information Technologies Institute
Centre for Research and Technology Hellas
Results (fully-automatic runs)
22Information Technologies Institute
Centre for Research and Technology Hellas
Results and conclusions
• Training SVMs on DCNN-based features instead of using the direct
output of the DCNNs, for the 345 TRECVID SIN concepts, improves the
accuracy (i.e., run ITI-CERTH 1 outperforms ITI-CERTH 2)
• In the AVS 2016 dataset
– Step 4 could be omitted for the fully-automatic runs
• Sub-queries without high semantic relatedness can be ignored; ITI-CERTH 1 & ITI-
CERTH 3 achieve the same results
– Step 2 could be omitted
• String matching between the test query and concepts does not improve the
accuracy; semantic relatedness makes the difference
• Fully-automatic runs outperformed the manually-assisted ones
• Our best fully-automatic run was ranked 2nd-best in the fully-automatic
run category; it also outperformed the runs of all but one participant in
the manually-assisted run category
23Information Technologies Institute
Centre for Research and Technology Hellas
Questions?
More information and contact:
Vasileios Mezaris, http://www.iti.gr/~bmezaris, bmezaris@iti.gr
TRECVID 2016 paper:
F. Markatopoulou, A. Moumtzidou, D. Galanopoulos, T. Mironidis, V. Kaltsa, A. Ioannidou, S. Symeonidis, K.
Avgerinakis, S. Andreadis, I. Gialampoukidis, S. Vrochidis, A. Briassouli, V. Mezaris, I. Kompatsiaris, I. Patras, "ITI-
CERTH participation in TRECVID 2016", Proc. TRECVID 2016 Workshop, Gaithersburg, MD USA, November 2016.
This work has received funding from the European Union’s Horizon 2020 research and innovation
programme under grant agreements H2020-693092 MOVING and H2020-687786 InVID.

More Related Content

What's hot

Invited Talk: Early Detection of Research Topics
Invited Talk: Early Detection of Research Topics Invited Talk: Early Detection of Research Topics
Invited Talk: Early Detection of Research Topics
Angelo Salatino
 
resume-tina-tingchu-lin
resume-tina-tingchu-linresume-tina-tingchu-lin
resume-tina-tingchu-lin
Ting-Chu Lin
 
Methodological issues in research on MOOCs - ECEL2014
Methodological issues in research on MOOCs - ECEL2014Methodological issues in research on MOOCs - ECEL2014
Methodological issues in research on MOOCs - ECEL2014
Juliana Elisa Raffaghelli
 

What's hot (15)

Data Wrangling Week 7
Data Wrangling Week 7Data Wrangling Week 7
Data Wrangling Week 7
 
Week 11: Programming for Data Analysis
Week 11: Programming for Data AnalysisWeek 11: Programming for Data Analysis
Week 11: Programming for Data Analysis
 
Hadoop in Alibaba Cloud
Hadoop in Alibaba CloudHadoop in Alibaba Cloud
Hadoop in Alibaba Cloud
 
bonino
boninobonino
bonino
 
ICSE12 SEE.ppt
ICSE12 SEE.pptICSE12 SEE.ppt
ICSE12 SEE.ppt
 
[DOLAP2019] Augmented Business Intelligence
[DOLAP2019] Augmented Business Intelligence[DOLAP2019] Augmented Business Intelligence
[DOLAP2019] Augmented Business Intelligence
 
[ADBIS 2021] - Optimizing Execution Plans in a Multistore
[ADBIS 2021] - Optimizing Execution Plans in a Multistore[ADBIS 2021] - Optimizing Execution Plans in a Multistore
[ADBIS 2021] - Optimizing Execution Plans in a Multistore
 
Invited Talk: Early Detection of Research Topics
Invited Talk: Early Detection of Research Topics Invited Talk: Early Detection of Research Topics
Invited Talk: Early Detection of Research Topics
 
resume-tina-tingchu-lin
resume-tina-tingchu-linresume-tina-tingchu-lin
resume-tina-tingchu-lin
 
Learning End to End
Learning End to EndLearning End to End
Learning End to End
 
Methodological issues in research on MOOCs - ECEL2014
Methodological issues in research on MOOCs - ECEL2014Methodological issues in research on MOOCs - ECEL2014
Methodological issues in research on MOOCs - ECEL2014
 
South Tyrol Suggests - STS
South Tyrol Suggests - STSSouth Tyrol Suggests - STS
South Tyrol Suggests - STS
 
Week 7: Object Storage Service Alibaba Cloud- DSA 441 Cloud Computing
Week 7: Object Storage Service Alibaba Cloud- DSA 441 Cloud ComputingWeek 7: Object Storage Service Alibaba Cloud- DSA 441 Cloud Computing
Week 7: Object Storage Service Alibaba Cloud- DSA 441 Cloud Computing
 
Week 11: Cloud Native- DSA 441 Cloud Computing
Week 11: Cloud Native- DSA 441 Cloud ComputingWeek 11: Cloud Native- DSA 441 Cloud Computing
Week 11: Cloud Native- DSA 441 Cloud Computing
 
Parsimonious and Adaptive Contextual Information Acquisition in Recommender S...
Parsimonious and Adaptive Contextual Information Acquisition in Recommender S...Parsimonious and Adaptive Contextual Information Acquisition in Recommender S...
Parsimonious and Adaptive Contextual Information Acquisition in Recommender S...
 

Similar to TRECVID 2016 Ad-hoc Video Search task, CERTH-ITI

An Application-Oriented Approach for Computer Security Education
An Application-Oriented Approach for Computer Security EducationAn Application-Oriented Approach for Computer Security Education
An Application-Oriented Approach for Computer Security Education
Xiao Qin
 

Similar to TRECVID 2016 Ad-hoc Video Search task, CERTH-ITI (20)

An intelligent framework using hybrid social media and market data, for stock...
An intelligent framework using hybrid social media and market data, for stock...An intelligent framework using hybrid social media and market data, for stock...
An intelligent framework using hybrid social media and market data, for stock...
 
Introduction to knowledge discovery
Introduction to knowledge discoveryIntroduction to knowledge discovery
Introduction to knowledge discovery
 
Thesis Presentation V4
Thesis Presentation V4Thesis Presentation V4
Thesis Presentation V4
 
Applications of Machine Learning and Metaheuristic Search to Security Testing
Applications of Machine Learning and Metaheuristic Search to Security TestingApplications of Machine Learning and Metaheuristic Search to Security Testing
Applications of Machine Learning and Metaheuristic Search to Security Testing
 
Machine Learning for Data Extraction
Machine Learning for Data ExtractionMachine Learning for Data Extraction
Machine Learning for Data Extraction
 
Artificial Intelligence Notes Unit 5
Artificial Intelligence Notes Unit 5Artificial Intelligence Notes Unit 5
Artificial Intelligence Notes Unit 5
 
An Application-Oriented Approach for Computer Security Education
An Application-Oriented Approach for Computer Security EducationAn Application-Oriented Approach for Computer Security Education
An Application-Oriented Approach for Computer Security Education
 
2019 DSA 105 Introduction to Data Science Week 3
2019 DSA 105 Introduction to Data Science Week 32019 DSA 105 Introduction to Data Science Week 3
2019 DSA 105 Introduction to Data Science Week 3
 
Video Thumbnail Selector
Video Thumbnail SelectorVideo Thumbnail Selector
Video Thumbnail Selector
 
Multimodal Learning Analytics
Multimodal Learning AnalyticsMultimodal Learning Analytics
Multimodal Learning Analytics
 
How to do science in a large IT company (ICPC World Finals 2021, Moscow)
How to do science in a large IT company (ICPC World Finals 2021, Moscow)How to do science in a large IT company (ICPC World Finals 2021, Moscow)
How to do science in a large IT company (ICPC World Finals 2021, Moscow)
 
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
 
Multimodal Learning Analytics
Multimodal Learning AnalyticsMultimodal Learning Analytics
Multimodal Learning Analytics
 
Security Operation Centre Specialist Course Content
Security Operation Centre Specialist Course ContentSecurity Operation Centre Specialist Course Content
Security Operation Centre Specialist Course Content
 
Security operations center_Specialist_training_course_content
Security operations center_Specialist_training_course_contentSecurity operations center_Specialist_training_course_content
Security operations center_Specialist_training_course_content
 
Sad 201 project sparc vision online library-assignment 2
Sad 201  project sparc vision  online library-assignment 2Sad 201  project sparc vision  online library-assignment 2
Sad 201 project sparc vision online library-assignment 2
 
mining sirdar , overman, assistant managerppt.ppt
mining sirdar , overman, assistant managerppt.pptmining sirdar , overman, assistant managerppt.ppt
mining sirdar , overman, assistant managerppt.ppt
 
Machine Learning & Artificial Intelligence - Machine Controlled Data Dispensa...
Machine Learning & Artificial Intelligence - Machine Controlled Data Dispensa...Machine Learning & Artificial Intelligence - Machine Controlled Data Dispensa...
Machine Learning & Artificial Intelligence - Machine Controlled Data Dispensa...
 
Recommender Systems @ Scale, Big Data Europe Conference 2019
Recommender Systems @ Scale, Big Data Europe Conference 2019Recommender Systems @ Scale, Big Data Europe Conference 2019
Recommender Systems @ Scale, Big Data Europe Conference 2019
 
Expert systems
Expert systemsExpert systems
Expert systems
 

More from MOVING Project

More from MOVING Project (20)

Opening up education through digitization. Remarks on recent developments in ...
Opening up education through digitization. Remarks on recent developments in ...Opening up education through digitization. Remarks on recent developments in ...
Opening up education through digitization. Remarks on recent developments in ...
 
MOVING: Applying digital science methodology for TVET
MOVING: Applying digital science methodology for TVETMOVING: Applying digital science methodology for TVET
MOVING: Applying digital science methodology for TVET
 
Learning analytics for reflective learning
Learning analytics for reflective learningLearning analytics for reflective learning
Learning analytics for reflective learning
 
Challenges in Developing Automatic Learning Guidance in Relation to an Inform...
Challenges in Developing Automatic Learning Guidance in Relation to an Inform...Challenges in Developing Automatic Learning Guidance in Relation to an Inform...
Challenges in Developing Automatic Learning Guidance in Relation to an Inform...
 
Unesco mobileweek 2019_frontier_tech_oer-final
Unesco mobileweek 2019_frontier_tech_oer-finalUnesco mobileweek 2019_frontier_tech_oer-final
Unesco mobileweek 2019_frontier_tech_oer-final
 
Inferring knowledge acquisition through Web navigation behaviour
Inferring knowledge acquisition through Web navigation behaviourInferring knowledge acquisition through Web navigation behaviour
Inferring knowledge acquisition through Web navigation behaviour
 
ITI-CERTH participation in TRECVID 2018
ITI-CERTH participation in TRECVID 2018ITI-CERTH participation in TRECVID 2018
ITI-CERTH participation in TRECVID 2018
 
Wissenschaft 2.0 und offene Forschungsmethoden vermitteln– Der MOOC "Science ...
Wissenschaft 2.0 und offene Forschungsmethoden vermitteln– Der MOOC "Science ...Wissenschaft 2.0 und offene Forschungsmethoden vermitteln– Der MOOC "Science ...
Wissenschaft 2.0 und offene Forschungsmethoden vermitteln– Der MOOC "Science ...
 
Wissenschaft 2.0 und offene Forschungsmethoden vermitteln: Der MOOC Science 2...
Wissenschaft 2.0 und offene Forschungsmethoden vermitteln: Der MOOC Science 2...Wissenschaft 2.0 und offene Forschungsmethoden vermitteln: Der MOOC Science 2...
Wissenschaft 2.0 und offene Forschungsmethoden vermitteln: Der MOOC Science 2...
 
VERGE: A Multimodal Interactive Search Engine for Video Browsing and Retrieval
VERGE: A Multimodal Interactive Search Engine for Video Browsing and RetrievalVERGE: A Multimodal Interactive Search Engine for Video Browsing and Retrieval
VERGE: A Multimodal Interactive Search Engine for Video Browsing and Retrieval
 
Temporal Lecture Video Fragmentation using Word Embeddings
Temporal Lecture Video Fragmentation using Word EmbeddingsTemporal Lecture Video Fragmentation using Word Embeddings
Temporal Lecture Video Fragmentation using Word Embeddings
 
The Impact of Blocking and Name-Matching on Author Disambiguation.
The Impact of Blocking and Name-Matching on Author Disambiguation.The Impact of Blocking and Name-Matching on Author Disambiguation.
The Impact of Blocking and Name-Matching on Author Disambiguation.
 
Effective Unsupervised Author Disambiguation with Relative Frequencies
Effective Unsupervised Author Disambiguation with Relative FrequenciesEffective Unsupervised Author Disambiguation with Relative Frequencies
Effective Unsupervised Author Disambiguation with Relative Frequencies
 
What to read next? Challenges and Preliminary Results in Selecting Represen...
What to read next? Challenges and  Preliminary Results in Selecting  Represen...What to read next? Challenges and  Preliminary Results in Selecting  Represen...
What to read next? Challenges and Preliminary Results in Selecting Represen...
 
Qualitative Analysis of Vocabulary Evolution on the Linked Open Data Cloud
Qualitative Analysis of Vocabulary Evolution on the Linked Open Data CloudQualitative Analysis of Vocabulary Evolution on the Linked Open Data Cloud
Qualitative Analysis of Vocabulary Evolution on the Linked Open Data Cloud
 
Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud
Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD CloudAnalyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud
Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud
 
Keeping Linked Open Data Caches Up-to-date by Predicting the Life-time of RDF...
Keeping Linked Open Data Caches Up-to-date by Predicting the Life-time of RDF...Keeping Linked Open Data Caches Up-to-date by Predicting the Life-time of RDF...
Keeping Linked Open Data Caches Up-to-date by Predicting the Life-time of RDF...
 
Deep Multi-task Learning with Label Correlation Constraint for Video Concept ...
Deep Multi-task Learning with Label Correlation Constraint for Video Concept ...Deep Multi-task Learning with Label Correlation Constraint for Video Concept ...
Deep Multi-task Learning with Label Correlation Constraint for Video Concept ...
 
Generic to Specific Recognition Models for Membership Analysis in Group Videos
Generic to Specific Recognition Models for Membership Analysis in Group VideosGeneric to Specific Recognition Models for Membership Analysis in Group Videos
Generic to Specific Recognition Models for Membership Analysis in Group Videos
 
MOVING the Industry 4.0
MOVING the Industry 4.0MOVING the Industry 4.0
MOVING the Industry 4.0
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 

TRECVID 2016 Ad-hoc Video Search task, CERTH-ITI

  • 1. 1Information Technologies Institute Centre for Research and Technology Hellas ITI-CERTH in TRECVID 2016 Ad-hoc Video Search (AVS) Foteini Markatopoulou, Damianos Galanopoulos, Ioannis Patras, Vasileios Mezaris Information Technologies Institute / Centre for Research and Technology Hellas TRECVID 2016 Workshop, Gaithersburg, MD, USA, November 2016
  • 2. 2Information Technologies Institute Centre for Research and Technology Hellas Highlights • AVS’s task objective is to retrieve a list of the 1000 most related test shots for a specific text query • Our approach: a fully-automatic system • The system consists of three components – Video shot processing – Query processing – Video shot retrieval • Both fully-automatic and manually-assisted (with users just specifying additional cues) runs were submitted
  • 3. 3Information Technologies Institute Centre for Research and Technology Hellas System Overview
  • 4. 4Information Technologies Institute Centre for Research and Technology Hellas Video shot processing
  • 5. 5Information Technologies Institute Centre for Research and Technology Hellas Video shot processing ImageNet 1000 • Five pre-trained DCCNs for 1000 concepts – AlexNet – GoogLeNet – ResNet – VGG Net – GoogLeNet trained on 5055 ImageNet concepts (we only considered the subset of 1000 concepts out of the 5055 ones) • Late fusion (averaging) on the direct output of the networks to obtain a single score per concept
  • 6. 6Information Technologies Institute Centre for Research and Technology Hellas Video shot processing TRECVID SIN 345 • Three pre-trained ImageNet networks, fine-tuned (FT; three FT strategies with different parameter instantiations from [1]; in total 51 FT networks) for these concepts – AlexNet (1000 ImageNet concepts) – GoogLeNet (1000 ImageNet concepts) – GoogLeNet originally trained on 5055 ImageNet concepts • The best performing FT network (as evaluated on the TRECVID SIN 2013 test dataset) is selected • Examined two approaches for using this for shot annotation – Using the direct output of the FT network – Linear SVM training with DCNN-based features [1] N. Pittaras, F. Markatopoulou, V. Mezaris, I. Patras, "Comparison of Fine-tuning and Extension Strategies for Deep Convolutional Neural Networks", at the 23rd Int. Conf. on MultiMedia Modeling (MMM'17), Reykjavik, Iceland, 4 January 2017. (accepted for publication)
  • 7. 7Information Technologies Institute Centre for Research and Technology Hellas Query processing • Each query is represented as a vector of related concepts – We select concepts which are most closely related to the query – These concepts form the query’s concept vector – Each element of this vector indicates the degree that the corresponding concept is related to the query • A five-step procedure is used – Each step selects concepts, from the concept pool, related to the query
  • 8. 8Information Technologies Institute Centre for Research and Technology Hellas Query processing: Step 1 Motivation: Some concepts are semantically close to input query and they can describe it extremely well Approach: – Compare every concept in our pool with the entire input query, using the Explicit Semantic Analysis (ESA) measure – If the score between the query and a concept is higher than a threshold (0.8) then the concept is selected – If at least one concept is selected in this way, we assume that the query is very well described and the query processing stops; otherwise the query processing continues in step 2 Example: the query Find shots of a sewing machine and the concept sewing machine are semantically extremely close Step 1  Step 2
  • 9. 9Information Technologies Institute Centre for Research and Technology Hellas Query processing: Step 1 The processing stopped in step 1 for 3 out of the 30 queries: • For Find shots of a sewing machine the concept sewing machine was selected • For Find shots of a policeman where a police car is visible the concept police car was selected • For Find shots of people shopping the concept tobacco shop was selected Step 1  Step 2
  • 10. 10Information Technologies Institute Centre for Research and Technology Hellas Query processing: Step 2 Motivation: Some (complex) concepts may describe the query quite well, but appear in a way that subsequent linguistic analysis to break down the query to sub-queries can make their detection difficult Approach: – We search if any of the concepts appear in any part of the query, by string matching – Any concepts that appear in the query are selected and the query processing continues in step 3 Example: For the query Find shots of a man with beard and wearing white robe speaking and gesturing to camera the concept speaking to camera was found Step 1  Step 2 Step 3
  • 11. 11Information Technologies Institute Centre for Research and Technology Hellas Query processing: Step 2 For 5 out of 30 queries concepts were selected through string matching • For Find shots of a man with beard and wearing white robe speaking and gesturing to camera, the concept speaking to camera was selected • For Find shots of one or more people opening a door and exiting through it, the concept door opening was selected • For Find shots of the 43rd president George W. Bush sitting down talking with people indoors, the concept sitting down was selected • For Find shots of military personnel interacting with protesters, the concept military personnel was selected • For Find shots of a person sitting down with a laptop visible, the concept sitting down was selected Step 1  Step 2 Step 3
  • 12. 12Information Technologies Institute Centre for Research and Technology Hellas Query processing: Step 3 Step 1  Step 2 Step 3 Motivation: Queries are complex sentences; we decompose queries to understand and process better their parts Approach: – We define a sub-query as a meaningful smaller phrase or term that is included in the original query, and we automatically decompose the query to subqueries • NLP procedures (e.g. PoS tagging, stop-word removal) and task-specific NLP rules are used • For example the triad Noun-Verb-Noun forms a sub-query – The ESA distance is evaluated for every sub-query – concept pair – If the score is higher than our step-1 threshold (0.8), then the concept is selected
  • 13. 13Information Technologies Institute Centre for Research and Technology Hellas Query processing: Step 3 Example: the query Find shots of a diver wearing diving suit and swimming under water is split into the following four sub-queries: diver wearing diving suit, swimming, water • If for every sub-query at least one concept is selected we consider the query completely analyzed and we proceed to video shot retrieval component • If for a subset of the sub-queries no concepts have been selected we continue to step 4 • If for all of the of the sub-queries no concepts have been selected we continue to step 5 Step 1  Step 2 Step 3  Step 4 Step 5
  • 14. 14Information Technologies Institute Centre for Research and Technology Hellas Query processing: Step 3 • On average, a query was broken down to 3.7 sub- queries • For none of the test queries there was at least one concept from our pool matched to each sub-query • For 17 out of 27 queries, concepts were matched to a subset of the sub-queries, thus the processing continued to step 4 • For the remaining 10 queries, no concept was matched to any of their sub-queries, thus the processing continued to step 5 Step 1  Step 2 Step 3  Step 4 Step 5
  • 15. 15Information Technologies Institute Centre for Research and Technology Hellas Query processing: Step 4 Motivation: For a subset of the sub-queries no concepts were selected due to their small semantic relatedness (i.e., in terms of ESA measure their relatedness is lower than the 0.8 threshold) Approach: – For these sub-queries the concept with the higher value of ESA measure is selected, and the we proceed to video shot retrieval Example: Step 1  Step 2 Step 3  Step 4Query: Find shots of one or more people walking or bicycling on a bridge during daytime Sub-queries Selected concepts (ESA score) Steps 2,3 • people walking • bicycling • bridge • walking (1.0) • bicycle-built-for-two (1.0) • suspension bridge (1.0) • bicycles (0.85) • bridges (0.84) • bicycling (0.84) Step 4 • daytime • daytime outdoor (0.74) 
  • 16. 16Information Technologies Institute Centre for Research and Technology Hellas Query processing: Step 5 Motivation: For some queries none of the above steps is able to select concepts Approach: – Our MED16 000Ex framework is used – The query title and its sub-queries form an Event Language Model – A Concept Language Model is formed for every concept using retrieved articles from Wikipedia – A ranked list of the most relevant concepts and the corresponding scores (semantic correlation between each query-concept pair) is returned – We proceed to video shot retrieval component Step 1  Step 2 Step 3  Step 4 Step 5  
  • 17. 17Information Technologies Institute Centre for Research and Technology Hellas Query processing: Step 5 Example: For the query Find shots of a person playing guitar outdoors the framework returns the following concepts: outdoor, acoustic guitar, electric guitar and daytime outdoor Step 1  Step 2 Step 3  Step 4 Step 5  
  • 18. 18Information Technologies Institute Centre for Research and Technology Hellas Video shot retrieval • The query’s concept vector is formed by the corresponding scores of the selected concepts • If a concept has been selected in steps 1, 3, 4 or 5 the corresponding vector’s element is assigned with the relatedness score (calculated using the ESA measure) and if it has been selected in step 2 it is set equal to 1 • Histogram intersection calculates the distance between query’s concept vector and keyframe’s concept vector for each of the test keyframes • The 1000 keyframes with the smallest distance from query’s concept vector are retrieved
  • 19. 19Information Technologies Institute Centre for Research and Technology Hellas Submitted Runs • We submitted both fully-automatic and manually-assisted runs • For the manually-assisted ones – We used the same fully-automatic system, but – A member of our team that was not involved in the development of our AVS system took a look at each query and manually suggested sub-queries for it, without knowledge of the automatically-generated ones – The manually defined sub-queries were added to the automatically- generated ones, and our automatic AVS system was applied
  • 20. 20Information Technologies Institute Centre for Research and Technology Hellas Submitted Runs ITI-CERTH 1: – Late fusion of the direct output from 5 DCNNs for ImageNet 1000 concepts – SVM-based concepts detectors for 345 TRECVID SIN concepts ITI-CERTH 2: – Late fusion of the direct output from 5 DCNNs for ImageNet 1000 concepts – The direct output of the FT network for 345 TRECVID SIN concepts ITI-CERTH 3: ITI-CERTH 1 run without step 4 ITI-CERTH 4: ITI-CERTH 1 run without step 2 Submitted run: ITI-CERTH 1 ITI-CERTH 2 ITI-CERTH 3 ITI-CERTH 4 MXinfAP (fully-automatic) 0.051 0.042 0.051 0.051 MXinfAP (manually-assisted) 0.043 0.037 0.037 0.043
  • 21. 21Information Technologies Institute Centre for Research and Technology Hellas Results (fully-automatic runs)
  • 22. 22Information Technologies Institute Centre for Research and Technology Hellas Results and conclusions • Training SVMs on DCNN-based features instead of using the direct output of the DCNNs, for the 345 TRECVID SIN concepts, improves the accuracy (i.e., run ITI-CERTH 1 outperforms ITI-CERTH 2) • In the AVS 2016 dataset – Step 4 could be omitted for the fully-automatic runs • Sub-queries without high semantic relatedness can be ignored; ITI-CERTH 1 & ITI- CERTH 3 achieve the same results – Step 2 could be omitted • String matching between the test query and concepts does not improve the accuracy; semantic relatedness makes the difference • Fully-automatic runs outperformed the manually-assisted ones • Our best fully-automatic run was ranked 2nd-best in the fully-automatic run category; it also outperformed the runs of all but one participant in the manually-assisted run category
  • 23. 23Information Technologies Institute Centre for Research and Technology Hellas Questions? More information and contact: Vasileios Mezaris, http://www.iti.gr/~bmezaris, bmezaris@iti.gr TRECVID 2016 paper: F. Markatopoulou, A. Moumtzidou, D. Galanopoulos, T. Mironidis, V. Kaltsa, A. Ioannidou, S. Symeonidis, K. Avgerinakis, S. Andreadis, I. Gialampoukidis, S. Vrochidis, A. Briassouli, V. Mezaris, I. Kompatsiaris, I. Patras, "ITI- CERTH participation in TRECVID 2016", Proc. TRECVID 2016 Workshop, Gaithersburg, MD USA, November 2016. This work has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreements H2020-693092 MOVING and H2020-687786 InVID.