SlideShare a Scribd company logo
When Textual and Visual Information
Join Forces
for MultiMedia Retrieval
Bahjat Safadi, Mathilde Sahuguet, Benoit Huet
EURECOM, Multimedia Department
Sophia Antipolis, France
Introduction
EU alone hosts 500+ online video platforms
42.7m hrs of footage in online archives of broadcasters and
producers (61% of archive footage is online)
UGC on the advance:
YouTube receives 60 hrs of video/minute
Vine and Instagram video
Internet video is now 40 percent of consumer Internet traffic,
and will reach 62 percent by the end of 2015, 75% in 2017
(source: CISCO)
How to make the content accessible?
Browsing, Searching, Hyperlinking
B Huet - Eurecom - BAMMF - p 220/06/2014
Objectives and Contributions
We propose and evaluate a video search framework using visual
information to enrich the classic text-based search for video
retrieval operating at the fragment level.
We investigate the following two questions:
To which extent can visual concepts contribute information when retrieving
videos?
How can we cope with the confidence in visual concept detection?
The framework extends conventional text-based search by fusing
together textual and visual scores.
We address both the semantic and intention gaps
By automatically mapping the query text to semantic concepts.
With the addition of “visual cues”
20/06/2014 B Huet - Eurecom - BAMMF - p 3
MediaEval Search & Hyperlinking
Information seeking in a video dataset:
retrieving media fragments/anchors
B Huet - Eurecom - BAMMF - p 420/06/2014
The Video Archive
2323 BBC videos of different genres (440 programs)
~1697h of video + audio
Subtitles (manual)
Two ASR transcripts (LIMSI,LIUM)
Metadata (Title, Cast, Description,..)
Shot boundaries and key-frames
Search: 50 queries from 29 users
– Textual query + visual cues
Face detection
Concept detection
B Huet - Eurecom - BAMMF - p 520/06/2014
The Video Archive
2323 BBC videos of different genres (440 programs)
~1697h of video + audio
Subtitles (manual)
Two ASR transcripts (LIMSI,LIUM)
Metadata (Title, Cast, Description,..)
Shot boundaries and key-frames
Search: 50 queries from 29 users
– Textual query + visual cues
Face detection
Concept detection
B Huet - Eurecom - BAMMF - p 620/06/2014
Text query: Medieval history of why castles were first built
Visual cues: Castle
Text query: Best players of all time; Embarrassing England performances;
Wake up call for English football; Wembley massacre;
Visual cues: Poor camera quality; heavy looking football; unusual goal
celebrations; unusual crowd reactions; dark; grey; overcast; black and white;
The proposed Framework
B Huet - Eurecom - BAMMF - p 720/06/2014
Videos, scenes
and subtitles
Collection
Scenes
Concepts
indexing scores
Visual
semantic
concepts
Content-based
indexing
Off-line
On-line
Textual/visual
Query:
Textual query
Scenes +
subtitles
Text-based
scores
Lucene indexing
User querying
Visual-based
scores
? Selected
concepts
Visual
cues
Ranking
Ranked list
Fusion
The proposed Framework
B Huet - Eurecom - BAMMF - p 820/06/2014
Scenes
Concepts
indexing scores
Videos, scenes
and subtitles
Collection
Visual
semantic
concepts
Content-based
indexing
No training data for visual concepts
Use 151 visual concept detectors
trained on TrecVid 2012 data
Unknown performance
Visual concept detector confidence (w)
100 top images for the concept “Animal”
58 out of 100 are manually evaluated as valid
B Huet - Eurecom - BAMMF - p 920/06/2014
The proposed Framework
B Huet - Eurecom - BAMMF - p 1020/06/2014
Textual/visual
Query:
User querying
<queryText>Children out on poetry trip Exploration of poetry by school children Poem writing</queryText>
<visualCues>House memories Farm exploration A poem on animal and shells </visualCues>
Users are not aware of visual concepts
Mapping visual cues to visual concepts
<queryText>Children out on poetry trip Exploration of poetry by school children Poem writing</queryText>
<visualCues>House memories Farm exploration A poem on animal and shells </visualCues>
Farm
Shells
Exploration
Poem
Animal
House
Memories
Animal
Birds
Insect
Cattle
Dogs
Building
School
Church
Flags
Mountain
WordNet Mapping
keywords
visualconcepts
B Huet - Eurecom - BAMMF - p 1120/06/2014
Mapping visual cues to visual concepts
Concepts mapped to the visual query "Castle”
Semantic similarity computed using the “Lin” distance
20/06/2014 B Huet - Eurecom - BAMMF - p 12
Concept Windows Plant Court Church Building
β 0.4533 0.4582 0.5115 0.6123 0.701
The proposed Framework
B Huet - Eurecom - BAMMF - p 1320/06/2014
Text-based
scores
Lucene indexing
Visual-based
scores
WordNet
similarity
Selected
concepts
RankingFusion
One score for each scene (t)
f i = ti
α
+ v i
1−α
One score for each scene (v):
Computed from the scores of the selected concepts for each scene
vi
q
= wc × vsi
c
c∈C 'q
∑
Evaluation
To which extent can visual concepts contribute information when
retrieving videos?
How can we cope with the confidence in visual concept detection?
BBC Archive subset provided by the MediaEval 2013 Search and
Hyperlinking task.
Evaluation Measures:
Mean Reciprocal Rank (MRR): assesses the rank of the relevant segment
Mean Generalized Average Precision (mGAP): takes into account starting
time of the segment
Mean Average Segment Precision (MASP): measures both ranking and
segmentation of relevant segments
20/06/2014 B Huet - Eurecom - BAMMF - p 14
Retrieval Performance (50 queries)
Low impact of visual concept detector confidence (w)
Significant improvement can be achieved by combining only mapped
concepts with θ ≥ 0.3.
Best performance is obtained when θ ≥ 0.8 (gain ≈ 11-12%).
20/06/2014 B Huet - Eurecom - BAMMF - p 15
w=1.0 w=confidence(c)
Visual concepts and Query association
The number of concepts associated to queries with
different threshold θ.
20/06/2014 B Huet - Eurecom - BAMMF - p 16
θ 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Min 5 5 5 2 0 0 0 0 0 0
Max 45 45 41 37 25 19 19 12 6 2
Mean 20 19 18 15 11 7 5 3 1 1
#Q(#c’q>0) 50 50 50 50 49 49 48 44 29 21
Retrieval on queries with visual concepts (21)
Concept mapping improves significantly the performance
of the text-based search task on these queries.
The best performance was achieved with θ ≥ 0.7
(gain ≈ 32-33%).
20/06/2014 B Huet - Eurecom - BAMMF - p 17
w=1.0 w=confidence(c)
Conclusion
A novel video search framework using visual information to
enrich a text-based search for video retrieval has been
presented.
We conducted our evaluations on the MediaEval 2013 where
we achieved the 2sd best on Search and 1st on Hyperlinking
Experimental results show that mapping text-based queries
to visual concepts improves significantly the search system.
When appropriately selecting the relevant visual concepts, a
very significant improvement is achieved (gain ≈ 33%).
20/06/2014 B Huet - Eurecom - BAMMF - p 18
Related Publications
B. Safadi, M. Sahuguet and B. Huet, When textual and visual information join forces for
multimedia retrieval, ICMR 2014, ACM International Conference on Multimedia Retrieval, April 1-4,
2014, Glasgow, Scotland
M. Sahuguet and B. Huet. Mining the Web for Multimedia-based Enriching. Multimedia Modeling
MMM 2014, 20th International Conference on MultiMedia Modeling, 8-10th January 2014, Dublin,
Ireland
M. Sahuguet, B. Huet, B. Cervenkova, E. Apostolidis, V. Mezaris, D. Stein, S. Eickeler, J-L. Redondo
Garcia, R. Troncy, L. Pikora. LinkedTV at MediaEval 2013 search and hyperlinking task,
MEDIAEVAL 2013, Multimedia Benchmark Workshop, October 18-19, 2013, Barcelona, Spain
Stein, D.; Öktem, A.; Apostolidis, E.; Mezaris, V.; Redondo García, J. L.; Troncy, R.; Sahuguet, M. &
Huet, B., From raw data to semantically enriched hyperlinking: Recent advances in the
LinkedTV analysis workflow, NEM Summit 2013, Networked & Electronic Media, 28-30 October
2013, Nantes, France
V. Mezaris and B. Huet, “Video Hyperlinking”, Tutorial Accepted at ICIP 2014 (Oct) Paris
B. Safadi, M. Sahuguet and B. Huet, “Linking text and visual concepts semantically for cross
modal multimedia search”, ICIP 2014, Paris 2014.
B Huet - Eurecom - BAMMF - p 1920/06/2014
Questions?
http://www.slideshare.net/huetbenoit/
Thank you.
When Textual and Visual Information
Join Forces
for MultiMedia Retrieval
Benoit Huet
B Huet - Eurecom - BAMMF - p 2020/06/2014

More Related Content

Viewers also liked

Multimedia Information Retrieval
Multimedia Information RetrievalMultimedia Information Retrieval
Multimedia Information Retrieval
Stephane Marchand-Maillet
 
Multimedia Information Retrieval: What is it, and why isn't ...
Multimedia Information Retrieval: What is it, and why isn't ...Multimedia Information Retrieval: What is it, and why isn't ...
Multimedia Information Retrieval: What is it, and why isn't ...webhostingguy
 
Integrated Multimedia Indexing and Retrieval
Integrated Multimedia Indexing and RetrievalIntegrated Multimedia Indexing and Retrieval
Integrated Multimedia Indexing and Retrieval
Rachmat Wahid Saleh Insani
 
Multimedia content based retrieval slideshare.ppt
Multimedia content based retrieval slideshare.pptMultimedia content based retrieval slideshare.ppt
Multimedia content based retrieval slideshare.ppt
govintech1
 
similarity measure
similarity measure similarity measure
similarity measure
ZHAO Sam
 
Opinion mining for social media
Opinion mining for social mediaOpinion mining for social media
Opinion mining for social media
Diana Maynard
 

Viewers also liked (6)

Multimedia Information Retrieval
Multimedia Information RetrievalMultimedia Information Retrieval
Multimedia Information Retrieval
 
Multimedia Information Retrieval: What is it, and why isn't ...
Multimedia Information Retrieval: What is it, and why isn't ...Multimedia Information Retrieval: What is it, and why isn't ...
Multimedia Information Retrieval: What is it, and why isn't ...
 
Integrated Multimedia Indexing and Retrieval
Integrated Multimedia Indexing and RetrievalIntegrated Multimedia Indexing and Retrieval
Integrated Multimedia Indexing and Retrieval
 
Multimedia content based retrieval slideshare.ppt
Multimedia content based retrieval slideshare.pptMultimedia content based retrieval slideshare.ppt
Multimedia content based retrieval slideshare.ppt
 
similarity measure
similarity measure similarity measure
similarity measure
 
Opinion mining for social media
Opinion mining for social mediaOpinion mining for social media
Opinion mining for social media
 

Similar to When textual and visual information join forces for multimedia retrieval

LinkedTV @ MediaEval 2013 Search and Hyperlinking Task
LinkedTV @ MediaEval 2013 Search and Hyperlinking TaskLinkedTV @ MediaEval 2013 Search and Hyperlinking Task
LinkedTV @ MediaEval 2013 Search and Hyperlinking Task
Benoit HUET
 
Semantic Multimedia Remixing - MediaEval 2013 Search and Hyperlinking Task
Semantic Multimedia Remixing - MediaEval 2013 Search and Hyperlinking TaskSemantic Multimedia Remixing - MediaEval 2013 Search and Hyperlinking Task
Semantic Multimedia Remixing - MediaEval 2013 Search and Hyperlinking TaskMediaMixerCommunity
 
Video Coding Enhancements for HTTP Adaptive Streaming
Video Coding Enhancements for HTTP Adaptive StreamingVideo Coding Enhancements for HTTP Adaptive Streaming
Video Coding Enhancements for HTTP Adaptive Streaming
Alpen-Adria-Universität
 
Research@Lunch_Presentation.pdf
Research@Lunch_Presentation.pdfResearch@Lunch_Presentation.pdf
Research@Lunch_Presentation.pdf
Vignesh V Menon
 
What can users do for multimedia?
What can users do for multimedia?What can users do for multimedia?
What can users do for multimedia?
Lora Aroyo
 
Similarity-based retrieval of multimedia content
Similarity-based retrieval of multimedia contentSimilarity-based retrieval of multimedia content
Similarity-based retrieval of multimedia content
Symeon Papadopoulos
 
Video Browser Showdown (VBS) 2012-2019
Video Browser Showdown (VBS) 2012-2019Video Browser Showdown (VBS) 2012-2019
Video Browser Showdown (VBS) 2012-2019
klschoef
 
Media Genre Inference for Predicting Media Interestingness
Media Genre Inference for Predicting Media InterestingnessMedia Genre Inference for Predicting Media Interestingness
Media Genre Inference for Predicting Media Interestingness
Benoit HUET
 
MediaEval 2017 - Interestingness Task: EURECOM @MediaEval 2017: Media Genre I...
MediaEval 2017 - Interestingness Task: EURECOM @MediaEval 2017: Media Genre I...MediaEval 2017 - Interestingness Task: EURECOM @MediaEval 2017: Media Genre I...
MediaEval 2017 - Interestingness Task: EURECOM @MediaEval 2017: Media Genre I...
multimediaeval
 
Multimedia Content Understanding: Bringing Context to Content
Multimedia Content Understanding: Bringing Context to ContentMultimedia Content Understanding: Bringing Context to Content
Multimedia Content Understanding: Bringing Context to Content
Benoit HUET
 
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdfContent_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
Vignesh V Menon
 
Content-adaptive Video Coding for HTTP Adaptive Streaming
Content-adaptive Video Coding for HTTP Adaptive StreamingContent-adaptive Video Coding for HTTP Adaptive Streaming
Content-adaptive Video Coding for HTTP Adaptive Streaming
Alpen-Adria-Universität
 
Bridging the gap between web and television
Bridging the gap between web and televisionBridging the gap between web and television
Bridging the gap between web and television
Marius Preda PhD
 
Content Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional ApproachContent Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional Approach
CSCJournals
 
Remixing Media on the Semantic Web (ISWC 2014 Tutorial) Pt 1 Media Fragment S...
Remixing Media on the Semantic Web (ISWC 2014 Tutorial) Pt 1 Media Fragment S...Remixing Media on the Semantic Web (ISWC 2014 Tutorial) Pt 1 Media Fragment S...
Remixing Media on the Semantic Web (ISWC 2014 Tutorial) Pt 1 Media Fragment S...
LinkedTV
 
Observe: Semantic Context-based Content Recommendation for Adaptive Public Sc...
Observe: Semantic Context-based Content Recommendation for Adaptive Public Sc...Observe: Semantic Context-based Content Recommendation for Adaptive Public Sc...
Observe: Semantic Context-based Content Recommendation for Adaptive Public Sc...
Victor de Boer
 
Flip Video Session
Flip Video SessionFlip Video Session
Flip Video Session
Chris LaBelle
 
Warcnet 2022_final.pptx
Warcnet 2022_final.pptxWarcnet 2022_final.pptx
Warcnet 2022_final.pptx
WARCnet
 
Research and Development at Sound and Vision
Research and Development at Sound and Vision Research and Development at Sound and Vision
Research and Development at Sound and Vision
Victor de Boer
 
Audiovisual content exploitation JTS2010
Audiovisual content exploitation  JTS2010 Audiovisual content exploitation  JTS2010
Audiovisual content exploitation JTS2010
roelandordelman.nl
 

Similar to When textual and visual information join forces for multimedia retrieval (20)

LinkedTV @ MediaEval 2013 Search and Hyperlinking Task
LinkedTV @ MediaEval 2013 Search and Hyperlinking TaskLinkedTV @ MediaEval 2013 Search and Hyperlinking Task
LinkedTV @ MediaEval 2013 Search and Hyperlinking Task
 
Semantic Multimedia Remixing - MediaEval 2013 Search and Hyperlinking Task
Semantic Multimedia Remixing - MediaEval 2013 Search and Hyperlinking TaskSemantic Multimedia Remixing - MediaEval 2013 Search and Hyperlinking Task
Semantic Multimedia Remixing - MediaEval 2013 Search and Hyperlinking Task
 
Video Coding Enhancements for HTTP Adaptive Streaming
Video Coding Enhancements for HTTP Adaptive StreamingVideo Coding Enhancements for HTTP Adaptive Streaming
Video Coding Enhancements for HTTP Adaptive Streaming
 
Research@Lunch_Presentation.pdf
Research@Lunch_Presentation.pdfResearch@Lunch_Presentation.pdf
Research@Lunch_Presentation.pdf
 
What can users do for multimedia?
What can users do for multimedia?What can users do for multimedia?
What can users do for multimedia?
 
Similarity-based retrieval of multimedia content
Similarity-based retrieval of multimedia contentSimilarity-based retrieval of multimedia content
Similarity-based retrieval of multimedia content
 
Video Browser Showdown (VBS) 2012-2019
Video Browser Showdown (VBS) 2012-2019Video Browser Showdown (VBS) 2012-2019
Video Browser Showdown (VBS) 2012-2019
 
Media Genre Inference for Predicting Media Interestingness
Media Genre Inference for Predicting Media InterestingnessMedia Genre Inference for Predicting Media Interestingness
Media Genre Inference for Predicting Media Interestingness
 
MediaEval 2017 - Interestingness Task: EURECOM @MediaEval 2017: Media Genre I...
MediaEval 2017 - Interestingness Task: EURECOM @MediaEval 2017: Media Genre I...MediaEval 2017 - Interestingness Task: EURECOM @MediaEval 2017: Media Genre I...
MediaEval 2017 - Interestingness Task: EURECOM @MediaEval 2017: Media Genre I...
 
Multimedia Content Understanding: Bringing Context to Content
Multimedia Content Understanding: Bringing Context to ContentMultimedia Content Understanding: Bringing Context to Content
Multimedia Content Understanding: Bringing Context to Content
 
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdfContent_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
Content_adaptive_video_coding_for_HTTP_Adaptive_Streaming.pdf
 
Content-adaptive Video Coding for HTTP Adaptive Streaming
Content-adaptive Video Coding for HTTP Adaptive StreamingContent-adaptive Video Coding for HTTP Adaptive Streaming
Content-adaptive Video Coding for HTTP Adaptive Streaming
 
Bridging the gap between web and television
Bridging the gap between web and televisionBridging the gap between web and television
Bridging the gap between web and television
 
Content Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional ApproachContent Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional Approach
 
Remixing Media on the Semantic Web (ISWC 2014 Tutorial) Pt 1 Media Fragment S...
Remixing Media on the Semantic Web (ISWC 2014 Tutorial) Pt 1 Media Fragment S...Remixing Media on the Semantic Web (ISWC 2014 Tutorial) Pt 1 Media Fragment S...
Remixing Media on the Semantic Web (ISWC 2014 Tutorial) Pt 1 Media Fragment S...
 
Observe: Semantic Context-based Content Recommendation for Adaptive Public Sc...
Observe: Semantic Context-based Content Recommendation for Adaptive Public Sc...Observe: Semantic Context-based Content Recommendation for Adaptive Public Sc...
Observe: Semantic Context-based Content Recommendation for Adaptive Public Sc...
 
Flip Video Session
Flip Video SessionFlip Video Session
Flip Video Session
 
Warcnet 2022_final.pptx
Warcnet 2022_final.pptxWarcnet 2022_final.pptx
Warcnet 2022_final.pptx
 
Research and Development at Sound and Vision
Research and Development at Sound and Vision Research and Development at Sound and Vision
Research and Development at Sound and Vision
 
Audiovisual content exploitation JTS2010
Audiovisual content exploitation  JTS2010 Audiovisual content exploitation  JTS2010
Audiovisual content exploitation JTS2010
 

Recently uploaded

María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024
eCommerce Institute
 
Gregory Harris - Cycle 2 - Civics Presentation
Gregory Harris - Cycle 2 - Civics PresentationGregory Harris - Cycle 2 - Civics Presentation
Gregory Harris - Cycle 2 - Civics Presentation
gharris9
 
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
AwangAniqkmals
 
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Dutch Power
 
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Sebastiano Panichella
 
Tom tresser burning issue.pptx My Burning issue
Tom tresser burning issue.pptx My Burning issueTom tresser burning issue.pptx My Burning issue
Tom tresser burning issue.pptx My Burning issue
amekonnen
 
2024-05-30_meetup_devops_aix-marseille.pdf
2024-05-30_meetup_devops_aix-marseille.pdf2024-05-30_meetup_devops_aix-marseille.pdf
2024-05-30_meetup_devops_aix-marseille.pdf
Frederic Leger
 
Gregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptxGregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptx
gharris9
 
Burning Issue Presentation By Kenmaryon.pdf
Burning Issue Presentation By Kenmaryon.pdfBurning Issue Presentation By Kenmaryon.pdf
Burning Issue Presentation By Kenmaryon.pdf
kkirkland2
 
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie WellsCollapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
Rosie Wells
 
International Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software TestingInternational Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software Testing
Sebastiano Panichella
 
Bitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXOBitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXO
Matjaž Lipuš
 
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptxsomanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
Howard Spence
 
Obesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditionsObesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditions
Faculty of Medicine And Health Sciences
 
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Sebastiano Panichella
 
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Dutch Power
 
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdfBonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
khadija278284
 
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdfSupercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Access Innovations, Inc.
 
Media as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern EraMedia as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern Era
faizulhassanfaiz1670
 

Recently uploaded (19)

María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024
 
Gregory Harris - Cycle 2 - Civics Presentation
Gregory Harris - Cycle 2 - Civics PresentationGregory Harris - Cycle 2 - Civics Presentation
Gregory Harris - Cycle 2 - Civics Presentation
 
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
 
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
 
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
 
Tom tresser burning issue.pptx My Burning issue
Tom tresser burning issue.pptx My Burning issueTom tresser burning issue.pptx My Burning issue
Tom tresser burning issue.pptx My Burning issue
 
2024-05-30_meetup_devops_aix-marseille.pdf
2024-05-30_meetup_devops_aix-marseille.pdf2024-05-30_meetup_devops_aix-marseille.pdf
2024-05-30_meetup_devops_aix-marseille.pdf
 
Gregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptxGregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptx
 
Burning Issue Presentation By Kenmaryon.pdf
Burning Issue Presentation By Kenmaryon.pdfBurning Issue Presentation By Kenmaryon.pdf
Burning Issue Presentation By Kenmaryon.pdf
 
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie WellsCollapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
 
International Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software TestingInternational Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software Testing
 
Bitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXOBitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXO
 
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptxsomanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
 
Obesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditionsObesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditions
 
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...
 
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
 
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdfBonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
 
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdfSupercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
 
Media as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern EraMedia as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern Era
 

When textual and visual information join forces for multimedia retrieval

  • 1. When Textual and Visual Information Join Forces for MultiMedia Retrieval Bahjat Safadi, Mathilde Sahuguet, Benoit Huet EURECOM, Multimedia Department Sophia Antipolis, France
  • 2. Introduction EU alone hosts 500+ online video platforms 42.7m hrs of footage in online archives of broadcasters and producers (61% of archive footage is online) UGC on the advance: YouTube receives 60 hrs of video/minute Vine and Instagram video Internet video is now 40 percent of consumer Internet traffic, and will reach 62 percent by the end of 2015, 75% in 2017 (source: CISCO) How to make the content accessible? Browsing, Searching, Hyperlinking B Huet - Eurecom - BAMMF - p 220/06/2014
  • 3. Objectives and Contributions We propose and evaluate a video search framework using visual information to enrich the classic text-based search for video retrieval operating at the fragment level. We investigate the following two questions: To which extent can visual concepts contribute information when retrieving videos? How can we cope with the confidence in visual concept detection? The framework extends conventional text-based search by fusing together textual and visual scores. We address both the semantic and intention gaps By automatically mapping the query text to semantic concepts. With the addition of “visual cues” 20/06/2014 B Huet - Eurecom - BAMMF - p 3
  • 4. MediaEval Search & Hyperlinking Information seeking in a video dataset: retrieving media fragments/anchors B Huet - Eurecom - BAMMF - p 420/06/2014
  • 5. The Video Archive 2323 BBC videos of different genres (440 programs) ~1697h of video + audio Subtitles (manual) Two ASR transcripts (LIMSI,LIUM) Metadata (Title, Cast, Description,..) Shot boundaries and key-frames Search: 50 queries from 29 users – Textual query + visual cues Face detection Concept detection B Huet - Eurecom - BAMMF - p 520/06/2014
  • 6. The Video Archive 2323 BBC videos of different genres (440 programs) ~1697h of video + audio Subtitles (manual) Two ASR transcripts (LIMSI,LIUM) Metadata (Title, Cast, Description,..) Shot boundaries and key-frames Search: 50 queries from 29 users – Textual query + visual cues Face detection Concept detection B Huet - Eurecom - BAMMF - p 620/06/2014 Text query: Medieval history of why castles were first built Visual cues: Castle Text query: Best players of all time; Embarrassing England performances; Wake up call for English football; Wembley massacre; Visual cues: Poor camera quality; heavy looking football; unusual goal celebrations; unusual crowd reactions; dark; grey; overcast; black and white;
  • 7. The proposed Framework B Huet - Eurecom - BAMMF - p 720/06/2014 Videos, scenes and subtitles Collection Scenes Concepts indexing scores Visual semantic concepts Content-based indexing Off-line On-line Textual/visual Query: Textual query Scenes + subtitles Text-based scores Lucene indexing User querying Visual-based scores ? Selected concepts Visual cues Ranking Ranked list Fusion
  • 8. The proposed Framework B Huet - Eurecom - BAMMF - p 820/06/2014 Scenes Concepts indexing scores Videos, scenes and subtitles Collection Visual semantic concepts Content-based indexing No training data for visual concepts Use 151 visual concept detectors trained on TrecVid 2012 data Unknown performance
  • 9. Visual concept detector confidence (w) 100 top images for the concept “Animal” 58 out of 100 are manually evaluated as valid B Huet - Eurecom - BAMMF - p 920/06/2014
  • 10. The proposed Framework B Huet - Eurecom - BAMMF - p 1020/06/2014 Textual/visual Query: User querying <queryText>Children out on poetry trip Exploration of poetry by school children Poem writing</queryText> <visualCues>House memories Farm exploration A poem on animal and shells </visualCues> Users are not aware of visual concepts
  • 11. Mapping visual cues to visual concepts <queryText>Children out on poetry trip Exploration of poetry by school children Poem writing</queryText> <visualCues>House memories Farm exploration A poem on animal and shells </visualCues> Farm Shells Exploration Poem Animal House Memories Animal Birds Insect Cattle Dogs Building School Church Flags Mountain WordNet Mapping keywords visualconcepts B Huet - Eurecom - BAMMF - p 1120/06/2014
  • 12. Mapping visual cues to visual concepts Concepts mapped to the visual query "Castle” Semantic similarity computed using the “Lin” distance 20/06/2014 B Huet - Eurecom - BAMMF - p 12 Concept Windows Plant Court Church Building β 0.4533 0.4582 0.5115 0.6123 0.701
  • 13. The proposed Framework B Huet - Eurecom - BAMMF - p 1320/06/2014 Text-based scores Lucene indexing Visual-based scores WordNet similarity Selected concepts RankingFusion One score for each scene (t) f i = ti α + v i 1−α One score for each scene (v): Computed from the scores of the selected concepts for each scene vi q = wc × vsi c c∈C 'q ∑
  • 14. Evaluation To which extent can visual concepts contribute information when retrieving videos? How can we cope with the confidence in visual concept detection? BBC Archive subset provided by the MediaEval 2013 Search and Hyperlinking task. Evaluation Measures: Mean Reciprocal Rank (MRR): assesses the rank of the relevant segment Mean Generalized Average Precision (mGAP): takes into account starting time of the segment Mean Average Segment Precision (MASP): measures both ranking and segmentation of relevant segments 20/06/2014 B Huet - Eurecom - BAMMF - p 14
  • 15. Retrieval Performance (50 queries) Low impact of visual concept detector confidence (w) Significant improvement can be achieved by combining only mapped concepts with θ ≥ 0.3. Best performance is obtained when θ ≥ 0.8 (gain ≈ 11-12%). 20/06/2014 B Huet - Eurecom - BAMMF - p 15 w=1.0 w=confidence(c)
  • 16. Visual concepts and Query association The number of concepts associated to queries with different threshold θ. 20/06/2014 B Huet - Eurecom - BAMMF - p 16 θ 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Min 5 5 5 2 0 0 0 0 0 0 Max 45 45 41 37 25 19 19 12 6 2 Mean 20 19 18 15 11 7 5 3 1 1 #Q(#c’q>0) 50 50 50 50 49 49 48 44 29 21
  • 17. Retrieval on queries with visual concepts (21) Concept mapping improves significantly the performance of the text-based search task on these queries. The best performance was achieved with θ ≥ 0.7 (gain ≈ 32-33%). 20/06/2014 B Huet - Eurecom - BAMMF - p 17 w=1.0 w=confidence(c)
  • 18. Conclusion A novel video search framework using visual information to enrich a text-based search for video retrieval has been presented. We conducted our evaluations on the MediaEval 2013 where we achieved the 2sd best on Search and 1st on Hyperlinking Experimental results show that mapping text-based queries to visual concepts improves significantly the search system. When appropriately selecting the relevant visual concepts, a very significant improvement is achieved (gain ≈ 33%). 20/06/2014 B Huet - Eurecom - BAMMF - p 18
  • 19. Related Publications B. Safadi, M. Sahuguet and B. Huet, When textual and visual information join forces for multimedia retrieval, ICMR 2014, ACM International Conference on Multimedia Retrieval, April 1-4, 2014, Glasgow, Scotland M. Sahuguet and B. Huet. Mining the Web for Multimedia-based Enriching. Multimedia Modeling MMM 2014, 20th International Conference on MultiMedia Modeling, 8-10th January 2014, Dublin, Ireland M. Sahuguet, B. Huet, B. Cervenkova, E. Apostolidis, V. Mezaris, D. Stein, S. Eickeler, J-L. Redondo Garcia, R. Troncy, L. Pikora. LinkedTV at MediaEval 2013 search and hyperlinking task, MEDIAEVAL 2013, Multimedia Benchmark Workshop, October 18-19, 2013, Barcelona, Spain Stein, D.; Öktem, A.; Apostolidis, E.; Mezaris, V.; Redondo García, J. L.; Troncy, R.; Sahuguet, M. & Huet, B., From raw data to semantically enriched hyperlinking: Recent advances in the LinkedTV analysis workflow, NEM Summit 2013, Networked & Electronic Media, 28-30 October 2013, Nantes, France V. Mezaris and B. Huet, “Video Hyperlinking”, Tutorial Accepted at ICIP 2014 (Oct) Paris B. Safadi, M. Sahuguet and B. Huet, “Linking text and visual concepts semantically for cross modal multimedia search”, ICIP 2014, Paris 2014. B Huet - Eurecom - BAMMF - p 1920/06/2014
  • 20. Questions? http://www.slideshare.net/huetbenoit/ Thank you. When Textual and Visual Information Join Forces for MultiMedia Retrieval Benoit Huet B Huet - Eurecom - BAMMF - p 2020/06/2014