SlideShare a Scribd company logo
1 of 14
Download to read offline
ETH-CVL @ MediaEval 2016: Textual-Visual
Embeddings and Video2GIF for Video Interestingness
Michael Gygli, PhD student @ Computer Vision Lab, ETH Zurich
Work with Arun Balajee Vasudevan, Anna Volokitin and Luc Van Gool
1
Content
▪ Overview
▪ Textual-visual embedding
▪ Video2GIF
▪ Results
2Michael Gygli, PhD student @ ETH Zurich06/21/2016
Finding the most interesting and relevant content
▪ Two dominant approaches
▪ Use of textual information (title, category) to obtain a video specific
model, often using web image priors
E.g. [Khosla et al. CVPR 13], [Liu et al. IJCAI 09], Song et al. CVPR 15]
▪ Supervised learning with generic features and large training set
E.g. [Potapov et al. ECCV 2014], [Sun et al. ECCV 2014], [Gygli et al.
CVPR 16]
▪ Some methods use both
E.g. [Liu et al. CVPR 15]
3
Textual visual embedding
Multi-task deep visual-semantic embedding for video
thumbnail selection [Liu et al. CVPR 15]
▪ Use Bing image search data (query, image, # of clicks) to learn a
joint embedding space for images and text
▪ Compute frame relevance as cosine similarity between the query or
title embedding and the frame embedding
[Liu et al. CVPR 2015]
5Michael Gygli, PhD student @ ETH Zurich06/21/2016
Our contribution with improvements over Liu et al.
● Siamese network
● Text Embedding Model
○ Words encoded through word2vec
○ LSTM to obtain fixed length embedding
● Convolutional Neural Network model
○ Fine-tuned VGG-19
● Training Data based on learning a ranking of: (query+, image+, image-
(“cat”, , )
6
Video2GIF
Video2GIF: Automatic Generation of Animated GIFs
from Video [Gygli et al. CVPR 16]
Approach
▪ Work with segments as units
▪ Obtained through change-point detection [Song et al. CVPR 15]
▪ Train a deep neural network for ranking segments
...
8
Example video
Highest
scoring
Lowest
scoring
Michael Gygli, PhD student @ ETH Zurich06/21/2016
Video2GIF: Automatic Generation of Animated GIFs
from Video
Approach
▪ Train a deep neural network for ranking
segments
▪ Built on C3D network [Tran et al. ICCV
2015]
▪ Objective: score positives higher than
negatives
h: scoring function
s+
: positive segment
s-
: negative segment
9Michael Gygli, PhD student @ ETH Zurich06/21/2016
Video2GIF: Dataset
10Michael Gygli, PhD student @ ETH Zurich06/21/2016
Available on github.com/gyglim/video2gif_dataset
▪ Large-scale training data: GIFs created from YouTube videos
▪ Align GIF back to video
▪ This part defines a positive example
▪ Assume non-selected parts are less interesting than selected part
Results
Task Run mAP
Image
1 0.1866
2 0.1952
3 0.1858
Video
1 0.1362
2 0.1574
Frame-based:
• Run 1: Visual-semantic embedding trained on Clickture dataset
• Run 2: As Run 1, with fine-tuning on development set
• Run 3: As Run 1, but trained on a larger subset of Clickture
Segment-based:
• Run 1: Video2GIF
• Run 2: Averaged score of Visual-semantic embedding and Video2GIF
Qualitative Results - Run 2
Title: Captives
Title: After Earth
Predicted best frame True best frame
Predicted best frame True best frame
More information
▪ Paper Video2GIF: Automatic Generation of Animated GIFs from Video.
M. Gygli, Y. Song, L. Cao, CVPR 2016
▪ Slides on Video Summarization as Subset Selection @ Tutorial on
Optimization Algorithms for Subset Selection and Summarization in Large
Data Sets, CVPR 2016
https://t.co/mQIpxMab3v
▪ Demo website for Video2GIF: http://video2gif.info/autogif
▪ Paper Multi-task deep visual-semantic embedding for video thumbnail
selection. W. Liu, T. Mei, Y. Zhang, C. Che, and J. Luo, CVPR 2015​
Questions?

More Related Content

Viewers also liked

MediaEval 2016 - the C@merata task: Natural Language Queries Derived from Exa...
MediaEval 2016 - the C@merata task: Natural Language Queries Derived from Exa...MediaEval 2016 - the C@merata task: Natural Language Queries Derived from Exa...
MediaEval 2016 - the C@merata task: Natural Language Queries Derived from Exa...
multimediaeval
 

Viewers also liked (16)

MediaEval 2016 - LAPI @ 2016 Retrieving Diverse Social Images Task: A Pseudo-...
MediaEval 2016 - LAPI @ 2016 Retrieving Diverse Social Images Task: A Pseudo-...MediaEval 2016 - LAPI @ 2016 Retrieving Diverse Social Images Task: A Pseudo-...
MediaEval 2016 - LAPI @ 2016 Retrieving Diverse Social Images Task: A Pseudo-...
 
MediaEval 2016 - BUT Zero-Cost Speech Recognition
MediaEval 2016 - BUT Zero-Cost Speech RecognitionMediaEval 2016 - BUT Zero-Cost Speech Recognition
MediaEval 2016 - BUT Zero-Cost Speech Recognition
 
MediaEval 2016 - Simula Team @ Context of Experience Task
MediaEval 2016 - Simula Team @ Context of Experience TaskMediaEval 2016 - Simula Team @ Context of Experience Task
MediaEval 2016 - Simula Team @ Context of Experience Task
 
MediaEval 2016 - ININ Submission to Zero Cost ASR Task
MediaEval 2016 - ININ Submission to Zero Cost ASR TaskMediaEval 2016 - ININ Submission to Zero Cost ASR Task
MediaEval 2016 - ININ Submission to Zero Cost ASR Task
 
MediaEval 2016 - IR Evaluation: Putting the User Back in the Loop
MediaEval 2016 - IR Evaluation: Putting the User Back in the LoopMediaEval 2016 - IR Evaluation: Putting the User Back in the Loop
MediaEval 2016 - IR Evaluation: Putting the User Back in the Loop
 
MediaEval 2016 - UPMC at MediaEval2016 Retrieving Diverse Social Images Task
MediaEval 2016 - UPMC at MediaEval2016 Retrieving Diverse Social Images TaskMediaEval 2016 - UPMC at MediaEval2016 Retrieving Diverse Social Images Task
MediaEval 2016 - UPMC at MediaEval2016 Retrieving Diverse Social Images Task
 
MediaEval 2016 - UNIFESP Predicting Media Interestingness Task
MediaEval 2016 - UNIFESP Predicting Media Interestingness TaskMediaEval 2016 - UNIFESP Predicting Media Interestingness Task
MediaEval 2016 - UNIFESP Predicting Media Interestingness Task
 
MediaEval 2016 - Emotion in Music Task: Lessons Learned
MediaEval 2016 - Emotion in Music Task: Lessons LearnedMediaEval 2016 - Emotion in Music Task: Lessons Learned
MediaEval 2016 - Emotion in Music Task: Lessons Learned
 
MediaEval 2016 - TUD-MMC Predicting media Interestingness Task
MediaEval 2016 - TUD-MMC Predicting media Interestingness TaskMediaEval 2016 - TUD-MMC Predicting media Interestingness Task
MediaEval 2016 - TUD-MMC Predicting media Interestingness Task
 
MediaEval 2016 - Tag Propagation in Talking Face Graphs
MediaEval 2016 - Tag Propagation in Talking Face GraphsMediaEval 2016 - Tag Propagation in Talking Face Graphs
MediaEval 2016 - Tag Propagation in Talking Face Graphs
 
MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...
MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...
MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...
 
MediaEval 2016 - UVigo System for Multimodal Person Discovery in Broadcast TV...
MediaEval 2016 - UVigo System for Multimodal Person Discovery in Broadcast TV...MediaEval 2016 - UVigo System for Multimodal Person Discovery in Broadcast TV...
MediaEval 2016 - UVigo System for Multimodal Person Discovery in Broadcast TV...
 
MediaEval 2016 - EUMSSI Team at the MediaEval Person Discovery Challenge
MediaEval 2016 - EUMSSI Team at the MediaEval Person Discovery ChallengeMediaEval 2016 - EUMSSI Team at the MediaEval Person Discovery Challenge
MediaEval 2016 - EUMSSI Team at the MediaEval Person Discovery Challenge
 
MediaEval 2016 - RECOD at Placing Task
MediaEval 2016 - RECOD at Placing TaskMediaEval 2016 - RECOD at Placing Task
MediaEval 2016 - RECOD at Placing Task
 
MediaEval 2016 - Retrieving Diverse Social Images Task Overview
MediaEval 2016 - Retrieving Diverse Social Images Task OverviewMediaEval 2016 - Retrieving Diverse Social Images Task Overview
MediaEval 2016 - Retrieving Diverse Social Images Task Overview
 
MediaEval 2016 - the C@merata task: Natural Language Queries Derived from Exa...
MediaEval 2016 - the C@merata task: Natural Language Queries Derived from Exa...MediaEval 2016 - the C@merata task: Natural Language Queries Derived from Exa...
MediaEval 2016 - the C@merata task: Natural Language Queries Derived from Exa...
 

Similar to MediaEval 2016 - ETH-CVL: Textual-Visual Embeddings and Video2GIF for Video Interestingness

Similar to MediaEval 2016 - ETH-CVL: Textual-Visual Embeddings and Video2GIF for Video Interestingness (20)

Using educational technology to convey complex IL topics: animating OSCOLA re...
Using educational technology to convey complex IL topics: animating OSCOLA re...Using educational technology to convey complex IL topics: animating OSCOLA re...
Using educational technology to convey complex IL topics: animating OSCOLA re...
 
A Ensemble Learning-based No Reference QoE Model for User Generated Contents
A Ensemble Learning-based No Reference QoE Model for User Generated ContentsA Ensemble Learning-based No Reference QoE Model for User Generated Contents
A Ensemble Learning-based No Reference QoE Model for User Generated Contents
 
Icme2020 tutorial video_summarization_part1
Icme2020 tutorial video_summarization_part1Icme2020 tutorial video_summarization_part1
Icme2020 tutorial video_summarization_part1
 
Interactive Video and Adult Education
Interactive Video and Adult EducationInteractive Video and Adult Education
Interactive Video and Adult Education
 
YouTube Trending Video Dashboard
YouTube Trending Video DashboardYouTube Trending Video Dashboard
YouTube Trending Video Dashboard
 
Jing
JingJing
Jing
 
Jing
JingJing
Jing
 
EDUC5101G Session 5 Presentation (March 8, 2016)
EDUC5101G Session 5 Presentation (March 8, 2016)EDUC5101G Session 5 Presentation (March 8, 2016)
EDUC5101G Session 5 Presentation (March 8, 2016)
 
Eye Tracking for Predicting ADHD
Eye Tracking for Predicting ADHDEye Tracking for Predicting ADHD
Eye Tracking for Predicting ADHD
 
軒銘Icalt.pptx
軒銘Icalt.pptx軒銘Icalt.pptx
軒銘Icalt.pptx
 
OSGi Alliance and its Technology - Where Are We Now, and What is Your Vision ...
OSGi Alliance and its Technology - Where Are We Now, and What is Your Vision ...OSGi Alliance and its Technology - Where Are We Now, and What is Your Vision ...
OSGi Alliance and its Technology - Where Are We Now, and What is Your Vision ...
 
SUMMARY GENERATION FOR LECTURING VIDEOS
SUMMARY GENERATION FOR LECTURING VIDEOSSUMMARY GENERATION FOR LECTURING VIDEOS
SUMMARY GENERATION FOR LECTURING VIDEOS
 
The LEGO Strategy: Guidelines for a Profitable Deployment
The LEGO Strategy: Guidelines for a Profitable DeploymentThe LEGO Strategy: Guidelines for a Profitable Deployment
The LEGO Strategy: Guidelines for a Profitable Deployment
 
IGoogle presentation
IGoogle presentationIGoogle presentation
IGoogle presentation
 
IGoogle presentation
IGoogle presentationIGoogle presentation
IGoogle presentation
 
IGoogle presentation
IGoogle presentationIGoogle presentation
IGoogle presentation
 
OSGeo INSPIRE Ping-Pong Match
OSGeo INSPIRE Ping-Pong MatchOSGeo INSPIRE Ping-Pong Match
OSGeo INSPIRE Ping-Pong Match
 
Jenkins and visual regression – Exove
Jenkins and visual regression – ExoveJenkins and visual regression – Exove
Jenkins and visual regression – Exove
 
How to start WebGL easily?
How to start WebGL easily?How to start WebGL easily?
How to start WebGL easily?
 
Modeling and Performance Analysis of Scrumban with Test-Driven Development us...
Modeling and Performance Analysis of Scrumban with Test-Driven Development us...Modeling and Performance Analysis of Scrumban with Test-Driven Development us...
Modeling and Performance Analysis of Scrumban with Test-Driven Development us...
 

More from multimediaeval

Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
multimediaeval
 

More from multimediaeval (20)

Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...
Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...
Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...
 
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...
 
Sports Video Classification: Classification of Strokes in Table Tennis for Me...
Sports Video Classification: Classification of Strokes in Table Tennis for Me...Sports Video Classification: Classification of Strokes in Table Tennis for Me...
Sports Video Classification: Classification of Strokes in Table Tennis for Me...
 
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...
 
Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task
Essex-NLIP at MediaEval Predicting Media Memorability 2020 TaskEssex-NLIP at MediaEval Predicting Media Memorability 2020 Task
Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task
 
Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...
Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...
Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...
 
Fooling an Automatic Image Quality Estimator
Fooling an Automatic Image Quality EstimatorFooling an Automatic Image Quality Estimator
Fooling an Automatic Image Quality Estimator
 
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...
 
Pixel Privacy: Quality Camouflage for Social Images
Pixel Privacy: Quality Camouflage for Social ImagesPixel Privacy: Quality Camouflage for Social Images
Pixel Privacy: Quality Camouflage for Social Images
 
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-MatchingHCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching
 
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
 
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...
 
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...
 
Deep Conditional Adversarial learning for polyp Segmentation
Deep Conditional Adversarial learning for polyp SegmentationDeep Conditional Adversarial learning for polyp Segmentation
Deep Conditional Adversarial learning for polyp Segmentation
 
A Temporal-Spatial Attention Model for Medical Image Detection
A Temporal-Spatial Attention Model for Medical Image DetectionA Temporal-Spatial Attention Model for Medical Image Detection
A Temporal-Spatial Attention Model for Medical Image Detection
 
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...
 
Fine-tuning for Polyp Segmentation with Attention
Fine-tuning for Polyp Segmentation with AttentionFine-tuning for Polyp Segmentation with Attention
Fine-tuning for Polyp Segmentation with Attention
 
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...
 
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...
 
Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...
 Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ... Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...
Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...
 

Recently uploaded

Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Lokesh Kothari
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Sérgio Sacani
 

Recently uploaded (20)

❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 

MediaEval 2016 - ETH-CVL: Textual-Visual Embeddings and Video2GIF for Video Interestingness

  • 1. ETH-CVL @ MediaEval 2016: Textual-Visual Embeddings and Video2GIF for Video Interestingness Michael Gygli, PhD student @ Computer Vision Lab, ETH Zurich Work with Arun Balajee Vasudevan, Anna Volokitin and Luc Van Gool 1
  • 2. Content ▪ Overview ▪ Textual-visual embedding ▪ Video2GIF ▪ Results 2Michael Gygli, PhD student @ ETH Zurich06/21/2016
  • 3. Finding the most interesting and relevant content ▪ Two dominant approaches ▪ Use of textual information (title, category) to obtain a video specific model, often using web image priors E.g. [Khosla et al. CVPR 13], [Liu et al. IJCAI 09], Song et al. CVPR 15] ▪ Supervised learning with generic features and large training set E.g. [Potapov et al. ECCV 2014], [Sun et al. ECCV 2014], [Gygli et al. CVPR 16] ▪ Some methods use both E.g. [Liu et al. CVPR 15] 3
  • 5. Multi-task deep visual-semantic embedding for video thumbnail selection [Liu et al. CVPR 15] ▪ Use Bing image search data (query, image, # of clicks) to learn a joint embedding space for images and text ▪ Compute frame relevance as cosine similarity between the query or title embedding and the frame embedding [Liu et al. CVPR 2015] 5Michael Gygli, PhD student @ ETH Zurich06/21/2016
  • 6. Our contribution with improvements over Liu et al. ● Siamese network ● Text Embedding Model ○ Words encoded through word2vec ○ LSTM to obtain fixed length embedding ● Convolutional Neural Network model ○ Fine-tuned VGG-19 ● Training Data based on learning a ranking of: (query+, image+, image- (“cat”, , ) 6
  • 8. Video2GIF: Automatic Generation of Animated GIFs from Video [Gygli et al. CVPR 16] Approach ▪ Work with segments as units ▪ Obtained through change-point detection [Song et al. CVPR 15] ▪ Train a deep neural network for ranking segments ... 8 Example video Highest scoring Lowest scoring Michael Gygli, PhD student @ ETH Zurich06/21/2016
  • 9. Video2GIF: Automatic Generation of Animated GIFs from Video Approach ▪ Train a deep neural network for ranking segments ▪ Built on C3D network [Tran et al. ICCV 2015] ▪ Objective: score positives higher than negatives h: scoring function s+ : positive segment s- : negative segment 9Michael Gygli, PhD student @ ETH Zurich06/21/2016
  • 10. Video2GIF: Dataset 10Michael Gygli, PhD student @ ETH Zurich06/21/2016 Available on github.com/gyglim/video2gif_dataset ▪ Large-scale training data: GIFs created from YouTube videos ▪ Align GIF back to video ▪ This part defines a positive example ▪ Assume non-selected parts are less interesting than selected part
  • 11. Results Task Run mAP Image 1 0.1866 2 0.1952 3 0.1858 Video 1 0.1362 2 0.1574 Frame-based: • Run 1: Visual-semantic embedding trained on Clickture dataset • Run 2: As Run 1, with fine-tuning on development set • Run 3: As Run 1, but trained on a larger subset of Clickture Segment-based: • Run 1: Video2GIF • Run 2: Averaged score of Visual-semantic embedding and Video2GIF
  • 12. Qualitative Results - Run 2 Title: Captives Title: After Earth Predicted best frame True best frame Predicted best frame True best frame
  • 13. More information ▪ Paper Video2GIF: Automatic Generation of Animated GIFs from Video. M. Gygli, Y. Song, L. Cao, CVPR 2016 ▪ Slides on Video Summarization as Subset Selection @ Tutorial on Optimization Algorithms for Subset Selection and Summarization in Large Data Sets, CVPR 2016 https://t.co/mQIpxMab3v ▪ Demo website for Video2GIF: http://video2gif.info/autogif ▪ Paper Multi-task deep visual-semantic embedding for video thumbnail selection. W. Liu, T. Mei, Y. Zhang, C. Che, and J. Luo, CVPR 2015​