SlideShare a Scribd company logo
Prof. Laura Leal-Taixé
Dynamic Scene
Understanding
Dynamic Scene Understanding
Understand every pixel of a video
Dynamic Scene Understanding
Understand every pixel of a video
road
tree
Semantic
segmentation
person
car
Dynamic Scene Understanding
Understand every pixel of a video
person 1
tree
Semantic
segmentation
person 2
person 3
Instance-
based
segmentation
road
car
Dynamic Scene Understanding
Understand every pixel of a video
Semantic
segmentation
Instance-
based
segmentation
Multiple
object
tracking
Prof. Laura Leal-Taixé
Video Object
Segmentation without
Temporal Information
Input image First mask
Supervised video object segmentation
Architecture
• Series of convolutional filters
• Upsampling at different scales
• Final summation before prediction à get information from all scales
Boundary Snapping
Snap the foreground mask to accurate contours
3
Foreground Branch
Specific object - Less accurate contours
1
Contour Branch
Accurate contours - Generic objects
2
Architecture: improving boundaries
Boundary Snapping
Snap the foreground mask to accurate contours
3
Foreground Branch
Specific object - Less accurate contours
1
Contour Branch
Accurate contours - Generic objects
2
Architecture: improving boundaries
Boundary Snapping
Snap the foreground mask to accurate contours
3
Foreground Branch
Specific object - Less accurate contours
1
Contour Branch
Accurate contours - Generic objects
2
Architecture: improving boundaries
Select the
superpixels that
are 50% or more
covered by the
foreground mask
Boundary Snapping
Snap the foreground mask to accurate contours
3
Foreground Branch
Specific object - Less accurate contours
1
Contour Branch
Accurate contours - Generic objects
2
ResultsonframeN
oftestsequence
Base Network
Pre-trained on ImageNet
1
Parent Network
Trained on DAVIS training set
2
Test Network
Fine-tuned on frame 1 of test sequence
3
ResultsonframeN
oftestsequence
Base Network
Pre-trained on ImageNet
1
Parent Network
Trained on DAVIS training set
2
Test Network
Fine-tuned on frame 1 of test sequence
3
ResultsonframeN
oftestsequence
Base Network
Pre-trained on ImageNet
1
Parent Network
Trained on DAVIS training set
2
Test Network
Fine-tuned on frame 1 of test sequence
3
Edges and basic
image features
Learns how to
do video
segmentation
Learns which
object to
segment
One-Shot Video Object Segmentation
S. Caelles, K.-K. Maninis, J. Pont-Tuset, L. Leal-Taixé, D. Cremers, L. van Gool. CVPR 2017
FinetuningPre-trained Training
ResultsonframeN
oftestsequence
Base Network
Pre-trained on ImageNet
1
Parent Network
Trained on DAVIS training set
2
Test Network
Fine-tuned on frame 1 of test sequence
3
Learns which
object to
segment
One-Shot Video Object Segmentation
• Learning the appearance of the
foreground and background objects
Finetuning
102ms – Parent network
Finetuning time
DAVIS dataset
11.8 pp.
Experiments: heavy occlusions
Occlusion 1
Occlusion 2
First mask (input)
Experiments: number of annotations
Experiments: number of annotations
Experiments: highly dynamic scenes
Introducing Semantics
• The network does not have the notion of objects
• Broken contours are often seen, especially when occluded parts
becomes visible
Introducing Semantics
Introducing Semantics
Semantic propagation
• OSVOS path: obtain a coarse foreground estimation
Semantic
Instance
Segmentation
Result
Top Matching InstancesInstance Proposals
Input Image
First-Round
Foreground
Estimation
Conditional
Classifier
Semantic
Selection &
Propagation
Semantic Prior
Foreground Estimation
CNN
Appearance Model
Semantic propagation
• Semantic prior branch that gives us proposals to select from
Semantic
Instance
Segmentation
Result
Top Matching InstancesInstance Proposals
Input Image
First-Round
Foreground
Estimation
Conditional
Classifier
Semantic
Selection &
Propagation
Semantic Prior
Foreground Estimation
CNN
Appearance Model
Semantic propagation
• Semantic prior branch that gives us proposals to select from
• Enforce that semantics stay coherent throughout the sequence
Semantic
Instance
Segmentation
Result
Top Matching InstancesInstance Proposals
Input Image
First-Round
Foreground
Estimation
Conditional
Classifier
Semantic
Selection &
Propagation
Semantic Prior
Foreground Estimation
CNN
Appearance Model
Semantic propagation
Semantic Selection
Selected Instances:
Person and Motorbike
Ground Truth
Instance Segmentation Proposals
Semantic Propagation
InstanceSegmentation
Proposals
First-Round
ForegroundEstimation
TopPersonand
Motorbike
Frame 0 Frame 18 Frame 24 Frame 30 Frame 36
Semantic propagation
Semantic Selection
Selected Instances:
Person and Motorbike
Ground Truth
Instance Segmentation Proposals
Semantic Propagation
InstanceSegmentation
Proposals
First-Round
ForegroundEstimation
TopPersonand
Motorbike
Frame 0 Frame 18 Frame 24 Frame 30 Frame 36
Semantic propagation
Semantic Selection
Selected Instances:
Person and Motorbike
Ground Truth
Instance Segmentation Proposals
Semantic Propagation
InstanceSegmentation
Proposals
First-Round
ForegroundEstimation
TopPersonand
Motorbike
Frame 0 Frame 18 Frame 24 Frame 30 Frame 36
Semantic propagation
Semantic Selection
Selected Instances:
Person and Motorbike
Ground Truth
Instance Segmentation Proposals
Semantic Propagation
InstanceSegmentation
Proposals
First-Round
ForegroundEstimation
TopPersonand
Motorbike
Frame 0 Frame 18 Frame 24 Frame 30 Frame 36
Semantic propagation
Semantic Selection
Selected Instances:
Person and Motorbike
Ground Truth
Instance Segmentation Proposals
Semantic Propagation
InstanceSegmentation
Proposals
First-Round
ForegroundEstimation
TopPersonand
Motorbike
Frame 0 Frame 18 Frame 24 Frame 30 Frame 36
Ablation studies on DAVIS 2016
Finetuning on the first mask
Ablation studies on DAVIS 2016
Pre-training the parent network
Ablation studies on DAVIS 2016
Adding semantic information
Ablation studies on DAVIS 2016
Adding semantic information
Ablation studies on DAVIS 2016
Brings a substantial boost in temporal stability
Conclusions
• We can do consistent video object segmentation without the need
of temporal information
• Pose changes show parts of the object the network has not seen
and therefore holes in the result are created
• Semantic priors help in delivering consistent object-type
segmentations
Thank you
Dr. Laura Leal-Taixé

More Related Content

What's hot

Object Detection Methods using Deep Learning
Object Detection Methods using Deep LearningObject Detection Methods using Deep Learning
Object Detection Methods using Deep Learning
Sungjoon Choi
 
Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018
Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018
Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
Recurrent Instance Segmentation (UPC Reading Group)
Recurrent Instance Segmentation (UPC Reading Group)Recurrent Instance Segmentation (UPC Reading Group)
Recurrent Instance Segmentation (UPC Reading Group)
Universitat Politècnica de Catalunya
 
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning Framework
Nader Karimi
 
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
Visual Saliency Prediction with Deep Learning - Kevin McGuinness - UPC Barcel...
Visual Saliency Prediction with Deep Learning - Kevin McGuinness - UPC Barcel...Visual Saliency Prediction with Deep Learning - Kevin McGuinness - UPC Barcel...
Visual Saliency Prediction with Deep Learning - Kevin McGuinness - UPC Barcel...
Universitat Politècnica de Catalunya
 
How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...
Dongmin Choi
 
Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331
Jihong Kang
 
Object detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetObject detection - RCNNs vs Retinanet
Object detection - RCNNs vs Retinanet
Rishabh Indoria
 
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018
Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018
Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
Deep Learning for Computer Vision: Segmentation (UPC 2016)
Deep Learning for Computer Vision: Segmentation (UPC 2016)Deep Learning for Computer Vision: Segmentation (UPC 2016)
Deep Learning for Computer Vision: Segmentation (UPC 2016)
Universitat Politècnica de Catalunya
 
Auro tripathy - Localizing with CNNs
Auro tripathy -  Localizing with CNNsAuro tripathy -  Localizing with CNNs
Auro tripathy - Localizing with CNNs
Auro Tripathy
 
Medical Imaging at DCU - Kevin McGuinness - UPC Barcelona 2018
Medical Imaging at DCU - Kevin McGuinness - UPC Barcelona 2018Medical Imaging at DCU - Kevin McGuinness - UPC Barcelona 2018
Medical Imaging at DCU - Kevin McGuinness - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
Brodmann17
 
ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]
Dongmin Choi
 
Shai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingShai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble tracking
wolf
 
Fcv rep darrell
Fcv rep darrellFcv rep darrell
Fcv rep darrellzukun
 
#6 PyData Warsaw: Deep learning for image segmentation
#6 PyData Warsaw: Deep learning for image segmentation#6 PyData Warsaw: Deep learning for image segmentation
#6 PyData Warsaw: Deep learning for image segmentation
Matthew Opala
 

What's hot (20)

Object Detection Methods using Deep Learning
Object Detection Methods using Deep LearningObject Detection Methods using Deep Learning
Object Detection Methods using Deep Learning
 
Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018
Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018
Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018
 
Recurrent Instance Segmentation (UPC Reading Group)
Recurrent Instance Segmentation (UPC Reading Group)Recurrent Instance Segmentation (UPC Reading Group)
Recurrent Instance Segmentation (UPC Reading Group)
 
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning Framework
 
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
 
Visual Saliency Prediction with Deep Learning - Kevin McGuinness - UPC Barcel...
Visual Saliency Prediction with Deep Learning - Kevin McGuinness - UPC Barcel...Visual Saliency Prediction with Deep Learning - Kevin McGuinness - UPC Barcel...
Visual Saliency Prediction with Deep Learning - Kevin McGuinness - UPC Barcel...
 
How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...
 
Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331
 
Object detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetObject detection - RCNNs vs Retinanet
Object detection - RCNNs vs Retinanet
 
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
 
Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018
Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018
Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018
 
Deep Learning for Computer Vision: Segmentation (UPC 2016)
Deep Learning for Computer Vision: Segmentation (UPC 2016)Deep Learning for Computer Vision: Segmentation (UPC 2016)
Deep Learning for Computer Vision: Segmentation (UPC 2016)
 
Auro tripathy - Localizing with CNNs
Auro tripathy -  Localizing with CNNsAuro tripathy -  Localizing with CNNs
Auro tripathy - Localizing with CNNs
 
Medical Imaging at DCU - Kevin McGuinness - UPC Barcelona 2018
Medical Imaging at DCU - Kevin McGuinness - UPC Barcelona 2018Medical Imaging at DCU - Kevin McGuinness - UPC Barcelona 2018
Medical Imaging at DCU - Kevin McGuinness - UPC Barcelona 2018
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 
ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]
 
Shai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingShai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble tracking
 
Fcv rep darrell
Fcv rep darrellFcv rep darrell
Fcv rep darrell
 
#6 PyData Warsaw: Deep learning for image segmentation
#6 PyData Warsaw: Deep learning for image segmentation#6 PyData Warsaw: Deep learning for image segmentation
#6 PyData Warsaw: Deep learning for image segmentation
 

Similar to Video Object Segmentation - Laura Leal-Taixé - UPC Barcelona 2018

Machine learning for newbies
Machine learning for newbiesMachine learning for newbies
Machine learning for newbies
Andrew Nikishaev
 
Object Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online LearningObject Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online Learning
Jui-Hsin (Larry) Lai
 
Surveillance scene classification using machine learning
Surveillance scene classification using machine learningSurveillance scene classification using machine learning
Surveillance scene classification using machine learning
Utkarsh Contractor
 
Recognition and tracking moving objects using moving camera in complex scenes
Recognition and tracking moving objects using moving camera in complex scenesRecognition and tracking moving objects using moving camera in complex scenes
Recognition and tracking moving objects using moving camera in complex scenes
IJCSEA Journal
 
CMU Trecvid sed11
CMU Trecvid sed11CMU Trecvid sed11
CMU Trecvid sed11Lu Jiang
 
Deep-learning-for-computer-vision-applications-using-matlab.pdf
Deep-learning-for-computer-vision-applications-using-matlab.pdfDeep-learning-for-computer-vision-applications-using-matlab.pdf
Deep-learning-for-computer-vision-applications-using-matlab.pdf
AubainYro1
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer Vision
Sungjoon Choi
 
D3L4-objects.pdf
D3L4-objects.pdfD3L4-objects.pdf
D3L4-objects.pdf
ssusere945ae
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用
CHENHuiMei
 
Deep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUSDeep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUS
Ganesan Narayanasamy
 
Online video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkOnline video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident network
NAVER Engineering
 
On-the-fly Visual Category Search in Web-scale Image Collections
On-the-fly Visual Category Search in Web-scale Image CollectionsOn-the-fly Visual Category Search in Web-scale Image Collections
On-the-fly Visual Category Search in Web-scale Image Collections
Ken Chatfield
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
Antonio Rueda-Toicen
 
Talk 2011-buet-perception-event
Talk 2011-buet-perception-eventTalk 2011-buet-perception-event
Talk 2011-buet-perception-eventMahfuzul Haque
 
Maxim Kamensky - Applying image matching algorithms to video recognition and ...
Maxim Kamensky - Applying image matching algorithms to video recognition and ...Maxim Kamensky - Applying image matching algorithms to video recognition and ...
Maxim Kamensky - Applying image matching algorithms to video recognition and ...
Eastern European Computer Vision Conference
 
Training Drone Image Models with Grand Theft Auto
Training Drone Image Models with Grand Theft AutoTraining Drone Image Models with Grand Theft Auto
Training Drone Image Models with Grand Theft Auto
Commonwealth Computer Research, Inc.
 
Ee 417 Senior Design
Ee 417 Senior DesignEe 417 Senior Design
Ee 417 Senior Design
crouchj1
 
Anomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETAnomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NET
Marco Parenzan
 
Deep learning unsupervised learning diapo
Deep learning unsupervised learning diapoDeep learning unsupervised learning diapo
Deep learning unsupervised learning diapo
Milton Paja
 

Similar to Video Object Segmentation - Laura Leal-Taixé - UPC Barcelona 2018 (20)

Machine learning for newbies
Machine learning for newbiesMachine learning for newbies
Machine learning for newbies
 
Object Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online LearningObject Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online Learning
 
Surveillance scene classification using machine learning
Surveillance scene classification using machine learningSurveillance scene classification using machine learning
Surveillance scene classification using machine learning
 
Recognition and tracking moving objects using moving camera in complex scenes
Recognition and tracking moving objects using moving camera in complex scenesRecognition and tracking moving objects using moving camera in complex scenes
Recognition and tracking moving objects using moving camera in complex scenes
 
CMU Trecvid sed11
CMU Trecvid sed11CMU Trecvid sed11
CMU Trecvid sed11
 
Deep-learning-for-computer-vision-applications-using-matlab.pdf
Deep-learning-for-computer-vision-applications-using-matlab.pdfDeep-learning-for-computer-vision-applications-using-matlab.pdf
Deep-learning-for-computer-vision-applications-using-matlab.pdf
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer Vision
 
D3L4-objects.pdf
D3L4-objects.pdfD3L4-objects.pdf
D3L4-objects.pdf
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用
 
Deep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUSDeep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUS
 
Online video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkOnline video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident network
 
On-the-fly Visual Category Search in Web-scale Image Collections
On-the-fly Visual Category Search in Web-scale Image CollectionsOn-the-fly Visual Category Search in Web-scale Image Collections
On-the-fly Visual Category Search in Web-scale Image Collections
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
 
Talk 2011-buet-perception-event
Talk 2011-buet-perception-eventTalk 2011-buet-perception-event
Talk 2011-buet-perception-event
 
Maxim Kamensky - Applying image matching algorithms to video recognition and ...
Maxim Kamensky - Applying image matching algorithms to video recognition and ...Maxim Kamensky - Applying image matching algorithms to video recognition and ...
Maxim Kamensky - Applying image matching algorithms to video recognition and ...
 
Training Drone Image Models with Grand Theft Auto
Training Drone Image Models with Grand Theft AutoTraining Drone Image Models with Grand Theft Auto
Training Drone Image Models with Grand Theft Auto
 
Ee 417 Senior Design
Ee 417 Senior DesignEe 417 Senior Design
Ee 417 Senior Design
 
CNN
CNNCNN
CNN
 
Anomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETAnomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NET
 
Deep learning unsupervised learning diapo
Deep learning unsupervised learning diapoDeep learning unsupervised learning diapo
Deep learning unsupervised learning diapo
 

More from Universitat Politècnica de Catalunya

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Universitat Politècnica de Catalunya
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
Universitat Politècnica de Catalunya
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Universitat Politècnica de Catalunya
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
Universitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Universitat Politècnica de Catalunya
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Universitat Politècnica de Catalunya
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Universitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
Universitat Politècnica de Catalunya
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Universitat Politècnica de Catalunya
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
Universitat Politècnica de Catalunya
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Universitat Politècnica de Catalunya
 

More from Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 

Recently uploaded

一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 

Recently uploaded (20)

一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 

Video Object Segmentation - Laura Leal-Taixé - UPC Barcelona 2018

  • 1. Prof. Laura Leal-Taixé Dynamic Scene Understanding
  • 2. Dynamic Scene Understanding Understand every pixel of a video
  • 3. Dynamic Scene Understanding Understand every pixel of a video road tree Semantic segmentation person car
  • 4. Dynamic Scene Understanding Understand every pixel of a video person 1 tree Semantic segmentation person 2 person 3 Instance- based segmentation road car
  • 5. Dynamic Scene Understanding Understand every pixel of a video Semantic segmentation Instance- based segmentation Multiple object tracking
  • 6. Prof. Laura Leal-Taixé Video Object Segmentation without Temporal Information
  • 7. Input image First mask Supervised video object segmentation
  • 8. Architecture • Series of convolutional filters • Upsampling at different scales • Final summation before prediction à get information from all scales Boundary Snapping Snap the foreground mask to accurate contours 3 Foreground Branch Specific object - Less accurate contours 1 Contour Branch Accurate contours - Generic objects 2
  • 9. Architecture: improving boundaries Boundary Snapping Snap the foreground mask to accurate contours 3 Foreground Branch Specific object - Less accurate contours 1 Contour Branch Accurate contours - Generic objects 2
  • 10. Architecture: improving boundaries Boundary Snapping Snap the foreground mask to accurate contours 3 Foreground Branch Specific object - Less accurate contours 1 Contour Branch Accurate contours - Generic objects 2
  • 11. Architecture: improving boundaries Select the superpixels that are 50% or more covered by the foreground mask Boundary Snapping Snap the foreground mask to accurate contours 3 Foreground Branch Specific object - Less accurate contours 1 Contour Branch Accurate contours - Generic objects 2
  • 12. ResultsonframeN oftestsequence Base Network Pre-trained on ImageNet 1 Parent Network Trained on DAVIS training set 2 Test Network Fine-tuned on frame 1 of test sequence 3 ResultsonframeN oftestsequence Base Network Pre-trained on ImageNet 1 Parent Network Trained on DAVIS training set 2 Test Network Fine-tuned on frame 1 of test sequence 3 ResultsonframeN oftestsequence Base Network Pre-trained on ImageNet 1 Parent Network Trained on DAVIS training set 2 Test Network Fine-tuned on frame 1 of test sequence 3 Edges and basic image features Learns how to do video segmentation Learns which object to segment One-Shot Video Object Segmentation S. Caelles, K.-K. Maninis, J. Pont-Tuset, L. Leal-Taixé, D. Cremers, L. van Gool. CVPR 2017 FinetuningPre-trained Training
  • 13. ResultsonframeN oftestsequence Base Network Pre-trained on ImageNet 1 Parent Network Trained on DAVIS training set 2 Test Network Fine-tuned on frame 1 of test sequence 3 Learns which object to segment One-Shot Video Object Segmentation • Learning the appearance of the foreground and background objects Finetuning
  • 14. 102ms – Parent network Finetuning time DAVIS dataset 11.8 pp.
  • 15. Experiments: heavy occlusions Occlusion 1 Occlusion 2 First mask (input)
  • 16. Experiments: number of annotations
  • 17. Experiments: number of annotations
  • 19. Introducing Semantics • The network does not have the notion of objects • Broken contours are often seen, especially when occluded parts becomes visible
  • 22. Semantic propagation • OSVOS path: obtain a coarse foreground estimation Semantic Instance Segmentation Result Top Matching InstancesInstance Proposals Input Image First-Round Foreground Estimation Conditional Classifier Semantic Selection & Propagation Semantic Prior Foreground Estimation CNN Appearance Model
  • 23. Semantic propagation • Semantic prior branch that gives us proposals to select from Semantic Instance Segmentation Result Top Matching InstancesInstance Proposals Input Image First-Round Foreground Estimation Conditional Classifier Semantic Selection & Propagation Semantic Prior Foreground Estimation CNN Appearance Model
  • 24. Semantic propagation • Semantic prior branch that gives us proposals to select from • Enforce that semantics stay coherent throughout the sequence Semantic Instance Segmentation Result Top Matching InstancesInstance Proposals Input Image First-Round Foreground Estimation Conditional Classifier Semantic Selection & Propagation Semantic Prior Foreground Estimation CNN Appearance Model
  • 25. Semantic propagation Semantic Selection Selected Instances: Person and Motorbike Ground Truth Instance Segmentation Proposals Semantic Propagation InstanceSegmentation Proposals First-Round ForegroundEstimation TopPersonand Motorbike Frame 0 Frame 18 Frame 24 Frame 30 Frame 36
  • 26. Semantic propagation Semantic Selection Selected Instances: Person and Motorbike Ground Truth Instance Segmentation Proposals Semantic Propagation InstanceSegmentation Proposals First-Round ForegroundEstimation TopPersonand Motorbike Frame 0 Frame 18 Frame 24 Frame 30 Frame 36
  • 27. Semantic propagation Semantic Selection Selected Instances: Person and Motorbike Ground Truth Instance Segmentation Proposals Semantic Propagation InstanceSegmentation Proposals First-Round ForegroundEstimation TopPersonand Motorbike Frame 0 Frame 18 Frame 24 Frame 30 Frame 36
  • 28. Semantic propagation Semantic Selection Selected Instances: Person and Motorbike Ground Truth Instance Segmentation Proposals Semantic Propagation InstanceSegmentation Proposals First-Round ForegroundEstimation TopPersonand Motorbike Frame 0 Frame 18 Frame 24 Frame 30 Frame 36
  • 29. Semantic propagation Semantic Selection Selected Instances: Person and Motorbike Ground Truth Instance Segmentation Proposals Semantic Propagation InstanceSegmentation Proposals First-Round ForegroundEstimation TopPersonand Motorbike Frame 0 Frame 18 Frame 24 Frame 30 Frame 36
  • 30. Ablation studies on DAVIS 2016 Finetuning on the first mask
  • 31. Ablation studies on DAVIS 2016 Pre-training the parent network
  • 32. Ablation studies on DAVIS 2016 Adding semantic information
  • 33. Ablation studies on DAVIS 2016 Adding semantic information
  • 34. Ablation studies on DAVIS 2016 Brings a substantial boost in temporal stability
  • 35. Conclusions • We can do consistent video object segmentation without the need of temporal information • Pose changes show parts of the object the network has not seen and therefore holes in the result are created • Semantic priors help in delivering consistent object-type segmentations
  • 36. Thank you Dr. Laura Leal-Taixé