SlideShare a Scribd company logo
1 of 43
Download to read offline
Online Video Object Segmentation via
Convolutional Trident Network
Seminar at Naver
Won-Dong Jang
Korea University
2017-08-22
Introduction & related works
Video object segmentation
• Clustering pixels in videos into objects or background
• Unsupervised
• Supervised
• Semi-supervised
Segment track
Video object segmentation
• Unsupervised video object segmentation
• Discover and segment a primary object in a video
• Without user annotations
Input videos Ground-truth segmentation labels
Video object segmentation
• Unsupervised video object segmentation
• Saliency detection-based approach
• Visual saliency
• Motion saliency
Examples of saliency detection
Video object segmentation
• Supervised video object segmentation
• User can annotate mislabeled pixels in any frames
• Interaction between an algorithm and a user
Annotation
at the first frame
Segmentation results after adding the annotation
Segmentation results
Additional
user annotation
Video object segmentation
• Semi-supervised video object segmentation
• Discover and segment a primary object in a video
• With user annotations in the first frame
Annotated target object
in the first frame
Segmentation results
Related works
• Object proposal-based algorithm
• Generate object proposals in each frame
• Select object proposals that are similar to the target object
[2015][ICCV][Perazzi] Fully Connected Object Proposals for Video Segmentation
Generated object proposals
Segment track
Related works
• Superpixel-based algorithm
• Over-segment each frame into superpixels
• Trace the target object based on the inter-frame matching
[2015][CVPR][Wen] Joint Online Tracking and Segmentation
Inter-frame matching Segment track
Proposed algorithm
In this presentation
• A semi-supervised online segmentation algorithm
• Online method
• Offline techniques require a huge memory space for a long video
• Deep learning-based approach
• Connection between a convolutional neural network and an MRF
optimization strategy
• Remarkable performance on the DAVIS benchmark dataset
Overview
• Framework
• Inter-frame propagation
• Inference via convolutional trident network (CTN)
• Yields three tailored probability maps for the MRF optimization
• MRF optimization
Inter-frame propagation
• Propagation from the previous frame
• To roughly locate the target object in the current frame
• Using backward optical flow
• From 𝑡 to 𝑡 − 1
Segmentation label map
for frame 𝑡 − 1
Backward motion
(optical flow)
Propagation map
for frame 𝑡
Convolutional trident network
• Infer segmentation information
• The propagation map may be inaccurate
• The inferred information is effectively used to solve a binary
labeling problem
Network architecture
• Encoder-decoder architecture
• Single encoder
• VGG encoder
• Three decoders
• Separative decoder
• Definite foreground decoder
• Definite background decoder
Network architecture
• Encoder-decoder architecture
• Single encoder
• VGG encoder
• Three decoders
• Separative decoder
• Definite foreground decoder
• Definite background decoder
Network architecture
• VGG encoder
• Trained for image classification
• Using ImageNet dataset
22K categories
14M images
Network architecture
• VGG encoder
• Trained for image classification
• Using ImageNet dataset
Convolutional layers
Fully connected
layers
Fox
Cat
Lion
Dog
Zebra
Umbrella
Tank
Lamp
Desk
Orange
Kite
…
Network architecture
• Separative decoder (SD)
• Separate a target object from the background
• Using a down-sampled foreground propagation patch
Conv1_1
Conv1_2
Pooling
Conv2_1
Conv2_2
Pooling
Conv3_1
Conv3_2
Conv3_3
Pooling
Conv4_1
Conv4_2
Conv4_3
Pooling
Conv5_1
Conv5_2
Conv5_3
SD-Dec2
SD-Dec3
SD-Dec4
SD-Dec5
SD-Pred
SD-Dec1
Unpooling
Unpooling
Unpooling
Unpooling
Skip connections
Image patch
Separative
probability patch
Foreground
propagation patch
Network architecture
• Definite foreground and background decoders
• Definite pixels indicate locations that should be labeled as the
foreground or the background indubitably
• Fixing labels in definite pixels improves labeling accuracies
• Inspired by the image matting problem
Input image Tri-map Matting result
Network architecture
• Definite foreground decoder (DFD)
• Identifies definite foreground pixels
Conv1_1
Conv1_2
Pooling
Conv2_1
Conv2_2
Pooling
Conv3_1
Conv3_2
Conv3_3
Pooling
Conv4_1
Conv4_2
Conv4_3
Pooling
Conv5_1
Conv5_2
Conv5_3
DFD-Dec2
DFD-Dec3
DFD-Dec4
DFD-Dec5
DFD-Pred
DFD-Dec1
Unpooling
Unpooling
Unpooling
Unpooling
Image patch
Definite foreground
probability patch
Foreground
propagation patch
Network architecture
• Definite background decoder (DBD)
• Finds definite background pixels
Conv1_1
Conv1_2
Pooling
Conv2_1
Conv2_2
Pooling
Conv3_1
Conv3_2
Conv3_3
Pooling
Conv4_1
Conv4_2
Conv4_3
Pooling
Conv5_1
Conv5_2
Conv5_3
DBD-Dec2
DBD-Dec3
DBD-Dec4
DBD-Dec5
DBD-Pred
DBD-Dec1
Unpooling
Unpooling
Unpooling
Unpooling
Image patch
Definite background
probability patch
Background
propagation patch
Network architecture
• Implementation issues in decoders
• Prediction layer
• Sigmoid layer is used to yield normalized outputs within [0, 1]
• Rectified linear unit (ReLU) + Batch normalization
• Kernel size
• 3 ×3 kernels are used in the prediction layers
• 5 × 5 kernels are used in the other convolution layers
Training phase
• Lack of video object segmentation dataset
• There are several datasets for video object segmentation
• However, each of them consists of a small number of videos from
12 to 59
• Instead,
• We use the PASCAL VOC 2012 dataset
• Object segmentation
• 26,844 object masks
• 11,355 images
Training phase
• Preprocessing of training data
• Ground-truth masks for the DFD and DBD are not available
• Hence, we generate them through simple image processing
Training phase
• Preprocessing of training data
• Degrade the objet mask to imitate propagation errors
• By performing suppression and noise addition
• Synthesize the ground-truth masks for the DFD and DBD
• By applying erosion and dilation
Training phase
• Implementation issues
• Caffe library
• Cross-entropy losses
−
1
𝑛
෍
𝑛=1
𝑁
𝑝 𝑛 log Ƹ𝑝 𝑛 + 1 − 𝑝 𝑛 log 1 − Ƹ𝑝 𝑛
• Minibatch
• with eight training data
• Learning rate
• 1e-3 for the first 55 epochs
• 1e-4 for the next 35 epochs
• Stochastic gradient descent
Inference phase
• Set input data
• By cropping the frame and propagation map
• CTN outputs three probability patches
• Separative probability map 𝑅S
• Definite foreground probability map 𝑅F
• Definite background probability map 𝑅B
Frame 𝑡
Segmentation label
at frame 𝑡 − 1
Optical flow
Inter-frame propagationInput at frame 𝑡
Propagation map
at frame 𝑡
Inference via convolutional encoder-decoder network
Encoder
Background
propagation patch
Foreground
propagation patch
Image patch
Definite
background
decoder
Definite
foreground
decoder
Separative
decoder
Definite background
probability patch
Definite foreground
probability patch
Separative
probability patch
Inference phase
• Classification
• Separative probability map 𝑅S
• If 𝑅S(𝐩) > 𝜃sep, pixel 𝐩 is classified as the foreground
• ℒ be the coordinate set for such foreground pixels
• Definite foreground probability map 𝑅F
• If 𝑅F(𝐩) > 𝜃def, pixel 𝐩 is classified as the definite foreground
• ℱ denotes the set of the definite foreground pixels
• Definite background probability map 𝑅B
• If 𝑅B(𝐩) > 𝜃def, pixel 𝐩 is classified as the definite background
• ℬ indicates the set of the definite background pixels
MRF optimization
• Solve two-class MRF optimization problem
• To improve the segmentation quality further
• Define a graph 𝐺 = (𝑁, 𝐸)
• Nodes are pixels in the current frame
• Each node is connected to its four neighbors by edges
MRF optimization
• MRF energy function
ℰ 𝑆 = ෍
𝐩∈𝑁
𝒟 𝐩, 𝑆 + 𝛾 × ෍
𝐩,𝐪 ∈𝐸
𝒵 𝐩, 𝐪, 𝑆
Definite background
probability patch
Definite foreground
probability patch
Image patch Separative
probability patch
Unary cost 𝒟
- Returns extremely high costs on DF and DB pixels
when they have background and foreground labels
Pairwise cost 𝒵
- Encourage neighboring pixels
to have the same label
MRF optimization
• Unary cost computation
• Build the RGB color Gaussian mixture models (GMMs) of the
foreground and the background, respectively
• 𝐾 = 10 for both GMMs
• Use pixels in ℒ to construct the foreground GMMs
• Use pixels in ℒ 𝑐
to construct the background GMMs
• Gaussian cost
𝜓 𝐩, 𝑠 = min
𝑘
− log 𝑓 𝐩; ℳ𝑠,𝑘
• Unary cost
𝒟 𝐩, 𝑆 = ൞
∞ if 𝑝 ∈ ℱ and 𝑆 𝑝 = 0
∞ if 𝑝 ∈ ℱ and 𝑆 𝑝 = 0
𝜓 𝐩, 𝑆(𝐩) otherwise
MRF optimization
• Pairwise cost computation
• Pairwise cost
𝒵 𝐩, 𝐪, 𝑆 = ቊ
exp −𝑑 𝐩, 𝐪 if 𝑆 𝐩 ≠ 𝑆 𝐪
0 otherwise
• Graph-cut optimization
ℰ 𝑆 = ෍
𝐩∈𝑁
𝒟 𝐩, 𝑆 + 𝛾 × ෍
𝐩,𝐪 ∈𝐸
𝒵 𝐩, 𝐪, 𝑆
Reappearing object detection
• Identification of reappearing parts
• A target object may disappear and be occluded by other
objects
• Use backward-forward motion consistency
Experimental results
Experimental results
• The DAVIS benchmark dataset
• 50 videos
• 854 × 480 resolution
• Number of frames
• From 25 to 104
• Difficulties
• Fast motion
• Occlusion
• Object deformation
Experimental results
• Performance measures
• Region similarity
• Jaccard index
• Contour accuracy
• F-measure
• Statistics
• Mean
• Recall
• Decay
Experimental results
• Performance comparison
Experimental results
• Qualitative results
Experimental results
• The SegTrack dataset
• 5 videos
• Performance comparison
• Jaccard index
Experimental results
• Ablation studies
• Effectiveness of each decoder
• Efficacy of MRF optimization
Experimental results
• Running time analysis
• Prop-Q
• Uses the state-of-the-art optical flow technique
• Prop-F
• Adopts a much faster optical flow technique
• Parameter selection
Conclusions
• A semi-supervised online video object segmentation
algorithm is introduced in this presentation
• Deep learning-based semi-supervised video object
segmentation algorithm
• Tailored network for MRF optimization
• Remarkable performance on the DAVIS dataset
• Q&A
• Thank you

More Related Content

What's hot

Unsupervised visual representation learning overview: Toward Self-Supervision
Unsupervised visual representation learning overview: Toward Self-SupervisionUnsupervised visual representation learning overview: Toward Self-Supervision
Unsupervised visual representation learning overview: Toward Self-SupervisionLEE HOSEONG
 
Super resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun YooSuper resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun YooJaeJun Yoo
 
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)Jia-Bin Huang
 
Deep Local Parametric Filters for Image Enhancement
Deep Local Parametric Filters for Image EnhancementDeep Local Parametric Filters for Image Enhancement
Deep Local Parametric Filters for Image EnhancementSean Moran
 
YOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewYOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewLEE HOSEONG
 
Deep Local Parametric Filters for Image Enhancement
Deep Local Parametric Filters for Image EnhancementDeep Local Parametric Filters for Image Enhancement
Deep Local Parametric Filters for Image EnhancementSean Moran
 
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D stream
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D streamColor and 3D Semantic Reconstruction of Indoor Scenes from RGB-D stream
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D streamNAVER Engineering
 
Talk 2011-buet-perception-event
Talk 2011-buet-perception-eventTalk 2011-buet-perception-event
Talk 2011-buet-perception-eventMahfuzul Haque
 
Deep Learning for Computer Vision: A comparision between Convolutional Neural...
Deep Learning for Computer Vision: A comparision between Convolutional Neural...Deep Learning for Computer Vision: A comparision between Convolutional Neural...
Deep Learning for Computer Vision: A comparision between Convolutional Neural...Vincenzo Lomonaco
 
Object-Region Video Transformers
Object-Region Video TransformersObject-Region Video Transformers
Object-Region Video TransformersSangwoo Mo
 
PR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object DetectionPR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object DetectionJinwon Lee
 
Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Hwa Pyung Kim
 
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...JaeJun Yoo
 
Shai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingShai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingwolf
 
Introduction to deep learning based voice activity detection
Introduction to deep learning based voice activity detectionIntroduction to deep learning based voice activity detection
Introduction to deep learning based voice activity detectionNAVER Engineering
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingSangwoo Mo
 
Generative Models for General Audiences
Generative Models for General AudiencesGenerative Models for General Audiences
Generative Models for General AudiencesSangwoo Mo
 
Deep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUSDeep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUSGanesan Narayanasamy
 
Passive stereo vision with deep learning
Passive stereo vision with deep learningPassive stereo vision with deep learning
Passive stereo vision with deep learningYu Huang
 

What's hot (20)

Unsupervised visual representation learning overview: Toward Self-Supervision
Unsupervised visual representation learning overview: Toward Self-SupervisionUnsupervised visual representation learning overview: Toward Self-Supervision
Unsupervised visual representation learning overview: Toward Self-Supervision
 
Super resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun YooSuper resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun Yoo
 
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
 
Deep Local Parametric Filters for Image Enhancement
Deep Local Parametric Filters for Image EnhancementDeep Local Parametric Filters for Image Enhancement
Deep Local Parametric Filters for Image Enhancement
 
YOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewYOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection review
 
Deep Local Parametric Filters for Image Enhancement
Deep Local Parametric Filters for Image EnhancementDeep Local Parametric Filters for Image Enhancement
Deep Local Parametric Filters for Image Enhancement
 
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D stream
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D streamColor and 3D Semantic Reconstruction of Indoor Scenes from RGB-D stream
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D stream
 
Talk 2011-buet-perception-event
Talk 2011-buet-perception-eventTalk 2011-buet-perception-event
Talk 2011-buet-perception-event
 
Background Subtraction Based on Phase and Distance Transform Under Sudden Ill...
Background Subtraction Based on Phase and Distance Transform Under Sudden Ill...Background Subtraction Based on Phase and Distance Transform Under Sudden Ill...
Background Subtraction Based on Phase and Distance Transform Under Sudden Ill...
 
Deep Learning for Computer Vision: A comparision between Convolutional Neural...
Deep Learning for Computer Vision: A comparision between Convolutional Neural...Deep Learning for Computer Vision: A comparision between Convolutional Neural...
Deep Learning for Computer Vision: A comparision between Convolutional Neural...
 
Object-Region Video Transformers
Object-Region Video TransformersObject-Region Video Transformers
Object-Region Video Transformers
 
PR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object DetectionPR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object Detection
 
Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)
 
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
 
Shai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingShai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble tracking
 
Introduction to deep learning based voice activity detection
Introduction to deep learning based voice activity detectionIntroduction to deep learning based voice activity detection
Introduction to deep learning based voice activity detection
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 
Generative Models for General Audiences
Generative Models for General AudiencesGenerative Models for General Audiences
Generative Models for General Audiences
 
Deep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUSDeep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUS
 
Passive stereo vision with deep learning
Passive stereo vision with deep learningPassive stereo vision with deep learning
Passive stereo vision with deep learning
 

Viewers also liked

딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망
딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망
딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망NAVER Engineering
 
Introduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningIntroduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningNAVER Engineering
 
알파고 해부하기 1부
알파고 해부하기 1부알파고 해부하기 1부
알파고 해부하기 1부Donghun Lee
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?NAVER Engineering
 
Multimodal Sequential Learning for Video QA
Multimodal Sequential Learning for Video QAMultimodal Sequential Learning for Video QA
Multimodal Sequential Learning for Video QANAVER Engineering
 
바둑인을 위한 알파고
바둑인을 위한 알파고바둑인을 위한 알파고
바둑인을 위한 알파고Donghun Lee
 
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단NAVER Engineering
 
알파고 풀어보기 / Alpha Technical Review
알파고 풀어보기 / Alpha Technical Review알파고 풀어보기 / Alpha Technical Review
알파고 풀어보기 / Alpha Technical Review상은 박
 
Step-by-step approach to question answering
Step-by-step approach to question answeringStep-by-step approach to question answering
Step-by-step approach to question answeringNAVER Engineering
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGANNAVER Engineering
 
RLCode와 A3C 쉽고 깊게 이해하기
RLCode와 A3C 쉽고 깊게 이해하기RLCode와 A3C 쉽고 깊게 이해하기
RLCode와 A3C 쉽고 깊게 이해하기Woong won Lee
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기NAVER Engineering
 
[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기
[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기
[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기이 의령
 
알파고 (바둑 인공지능)의 작동 원리
알파고 (바둑 인공지능)의 작동 원리알파고 (바둑 인공지능)의 작동 원리
알파고 (바둑 인공지능)의 작동 원리Shane (Seungwhan) Moon
 
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016Taehoon Kim
 
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...Jeongkyu Shin
 
알아두면 쓸데있는 신기한 강화학습 NAVER 2017
알아두면 쓸데있는 신기한 강화학습 NAVER 2017알아두면 쓸데있는 신기한 강화학습 NAVER 2017
알아두면 쓸데있는 신기한 강화학습 NAVER 2017Taehoon Kim
 
what is_tabs_share
what is_tabs_sharewhat is_tabs_share
what is_tabs_shareNAVER D2
 
[132]웨일 브라우저 1년 그리고 미래
[132]웨일 브라우저 1년 그리고 미래[132]웨일 브라우저 1년 그리고 미래
[132]웨일 브라우저 1년 그리고 미래NAVER D2
 
[142] 생체 이해에 기반한 로봇 – 고성능 로봇에게 인간의 유연함과 안전성 부여하기
[142] 생체 이해에 기반한 로봇 – 고성능 로봇에게 인간의 유연함과 안전성 부여하기[142] 생체 이해에 기반한 로봇 – 고성능 로봇에게 인간의 유연함과 안전성 부여하기
[142] 생체 이해에 기반한 로봇 – 고성능 로봇에게 인간의 유연함과 안전성 부여하기NAVER D2
 

Viewers also liked (20)

딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망
딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망
딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망
 
Introduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningIntroduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement Learning
 
알파고 해부하기 1부
알파고 해부하기 1부알파고 해부하기 1부
알파고 해부하기 1부
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?
 
Multimodal Sequential Learning for Video QA
Multimodal Sequential Learning for Video QAMultimodal Sequential Learning for Video QA
Multimodal Sequential Learning for Video QA
 
바둑인을 위한 알파고
바둑인을 위한 알파고바둑인을 위한 알파고
바둑인을 위한 알파고
 
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단
 
알파고 풀어보기 / Alpha Technical Review
알파고 풀어보기 / Alpha Technical Review알파고 풀어보기 / Alpha Technical Review
알파고 풀어보기 / Alpha Technical Review
 
Step-by-step approach to question answering
Step-by-step approach to question answeringStep-by-step approach to question answering
Step-by-step approach to question answering
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGAN
 
RLCode와 A3C 쉽고 깊게 이해하기
RLCode와 A3C 쉽고 깊게 이해하기RLCode와 A3C 쉽고 깊게 이해하기
RLCode와 A3C 쉽고 깊게 이해하기
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
 
[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기
[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기
[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기
 
알파고 (바둑 인공지능)의 작동 원리
알파고 (바둑 인공지능)의 작동 원리알파고 (바둑 인공지능)의 작동 원리
알파고 (바둑 인공지능)의 작동 원리
 
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016
 
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
 
알아두면 쓸데있는 신기한 강화학습 NAVER 2017
알아두면 쓸데있는 신기한 강화학습 NAVER 2017알아두면 쓸데있는 신기한 강화학습 NAVER 2017
알아두면 쓸데있는 신기한 강화학습 NAVER 2017
 
what is_tabs_share
what is_tabs_sharewhat is_tabs_share
what is_tabs_share
 
[132]웨일 브라우저 1년 그리고 미래
[132]웨일 브라우저 1년 그리고 미래[132]웨일 브라우저 1년 그리고 미래
[132]웨일 브라우저 1년 그리고 미래
 
[142] 생체 이해에 기반한 로봇 – 고성능 로봇에게 인간의 유연함과 안전성 부여하기
[142] 생체 이해에 기반한 로봇 – 고성능 로봇에게 인간의 유연함과 안전성 부여하기[142] 생체 이해에 기반한 로봇 – 고성능 로봇에게 인간의 유연함과 안전성 부여하기
[142] 생체 이해에 기반한 로봇 – 고성능 로봇에게 인간의 유연함과 안전성 부여하기
 

Similar to Online video object segmentation via convolutional trident network

Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)DonghyunKang12
 
Introduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural NetworksIntroduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural NetworksMarcinJedyk
 
Computer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonComputer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonAditya Bhattacharya
 
Introduction to computer vision
Introduction to computer visionIntroduction to computer vision
Introduction to computer visionMarcin Jedyk
 
Week5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptxWeek5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptxfahmi324663
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012Jinwon Lee
 
Object detection at night
Object detection at nightObject detection at night
Object detection at nightSanjay Crúzé
 
NVIDIA 深度學習教育機構 (DLI): Medical image segmentation using digits
NVIDIA 深度學習教育機構 (DLI): Medical image segmentation using digitsNVIDIA 深度學習教育機構 (DLI): Medical image segmentation using digits
NVIDIA 深度學習教育機構 (DLI): Medical image segmentation using digitsNVIDIA Taiwan
 
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...Daosheng Mu
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用CHENHuiMei
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer VisionSungjoon Choi
 
HiPEAC 2019 Workshop - Use Cases
HiPEAC 2019 Workshop - Use CasesHiPEAC 2019 Workshop - Use Cases
HiPEAC 2019 Workshop - Use CasesTulipp. Eu
 
Wits presentation 6_28072015
Wits presentation 6_28072015Wits presentation 6_28072015
Wits presentation 6_28072015Beatrice van Eden
 
Cahall Final Intern Presentation
Cahall Final Intern PresentationCahall Final Intern Presentation
Cahall Final Intern PresentationDaniel Cahall
 
Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...
Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...
Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...Aritra Sarkar
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learningVishwas Lele
 
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...Edge AI and Vision Alliance
 

Similar to Online video object segmentation via convolutional trident network (20)

Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)
 
Introduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural NetworksIntroduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural Networks
 
Computer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonComputer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathon
 
Introduction to computer vision
Introduction to computer visionIntroduction to computer vision
Introduction to computer vision
 
Week5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptxWeek5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptx
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012
 
IMAGE PROCESSING
IMAGE PROCESSINGIMAGE PROCESSING
IMAGE PROCESSING
 
Object detection at night
Object detection at nightObject detection at night
Object detection at night
 
NVIDIA 深度學習教育機構 (DLI): Medical image segmentation using digits
NVIDIA 深度學習教育機構 (DLI): Medical image segmentation using digitsNVIDIA 深度學習教育機構 (DLI): Medical image segmentation using digits
NVIDIA 深度學習教育機構 (DLI): Medical image segmentation using digits
 
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer Vision
 
HiPEAC 2019 Workshop - Use Cases
HiPEAC 2019 Workshop - Use CasesHiPEAC 2019 Workshop - Use Cases
HiPEAC 2019 Workshop - Use Cases
 
Wits presentation 6_28072015
Wits presentation 6_28072015Wits presentation 6_28072015
Wits presentation 6_28072015
 
Cahall Final Intern Presentation
Cahall Final Intern PresentationCahall Final Intern Presentation
Cahall Final Intern Presentation
 
SPPNet
SPPNetSPPNet
SPPNet
 
Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...
Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...
Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
 
NMSL_2017summer
NMSL_2017summerNMSL_2017summer
NMSL_2017summer
 

More from NAVER Engineering

디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIXNAVER Engineering
 
진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)NAVER Engineering
 
서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트NAVER Engineering
 
BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호NAVER Engineering
 
이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라NAVER Engineering
 
날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기NAVER Engineering
 
쏘카프레임 구축 배경과 과정
 쏘카프레임 구축 배경과 과정 쏘카프레임 구축 배경과 과정
쏘카프레임 구축 배경과 과정NAVER Engineering
 
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기NAVER Engineering
 
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)NAVER Engineering
 
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드NAVER Engineering
 
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기NAVER Engineering
 
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활NAVER Engineering
 
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출NAVER Engineering
 
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우NAVER Engineering
 
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...NAVER Engineering
 
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법NAVER Engineering
 
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며NAVER Engineering
 
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기NAVER Engineering
 
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기NAVER Engineering
 

More from NAVER Engineering (20)

React vac pattern
React vac patternReact vac pattern
React vac pattern
 
디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX
 
진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)
 
서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트
 
BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호
 
이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라
 
날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기
 
쏘카프레임 구축 배경과 과정
 쏘카프레임 구축 배경과 과정 쏘카프레임 구축 배경과 과정
쏘카프레임 구축 배경과 과정
 
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
 
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
 
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
 
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
 
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
 
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
 
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
 
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
 
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
 
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
 
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
 
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
 

Recently uploaded

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 

Recently uploaded (20)

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 

Online video object segmentation via convolutional trident network

  • 1. Online Video Object Segmentation via Convolutional Trident Network Seminar at Naver Won-Dong Jang Korea University 2017-08-22
  • 3. Video object segmentation • Clustering pixels in videos into objects or background • Unsupervised • Supervised • Semi-supervised Segment track
  • 4. Video object segmentation • Unsupervised video object segmentation • Discover and segment a primary object in a video • Without user annotations Input videos Ground-truth segmentation labels
  • 5. Video object segmentation • Unsupervised video object segmentation • Saliency detection-based approach • Visual saliency • Motion saliency Examples of saliency detection
  • 6. Video object segmentation • Supervised video object segmentation • User can annotate mislabeled pixels in any frames • Interaction between an algorithm and a user Annotation at the first frame Segmentation results after adding the annotation Segmentation results Additional user annotation
  • 7. Video object segmentation • Semi-supervised video object segmentation • Discover and segment a primary object in a video • With user annotations in the first frame Annotated target object in the first frame Segmentation results
  • 8. Related works • Object proposal-based algorithm • Generate object proposals in each frame • Select object proposals that are similar to the target object [2015][ICCV][Perazzi] Fully Connected Object Proposals for Video Segmentation Generated object proposals Segment track
  • 9. Related works • Superpixel-based algorithm • Over-segment each frame into superpixels • Trace the target object based on the inter-frame matching [2015][CVPR][Wen] Joint Online Tracking and Segmentation Inter-frame matching Segment track
  • 11. In this presentation • A semi-supervised online segmentation algorithm • Online method • Offline techniques require a huge memory space for a long video • Deep learning-based approach • Connection between a convolutional neural network and an MRF optimization strategy • Remarkable performance on the DAVIS benchmark dataset
  • 12. Overview • Framework • Inter-frame propagation • Inference via convolutional trident network (CTN) • Yields three tailored probability maps for the MRF optimization • MRF optimization
  • 13. Inter-frame propagation • Propagation from the previous frame • To roughly locate the target object in the current frame • Using backward optical flow • From 𝑡 to 𝑡 − 1 Segmentation label map for frame 𝑡 − 1 Backward motion (optical flow) Propagation map for frame 𝑡
  • 14. Convolutional trident network • Infer segmentation information • The propagation map may be inaccurate • The inferred information is effectively used to solve a binary labeling problem
  • 15. Network architecture • Encoder-decoder architecture • Single encoder • VGG encoder • Three decoders • Separative decoder • Definite foreground decoder • Definite background decoder
  • 16. Network architecture • Encoder-decoder architecture • Single encoder • VGG encoder • Three decoders • Separative decoder • Definite foreground decoder • Definite background decoder
  • 17. Network architecture • VGG encoder • Trained for image classification • Using ImageNet dataset 22K categories 14M images
  • 18. Network architecture • VGG encoder • Trained for image classification • Using ImageNet dataset Convolutional layers Fully connected layers Fox Cat Lion Dog Zebra Umbrella Tank Lamp Desk Orange Kite …
  • 19. Network architecture • Separative decoder (SD) • Separate a target object from the background • Using a down-sampled foreground propagation patch Conv1_1 Conv1_2 Pooling Conv2_1 Conv2_2 Pooling Conv3_1 Conv3_2 Conv3_3 Pooling Conv4_1 Conv4_2 Conv4_3 Pooling Conv5_1 Conv5_2 Conv5_3 SD-Dec2 SD-Dec3 SD-Dec4 SD-Dec5 SD-Pred SD-Dec1 Unpooling Unpooling Unpooling Unpooling Skip connections Image patch Separative probability patch Foreground propagation patch
  • 20. Network architecture • Definite foreground and background decoders • Definite pixels indicate locations that should be labeled as the foreground or the background indubitably • Fixing labels in definite pixels improves labeling accuracies • Inspired by the image matting problem Input image Tri-map Matting result
  • 21. Network architecture • Definite foreground decoder (DFD) • Identifies definite foreground pixels Conv1_1 Conv1_2 Pooling Conv2_1 Conv2_2 Pooling Conv3_1 Conv3_2 Conv3_3 Pooling Conv4_1 Conv4_2 Conv4_3 Pooling Conv5_1 Conv5_2 Conv5_3 DFD-Dec2 DFD-Dec3 DFD-Dec4 DFD-Dec5 DFD-Pred DFD-Dec1 Unpooling Unpooling Unpooling Unpooling Image patch Definite foreground probability patch Foreground propagation patch
  • 22. Network architecture • Definite background decoder (DBD) • Finds definite background pixels Conv1_1 Conv1_2 Pooling Conv2_1 Conv2_2 Pooling Conv3_1 Conv3_2 Conv3_3 Pooling Conv4_1 Conv4_2 Conv4_3 Pooling Conv5_1 Conv5_2 Conv5_3 DBD-Dec2 DBD-Dec3 DBD-Dec4 DBD-Dec5 DBD-Pred DBD-Dec1 Unpooling Unpooling Unpooling Unpooling Image patch Definite background probability patch Background propagation patch
  • 23. Network architecture • Implementation issues in decoders • Prediction layer • Sigmoid layer is used to yield normalized outputs within [0, 1] • Rectified linear unit (ReLU) + Batch normalization • Kernel size • 3 ×3 kernels are used in the prediction layers • 5 × 5 kernels are used in the other convolution layers
  • 24. Training phase • Lack of video object segmentation dataset • There are several datasets for video object segmentation • However, each of them consists of a small number of videos from 12 to 59 • Instead, • We use the PASCAL VOC 2012 dataset • Object segmentation • 26,844 object masks • 11,355 images
  • 25. Training phase • Preprocessing of training data • Ground-truth masks for the DFD and DBD are not available • Hence, we generate them through simple image processing
  • 26. Training phase • Preprocessing of training data • Degrade the objet mask to imitate propagation errors • By performing suppression and noise addition • Synthesize the ground-truth masks for the DFD and DBD • By applying erosion and dilation
  • 27. Training phase • Implementation issues • Caffe library • Cross-entropy losses − 1 𝑛 ෍ 𝑛=1 𝑁 𝑝 𝑛 log Ƹ𝑝 𝑛 + 1 − 𝑝 𝑛 log 1 − Ƹ𝑝 𝑛 • Minibatch • with eight training data • Learning rate • 1e-3 for the first 55 epochs • 1e-4 for the next 35 epochs • Stochastic gradient descent
  • 28. Inference phase • Set input data • By cropping the frame and propagation map • CTN outputs three probability patches • Separative probability map 𝑅S • Definite foreground probability map 𝑅F • Definite background probability map 𝑅B Frame 𝑡 Segmentation label at frame 𝑡 − 1 Optical flow Inter-frame propagationInput at frame 𝑡 Propagation map at frame 𝑡 Inference via convolutional encoder-decoder network Encoder Background propagation patch Foreground propagation patch Image patch Definite background decoder Definite foreground decoder Separative decoder Definite background probability patch Definite foreground probability patch Separative probability patch
  • 29. Inference phase • Classification • Separative probability map 𝑅S • If 𝑅S(𝐩) > 𝜃sep, pixel 𝐩 is classified as the foreground • ℒ be the coordinate set for such foreground pixels • Definite foreground probability map 𝑅F • If 𝑅F(𝐩) > 𝜃def, pixel 𝐩 is classified as the definite foreground • ℱ denotes the set of the definite foreground pixels • Definite background probability map 𝑅B • If 𝑅B(𝐩) > 𝜃def, pixel 𝐩 is classified as the definite background • ℬ indicates the set of the definite background pixels
  • 30. MRF optimization • Solve two-class MRF optimization problem • To improve the segmentation quality further • Define a graph 𝐺 = (𝑁, 𝐸) • Nodes are pixels in the current frame • Each node is connected to its four neighbors by edges
  • 31. MRF optimization • MRF energy function ℰ 𝑆 = ෍ 𝐩∈𝑁 𝒟 𝐩, 𝑆 + 𝛾 × ෍ 𝐩,𝐪 ∈𝐸 𝒵 𝐩, 𝐪, 𝑆 Definite background probability patch Definite foreground probability patch Image patch Separative probability patch Unary cost 𝒟 - Returns extremely high costs on DF and DB pixels when they have background and foreground labels Pairwise cost 𝒵 - Encourage neighboring pixels to have the same label
  • 32. MRF optimization • Unary cost computation • Build the RGB color Gaussian mixture models (GMMs) of the foreground and the background, respectively • 𝐾 = 10 for both GMMs • Use pixels in ℒ to construct the foreground GMMs • Use pixels in ℒ 𝑐 to construct the background GMMs • Gaussian cost 𝜓 𝐩, 𝑠 = min 𝑘 − log 𝑓 𝐩; ℳ𝑠,𝑘 • Unary cost 𝒟 𝐩, 𝑆 = ൞ ∞ if 𝑝 ∈ ℱ and 𝑆 𝑝 = 0 ∞ if 𝑝 ∈ ℱ and 𝑆 𝑝 = 0 𝜓 𝐩, 𝑆(𝐩) otherwise
  • 33. MRF optimization • Pairwise cost computation • Pairwise cost 𝒵 𝐩, 𝐪, 𝑆 = ቊ exp −𝑑 𝐩, 𝐪 if 𝑆 𝐩 ≠ 𝑆 𝐪 0 otherwise • Graph-cut optimization ℰ 𝑆 = ෍ 𝐩∈𝑁 𝒟 𝐩, 𝑆 + 𝛾 × ෍ 𝐩,𝐪 ∈𝐸 𝒵 𝐩, 𝐪, 𝑆
  • 34. Reappearing object detection • Identification of reappearing parts • A target object may disappear and be occluded by other objects • Use backward-forward motion consistency
  • 36. Experimental results • The DAVIS benchmark dataset • 50 videos • 854 × 480 resolution • Number of frames • From 25 to 104 • Difficulties • Fast motion • Occlusion • Object deformation
  • 37. Experimental results • Performance measures • Region similarity • Jaccard index • Contour accuracy • F-measure • Statistics • Mean • Recall • Decay
  • 40. Experimental results • The SegTrack dataset • 5 videos • Performance comparison • Jaccard index
  • 41. Experimental results • Ablation studies • Effectiveness of each decoder • Efficacy of MRF optimization
  • 42. Experimental results • Running time analysis • Prop-Q • Uses the state-of-the-art optical flow technique • Prop-F • Adopts a much faster optical flow technique • Parameter selection
  • 43. Conclusions • A semi-supervised online video object segmentation algorithm is introduced in this presentation • Deep learning-based semi-supervised video object segmentation algorithm • Tailored network for MRF optimization • Remarkable performance on the DAVIS dataset • Q&A • Thank you