Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Deep Learning for Video: Object Detection & Segmentation (UPC 2018)

1,956 views

Published on

https://mcv-m6-video.github.io/deepvideo-2018/

Overview of deep learning solutions for video processing. Part of a series of slides covering topics like action recognition, action detection, object tracking, object detection, scene segmentation, language and learning from videos.

Prepared for the Master in Computer Vision Barcelona:
http://pagines.uab.cat/mcv/

Published in: Data & Analytics
  • Be the first to comment

Deep Learning for Video: Object Detection & Segmentation (UPC 2018)

  1. 1. @DocXavi Xavier Giró-i-Nieto [http://pagines.uab.cat/mcv/] Module 6 Deep Learning for Video: Object Detection & Segmentation 22nd March 2018
  2. 2. ● MSc course (2017) ● BSc course (2018) 2 Deep Learning online courses by UPC: ● 1st edition (2016) ● 2nd edition (2017) ● 3rd edition (2018) ● 1st edition (2017) ● 2nd edition (2018) Next edition Autumn 2018 Next edition Winter/Spring 2019Summer School (late June 2018)
  3. 3. 3 [ILSVRC 2015 Slides and videos] Video Object Detection [Challenge in Kaggle]
  4. 4. 4 Objects: Object Detection Video Object Detection = Intra-frame Localization + Inter-frame tracking
  5. 5. 5 Kang, Kai, Hongsheng Li, Junjie Yan, Xingyu Zeng, Bin Yang, Tong Xiao, Cong Zhang et al. "T-CNN: Tubelets with convolutional neural networks for object detection from videos." TCSVT 2017. [code] Object Detection: T-CNN
  6. 6. 6 Kang, Kai, Hongsheng Li, Junjie Yan, Xingyu Zeng, Bin Yang, Tong Xiao, Cong Zhang et al. "T-CNN: Tubelets with convolutional neural networks for object detection from videos." TCSVT 2017. [code] Object Detection: T-CNN
  7. 7. 7 Kang, Kai, Hongsheng Li, Junjie Yan, Xingyu Zeng, Bin Yang, Tong Xiao, Cong Zhang et al. "T-CNN: Tubelets with convolutional neural networks for object detection from videos." TCSVT 2017. [code] Object Detection: T-CNN
  8. 8. 8 Kang, Kai, Hongsheng Li, Junjie Yan, Xingyu Zeng, Bin Yang, Tong Xiao, Cong Zhang et al. "T-CNN: Tubelets with convolutional neural networks for object detection from videos." TCSVT 2017. [code] Object Detection: T-CNN
  9. 9. 9 Kang, Kai, Hongsheng Li, Junjie Yan, Xingyu Zeng, Bin Yang, Tong Xiao, Cong Zhang et al. "T-CNN: Tubelets with convolutional neural networks for object detection from videos." TCSVT 2017. [code] Object Detection: T-CNN
  10. 10. 10 Kang, Kai, Hongsheng Li, Junjie Yan, Xingyu Zeng, Bin Yang, Tong Xiao, Cong Zhang et al. "T-CNN: Tubelets with convolutional neural networks for object detection from videos." TCSVT 2017. [code] Object Detection: T-CNN Long-term temporal consistency is obtained by running a tracking algorithm between short-term tubelets.
  11. 11. 11 Kang, Kai, Hongsheng Li, Junjie Yan, Xingyu Zeng, Bin Yang, Tong Xiao, Cong Zhang et al. "T-CNN: Tubelets with convolutional neural networks for object detection from videos." TCSVT 2017. [code] Object Detection: T-CNN
  12. 12. 12Feichtenhofer, Christoph, Axel Pinz, and Andrew Zisserman. "Detect to track and track to detect." ICCV 2017.
  13. 13. 13Feichtenhofer, Christoph, Axel Pinz, and Andrew Zisserman. "Detect to track and track to detect." ICCV 2017. [talk] Object Detection: Detect & Track Convolutional Cross-Correlations between feature responses of adjacent frames.
  14. 14. 14Feichtenhofer, Christoph, Axel Pinz, and Andrew Zisserman. "Detect to track and track to detect." ICCV 2017. [talk] Object Detection: Detect & Track Convolutional Cross-Correlations between feature responses of adjacent frames.
  15. 15. 15 Tokmakov, Pavel, Karteek Alahari, and Cordelia Schmid. "Learning video object segmentation with visual memory." ICCV 2017. Segmentation: Two-stream & Conv-GRU Temporal memory implemented with Conv-GRU.
  16. 16. 16 Segmentation: Deep Feature Flow Zhu, Xizhou, Yuwen Xiong, Jifeng Dai, Lu Yuan, and Yichen Wei. "Deep feature flow for video recognition." CVPR 2017. Deep features from sparse frames are propagated to neighbouring frames by the optical flow estimated by lightweight network.
  17. 17. 17 Pathak, Deepak, Ross Girshick, Piotr Dollár, Trevor Darrell, and Bharath Hariharan. "Learning features by watching objects move." CVPR 2017 Segmentation: Unsupervised
  18. 18. 18 Questions ?

×