Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Deep Convolutional Neural Fields for
Depth Estimation from a Single Image
Fayao Liu, Chunhua Shen, Guosheng Lin
University...
Australian Centre for Robotic Vision
• University of Adelaide, Australia
Chunhua Shen
Compressive sensing, tracking, detec...
Depth Estimation in Monocular Images
No reliable depth cue
• No stereo correspondence
• No motion in videos
2016/8/11 3
Previous works
• Enforcing geometric assumptions
– Hedau et al. ECCV 2010
– Lee et al. NIPS 2010
– Gupta et al. ECCV 2010
...
Contributions
Propose to formulate the depth estimation as a deep
continuous CRF learning problem, without relying on any
...
Overview
• 𝐱: image
• 𝐲 = 𝑦1, … , 𝑦𝑛 ∈ 𝑅 𝑛
: continuous depth values
corresponding to all 𝑛 superpixels in 𝐱
• conditional...
Overview
• conditional probability distribution of the data
• Z(𝐱) is the partition function
• Inference: maximum a poster...
Energy Function
• Typical combination of unary and pairwise potentials
• 𝑈 regress the depth from a single superpixel
• 𝑉 ...
Framework
Unary part
Pairwise part CRF loss layer
2016/8/11 9
Unary Potential
• Regress depth value of each superpixel using lease
square loss
Ground-truth prediction
224 × 224
2016/8/...
Pairwise Potential
• Pairwise potentials are constructed from 𝐾 types of
similarity observations
• Here 𝑅 𝑝𝑞 is the output...
Pairwise Potential
• Only 1 fully connected layer (without activation)
• 𝑆 𝑝𝑞
(𝐾)
: 𝑘th similarity type
𝑆 𝑝𝑞
(𝐾)
= exp(−𝛾|...
Learning
2016/8/11 13
The Energy Function
• The energy
• For ease of expression, we introduce
– 𝐈 is the 𝑛 × 𝑛 identity matrix
– 𝐑 is the matrix...
Partition and Conditional Probability
Distribution
• Remind that
• and the energy
• Due to quadratic terms of 𝐲 and positi...
Negative log-likelihood
• Given
• The negative log-likelihood is
• During learning, we minimizes the negative log-
likelih...
Partial Derivatives
• We then calculate the partial derivatives of negative
log-likelihood
• where 𝐉 is an 𝑛 × 𝑛 matrix wi...
Inference
2016/8/11 18
Depth Prediction
• Prediction is to solve the MAP inference, in which
closed form solutions exist
• Discuss: if 𝑅 𝑝𝑞 = 0(d...
Experiment Datasets
• Make3D: outdoor scene reconstruction
– 534 images
• NYU v2: indoor scene reconstruction
– 1449 RGBD ...
Evaluation Protocals
2016/8/11 21
Baseline Comparisons
• NYU v2
• Make3D
2016/8/11 22
Make3D
2016/8/11 23
Make3D
2016/8/11 24
NYU v2
2016/8/11 25
NYU v2
2016/8/11 26
Thank you.
Deep Convolutional Neural Fields for Depth Estimation from a
Single Image
2016/8/11 27
Upcoming SlideShare
Loading in …5
×

Deep convolutional neural fields for depth estimation from a single image

1,286 views

Published on

Paper reading 2015

Published in: Science
  • Be the first to comment

Deep convolutional neural fields for depth estimation from a single image

  1. 1. Deep Convolutional Neural Fields for Depth Estimation from a Single Image Fayao Liu, Chunhua Shen, Guosheng Lin University of Adelaide, Australia; Australian Centre for Robotic Vision 2016/8/11 1
  2. 2. Australian Centre for Robotic Vision • University of Adelaide, Australia Chunhua Shen Compressive sensing, tracking, detection, … Weibo: @沈春華_ADL Fayao Liu (PhD student) Depth estimation, image segmentation, CRF learning Guosheng Lin Graphical models, hashing 2016/8/11 2
  3. 3. Depth Estimation in Monocular Images No reliable depth cue • No stereo correspondence • No motion in videos 2016/8/11 3
  4. 4. Previous works • Enforcing geometric assumptions – Hedau et al. ECCV 2010 – Lee et al. NIPS 2010 – Gupta et al. ECCV 2010 • Non-parametric methods – Candidate images retrieval + scene alignment + depth infer – Karsch et al. PAMI 2014 2016/8/11 4
  5. 5. Contributions Propose to formulate the depth estimation as a deep continuous CRF learning problem, without relying on any geometric priors nor any extra information – joint training of a deep CNN and a graphical model – the partition can be analytically calculated, the log-likelihood can be optimized directly – The gradients can be exactly calculated in the back propagation training. – Inference (MAP problem) is in closed form – Jointly train unary and pairwise potentials of the CRF 2016/8/11 5
  6. 6. Overview • 𝐱: image • 𝐲 = 𝑦1, … , 𝑦𝑛 ∈ 𝑅 𝑛 : continuous depth values corresponding to all 𝑛 superpixels in 𝐱 • conditional probability distribution of the data • Z(𝐱) is the partition function 2016/8/11 6
  7. 7. Overview • conditional probability distribution of the data • Z(𝐱) is the partition function • Inference: maximum a posteriori (MAP) problem 2016/8/11 7
  8. 8. Energy Function • Typical combination of unary and pairwise potentials • 𝑈 regress the depth from a single superpixel • 𝑉 encourages smoothness between neighboring superpixels • 𝑈 and 𝑉 are jointly learned in a unified CNN framework 2016/8/11 8
  9. 9. Framework Unary part Pairwise part CRF loss layer 2016/8/11 9
  10. 10. Unary Potential • Regress depth value of each superpixel using lease square loss Ground-truth prediction 224 × 224 2016/8/11 10
  11. 11. Pairwise Potential • Pairwise potentials are constructed from 𝐾 types of similarity observations • Here 𝑅 𝑝𝑞 is the output of the network • Only 1 fully connected layer (without activation) 2016/8/11 11
  12. 12. Pairwise Potential • Only 1 fully connected layer (without activation) • 𝑆 𝑝𝑞 (𝐾) : 𝑘th similarity type 𝑆 𝑝𝑞 (𝐾) = exp(−𝛾||𝑠 𝑝 (𝑘) − 𝑠 𝑞 (𝑘) ||), • 3 types are used in the paper – color difference – color histogram difference – LBP texture disparity 2016/8/11 12
  13. 13. Learning 2016/8/11 13
  14. 14. The Energy Function • The energy • For ease of expression, we introduce – 𝐈 is the 𝑛 × 𝑛 identity matrix – 𝐑 is the matrix composed of 𝑅 𝑝𝑞 – 𝐃 is a diagonal matrix with 𝐷 𝑝𝑝 = 𝑞 𝑅 𝑝𝑞 • We have 2016/8/11 14
  15. 15. Partition and Conditional Probability Distribution • Remind that • and the energy • Due to quadratic terms of 𝐲 and positive definiteness of 𝐀, we have • Gaussian integral (n-dimensional with linear term) • Hence the conditional probability distribution is 2016/8/11 15
  16. 16. Negative log-likelihood • Given • The negative log-likelihood is • During learning, we minimizes the negative log- likelihood of the training data with regularization: 2016/8/11 16
  17. 17. Partial Derivatives • We then calculate the partial derivatives of negative log-likelihood • where 𝐉 is an 𝑛 × 𝑛 matrix with elements 2016/8/11 17
  18. 18. Inference 2016/8/11 18
  19. 19. Depth Prediction • Prediction is to solve the MAP inference, in which closed form solutions exist • Discuss: if 𝑅 𝑝𝑞 = 0(discard the pairwise term), then 𝐲∗ = 𝐳, which is a conventional regression model. 2016/8/11 19
  20. 20. Experiment Datasets • Make3D: outdoor scene reconstruction – 534 images • NYU v2: indoor scene reconstruction – 1449 RGBD images (795 training; 654 testing) 2016/8/11 20
  21. 21. Evaluation Protocals 2016/8/11 21
  22. 22. Baseline Comparisons • NYU v2 • Make3D 2016/8/11 22
  23. 23. Make3D 2016/8/11 23
  24. 24. Make3D 2016/8/11 24
  25. 25. NYU v2 2016/8/11 25
  26. 26. NYU v2 2016/8/11 26
  27. 27. Thank you. Deep Convolutional Neural Fields for Depth Estimation from a Single Image 2016/8/11 27

×