Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

1,286 views

Published on

Paper reading 2015

Published in:
Science

No Downloads

Total views

1,286

On SlideShare

0

From Embeds

0

Number of Embeds

168

Shares

0

Downloads

0

Comments

0

Likes

2

No embeds

No notes for slide

Guosheng Lin is the winner of Google PhD fellowship in 2014

Here E is the energy, Z is the partion function.

In general, Z is difficult to compute. Howerver, in this paper, the CRF model is continuous since the depth values are continuous. Under certain conditions, Z can be calculated analytically. We will discuss this later.

N is the set of all superpixels, and S are the edges over the graphical model.

- 1. Deep Convolutional Neural Fields for Depth Estimation from a Single Image Fayao Liu, Chunhua Shen, Guosheng Lin University of Adelaide, Australia; Australian Centre for Robotic Vision 2016/8/11 1
- 2. Australian Centre for Robotic Vision • University of Adelaide, Australia Chunhua Shen Compressive sensing, tracking, detection, … Weibo: @沈春華_ADL Fayao Liu (PhD student) Depth estimation, image segmentation, CRF learning Guosheng Lin Graphical models, hashing 2016/8/11 2
- 3. Depth Estimation in Monocular Images No reliable depth cue • No stereo correspondence • No motion in videos 2016/8/11 3
- 4. Previous works • Enforcing geometric assumptions – Hedau et al. ECCV 2010 – Lee et al. NIPS 2010 – Gupta et al. ECCV 2010 • Non-parametric methods – Candidate images retrieval + scene alignment + depth infer – Karsch et al. PAMI 2014 2016/8/11 4
- 5. Contributions Propose to formulate the depth estimation as a deep continuous CRF learning problem, without relying on any geometric priors nor any extra information – joint training of a deep CNN and a graphical model – the partition can be analytically calculated, the log-likelihood can be optimized directly – The gradients can be exactly calculated in the back propagation training. – Inference (MAP problem) is in closed form – Jointly train unary and pairwise potentials of the CRF 2016/8/11 5
- 6. Overview • 𝐱: image • 𝐲 = 𝑦1, … , 𝑦𝑛 ∈ 𝑅 𝑛 : continuous depth values corresponding to all 𝑛 superpixels in 𝐱 • conditional probability distribution of the data • Z(𝐱) is the partition function 2016/8/11 6
- 7. Overview • conditional probability distribution of the data • Z(𝐱) is the partition function • Inference: maximum a posteriori (MAP) problem 2016/8/11 7
- 8. Energy Function • Typical combination of unary and pairwise potentials • 𝑈 regress the depth from a single superpixel • 𝑉 encourages smoothness between neighboring superpixels • 𝑈 and 𝑉 are jointly learned in a unified CNN framework 2016/8/11 8
- 9. Framework Unary part Pairwise part CRF loss layer 2016/8/11 9
- 10. Unary Potential • Regress depth value of each superpixel using lease square loss Ground-truth prediction 224 × 224 2016/8/11 10
- 11. Pairwise Potential • Pairwise potentials are constructed from 𝐾 types of similarity observations • Here 𝑅 𝑝𝑞 is the output of the network • Only 1 fully connected layer (without activation) 2016/8/11 11
- 12. Pairwise Potential • Only 1 fully connected layer (without activation) • 𝑆 𝑝𝑞 (𝐾) : 𝑘th similarity type 𝑆 𝑝𝑞 (𝐾) = exp(−𝛾||𝑠 𝑝 (𝑘) − 𝑠 𝑞 (𝑘) ||), • 3 types are used in the paper – color difference – color histogram difference – LBP texture disparity 2016/8/11 12
- 13. Learning 2016/8/11 13
- 14. The Energy Function • The energy • For ease of expression, we introduce – 𝐈 is the 𝑛 × 𝑛 identity matrix – 𝐑 is the matrix composed of 𝑅 𝑝𝑞 – 𝐃 is a diagonal matrix with 𝐷 𝑝𝑝 = 𝑞 𝑅 𝑝𝑞 • We have 2016/8/11 14
- 15. Partition and Conditional Probability Distribution • Remind that • and the energy • Due to quadratic terms of 𝐲 and positive definiteness of 𝐀, we have • Gaussian integral (n-dimensional with linear term) • Hence the conditional probability distribution is 2016/8/11 15
- 16. Negative log-likelihood • Given • The negative log-likelihood is • During learning, we minimizes the negative log- likelihood of the training data with regularization: 2016/8/11 16
- 17. Partial Derivatives • We then calculate the partial derivatives of negative log-likelihood • where 𝐉 is an 𝑛 × 𝑛 matrix with elements 2016/8/11 17
- 18. Inference 2016/8/11 18
- 19. Depth Prediction • Prediction is to solve the MAP inference, in which closed form solutions exist • Discuss: if 𝑅 𝑝𝑞 = 0(discard the pairwise term), then 𝐲∗ = 𝐳, which is a conventional regression model. 2016/8/11 19
- 20. Experiment Datasets • Make3D: outdoor scene reconstruction – 534 images • NYU v2: indoor scene reconstruction – 1449 RGBD images (795 training; 654 testing) 2016/8/11 20
- 21. Evaluation Protocals 2016/8/11 21
- 22. Baseline Comparisons • NYU v2 • Make3D 2016/8/11 22
- 23. Make3D 2016/8/11 23
- 24. Make3D 2016/8/11 24
- 25. NYU v2 2016/8/11 25
- 26. NYU v2 2016/8/11 26
- 27. Thank you. Deep Convolutional Neural Fields for Depth Estimation from a Single Image 2016/8/11 27

No public clipboards found for this slide

Be the first to comment