Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
DeepVO
Towards End-to-End Visual Odometry with Deep
Recurrent Convolutional Neural Networks
National Chung Cheng Universit...
About this work
DeepVO : Towards Visual Odometry with Deep Learning
Sen Wang1,2, Ronald Clark2, Hongkai Wen2 and Niki Trig...
Contributions
1. Proving that
Monocular VO could
be build by End-to-
End training
2. RCNN architecture
could generalized t...
Related works
4
Visual odometry
Geometric
Sparse Direct
Learning
Related works
Sparse
 PTAM
 ORB-SLAM
Direct
 DTAM
5
Network
 CNN
 RNN
 LSTM
Network design
1. Traditional computer vision learn knowledge from
appearance and image context
2. Visual odometry should ...
Network design
7
DeepVO : Towards Visual Odometry with Deep Learning
8
DeepVO : Towards Visual Odometry with Deep Learning
Preprocessing
 Normalizing inputs (speed up training)
=> subtracting the mean RGB values of the
training set
 Resize ima...
CNN
 What this research mean by learning
“geometric” feature?
=> They stacking two RGB images and feed it
into CNN. Expec...
RNN
 RNN is not suitable to directly learn sequential
representation from high-dimensional raw
data, such as images.
 Hi...
LSTM (Long short-term memory)
12
DeepVO : Towards Visual Odometry with Deep Learning
Need depth to
learn high level
repres...
13
DeepVO : Towards Visual Odometry with Deep Learning
14
Cost function
𝜃∗
= argmin
𝜃
1
𝑁
෍
𝑖=1
𝑁
෍
𝑘=1
𝑡
Ƹ𝑝 𝑘 − 𝑝 𝑘 2
2
+ 𝜘 ො𝜑 𝑘 − 𝜑 𝑘 2
2
Conditional probability of pose
𝑝 𝑌𝑡 ...
Experimental results
DeepVO
VISO2
15
Training & testing
1. Dataset: KITTI VO/SLAM benchmark
(22 sequences of images / 10fps / dynamic object)
2. 7410 training ...
overfitting
 Orientation is more
prone to overfitting
17
DeepVO : Towards Visual Odometry with Deep Learning
Compare with
traditional VO
 Open-source VO library
LIBVISO2
 Monocular / Stereo
18
DeepVO : Towards Visual Odometry wit...
Trajectory (1/2)
19
DeepVO : Towards Visual Odometry with Deep Learning
Trajectory (2/2)
 No ground truth:
Seq11~19
20
DeepVO : Towards Visual Odometry with Deep Learning
21
DeepVO : Towards Visual Odometry with Deep Learning
Dynamic
 This research don’t
know how to deal
with this issue
 Traditional VO –
RANSAC (remove
outlier)
 Get more train...
Conclusion
23
 End-to-end monocular VO based on Deep learning
 Deep RCNN
 No need to carefully tune the parameters of t...
Upcoming SlideShare
Loading in …5
×

DeepVO - Towards Visual Odometry with Deep Learning

Author:
Sen Wang1,2, Ronald Clark2, Hongkai Wen2 and Niki Trigoni2
1. Edinburgh Centre for Robotics, Heriot-Watt University, UK
2. University of Oxford, UK
Download this paper: http://senwang.gitlab.io/DeepVO/#paper
Watch video: http://senwang.gitlab.io/DeepVO/#video

  • Be the first to comment

DeepVO - Towards Visual Odometry with Deep Learning

  1. 1. DeepVO Towards End-to-End Visual Odometry with Deep Recurrent Convolutional Neural Networks National Chung Cheng University, Taiwan Robot Vision Laboratory 2017/11/08 Jacky Liu
  2. 2. About this work DeepVO : Towards Visual Odometry with Deep Learning Sen Wang1,2, Ronald Clark2, Hongkai Wen2 and Niki Trigoni2 1. Edinburgh Centre for Robotics, Heriot-Watt University, UK 2. University of Oxford, UK Download this paper: http://senwang.gitlab.io/DeepVO/#paper Watch video: http://senwang.gitlab.io/DeepVO/#video 2 DeepVO : Towards Visual Odometry with Deep Learning
  3. 3. Contributions 1. Proving that Monocular VO could be build by End-to- End training 2. RCNN architecture could generalized to unseen environment 3. Complex movement could be modeled by RCNN 3 DeepVO : Towards Visual Odometry with Deep Learning
  4. 4. Related works 4 Visual odometry Geometric Sparse Direct Learning
  5. 5. Related works Sparse  PTAM  ORB-SLAM Direct  DTAM 5 Network  CNN  RNN  LSTM
  6. 6. Network design 1. Traditional computer vision learn knowledge from appearance and image context 2. Visual odometry should learn from geometry. This is what RCNN tried to address 6 DeepVO : Towards Visual Odometry with Deep Learning
  7. 7. Network design 7 DeepVO : Towards Visual Odometry with Deep Learning
  8. 8. 8 DeepVO : Towards Visual Odometry with Deep Learning
  9. 9. Preprocessing  Normalizing inputs (speed up training) => subtracting the mean RGB values of the training set  Resize image to 64x  Stack two images to form a tensor 9 DeepVO : Towards Visual Odometry with Deep Learning
  10. 10. CNN  What this research mean by learning “geometric” feature? => They stacking two RGB images and feed it into CNN. Expecting the network to perform feature extraction on the concatenation of two consecutive monocular RGB images. 10 DeepVO : Towards Visual Odometry with Deep Learning
  11. 11. RNN  RNN is not suitable to directly learn sequential representation from high-dimensional raw data, such as images.  Hidden state: ℎ 𝑘 = ℋ 𝑊𝑥ℎ 𝑥 𝑘 + 𝑊ℎℎℎ 𝑘−1 + 𝑏ℎ  Output: 𝑦 𝑘 = 𝑊ℎ𝑦ℎ 𝑘 + 𝑏 𝑦 11 DeepVO : Towards Visual Odometry with Deep Learning 𝑏: bias vector𝑊: weight matrix 𝑘: time index ℋ: activation function Vanishing gradient problem
  12. 12. LSTM (Long short-term memory) 12 DeepVO : Towards Visual Odometry with Deep Learning Need depth to learn high level representation
  13. 13. 13 DeepVO : Towards Visual Odometry with Deep Learning
  14. 14. 14 Cost function 𝜃∗ = argmin 𝜃 1 𝑁 ෍ 𝑖=1 𝑁 ෍ 𝑘=1 𝑡 Ƹ𝑝 𝑘 − 𝑝 𝑘 2 2 + 𝜘 ො𝜑 𝑘 − 𝜑 𝑘 2 2 Conditional probability of pose 𝑝 𝑌𝑡 𝑋𝑡 = 𝑝(𝑦1, … , 𝑦𝑡|𝑥1, … , 𝑥𝑡) 𝜃∗ = argmin 𝜃 𝑝(𝑌𝑡|𝑋𝑡; 𝜃) Ground truth pose (𝑝 𝑘, 𝜑 𝑘) = (position, orientation) 𝑠𝑐𝑎𝑙𝑒 𝑓𝑎𝑐𝑡𝑜𝑟
  15. 15. Experimental results DeepVO VISO2 15
  16. 16. Training & testing 1. Dataset: KITTI VO/SLAM benchmark (22 sequences of images / 10fps / dynamic object) 2. 7410 training samples (image and trajectory pair) 3. Implemented based on Theano 4. Hardware: Nvidia Tesla K40 GPU 5. 200 epochs 6. Learning rate 0.001 7. Regularization: dropout / early stopping 8. CNN: transfer learning from FlowNet 16
  17. 17. overfitting  Orientation is more prone to overfitting 17 DeepVO : Towards Visual Odometry with Deep Learning
  18. 18. Compare with traditional VO  Open-source VO library LIBVISO2  Monocular / Stereo 18 DeepVO : Towards Visual Odometry with Deep Learning
  19. 19. Trajectory (1/2) 19 DeepVO : Towards Visual Odometry with Deep Learning
  20. 20. Trajectory (2/2)  No ground truth: Seq11~19 20 DeepVO : Towards Visual Odometry with Deep Learning
  21. 21. 21 DeepVO : Towards Visual Odometry with Deep Learning
  22. 22. Dynamic  This research don’t know how to deal with this issue  Traditional VO – RANSAC (remove outlier)  Get more training data 22 DeepVO : Towards Visual Odometry with Deep Learning
  23. 23. Conclusion 23  End-to-end monocular VO based on Deep learning  Deep RCNN  No need to carefully tune the parameters of the VO system  It is not expected as a replacement to the classic geometry based approach

×