SlideShare a Scribd company logo
DEPTH FUSION FROM RGB AND
DEPTH SENSORS IV
Yu Huang
Yu.huang07@gmail.com
Sunnyvale, California
Outline
■ Single-Photon 3D Imaging with Deep Sensor Fusion
■ Deep RGB-D Canonical Correlation Analysis For Sparse Depth Completion
■ Confidence Propagation through CNNs for Guided Sparse Depth
Regression
■ Learning Guided Convolutional Network for Depth Completion
■ DFineNet: Ego-Motion Estimation and Depth Refinement from Sparse,
Noisy Depth Input with RGB Guidance
■ PLIN: A Network for Pseudo-LiDAR Point Cloud Interpolation
■ Depth Completion from Sparse LiDAR Data with Depth-Normal
Constraints
Single-Photon 3D Imaging with Deep
Sensor Fusion
■ Active illumination time-of-flight sensors in particular have become widely used to
estimate a 3D representation of a scene.
■ However, the maximum range, density of acquired spatial samples, and overall
acquisition time of these sensors is fundamentally limited by the min-imum signal
required to estimate depth reliably.
■ A data-driven method for photon-efficient 3D imaging which leverages sensor fusion
and computational reconstruction to rapidly and robustly estimate a dense depth map
from low photon counts.
■ This sensor fusion approach uses measurements of single photon arrival times from a
LR single-photon detector array and an intensity image from a conventional HR camera.
■ Using a multi-scale deep convolutional network, it jointly processes the raw
measurements from both sensors and output a high-resolution depth map.
2018
Single-Photon 3D Imaging with Deep
Sensor Fusion
Single-photon 3D imaging systems measure a spatio-temporal volume containing photon counts (left) that
include ambient light, noise, and photons emitted by a pulsed laser into the scene and reflected back to the
detector. Conventional depth estimation techniques, such as log-matched filtering (center left), estimate a depth
map from these counts. However, depth estimation is a non-convex and challenging problem, especially for
extremely low photon counts observed in fast or long-range 3D imaging systems. Here is a data-driven approach
to solve this depth estimation problem and explore deep sensor fusion approaches that use an intensity image of
the scene to optimize the robustness (center right) and resolution (right) of the depth estimation.
Single-Photon 3D Imaging with Deep
Sensor Fusion
The denoising branch (left) takes as input the 3D volume of photon counts and processes it at multiple scales using
a series of 3D conv layers. The resulting features from each resolution scale are concatenated together and
optionally concatenated with additional features from an intensity image in a sensor fusion approach. A further set
of 3D conv layers regresses a normalized illumination pulse, censoring the BG photon events. A differentiable
argmax operator is used to localize the ToF of the estimated illumination pulse and determine the depth. In the
image-guided upsampling branch (right), the network predicts HF differences between an upsampled LF depth map
and the HR depth map using multi-scale guidance from HF features of the intensity image. The entire network is
trainable end-to-end for depth estimation and upsampling from raw photon counts and an intensity image.
Single-Photon 3D Imaging with Deep
Sensor Fusion
(a) photo of setup (b) imaging optics (c) illumination optics
Single-photon imaging prototype. (a) both the imaging opMcs (boNom) and illuminaMon opMcs (top). The
illuminaMon and imaging opMcs are aligned in a recMfied setup to perform energy-efficient epipolar scanning.
(bA dichroic short-pass filter reflects light above 500 nm to a PointGrey vision camera, and transmits light of
all remaining wavelengths through a 450 nm laser line filter and onto a 1D array of 256 SPAD pixels. The galvo
mirror angle controls the scanline imaging the scene. (c) A cylindrical lens creates a verMcal laser line, and the
galvo mirror determines the posiMon of this laser line within the scene.
Single-Photon 3D Imaging with Deep
Sensor Fusion
Reconstruction results for four scenes: checkerboard, elephant, lamp, and bouncing ball.
Deep RGB-D Canonical Correlation
Analysis For Sparse Depth Completion
■ Correlation For Completion Network (CFCNet), an end-to-end deep model to do the sparse
depth completion task with RGB info.
■ A 2D deep canonical correlation analysis as network constraints to ensure encoders of RGB
and depth capture the most similar semantics.
■ It transforms the RGB features to the depth domain, and the complementary RGB info is
used to complete the missing depth info.
■ A completed dense depth map is viewed as composed of two parts.
■ One is the sparse depth which is observable and used as the input, another is non-
observable and recovered by the task.
■ Also, the corresponding full RGB image of the depth map can be decomposed into two parts,
one is called the sparse RGB, which holds the corresponding RGB values at the observable
locations in the sparse depth, and the other part is complementary RGB, which is the
subtraction of the sparse RGB from the full RGB images.
■ During training, CFCNet learns the relationship between sparse depth and sparse RGB and
uses the learned knowledge to recover non-observable depth from complementary RGB.
2019,6
Deep RGB-D Canonical Correlation
Analysis For Sparse Depth Completion
The input 0 - 1 sparse mask represents the sparse pattern of depth measures. The complementary mask is complementary to the
sparse mask. Separate a full image into a sparse RGB and a complementary RGB by the mask and feed them with masks into networks.
Deep RGB-D Canonical Correlation
Analysis For Sparse Depth Completion
■ CFCNet takes in sparse depth map, sparse RGB, and complementary RGB.
■ Use the Sparsity-aware Attentional Convolutions (SAConv) in VGG16-like encoders.
■ SAConv is inspired by local attention mask which introduces the segmentation-aware
mask to let convolution "focus" on the signals consistent with the segmentation mask.
■ In order to propagate info from reliable sources, use sparsity masks to make convolution
operations attend on the signals from reliable locations.
■ Difference to the local attention mask is that SAConv does not apply mask normalization.
■ It avoids mask normalization cause it affect the stability of later 2D2CCA calculations due
to the numerically small extracted features it produces after several times normalization.
■ Also, use max-pooling operation on masks after every SAConv to keep track of the visibility.
■ If there is at least one nonzero value visible to a convolutional kernel, the max-pooling
would evaluate the value at the position to 1.
Deep RGB-D Canonical Correlation
Analysis For Sparse Depth Completion
SAConv. The ⊙ is for Hadamard product. The
⊗ is for convolution. The + is for elementwise
addition. The kernel size is 3 × 3 and stride is 1
for both convolution and max-pooling.
2D Deep Canonical Correlation Analysis (2D2CCA)
full-rank covariance matrix
covariance matrices
correlation
The total loss funcMon
Deep RGB-D Canonical Correlation
Analysis For Sparse Depth Completion
■ Most multi-modal deep learning approaches simply concatenate or element-wisely add
bottleneck features.
■ However, when the extracted semantics and range of feature value differs among
elements, direct concatenation and addition on multi-modal data source would not
always yield better performance than single-modal data source.
■ To avoid this problem, use encoders to extract higher-level semantics from two branches.
■ 2D2CCA ensures the extracted features from two branches are maximally correlated.
■ The intuition is to capture the same semantics from the RGB and depth domains.
■ Next, use a transformer network to transform extracted features from RGB domain to
depth domain, making extracted features from different sources share the same
numerical range.
■ During the training phase, use features of sparse depth and corresponding sparse RGB
image to calculate the 2D2CCA loss and transformer loss.
Deep RGB-D Canonical Correlation
Analysis For Sparse Depth Completion
(a) RGB image (b) 500 points sparse depth as inputs. (c) Completed depth maps. (d) Results from MIT.
Learning Guided Convolutional Network for
Depth Completion
■ Dense depth perception is critical for autonomous driving and other robotics applications.
■ It is thus necessary to complete the sparse LiDAR data, where a synchronized guidance
RGB image is often used to facilitate this completion.
■ Inspired by the guided image filtering, a guided network to predict kernel weights from the
guidance image.
■ These predicted kernels are then applied to extract the depth image features.
■ In this way, a network generates content-dependent and spatially-variant kernels for multi-
modal feature fusion.
■ Dynamically generated spatially-variant kernels could lead to prohibitive GPU memory
consumption and computation overhead.
■ A convolution factorization is designed to reduce computation and memory consumption.
■ GPU memory reduction makes it possible for feature fusion to work in multi-stage scheme.
2019,8
Learning Guided Convolutional Network for
Depth Completion
The network architecture includes two sub-networks: GuideNet in orange and DepthNet in blue. To add a convolution
layer at the beginning of both GuideNet and DepthNet as well as the end of DepthNet. The light orange and blue are
the encoder stages, while corresponding dark ones are decoder stage of GuideNet and DepthNet, respectively. The
ResBlock represents the basic residual block structure with two sequential 3 × 3 convolutional layers.
Learning Guided Convolutional Network for
Depth Completion
Guided Convolution Module. (a) the overall pipeline of guided convolution module. Given image features as input,
filter generation layer dynamically produces guided kernels, which are further applied on input depth features
and output new depth features. (b) the details of convolution between guided kernels and input depth features.
To factorize it into two-stage convolutions: channel-wise convolution and cross-channel convolution.
Learning Guided Convolutional Network for
Depth Completion
Qualitative comparison with state-of-the-art methods on KITTI test set
DFineNet: Ego-Motion Estimation and Depth Refinement
from Sparse, Noisy Depth Input with RGB Guidance
■ Depth estimation is an important capability for autonomous vehicles to understand
and reconstruct 3D environments as well as avoid obstacles during the execution.
■ Accurate depth sensors such as LiDARs are often heavy, expensive and can only
provide sparse depth while lighter depth sensors such as stereo cameras are noiser
in comparison.
■ It is an end- to-end learning algorithm that is capable of using sparse, noisy input
depth for refinement and depth completion.
■ This model also produces the camera pose as a byproduct, making it a great solution
for autonomous systems.
■ To evaluate the approach on both indoor and outdoor datasets.
■ 2019,8.
DFineNet: Ego-Motion Estimation and Depth
Refinement from Sparse, Noisy Depth Input with RGB
Guidance
An example of sparse, noisy depth input (1st row), the
3D visualization of ground truth of depth (2nd row)
and the 3D visualization of output from our model
(bottom). RGB image (1st) is overlaid with sparse,
noisy depth input for visualization.
DFineNet: Ego-Motion Estimation and Depth Refinement
from Sparse, Noisy Depth Input with RGB Guidance
It refines sparse & noisy depth input (the 3rd row) to output dense depth of high quality (bottom row).
DFineNet: Ego-Motion Estimation and Depth Refinement
from Sparse, Noisy Depth Input with RGB Guidance
Network Architecture
The network consists of two branches: one CNN to learn the function that estimates the depth (ψd),
and one CNN to learn the function that estimates the pose (θp). This network takes as input the image
sequence and corresponding sparse depth maps and outputs the transformation as well as the dense
depth map. During training, the two sets of parameters are simultaneously updated by the training
signal which will be detailed in this section. It is the revised depth net of Ma from MIT as a Depth-CNN.
DFineNet: Ego-Motion Estimation and Depth Refinement
from Sparse, Noisy Depth Input with RGB Guidance
■ Supervised Loss
■ Photometric Loss
■ Masked Photometric Loss
■ Smoothness Loss
– Derived from Sfm-net
■ Total Loss:
DFineNet: Ego-Motion Estimation and Depth Refinement
from Sparse, Noisy Depth Input with RGB Guidance
Qualitative results of this method (left), RGB guide & certainty (middle) ranking 1st and MIT’s Ma(right) ranking 7th.
Confidence Propagation through CNNs
for Guided Sparse Depth Regression
■ 2019,8
■ Generally, convolutional neural networks (CNNs) process data on a regular grid, e.g. data
generated by ordinary cameras.
■ Designing CNNs for sparse and irregularly spaced input data is still an open research
problem with numerous applications in autonomous driving, robotics, and surveillance.
■ An algebraically-constrained normalized convolution layer for CNNs with highly sparse
input that has a smaller number of network parameters compared to related work.
■ Strategies for determining the confidence from the convolution operation and
propagating it to consecutive layers.
■ An objective function that simultaneously minimizes the data error while maximizing the
output confidence.
■ To integrate structural information, fusion strategies to combine depth and RGB
information in the normalized convolution network framework.
■ In addition, use of output confidence as an auxiliary information to improve the results.
Confidence Propagation through CNNs
for Guided Sparse Depth Regression
Scene depth completion pipeline on an example image. The input to the pipeline is a very sparse projected LiDAR
point cloud, an input confidence map which has zeros at missing pixels and ones otherwise, and an RGB image. The
sparse point cloud input and the input confidence are fed to a multi-scale unguided network that acts as a generic
estimator for the data. Afterwards, the continuous output confidence map is concatenated with the RGB image and
fed to a feature extraction network. The output from the unguided network and the RGB feature extraction networks
are concatenated and fed to a fusion network which produces the final dense depth map.
Confidence Propagation through CNNs
for Guided Sparse Depth Regression
The standard convoluMon layer in CNN frameworks can be replaced by a normalized convoluMon layer
with minor modificaMons. First, the layer takes in two inputs simultaneously, the data and its confidence.
The forward pass is then modified and the back-propagaMon is modified to include a derivaMve term for
the non-negaMvity enforcement funcMon. To propagate the confidence to consecuMve layers, the already-
calculated denominator term is normalized by the sum of the filter elements.
Normalized Convolution layer that takes in two inputs, i.e. data
and confidence and outputs a data term and a confidence term.
Confidence Propagation through CNNs
for Guided Sparse Depth Regression
The multi-scale architecture for the task of unguided scene depth completion that utilizes normalized convolution
layers. Downsampling is performed using max pooling on confidence maps and the indices of the pooled pixels are
used to select the pixels with highest confidences from the feature maps. Different scales are fused by upsampling
the coarser scale and concatenating it with the finer scale. A normalized convolution layer is then used to fuse the
feature maps based on the confidence information. Finally, a 1 × 1 normalized convolution layer is used to merge
different channels into one channel and produce a dense output and an output confidence map.
Confidence Propagation through CNNs
for Guided Sparse Depth Regression
(a) A multi-stream architecture that
contains a stream for depth and
another stream for RGB + Output
Confidence feature extraction.
Afterwards, a fusion network combines
both streams to produce the final
dense output. (d) A multi-scale
encoder-decoder architecture where
depth is fed to the unguided network
followed by an encoder and output
confidence and RGB image are
concatenated then fed to a similar
encoder. Both streams have skip-
connection to the decoder between
the corresponding scales. (c) is similar
to (a), but with early fusion and (d) is
similar to (b) but with early fusion.
Confidence Propagation through CNNs
for Guided Sparse Depth Regression
(a) RGB input, (b) Method MS-Net[LF]-L2 (gd), (c) Sparse-
to-Dense (gd) and (d) HMS-Net (gd). For each one, top:
the prediction, ethod MS-Net[LF]-L2 (gd) performs
slightly better, while Sparse-to-Dense produces smoother
edges due to the use of a smoothness loss.
PLIN: A Network for Pseudo-LiDAR Point
Cloud Interpolation
■ LiDAR can provide dependable 3D spatial information at a low frequency (around 10Hz)
and have been widely applied in the field of autonomous driving and UAV.
■ However, the camera with a higher frequency (around 20-30Hz) has to be decreased so
as to match with LiDAR in a multi- sensor system.
■ A Pseudo-LiDAR interpolation network (PLIN) to increase the frequency of LiDAR sensors.
■ PLIN can generate temporally and spatially high- quality point cloud sequences to match
the high frequency of cameras.
■ For this goal, use a coarse interpolation stage guided by consecutive sparse depth maps
and motion relationship and a refined interpolation stage guided by the realistic scene.
■ Using this coarse-to-fine cascade structure, this method can progressively perceive
multi-modal info and generate accurate intermediate point clouds.
■ This is the first deep framework for Pseudo-LiDAR point cloud interpolation, which shows
appealing applications in navigation systems equipped with LiDAR and cameras.
2019,9
PLIN: A Network for Pseudo-LiDAR Point
Cloud Interpolation
Overall pipeline of the proposed method.
PLIN aims to address the mismatching
problem of frequency between camera
and LiDAR sensors, generating both
temporally and spatially high-quality
point cloud sequences. This method
takes three consecutive color images
and two sparse depth maps as inputs,
and interpolates an intermediate dense
depth map, which is further transformed
into a Pseudo-LiDAR point cloud using
camera intrinsic parameters.
PLIN: A Network for Pseudo-LiDAR Point
Cloud Interpolation
Overview of the Pseudo-LiDAR interpolation network (PLIN). The whole architecture consists of three modules,
including the motion guidance module, scene guidance module and transformation module.
PLIN: A Network for Pseudo-LiDAR Point
Cloud Interpolation
Results of interpolated depth map obtained by PLIN. For each example, it shows the intermediate color image,
sparse depth map, dense depth map, and the result. This method can recover the original depth informaMon and
generate much denser distribuMons.
PLIN: A Network for Pseudo-LiDAR Point
Cloud Interpolation
It shows the color image, interpolated dense depth map, two views of the generated Pseudo-LiDAR, and
enlarged areas. The complete network produces more accurate depth map, and the distribution and shape of
Pseudo-LiDAR are more similar to those of the GT point cloud.
Depth Completion from Sparse LiDAR
Data with Depth-Normal Constraints
■ Depth completion aims to recover dense depth maps from sparse depth
measurements.
■ It is of increasing importance for autonomous driving and draws increasing attention
from the vision community.
■ Most of existing methods directly train a network to learn a mapping from sparse
depth inputs to dense depth maps, which has difficulties in utilizing the 3D
geometric constraints and handling the practical sensor noises.
■ to regularize the depth completion and improve the robustness against noise, a
unified CNN framework 1) models the geometric constraints between depth and
surface normal in a diffusion module and 2) predicts the confidence of sparse
LiDAR measurements to mitigate the impact of noise.
■ Specifically, the encoder-decoder backbone predicts surface normals, coarse depth
and confidence of LiDAR inputs simultaneously, which are subsequently inputted
into the diffusion refinement module to obtain the final completion results.
2019,10
Depth Completion from Sparse LiDAR
Data with Depth-Normal Constraints
From sparse LiDAR measurements and color images (a-b), this model first infers the maps of coarse depth
and normal (c-d), and then recurrently refines the initial depth estimation by enforcing the constraints
between depth and normals. Moreover, to address the noises in practical LiDAR measurements (g), employ
a decoder branch to predict the confidences (h) of sparse inputs for better regularization.
Depth Completion from Sparse LiDAR
Data with Depth-Normal Constraints
The predicMon network first predicts maps of surface normal N, coarse depth D and confidence M of sparse depth input with a
shared-weight encoder and independent decoders. Then, the sparse depth inputs D ̄ and coarse depth D are transformed to the
plane-origin distance space as P ̄ and P. Next, the refinement network, an anisotropic diffusion module, refines the coarse depth
map D in the plane-origin distance subspace to enforce the constraints between depth and normal and to incorporate info from
the confident sparse depth inputs. During the refinement, the diffusion conductance depends on the similarity in guidance feature
map G. Finally, the refined P is inversely transformed back to obtain the refined depth map Drwhen the diffusion is finished.
Depth Completion from Sparse LiDAR
Data with Depth-Normal Constraints
Differentiable diffusion block. In each
refinement iteration, high-dimensional feature
vectors (e.g., of dimension 64) in guidance
feature map G are independently transformed via
two different functions f and g (modeled as two
convolution layers followed by normalization).
Then, the conductance from each location xi (in
plane-origin distance map P) to its neighboring K
pixels (xj ∈ Ni) are calculated. Finally, the diffusion
is performed through a convolution operation
with the kernels defined by the previous
computed conductance. Through such diffusion,
depth completion results are regularized by the
constraint between depth and normal.
Depth Completion from Sparse LiDAR
Data with Depth-Normal Constraints
negative cosine loss
L2 reconstruction loss
L2 depth loss
L2 refinement reconstruction loss
The overall loss function:
relation between depth and normal can be
established via the tangent plane equation
Depth Completion from Sparse LiDAR
Data with Depth-Normal Constraints
Quantitative comparison with other
methods. For each method, provide
the whole completion results as well
as the zoom-in views of details and
error maps for better comparison.
Also provide the normal prediction
and confidence prediction of this
method for better illustration.
Depth Fusion from RGB and Depth Sensors  IV

More Related Content

What's hot

Fisheye Omnidirectional View in Autonomous Driving II
Fisheye Omnidirectional View in Autonomous Driving IIFisheye Omnidirectional View in Autonomous Driving II
Fisheye Omnidirectional View in Autonomous Driving II
Yu Huang
 
Fisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous DrivingFisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous Driving
Yu Huang
 
Deep vo and slam ii
Deep vo and slam iiDeep vo and slam ii
Deep vo and slam ii
Yu Huang
 
Deep Learning’s Application in Radar Signal Data II
Deep Learning’s Application in Radar Signal Data IIDeep Learning’s Application in Radar Signal Data II
Deep Learning’s Application in Radar Signal Data II
Yu Huang
 
3-d interpretation from single 2-d image III
3-d interpretation from single 2-d image III3-d interpretation from single 2-d image III
3-d interpretation from single 2-d image III
Yu Huang
 
BEV Semantic Segmentation
BEV Semantic SegmentationBEV Semantic Segmentation
BEV Semantic Segmentation
Yu Huang
 
Driving behaviors for adas and autonomous driving XII
Driving behaviors for adas and autonomous driving XIIDriving behaviors for adas and autonomous driving XII
Driving behaviors for adas and autonomous driving XII
Yu Huang
 
camera-based Lane detection by deep learning
camera-based Lane detection by deep learningcamera-based Lane detection by deep learning
camera-based Lane detection by deep learning
Yu Huang
 
Pose estimation from RGB images by deep learning
Pose estimation from RGB images by deep learningPose estimation from RGB images by deep learning
Pose estimation from RGB images by deep learning
Yu Huang
 
3-d interpretation from stereo images for autonomous driving
3-d interpretation from stereo images for autonomous driving3-d interpretation from stereo images for autonomous driving
3-d interpretation from stereo images for autonomous driving
Yu Huang
 
Driving behaviors for adas and autonomous driving xiv
Driving behaviors for adas and autonomous driving xivDriving behaviors for adas and autonomous driving xiv
Driving behaviors for adas and autonomous driving xiv
Yu Huang
 
Deep VO and SLAM
Deep VO and SLAMDeep VO and SLAM
Deep VO and SLAM
Yu Huang
 
fusion of Camera and lidar for autonomous driving II
fusion of Camera and lidar for autonomous driving IIfusion of Camera and lidar for autonomous driving II
fusion of Camera and lidar for autonomous driving II
Yu Huang
 
Multi sensor calibration by deep learning
Multi sensor calibration by deep learningMulti sensor calibration by deep learning
Multi sensor calibration by deep learning
Yu Huang
 
Pedestrian Behavior/Intention Modeling for Autonomous Driving VI
Pedestrian Behavior/Intention Modeling for Autonomous Driving VIPedestrian Behavior/Intention Modeling for Autonomous Driving VI
Pedestrian Behavior/Intention Modeling for Autonomous Driving VI
Yu Huang
 
LiDAR-based Autonomous Driving III (by Deep Learning)
LiDAR-based Autonomous Driving III (by Deep Learning)LiDAR-based Autonomous Driving III (by Deep Learning)
LiDAR-based Autonomous Driving III (by Deep Learning)
Yu Huang
 
Deep VO and SLAM IV
Deep VO and SLAM IVDeep VO and SLAM IV
Deep VO and SLAM IV
Yu Huang
 
Depth Fusion from RGB and Depth Sensors by Deep Learning
Depth Fusion from RGB and Depth Sensors by Deep LearningDepth Fusion from RGB and Depth Sensors by Deep Learning
Depth Fusion from RGB and Depth Sensors by Deep Learning
Yu Huang
 
Pedestrian behavior/intention modeling for autonomous driving IV
Pedestrian behavior/intention modeling for autonomous driving IVPedestrian behavior/intention modeling for autonomous driving IV
Pedestrian behavior/intention modeling for autonomous driving IV
Yu Huang
 
Deep learning for 3-D Scene Reconstruction and Modeling
Deep learning for 3-D Scene Reconstruction and Modeling Deep learning for 3-D Scene Reconstruction and Modeling
Deep learning for 3-D Scene Reconstruction and Modeling
Yu Huang
 

What's hot (20)

Fisheye Omnidirectional View in Autonomous Driving II
Fisheye Omnidirectional View in Autonomous Driving IIFisheye Omnidirectional View in Autonomous Driving II
Fisheye Omnidirectional View in Autonomous Driving II
 
Fisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous DrivingFisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous Driving
 
Deep vo and slam ii
Deep vo and slam iiDeep vo and slam ii
Deep vo and slam ii
 
Deep Learning’s Application in Radar Signal Data II
Deep Learning’s Application in Radar Signal Data IIDeep Learning’s Application in Radar Signal Data II
Deep Learning’s Application in Radar Signal Data II
 
3-d interpretation from single 2-d image III
3-d interpretation from single 2-d image III3-d interpretation from single 2-d image III
3-d interpretation from single 2-d image III
 
BEV Semantic Segmentation
BEV Semantic SegmentationBEV Semantic Segmentation
BEV Semantic Segmentation
 
Driving behaviors for adas and autonomous driving XII
Driving behaviors for adas and autonomous driving XIIDriving behaviors for adas and autonomous driving XII
Driving behaviors for adas and autonomous driving XII
 
camera-based Lane detection by deep learning
camera-based Lane detection by deep learningcamera-based Lane detection by deep learning
camera-based Lane detection by deep learning
 
Pose estimation from RGB images by deep learning
Pose estimation from RGB images by deep learningPose estimation from RGB images by deep learning
Pose estimation from RGB images by deep learning
 
3-d interpretation from stereo images for autonomous driving
3-d interpretation from stereo images for autonomous driving3-d interpretation from stereo images for autonomous driving
3-d interpretation from stereo images for autonomous driving
 
Driving behaviors for adas and autonomous driving xiv
Driving behaviors for adas and autonomous driving xivDriving behaviors for adas and autonomous driving xiv
Driving behaviors for adas and autonomous driving xiv
 
Deep VO and SLAM
Deep VO and SLAMDeep VO and SLAM
Deep VO and SLAM
 
fusion of Camera and lidar for autonomous driving II
fusion of Camera and lidar for autonomous driving IIfusion of Camera and lidar for autonomous driving II
fusion of Camera and lidar for autonomous driving II
 
Multi sensor calibration by deep learning
Multi sensor calibration by deep learningMulti sensor calibration by deep learning
Multi sensor calibration by deep learning
 
Pedestrian Behavior/Intention Modeling for Autonomous Driving VI
Pedestrian Behavior/Intention Modeling for Autonomous Driving VIPedestrian Behavior/Intention Modeling for Autonomous Driving VI
Pedestrian Behavior/Intention Modeling for Autonomous Driving VI
 
LiDAR-based Autonomous Driving III (by Deep Learning)
LiDAR-based Autonomous Driving III (by Deep Learning)LiDAR-based Autonomous Driving III (by Deep Learning)
LiDAR-based Autonomous Driving III (by Deep Learning)
 
Deep VO and SLAM IV
Deep VO and SLAM IVDeep VO and SLAM IV
Deep VO and SLAM IV
 
Depth Fusion from RGB and Depth Sensors by Deep Learning
Depth Fusion from RGB and Depth Sensors by Deep LearningDepth Fusion from RGB and Depth Sensors by Deep Learning
Depth Fusion from RGB and Depth Sensors by Deep Learning
 
Pedestrian behavior/intention modeling for autonomous driving IV
Pedestrian behavior/intention modeling for autonomous driving IVPedestrian behavior/intention modeling for autonomous driving IV
Pedestrian behavior/intention modeling for autonomous driving IV
 
Deep learning for 3-D Scene Reconstruction and Modeling
Deep learning for 3-D Scene Reconstruction and Modeling Deep learning for 3-D Scene Reconstruction and Modeling
Deep learning for 3-D Scene Reconstruction and Modeling
 

Similar to Depth Fusion from RGB and Depth Sensors IV

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
taeseon ryu
 
TransNeRF
TransNeRFTransNeRF
TransNeRF
NavneetPaul2
 
Neural Radiance Fields & Neural Rendering.pdf
Neural Radiance Fields & Neural Rendering.pdfNeural Radiance Fields & Neural Rendering.pdf
Neural Radiance Fields & Neural Rendering.pdf
NavneetPaul2
 
Depth Fusion from RGB and Depth Sensors II
Depth Fusion from RGB and Depth Sensors IIDepth Fusion from RGB and Depth Sensors II
Depth Fusion from RGB and Depth Sensors II
Yu Huang
 
Single Image Depth Estimation using frequency domain analysis and Deep learning
Single Image Depth Estimation using frequency domain analysis and Deep learningSingle Image Depth Estimation using frequency domain analysis and Deep learning
Single Image Depth Estimation using frequency domain analysis and Deep learning
Ahan M R
 
The single image dehazing based on efficient transmission estimation
The single image dehazing based on efficient transmission estimationThe single image dehazing based on efficient transmission estimation
The single image dehazing based on efficient transmission estimation
AVVENIRE TECHNOLOGIES
 
Large scale 3 d point cloud compression using adaptive radial distance predic...
Large scale 3 d point cloud compression using adaptive radial distance predic...Large scale 3 d point cloud compression using adaptive radial distance predic...
Large scale 3 d point cloud compression using adaptive radial distance predic...
ieeepondy
 
mvitelli_ee367_final_report
mvitelli_ee367_final_reportmvitelli_ee367_final_report
mvitelli_ee367_final_reportMatt Vitelli
 
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNAutomatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNZihao(Gerald) Zhang
 
fusion of Camera and lidar for autonomous driving I
fusion of Camera and lidar for autonomous driving Ifusion of Camera and lidar for autonomous driving I
fusion of Camera and lidar for autonomous driving I
Yu Huang
 
Secrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologySecrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics Technology
Tiago Sousa
 
A Review On Single Image Depth Prediction with Wavelet Decomposition
A Review On Single Image Depth Prediction with Wavelet DecompositionA Review On Single Image Depth Prediction with Wavelet Decomposition
A Review On Single Image Depth Prediction with Wavelet Decomposition
IRJET Journal
 
DIGITAL IMAGE PROCESSING - Day 5 Applications of DIP
DIGITAL IMAGE PROCESSING - Day 5 Applications of DIPDIGITAL IMAGE PROCESSING - Day 5 Applications of DIP
DIGITAL IMAGE PROCESSING - Day 5 Applications of DIP
vijayanand Kandaswamy
 
NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...
NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...
NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...
ssuser4b1f48
 
Learning RGB-D Salient Object Detection using background enclosure, depth con...
Learning RGB-D Salient Object Detection using background enclosure, depth con...Learning RGB-D Salient Object Detection using background enclosure, depth con...
Learning RGB-D Salient Object Detection using background enclosure, depth con...
Benyamin Moadab
 
Dataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problemsDataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problems
PetteriTeikariPhD
 
EFFICIENT IMAGE COMPRESSION USING LAPLACIAN PYRAMIDAL FILTERS FOR EDGE IMAGES
EFFICIENT IMAGE COMPRESSION USING LAPLACIAN PYRAMIDAL FILTERS FOR EDGE IMAGESEFFICIENT IMAGE COMPRESSION USING LAPLACIAN PYRAMIDAL FILTERS FOR EDGE IMAGES
EFFICIENT IMAGE COMPRESSION USING LAPLACIAN PYRAMIDAL FILTERS FOR EDGE IMAGES
ijcnac
 
IMAGE FUSION IN IMAGE PROCESSING
IMAGE FUSION IN IMAGE PROCESSINGIMAGE FUSION IN IMAGE PROCESSING
IMAGE FUSION IN IMAGE PROCESSINGgarima0690
 
Review-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learningReview-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learning
Trong-An Bui
 
WT in IP.ppt
WT in IP.pptWT in IP.ppt
WT in IP.ppt
viveksingh19210115
 

Similar to Depth Fusion from RGB and Depth Sensors IV (20)

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
TransNeRF
TransNeRFTransNeRF
TransNeRF
 
Neural Radiance Fields & Neural Rendering.pdf
Neural Radiance Fields & Neural Rendering.pdfNeural Radiance Fields & Neural Rendering.pdf
Neural Radiance Fields & Neural Rendering.pdf
 
Depth Fusion from RGB and Depth Sensors II
Depth Fusion from RGB and Depth Sensors IIDepth Fusion from RGB and Depth Sensors II
Depth Fusion from RGB and Depth Sensors II
 
Single Image Depth Estimation using frequency domain analysis and Deep learning
Single Image Depth Estimation using frequency domain analysis and Deep learningSingle Image Depth Estimation using frequency domain analysis and Deep learning
Single Image Depth Estimation using frequency domain analysis and Deep learning
 
The single image dehazing based on efficient transmission estimation
The single image dehazing based on efficient transmission estimationThe single image dehazing based on efficient transmission estimation
The single image dehazing based on efficient transmission estimation
 
Large scale 3 d point cloud compression using adaptive radial distance predic...
Large scale 3 d point cloud compression using adaptive radial distance predic...Large scale 3 d point cloud compression using adaptive radial distance predic...
Large scale 3 d point cloud compression using adaptive radial distance predic...
 
mvitelli_ee367_final_report
mvitelli_ee367_final_reportmvitelli_ee367_final_report
mvitelli_ee367_final_report
 
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNAutomatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
 
fusion of Camera and lidar for autonomous driving I
fusion of Camera and lidar for autonomous driving Ifusion of Camera and lidar for autonomous driving I
fusion of Camera and lidar for autonomous driving I
 
Secrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologySecrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics Technology
 
A Review On Single Image Depth Prediction with Wavelet Decomposition
A Review On Single Image Depth Prediction with Wavelet DecompositionA Review On Single Image Depth Prediction with Wavelet Decomposition
A Review On Single Image Depth Prediction with Wavelet Decomposition
 
DIGITAL IMAGE PROCESSING - Day 5 Applications of DIP
DIGITAL IMAGE PROCESSING - Day 5 Applications of DIPDIGITAL IMAGE PROCESSING - Day 5 Applications of DIP
DIGITAL IMAGE PROCESSING - Day 5 Applications of DIP
 
NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...
NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...
NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...
 
Learning RGB-D Salient Object Detection using background enclosure, depth con...
Learning RGB-D Salient Object Detection using background enclosure, depth con...Learning RGB-D Salient Object Detection using background enclosure, depth con...
Learning RGB-D Salient Object Detection using background enclosure, depth con...
 
Dataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problemsDataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problems
 
EFFICIENT IMAGE COMPRESSION USING LAPLACIAN PYRAMIDAL FILTERS FOR EDGE IMAGES
EFFICIENT IMAGE COMPRESSION USING LAPLACIAN PYRAMIDAL FILTERS FOR EDGE IMAGESEFFICIENT IMAGE COMPRESSION USING LAPLACIAN PYRAMIDAL FILTERS FOR EDGE IMAGES
EFFICIENT IMAGE COMPRESSION USING LAPLACIAN PYRAMIDAL FILTERS FOR EDGE IMAGES
 
IMAGE FUSION IN IMAGE PROCESSING
IMAGE FUSION IN IMAGE PROCESSINGIMAGE FUSION IN IMAGE PROCESSING
IMAGE FUSION IN IMAGE PROCESSING
 
Review-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learningReview-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learning
 
WT in IP.ppt
WT in IP.pptWT in IP.ppt
WT in IP.ppt
 

More from Yu Huang

Application of Foundation Model for Autonomous Driving
Application of Foundation Model for Autonomous DrivingApplication of Foundation Model for Autonomous Driving
Application of Foundation Model for Autonomous Driving
Yu Huang
 
The New Perception Framework in Autonomous Driving: An Introduction of BEV N...
The New Perception Framework  in Autonomous Driving: An Introduction of BEV N...The New Perception Framework  in Autonomous Driving: An Introduction of BEV N...
The New Perception Framework in Autonomous Driving: An Introduction of BEV N...
Yu Huang
 
Data Closed Loop in Simulation Test of Autonomous Driving
Data Closed Loop in Simulation Test of Autonomous DrivingData Closed Loop in Simulation Test of Autonomous Driving
Data Closed Loop in Simulation Test of Autonomous Driving
Yu Huang
 
Techniques and Challenges in Autonomous Driving
Techniques and Challenges in Autonomous DrivingTechniques and Challenges in Autonomous Driving
Techniques and Challenges in Autonomous Driving
Yu Huang
 
BEV Joint Detection and Segmentation
BEV Joint Detection and SegmentationBEV Joint Detection and Segmentation
BEV Joint Detection and Segmentation
Yu Huang
 
BEV Object Detection and Prediction
BEV Object Detection and PredictionBEV Object Detection and Prediction
BEV Object Detection and Prediction
Yu Huang
 
Fisheye based Perception for Autonomous Driving VI
Fisheye based Perception for Autonomous Driving VIFisheye based Perception for Autonomous Driving VI
Fisheye based Perception for Autonomous Driving VI
Yu Huang
 
Fisheye/Omnidirectional View in Autonomous Driving V
Fisheye/Omnidirectional View in Autonomous Driving VFisheye/Omnidirectional View in Autonomous Driving V
Fisheye/Omnidirectional View in Autonomous Driving V
Yu Huang
 
Fisheye/Omnidirectional View in Autonomous Driving IV
Fisheye/Omnidirectional View in Autonomous Driving IVFisheye/Omnidirectional View in Autonomous Driving IV
Fisheye/Omnidirectional View in Autonomous Driving IV
Yu Huang
 
Prediction,Planninng & Control at Baidu
Prediction,Planninng & Control at BaiduPrediction,Planninng & Control at Baidu
Prediction,Planninng & Control at Baidu
Yu Huang
 
Cruise AI under the Hood
Cruise AI under the HoodCruise AI under the Hood
Cruise AI under the Hood
Yu Huang
 
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
Yu Huang
 
Scenario-Based Development & Testing for Autonomous Driving
Scenario-Based Development & Testing for Autonomous DrivingScenario-Based Development & Testing for Autonomous Driving
Scenario-Based Development & Testing for Autonomous Driving
Yu Huang
 
How to Build a Data Closed-loop Platform for Autonomous Driving?
How to Build a Data Closed-loop Platform for Autonomous Driving?How to Build a Data Closed-loop Platform for Autonomous Driving?
How to Build a Data Closed-loop Platform for Autonomous Driving?
Yu Huang
 
Annotation tools for ADAS & Autonomous Driving
Annotation tools for ADAS & Autonomous DrivingAnnotation tools for ADAS & Autonomous Driving
Annotation tools for ADAS & Autonomous Driving
Yu Huang
 
Simulation for autonomous driving at uber atg
Simulation for autonomous driving at uber atgSimulation for autonomous driving at uber atg
Simulation for autonomous driving at uber atg
Yu Huang
 
Prediction and planning for self driving at waymo
Prediction and planning for self driving at waymoPrediction and planning for self driving at waymo
Prediction and planning for self driving at waymo
Yu Huang
 
Jointly mapping, localization, perception, prediction and planning
Jointly mapping, localization, perception, prediction and planningJointly mapping, localization, perception, prediction and planning
Jointly mapping, localization, perception, prediction and planning
Yu Huang
 
Data pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous drivingData pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous driving
Yu Huang
 
Open Source codes of trajectory prediction & behavior planning
Open Source codes of trajectory prediction & behavior planningOpen Source codes of trajectory prediction & behavior planning
Open Source codes of trajectory prediction & behavior planning
Yu Huang
 

More from Yu Huang (20)

Application of Foundation Model for Autonomous Driving
Application of Foundation Model for Autonomous DrivingApplication of Foundation Model for Autonomous Driving
Application of Foundation Model for Autonomous Driving
 
The New Perception Framework in Autonomous Driving: An Introduction of BEV N...
The New Perception Framework  in Autonomous Driving: An Introduction of BEV N...The New Perception Framework  in Autonomous Driving: An Introduction of BEV N...
The New Perception Framework in Autonomous Driving: An Introduction of BEV N...
 
Data Closed Loop in Simulation Test of Autonomous Driving
Data Closed Loop in Simulation Test of Autonomous DrivingData Closed Loop in Simulation Test of Autonomous Driving
Data Closed Loop in Simulation Test of Autonomous Driving
 
Techniques and Challenges in Autonomous Driving
Techniques and Challenges in Autonomous DrivingTechniques and Challenges in Autonomous Driving
Techniques and Challenges in Autonomous Driving
 
BEV Joint Detection and Segmentation
BEV Joint Detection and SegmentationBEV Joint Detection and Segmentation
BEV Joint Detection and Segmentation
 
BEV Object Detection and Prediction
BEV Object Detection and PredictionBEV Object Detection and Prediction
BEV Object Detection and Prediction
 
Fisheye based Perception for Autonomous Driving VI
Fisheye based Perception for Autonomous Driving VIFisheye based Perception for Autonomous Driving VI
Fisheye based Perception for Autonomous Driving VI
 
Fisheye/Omnidirectional View in Autonomous Driving V
Fisheye/Omnidirectional View in Autonomous Driving VFisheye/Omnidirectional View in Autonomous Driving V
Fisheye/Omnidirectional View in Autonomous Driving V
 
Fisheye/Omnidirectional View in Autonomous Driving IV
Fisheye/Omnidirectional View in Autonomous Driving IVFisheye/Omnidirectional View in Autonomous Driving IV
Fisheye/Omnidirectional View in Autonomous Driving IV
 
Prediction,Planninng & Control at Baidu
Prediction,Planninng & Control at BaiduPrediction,Planninng & Control at Baidu
Prediction,Planninng & Control at Baidu
 
Cruise AI under the Hood
Cruise AI under the HoodCruise AI under the Hood
Cruise AI under the Hood
 
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
 
Scenario-Based Development & Testing for Autonomous Driving
Scenario-Based Development & Testing for Autonomous DrivingScenario-Based Development & Testing for Autonomous Driving
Scenario-Based Development & Testing for Autonomous Driving
 
How to Build a Data Closed-loop Platform for Autonomous Driving?
How to Build a Data Closed-loop Platform for Autonomous Driving?How to Build a Data Closed-loop Platform for Autonomous Driving?
How to Build a Data Closed-loop Platform for Autonomous Driving?
 
Annotation tools for ADAS & Autonomous Driving
Annotation tools for ADAS & Autonomous DrivingAnnotation tools for ADAS & Autonomous Driving
Annotation tools for ADAS & Autonomous Driving
 
Simulation for autonomous driving at uber atg
Simulation for autonomous driving at uber atgSimulation for autonomous driving at uber atg
Simulation for autonomous driving at uber atg
 
Prediction and planning for self driving at waymo
Prediction and planning for self driving at waymoPrediction and planning for self driving at waymo
Prediction and planning for self driving at waymo
 
Jointly mapping, localization, perception, prediction and planning
Jointly mapping, localization, perception, prediction and planningJointly mapping, localization, perception, prediction and planning
Jointly mapping, localization, perception, prediction and planning
 
Data pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous drivingData pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous driving
 
Open Source codes of trajectory prediction & behavior planning
Open Source codes of trajectory prediction & behavior planningOpen Source codes of trajectory prediction & behavior planning
Open Source codes of trajectory prediction & behavior planning
 

Recently uploaded

block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
Divya Somashekar
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
TeeVichai
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
AhmedHussein950959
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
WENKENLI1
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
zwunae
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
seandesed
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
karthi keyan
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
Osamah Alsalih
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
AafreenAbuthahir2
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
AmarGB2
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
ViniHema
 

Recently uploaded (20)

block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
 

Depth Fusion from RGB and Depth Sensors IV

  • 1. DEPTH FUSION FROM RGB AND DEPTH SENSORS IV Yu Huang Yu.huang07@gmail.com Sunnyvale, California
  • 2. Outline ■ Single-Photon 3D Imaging with Deep Sensor Fusion ■ Deep RGB-D Canonical Correlation Analysis For Sparse Depth Completion ■ Confidence Propagation through CNNs for Guided Sparse Depth Regression ■ Learning Guided Convolutional Network for Depth Completion ■ DFineNet: Ego-Motion Estimation and Depth Refinement from Sparse, Noisy Depth Input with RGB Guidance ■ PLIN: A Network for Pseudo-LiDAR Point Cloud Interpolation ■ Depth Completion from Sparse LiDAR Data with Depth-Normal Constraints
  • 3. Single-Photon 3D Imaging with Deep Sensor Fusion ■ Active illumination time-of-flight sensors in particular have become widely used to estimate a 3D representation of a scene. ■ However, the maximum range, density of acquired spatial samples, and overall acquisition time of these sensors is fundamentally limited by the min-imum signal required to estimate depth reliably. ■ A data-driven method for photon-efficient 3D imaging which leverages sensor fusion and computational reconstruction to rapidly and robustly estimate a dense depth map from low photon counts. ■ This sensor fusion approach uses measurements of single photon arrival times from a LR single-photon detector array and an intensity image from a conventional HR camera. ■ Using a multi-scale deep convolutional network, it jointly processes the raw measurements from both sensors and output a high-resolution depth map. 2018
  • 4. Single-Photon 3D Imaging with Deep Sensor Fusion Single-photon 3D imaging systems measure a spatio-temporal volume containing photon counts (left) that include ambient light, noise, and photons emitted by a pulsed laser into the scene and reflected back to the detector. Conventional depth estimation techniques, such as log-matched filtering (center left), estimate a depth map from these counts. However, depth estimation is a non-convex and challenging problem, especially for extremely low photon counts observed in fast or long-range 3D imaging systems. Here is a data-driven approach to solve this depth estimation problem and explore deep sensor fusion approaches that use an intensity image of the scene to optimize the robustness (center right) and resolution (right) of the depth estimation.
  • 5. Single-Photon 3D Imaging with Deep Sensor Fusion The denoising branch (left) takes as input the 3D volume of photon counts and processes it at multiple scales using a series of 3D conv layers. The resulting features from each resolution scale are concatenated together and optionally concatenated with additional features from an intensity image in a sensor fusion approach. A further set of 3D conv layers regresses a normalized illumination pulse, censoring the BG photon events. A differentiable argmax operator is used to localize the ToF of the estimated illumination pulse and determine the depth. In the image-guided upsampling branch (right), the network predicts HF differences between an upsampled LF depth map and the HR depth map using multi-scale guidance from HF features of the intensity image. The entire network is trainable end-to-end for depth estimation and upsampling from raw photon counts and an intensity image.
  • 6. Single-Photon 3D Imaging with Deep Sensor Fusion (a) photo of setup (b) imaging optics (c) illumination optics Single-photon imaging prototype. (a) both the imaging opMcs (boNom) and illuminaMon opMcs (top). The illuminaMon and imaging opMcs are aligned in a recMfied setup to perform energy-efficient epipolar scanning. (bA dichroic short-pass filter reflects light above 500 nm to a PointGrey vision camera, and transmits light of all remaining wavelengths through a 450 nm laser line filter and onto a 1D array of 256 SPAD pixels. The galvo mirror angle controls the scanline imaging the scene. (c) A cylindrical lens creates a verMcal laser line, and the galvo mirror determines the posiMon of this laser line within the scene.
  • 7. Single-Photon 3D Imaging with Deep Sensor Fusion Reconstruction results for four scenes: checkerboard, elephant, lamp, and bouncing ball.
  • 8. Deep RGB-D Canonical Correlation Analysis For Sparse Depth Completion ■ Correlation For Completion Network (CFCNet), an end-to-end deep model to do the sparse depth completion task with RGB info. ■ A 2D deep canonical correlation analysis as network constraints to ensure encoders of RGB and depth capture the most similar semantics. ■ It transforms the RGB features to the depth domain, and the complementary RGB info is used to complete the missing depth info. ■ A completed dense depth map is viewed as composed of two parts. ■ One is the sparse depth which is observable and used as the input, another is non- observable and recovered by the task. ■ Also, the corresponding full RGB image of the depth map can be decomposed into two parts, one is called the sparse RGB, which holds the corresponding RGB values at the observable locations in the sparse depth, and the other part is complementary RGB, which is the subtraction of the sparse RGB from the full RGB images. ■ During training, CFCNet learns the relationship between sparse depth and sparse RGB and uses the learned knowledge to recover non-observable depth from complementary RGB. 2019,6
  • 9. Deep RGB-D Canonical Correlation Analysis For Sparse Depth Completion The input 0 - 1 sparse mask represents the sparse pattern of depth measures. The complementary mask is complementary to the sparse mask. Separate a full image into a sparse RGB and a complementary RGB by the mask and feed them with masks into networks.
  • 10. Deep RGB-D Canonical Correlation Analysis For Sparse Depth Completion ■ CFCNet takes in sparse depth map, sparse RGB, and complementary RGB. ■ Use the Sparsity-aware Attentional Convolutions (SAConv) in VGG16-like encoders. ■ SAConv is inspired by local attention mask which introduces the segmentation-aware mask to let convolution "focus" on the signals consistent with the segmentation mask. ■ In order to propagate info from reliable sources, use sparsity masks to make convolution operations attend on the signals from reliable locations. ■ Difference to the local attention mask is that SAConv does not apply mask normalization. ■ It avoids mask normalization cause it affect the stability of later 2D2CCA calculations due to the numerically small extracted features it produces after several times normalization. ■ Also, use max-pooling operation on masks after every SAConv to keep track of the visibility. ■ If there is at least one nonzero value visible to a convolutional kernel, the max-pooling would evaluate the value at the position to 1.
  • 11. Deep RGB-D Canonical Correlation Analysis For Sparse Depth Completion SAConv. The ⊙ is for Hadamard product. The ⊗ is for convolution. The + is for elementwise addition. The kernel size is 3 × 3 and stride is 1 for both convolution and max-pooling. 2D Deep Canonical Correlation Analysis (2D2CCA) full-rank covariance matrix covariance matrices correlation The total loss funcMon
  • 12. Deep RGB-D Canonical Correlation Analysis For Sparse Depth Completion ■ Most multi-modal deep learning approaches simply concatenate or element-wisely add bottleneck features. ■ However, when the extracted semantics and range of feature value differs among elements, direct concatenation and addition on multi-modal data source would not always yield better performance than single-modal data source. ■ To avoid this problem, use encoders to extract higher-level semantics from two branches. ■ 2D2CCA ensures the extracted features from two branches are maximally correlated. ■ The intuition is to capture the same semantics from the RGB and depth domains. ■ Next, use a transformer network to transform extracted features from RGB domain to depth domain, making extracted features from different sources share the same numerical range. ■ During the training phase, use features of sparse depth and corresponding sparse RGB image to calculate the 2D2CCA loss and transformer loss.
  • 13. Deep RGB-D Canonical Correlation Analysis For Sparse Depth Completion (a) RGB image (b) 500 points sparse depth as inputs. (c) Completed depth maps. (d) Results from MIT.
  • 14. Learning Guided Convolutional Network for Depth Completion ■ Dense depth perception is critical for autonomous driving and other robotics applications. ■ It is thus necessary to complete the sparse LiDAR data, where a synchronized guidance RGB image is often used to facilitate this completion. ■ Inspired by the guided image filtering, a guided network to predict kernel weights from the guidance image. ■ These predicted kernels are then applied to extract the depth image features. ■ In this way, a network generates content-dependent and spatially-variant kernels for multi- modal feature fusion. ■ Dynamically generated spatially-variant kernels could lead to prohibitive GPU memory consumption and computation overhead. ■ A convolution factorization is designed to reduce computation and memory consumption. ■ GPU memory reduction makes it possible for feature fusion to work in multi-stage scheme. 2019,8
  • 15. Learning Guided Convolutional Network for Depth Completion The network architecture includes two sub-networks: GuideNet in orange and DepthNet in blue. To add a convolution layer at the beginning of both GuideNet and DepthNet as well as the end of DepthNet. The light orange and blue are the encoder stages, while corresponding dark ones are decoder stage of GuideNet and DepthNet, respectively. The ResBlock represents the basic residual block structure with two sequential 3 × 3 convolutional layers.
  • 16. Learning Guided Convolutional Network for Depth Completion Guided Convolution Module. (a) the overall pipeline of guided convolution module. Given image features as input, filter generation layer dynamically produces guided kernels, which are further applied on input depth features and output new depth features. (b) the details of convolution between guided kernels and input depth features. To factorize it into two-stage convolutions: channel-wise convolution and cross-channel convolution.
  • 17. Learning Guided Convolutional Network for Depth Completion Qualitative comparison with state-of-the-art methods on KITTI test set
  • 18. DFineNet: Ego-Motion Estimation and Depth Refinement from Sparse, Noisy Depth Input with RGB Guidance ■ Depth estimation is an important capability for autonomous vehicles to understand and reconstruct 3D environments as well as avoid obstacles during the execution. ■ Accurate depth sensors such as LiDARs are often heavy, expensive and can only provide sparse depth while lighter depth sensors such as stereo cameras are noiser in comparison. ■ It is an end- to-end learning algorithm that is capable of using sparse, noisy input depth for refinement and depth completion. ■ This model also produces the camera pose as a byproduct, making it a great solution for autonomous systems. ■ To evaluate the approach on both indoor and outdoor datasets. ■ 2019,8.
  • 19. DFineNet: Ego-Motion Estimation and Depth Refinement from Sparse, Noisy Depth Input with RGB Guidance An example of sparse, noisy depth input (1st row), the 3D visualization of ground truth of depth (2nd row) and the 3D visualization of output from our model (bottom). RGB image (1st) is overlaid with sparse, noisy depth input for visualization.
  • 20. DFineNet: Ego-Motion Estimation and Depth Refinement from Sparse, Noisy Depth Input with RGB Guidance It refines sparse & noisy depth input (the 3rd row) to output dense depth of high quality (bottom row).
  • 21. DFineNet: Ego-Motion Estimation and Depth Refinement from Sparse, Noisy Depth Input with RGB Guidance Network Architecture The network consists of two branches: one CNN to learn the function that estimates the depth (ψd), and one CNN to learn the function that estimates the pose (θp). This network takes as input the image sequence and corresponding sparse depth maps and outputs the transformation as well as the dense depth map. During training, the two sets of parameters are simultaneously updated by the training signal which will be detailed in this section. It is the revised depth net of Ma from MIT as a Depth-CNN.
  • 22. DFineNet: Ego-Motion Estimation and Depth Refinement from Sparse, Noisy Depth Input with RGB Guidance ■ Supervised Loss ■ Photometric Loss ■ Masked Photometric Loss ■ Smoothness Loss – Derived from Sfm-net ■ Total Loss:
  • 23. DFineNet: Ego-Motion Estimation and Depth Refinement from Sparse, Noisy Depth Input with RGB Guidance Qualitative results of this method (left), RGB guide & certainty (middle) ranking 1st and MIT’s Ma(right) ranking 7th.
  • 24. Confidence Propagation through CNNs for Guided Sparse Depth Regression ■ 2019,8 ■ Generally, convolutional neural networks (CNNs) process data on a regular grid, e.g. data generated by ordinary cameras. ■ Designing CNNs for sparse and irregularly spaced input data is still an open research problem with numerous applications in autonomous driving, robotics, and surveillance. ■ An algebraically-constrained normalized convolution layer for CNNs with highly sparse input that has a smaller number of network parameters compared to related work. ■ Strategies for determining the confidence from the convolution operation and propagating it to consecutive layers. ■ An objective function that simultaneously minimizes the data error while maximizing the output confidence. ■ To integrate structural information, fusion strategies to combine depth and RGB information in the normalized convolution network framework. ■ In addition, use of output confidence as an auxiliary information to improve the results.
  • 25. Confidence Propagation through CNNs for Guided Sparse Depth Regression Scene depth completion pipeline on an example image. The input to the pipeline is a very sparse projected LiDAR point cloud, an input confidence map which has zeros at missing pixels and ones otherwise, and an RGB image. The sparse point cloud input and the input confidence are fed to a multi-scale unguided network that acts as a generic estimator for the data. Afterwards, the continuous output confidence map is concatenated with the RGB image and fed to a feature extraction network. The output from the unguided network and the RGB feature extraction networks are concatenated and fed to a fusion network which produces the final dense depth map.
  • 26. Confidence Propagation through CNNs for Guided Sparse Depth Regression The standard convoluMon layer in CNN frameworks can be replaced by a normalized convoluMon layer with minor modificaMons. First, the layer takes in two inputs simultaneously, the data and its confidence. The forward pass is then modified and the back-propagaMon is modified to include a derivaMve term for the non-negaMvity enforcement funcMon. To propagate the confidence to consecuMve layers, the already- calculated denominator term is normalized by the sum of the filter elements. Normalized Convolution layer that takes in two inputs, i.e. data and confidence and outputs a data term and a confidence term.
  • 27. Confidence Propagation through CNNs for Guided Sparse Depth Regression The multi-scale architecture for the task of unguided scene depth completion that utilizes normalized convolution layers. Downsampling is performed using max pooling on confidence maps and the indices of the pooled pixels are used to select the pixels with highest confidences from the feature maps. Different scales are fused by upsampling the coarser scale and concatenating it with the finer scale. A normalized convolution layer is then used to fuse the feature maps based on the confidence information. Finally, a 1 × 1 normalized convolution layer is used to merge different channels into one channel and produce a dense output and an output confidence map.
  • 28. Confidence Propagation through CNNs for Guided Sparse Depth Regression (a) A multi-stream architecture that contains a stream for depth and another stream for RGB + Output Confidence feature extraction. Afterwards, a fusion network combines both streams to produce the final dense output. (d) A multi-scale encoder-decoder architecture where depth is fed to the unguided network followed by an encoder and output confidence and RGB image are concatenated then fed to a similar encoder. Both streams have skip- connection to the decoder between the corresponding scales. (c) is similar to (a), but with early fusion and (d) is similar to (b) but with early fusion.
  • 29. Confidence Propagation through CNNs for Guided Sparse Depth Regression (a) RGB input, (b) Method MS-Net[LF]-L2 (gd), (c) Sparse- to-Dense (gd) and (d) HMS-Net (gd). For each one, top: the prediction, ethod MS-Net[LF]-L2 (gd) performs slightly better, while Sparse-to-Dense produces smoother edges due to the use of a smoothness loss.
  • 30. PLIN: A Network for Pseudo-LiDAR Point Cloud Interpolation ■ LiDAR can provide dependable 3D spatial information at a low frequency (around 10Hz) and have been widely applied in the field of autonomous driving and UAV. ■ However, the camera with a higher frequency (around 20-30Hz) has to be decreased so as to match with LiDAR in a multi- sensor system. ■ A Pseudo-LiDAR interpolation network (PLIN) to increase the frequency of LiDAR sensors. ■ PLIN can generate temporally and spatially high- quality point cloud sequences to match the high frequency of cameras. ■ For this goal, use a coarse interpolation stage guided by consecutive sparse depth maps and motion relationship and a refined interpolation stage guided by the realistic scene. ■ Using this coarse-to-fine cascade structure, this method can progressively perceive multi-modal info and generate accurate intermediate point clouds. ■ This is the first deep framework for Pseudo-LiDAR point cloud interpolation, which shows appealing applications in navigation systems equipped with LiDAR and cameras. 2019,9
  • 31. PLIN: A Network for Pseudo-LiDAR Point Cloud Interpolation Overall pipeline of the proposed method. PLIN aims to address the mismatching problem of frequency between camera and LiDAR sensors, generating both temporally and spatially high-quality point cloud sequences. This method takes three consecutive color images and two sparse depth maps as inputs, and interpolates an intermediate dense depth map, which is further transformed into a Pseudo-LiDAR point cloud using camera intrinsic parameters.
  • 32. PLIN: A Network for Pseudo-LiDAR Point Cloud Interpolation Overview of the Pseudo-LiDAR interpolation network (PLIN). The whole architecture consists of three modules, including the motion guidance module, scene guidance module and transformation module.
  • 33. PLIN: A Network for Pseudo-LiDAR Point Cloud Interpolation Results of interpolated depth map obtained by PLIN. For each example, it shows the intermediate color image, sparse depth map, dense depth map, and the result. This method can recover the original depth informaMon and generate much denser distribuMons.
  • 34. PLIN: A Network for Pseudo-LiDAR Point Cloud Interpolation It shows the color image, interpolated dense depth map, two views of the generated Pseudo-LiDAR, and enlarged areas. The complete network produces more accurate depth map, and the distribution and shape of Pseudo-LiDAR are more similar to those of the GT point cloud.
  • 35. Depth Completion from Sparse LiDAR Data with Depth-Normal Constraints ■ Depth completion aims to recover dense depth maps from sparse depth measurements. ■ It is of increasing importance for autonomous driving and draws increasing attention from the vision community. ■ Most of existing methods directly train a network to learn a mapping from sparse depth inputs to dense depth maps, which has difficulties in utilizing the 3D geometric constraints and handling the practical sensor noises. ■ to regularize the depth completion and improve the robustness against noise, a unified CNN framework 1) models the geometric constraints between depth and surface normal in a diffusion module and 2) predicts the confidence of sparse LiDAR measurements to mitigate the impact of noise. ■ Specifically, the encoder-decoder backbone predicts surface normals, coarse depth and confidence of LiDAR inputs simultaneously, which are subsequently inputted into the diffusion refinement module to obtain the final completion results. 2019,10
  • 36. Depth Completion from Sparse LiDAR Data with Depth-Normal Constraints From sparse LiDAR measurements and color images (a-b), this model first infers the maps of coarse depth and normal (c-d), and then recurrently refines the initial depth estimation by enforcing the constraints between depth and normals. Moreover, to address the noises in practical LiDAR measurements (g), employ a decoder branch to predict the confidences (h) of sparse inputs for better regularization.
  • 37. Depth Completion from Sparse LiDAR Data with Depth-Normal Constraints The predicMon network first predicts maps of surface normal N, coarse depth D and confidence M of sparse depth input with a shared-weight encoder and independent decoders. Then, the sparse depth inputs D ̄ and coarse depth D are transformed to the plane-origin distance space as P ̄ and P. Next, the refinement network, an anisotropic diffusion module, refines the coarse depth map D in the plane-origin distance subspace to enforce the constraints between depth and normal and to incorporate info from the confident sparse depth inputs. During the refinement, the diffusion conductance depends on the similarity in guidance feature map G. Finally, the refined P is inversely transformed back to obtain the refined depth map Drwhen the diffusion is finished.
  • 38. Depth Completion from Sparse LiDAR Data with Depth-Normal Constraints Differentiable diffusion block. In each refinement iteration, high-dimensional feature vectors (e.g., of dimension 64) in guidance feature map G are independently transformed via two different functions f and g (modeled as two convolution layers followed by normalization). Then, the conductance from each location xi (in plane-origin distance map P) to its neighboring K pixels (xj ∈ Ni) are calculated. Finally, the diffusion is performed through a convolution operation with the kernels defined by the previous computed conductance. Through such diffusion, depth completion results are regularized by the constraint between depth and normal.
  • 39. Depth Completion from Sparse LiDAR Data with Depth-Normal Constraints negative cosine loss L2 reconstruction loss L2 depth loss L2 refinement reconstruction loss The overall loss function: relation between depth and normal can be established via the tangent plane equation
  • 40. Depth Completion from Sparse LiDAR Data with Depth-Normal Constraints Quantitative comparison with other methods. For each method, provide the whole completion results as well as the zoom-in views of details and error maps for better comparison. Also provide the normal prediction and confidence prediction of this method for better illustration.