SlideShare a Scribd company logo
Color and 3D Semantic Reconstruction
of Indoor Scenes from RGB-D Streams
전준호
CG Lab. POSTECH
Tech Talk @ NAVER
2018.12.10
3D Reconstruction
• Capture shape and appearance of real objects and environments
• Produce 3D models for applications such as virtual/augmented reality, 3D printing
2
3D Reconstruction using RGB-D Sensor
• Geometric reconstructions are rapidly developed, and available for large-scale scenes
▫ But mainly focus on acquiring an accurate geometry
KinectFusion [Newcombe 2011]
Voxel hashing [Nießner 2013] Elastic fragments [Zhou 2013] Robust reconstruction [Choi 2015]
3
Auxiliary Information of 3D Indoor Scene
• Surface color
• Object class
• Lighting condition
• Sound
 Rich UX  Color and Semantic Reconstruction
4
Contributions – Color Reconstruction
• Texture Map Generation for 3D Reconstructed Scenes
▫ Reconstruct clean and sharp surface color of the 3D reconstructed scene
▫ Light-weight color representation for reconstructed scenes
▫ Texture coordinates optimization to acquire sharp texture map
Texture map generation for 3D reconstruction
5
Contributions – Semantic Reconstruction
• Reconstruction of semantically segmented 3D meshes
▫ Predict per-vertex object class of the 3D reconstructed scene
▫ Volumetric semantic fusion of frame-by-frame semantic predictions
▫ Adaptive integration and CRF optimization for robust labeling
3D semantic reconstruction
6
Texture Map Generation
for 3D Reconstructed Scenes
Junho Jeon, Yeongyu Jung, Haejoon Kim,
Seungyong Lee
The Visual Computer (CGI 2016)
3D Reconstruction using RGB-D Sensor
• Available for very large-scale scenes
▫ But no or inaccurate color information!
Robust reconstruction [Choi 2015] BundleFusion [Dai 2017] 8
Color Reconstruction
• Naïve color blending introduces blurring, ghosting, etc.
▫ Incorrect camera poses
▫ Lens distortions
▫ Misaligned RGB-D images
• Goal: precisely reconstruct the color from RGB-D stream
Blurry color from volumetric blending
9
Previous work: Color Map Optimization
• Zhou and Koltun, TOG 2014
▫ Project RGB stream onto mesh to get vertex color
▫ Optimize camera pose & warping function for images  clean vertex color
▫ Limitation: method based on vertex color
 Time-consuming optimization
 Inefficient rendering
* Images from Zhou’s slides
Result
Optimization takes 5 mins.
Image warping function
10
Our Approach
• Color reconstruction based on texture mapping
▫ Generating texture map for simplified mesh
▫ Optimize texture map to maximize photometric consistency
▫ GPU-based parallel solver
 100x faster color reconstruction!
 Efficient rendering
Our method
11
Rendering Result
Sub-textures
Global texture
Spatio-temporally
sampled key frames
Simplified 3D
reconstructed mesh
RGB-D stream
Color
Depth
Overall Framework
1) Preprocessing
2) Spatiotemporal key frame sampling
3) Texture map generation
4) Texture map optimization
Refined global
texture map
(1) Preprocessing
(4) Texture map
optimization
(2) Key frame
sampling
(3) Texture map
generation
12
Preprocessing
Rendering Result
Sub-textures
Global texture
Spatio-temporally
sampled key frames
Simplified 3D
reconstructed mesh
RGB-D stream
Color
Depth
Refined global
texture map
(1) Preprocessing
(4) Texture map
optimization
(2) Key frame
sampling
(3) Texture map
generation
13
Preprocessing
• Geometric model reconstruction
▫ Dense scene reconstruction with point of interest [Zhou 2013]
▫ Any other 3D reconstruction method can be used
• Model simplification
▫ Original mesh consists of more than 1M faces
 Inefficient texture mapping
 Further process becomes extremely time-consuming
▫ Surface simplification using quadric error metrics [Garland 1997]
Mesh simplification (faces 460K to 23K)Dense scene reconstruction [Zhou 2013]
14
Spatiotemporal Key Frame Sampling
Rendering Result
Sub-textures
Global texture
Spatio-temporally
sampled key frames
Simplified 3D
reconstructed mesh
RGB-D stream
Color
Depth
Refined global
texture map
(1) Preprocessing
(4) Texture map
optimization
(2) Key frame
sampling
(3) Texture map
generation
15
Spatiotemporal Key Frame Sampling
• Input color stream
▫ A lot of redundant data, color images suffer from motion blurs
• Temporal sampling
▫ Sample less blurry key frames based on Blurriness [Crété-Roffet 2007]
• Spatial sampling
▫ Uniqueness: the image not able to be covered by other image
▫ Sample by eliminating image with minimum uniqueness
Overlapping (red) and unique region (blue)Temporal sampling with blurriness
16
Texture Map Generation
Rendering ResultRGB-D stream
Color
Depth
Refined global
texture map
Sub-textures
Global texture
Spatio-temporally
sampled key frames
Simplified 3D
reconstructed mesh
(3) Texture map
generation
(1) Preprocessing
(4) Texture map
optimization
(2) Key frame
sampling
17
Texture Map Generation
• UV unwrapping to mesh for global texture map
▫ Get global texture coordinates for every vertex
• Estimate color by blending key frames
▫ Sub-texture map by projecting mesh to each camera
▫ Blended sub-texture becomes global texture
Global texture map
UV
unwrapping
Sub-textures
Mesh
projection
Weighted
blending
18
Global Texture Map Optimization
Rendering Result
Spatio-temporally
sampled key frames
Simplified 3D
reconstructed mesh
RGB-D stream
Color
Depth
(1) Preprocessing
(2) Key frame
sampling
Sub-textures
Global texture
Refined global
texture map
(4) Texture map
optimization
(3) Texture map
generation
19
Global Texture Map Optimization
• Generated texture map also suffers from blurring, ghosting, etc.
▫ Inconsistent color blending from different sub-textures
• Optimize sub-texture coordinates to be consistent
 Sharper & cleaner global texture map
Inconsistent blendingConsistent blending
20
Global Texture Map Optimization
• Search new sub-texture coordinates of each vertex
• Energy formulation for photometric consistency
▫ For every face, blended global texture should be consistent with sub-textures
▫ Consider consistency of sampled points on each face
• Non-linear least square problem
▫ Need to be solved by Gauss-Newton method
Sub-texture coordinates (variables)
Sub-texture
(intensity)
Blended global texture
(intensity)
Sub-textures of face f
21
GPU-based Alternating Solver
• Applying naïve Gauss-Newton method is non-trivial
▫ Infeasible to solve directly due to the # of variables
• Exploit locality of the problem to parallelize the optimization
▫ Assuming 1-ring neighborhood of v fixed,
optimization of sub-texture coordinates uv is independent from other vertices
▫ Schwarz Alternating Method
 While keeping boundary variables, update inner variables
 Independent optimizations are propagated iteratively
v
v2v1
v3
v4
1-ring neighborhood
v
Propagation of optimization
22
Experimental Results
• Tested on various 3D reconstructed models
• Intel i7 4.0GHz, 16 RAM, NVIDIA Titan X
*models from Zhou et al. 23
Experimental Results
Volumetric blending
24
Experimental Results 10k faces
Optimization takes 2.6s (100 iterations)
Our method
25
Experimental Results
26
Experimental Results
27
Our result (after opt.)Our result (before opt.)
Color map optimization [Zhou 2014]Volumetric blending
Faces #: 15,853,238
File size: 713MB
Optimization: 5 mins
Faces #: 10,000
File size: 217KB + few MB
Optimization: 2.6 secs
28
Experimental Results
• 10K faces takes less than 3s (original mesh consists of 1M faces)
20K faces
10K faces
5K faces
29
Experimental Results
130k faces
Optimization takes 16s
30
Experimental Results
130k faces
Optimization takes 16s
31
Experimental Results
130k faces
Optimization takes 18s
32
Experimental Results 130k faces
Optimization takes 18s
33
Experimental Results
195k faces
Optimization takes 31s
34
Experimental Results
195k faces
Optimization takes 31s
35
Summary
• Texture map generation for color reconstruction of 3D indoor scene
▫ Texture map generation maximizing the photometric consistency of mapping
▫ Spatiotemporal sampling for faster processing and sharper texture map
▫ Efficient optimization based on a parallel Gauss-Newton solver on GPU
 Directly applicable for computer graphics
36
Semantic Reconstruction:
Reconstruction of Semantically Segmented
3D Meshes via Volumetric Semantic Fusion
Junho Jeon, Jinwoong Jung, Jungeon Kim,
Seungyong Lee
Computer Graphics Forum (Pacific Graphics 2018)
Reconstruction of Semantic Information
• Virtual/augmented reality  Interaction with 3D scenes
• Single connected 3D model  not suitable
• Requires individually segmented object models
 Semantic segmentation on 3D reconstructed scene
Interaction with 3D scene Single connected 3D model
Sofa
Floor
Shelves
Wall
38
Semantic Segmentation on 2D Image
• Pixel-wise annotation of semantic object class
• Well established network architectures and dataset
▫ PASCAL, MS COCO, Mapillary, Places, …
• Has shown successful performance
Places dataset Mapillary dataset 39
3D Semantic Segmentation
• Point-wise (vertex-wise) annotation on 3D scene model
• Deep learning on a 3D data is not straight-forward
▫ Unstructured point cloud, mesh with complex topology
• Lack of annotated 3D reconstructed model dataset
▫ Recently, an annotated dataset is released (ScanNet)
floor
bed
wall
Chair
Picture
Reconstructed 3D model Per-vertex annotation
40
Related Work – 3D CNN-based Methods
• Represent input geometry as an uniform voxel grid
▫ Binary occupancy grid or distance field
• Direct feature extraction and classification w/ 3D CNN
• Higher memory consumption  only low resolution segmentations
Fully convolutional 3D CNN architecture
Images courtesy of [Qi 2017]
Voxel-based semantic segmentation
[Dai 2017]
41
Related Work – Point-based Methods
• Unstructured point cloud to ordered sequence vector
▫ Point set grouping, slice pooling, max pooling
• Feature extraction and classification w/ MLP or RNN
• Miss geometric detail (may miss small object classes)
RSNet [Huang 2018]PointNet++ [Qi 2017]
42
Our Approach: Semantic Reconstruction
• 3D (geometry) reconstruction
using fusion of
multiple geometry (depth images)
• 3D semantic reconstruction
using fusion of
multiple 2D semantic predictions
Multiple depth images Dense surface reconstruction
Multiple semantic predictions 3D semantic reconstruction
43
Volumetric Fusion of Semantic Information
• Review: Volumetric Fusion of 3D Geometry
• Geometry representation using a uniform voxel grid
▫ Each voxel stores TSDF value (geometry information)
Uniform voxel grid
44
Volumetric Fusion of Semantic Information
• Review: Volumetric Fusion of 3D Geometry
• Geometry representation using a uniform voxel grid
▫ Each voxel stores TSDF value (geometry information)
• Merge noisy measurements on single voxel grid w/ estimated camera poses
▫ Volumetric denoising of the reconstructed geometry (TSDF values)
Images courtesy of Newcombe’s slides
Multi-frames geometric fusionUniform voxel grid
45
Volumetric Fusion of Semantic Information
• Each voxel has semantic probabilistic distribution (20 classes)
▫ Volumetric fusion of multi-frames semantic predictions
• Seamless integration into the 3D (geometry) reconstruction process
Volumetric semantic fusion
RGB-D Stream
Stream of
Semantic Prediction
CNN-based 2D semantic segmentation
46
Semantic Reconstruction
• CNN-based 2D Semantic Segmentation
• Volumetric semantic fusion with adaptive weight
• CRF-based 3D semantic label regularization
Overall framework
RGB-D Stream Dense Surface Reconstruction
(1) CNN-based
2D Semantic Segmentation
Per-vertex
Semantic Class Confidence
Semantically Segmented Mesh
& Projected 2D Segmentation
(2) Volumetric Semantic Fusion
…
(3) CRF-based
Label Regularization
…
47
Semantic Reconstruction
• CNN-based 2D Semantic Segmentation
• Volumetric semantic fusion with adaptive weight
• CRF-based 3D semantic label regularization
RGB-D Stream Dense Surface Reconstruction
(1) CNN-based
2D Semantic Segmentation
Per-vertex
Semantic Class Confidence
Semantically Segmented Mesh
& Projected 2D Segmentation
(2) Volumetric Semantic Fusion
…
(3) CRF-based
Label Regularization
…
Overall framework
48
Semantic Reconstruction
• CNN-based 2D Semantic Segmentation
• Volumetric semantic fusion with adaptive weight
• CRF-based 3D semantic label regularization
RGB-D Stream Dense Surface Reconstruction
(1) CNN-based
2D Semantic Segmentation
Per-vertex
Semantic Class Confidence
Semantically Segmented Mesh
& Projected 2D Segmentation
(2) Volumetric Semantic Fusion
…
(3) CRF-based
Label Regularization
…
Overall framework
49
Semantic Reconstruction
• CNN-based 2D Semantic Segmentation
• Volumetric semantic fusion with adaptive weight
• CRF-based 3D semantic label regularization
RGB-D Stream Dense Surface Reconstruction
(1) CNN-based
2D Semantic Segmentation
Per-vertex
Semantic Class Confidence
Semantically Segmented Mesh
& Projected 2D Segmentation
(2) Volumetric Semantic Fusion
…
(3) CRF-based
Label Regularization
…
Overall framework
50
CNN-based 2D Semantic Segmentation
• RGB-D Semantic segmentation  RDFNet [Park 2017]
• Stream for 3D reconstruction differs from still images
▫ Captured close to objects, may suffer from motion blur
Images from ScanNet dataset (reconstruction)
Images from NYU-D dataset (still image)
51
CNN-based 2D Semantic Segmentation
• RGB-D Semantic segmentation  RDFNet [Park 2017]
• Stream for 3D reconstruction differs from still images
▫ Captured close to objects, may suffer from motion blur
• Fine tuning on ScanNet dataset [Dai et al. 2017]
▫ Drastically improves segmentation quality
Input Original RDFNet
[Park 2017]
Fine-tuned RDFNet
52
Adaptive Volumetric Semantic Fusion
• 2D predictions & camera poses may have error
▫ Weighted average of class probability for a voxel
Volumetric semantic fusion
?
??
?
53
Adaptive Volumetric Semantic Fusion
• 2D predictions & camera poses may have error
▫ Weighted average of class probability for a voxel
• Depth-based reliability weight
▫ Network accuracy depends on a pixel depth (i.e. relative scale)
▫ Pixel close to the camera gives less contribution to the result
Volumetric semantic fusion
?
??
?
54
Adaptive Volumetric Semantic Fusion
• 2D predictions & camera poses may have error
▫ Weighted average of class probability for a voxel
• Foreground boundary weight
▫ Unreliable predictions around misaligned objects boundary
▫ Prohibit wall/floor labels for foreground objects (pixels)
Input color Input depth
Depth weights Foreground weights
Wall
(background)
Object
(foreground)
Unreliable predictions
55
Reconstruction of Semantically Labeled 3D Mesh
• Marching cube to extract a reconstructed 3D mesh from the volumetric representation
▫ Bilinear interpolation of fused probabilities at voxels
▫ Each vertex has 20 objects class probabilities
56
Reconstruction of Semantically Labeled 3D Mesh
• Marching cube to extract a reconstructed 3D mesh from the volumetric representation
▫ Bilinear interpolation of fused probabilities at voxels
▫ Each vertex has 20 objects class probabilities
• Select maximum probability class for each vertex to obtain an initial segmentation
Initial segmentation result
Color Floor Wall
OthersBedChair
Probability visualization for major classes
57
CRF-based Label Regularization
• Integrated, but noisy 3D segmentation results
▫ 2D Segmentation considers a limited FOV individually
• CRF optimization to determine final class labels
• Consider global context of reconstructed scene w/ geometry (surface normal),
appearance (colors), and semantic similarity using a confusion matrix of CNN
Input Naïve integration Adaptive integration
No CRF
Final result
58
Experimental Setting
• 2D semantic segmentation: RDFNet for RGB-D stream (Caffe) [Park 2017]
• Camera pose estimation & 3D volumetric fusion: BundleFusion [Dai 2017]
• NVIDIA GeForce Titan X with 12GB VRAM
59
Experimental Results
Reconstructed scene (color) Segmented result 60
Experimental Results
Segmented resultReconstructed scene (color) 61
Segmentation Result of Large-scale Scenes
• Incremental integration enables
semantic reconstruction
of large-scale 3D scenes
Reconstructed scene Segmented result
62
Visual Comparisons
Input ScanNet [Dai 2017] Our results 63
Visual Comparisons
Input ScanNet [Dai 2017] Our results 64
Quantitative Evaluation
• Global voxel classification accuracy with majority voting
▫ Improve previous method with a large gap (+6.86%)
▫ Tested on ScanNet dataset (312 test scenes)
Configurations Accuracy
Voxel-based labeling [Dai 2017] 73.0%
Naïve integration without CRF 79.02%
Adaptive integration without CRF 79.28%
Naïve integration with CRF 79.79%
Adaptive integration with CRF 79.86%
65
Quantitative Evaluation
• Global voxel classification accuracy with majority voting
▫ Improve previous method with a large gap (+6.86%)
▫ Tested on ScanNet dataset (312 test scenes)
• Adaptive integration & CRF seem not effective
▫ Mainly handles an object boundary: visually critical but covers only small portion of data
Configurations Accuracy
Voxel-based labeling [Dai 2017] 73.0%
Naïve integration without CRF 79.02%
Adaptive integration without CRF 79.28%
Naïve integration with CRF 79.79%
Adaptive integration with CRF 79.86%
66
Quantitative Evaluation
• ScanNet dataset is highly unbalanced
▫ Most of vertices (61.7%) are wall & floor  imbalanced classes
▫ Class-mean intersection-over-union (mIOU) & accuracy (mAcc)
• Outperforms previous SOTAs (+2.84% / +15.3%)
Configurations mIOU mAcc
PointNet [Qi 2017] 14.69% 19.90%
PointNet++ [Qi 2017] 34.26% 43.77%
RSNet [Huang 2018] 39.35% 48.37%
RSNet w/ RGB [Huang 2018] 41.16% 50.34%
Ours 44.00% 65.64%
67
2D Projection of 3D Segmentation
• Fusion & regularization improve semantic segmentation results
• We can render 2D semantic maps from the segmented 3D model
• Original 2D segmentation vs. rendered 2D results
▫ Tested on ScanNet dataset (53K frames from 312 test scenes)
Pixel
Acc.
Mean
Acc.
Mean
IoU
Original RDFNet 60.44 47.32 29.34
Finetuned RDFNet (2D) 73.55 59.82 45.60
Our result (rendered 2D) 77.18 63.20 50.69
Quantitative comparisonInput image Results of CNN Our result 68
3D Scene Completion and Manipulation
• Class-wise (semantic) 3D scene manipulation
• Scene completion
• Object modification
Input sceneSemantic meshFloor fillingObject removal
69
Summary
• Volumetric semantic fusion integrating 2D semantic predictions
 exploit success of 2D CNN & data
• Adaptive integration based on depth and scene structure
 compensate uncertainty of network prediction
• CRF-based label regularization using the geometric and photometric information
 refine final result
70
Summary
• Volumetric semantic fusion integrating 2D semantic predictions
 exploit success of 2D CNN & data
• Adaptive integration based on depth and scene structure
 compensate uncertainty of network prediction
• CRF-based label regularization using the geometric and photometric information
 refine final result
• Limitation
▫ 2D semantic segmentation requires heavy computation
▫ Multiple GPUs to achieve real-time performance
71
Summary and Future Work
Color and 3D Semantic Reconstruction
of Indoor Scenes from RGB-D Streams
Summary
• 3D Reconstruction of auxiliary information
▫ Beyond the geometric reconstruction of the indoor scene
▫ Useful for rich user experience on VR/AR application
73
Summary
• 3D Color and Semantic Reconstruction of Indoor Scenes from a RGB-D Streams
 Efficient and accurate color representation
 Texture map generation using spatiotemporal key frame sampling and texture coordinate optimization
 Optimizing texture map considering geometric and photometric consistency together
 Per-vertex dense semantic class information
 3D Semantic segmentation on a reconstructed scenes via a volumetric semantic fusion
 3D instance segmentation of reconstructed scene for individual object meshes
Texture map generation Semantic reconstruction
74
Thank you!
Q & A
75
Supplementary Sildes
Color and 3D Semantic Reconstruction
of Indoor Scenes from RGB-D Streams

More Related Content

What's hot

Depth estimation using deep learning
Depth estimation using deep learningDepth estimation using deep learning
Depth estimation using deep learning
University of Oklahoma
 
Learning to Perceive the 3D World
Learning to Perceive the 3D WorldLearning to Perceive the 3D World
Learning to Perceive the 3D World
NAVER Engineering
 
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
Taegyun Jeon
 
DeconvNet, DecoupledNet, TransferNet in Image Segmentation
DeconvNet, DecoupledNet, TransferNet in Image SegmentationDeconvNet, DecoupledNet, TransferNet in Image Segmentation
DeconvNet, DecoupledNet, TransferNet in Image Segmentation
NamHyuk Ahn
 
Super resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun YooSuper resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun Yoo
JaeJun Yoo
 
Single Image Depth Estimation using frequency domain analysis and Deep learning
Single Image Depth Estimation using frequency domain analysis and Deep learningSingle Image Depth Estimation using frequency domain analysis and Deep learning
Single Image Depth Estimation using frequency domain analysis and Deep learning
Ahan M R
 
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Wanjin Yu
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
Fellowship at Vodafone FutureLab
 
Attentive semantic alignment with offset aware correlation kernels
Attentive semantic alignment with offset aware correlation kernelsAttentive semantic alignment with offset aware correlation kernels
Attentive semantic alignment with offset aware correlation kernels
NAVER Engineering
 
CNN and its applications by ketaki
CNN and its applications by ketakiCNN and its applications by ketaki
CNN and its applications by ketaki
Ketaki Patwari
 
Convolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep LearningConvolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep Learning
Mohamed Loey
 
Convolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular ArchitecturesConvolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular Architectures
ananth
 
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Universitat Politècnica de Catalunya
 
Understanding Convolutional Neural Networks
Understanding Convolutional Neural NetworksUnderstanding Convolutional Neural Networks
Understanding Convolutional Neural Networks
Jeremy Nixon
 
SeRanet introduction
SeRanet introductionSeRanet introduction
SeRanet introduction
Kosuke Nakago
 
Modern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentationModern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentation
Gioele Ciaparrone
 
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite Imagery
RAHUL BHOJWANI
 
Enhanced Deep Residual Networks for Single Image Super-Resolution
Enhanced Deep Residual Networks for Single Image Super-ResolutionEnhanced Deep Residual Networks for Single Image Super-Resolution
Enhanced Deep Residual Networks for Single Image Super-Resolution
NAVER Engineering
 
Background Subtraction Based on Phase and Distance Transform Under Sudden Ill...
Background Subtraction Based on Phase and Distance Transform Under Sudden Ill...Background Subtraction Based on Phase and Distance Transform Under Sudden Ill...
Background Subtraction Based on Phase and Distance Transform Under Sudden Ill...
Shanghai Jiao Tong University(上海交通大学)
 

What's hot (20)

Depth estimation using deep learning
Depth estimation using deep learningDepth estimation using deep learning
Depth estimation using deep learning
 
Learning to Perceive the 3D World
Learning to Perceive the 3D WorldLearning to Perceive the 3D World
Learning to Perceive the 3D World
 
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
 
DeconvNet, DecoupledNet, TransferNet in Image Segmentation
DeconvNet, DecoupledNet, TransferNet in Image SegmentationDeconvNet, DecoupledNet, TransferNet in Image Segmentation
DeconvNet, DecoupledNet, TransferNet in Image Segmentation
 
Super resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun YooSuper resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun Yoo
 
Single Image Depth Estimation using frequency domain analysis and Deep learning
Single Image Depth Estimation using frequency domain analysis and Deep learningSingle Image Depth Estimation using frequency domain analysis and Deep learning
Single Image Depth Estimation using frequency domain analysis and Deep learning
 
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
 
Attentive semantic alignment with offset aware correlation kernels
Attentive semantic alignment with offset aware correlation kernelsAttentive semantic alignment with offset aware correlation kernels
Attentive semantic alignment with offset aware correlation kernels
 
CNN and its applications by ketaki
CNN and its applications by ketakiCNN and its applications by ketaki
CNN and its applications by ketaki
 
Convolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep LearningConvolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep Learning
 
Convolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular ArchitecturesConvolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular Architectures
 
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
 
Understanding Convolutional Neural Networks
Understanding Convolutional Neural NetworksUnderstanding Convolutional Neural Networks
Understanding Convolutional Neural Networks
 
SeRanet introduction
SeRanet introductionSeRanet introduction
SeRanet introduction
 
Modern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentationModern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentation
 
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite Imagery
 
Enhanced Deep Residual Networks for Single Image Super-Resolution
Enhanced Deep Residual Networks for Single Image Super-ResolutionEnhanced Deep Residual Networks for Single Image Super-Resolution
Enhanced Deep Residual Networks for Single Image Super-Resolution
 
Background Subtraction Based on Phase and Distance Transform Under Sudden Ill...
Background Subtraction Based on Phase and Distance Transform Under Sudden Ill...Background Subtraction Based on Phase and Distance Transform Under Sudden Ill...
Background Subtraction Based on Phase and Distance Transform Under Sudden Ill...
 

Similar to Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D stream

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
taeseon ryu
 
TransNeRF
TransNeRFTransNeRF
TransNeRF
NavneetPaul2
 
Analysis of KinectFusion
Analysis of KinectFusionAnalysis of KinectFusion
Analysis of KinectFusion
Dong-Won Shin
 
Advanced Lighting for Interactive Applications
Advanced Lighting for Interactive ApplicationsAdvanced Lighting for Interactive Applications
Advanced Lighting for Interactive Applications
stefan_b
 
From Experimentation to Production: The Future of WebGL
From Experimentation to Production: The Future of WebGLFrom Experimentation to Production: The Future of WebGL
From Experimentation to Production: The Future of WebGL
FITC
 
Kintinuous review
Kintinuous reviewKintinuous review
Kintinuous review
Dong-Won Shin
 
Cahall Final Intern Presentation
Cahall Final Intern PresentationCahall Final Intern Presentation
Cahall Final Intern PresentationDaniel Cahall
 
Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)
Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)
Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)Matthias Trapp
 
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
Universitat Politècnica de Catalunya
 
lecture_16_jiajun.pdf
lecture_16_jiajun.pdflecture_16_jiajun.pdf
lecture_16_jiajun.pdf
Kuan-Tsae Huang
 
Shadow Techniques for Real-Time and Interactive Applications
Shadow Techniques for Real-Time and Interactive ApplicationsShadow Techniques for Real-Time and Interactive Applications
Shadow Techniques for Real-Time and Interactive Applications
stefan_b
 
Introduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable RenderingIntroduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable Rendering
Preferred Networks
 
Game development terminologies
Game development terminologiesGame development terminologies
Game development terminologies
Ahmed Badr
 
Texture_Mapping_RGB-D_Sensor_Survey.pptx
Texture_Mapping_RGB-D_Sensor_Survey.pptxTexture_Mapping_RGB-D_Sensor_Survey.pptx
Texture_Mapping_RGB-D_Sensor_Survey.pptx
ssuser906a0e
 
Image enhancement
Image enhancementImage enhancement
Image enhancementAyaelshiwi
 
Authoring of procedural rocks in The Blacksmith realtime short
Authoring of procedural rocks in The Blacksmith realtime shortAuthoring of procedural rocks in The Blacksmith realtime short
Authoring of procedural rocks in The Blacksmith realtime short
Vesselin Efremov
 
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Gurbinder Gill
 
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
Electronic Arts / DICE
 
Past, Present and Future Challenges of Global Illumination in Games
Past, Present and Future Challenges of Global Illumination in GamesPast, Present and Future Challenges of Global Illumination in Games
Past, Present and Future Challenges of Global Illumination in Games
Colin Barré-Brisebois
 
DTAM: Dense Tracking and Mapping in Real-Time, Robot vision Group
DTAM: Dense Tracking and Mapping in Real-Time, Robot vision GroupDTAM: Dense Tracking and Mapping in Real-Time, Robot vision Group
DTAM: Dense Tracking and Mapping in Real-Time, Robot vision Group
Lihang Li
 

Similar to Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D stream (20)

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
 
TransNeRF
TransNeRFTransNeRF
TransNeRF
 
Analysis of KinectFusion
Analysis of KinectFusionAnalysis of KinectFusion
Analysis of KinectFusion
 
Advanced Lighting for Interactive Applications
Advanced Lighting for Interactive ApplicationsAdvanced Lighting for Interactive Applications
Advanced Lighting for Interactive Applications
 
From Experimentation to Production: The Future of WebGL
From Experimentation to Production: The Future of WebGLFrom Experimentation to Production: The Future of WebGL
From Experimentation to Production: The Future of WebGL
 
Kintinuous review
Kintinuous reviewKintinuous review
Kintinuous review
 
Cahall Final Intern Presentation
Cahall Final Intern PresentationCahall Final Intern Presentation
Cahall Final Intern Presentation
 
Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)
Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)
Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)
 
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
 
lecture_16_jiajun.pdf
lecture_16_jiajun.pdflecture_16_jiajun.pdf
lecture_16_jiajun.pdf
 
Shadow Techniques for Real-Time and Interactive Applications
Shadow Techniques for Real-Time and Interactive ApplicationsShadow Techniques for Real-Time and Interactive Applications
Shadow Techniques for Real-Time and Interactive Applications
 
Introduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable RenderingIntroduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable Rendering
 
Game development terminologies
Game development terminologiesGame development terminologies
Game development terminologies
 
Texture_Mapping_RGB-D_Sensor_Survey.pptx
Texture_Mapping_RGB-D_Sensor_Survey.pptxTexture_Mapping_RGB-D_Sensor_Survey.pptx
Texture_Mapping_RGB-D_Sensor_Survey.pptx
 
Image enhancement
Image enhancementImage enhancement
Image enhancement
 
Authoring of procedural rocks in The Blacksmith realtime short
Authoring of procedural rocks in The Blacksmith realtime shortAuthoring of procedural rocks in The Blacksmith realtime short
Authoring of procedural rocks in The Blacksmith realtime short
 
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
 
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
 
Past, Present and Future Challenges of Global Illumination in Games
Past, Present and Future Challenges of Global Illumination in GamesPast, Present and Future Challenges of Global Illumination in Games
Past, Present and Future Challenges of Global Illumination in Games
 
DTAM: Dense Tracking and Mapping in Real-Time, Robot vision Group
DTAM: Dense Tracking and Mapping in Real-Time, Robot vision GroupDTAM: Dense Tracking and Mapping in Real-Time, Robot vision Group
DTAM: Dense Tracking and Mapping in Real-Time, Robot vision Group
 

More from NAVER Engineering

React vac pattern
React vac patternReact vac pattern
React vac pattern
NAVER Engineering
 
디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX
NAVER Engineering
 
진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)
NAVER Engineering
 
서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트
NAVER Engineering
 
BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호
NAVER Engineering
 
이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라
NAVER Engineering
 
날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기
NAVER Engineering
 
쏘카프레임 구축 배경과 과정
 쏘카프레임 구축 배경과 과정 쏘카프레임 구축 배경과 과정
쏘카프레임 구축 배경과 과정
NAVER Engineering
 
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
NAVER Engineering
 
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
NAVER Engineering
 
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
NAVER Engineering
 
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
NAVER Engineering
 
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
NAVER Engineering
 
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
NAVER Engineering
 
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
NAVER Engineering
 
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
NAVER Engineering
 
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
NAVER Engineering
 
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
NAVER Engineering
 
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
NAVER Engineering
 
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
NAVER Engineering
 

More from NAVER Engineering (20)

React vac pattern
React vac patternReact vac pattern
React vac pattern
 
디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX
 
진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)
 
서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트
 
BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호
 
이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라
 
날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기
 
쏘카프레임 구축 배경과 과정
 쏘카프레임 구축 배경과 과정 쏘카프레임 구축 배경과 과정
쏘카프레임 구축 배경과 과정
 
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
 
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
 
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
 
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
 
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
 
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
 
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
 
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
 
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
 
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
 
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
 
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
 

Recently uploaded

Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 

Recently uploaded (20)

Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 

Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D stream

  • 1. Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D Streams 전준호 CG Lab. POSTECH Tech Talk @ NAVER 2018.12.10
  • 2. 3D Reconstruction • Capture shape and appearance of real objects and environments • Produce 3D models for applications such as virtual/augmented reality, 3D printing 2
  • 3. 3D Reconstruction using RGB-D Sensor • Geometric reconstructions are rapidly developed, and available for large-scale scenes ▫ But mainly focus on acquiring an accurate geometry KinectFusion [Newcombe 2011] Voxel hashing [Nießner 2013] Elastic fragments [Zhou 2013] Robust reconstruction [Choi 2015] 3
  • 4. Auxiliary Information of 3D Indoor Scene • Surface color • Object class • Lighting condition • Sound  Rich UX  Color and Semantic Reconstruction 4
  • 5. Contributions – Color Reconstruction • Texture Map Generation for 3D Reconstructed Scenes ▫ Reconstruct clean and sharp surface color of the 3D reconstructed scene ▫ Light-weight color representation for reconstructed scenes ▫ Texture coordinates optimization to acquire sharp texture map Texture map generation for 3D reconstruction 5
  • 6. Contributions – Semantic Reconstruction • Reconstruction of semantically segmented 3D meshes ▫ Predict per-vertex object class of the 3D reconstructed scene ▫ Volumetric semantic fusion of frame-by-frame semantic predictions ▫ Adaptive integration and CRF optimization for robust labeling 3D semantic reconstruction 6
  • 7. Texture Map Generation for 3D Reconstructed Scenes Junho Jeon, Yeongyu Jung, Haejoon Kim, Seungyong Lee The Visual Computer (CGI 2016)
  • 8. 3D Reconstruction using RGB-D Sensor • Available for very large-scale scenes ▫ But no or inaccurate color information! Robust reconstruction [Choi 2015] BundleFusion [Dai 2017] 8
  • 9. Color Reconstruction • Naïve color blending introduces blurring, ghosting, etc. ▫ Incorrect camera poses ▫ Lens distortions ▫ Misaligned RGB-D images • Goal: precisely reconstruct the color from RGB-D stream Blurry color from volumetric blending 9
  • 10. Previous work: Color Map Optimization • Zhou and Koltun, TOG 2014 ▫ Project RGB stream onto mesh to get vertex color ▫ Optimize camera pose & warping function for images  clean vertex color ▫ Limitation: method based on vertex color  Time-consuming optimization  Inefficient rendering * Images from Zhou’s slides Result Optimization takes 5 mins. Image warping function 10
  • 11. Our Approach • Color reconstruction based on texture mapping ▫ Generating texture map for simplified mesh ▫ Optimize texture map to maximize photometric consistency ▫ GPU-based parallel solver  100x faster color reconstruction!  Efficient rendering Our method 11
  • 12. Rendering Result Sub-textures Global texture Spatio-temporally sampled key frames Simplified 3D reconstructed mesh RGB-D stream Color Depth Overall Framework 1) Preprocessing 2) Spatiotemporal key frame sampling 3) Texture map generation 4) Texture map optimization Refined global texture map (1) Preprocessing (4) Texture map optimization (2) Key frame sampling (3) Texture map generation 12
  • 13. Preprocessing Rendering Result Sub-textures Global texture Spatio-temporally sampled key frames Simplified 3D reconstructed mesh RGB-D stream Color Depth Refined global texture map (1) Preprocessing (4) Texture map optimization (2) Key frame sampling (3) Texture map generation 13
  • 14. Preprocessing • Geometric model reconstruction ▫ Dense scene reconstruction with point of interest [Zhou 2013] ▫ Any other 3D reconstruction method can be used • Model simplification ▫ Original mesh consists of more than 1M faces  Inefficient texture mapping  Further process becomes extremely time-consuming ▫ Surface simplification using quadric error metrics [Garland 1997] Mesh simplification (faces 460K to 23K)Dense scene reconstruction [Zhou 2013] 14
  • 15. Spatiotemporal Key Frame Sampling Rendering Result Sub-textures Global texture Spatio-temporally sampled key frames Simplified 3D reconstructed mesh RGB-D stream Color Depth Refined global texture map (1) Preprocessing (4) Texture map optimization (2) Key frame sampling (3) Texture map generation 15
  • 16. Spatiotemporal Key Frame Sampling • Input color stream ▫ A lot of redundant data, color images suffer from motion blurs • Temporal sampling ▫ Sample less blurry key frames based on Blurriness [Crété-Roffet 2007] • Spatial sampling ▫ Uniqueness: the image not able to be covered by other image ▫ Sample by eliminating image with minimum uniqueness Overlapping (red) and unique region (blue)Temporal sampling with blurriness 16
  • 17. Texture Map Generation Rendering ResultRGB-D stream Color Depth Refined global texture map Sub-textures Global texture Spatio-temporally sampled key frames Simplified 3D reconstructed mesh (3) Texture map generation (1) Preprocessing (4) Texture map optimization (2) Key frame sampling 17
  • 18. Texture Map Generation • UV unwrapping to mesh for global texture map ▫ Get global texture coordinates for every vertex • Estimate color by blending key frames ▫ Sub-texture map by projecting mesh to each camera ▫ Blended sub-texture becomes global texture Global texture map UV unwrapping Sub-textures Mesh projection Weighted blending 18
  • 19. Global Texture Map Optimization Rendering Result Spatio-temporally sampled key frames Simplified 3D reconstructed mesh RGB-D stream Color Depth (1) Preprocessing (2) Key frame sampling Sub-textures Global texture Refined global texture map (4) Texture map optimization (3) Texture map generation 19
  • 20. Global Texture Map Optimization • Generated texture map also suffers from blurring, ghosting, etc. ▫ Inconsistent color blending from different sub-textures • Optimize sub-texture coordinates to be consistent  Sharper & cleaner global texture map Inconsistent blendingConsistent blending 20
  • 21. Global Texture Map Optimization • Search new sub-texture coordinates of each vertex • Energy formulation for photometric consistency ▫ For every face, blended global texture should be consistent with sub-textures ▫ Consider consistency of sampled points on each face • Non-linear least square problem ▫ Need to be solved by Gauss-Newton method Sub-texture coordinates (variables) Sub-texture (intensity) Blended global texture (intensity) Sub-textures of face f 21
  • 22. GPU-based Alternating Solver • Applying naïve Gauss-Newton method is non-trivial ▫ Infeasible to solve directly due to the # of variables • Exploit locality of the problem to parallelize the optimization ▫ Assuming 1-ring neighborhood of v fixed, optimization of sub-texture coordinates uv is independent from other vertices ▫ Schwarz Alternating Method  While keeping boundary variables, update inner variables  Independent optimizations are propagated iteratively v v2v1 v3 v4 1-ring neighborhood v Propagation of optimization 22
  • 23. Experimental Results • Tested on various 3D reconstructed models • Intel i7 4.0GHz, 16 RAM, NVIDIA Titan X *models from Zhou et al. 23
  • 25. Experimental Results 10k faces Optimization takes 2.6s (100 iterations) Our method 25
  • 28. Our result (after opt.)Our result (before opt.) Color map optimization [Zhou 2014]Volumetric blending Faces #: 15,853,238 File size: 713MB Optimization: 5 mins Faces #: 10,000 File size: 217KB + few MB Optimization: 2.6 secs 28
  • 29. Experimental Results • 10K faces takes less than 3s (original mesh consists of 1M faces) 20K faces 10K faces 5K faces 29
  • 33. Experimental Results 130k faces Optimization takes 18s 33
  • 36. Summary • Texture map generation for color reconstruction of 3D indoor scene ▫ Texture map generation maximizing the photometric consistency of mapping ▫ Spatiotemporal sampling for faster processing and sharper texture map ▫ Efficient optimization based on a parallel Gauss-Newton solver on GPU  Directly applicable for computer graphics 36
  • 37. Semantic Reconstruction: Reconstruction of Semantically Segmented 3D Meshes via Volumetric Semantic Fusion Junho Jeon, Jinwoong Jung, Jungeon Kim, Seungyong Lee Computer Graphics Forum (Pacific Graphics 2018)
  • 38. Reconstruction of Semantic Information • Virtual/augmented reality  Interaction with 3D scenes • Single connected 3D model  not suitable • Requires individually segmented object models  Semantic segmentation on 3D reconstructed scene Interaction with 3D scene Single connected 3D model Sofa Floor Shelves Wall 38
  • 39. Semantic Segmentation on 2D Image • Pixel-wise annotation of semantic object class • Well established network architectures and dataset ▫ PASCAL, MS COCO, Mapillary, Places, … • Has shown successful performance Places dataset Mapillary dataset 39
  • 40. 3D Semantic Segmentation • Point-wise (vertex-wise) annotation on 3D scene model • Deep learning on a 3D data is not straight-forward ▫ Unstructured point cloud, mesh with complex topology • Lack of annotated 3D reconstructed model dataset ▫ Recently, an annotated dataset is released (ScanNet) floor bed wall Chair Picture Reconstructed 3D model Per-vertex annotation 40
  • 41. Related Work – 3D CNN-based Methods • Represent input geometry as an uniform voxel grid ▫ Binary occupancy grid or distance field • Direct feature extraction and classification w/ 3D CNN • Higher memory consumption  only low resolution segmentations Fully convolutional 3D CNN architecture Images courtesy of [Qi 2017] Voxel-based semantic segmentation [Dai 2017] 41
  • 42. Related Work – Point-based Methods • Unstructured point cloud to ordered sequence vector ▫ Point set grouping, slice pooling, max pooling • Feature extraction and classification w/ MLP or RNN • Miss geometric detail (may miss small object classes) RSNet [Huang 2018]PointNet++ [Qi 2017] 42
  • 43. Our Approach: Semantic Reconstruction • 3D (geometry) reconstruction using fusion of multiple geometry (depth images) • 3D semantic reconstruction using fusion of multiple 2D semantic predictions Multiple depth images Dense surface reconstruction Multiple semantic predictions 3D semantic reconstruction 43
  • 44. Volumetric Fusion of Semantic Information • Review: Volumetric Fusion of 3D Geometry • Geometry representation using a uniform voxel grid ▫ Each voxel stores TSDF value (geometry information) Uniform voxel grid 44
  • 45. Volumetric Fusion of Semantic Information • Review: Volumetric Fusion of 3D Geometry • Geometry representation using a uniform voxel grid ▫ Each voxel stores TSDF value (geometry information) • Merge noisy measurements on single voxel grid w/ estimated camera poses ▫ Volumetric denoising of the reconstructed geometry (TSDF values) Images courtesy of Newcombe’s slides Multi-frames geometric fusionUniform voxel grid 45
  • 46. Volumetric Fusion of Semantic Information • Each voxel has semantic probabilistic distribution (20 classes) ▫ Volumetric fusion of multi-frames semantic predictions • Seamless integration into the 3D (geometry) reconstruction process Volumetric semantic fusion RGB-D Stream Stream of Semantic Prediction CNN-based 2D semantic segmentation 46
  • 47. Semantic Reconstruction • CNN-based 2D Semantic Segmentation • Volumetric semantic fusion with adaptive weight • CRF-based 3D semantic label regularization Overall framework RGB-D Stream Dense Surface Reconstruction (1) CNN-based 2D Semantic Segmentation Per-vertex Semantic Class Confidence Semantically Segmented Mesh & Projected 2D Segmentation (2) Volumetric Semantic Fusion … (3) CRF-based Label Regularization … 47
  • 48. Semantic Reconstruction • CNN-based 2D Semantic Segmentation • Volumetric semantic fusion with adaptive weight • CRF-based 3D semantic label regularization RGB-D Stream Dense Surface Reconstruction (1) CNN-based 2D Semantic Segmentation Per-vertex Semantic Class Confidence Semantically Segmented Mesh & Projected 2D Segmentation (2) Volumetric Semantic Fusion … (3) CRF-based Label Regularization … Overall framework 48
  • 49. Semantic Reconstruction • CNN-based 2D Semantic Segmentation • Volumetric semantic fusion with adaptive weight • CRF-based 3D semantic label regularization RGB-D Stream Dense Surface Reconstruction (1) CNN-based 2D Semantic Segmentation Per-vertex Semantic Class Confidence Semantically Segmented Mesh & Projected 2D Segmentation (2) Volumetric Semantic Fusion … (3) CRF-based Label Regularization … Overall framework 49
  • 50. Semantic Reconstruction • CNN-based 2D Semantic Segmentation • Volumetric semantic fusion with adaptive weight • CRF-based 3D semantic label regularization RGB-D Stream Dense Surface Reconstruction (1) CNN-based 2D Semantic Segmentation Per-vertex Semantic Class Confidence Semantically Segmented Mesh & Projected 2D Segmentation (2) Volumetric Semantic Fusion … (3) CRF-based Label Regularization … Overall framework 50
  • 51. CNN-based 2D Semantic Segmentation • RGB-D Semantic segmentation  RDFNet [Park 2017] • Stream for 3D reconstruction differs from still images ▫ Captured close to objects, may suffer from motion blur Images from ScanNet dataset (reconstruction) Images from NYU-D dataset (still image) 51
  • 52. CNN-based 2D Semantic Segmentation • RGB-D Semantic segmentation  RDFNet [Park 2017] • Stream for 3D reconstruction differs from still images ▫ Captured close to objects, may suffer from motion blur • Fine tuning on ScanNet dataset [Dai et al. 2017] ▫ Drastically improves segmentation quality Input Original RDFNet [Park 2017] Fine-tuned RDFNet 52
  • 53. Adaptive Volumetric Semantic Fusion • 2D predictions & camera poses may have error ▫ Weighted average of class probability for a voxel Volumetric semantic fusion ? ?? ? 53
  • 54. Adaptive Volumetric Semantic Fusion • 2D predictions & camera poses may have error ▫ Weighted average of class probability for a voxel • Depth-based reliability weight ▫ Network accuracy depends on a pixel depth (i.e. relative scale) ▫ Pixel close to the camera gives less contribution to the result Volumetric semantic fusion ? ?? ? 54
  • 55. Adaptive Volumetric Semantic Fusion • 2D predictions & camera poses may have error ▫ Weighted average of class probability for a voxel • Foreground boundary weight ▫ Unreliable predictions around misaligned objects boundary ▫ Prohibit wall/floor labels for foreground objects (pixels) Input color Input depth Depth weights Foreground weights Wall (background) Object (foreground) Unreliable predictions 55
  • 56. Reconstruction of Semantically Labeled 3D Mesh • Marching cube to extract a reconstructed 3D mesh from the volumetric representation ▫ Bilinear interpolation of fused probabilities at voxels ▫ Each vertex has 20 objects class probabilities 56
  • 57. Reconstruction of Semantically Labeled 3D Mesh • Marching cube to extract a reconstructed 3D mesh from the volumetric representation ▫ Bilinear interpolation of fused probabilities at voxels ▫ Each vertex has 20 objects class probabilities • Select maximum probability class for each vertex to obtain an initial segmentation Initial segmentation result Color Floor Wall OthersBedChair Probability visualization for major classes 57
  • 58. CRF-based Label Regularization • Integrated, but noisy 3D segmentation results ▫ 2D Segmentation considers a limited FOV individually • CRF optimization to determine final class labels • Consider global context of reconstructed scene w/ geometry (surface normal), appearance (colors), and semantic similarity using a confusion matrix of CNN Input Naïve integration Adaptive integration No CRF Final result 58
  • 59. Experimental Setting • 2D semantic segmentation: RDFNet for RGB-D stream (Caffe) [Park 2017] • Camera pose estimation & 3D volumetric fusion: BundleFusion [Dai 2017] • NVIDIA GeForce Titan X with 12GB VRAM 59
  • 60. Experimental Results Reconstructed scene (color) Segmented result 60
  • 62. Segmentation Result of Large-scale Scenes • Incremental integration enables semantic reconstruction of large-scale 3D scenes Reconstructed scene Segmented result 62
  • 63. Visual Comparisons Input ScanNet [Dai 2017] Our results 63
  • 64. Visual Comparisons Input ScanNet [Dai 2017] Our results 64
  • 65. Quantitative Evaluation • Global voxel classification accuracy with majority voting ▫ Improve previous method with a large gap (+6.86%) ▫ Tested on ScanNet dataset (312 test scenes) Configurations Accuracy Voxel-based labeling [Dai 2017] 73.0% Naïve integration without CRF 79.02% Adaptive integration without CRF 79.28% Naïve integration with CRF 79.79% Adaptive integration with CRF 79.86% 65
  • 66. Quantitative Evaluation • Global voxel classification accuracy with majority voting ▫ Improve previous method with a large gap (+6.86%) ▫ Tested on ScanNet dataset (312 test scenes) • Adaptive integration & CRF seem not effective ▫ Mainly handles an object boundary: visually critical but covers only small portion of data Configurations Accuracy Voxel-based labeling [Dai 2017] 73.0% Naïve integration without CRF 79.02% Adaptive integration without CRF 79.28% Naïve integration with CRF 79.79% Adaptive integration with CRF 79.86% 66
  • 67. Quantitative Evaluation • ScanNet dataset is highly unbalanced ▫ Most of vertices (61.7%) are wall & floor  imbalanced classes ▫ Class-mean intersection-over-union (mIOU) & accuracy (mAcc) • Outperforms previous SOTAs (+2.84% / +15.3%) Configurations mIOU mAcc PointNet [Qi 2017] 14.69% 19.90% PointNet++ [Qi 2017] 34.26% 43.77% RSNet [Huang 2018] 39.35% 48.37% RSNet w/ RGB [Huang 2018] 41.16% 50.34% Ours 44.00% 65.64% 67
  • 68. 2D Projection of 3D Segmentation • Fusion & regularization improve semantic segmentation results • We can render 2D semantic maps from the segmented 3D model • Original 2D segmentation vs. rendered 2D results ▫ Tested on ScanNet dataset (53K frames from 312 test scenes) Pixel Acc. Mean Acc. Mean IoU Original RDFNet 60.44 47.32 29.34 Finetuned RDFNet (2D) 73.55 59.82 45.60 Our result (rendered 2D) 77.18 63.20 50.69 Quantitative comparisonInput image Results of CNN Our result 68
  • 69. 3D Scene Completion and Manipulation • Class-wise (semantic) 3D scene manipulation • Scene completion • Object modification Input sceneSemantic meshFloor fillingObject removal 69
  • 70. Summary • Volumetric semantic fusion integrating 2D semantic predictions  exploit success of 2D CNN & data • Adaptive integration based on depth and scene structure  compensate uncertainty of network prediction • CRF-based label regularization using the geometric and photometric information  refine final result 70
  • 71. Summary • Volumetric semantic fusion integrating 2D semantic predictions  exploit success of 2D CNN & data • Adaptive integration based on depth and scene structure  compensate uncertainty of network prediction • CRF-based label regularization using the geometric and photometric information  refine final result • Limitation ▫ 2D semantic segmentation requires heavy computation ▫ Multiple GPUs to achieve real-time performance 71
  • 72. Summary and Future Work Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D Streams
  • 73. Summary • 3D Reconstruction of auxiliary information ▫ Beyond the geometric reconstruction of the indoor scene ▫ Useful for rich user experience on VR/AR application 73
  • 74. Summary • 3D Color and Semantic Reconstruction of Indoor Scenes from a RGB-D Streams  Efficient and accurate color representation  Texture map generation using spatiotemporal key frame sampling and texture coordinate optimization  Optimizing texture map considering geometric and photometric consistency together  Per-vertex dense semantic class information  3D Semantic segmentation on a reconstructed scenes via a volumetric semantic fusion  3D instance segmentation of reconstructed scene for individual object meshes Texture map generation Semantic reconstruction 74
  • 76. Supplementary Sildes Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D Streams