SlideShare a Scribd company logo
1 of 72
Download to read offline
2 0 1 9 / 0 9 / 1 2
Survey of 3D Deep Learning (Robo+R3 Study Group)
Arithmer R3 Div. Takashi Nakano
Self-Introduction
• Takashi Nakano
• Graduate School
• Kyoto University
• Laboratory : Nuclear Theory Group
• Research : Theoretical Physics, Ph.D. (Science)
• Phase structure of the universe
• Theoretical properties of Lattice QCD
• Phase structure of graphene
• Former Job
• KOZO KEIKAKU ENGINEERING Inc.
• Contract analysis / Technical support / Introduction support by using software of Fluid Dynamics /
Powder engineering
• Current Job
• Application of machine learning / deep learning to fluid dynamics
• e.g. https://arithmer.co.jp/2019-12-29-1/
• Application of machine learning / deep learning to 3D data
Purpose
• Purpose of this material
• Overview of 3D deep learning
• Comparison b/w each method of 3D deep learning
• Main papers (In this material, I have summarized the material based on
following materials and cited papers therein.)
• E. Ahmed et al, "A survey on Deep Learning Advances on Different 3D Data Representations",
2018
• M. M. Bronstein et al., "Geometric deep learning: going beyond Euclidean data", 2016
Application
• Application of 3D Deep Learning
Classification Segmentation Correspondence Retrieval
3D data restoration from 2D images,
Pose Estimation, etc.Per-point classification
Each label
at each vertex
same #vertex
at each model
Comparison of Global Feature
[2]
[1]
[1] C. R. Qi, "PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation", 2016
[2] J. Masci et al. "Geodesic convolutional neural networks on Riemannian manifolds", 2015
[1]
[1]
[1]
Agenda
• Methods of 3D Deep Learning
• Euclidean vs Non-Euclidean
• Euclidean Method
• Projections / Multi-View
• Voxel
• Non-Euclidean Method
• Point Cloud / Mesh / Graph
• Accuracy
• Dataset / Material
• Appendix
• Mesh Generation
• Laplacian on Graph
• Correspondence
3D Data
• 3D Data
Point Cloud Mesh
Point Cloud Mesh Graph
Vertex 〇 〇 〇
Face - 〇 -
Edge - - 〇
[ 𝑥0, 𝑦0, 𝑧0 , … , 𝑥 𝑁, 𝑦 𝑁, 𝑧 𝑁 ]
[ 𝑉00, 𝑉01, 𝑉02 , … , 𝑉𝐹0, 𝑉𝐹1, 𝑉𝐹2 ]
[ 𝑉00, 𝑉01 , 𝑉01, 𝑉02 , … , 𝑉𝐸2, 𝑉𝐸1 ]
𝒱 𝒱
ℱ
𝒱
ℰ
Graph
Representation
• Representation of 3D data
[1] E. Ahmed et al, "A survey on Deep Learning Advances on Different 3D Data Representations", 2018
[1]
Representation
• Representation of 3D data
Euclidean
Non-Euclidean
Grid
(Translational invariant)
Non-Grid
(not Translational invariant)
Local / Intrinsic
Global / Extrinsic
Point of view from 3D
Point of view from 2D Surface
3D2D
Rigid
Small deformation
Non-Rigid
Large deformation
3D CNN2D CNN
Non-trivial CNN
〇
×
[1]
[1] J. Masci et al. "Geodesic convolutional neural networks on Riemannian manifolds", 2015
[1]
Euclidean vs Non-Euclidean
• Euclidean
[1] E. Ahmed et al, "A survey on Deep Learning Advances on Different 3D Data Representations", 2018
[1]
Euclidean vs Non-Euclidean
• Euclidean (detail of feature)
Feature Merit Demerit
Descriptors Extraction of 3D topological
feature (SHOT, PFH, etc.)
• Can convert as feature
• Can each problem
The geometric properties
of the shape is lost.
Projections Projection of 3D to 2D - The geometric properties
of the shape is lost.
RGB-D RGB + Depth map • Can use data from
RGB-D sensors
(Kinect/realsence) as
input
• Need depth map.
• Only infer some of the
3D properties based
on the depth.
Volumetric Voxelization • Expansion of 2D CNN • Need Large memories.
• (grid information)
• Need high resolution
for detailed shapes.
(e.g. segmentation)
Multi-View 2D images from multi-angles • Highest accuracy in
Euclidean method
• Need multi-view
images
Euclidean vs Non-Euclidean
• Non-Euclidean
Point Cloud Mesh Graph
Unordered point cloud
No connected information
b/w point cloud
Connected information
b/w point cloud
Graph (Vertex, edge)
Dependence of
noise and density of
point cloud
Need to convert
from point cloud to mesh
Need to create graph type
[ 𝑥0, 𝑦0, 𝑧0 , 𝑥1, 𝑦1, 𝑧1 ]
[ 𝑥1, 𝑦1, 𝑧1 , 𝑥0, 𝑦0, 𝑧0 ]
𝒱 𝒱
ℱ
𝒱
ℰ
[1] R. Hanocka et al., "MeshCNN: A Network with an Edge", 2018
[1]
Euclidean vs Non-Euclidean
• Non-Euclidean (detail of feature)
Feature Merit Demerit
Point
Cloud
• Treat point cloud
• Need to keep translational
and rotational invariance
• Treat unordered point cloud
• No connected information
b/w point cloud
• Original data is often point
cloud.
• e.g. scanned data (No CAD
data, Terrain data)
• Civil engineering, architecture,
medical care, fashion
• Treat noise
• Dependence of density of
point cloud
• Complement b/w point
cloud
• Cannot distinguish b/w
close point cloud
Mesh • Treat mesh data
• Connected information b/w
point cloud
• Convert mesh data to
structure for applying CNN
• CAD data
• e.g. design in manufacturing
• Can keep geometry in few
mesh
• Convert point cloud to mesh
data
Graph • Treat mesh as graph
• Vertex (node)
• Edge (connected information
b/w point cloud)
• Same as Mesh • Create graph type CNN
(non-trivial)
Euclidean vs Non-Euclidean
• Non-Euclidean (detail of feature)
Feature Merit Demerit
Point
Cloud
• Treat point cloud
• Need to keep translational
and rotational invariance
• Treat unordered point cloud
• No connected information
b/w point cloud
• Original data is often point
cloud.
• e.g. scanned data (No CAD
data, Terrain data)
• Civil engineering, architecture,
medical care, fashion
• Treat noise
• Dependence of density of
point cloud
• Complement b/w point
cloud
• Cannot distinguish b/w
close point cloud
Mesh • Treat mesh data
• Connected information b/w
point cloud
• Convert mesh data to
structure for applying CNN
• CAD data
• e.g. design in manufacturing
• Can keep geometry in few
mesh
• Convert point cloud to mesh
data
Graph • Treat mesh as graph
• Vertex (node)
• Edge (connected information
b/w point cloud)
• Same as Mesh • Create graph type CNN
(non-trivial)
Euclidean
• Representation of 3D data
[1] E. Ahmed et al, "A survey on Deep Learning Advances on Different 3D Data Representations", 2018
[1]
Euclidean
• Each Euclidean Method (Projections / RGB-D / Volumetric / Multi-View)
Method Application Link
Deep Pano Classification Paper
Two-stream CNNs on RGB-D Classification Paper
VoxNet Classification Paper
GitHub(Keras)
MVCNN Classification
Retrieval
Paper
GitHub(PyTorch/TensorFlow
etc.)
Euclidean
• Deep Pano [1]
• Projection to Panoramic image
• Row-wise max-pooling for rotational invariant
Panoramic image
[1] B. Shi et al. "DeepPano: Deep Panoramic Representation for 3D Shape Recognition", 2017
[1]
Euclidean
• Two-stream CNNs on RGB-D [1]
• Concatenate CNN of RGB and CNN of depth map
Concatenation[1]
[1] A. Eitel et al. "Multimodal Deep Learning for Robust RGB-D Object Recognition", 2015
Euclidean
• VoxNet [1]
• Voxelization of 3D point cloud to voxel
• Not robust for data loss
Voxelization
Point Cloud
Voxel
[1] D. Maturana et al. "VoxNet: A 3D Convolutional Neural Network for Real-Time Object Recognition", 2015
[1]
Euclidean
• MVCNN [1]
• Merge CNN of each images
[1] H. Su et al. "Multi-view Convolutional Neural Networks for 3D Shape Recognition", 2015
[1]
Non-Euclidean (Point Clouds)
• Representation of 3D data
[1] E. Ahmed et al, "A survey on Deep Learning Advances on Different 3D Data Representations", 2018
[1]
Non-Euclidean (Point Clouds)
• Each Non-Euclidean Method (Point Cloud)
Method Application Link
PointNet Classification
Segmentation
Retrieval
Correspondence
Paper
GitHub (TensorFlow)
PointNet++ Classification
Segmentation
Retrieval
Correspondence
Paper
GitHub (TensorFlow)
PyTorch-geometric (PointConv)
Dynamic Graph CNN
(DGCNN)
Classification
Segmentation
Paper
GitHub (PyTorch/TensorFlow)
PyTorch-geometric
(DynamicEdgeConv)
PointCNN Classification
Segmentation
Paper
GitHub (TensorFlow)
PyTorch-geometric (XConv)
※Some equations from following pages are referred to the documents in PyTorch-geometric.
(https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html)
I will explain PyTorch-geometric in later page.
Non-Euclidean (Point Clouds)
• PointNet [1]
• Treat unordered point cloud by max-pooling
• Comparison b/w PointNet++
• Detailed information is lost
• Cannot treat different density of point cloud
Part segmentation
(Per-point classification)
Predict Affine transformation
(Transrational, Rotational Invariance)
Similar to Spatial Transformer Networks in 2D
Classification
𝑓 𝑥1, ⋯ , 𝑥 𝑛 = 𝑔(ℎ 𝑥1 , ⋯ , ℎ 𝑥 𝑛 )
Max-poolingInput feature
MLP
Symmetry Function
Global + Local Feature
Randomly
rotating the object
along up-axis,
Normalization
in unit square
Affine
transformation
[1]
[1] C. R. Qi, "PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation", 2016
Non-Euclidean (Point Clouds)
• PointNet
• T-Net [1]
• Similar to Spatial Transformer Networks in 2D
• Spatial Transformer Networks
• Alignment of image (transformation, rotation, distortion etc.) by spatial transformation
• Learn affine transformation from input data (not necessarily special data)
• Can insert this networks at each point b/w networks
Reference Contents
Paper Original Paper
Sample (PyTorch) Dataset : MNIST
[1] M. Jaderberg et al. "Spatial transformer networks",2015
[1]
Non-Euclidean (Point Clouds)
• PointNet
• Spatial Transformer Networks
• Localization net : output parameters 𝜃 to transform for input feature map 𝑈
• Combination of Conv, MaxPool, ReLU, FC
• Output : 2 × 3
• Grid generator : create sampling grid by using the parameters
• Sampler : Output transformed feature map 𝑉
• pixel
Spatial Transformer Networks (2D)
Grid generator
Input map to transformed map
Input
feature map
Output
feature map
𝑥𝑖
𝑠
𝑦𝑖
𝑠 = 𝒯𝜃 𝐺𝑖 = 𝐴 𝜃
𝑥𝑖
𝑡
𝑦𝑖
𝑡
1
2 × 3
[1] M. Jaderberg et al. "Spatial transformer networks",2015
[1][1]
[1]
Non-Euclidean (Point Clouds)
• PointNet
• T-Net
• 3D ver. of Spatial Transformer Networks in 2D
• Not need sampling grid (There are no gird structure in 3D)
• Directly apply transformation to each point cloud
• Output parameter
• 3 × 3 in first T-Net
• 64 × 64 in second T-Net
T-Net
(input feature : 3)
T-Net
(input feature : 64)
[1]
[1] C. R. Qi, "PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation", 2016
Non-Euclidean (Point Clouds)
• PointNet++ [1]
• Comparison b/w PointNet
• Detailed information is kept
• Can treat different density of point cloud
Concatenation of multi-resolution [1] C. R. Qi et al. "Pointnet++: Deep hierarchical feature learning on point sets in a metric space", 2017
[1]
Non-Euclidean (Point Clouds)
• PointNet++
• Set abstraction
• Grouping in one scale + feature extraction
• Sampling Layer : Extraction of sampling points by farthest point sampling (FPS)
• Grouping Layer : Grouping points around sampling points
• PointNet Layer : Applying PointNet
Sampling Layer Grouping Layer
𝑟
Non-Euclidean (Point Clouds)
• PointNet++
• Point Feature Propagation for segmentation
• Interpolation : interpolation from k neighbor points
• Concatenation
Interpolation
𝑓 𝑗 𝑥 =
𝑖=1
𝑘
𝑤𝑖 𝑥 𝑓𝑖
(𝑗)
𝑖=1
𝑘
𝑤𝑖 𝑥
𝑘 = 3
𝑤𝑖 𝑥 =
1
𝑑 𝑥, 𝑥𝑖
2
𝑑
𝑥
𝑥𝑖
𝑥
feature
weight
Inverse of distance[1] C. R. Qi et al. "Pointnet++: Deep hierarchical feature learning on point sets in a metric space", 2017
[1]
Non-Euclidean (Point Clouds)
• PointNet++
• Single scale grouping
• Multi scale/resolution grouping
• Combination of features from different scales
• Robust for non-uniform sampling density
• Modifying architecture in set abstraction level
𝐿𝑖−1 Level
𝐿𝑖 Level
Original Points
multi-resolution grouping (MRG)
Concatenation of information of multi-resolutionHigh computational cost
multi-scale grouping (MSG)
Recommendation
[1] C. R. Qi et al. "Pointnet++: Deep hierarchical feature learning on point sets in a metric space", 2017
[1] [1]
Non-Euclidean (Point Clouds)
• PointNet++
• Detail of architecture
• Note: #vertex is fixed
𝑆𝐴 512, 0.2, 64, 64, 128 → 𝑆𝐴 128, 0.2, 64, 64, 128 → 𝑆𝐴 256,512,1024
#𝑖𝑛𝑝𝑢𝑡 𝑣𝑒𝑟𝑡𝑒𝑥: 1024
→ 𝐹𝐶 512,0.5 → 𝐹𝐶 256,0.5 → 𝐹𝐶(𝐾)
class
→ 𝐹𝑃 256, 256 → 𝐹𝑃 256,128 → 𝐹𝑃(128,128,128,128, 𝐾)
per point segmentation
1024/2 512/4
Architecture for classification
and part segmentation of ModelNet
using single scale grouping
𝑆𝐴 𝐾, 𝑟, ℓ1, ⋯ , ℓ 𝑑
#vertex radius Pointnet (#FC:d)
𝑆𝐴 ℓ1, ⋯ , ℓ 𝑑
Set abstraction level
Global Set
abstraction level #FC:d
Convert single vector by maxpooling
For classification
For part segmentation
Same in cls. and seg.
𝐹𝐶 ℓ, 𝑑𝑝
𝐹𝑃 ℓ1, ⋯ , ℓ 𝑑
Fully Connected
Feature Propagation
Channel
Ratio of dropout
#FC:d
Non-Euclidean (Point Clouds)
• PointNet++
• Detail of architecture
• Note: #vertex is fixed
#𝑖𝑛𝑝𝑢𝑡 𝑣𝑒𝑟𝑡𝑒𝑥: 1024
Architecture classification of ModelNet using multi-resolution grouping (MRG)
𝑆𝐴 512, 0.2, 64, 64, 128 → 𝑆𝐴 64, 0.4, 128, 128, 256
𝑆𝐴 512, 0.4, 64, 128, 256
𝑆𝐴 64, 128, 256,512
𝑆𝐴 256,512,1024
Concat.
Concat.
→ 𝐹𝐶 512,0.5 → 𝐹𝐶 256,0.5 → 𝐹𝐶(𝐾) Same as single scale grouping
class
Non-Euclidean (Point Clouds)
• Dynamic Graph CNN (DGCNN) [1]
• PointNet + w/ Edge Conv.
• Edge Conv.
• Create local edge structure dynamically (not fixed in each layer)
Edge Conv.
PointNet+ w/ Edge Conv.
𝒙𝑖′ =
𝑗∈𝑁(𝑖)
ℎ 𝚯(𝒙𝑖, 𝒙𝑗 − 𝒙𝑖)
global local
Search neighbors in feature space by kNN
[1] Y. Wang, "Dynamic Graph CNN for Learning on Point Clouds", 2018
[1]
[1]
Non-Euclidean (Point Clouds)
• PointCNN [1]
• Downsampling information from neighborhoods into fewer representative
points
Χ-Conv.
Lower resolution, deeper channels
Decreasing #representative points, deeper channels
𝒙𝑖′ = 𝐶𝑜𝑛𝑣 𝑲, 𝛾Θ 𝑷𝑖 − 𝒑𝑖 × ℎΘ 𝑷𝑖 − 𝒑𝑖 , 𝒙𝑖
Input feature
MLP applied
individually on each point
like PointNet
Kernel
Concatenation
[1] Y. Li et al. "PointCNN: Convolution On X-Transformed Points", 2018
[1]
[1]
Non-Euclidean (Mesh)
• Representation of 3D data
[1] E. Ahmed et al, "A survey on Deep Learning Advances on Different 3D Data Representations", 2018
[1]
Non-Euclidean (Mesh)
• Each Non-Euclidean Method (Mesh)
Method Application Link
MeshCNN Classification
Segmentation
Paper
GitHub (PyTorch)
MeshNet Classification Paper
GitHub (PyTorch)
Non-Euclidean (Mesh)
• MeshCNN [1]
• Edge collapse by pooling
• Can apply only the manifold mesh
use in segmentation
EdgeEdge collapse by pooling
Input feature
Angle
Length
Pooling / Unpooling
[1] R. Hanocka et al., "MeshCNN: A Network with an Edge", 2018
[1]
[1] [1]
Non-Euclidean (Mesh)
• MeshNet
• Input feature
• Center, corner, normal, neighbor index
Information of neighborhood of face
Mesh Conv.
(Combination + Aggregation)
[1] Y. Feng et al. "MeshNet: Mesh Neural Network for 3D Shape Representation", 2018
[1]
[1]
[1]
Non-Euclidean (Graph)
• Representation of 3D data
[1] E. Ahmed et al, "A survey on Deep Learning Advances on Different 3D Data Representations", 2018
[1]
Non-Euclidean (Graph)
• Each Non-Euclidean Method (Graph)
• Spectral / Spatial Method
Spectral Spatial
Euclidean
(1D)
Non-Euclidean
(Manifold)
∆𝜙𝑖 = 𝜆𝑖 𝜙𝑖
𝜙𝑖 = 𝑒 𝑖𝜔𝑥
, 𝜆𝑖 = 𝜔2
Local coordinate
𝐷𝑗 𝑥 𝑓 =
𝑦∈𝑁(𝑥)
𝜔𝑗(𝒖 𝑥, 𝑦 )𝑓(𝑦)
𝑓 ∗ 𝑔 𝑥 =
𝑗=1
𝐽
𝑔𝑗 𝐷𝑗 𝑥 𝑓
Patch Operator
Pseudo-coordinate
Convolution
Generalization of
Fourier basis
[1] M. M. Bronstein et al., "Geometric deep learning: going beyond Euclidean data", 2016
[2] J. Masci et al. "Geodesic convolutional neural networks on Riemannian manifolds", 2015
[3] M. Fey et al. "SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels", 2017
[1] [1]
[3]
[2]
Non-Euclidean (Graph)
• Each Non-Euclidean Method (Graph)
• Spatial method is more useful than spectral method.
Method Structure Feature
Spectral • Fourier basis in manifold
• Laplacian eigenvalue/eigenvector
• Spectral filter coefficients is base dependent in some
method
• No locality in some method
• High computational cost
Spatial • Create local coordinate
• Patch operator + Conv.
• Locality
• Efficient computational cost
※Some equations from following pages are referred to the documents in PyTorch-geometric.
(https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html)
I will explain PyTorch-geometric in later page.
Non-Euclidean (Graph)
• Each Non-Euclidean Method (Graph)
• Spectral, Spectral free
Method Method Application Link
Spectral CNN Spectral Graph Paper
Chebyshev Spectral
CNN
(ChebNet)
Spectral free Graph Paper
GitHub (TensorFlow)
PyTorch-geometric
(ChebConv)
Graph Convolutional
Network
(GCN)
Spectral free Graph Paper
PyTorch-geometric
(GCNConv)
Graph Neural Network
(GNN)
Spectral free Graph Paper
Non-Euclidean (Graph)
• Spectral CNN [2]
• cannot use different shape
• Spectral filter coefficients is base dependent
• High computational cost
• No locality
∆𝑓 𝑖 ∝
𝑖,𝑗 ∈ℰ
𝜔𝑖𝑗(𝑓𝑖 − 𝑓𝑗)Laplacian
Different shape
-> different basis -> different result
𝒇ℓ
𝑜𝑢𝑡
= 𝜉
ℓ′=1
𝑝
𝚽 𝑘 𝑮ℓ,ℓ′ 𝚽 𝑘
𝑇
𝒇ℓ′
𝑖𝑛
Laplacian eigenvectorReLU
[1] M. M. Bronstein et al., "Geometric deep learning: going beyond Euclidean data", 2016
[2] J. Bruna et al. "Spectral Networks and Locally Connected Networks on Graphs", 2013
[1]
[1]
Non-Euclidean (Graph)
• Chebyshev Spectral CNN (ChebNet) [1]
• Not calculate Laplacian eigenvectors directly
• Locality (K hops)
• Approximate filter as polynomial
• Graph Convolutional Network (GCN) [2]
• Special ver. of ChebNet (𝐾 = 2)
𝑋′
=
𝑘=0
𝐾−1
𝑍(𝑘)
⋅ Θ(𝑘)
𝑍(0) = 𝑋
𝑍(1)
= 𝐿 ⋅ 𝑋
𝑍(𝑘) = 2 ⋅ 𝐿 ⋅ 𝑍 𝑘−1 − 𝑍(𝑘−2)
𝐿:scaled and normalized Laplacian
[1] M. Defferrard et al. "Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering", 2016
[2] T. N. Kipf et al. "Semi-Supervised Classification with Graph Convolutional Networks", 2016
Non-Euclidean (Graph)
• Each Non-Euclidean Method (Graph)
• Charting
Application Link
Geodesic CNN Mesh
Shape retrieval /
correspondence
Paper
Anisotropic CNN Mesh / point cloud
Shape correspondence
Paper
MoNet Graph / mesh / point cloud
Shape correspondence
Paper
PyTorch-geometric (GMMConv)
SplineCNN Graph / Mesh
Classification
Shape correspondence
Paper
GitHub (PyTorch)
PyTorch-geometric
(SplineConv)
FeaStNet Graph / Mesh
Shape correspondence
Segmentation
Paper
PyTorch-geometric (FeaStConv)
Non-Euclidean (Graph)
• Geodesic CNN (GCNN)[1] ⊂ Anisotropic CNN (ACNN)[2] ⊂ MoNet [3]
MoNet
𝐷𝑗 𝑥 𝑓 =
𝑦∈𝑁(𝑥)
𝜔𝑗(𝒖 𝑥, 𝑦 )𝑓(𝑦)
𝑓 ∗ 𝑔 𝑥 =
𝑗=1
𝐽
𝑔𝑗 𝐷𝑗 𝑥 𝑓
Patch Operator
Pseudo-coordinate
Convolution
𝜔𝑗 𝒖 = exp −
1
2
𝒖 − 𝜇 𝑗
T
Σj
−1
(𝒖 − 𝜇 𝑗)
𝜔𝑗 𝒖 = exp −
1
2
𝒖 𝑇
𝑹 𝜃 𝑗
𝛼 0
0 1
𝑹 𝜃 𝑗
𝑇
𝒖ACNN
GCNN 𝜔𝑗 𝒖 = exp −
1
2
𝒖 − 𝑢𝑗
T 𝜎𝜌
2
0
0 𝜎 𝜃
2
(𝒖 − 𝑢𝑗)
Rotation of 𝜃 to the maximum
curvature direction The degree of anisotropy
covariance (radius, angle direction)
Learning parameters
[1] J. Masci et al. "Geodesic convolutional neural networks on Riemannian manifolds", 2015
[2] D Boscaini et al. "Learning shape correspondence with anisotropic convolutional neural networks", 2016
[3] F. Monti et al. "Geometric deep learning on graphs and manifolds using mixture model CNNs", 2016
[4] M. Fey et al. "SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels", 2017
[1]
[4]
Non-Euclidean (Graph)
• Geodesic CNN (GCNN)
• Create local coordinate
• Do not verify the meaningful chart (need to create small radius chart)
• Anisotropic CNN (ACNN)
• Fourier basis is based on anisotropic heat diffusion eq.
• MoNet
• Learn filter as parametric kernel
• Generalization of geodesic CNN and anisotropic CNN
Non-Euclidean (Graph)
• SplineCNN [1]
• Filter based on B-spline function
• Efficient computational cost
𝒙𝑖′ =
1
|𝑁 𝑖 |
𝑗∈𝑁(𝑖)
𝒙𝑖 ⋅ ℎ 𝚯(𝒆𝑖,𝑗)
Weighted B-Spline basis
[1] M. Fey et al. "SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels", 2017
[1]
Non-Euclidean (Graph)
• FeaStNet [1]
• Dynamically determine relation b/w filter weight and local graph of a node
𝒙𝑖′ =
1
|𝑁 𝑖 |
𝑗∈𝑁(𝑖) 𝑚=1
𝑀
𝑞 𝑚 𝒙𝑖, 𝒙𝑗 𝑾 𝑚 𝒙𝑗
Filter
(e.g. 𝑀 = 3 × 3 = 9)
Euclidean FeaStNet
Weight
Input Output Input Output
#neighbor
(e.g. 𝑁 = 6)
𝑞 𝑚 𝒙𝑖, 𝒙𝑗 = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥𝑗 𝒖 𝒎
𝑻
𝒙𝒊 − 𝒙𝒋 + 𝑐 𝑚
𝒙𝑖′ =
𝑚=1
𝑀
𝑾 𝑚 𝒙 𝑛(𝑚,𝑖)
pixel
D input featureE output feature [1] N Verma et al. "FeaStNet: Feature-Steered Graph Convolutions for 3D Shape Analysis", 2017
[1][1]
Non-Euclidean (Graph)
• PyTorch-geometric
• https://github.com/rusty1s/pytorch_geometric
• Library based on PyTorch
• For point cloud, mesh (not only graph)
• Include Point cloud, graph-type approach code
• PointNet++, DGCNN, PointCNN
• ChebNet, GCN, MoNet, SplineCNN, FeaStNet
• Easy to get the famous sample data and transform same data format
• ModelNet, ShapeNet, etc.
• Many example and benchmark
Accuracy
• Accuracy (Classification)
• around 90% in any method (except VoxNet)
Method “ModelNet40”
Overall Acc. [%] / Mean
class Acc. [%]
“SHREC”
Overall Acc. [%]
VoxNet 85.9 / 83.0 −
MVCNN − / 90.1 𝟗𝟔. 𝟎𝟗
PointNet 89.2 / 86.0 −
PointNet++ 90.7 / − −
DGCNN 𝟗𝟐. 𝟗 / 𝟗𝟎. 𝟐 −
PointCNN 92.2 / 88.1 −
MeshNet − / 91.9 −
MeshCNN − / − 91.0
※Please refer the detail in each paper (mentioned in each page)
Accuracy
• Accuracy (Segmentation)
Method Part segmentation
“ShapeNet”
mIoU (mean per-class
part-averaged IoU) [%]
Part
segmentation
“ScanNet”
Acc. [%]
Part
segmentation
“COSEG”
Acc. [%]
Scene
segmentation
“S3DIS”
Acc. [%] /
mIoU [%]
Human body
segmentation
“including
SCAPE, FAUST
etc.”
Acc. [%]
PointNet 80.4 (57.9 − 95.3) 73.9 54.4 − 91.5 78.6 / − 90.77
PointNet++ 81.9 (58.7 − 95.3) 84.5 79.1 − 98.9 − / − −
DGCNN 82.3 (𝟔𝟑. 𝟓 − 𝟗𝟓. 𝟕) − − 𝟖𝟒. 𝟏 / 56.1 −
PointCNN 𝟖𝟒. 𝟔 𝟖𝟓. 𝟏 − − / 𝟔𝟓. 𝟑𝟗 −
MeshCNN − − 𝟗𝟕. 𝟓𝟔 − 𝟗𝟗. 𝟔𝟑 − / − 𝟗𝟐. 𝟑𝟎
FeaStNet 81.5 − − − / − −
※Please refer the detail in each paper (mentioned in each page)
Dataset
• 3D Dataset
Contents Data Format Purpose PyTorch-geometric
ModelNet10/40 3D CAD Model
(10 or 40 classes)
Mesh (.OFF) Classification ModelNet
ShapeNet 3D Shape Point Cloud (.pts) Segmentation ShapeNet
ScanNet Indoor Scan Data Mesh (.ply) Segmentation -
S3DIS
(original, .h5)
Indoor Scan Data Point Cloud Segmentation S3DIS
ScanNet:registration required
S3DIS : registration required (for original)
Dataset
• 3D Dataset
Contents Data Format Purpose PyTorch-geometric
SHREC many type for each
contest
- Retrieval -
SHREC2016 Animal, Human
(Part Data)
Mesh (.OFF) Correspondence SHREC2016
TOSCA Animal, Human Mesh
(same #vertices at
each category,
separate file of
vertices and
triangles)
Correspondence TOSCA
PCPNet 3D Shape Point Cloud (.xyz)
(Including normal,
curvature files.)
Estimation of local
shape (Normal,
curvature)
PCPNet
FAUST Human body Mesh Correspondence FAUST
FAUST(Note) : registration required
Material
• Material of 3D deep learning (3D / point cloud)
Paper Comment
A survey on Deep Learning Advances on
Different 3D Data Representations
• Review of 3D Deep Learning
• Easier to read it
• Written from point of view about Euclidean
and Non-Euclidean method
Paperwithcode • Paper w/ code about 3D
Point Cloud Deep Learning Survey Ver. 2 • Deep learning for point cloud
• Survey of many papers
Material
• Material of 3D deep learning (graph)
Paper Comment
Geometric deep learning: going beyond
Euclidean data
• Review of geometric deep learning
Geometric Deep Learning • summary of paper and code about geometric
deep learning
Geometric Deep Learning on Graphs and
Manifolds (NIPS2017)
• Presentation (youtube) about geometric deep
learning
Summary
• There are many methods of 3D deep learning.
• Two main method
• Euclidean vs Non-Euclidean
• Euclidean Method
• Projections / Multi-View / Voxel
• Non-Euclidean Method
• Point Cloud / Mesh / Graph
• Each method have merit and demerit.
• We need to choose the better method for each data type and application.
• The research about 3D deep learning is growing.
Appendix
• Appendix
• Mesh Generation
• Laplacian on Graph
• Correspondence
Appendix : Mesh Generation
• Mesh Generation
• In this material, I have summarized these materials.
Link Contents
点群面張り(精密工学会) • Surface reconstruction
メッシュ処理(精密工学会) • Mesh processing
CV勉強会@関東発表資料 点群再構成に関するサーベイ • Survey of point cloud reconstruction
Appendix : Mesh Generation
• Difficulty of Mesh Generation
Processing Difficulty
Pre-processing Reduction of Noise / Missing / Abnormal value / density difference of vertices
Post-processing Mesh smoothing / hole filling
Ground Truth Noise
/ Abnormal value
Missing / density
difference of vertices
Mesh smoothing
/ hole filling
Appendix : Mesh Generation
• Kinds of Mesh Generation
Kind Feature Classification of the method
Direct Triangulation Direct mesh generation form point cloud Explicit method
Surface Smoothness Smooth surface mesh from point cloud Implicit method
Direct Triangulation Surface Smoothness
Appendix : Mesh Generation
• Classification of the method
• In general, it is easier to use the implicit method, since there are noise of point
cloud.
Classification of the method Information to use Influence of noise and density of
vertices
Guarantee of accuracy
Explicit method Vertices Large
(error of vertices = error of
meshes)
◎
Implicit method Meshes based on isosurface of
function fields which is
calculated from vertices
Small 〇
Appendix : Mesh Generation
• Kinds of Mesh Generation (Detail)
• Direct Triangulation (example of built-in function in MeshLab)
Method Feature
Voronoi-Based Surface Reconstruction Creation of Delaunay diagram adding the vertices
using Voronoi diagram
Ball-Pivoting Algorithm Roll the ball over the point cloud and generate mesh
from the point cloud located within a certain distance
Appendix : Mesh Generation
• Voronoi-Based Surface Reconstruction
• Voronoi diagram
• Region divided by the bisector of each vertices (in 2D)
• Delaunay triangulation
• Triangulation by connection of vertices
bisector
Vertices (S)
Voronoi
Vertices (V) Delaunay triangulation of S and V
(black + red line)
Example of 2D
Surface
(black line)
[1] N. Amenta et al. "A New Voronoi-Based Surface Reconstruction Algorithm", 1998
[1] [1]
Appendix : Mesh Generation
• Ball-Pivoting Algorithm
Not created
Close point cloudSparse data
Not created
Ideal data
[1] F Bernardini et al. "The Ball-Pivoting Algorithm for Surface Reconstruction", 1999
[1]
Appendix : Mesh Generation
• Kinds of Mesh Generation (Detail)
• Surface Smoothness (example of built-in function in MeshLab)
Method Feature
Signed distance function
+ Marching Cubes
Creation of Signed distance function by using the
distance b/w vertices and surface
+ Mesh generation by using Marching Cubes
Screened Poisson surface reconstruction
(Poisson surface reconstruction)
Distinguish b/w inside and outside of surface by
using Poisson eq.
Appendix : Mesh Generation
• Signed distance function + Marching Cubes
Oriented tangent planes Estimated signed distance
Output of
modified marching cubes
𝑓 𝒑 = 𝒑 − 𝒐 ⋅ 𝒏 𝑓 𝒑 > 0
→ 𝑜𝑢𝑡𝑠𝑖𝑑𝑒
𝑓 𝒑 < 0
→ 𝑖𝑛𝑠𝑖𝑑𝑒
𝒑
𝒑
𝒐
𝒏
𝑓 𝒑 = 0 → 𝑠𝑢𝑟𝑓𝑎𝑐𝑒
[1] H. Hoppe et al. "Surface Reconstruction from Unorganized Points", 1992
[1]
Appendix : Mesh Generation
• Screened Poisson surface reconstruction
• get Indicator Function by solving the Poisson eq.
∆𝜒 ≡ ∇ ⋅ ∇𝜒 = ∇ ⋅ 𝑽
Poisson eq.
Poisson surface reconstruction
Screened Poisson
surface reconstruction
Complement
b/w point cloud
[1] M. Kazhdan et al. "Poisson Surface Reconstruction", 2006
[2]
[1]
[2] M. Kazhdan et al. "Screened Poisson Surface Reconstruction", 2013
Appendix : Laplacian on Graph
• Laplacian on Graph [1]
∆𝑓 𝑖 =
1
𝑎𝑖
𝑖,𝑗 ∈ℰ
𝜔𝑖𝑗(𝑓𝑖 − 𝑓𝑗)Laplacian
(𝒱, ℰ)Graph (undirected)
𝒱 = {1, ⋯ , 𝑛}
ℰ ⊆ 𝒱 × 𝒱
𝑎𝑖
𝜔𝑖𝑗
weight
div. 𝑑𝑖𝑣 𝐹 𝑖 =
1
𝑎𝑖
𝑗: 𝑖,𝑗 ∈ℰ
𝜔𝑖𝑗 𝐹𝑖𝑗
Grad. ∇𝑓 𝑖𝑗 = 𝑓𝑖 − 𝑓𝑗
Mesh
𝜔𝑖𝑗 =
−ℓ𝑖𝑗
2
+ ℓ𝑗𝑘
2
+ ℓ 𝑘𝑖
2
8𝑎𝑖𝑗𝑘
+
−ℓ𝑖𝑗
2
+ ℓ𝑗ℎ
2
+ ℓℎ𝑖
2
8𝑎𝑖𝑗ℎ
=
1
2
(cot 𝛼𝑖𝑗 + cot 𝛽𝑖𝑗)
𝑎𝑖 =
1
3
𝑗𝑘: 𝑖,𝑗,𝑘 ∈ℱ
𝑎𝑖𝑗𝑘
𝑎𝑖𝑗𝑘 = 𝑠𝑖𝑗𝑘(𝑠𝑖𝑗𝑘 − ℓ𝑖𝑗)(𝑠𝑖𝑗𝑘 − ℓ𝑗𝑘)(𝑠𝑖𝑗𝑘 − ℓ 𝑘𝑖)
1/2
𝑠𝑖𝑗𝑘 =
1
2
(𝑎𝑖𝑗 + 𝑎𝑗𝑘 + 𝑎 𝑘𝑖)
𝑓: 𝒱 → ℝ, 𝐹: ℰ → ℝ
∆≡ −𝑑𝑖𝑣 ∇
→Laplacian
eigenvalues 𝜆 > 0
[1] M. M. Bronstein et al., "Geometric deep learning: going beyond Euclidean data", 2016
[1]
Appendix : Laplacian on Graph
• Laplacian on Graph [1]
Δ𝒇 = 𝑨−1 𝑫 − 𝑾 𝒇
𝒇 = 𝑓1, ⋯ , 𝑓𝑛
𝑇
𝑾 = (𝜔𝑖𝑗)
𝑨 = 𝑑𝑖𝑎𝑔(𝑎1, ⋯ , 𝑎 𝑛)
𝑫 = 𝑑𝑖𝑎𝑔
𝑗:𝑗≠𝑖
𝜔𝑖𝑗
Laplacian ∆ Condition
Unnormalized graph
Laplacian
∆= 𝑫 − 𝑾 𝐴 = 𝐼
Normalized
Symmetry Laplacian
∆= 𝑰 − 𝑫−
𝟏
𝟐 𝑾𝑫
𝟏
𝟐
𝐴 = 𝐷
+ Normalization
Random walk
Laplacian
∆= 𝑰 − 𝑫−1 𝑾 𝐴 = 𝐷
Laplacian (as matrix)
[1] M. M. Bronstein et al., "Geometric deep learning: going beyond Euclidean data", 2016
Appendix : Laplacian on Graph
• Laplacian on Graph [1]
• Convolution
(𝑓 ∗ 𝑔)(𝑥) =
𝑖≥0
𝑓𝑖 𝑔𝑖 𝜙𝑖(𝑥)
𝒇 ∗ 𝒈 = 𝚽𝑑𝑖𝑎𝑔 𝑔 𝚽T 𝐟
𝒇 = 𝑓1, ⋯ , 𝑓𝑛
𝑇
𝒈 = ( 𝑔1, ⋯ , 𝑔 𝒏)
𝚽 = (𝜙1, ⋯ , 𝜙 𝑛)
Matrix
Conv.
[1] M. M. Bronstein et al., "Geometric deep learning: going beyond Euclidean data", 2016
Appendix :Correspondence
• Correspondence [1]
Query Reference input
𝑥𝑖 (𝑦1, ⋯ , 𝑦 𝑁)
label
Each query vertex has labels
as all reference vertices
Output
(Probability)
Correct Label
Output
(Probability)
(𝑝1, 𝑝2, ⋯ , 𝑝 𝑁)
(1,0, ⋯ , 0)
One-hot vector
Loss
[1] M. M. Bronstein et al., "Geometric deep learning: going beyond Euclidean data", 2016
[1]
Summary of survey papers on deep learning method to 3D data

More Related Content

What's hot

GANの概要とDCGANのアーキテクチャ/アルゴリズム
GANの概要とDCGANのアーキテクチャ/アルゴリズムGANの概要とDCGANのアーキテクチャ/アルゴリズム
GANの概要とDCGANのアーキテクチャ/アルゴリズムHirosaji
 
[DL輪読会]StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generat...
[DL輪読会]StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generat...[DL輪読会]StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generat...
[DL輪読会]StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generat...Deep Learning JP
 
[DL輪読会]Learning to Simulate Complex Physics with Graph Networks
[DL輪読会]Learning to Simulate Complex Physics with Graph Networks[DL輪読会]Learning to Simulate Complex Physics with Graph Networks
[DL輪読会]Learning to Simulate Complex Physics with Graph NetworksDeep Learning JP
 
[DL輪読会]StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators
[DL輪読会]StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators[DL輪読会]StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators
[DL輪読会]StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image GeneratorsDeep Learning JP
 
論文紹介:Multimodal Learning with Transformers: A Survey
論文紹介:Multimodal Learning with Transformers: A Survey論文紹介:Multimodal Learning with Transformers: A Survey
論文紹介:Multimodal Learning with Transformers: A SurveyToru Tamaki
 
Transformer メタサーベイ
Transformer メタサーベイTransformer メタサーベイ
Transformer メタサーベイcvpaper. challenge
 
【DL輪読会】GradMax: Growing Neural Networks using Gradient Information
【DL輪読会】GradMax: Growing Neural Networks using Gradient Information【DL輪読会】GradMax: Growing Neural Networks using Gradient Information
【DL輪読会】GradMax: Growing Neural Networks using Gradient InformationDeep Learning JP
 
【DL輪読会】"Instant Neural Graphics Primitives with a Multiresolution Hash Encoding"
【DL輪読会】"Instant Neural Graphics Primitives with a Multiresolution Hash Encoding"【DL輪読会】"Instant Neural Graphics Primitives with a Multiresolution Hash Encoding"
【DL輪読会】"Instant Neural Graphics Primitives with a Multiresolution Hash Encoding"Deep Learning JP
 
Deep Learningと画像認識   ~歴史・理論・実践~
Deep Learningと画像認識 ~歴史・理論・実践~Deep Learningと画像認識 ~歴史・理論・実践~
Deep Learningと画像認識   ~歴史・理論・実践~nlab_utokyo
 
NVIDIA Modulus: Physics ML 開発のためのフレームワーク
NVIDIA Modulus: Physics ML 開発のためのフレームワークNVIDIA Modulus: Physics ML 開発のためのフレームワーク
NVIDIA Modulus: Physics ML 開発のためのフレームワークNVIDIA Japan
 
Deformable Part Modelとその発展
Deformable Part Modelとその発展Deformable Part Modelとその発展
Deformable Part Modelとその発展Takao Yamanaka
 
[DL輪読会]Vision Transformer with Deformable Attention (Deformable Attention Tra...
[DL輪読会]Vision Transformer with Deformable Attention (Deformable Attention Tra...[DL輪読会]Vision Transformer with Deformable Attention (Deformable Attention Tra...
[DL輪読会]Vision Transformer with Deformable Attention (Deformable Attention Tra...Deep Learning JP
 
論文紹介「PointNetLK: Robust & Efficient Point Cloud Registration Using PointNet」
論文紹介「PointNetLK: Robust & Efficient Point Cloud Registration Using PointNet」論文紹介「PointNetLK: Robust & Efficient Point Cloud Registration Using PointNet」
論文紹介「PointNetLK: Robust & Efficient Point Cloud Registration Using PointNet」Naoya Chiba
 
ConditionalPointDiffusion.pdf
ConditionalPointDiffusion.pdfConditionalPointDiffusion.pdf
ConditionalPointDiffusion.pdfTakuya Minagawa
 
SSII2019TS: Shall We GANs?​ ~GANの基礎から最近の研究まで~
SSII2019TS: Shall We GANs?​ ~GANの基礎から最近の研究まで~SSII2019TS: Shall We GANs?​ ~GANの基礎から最近の研究まで~
SSII2019TS: Shall We GANs?​ ~GANの基礎から最近の研究まで~SSII
 
Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algori...
Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algori...Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algori...
Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algori...Han Xiao
 
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料Yusuke Uchida
 
これからの Vision & Language ~ Acadexit した4つの理由
これからの Vision & Language ~ Acadexit した4つの理由これからの Vision & Language ~ Acadexit した4つの理由
これからの Vision & Language ~ Acadexit した4つの理由Yoshitaka Ushiku
 
introduction to double deep Q-learning
introduction to double deep Q-learningintroduction to double deep Q-learning
introduction to double deep Q-learningWEBFARMER. ltd.
 
[DL輪読会]Pay Attention to MLPs (gMLP)
[DL輪読会]Pay Attention to MLPs	(gMLP)[DL輪読会]Pay Attention to MLPs	(gMLP)
[DL輪読会]Pay Attention to MLPs (gMLP)Deep Learning JP
 

What's hot (20)

GANの概要とDCGANのアーキテクチャ/アルゴリズム
GANの概要とDCGANのアーキテクチャ/アルゴリズムGANの概要とDCGANのアーキテクチャ/アルゴリズム
GANの概要とDCGANのアーキテクチャ/アルゴリズム
 
[DL輪読会]StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generat...
[DL輪読会]StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generat...[DL輪読会]StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generat...
[DL輪読会]StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generat...
 
[DL輪読会]Learning to Simulate Complex Physics with Graph Networks
[DL輪読会]Learning to Simulate Complex Physics with Graph Networks[DL輪読会]Learning to Simulate Complex Physics with Graph Networks
[DL輪読会]Learning to Simulate Complex Physics with Graph Networks
 
[DL輪読会]StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators
[DL輪読会]StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators[DL輪読会]StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators
[DL輪読会]StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators
 
論文紹介:Multimodal Learning with Transformers: A Survey
論文紹介:Multimodal Learning with Transformers: A Survey論文紹介:Multimodal Learning with Transformers: A Survey
論文紹介:Multimodal Learning with Transformers: A Survey
 
Transformer メタサーベイ
Transformer メタサーベイTransformer メタサーベイ
Transformer メタサーベイ
 
【DL輪読会】GradMax: Growing Neural Networks using Gradient Information
【DL輪読会】GradMax: Growing Neural Networks using Gradient Information【DL輪読会】GradMax: Growing Neural Networks using Gradient Information
【DL輪読会】GradMax: Growing Neural Networks using Gradient Information
 
【DL輪読会】"Instant Neural Graphics Primitives with a Multiresolution Hash Encoding"
【DL輪読会】"Instant Neural Graphics Primitives with a Multiresolution Hash Encoding"【DL輪読会】"Instant Neural Graphics Primitives with a Multiresolution Hash Encoding"
【DL輪読会】"Instant Neural Graphics Primitives with a Multiresolution Hash Encoding"
 
Deep Learningと画像認識   ~歴史・理論・実践~
Deep Learningと画像認識 ~歴史・理論・実践~Deep Learningと画像認識 ~歴史・理論・実践~
Deep Learningと画像認識   ~歴史・理論・実践~
 
NVIDIA Modulus: Physics ML 開発のためのフレームワーク
NVIDIA Modulus: Physics ML 開発のためのフレームワークNVIDIA Modulus: Physics ML 開発のためのフレームワーク
NVIDIA Modulus: Physics ML 開発のためのフレームワーク
 
Deformable Part Modelとその発展
Deformable Part Modelとその発展Deformable Part Modelとその発展
Deformable Part Modelとその発展
 
[DL輪読会]Vision Transformer with Deformable Attention (Deformable Attention Tra...
[DL輪読会]Vision Transformer with Deformable Attention (Deformable Attention Tra...[DL輪読会]Vision Transformer with Deformable Attention (Deformable Attention Tra...
[DL輪読会]Vision Transformer with Deformable Attention (Deformable Attention Tra...
 
論文紹介「PointNetLK: Robust & Efficient Point Cloud Registration Using PointNet」
論文紹介「PointNetLK: Robust & Efficient Point Cloud Registration Using PointNet」論文紹介「PointNetLK: Robust & Efficient Point Cloud Registration Using PointNet」
論文紹介「PointNetLK: Robust & Efficient Point Cloud Registration Using PointNet」
 
ConditionalPointDiffusion.pdf
ConditionalPointDiffusion.pdfConditionalPointDiffusion.pdf
ConditionalPointDiffusion.pdf
 
SSII2019TS: Shall We GANs?​ ~GANの基礎から最近の研究まで~
SSII2019TS: Shall We GANs?​ ~GANの基礎から最近の研究まで~SSII2019TS: Shall We GANs?​ ~GANの基礎から最近の研究まで~
SSII2019TS: Shall We GANs?​ ~GANの基礎から最近の研究まで~
 
Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algori...
Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algori...Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algori...
Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algori...
 
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料
 
これからの Vision & Language ~ Acadexit した4つの理由
これからの Vision & Language ~ Acadexit した4つの理由これからの Vision & Language ~ Acadexit した4つの理由
これからの Vision & Language ~ Acadexit した4つの理由
 
introduction to double deep Q-learning
introduction to double deep Q-learningintroduction to double deep Q-learning
introduction to double deep Q-learning
 
[DL輪読会]Pay Attention to MLPs (gMLP)
[DL輪読会]Pay Attention to MLPs	(gMLP)[DL輪読会]Pay Attention to MLPs	(gMLP)
[DL輪読会]Pay Attention to MLPs (gMLP)
 

Similar to Summary of survey papers on deep learning method to 3D data

Introduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable RenderingIntroduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable RenderingPreferred Networks
 
Topological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial SystemsTopological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial SystemsMason Porter
 
High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingHigh-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingNesreen K. Ahmed
 
2D and 3D Presentation of Spatial Data: A Systematic Review
2D and 3D Presentation of Spatial Data: A Systematic Review2D and 3D Presentation of Spatial Data: A Systematic Review
2D and 3D Presentation of Spatial Data: A Systematic ReviewMatthias Trapp
 
Topological Data Analysis
Topological Data AnalysisTopological Data Analysis
Topological Data AnalysisDeviousQuant
 
20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworkstm1966
 
187186134 5-geometric-modeling
187186134 5-geometric-modeling187186134 5-geometric-modeling
187186134 5-geometric-modelingmanojg1990
 
5 geometric-modeling-ppt-university-of-victoria
5 geometric-modeling-ppt-university-of-victoria5 geometric-modeling-ppt-university-of-victoria
5 geometric-modeling-ppt-university-of-victoriaRaghu Gadde
 
187186134 5-geometric-modeling
187186134 5-geometric-modeling187186134 5-geometric-modeling
187186134 5-geometric-modelingmanojg1990
 
5_Geometric_Modeling.pdf
5_Geometric_Modeling.pdf5_Geometric_Modeling.pdf
5_Geometric_Modeling.pdfKeerthanaP37
 
Enhancing Parallel Coordinates with Curves
Enhancing Parallel Coordinates with CurvesEnhancing Parallel Coordinates with Curves
Enhancing Parallel Coordinates with Curvesmartinjgraham
 
Learning to Perceive the 3D World
Learning to Perceive the 3D WorldLearning to Perceive the 3D World
Learning to Perceive the 3D WorldNAVER Engineering
 
Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Lea...
Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Lea...Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Lea...
Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Lea...Skolkovo Robotics Center
 
Modelling - Third dimension.pptx
Modelling - Third dimension.pptxModelling - Third dimension.pptx
Modelling - Third dimension.pptxAliya Fathima Ilyas
 
Synthetic Data and Graphics Techniques in Robotics
Synthetic Data and Graphics Techniques in RoboticsSynthetic Data and Graphics Techniques in Robotics
Synthetic Data and Graphics Techniques in RoboticsPrabindh Sundareson
 

Similar to Summary of survey papers on deep learning method to 3D data (20)

Introduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable RenderingIntroduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable Rendering
 
Topological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial SystemsTopological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial Systems
 
High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingHigh-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and Modeling
 
2D and 3D Presentation of Spatial Data: A Systematic Review
2D and 3D Presentation of Spatial Data: A Systematic Review2D and 3D Presentation of Spatial Data: A Systematic Review
2D and 3D Presentation of Spatial Data: A Systematic Review
 
lecture_16_jiajun.pdf
lecture_16_jiajun.pdflecture_16_jiajun.pdf
lecture_16_jiajun.pdf
 
Topological Data Analysis
Topological Data AnalysisTopological Data Analysis
Topological Data Analysis
 
20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks
 
187186134 5-geometric-modeling
187186134 5-geometric-modeling187186134 5-geometric-modeling
187186134 5-geometric-modeling
 
5 geometric-modeling-ppt-university-of-victoria
5 geometric-modeling-ppt-university-of-victoria5 geometric-modeling-ppt-university-of-victoria
5 geometric-modeling-ppt-university-of-victoria
 
5 geometric modeling
5 geometric modeling5 geometric modeling
5 geometric modeling
 
187186134 5-geometric-modeling
187186134 5-geometric-modeling187186134 5-geometric-modeling
187186134 5-geometric-modeling
 
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
 
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
 
5_Geometric_Modeling.pdf
5_Geometric_Modeling.pdf5_Geometric_Modeling.pdf
5_Geometric_Modeling.pdf
 
Enhancing Parallel Coordinates with Curves
Enhancing Parallel Coordinates with CurvesEnhancing Parallel Coordinates with Curves
Enhancing Parallel Coordinates with Curves
 
Learning to Perceive the 3D World
Learning to Perceive the 3D WorldLearning to Perceive the 3D World
Learning to Perceive the 3D World
 
Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Lea...
Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Lea...Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Lea...
Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Lea...
 
Modelling - Third dimension.pptx
Modelling - Third dimension.pptxModelling - Third dimension.pptx
Modelling - Third dimension.pptx
 
Synthetic Data and Graphics Techniques in Robotics
Synthetic Data and Graphics Techniques in RoboticsSynthetic Data and Graphics Techniques in Robotics
Synthetic Data and Graphics Techniques in Robotics
 
報告
報告報告
報告
 

More from Arithmer Inc.

コーディネートレコメンド
コーディネートレコメンドコーディネートレコメンド
コーディネートレコメンドArithmer Inc.
 
Arithmerソリューション紹介 流体予測システム
Arithmerソリューション紹介 流体予測システムArithmerソリューション紹介 流体予測システム
Arithmerソリューション紹介 流体予測システムArithmer Inc.
 
Weakly supervised semantic segmentation of 3D point cloud
Weakly supervised semantic segmentation of 3D point cloudWeakly supervised semantic segmentation of 3D point cloud
Weakly supervised semantic segmentation of 3D point cloudArithmer Inc.
 
Arithmer NLP 自然言語処理 ソリューション紹介
Arithmer NLP 自然言語処理 ソリューション紹介Arithmer NLP 自然言語処理 ソリューション紹介
Arithmer NLP 自然言語処理 ソリューション紹介Arithmer Inc.
 
Arithmer Robo Introduction
Arithmer Robo IntroductionArithmer Robo Introduction
Arithmer Robo IntroductionArithmer Inc.
 
Arithmer AIチャットボット
Arithmer AIチャットボットArithmer AIチャットボット
Arithmer AIチャットボットArithmer Inc.
 
Arithmer R3 Introduction
Arithmer R3 Introduction Arithmer R3 Introduction
Arithmer R3 Introduction Arithmer Inc.
 
VIBE: Video Inference for Human Body Pose and Shape Estimation
VIBE: Video Inference for Human Body Pose and Shape EstimationVIBE: Video Inference for Human Body Pose and Shape Estimation
VIBE: Video Inference for Human Body Pose and Shape EstimationArithmer Inc.
 
Arithmer Inspection Introduction
Arithmer Inspection IntroductionArithmer Inspection Introduction
Arithmer Inspection IntroductionArithmer Inc.
 
全力解説!Transformer
全力解説!Transformer全力解説!Transformer
全力解説!TransformerArithmer Inc.
 
Arithmer NLP Introduction
Arithmer NLP IntroductionArithmer NLP Introduction
Arithmer NLP IntroductionArithmer Inc.
 
Introduction of Quantum Annealing and D-Wave Machines
Introduction of Quantum Annealing and D-Wave MachinesIntroduction of Quantum Annealing and D-Wave Machines
Introduction of Quantum Annealing and D-Wave MachinesArithmer Inc.
 
Arithmer OCR Introduction
Arithmer OCR IntroductionArithmer OCR Introduction
Arithmer OCR IntroductionArithmer Inc.
 
Arithmer Dynamics Introduction
Arithmer Dynamics Introduction Arithmer Dynamics Introduction
Arithmer Dynamics Introduction Arithmer Inc.
 
ArithmerDB Introduction
ArithmerDB IntroductionArithmerDB Introduction
ArithmerDB IntroductionArithmer Inc.
 
Summarizing videos with Attention
Summarizing videos with AttentionSummarizing videos with Attention
Summarizing videos with AttentionArithmer Inc.
 
3D human body modeling from RGB images
3D human body modeling from RGB images3D human body modeling from RGB images
3D human body modeling from RGB imagesArithmer Inc.
 

More from Arithmer Inc. (20)

コーディネートレコメンド
コーディネートレコメンドコーディネートレコメンド
コーディネートレコメンド
 
Test for AI model
Test for AI modelTest for AI model
Test for AI model
 
最適化
最適化最適化
最適化
 
Arithmerソリューション紹介 流体予測システム
Arithmerソリューション紹介 流体予測システムArithmerソリューション紹介 流体予測システム
Arithmerソリューション紹介 流体予測システム
 
Weakly supervised semantic segmentation of 3D point cloud
Weakly supervised semantic segmentation of 3D point cloudWeakly supervised semantic segmentation of 3D point cloud
Weakly supervised semantic segmentation of 3D point cloud
 
Arithmer NLP 自然言語処理 ソリューション紹介
Arithmer NLP 自然言語処理 ソリューション紹介Arithmer NLP 自然言語処理 ソリューション紹介
Arithmer NLP 自然言語処理 ソリューション紹介
 
Arithmer Robo Introduction
Arithmer Robo IntroductionArithmer Robo Introduction
Arithmer Robo Introduction
 
Arithmer AIチャットボット
Arithmer AIチャットボットArithmer AIチャットボット
Arithmer AIチャットボット
 
Arithmer R3 Introduction
Arithmer R3 Introduction Arithmer R3 Introduction
Arithmer R3 Introduction
 
VIBE: Video Inference for Human Body Pose and Shape Estimation
VIBE: Video Inference for Human Body Pose and Shape EstimationVIBE: Video Inference for Human Body Pose and Shape Estimation
VIBE: Video Inference for Human Body Pose and Shape Estimation
 
Arithmer Inspection Introduction
Arithmer Inspection IntroductionArithmer Inspection Introduction
Arithmer Inspection Introduction
 
全力解説!Transformer
全力解説!Transformer全力解説!Transformer
全力解説!Transformer
 
Arithmer NLP Introduction
Arithmer NLP IntroductionArithmer NLP Introduction
Arithmer NLP Introduction
 
Introduction of Quantum Annealing and D-Wave Machines
Introduction of Quantum Annealing and D-Wave MachinesIntroduction of Quantum Annealing and D-Wave Machines
Introduction of Quantum Annealing and D-Wave Machines
 
Arithmer OCR Introduction
Arithmer OCR IntroductionArithmer OCR Introduction
Arithmer OCR Introduction
 
Arithmer Dynamics Introduction
Arithmer Dynamics Introduction Arithmer Dynamics Introduction
Arithmer Dynamics Introduction
 
ArithmerDB Introduction
ArithmerDB IntroductionArithmerDB Introduction
ArithmerDB Introduction
 
Summarizing videos with Attention
Summarizing videos with AttentionSummarizing videos with Attention
Summarizing videos with Attention
 
3D human body modeling from RGB images
3D human body modeling from RGB images3D human body modeling from RGB images
3D human body modeling from RGB images
 
YOLACT
YOLACTYOLACT
YOLACT
 

Recently uploaded

The work to make the piecework work: An ethnographic study of food delivery w...
The work to make the piecework work: An ethnographic study of food delivery w...The work to make the piecework work: An ethnographic study of food delivery w...
The work to make the piecework work: An ethnographic study of food delivery w...stockholm university
 
Introduction-to-Wazuh-and-its-integration.pptx
Introduction-to-Wazuh-and-its-integration.pptxIntroduction-to-Wazuh-and-its-integration.pptx
Introduction-to-Wazuh-and-its-integration.pptxmprakaash5
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentMahmoud Rabie
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Bitdefender-CSG-Report-creat7534-interactive
Bitdefender-CSG-Report-creat7534-interactiveBitdefender-CSG-Report-creat7534-interactive
Bitdefender-CSG-Report-creat7534-interactivestartupro
 
Dublin_mulesoft_meetup_API_specifications.pptx
Dublin_mulesoft_meetup_API_specifications.pptxDublin_mulesoft_meetup_API_specifications.pptx
Dublin_mulesoft_meetup_API_specifications.pptxKunal Gupta
 
THE STATE OF STARTUP ECOSYSTEM - INDIA x JAPAN 2023
THE STATE OF STARTUP ECOSYSTEM - INDIA x JAPAN 2023THE STATE OF STARTUP ECOSYSTEM - INDIA x JAPAN 2023
THE STATE OF STARTUP ECOSYSTEM - INDIA x JAPAN 2023Joshua Flannery
 
A PowerPoint Presentation on Vikram Lander pptx
A PowerPoint Presentation on Vikram Lander pptxA PowerPoint Presentation on Vikram Lander pptx
A PowerPoint Presentation on Vikram Lander pptxatharvdev2010
 
Tecnogravura, Cylinder Engraving for Rotogravure
Tecnogravura, Cylinder Engraving for RotogravureTecnogravura, Cylinder Engraving for Rotogravure
Tecnogravura, Cylinder Engraving for RotogravureAntonio de Llamas
 
Laying the Data Foundations for Artificial Intelligence!
Laying the Data Foundations for Artificial Intelligence!Laying the Data Foundations for Artificial Intelligence!
Laying the Data Foundations for Artificial Intelligence!Memoori
 
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024BookNet Canada
 
HCI Lesson 1 - Introduction to Human-Computer Interaction.pdf
HCI Lesson 1 - Introduction to Human-Computer Interaction.pdfHCI Lesson 1 - Introduction to Human-Computer Interaction.pdf
HCI Lesson 1 - Introduction to Human-Computer Interaction.pdfROWELL MARQUINA
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Arti Languages Pre Seed Pitchdeck 2024.pdf
Arti Languages Pre Seed Pitchdeck 2024.pdfArti Languages Pre Seed Pitchdeck 2024.pdf
Arti Languages Pre Seed Pitchdeck 2024.pdfwill854175
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Tetracrom printing process for packaging with CMYK+
Tetracrom printing process for packaging with CMYK+Tetracrom printing process for packaging with CMYK+
Tetracrom printing process for packaging with CMYK+Antonio de Llamas
 
Dynamical Context introduction word sensibility orientation
Dynamical Context introduction word sensibility orientationDynamical Context introduction word sensibility orientation
Dynamical Context introduction word sensibility orientationBuild Intuit
 

Recently uploaded (20)

The work to make the piecework work: An ethnographic study of food delivery w...
The work to make the piecework work: An ethnographic study of food delivery w...The work to make the piecework work: An ethnographic study of food delivery w...
The work to make the piecework work: An ethnographic study of food delivery w...
 
Introduction-to-Wazuh-and-its-integration.pptx
Introduction-to-Wazuh-and-its-integration.pptxIntroduction-to-Wazuh-and-its-integration.pptx
Introduction-to-Wazuh-and-its-integration.pptx
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career Development
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Bitdefender-CSG-Report-creat7534-interactive
Bitdefender-CSG-Report-creat7534-interactiveBitdefender-CSG-Report-creat7534-interactive
Bitdefender-CSG-Report-creat7534-interactive
 
Dublin_mulesoft_meetup_API_specifications.pptx
Dublin_mulesoft_meetup_API_specifications.pptxDublin_mulesoft_meetup_API_specifications.pptx
Dublin_mulesoft_meetup_API_specifications.pptx
 
THE STATE OF STARTUP ECOSYSTEM - INDIA x JAPAN 2023
THE STATE OF STARTUP ECOSYSTEM - INDIA x JAPAN 2023THE STATE OF STARTUP ECOSYSTEM - INDIA x JAPAN 2023
THE STATE OF STARTUP ECOSYSTEM - INDIA x JAPAN 2023
 
A PowerPoint Presentation on Vikram Lander pptx
A PowerPoint Presentation on Vikram Lander pptxA PowerPoint Presentation on Vikram Lander pptx
A PowerPoint Presentation on Vikram Lander pptx
 
Tecnogravura, Cylinder Engraving for Rotogravure
Tecnogravura, Cylinder Engraving for RotogravureTecnogravura, Cylinder Engraving for Rotogravure
Tecnogravura, Cylinder Engraving for Rotogravure
 
Laying the Data Foundations for Artificial Intelligence!
Laying the Data Foundations for Artificial Intelligence!Laying the Data Foundations for Artificial Intelligence!
Laying the Data Foundations for Artificial Intelligence!
 
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
 
HCI Lesson 1 - Introduction to Human-Computer Interaction.pdf
HCI Lesson 1 - Introduction to Human-Computer Interaction.pdfHCI Lesson 1 - Introduction to Human-Computer Interaction.pdf
HCI Lesson 1 - Introduction to Human-Computer Interaction.pdf
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Arti Languages Pre Seed Pitchdeck 2024.pdf
Arti Languages Pre Seed Pitchdeck 2024.pdfArti Languages Pre Seed Pitchdeck 2024.pdf
Arti Languages Pre Seed Pitchdeck 2024.pdf
 
BoSEU24 | Bill Thompson | Talk From Another Century
BoSEU24 | Bill Thompson | Talk From Another CenturyBoSEU24 | Bill Thompson | Talk From Another Century
BoSEU24 | Bill Thompson | Talk From Another Century
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Tetracrom printing process for packaging with CMYK+
Tetracrom printing process for packaging with CMYK+Tetracrom printing process for packaging with CMYK+
Tetracrom printing process for packaging with CMYK+
 
Dynamical Context introduction word sensibility orientation
Dynamical Context introduction word sensibility orientationDynamical Context introduction word sensibility orientation
Dynamical Context introduction word sensibility orientation
 

Summary of survey papers on deep learning method to 3D data

  • 1. 2 0 1 9 / 0 9 / 1 2 Survey of 3D Deep Learning (Robo+R3 Study Group) Arithmer R3 Div. Takashi Nakano
  • 2. Self-Introduction • Takashi Nakano • Graduate School • Kyoto University • Laboratory : Nuclear Theory Group • Research : Theoretical Physics, Ph.D. (Science) • Phase structure of the universe • Theoretical properties of Lattice QCD • Phase structure of graphene • Former Job • KOZO KEIKAKU ENGINEERING Inc. • Contract analysis / Technical support / Introduction support by using software of Fluid Dynamics / Powder engineering • Current Job • Application of machine learning / deep learning to fluid dynamics • e.g. https://arithmer.co.jp/2019-12-29-1/ • Application of machine learning / deep learning to 3D data
  • 3. Purpose • Purpose of this material • Overview of 3D deep learning • Comparison b/w each method of 3D deep learning • Main papers (In this material, I have summarized the material based on following materials and cited papers therein.) • E. Ahmed et al, "A survey on Deep Learning Advances on Different 3D Data Representations", 2018 • M. M. Bronstein et al., "Geometric deep learning: going beyond Euclidean data", 2016
  • 4. Application • Application of 3D Deep Learning Classification Segmentation Correspondence Retrieval 3D data restoration from 2D images, Pose Estimation, etc.Per-point classification Each label at each vertex same #vertex at each model Comparison of Global Feature [2] [1] [1] C. R. Qi, "PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation", 2016 [2] J. Masci et al. "Geodesic convolutional neural networks on Riemannian manifolds", 2015 [1] [1] [1]
  • 5. Agenda • Methods of 3D Deep Learning • Euclidean vs Non-Euclidean • Euclidean Method • Projections / Multi-View • Voxel • Non-Euclidean Method • Point Cloud / Mesh / Graph • Accuracy • Dataset / Material • Appendix • Mesh Generation • Laplacian on Graph • Correspondence
  • 6. 3D Data • 3D Data Point Cloud Mesh Point Cloud Mesh Graph Vertex 〇 〇 〇 Face - 〇 - Edge - - 〇 [ 𝑥0, 𝑦0, 𝑧0 , … , 𝑥 𝑁, 𝑦 𝑁, 𝑧 𝑁 ] [ 𝑉00, 𝑉01, 𝑉02 , … , 𝑉𝐹0, 𝑉𝐹1, 𝑉𝐹2 ] [ 𝑉00, 𝑉01 , 𝑉01, 𝑉02 , … , 𝑉𝐸2, 𝑉𝐸1 ] 𝒱 𝒱 ℱ 𝒱 ℰ Graph
  • 7. Representation • Representation of 3D data [1] E. Ahmed et al, "A survey on Deep Learning Advances on Different 3D Data Representations", 2018 [1]
  • 8. Representation • Representation of 3D data Euclidean Non-Euclidean Grid (Translational invariant) Non-Grid (not Translational invariant) Local / Intrinsic Global / Extrinsic Point of view from 3D Point of view from 2D Surface 3D2D Rigid Small deformation Non-Rigid Large deformation 3D CNN2D CNN Non-trivial CNN 〇 × [1] [1] J. Masci et al. "Geodesic convolutional neural networks on Riemannian manifolds", 2015 [1]
  • 9. Euclidean vs Non-Euclidean • Euclidean [1] E. Ahmed et al, "A survey on Deep Learning Advances on Different 3D Data Representations", 2018 [1]
  • 10. Euclidean vs Non-Euclidean • Euclidean (detail of feature) Feature Merit Demerit Descriptors Extraction of 3D topological feature (SHOT, PFH, etc.) • Can convert as feature • Can each problem The geometric properties of the shape is lost. Projections Projection of 3D to 2D - The geometric properties of the shape is lost. RGB-D RGB + Depth map • Can use data from RGB-D sensors (Kinect/realsence) as input • Need depth map. • Only infer some of the 3D properties based on the depth. Volumetric Voxelization • Expansion of 2D CNN • Need Large memories. • (grid information) • Need high resolution for detailed shapes. (e.g. segmentation) Multi-View 2D images from multi-angles • Highest accuracy in Euclidean method • Need multi-view images
  • 11. Euclidean vs Non-Euclidean • Non-Euclidean Point Cloud Mesh Graph Unordered point cloud No connected information b/w point cloud Connected information b/w point cloud Graph (Vertex, edge) Dependence of noise and density of point cloud Need to convert from point cloud to mesh Need to create graph type [ 𝑥0, 𝑦0, 𝑧0 , 𝑥1, 𝑦1, 𝑧1 ] [ 𝑥1, 𝑦1, 𝑧1 , 𝑥0, 𝑦0, 𝑧0 ] 𝒱 𝒱 ℱ 𝒱 ℰ [1] R. Hanocka et al., "MeshCNN: A Network with an Edge", 2018 [1]
  • 12. Euclidean vs Non-Euclidean • Non-Euclidean (detail of feature) Feature Merit Demerit Point Cloud • Treat point cloud • Need to keep translational and rotational invariance • Treat unordered point cloud • No connected information b/w point cloud • Original data is often point cloud. • e.g. scanned data (No CAD data, Terrain data) • Civil engineering, architecture, medical care, fashion • Treat noise • Dependence of density of point cloud • Complement b/w point cloud • Cannot distinguish b/w close point cloud Mesh • Treat mesh data • Connected information b/w point cloud • Convert mesh data to structure for applying CNN • CAD data • e.g. design in manufacturing • Can keep geometry in few mesh • Convert point cloud to mesh data Graph • Treat mesh as graph • Vertex (node) • Edge (connected information b/w point cloud) • Same as Mesh • Create graph type CNN (non-trivial)
  • 13. Euclidean vs Non-Euclidean • Non-Euclidean (detail of feature) Feature Merit Demerit Point Cloud • Treat point cloud • Need to keep translational and rotational invariance • Treat unordered point cloud • No connected information b/w point cloud • Original data is often point cloud. • e.g. scanned data (No CAD data, Terrain data) • Civil engineering, architecture, medical care, fashion • Treat noise • Dependence of density of point cloud • Complement b/w point cloud • Cannot distinguish b/w close point cloud Mesh • Treat mesh data • Connected information b/w point cloud • Convert mesh data to structure for applying CNN • CAD data • e.g. design in manufacturing • Can keep geometry in few mesh • Convert point cloud to mesh data Graph • Treat mesh as graph • Vertex (node) • Edge (connected information b/w point cloud) • Same as Mesh • Create graph type CNN (non-trivial)
  • 14. Euclidean • Representation of 3D data [1] E. Ahmed et al, "A survey on Deep Learning Advances on Different 3D Data Representations", 2018 [1]
  • 15. Euclidean • Each Euclidean Method (Projections / RGB-D / Volumetric / Multi-View) Method Application Link Deep Pano Classification Paper Two-stream CNNs on RGB-D Classification Paper VoxNet Classification Paper GitHub(Keras) MVCNN Classification Retrieval Paper GitHub(PyTorch/TensorFlow etc.)
  • 16. Euclidean • Deep Pano [1] • Projection to Panoramic image • Row-wise max-pooling for rotational invariant Panoramic image [1] B. Shi et al. "DeepPano: Deep Panoramic Representation for 3D Shape Recognition", 2017 [1]
  • 17. Euclidean • Two-stream CNNs on RGB-D [1] • Concatenate CNN of RGB and CNN of depth map Concatenation[1] [1] A. Eitel et al. "Multimodal Deep Learning for Robust RGB-D Object Recognition", 2015
  • 18. Euclidean • VoxNet [1] • Voxelization of 3D point cloud to voxel • Not robust for data loss Voxelization Point Cloud Voxel [1] D. Maturana et al. "VoxNet: A 3D Convolutional Neural Network for Real-Time Object Recognition", 2015 [1]
  • 19. Euclidean • MVCNN [1] • Merge CNN of each images [1] H. Su et al. "Multi-view Convolutional Neural Networks for 3D Shape Recognition", 2015 [1]
  • 20. Non-Euclidean (Point Clouds) • Representation of 3D data [1] E. Ahmed et al, "A survey on Deep Learning Advances on Different 3D Data Representations", 2018 [1]
  • 21. Non-Euclidean (Point Clouds) • Each Non-Euclidean Method (Point Cloud) Method Application Link PointNet Classification Segmentation Retrieval Correspondence Paper GitHub (TensorFlow) PointNet++ Classification Segmentation Retrieval Correspondence Paper GitHub (TensorFlow) PyTorch-geometric (PointConv) Dynamic Graph CNN (DGCNN) Classification Segmentation Paper GitHub (PyTorch/TensorFlow) PyTorch-geometric (DynamicEdgeConv) PointCNN Classification Segmentation Paper GitHub (TensorFlow) PyTorch-geometric (XConv) ※Some equations from following pages are referred to the documents in PyTorch-geometric. (https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html) I will explain PyTorch-geometric in later page.
  • 22. Non-Euclidean (Point Clouds) • PointNet [1] • Treat unordered point cloud by max-pooling • Comparison b/w PointNet++ • Detailed information is lost • Cannot treat different density of point cloud Part segmentation (Per-point classification) Predict Affine transformation (Transrational, Rotational Invariance) Similar to Spatial Transformer Networks in 2D Classification 𝑓 𝑥1, ⋯ , 𝑥 𝑛 = 𝑔(ℎ 𝑥1 , ⋯ , ℎ 𝑥 𝑛 ) Max-poolingInput feature MLP Symmetry Function Global + Local Feature Randomly rotating the object along up-axis, Normalization in unit square Affine transformation [1] [1] C. R. Qi, "PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation", 2016
  • 23. Non-Euclidean (Point Clouds) • PointNet • T-Net [1] • Similar to Spatial Transformer Networks in 2D • Spatial Transformer Networks • Alignment of image (transformation, rotation, distortion etc.) by spatial transformation • Learn affine transformation from input data (not necessarily special data) • Can insert this networks at each point b/w networks Reference Contents Paper Original Paper Sample (PyTorch) Dataset : MNIST [1] M. Jaderberg et al. "Spatial transformer networks",2015 [1]
  • 24. Non-Euclidean (Point Clouds) • PointNet • Spatial Transformer Networks • Localization net : output parameters 𝜃 to transform for input feature map 𝑈 • Combination of Conv, MaxPool, ReLU, FC • Output : 2 × 3 • Grid generator : create sampling grid by using the parameters • Sampler : Output transformed feature map 𝑉 • pixel Spatial Transformer Networks (2D) Grid generator Input map to transformed map Input feature map Output feature map 𝑥𝑖 𝑠 𝑦𝑖 𝑠 = 𝒯𝜃 𝐺𝑖 = 𝐴 𝜃 𝑥𝑖 𝑡 𝑦𝑖 𝑡 1 2 × 3 [1] M. Jaderberg et al. "Spatial transformer networks",2015 [1][1] [1]
  • 25. Non-Euclidean (Point Clouds) • PointNet • T-Net • 3D ver. of Spatial Transformer Networks in 2D • Not need sampling grid (There are no gird structure in 3D) • Directly apply transformation to each point cloud • Output parameter • 3 × 3 in first T-Net • 64 × 64 in second T-Net T-Net (input feature : 3) T-Net (input feature : 64) [1] [1] C. R. Qi, "PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation", 2016
  • 26. Non-Euclidean (Point Clouds) • PointNet++ [1] • Comparison b/w PointNet • Detailed information is kept • Can treat different density of point cloud Concatenation of multi-resolution [1] C. R. Qi et al. "Pointnet++: Deep hierarchical feature learning on point sets in a metric space", 2017 [1]
  • 27. Non-Euclidean (Point Clouds) • PointNet++ • Set abstraction • Grouping in one scale + feature extraction • Sampling Layer : Extraction of sampling points by farthest point sampling (FPS) • Grouping Layer : Grouping points around sampling points • PointNet Layer : Applying PointNet Sampling Layer Grouping Layer 𝑟
  • 28. Non-Euclidean (Point Clouds) • PointNet++ • Point Feature Propagation for segmentation • Interpolation : interpolation from k neighbor points • Concatenation Interpolation 𝑓 𝑗 𝑥 = 𝑖=1 𝑘 𝑤𝑖 𝑥 𝑓𝑖 (𝑗) 𝑖=1 𝑘 𝑤𝑖 𝑥 𝑘 = 3 𝑤𝑖 𝑥 = 1 𝑑 𝑥, 𝑥𝑖 2 𝑑 𝑥 𝑥𝑖 𝑥 feature weight Inverse of distance[1] C. R. Qi et al. "Pointnet++: Deep hierarchical feature learning on point sets in a metric space", 2017 [1]
  • 29. Non-Euclidean (Point Clouds) • PointNet++ • Single scale grouping • Multi scale/resolution grouping • Combination of features from different scales • Robust for non-uniform sampling density • Modifying architecture in set abstraction level 𝐿𝑖−1 Level 𝐿𝑖 Level Original Points multi-resolution grouping (MRG) Concatenation of information of multi-resolutionHigh computational cost multi-scale grouping (MSG) Recommendation [1] C. R. Qi et al. "Pointnet++: Deep hierarchical feature learning on point sets in a metric space", 2017 [1] [1]
  • 30. Non-Euclidean (Point Clouds) • PointNet++ • Detail of architecture • Note: #vertex is fixed 𝑆𝐴 512, 0.2, 64, 64, 128 → 𝑆𝐴 128, 0.2, 64, 64, 128 → 𝑆𝐴 256,512,1024 #𝑖𝑛𝑝𝑢𝑡 𝑣𝑒𝑟𝑡𝑒𝑥: 1024 → 𝐹𝐶 512,0.5 → 𝐹𝐶 256,0.5 → 𝐹𝐶(𝐾) class → 𝐹𝑃 256, 256 → 𝐹𝑃 256,128 → 𝐹𝑃(128,128,128,128, 𝐾) per point segmentation 1024/2 512/4 Architecture for classification and part segmentation of ModelNet using single scale grouping 𝑆𝐴 𝐾, 𝑟, ℓ1, ⋯ , ℓ 𝑑 #vertex radius Pointnet (#FC:d) 𝑆𝐴 ℓ1, ⋯ , ℓ 𝑑 Set abstraction level Global Set abstraction level #FC:d Convert single vector by maxpooling For classification For part segmentation Same in cls. and seg. 𝐹𝐶 ℓ, 𝑑𝑝 𝐹𝑃 ℓ1, ⋯ , ℓ 𝑑 Fully Connected Feature Propagation Channel Ratio of dropout #FC:d
  • 31. Non-Euclidean (Point Clouds) • PointNet++ • Detail of architecture • Note: #vertex is fixed #𝑖𝑛𝑝𝑢𝑡 𝑣𝑒𝑟𝑡𝑒𝑥: 1024 Architecture classification of ModelNet using multi-resolution grouping (MRG) 𝑆𝐴 512, 0.2, 64, 64, 128 → 𝑆𝐴 64, 0.4, 128, 128, 256 𝑆𝐴 512, 0.4, 64, 128, 256 𝑆𝐴 64, 128, 256,512 𝑆𝐴 256,512,1024 Concat. Concat. → 𝐹𝐶 512,0.5 → 𝐹𝐶 256,0.5 → 𝐹𝐶(𝐾) Same as single scale grouping class
  • 32. Non-Euclidean (Point Clouds) • Dynamic Graph CNN (DGCNN) [1] • PointNet + w/ Edge Conv. • Edge Conv. • Create local edge structure dynamically (not fixed in each layer) Edge Conv. PointNet+ w/ Edge Conv. 𝒙𝑖′ = 𝑗∈𝑁(𝑖) ℎ 𝚯(𝒙𝑖, 𝒙𝑗 − 𝒙𝑖) global local Search neighbors in feature space by kNN [1] Y. Wang, "Dynamic Graph CNN for Learning on Point Clouds", 2018 [1] [1]
  • 33. Non-Euclidean (Point Clouds) • PointCNN [1] • Downsampling information from neighborhoods into fewer representative points Χ-Conv. Lower resolution, deeper channels Decreasing #representative points, deeper channels 𝒙𝑖′ = 𝐶𝑜𝑛𝑣 𝑲, 𝛾Θ 𝑷𝑖 − 𝒑𝑖 × ℎΘ 𝑷𝑖 − 𝒑𝑖 , 𝒙𝑖 Input feature MLP applied individually on each point like PointNet Kernel Concatenation [1] Y. Li et al. "PointCNN: Convolution On X-Transformed Points", 2018 [1] [1]
  • 34. Non-Euclidean (Mesh) • Representation of 3D data [1] E. Ahmed et al, "A survey on Deep Learning Advances on Different 3D Data Representations", 2018 [1]
  • 35. Non-Euclidean (Mesh) • Each Non-Euclidean Method (Mesh) Method Application Link MeshCNN Classification Segmentation Paper GitHub (PyTorch) MeshNet Classification Paper GitHub (PyTorch)
  • 36. Non-Euclidean (Mesh) • MeshCNN [1] • Edge collapse by pooling • Can apply only the manifold mesh use in segmentation EdgeEdge collapse by pooling Input feature Angle Length Pooling / Unpooling [1] R. Hanocka et al., "MeshCNN: A Network with an Edge", 2018 [1] [1] [1]
  • 37. Non-Euclidean (Mesh) • MeshNet • Input feature • Center, corner, normal, neighbor index Information of neighborhood of face Mesh Conv. (Combination + Aggregation) [1] Y. Feng et al. "MeshNet: Mesh Neural Network for 3D Shape Representation", 2018 [1] [1] [1]
  • 38. Non-Euclidean (Graph) • Representation of 3D data [1] E. Ahmed et al, "A survey on Deep Learning Advances on Different 3D Data Representations", 2018 [1]
  • 39. Non-Euclidean (Graph) • Each Non-Euclidean Method (Graph) • Spectral / Spatial Method Spectral Spatial Euclidean (1D) Non-Euclidean (Manifold) ∆𝜙𝑖 = 𝜆𝑖 𝜙𝑖 𝜙𝑖 = 𝑒 𝑖𝜔𝑥 , 𝜆𝑖 = 𝜔2 Local coordinate 𝐷𝑗 𝑥 𝑓 = 𝑦∈𝑁(𝑥) 𝜔𝑗(𝒖 𝑥, 𝑦 )𝑓(𝑦) 𝑓 ∗ 𝑔 𝑥 = 𝑗=1 𝐽 𝑔𝑗 𝐷𝑗 𝑥 𝑓 Patch Operator Pseudo-coordinate Convolution Generalization of Fourier basis [1] M. M. Bronstein et al., "Geometric deep learning: going beyond Euclidean data", 2016 [2] J. Masci et al. "Geodesic convolutional neural networks on Riemannian manifolds", 2015 [3] M. Fey et al. "SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels", 2017 [1] [1] [3] [2]
  • 40. Non-Euclidean (Graph) • Each Non-Euclidean Method (Graph) • Spatial method is more useful than spectral method. Method Structure Feature Spectral • Fourier basis in manifold • Laplacian eigenvalue/eigenvector • Spectral filter coefficients is base dependent in some method • No locality in some method • High computational cost Spatial • Create local coordinate • Patch operator + Conv. • Locality • Efficient computational cost ※Some equations from following pages are referred to the documents in PyTorch-geometric. (https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html) I will explain PyTorch-geometric in later page.
  • 41. Non-Euclidean (Graph) • Each Non-Euclidean Method (Graph) • Spectral, Spectral free Method Method Application Link Spectral CNN Spectral Graph Paper Chebyshev Spectral CNN (ChebNet) Spectral free Graph Paper GitHub (TensorFlow) PyTorch-geometric (ChebConv) Graph Convolutional Network (GCN) Spectral free Graph Paper PyTorch-geometric (GCNConv) Graph Neural Network (GNN) Spectral free Graph Paper
  • 42. Non-Euclidean (Graph) • Spectral CNN [2] • cannot use different shape • Spectral filter coefficients is base dependent • High computational cost • No locality ∆𝑓 𝑖 ∝ 𝑖,𝑗 ∈ℰ 𝜔𝑖𝑗(𝑓𝑖 − 𝑓𝑗)Laplacian Different shape -> different basis -> different result 𝒇ℓ 𝑜𝑢𝑡 = 𝜉 ℓ′=1 𝑝 𝚽 𝑘 𝑮ℓ,ℓ′ 𝚽 𝑘 𝑇 𝒇ℓ′ 𝑖𝑛 Laplacian eigenvectorReLU [1] M. M. Bronstein et al., "Geometric deep learning: going beyond Euclidean data", 2016 [2] J. Bruna et al. "Spectral Networks and Locally Connected Networks on Graphs", 2013 [1] [1]
  • 43. Non-Euclidean (Graph) • Chebyshev Spectral CNN (ChebNet) [1] • Not calculate Laplacian eigenvectors directly • Locality (K hops) • Approximate filter as polynomial • Graph Convolutional Network (GCN) [2] • Special ver. of ChebNet (𝐾 = 2) 𝑋′ = 𝑘=0 𝐾−1 𝑍(𝑘) ⋅ Θ(𝑘) 𝑍(0) = 𝑋 𝑍(1) = 𝐿 ⋅ 𝑋 𝑍(𝑘) = 2 ⋅ 𝐿 ⋅ 𝑍 𝑘−1 − 𝑍(𝑘−2) 𝐿:scaled and normalized Laplacian [1] M. Defferrard et al. "Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering", 2016 [2] T. N. Kipf et al. "Semi-Supervised Classification with Graph Convolutional Networks", 2016
  • 44. Non-Euclidean (Graph) • Each Non-Euclidean Method (Graph) • Charting Application Link Geodesic CNN Mesh Shape retrieval / correspondence Paper Anisotropic CNN Mesh / point cloud Shape correspondence Paper MoNet Graph / mesh / point cloud Shape correspondence Paper PyTorch-geometric (GMMConv) SplineCNN Graph / Mesh Classification Shape correspondence Paper GitHub (PyTorch) PyTorch-geometric (SplineConv) FeaStNet Graph / Mesh Shape correspondence Segmentation Paper PyTorch-geometric (FeaStConv)
  • 45. Non-Euclidean (Graph) • Geodesic CNN (GCNN)[1] ⊂ Anisotropic CNN (ACNN)[2] ⊂ MoNet [3] MoNet 𝐷𝑗 𝑥 𝑓 = 𝑦∈𝑁(𝑥) 𝜔𝑗(𝒖 𝑥, 𝑦 )𝑓(𝑦) 𝑓 ∗ 𝑔 𝑥 = 𝑗=1 𝐽 𝑔𝑗 𝐷𝑗 𝑥 𝑓 Patch Operator Pseudo-coordinate Convolution 𝜔𝑗 𝒖 = exp − 1 2 𝒖 − 𝜇 𝑗 T Σj −1 (𝒖 − 𝜇 𝑗) 𝜔𝑗 𝒖 = exp − 1 2 𝒖 𝑇 𝑹 𝜃 𝑗 𝛼 0 0 1 𝑹 𝜃 𝑗 𝑇 𝒖ACNN GCNN 𝜔𝑗 𝒖 = exp − 1 2 𝒖 − 𝑢𝑗 T 𝜎𝜌 2 0 0 𝜎 𝜃 2 (𝒖 − 𝑢𝑗) Rotation of 𝜃 to the maximum curvature direction The degree of anisotropy covariance (radius, angle direction) Learning parameters [1] J. Masci et al. "Geodesic convolutional neural networks on Riemannian manifolds", 2015 [2] D Boscaini et al. "Learning shape correspondence with anisotropic convolutional neural networks", 2016 [3] F. Monti et al. "Geometric deep learning on graphs and manifolds using mixture model CNNs", 2016 [4] M. Fey et al. "SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels", 2017 [1] [4]
  • 46. Non-Euclidean (Graph) • Geodesic CNN (GCNN) • Create local coordinate • Do not verify the meaningful chart (need to create small radius chart) • Anisotropic CNN (ACNN) • Fourier basis is based on anisotropic heat diffusion eq. • MoNet • Learn filter as parametric kernel • Generalization of geodesic CNN and anisotropic CNN
  • 47. Non-Euclidean (Graph) • SplineCNN [1] • Filter based on B-spline function • Efficient computational cost 𝒙𝑖′ = 1 |𝑁 𝑖 | 𝑗∈𝑁(𝑖) 𝒙𝑖 ⋅ ℎ 𝚯(𝒆𝑖,𝑗) Weighted B-Spline basis [1] M. Fey et al. "SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels", 2017 [1]
  • 48. Non-Euclidean (Graph) • FeaStNet [1] • Dynamically determine relation b/w filter weight and local graph of a node 𝒙𝑖′ = 1 |𝑁 𝑖 | 𝑗∈𝑁(𝑖) 𝑚=1 𝑀 𝑞 𝑚 𝒙𝑖, 𝒙𝑗 𝑾 𝑚 𝒙𝑗 Filter (e.g. 𝑀 = 3 × 3 = 9) Euclidean FeaStNet Weight Input Output Input Output #neighbor (e.g. 𝑁 = 6) 𝑞 𝑚 𝒙𝑖, 𝒙𝑗 = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥𝑗 𝒖 𝒎 𝑻 𝒙𝒊 − 𝒙𝒋 + 𝑐 𝑚 𝒙𝑖′ = 𝑚=1 𝑀 𝑾 𝑚 𝒙 𝑛(𝑚,𝑖) pixel D input featureE output feature [1] N Verma et al. "FeaStNet: Feature-Steered Graph Convolutions for 3D Shape Analysis", 2017 [1][1]
  • 49. Non-Euclidean (Graph) • PyTorch-geometric • https://github.com/rusty1s/pytorch_geometric • Library based on PyTorch • For point cloud, mesh (not only graph) • Include Point cloud, graph-type approach code • PointNet++, DGCNN, PointCNN • ChebNet, GCN, MoNet, SplineCNN, FeaStNet • Easy to get the famous sample data and transform same data format • ModelNet, ShapeNet, etc. • Many example and benchmark
  • 50. Accuracy • Accuracy (Classification) • around 90% in any method (except VoxNet) Method “ModelNet40” Overall Acc. [%] / Mean class Acc. [%] “SHREC” Overall Acc. [%] VoxNet 85.9 / 83.0 − MVCNN − / 90.1 𝟗𝟔. 𝟎𝟗 PointNet 89.2 / 86.0 − PointNet++ 90.7 / − − DGCNN 𝟗𝟐. 𝟗 / 𝟗𝟎. 𝟐 − PointCNN 92.2 / 88.1 − MeshNet − / 91.9 − MeshCNN − / − 91.0 ※Please refer the detail in each paper (mentioned in each page)
  • 51. Accuracy • Accuracy (Segmentation) Method Part segmentation “ShapeNet” mIoU (mean per-class part-averaged IoU) [%] Part segmentation “ScanNet” Acc. [%] Part segmentation “COSEG” Acc. [%] Scene segmentation “S3DIS” Acc. [%] / mIoU [%] Human body segmentation “including SCAPE, FAUST etc.” Acc. [%] PointNet 80.4 (57.9 − 95.3) 73.9 54.4 − 91.5 78.6 / − 90.77 PointNet++ 81.9 (58.7 − 95.3) 84.5 79.1 − 98.9 − / − − DGCNN 82.3 (𝟔𝟑. 𝟓 − 𝟗𝟓. 𝟕) − − 𝟖𝟒. 𝟏 / 56.1 − PointCNN 𝟖𝟒. 𝟔 𝟖𝟓. 𝟏 − − / 𝟔𝟓. 𝟑𝟗 − MeshCNN − − 𝟗𝟕. 𝟓𝟔 − 𝟗𝟗. 𝟔𝟑 − / − 𝟗𝟐. 𝟑𝟎 FeaStNet 81.5 − − − / − − ※Please refer the detail in each paper (mentioned in each page)
  • 52. Dataset • 3D Dataset Contents Data Format Purpose PyTorch-geometric ModelNet10/40 3D CAD Model (10 or 40 classes) Mesh (.OFF) Classification ModelNet ShapeNet 3D Shape Point Cloud (.pts) Segmentation ShapeNet ScanNet Indoor Scan Data Mesh (.ply) Segmentation - S3DIS (original, .h5) Indoor Scan Data Point Cloud Segmentation S3DIS ScanNet:registration required S3DIS : registration required (for original)
  • 53. Dataset • 3D Dataset Contents Data Format Purpose PyTorch-geometric SHREC many type for each contest - Retrieval - SHREC2016 Animal, Human (Part Data) Mesh (.OFF) Correspondence SHREC2016 TOSCA Animal, Human Mesh (same #vertices at each category, separate file of vertices and triangles) Correspondence TOSCA PCPNet 3D Shape Point Cloud (.xyz) (Including normal, curvature files.) Estimation of local shape (Normal, curvature) PCPNet FAUST Human body Mesh Correspondence FAUST FAUST(Note) : registration required
  • 54. Material • Material of 3D deep learning (3D / point cloud) Paper Comment A survey on Deep Learning Advances on Different 3D Data Representations • Review of 3D Deep Learning • Easier to read it • Written from point of view about Euclidean and Non-Euclidean method Paperwithcode • Paper w/ code about 3D Point Cloud Deep Learning Survey Ver. 2 • Deep learning for point cloud • Survey of many papers
  • 55. Material • Material of 3D deep learning (graph) Paper Comment Geometric deep learning: going beyond Euclidean data • Review of geometric deep learning Geometric Deep Learning • summary of paper and code about geometric deep learning Geometric Deep Learning on Graphs and Manifolds (NIPS2017) • Presentation (youtube) about geometric deep learning
  • 56. Summary • There are many methods of 3D deep learning. • Two main method • Euclidean vs Non-Euclidean • Euclidean Method • Projections / Multi-View / Voxel • Non-Euclidean Method • Point Cloud / Mesh / Graph • Each method have merit and demerit. • We need to choose the better method for each data type and application. • The research about 3D deep learning is growing.
  • 57. Appendix • Appendix • Mesh Generation • Laplacian on Graph • Correspondence
  • 58. Appendix : Mesh Generation • Mesh Generation • In this material, I have summarized these materials. Link Contents 点群面張り(精密工学会) • Surface reconstruction メッシュ処理(精密工学会) • Mesh processing CV勉強会@関東発表資料 点群再構成に関するサーベイ • Survey of point cloud reconstruction
  • 59. Appendix : Mesh Generation • Difficulty of Mesh Generation Processing Difficulty Pre-processing Reduction of Noise / Missing / Abnormal value / density difference of vertices Post-processing Mesh smoothing / hole filling Ground Truth Noise / Abnormal value Missing / density difference of vertices Mesh smoothing / hole filling
  • 60. Appendix : Mesh Generation • Kinds of Mesh Generation Kind Feature Classification of the method Direct Triangulation Direct mesh generation form point cloud Explicit method Surface Smoothness Smooth surface mesh from point cloud Implicit method Direct Triangulation Surface Smoothness
  • 61. Appendix : Mesh Generation • Classification of the method • In general, it is easier to use the implicit method, since there are noise of point cloud. Classification of the method Information to use Influence of noise and density of vertices Guarantee of accuracy Explicit method Vertices Large (error of vertices = error of meshes) ◎ Implicit method Meshes based on isosurface of function fields which is calculated from vertices Small 〇
  • 62. Appendix : Mesh Generation • Kinds of Mesh Generation (Detail) • Direct Triangulation (example of built-in function in MeshLab) Method Feature Voronoi-Based Surface Reconstruction Creation of Delaunay diagram adding the vertices using Voronoi diagram Ball-Pivoting Algorithm Roll the ball over the point cloud and generate mesh from the point cloud located within a certain distance
  • 63. Appendix : Mesh Generation • Voronoi-Based Surface Reconstruction • Voronoi diagram • Region divided by the bisector of each vertices (in 2D) • Delaunay triangulation • Triangulation by connection of vertices bisector Vertices (S) Voronoi Vertices (V) Delaunay triangulation of S and V (black + red line) Example of 2D Surface (black line) [1] N. Amenta et al. "A New Voronoi-Based Surface Reconstruction Algorithm", 1998 [1] [1]
  • 64. Appendix : Mesh Generation • Ball-Pivoting Algorithm Not created Close point cloudSparse data Not created Ideal data [1] F Bernardini et al. "The Ball-Pivoting Algorithm for Surface Reconstruction", 1999 [1]
  • 65. Appendix : Mesh Generation • Kinds of Mesh Generation (Detail) • Surface Smoothness (example of built-in function in MeshLab) Method Feature Signed distance function + Marching Cubes Creation of Signed distance function by using the distance b/w vertices and surface + Mesh generation by using Marching Cubes Screened Poisson surface reconstruction (Poisson surface reconstruction) Distinguish b/w inside and outside of surface by using Poisson eq.
  • 66. Appendix : Mesh Generation • Signed distance function + Marching Cubes Oriented tangent planes Estimated signed distance Output of modified marching cubes 𝑓 𝒑 = 𝒑 − 𝒐 ⋅ 𝒏 𝑓 𝒑 > 0 → 𝑜𝑢𝑡𝑠𝑖𝑑𝑒 𝑓 𝒑 < 0 → 𝑖𝑛𝑠𝑖𝑑𝑒 𝒑 𝒑 𝒐 𝒏 𝑓 𝒑 = 0 → 𝑠𝑢𝑟𝑓𝑎𝑐𝑒 [1] H. Hoppe et al. "Surface Reconstruction from Unorganized Points", 1992 [1]
  • 67. Appendix : Mesh Generation • Screened Poisson surface reconstruction • get Indicator Function by solving the Poisson eq. ∆𝜒 ≡ ∇ ⋅ ∇𝜒 = ∇ ⋅ 𝑽 Poisson eq. Poisson surface reconstruction Screened Poisson surface reconstruction Complement b/w point cloud [1] M. Kazhdan et al. "Poisson Surface Reconstruction", 2006 [2] [1] [2] M. Kazhdan et al. "Screened Poisson Surface Reconstruction", 2013
  • 68. Appendix : Laplacian on Graph • Laplacian on Graph [1] ∆𝑓 𝑖 = 1 𝑎𝑖 𝑖,𝑗 ∈ℰ 𝜔𝑖𝑗(𝑓𝑖 − 𝑓𝑗)Laplacian (𝒱, ℰ)Graph (undirected) 𝒱 = {1, ⋯ , 𝑛} ℰ ⊆ 𝒱 × 𝒱 𝑎𝑖 𝜔𝑖𝑗 weight div. 𝑑𝑖𝑣 𝐹 𝑖 = 1 𝑎𝑖 𝑗: 𝑖,𝑗 ∈ℰ 𝜔𝑖𝑗 𝐹𝑖𝑗 Grad. ∇𝑓 𝑖𝑗 = 𝑓𝑖 − 𝑓𝑗 Mesh 𝜔𝑖𝑗 = −ℓ𝑖𝑗 2 + ℓ𝑗𝑘 2 + ℓ 𝑘𝑖 2 8𝑎𝑖𝑗𝑘 + −ℓ𝑖𝑗 2 + ℓ𝑗ℎ 2 + ℓℎ𝑖 2 8𝑎𝑖𝑗ℎ = 1 2 (cot 𝛼𝑖𝑗 + cot 𝛽𝑖𝑗) 𝑎𝑖 = 1 3 𝑗𝑘: 𝑖,𝑗,𝑘 ∈ℱ 𝑎𝑖𝑗𝑘 𝑎𝑖𝑗𝑘 = 𝑠𝑖𝑗𝑘(𝑠𝑖𝑗𝑘 − ℓ𝑖𝑗)(𝑠𝑖𝑗𝑘 − ℓ𝑗𝑘)(𝑠𝑖𝑗𝑘 − ℓ 𝑘𝑖) 1/2 𝑠𝑖𝑗𝑘 = 1 2 (𝑎𝑖𝑗 + 𝑎𝑗𝑘 + 𝑎 𝑘𝑖) 𝑓: 𝒱 → ℝ, 𝐹: ℰ → ℝ ∆≡ −𝑑𝑖𝑣 ∇ →Laplacian eigenvalues 𝜆 > 0 [1] M. M. Bronstein et al., "Geometric deep learning: going beyond Euclidean data", 2016 [1]
  • 69. Appendix : Laplacian on Graph • Laplacian on Graph [1] Δ𝒇 = 𝑨−1 𝑫 − 𝑾 𝒇 𝒇 = 𝑓1, ⋯ , 𝑓𝑛 𝑇 𝑾 = (𝜔𝑖𝑗) 𝑨 = 𝑑𝑖𝑎𝑔(𝑎1, ⋯ , 𝑎 𝑛) 𝑫 = 𝑑𝑖𝑎𝑔 𝑗:𝑗≠𝑖 𝜔𝑖𝑗 Laplacian ∆ Condition Unnormalized graph Laplacian ∆= 𝑫 − 𝑾 𝐴 = 𝐼 Normalized Symmetry Laplacian ∆= 𝑰 − 𝑫− 𝟏 𝟐 𝑾𝑫 𝟏 𝟐 𝐴 = 𝐷 + Normalization Random walk Laplacian ∆= 𝑰 − 𝑫−1 𝑾 𝐴 = 𝐷 Laplacian (as matrix) [1] M. M. Bronstein et al., "Geometric deep learning: going beyond Euclidean data", 2016
  • 70. Appendix : Laplacian on Graph • Laplacian on Graph [1] • Convolution (𝑓 ∗ 𝑔)(𝑥) = 𝑖≥0 𝑓𝑖 𝑔𝑖 𝜙𝑖(𝑥) 𝒇 ∗ 𝒈 = 𝚽𝑑𝑖𝑎𝑔 𝑔 𝚽T 𝐟 𝒇 = 𝑓1, ⋯ , 𝑓𝑛 𝑇 𝒈 = ( 𝑔1, ⋯ , 𝑔 𝒏) 𝚽 = (𝜙1, ⋯ , 𝜙 𝑛) Matrix Conv. [1] M. M. Bronstein et al., "Geometric deep learning: going beyond Euclidean data", 2016
  • 71. Appendix :Correspondence • Correspondence [1] Query Reference input 𝑥𝑖 (𝑦1, ⋯ , 𝑦 𝑁) label Each query vertex has labels as all reference vertices Output (Probability) Correct Label Output (Probability) (𝑝1, 𝑝2, ⋯ , 𝑝 𝑁) (1,0, ⋯ , 0) One-hot vector Loss [1] M. M. Bronstein et al., "Geometric deep learning: going beyond Euclidean data", 2016 [1]