Self-supervised representation learning based
continuous CRF convolutionnal and dualdiscriminator
for 3D point clouds
Youness Abouqora
2022.07
Introduction
Introduction 3D representations
multi-view images + 2D
CNN
volumetric data + 3D
CNN
point cloud + DL
(CNN) ?
mesh data + DL
(GNN) ?
image depth +
Introduction point cloud
Advantages
 raw sensor data, e.g., Lidar
 simple representation: N * (x, y, z, color, normal…)
 better 3D shape capturing
Why emerging?
 autonomous driving
 AR & VR
 robot manipulation
 Geomatics
 3D face & medical
 AI-assisted shape design in 3D game and animation,
etc.
4
 open problem,
LIDAR Sensors measures time from when pulse sent to
when received.
Cloud of points
Introduction point cloud
4
lam
p
Introduction point cloud analysis
shape
classification
shape
retrieval
shape
correspondence
semantic
segmentation
normal
estimation
object
detection
keypoint
detection
……
Introduction datasets
……
Princeton ModelNet:
1k
PartNet models
Mo et al. PartNet: A Large-scale Benchmark for Fine-grained and Hierarchical Part-level 3D Object Understanding. CVPR
2019. Yi et al. A scalable active framework for region annotation in 3D shape collections.
TOG 2016. Wu et al. 3D ShapeNets: A Deep Representation for Volumetric Shapes. CVPR
ShapeNet Part:
2k
Introduction datasets
……
Armeni et al. 3d semantic parsing of large-scale indoor spaces. CVPR
2016. Hackel et al. Semantic3d. net: A new large-scale point cloud classification benchmark.
Stanford 3D indoor scene:
8k
Semantic 3D: 4 billion in
total
KITTI: det
Dai et al. ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR
2017.
ScanNet: seg +
det
Introduction some challenges
Irregular (unordered): permutation
invariance
Robustness to rigid
transformations
rotation
scale
translation
Robustness to corruption, outlier, noise; partial
data
Related work – PointNet family
Related Work PointNet: permutation invariance
No local patterns
capturing
Shared MLP + max pool (symmetric
function)
Qi et al. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. CVPR
Related Work PointNet++: local to global
Sampling + Grouping +
PointNet
Qi et al. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. NIPS
Only similar to CNN in
framework
Related work – relation modeling
Related Work DGCNN
Wang et al. Dynamic Graph CNN for Learning on Point Clouds.
Dynamic Graph CNN
(DGCNN)
Points in high-level feature
space captures
semantically similar
structures.
Despite a large distance
between them in the
original 3D space.
Related Work DGCNN
DGCNN —— EdgeConv
 Neighbors are found in feature space
 Learn from semantically similar structures
Wang et al. Dynamic Graph CNN for Learning on Point Clouds.
global
info.
local
info.
ma
x
Related Work self-attention
Yang et al. Modeling Point Clouds with Self-Attention and Gumbel Subset Sampling. CVPR
 Relation modeling: self-attention
 Gumbel Subset Sampling VS.
Farthest Point Sampling
— permutation-invariant
— high-dimension embedding
space
— differentiable
Related Work self-attention
Yang et al. Modeling Point Clouds with Self-Attention and Gumbel Subset Sampling. CVPR
Embedding :
PointNet
Self-attention :
group convolution + channel shuffle + pre-
activation
Related Work self-attention
Maximilian et al. Attention-based deep multiple instance learning. ICML
2018. Yang et al. Modeling Point Clouds with Self-Attention and Gumbel Subset Sampling.
Gumbel Subset
Sampling:
discrete reparameterization
trick
multiple point
version
Related work – convolution on point cloud
Related Work Kernel Point Convolution
Hugues et al. KPConv: Flexible and Deformable Convolution for Point Clouds. arXiv
kernel
points:
Related Work Kernel Point Convolution
Hugues et al. KPConv: Flexible and Deformable Convolution for Point Clouds. arXiv
repulsive
potential:
attractive
potential:
Overview – Graph convolutional networks
INPUT GRAPH
TARGET NODE
D
E
F
C
A
B B
D
A
A
In GNN, we have aggregated the neighbor messages by taking their weighted
average. Can we do better?
C
F
B
E
A
?
?
?
?
�
�
Any differentiable function that maps set of vectors in to a single vector can be used
as the Aggregate function.
GCN defines the message passing function as
Overview Graph Convolutional Networks (GCN)
𝒉𝑢
𝑘 −1
𝒉𝑣
𝑘 −1
𝒉𝑣
𝑘
C
h𝑣
𝑘
=𝜎 ([𝑊 𝑘 . 𝐴𝑔𝑔𝑟𝑒𝑔𝑎𝑡𝑒 ({h𝑢
𝑘 −1
|𝑢𝜖 𝒩 (𝑣 )}), 𝐵𝑘 h𝑣
𝑘
])
Neighborhood aggregation
h𝑣
𝑘
=𝜎
(𝑊
𝑘
∑
𝑢∈ 𝑁 ( 𝑣) ∪ {𝑣 }
h𝑢
√|𝑁 (𝑢)∨¿ 𝑁(𝑣)|)
Main idea: pass messages between pairs of nodes & agglomerate
Stacking multiple layers like standard CNNs:
T. N. Kipf and M. Welling. Semi-supervised classification with graph convolutional networks. ICLR, 201
Overview Graph Convolutional Networks (GCN)
Overview – Continuous conditional random
fields
Discret CRF (D-CRF), all components of range over a finite label alphabet
Continuous CRF (C-CRF): is relaxed to be continuous values
Energy functions
 D-CRF usually employs Potts model1
with indicator function for the pairwise terms;
 C-CRF, quadratic cost function can be used to measure the label compatibility.
Learning and inference
 The exact learning/inference of D-CRF is usually intractable due to its discrete property, which
requires approximation techniques2
such as belief propagation, mean field, Monte Carlo
approaches…
 C-CRF offers direct learning together with closed-form inference,
Overview Continuous CRF Vs Discret CRF
D-CRFs usually used as
postprocessing to refine the
labels.
C-CRF essentially participates in
the feature extraction by
modeling data affinity in feature
space

Self-supervised representation learning on point clouds - Copy.pptx

  • 1.
    Self-supervised representation learningbased continuous CRF convolutionnal and dualdiscriminator for 3D point clouds Youness Abouqora 2022.07
  • 2.
  • 3.
    Introduction 3D representations multi-viewimages + 2D CNN volumetric data + 3D CNN point cloud + DL (CNN) ? mesh data + DL (GNN) ? image depth +
  • 4.
    Introduction point cloud Advantages raw sensor data, e.g., Lidar  simple representation: N * (x, y, z, color, normal…)  better 3D shape capturing Why emerging?  autonomous driving  AR & VR  robot manipulation  Geomatics  3D face & medical  AI-assisted shape design in 3D game and animation, etc. 4  open problem, LIDAR Sensors measures time from when pulse sent to when received. Cloud of points
  • 5.
  • 6.
    lam p Introduction point cloudanalysis shape classification shape retrieval shape correspondence semantic segmentation normal estimation object detection keypoint detection ……
  • 7.
    Introduction datasets …… Princeton ModelNet: 1k PartNetmodels Mo et al. PartNet: A Large-scale Benchmark for Fine-grained and Hierarchical Part-level 3D Object Understanding. CVPR 2019. Yi et al. A scalable active framework for region annotation in 3D shape collections. TOG 2016. Wu et al. 3D ShapeNets: A Deep Representation for Volumetric Shapes. CVPR ShapeNet Part: 2k
  • 8.
    Introduction datasets …… Armeni etal. 3d semantic parsing of large-scale indoor spaces. CVPR 2016. Hackel et al. Semantic3d. net: A new large-scale point cloud classification benchmark. Stanford 3D indoor scene: 8k Semantic 3D: 4 billion in total KITTI: det Dai et al. ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. CVPR 2017. ScanNet: seg + det
  • 9.
    Introduction some challenges Irregular(unordered): permutation invariance Robustness to rigid transformations rotation scale translation Robustness to corruption, outlier, noise; partial data
  • 10.
    Related work –PointNet family
  • 11.
    Related Work PointNet:permutation invariance No local patterns capturing Shared MLP + max pool (symmetric function) Qi et al. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. CVPR
  • 12.
    Related Work PointNet++:local to global Sampling + Grouping + PointNet Qi et al. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. NIPS Only similar to CNN in framework
  • 13.
    Related work –relation modeling
  • 14.
    Related Work DGCNN Wanget al. Dynamic Graph CNN for Learning on Point Clouds. Dynamic Graph CNN (DGCNN) Points in high-level feature space captures semantically similar structures. Despite a large distance between them in the original 3D space.
  • 15.
    Related Work DGCNN DGCNN—— EdgeConv  Neighbors are found in feature space  Learn from semantically similar structures Wang et al. Dynamic Graph CNN for Learning on Point Clouds. global info. local info. ma x
  • 16.
    Related Work self-attention Yanget al. Modeling Point Clouds with Self-Attention and Gumbel Subset Sampling. CVPR  Relation modeling: self-attention  Gumbel Subset Sampling VS. Farthest Point Sampling — permutation-invariant — high-dimension embedding space — differentiable
  • 17.
    Related Work self-attention Yanget al. Modeling Point Clouds with Self-Attention and Gumbel Subset Sampling. CVPR Embedding : PointNet Self-attention : group convolution + channel shuffle + pre- activation
  • 18.
    Related Work self-attention Maximilianet al. Attention-based deep multiple instance learning. ICML 2018. Yang et al. Modeling Point Clouds with Self-Attention and Gumbel Subset Sampling. Gumbel Subset Sampling: discrete reparameterization trick multiple point version
  • 19.
    Related work –convolution on point cloud
  • 20.
    Related Work KernelPoint Convolution Hugues et al. KPConv: Flexible and Deformable Convolution for Point Clouds. arXiv kernel points:
  • 21.
    Related Work KernelPoint Convolution Hugues et al. KPConv: Flexible and Deformable Convolution for Point Clouds. arXiv repulsive potential: attractive potential:
  • 22.
    Overview – Graphconvolutional networks
  • 23.
    INPUT GRAPH TARGET NODE D E F C A BB D A A In GNN, we have aggregated the neighbor messages by taking their weighted average. Can we do better? C F B E A ? ? ? ? � � Any differentiable function that maps set of vectors in to a single vector can be used as the Aggregate function. GCN defines the message passing function as Overview Graph Convolutional Networks (GCN) 𝒉𝑢 𝑘 −1 𝒉𝑣 𝑘 −1 𝒉𝑣 𝑘 C h𝑣 𝑘 =𝜎 ([𝑊 𝑘 . 𝐴𝑔𝑔𝑟𝑒𝑔𝑎𝑡𝑒 ({h𝑢 𝑘 −1 |𝑢𝜖 𝒩 (𝑣 )}), 𝐵𝑘 h𝑣 𝑘 ]) Neighborhood aggregation h𝑣 𝑘 =𝜎 (𝑊 𝑘 ∑ 𝑢∈ 𝑁 ( 𝑣) ∪ {𝑣 } h𝑢 √|𝑁 (𝑢)∨¿ 𝑁(𝑣)|)
  • 24.
    Main idea: passmessages between pairs of nodes & agglomerate Stacking multiple layers like standard CNNs: T. N. Kipf and M. Welling. Semi-supervised classification with graph convolutional networks. ICLR, 201 Overview Graph Convolutional Networks (GCN)
  • 25.
    Overview – Continuousconditional random fields
  • 26.
    Discret CRF (D-CRF),all components of range over a finite label alphabet Continuous CRF (C-CRF): is relaxed to be continuous values Energy functions  D-CRF usually employs Potts model1 with indicator function for the pairwise terms;  C-CRF, quadratic cost function can be used to measure the label compatibility. Learning and inference  The exact learning/inference of D-CRF is usually intractable due to its discrete property, which requires approximation techniques2 such as belief propagation, mean field, Monte Carlo approaches…  C-CRF offers direct learning together with closed-form inference, Overview Continuous CRF Vs Discret CRF D-CRFs usually used as postprocessing to refine the labels. C-CRF essentially participates in the feature extraction by modeling data affinity in feature space

Editor's Notes

  • #6 Au cours des dernières années, il y a eu un nombre croissant de travaux visant à adapter les méthodes d’apprentissage profond ou à introduire de nouvelles approches "profondes" pour classifier les nuages de points 3D. Parmis les approches développées on peut distinguer quatre philosophies : (a) en projetant le nuage de points sur des images, puis en classifiant chaque image avec des réseaux de segmentation d’images [2], (b) en projetant le nuage dans un grille d’occupation 3D comme dans VoxNet [7] et [5], (c) en travaillant plus sur des graphes du nuage de points via des CRFs comme SegCloud [11] ou plus en faisant des convolutions sur des graphes comme SPGraph [6], (d) en prenant directemen