Weakly supervised semantic segmentation of 3D point cloud

•

0 likes•121 views

Slide for study session given by Dr. Daisuke Sato at Arithmer inc. It is a summary of methods for semantic segmentation for 3D pointcloud using 2D weakly-supervised learning. Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。 Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.

Technology

2021/7/18
Weakly Supervised Semantic Segmentation in 3D
Graph-Structured Point Clouds of Wild Scenes
Arithmer R3 team
Daisuke SATO
https://arxiv.org/abs/2004.12498v2
1

Introduction
Point clouds
● 3D sensors have been developed rapidly
these days.
○ iPhone/iPad has LiDAR
○ realsense/azure kinect is amazing
considering its low price.
● Raw data collected is point cloud
○ Collection of 3D points
○ (x0
, y0
, z0
)
(x1
, y1
, z1
)
…
(xn
, yn
, zn
)
● Processing point cloud is quite
important in robotics/measurement (測
量).
2

Introduction
Task: semantic segmentation of point clouds
● Classifying every point of 3D point clouds
○ If we perform this task with supervised
learning, we need to annotate each
point.
○ Labeling point clouds is super
exhausting.
○ Labeling 2D image is easier.
○ Let us do weakly supervised learning
with 2D projected image.
3

Introduction
4
Idea: weakly supervised learning with 2D projected image
Input is
3D point cloud
Output at
inference stage
is 3D segmented
point cloud
Label at training
stage is
segmented 2D
projected images

Introduction
Note: Projection of 3D pointcloud onto 2D image
● Projection is described by the two
parameters:
○ Internal camera parameters
■ Focal length (f)
■ principal point (c)
○ External camera parameters
■ Translation (t)
■ Rotation (R)
5

Method
Model architecture
6
Total loss
Visibility branch
(feature vector -> mask for visibility)
Input is 3D point cloud
Graph convolution encoder
(point cloud -> feature vector)
2D projection
(segmented pcd/mask for visibility
-> 2D segmented image)
Segmentation branch
(feature vector
-> mask for segmentation)

Method
Perspective rendering
● Problem: Multiple pointclouds can be
projected onto a single pixel. Which
class should we give the pixel?
7
● Solution: “Semantic fusion”
(sophiscated voting system)

Experiments
Datasets
● SUNCG Synthetic dataset
○ class number: 40 (furniture)
○ rooms: 404058
■ create 55000 2D rendering sets
● S3DIS Real-world dataset
○ class number: 13 (furniture)
○ rooms: 272
■ thousands of viewpoints are
provided
8
● Each point has
○ position: (x, y, z)
○ color: (r, g, b)
● Also normal is computed (u, v, w)

Experiments
Metrics
● mean accuracy of total classes (mAcc)
● overall accuracy (oAcc)
● mean per-class intersection-over-union
(mIoU)
○ IOU = TP / (TP + FP + FN)
9

Experiments
Comparison with other fully-supervised methods
Not so bad even compared with fully-supervised learning.
10

Experiments
Inference samples for SUNCG dataset
11

Experiments
Inference samples for S3DIS dataset
12

Conclusion
● To the best of our knowledge, this is the first work to apply 2D supervision for 3D
semantic point cloud segmentation of wild scenes without using any 3D pointwise
annotations.
● Extensive experiments are conducted and the proposed method achieves comparable
performance with the state-of-the-art 3D supervised methods on the popular SUNCG and
S3DIS benchmarks.
13

Experiments
Ablation study: projection method/OBSNet decoder
● Bullet list
17

Experiments
Ablation study - Encoder Design
● Bullet list
18

Experiments
Ablation study - Amount of training data
● Bullet list
19

Experiments
Ablation study - Visibility detection by OBSNet
● Bullet list
20

Experiments
Transfer learning from synthesic to realistic dataset
● Bullet list
21

What's hot

Generative Adversarial Networks (GAN)Manohar Mukku

ViT (Vision Transformer) Review [CDM]Dongmin Choi

Convolutional Neural Networks (CNN)Gaurav Mittal

Relational knowledge distillationNAVER Engineering

Super Resolutionalokahuti

Deep learning - A Visual IntroductionLukas Masuch

Generative adversarial networks남주 김

Convolutional Neural Network Models - Deep LearningMohamed Loey

AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)Fellowship at Vodafone FutureLab

Image Segmentation Using Deep Learning : A surveyNUPUR YADAV

Introduction to Deep LearningOswald Campesato

GAN in medical imagingCheng-Bin Jin

MobileNet - PR044Jinwon Lee

Super resolution in deep learning era - Jaejun YooJaeJun Yoo

Autoencoders in Deep Learningmilad abbasi

Generative Adversarial NetworksMustafa Yagmur

Deep Learning With Python | Deep Learning And Neural Networks | Deep Learning...Simplilearn

Deep learningHatim EL-QADDOURY

Photogrammetry and Star Wars BattlefrontElectronic Arts / DICE

Deep learningRouyun Pan

What's hot (20)

Generative Adversarial Networks (GAN)

ViT (Vision Transformer) Review [CDM]

Convolutional Neural Networks (CNN)

Relational knowledge distillation

Super Resolution

Deep learning - A Visual Introduction

Generative adversarial networks

Convolutional Neural Network Models - Deep Learning

AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)

Image Segmentation Using Deep Learning : A survey

Introduction to Deep Learning

GAN in medical imaging

MobileNet - PR044

Super resolution in deep learning era - Jaejun Yoo

Autoencoders in Deep Learning

Generative Adversarial Networks

Deep Learning With Python | Deep Learning And Neural Networks | Deep Learning...

Deep learning

Photogrammetry and Star Wars Battlefront

Deep learning

Similar to Weakly supervised semantic segmentation of 3D point cloud

Making BIG DATA smallerTony Tran

Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017Universitat Politècnica de Catalunya

Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018Universitat Politècnica de Catalunya

Introduction to Applied Machine LearningSheilaJimenezMorejon

物件偵測與辨識技術CHENHuiMei

Deep learning @ University of Oradea - part I (16 Jan. 2018)Vlad Ovidiu Mihalca

The Perceptron (D1L2 Deep Learning for Speech and Language)Universitat Politècnica de Catalunya

Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Universitat Politècnica de Catalunya

RWDAAbraham Monrroy

Eye deepsveitser

“Introducing Machine Learning and How to Teach Machines to See,” a Presentati...Edge AI and Vision Alliance

ECCV2022 paper reading - MultiMAE: Multi-modal Multi-task Masked Autoencoders...Antonio Tejero de Pablos

Visualizing Data Using t-SNEDavid Khosid

Beginning direct3d gameprogramming01_thehistoryofdirect3dgraphics_20160407_ji...JinTaek Seo

Point-GNN: Graph Neural Network for 3D Object Detection in a Point CloudNuwan Sriyantha Bandara

Semantic Segmentation on Satellite ImageryRAHUL BHOJWANI

Introduction to 3D Computer Vision and Differentiable RenderingPreferred Networks

Embeddings the geometry of relational algebraNikolaos Vasiloglou

DALL-E.pdfdsfajkh

Deep single view 3 d object reconstruction with visual hullHanqing Wang

Similar to Weakly supervised semantic segmentation of 3D point cloud (20)

Making BIG DATA smaller

Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017

Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018

Introduction to Applied Machine Learning

物件偵測與辨識技術

Deep learning @ University of Oradea - part I (16 Jan. 2018)

The Perceptron (D1L2 Deep Learning for Speech and Language)

Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018

RWDA

Eye deep

“Introducing Machine Learning and How to Teach Machines to See,” a Presentati...

ECCV2022 paper reading - MultiMAE: Multi-modal Multi-task Masked Autoencoders...

Visualizing Data Using t-SNE

Beginning direct3d gameprogramming01_thehistoryofdirect3dgraphics_20160407_ji...

Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud

Semantic Segmentation on Satellite Imagery

Introduction to 3D Computer Vision and Differentiable Rendering

Embeddings the geometry of relational algebra

DALL-E.pdf

Deep single view 3 d object reconstruction with visual hull

Recently uploaded

Vulnerability_Management_GRC_by Sohang Sengupta.pptxnull - The Open Security Community

Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software

How to convert PDF to text with Nanonetsnaman860154

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55

Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106

Key Features Of Token Development (1).pptxLBM Solutions

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j

Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski

AI as an Interface for Commercial BuildingsMemoori

Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxnull - The Open Security Community

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited

SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal

Install Stable Diffusion in windows machinePadma Pradeep

Recently uploaded (20)

Vulnerability_Management_GRC_by Sohang Sengupta.pptx

Advanced Test Driven-Development @ php[tek] 2024

The Codex of Business Writing Software for Real-World Solutions 2.pptx

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation

How to convert PDF to text with Nanonets

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...

Snow Chain-Integrated Tire for a Safe Drive on Winter Roads

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics

Key Features Of Token Development (1).pptx

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...

Human Factors of XR: Using Human Factors to Design XR Systems

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...

AI as an Interface for Commercial Buildings

Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365

SQL Database Design For Developers at php[tek] 2024

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service

Install Stable Diffusion in windows machine

Weakly supervised semantic segmentation of 3D point cloud

1. 2021/7/18 Weakly Supervised Semantic Segmentation in 3D Graph-Structured Point Clouds of Wild Scenes Arithmer R3 team Daisuke SATO https://arxiv.org/abs/2004.12498v2 1

2. Introduction Point clouds ● 3D sensors have been developed rapidly these days. ○ iPhone/iPad has LiDAR ○ realsense/azure kinect is amazing considering its low price. ● Raw data collected is point cloud ○ Collection of 3D points ○ (x0 , y0 , z0 ) (x1 , y1 , z1 ) … (xn , yn , zn ) ● Processing point cloud is quite important in robotics/measurement (測量). 2

3. Introduction Task: semantic segmentation of point clouds ● Classifying every point of 3D point clouds ○ If we perform this task with supervised learning, we need to annotate each point. ○ Labeling point clouds is super exhausting. ○ Labeling 2D image is easier. ○ Let us do weakly supervised learning with 2D projected image. 3

4. Introduction 4 Idea: weakly supervised learning with 2D projected image Input is 3D point cloud Output at inference stage is 3D segmented point cloud Label at training stage is segmented 2D projected images

5. Introduction Note: Projection of 3D pointcloud onto 2D image ● Projection is described by the two parameters: ○ Internal camera parameters ■ Focal length (f) ■ principal point (c) ○ External camera parameters ■ Translation (t) ■ Rotation (R) 5

6. Method Model architecture 6 Total loss Visibility branch (feature vector -> mask for visibility) Input is 3D point cloud Graph convolution encoder (point cloud -> feature vector) 2D projection (segmented pcd/mask for visibility -> 2D segmented image) Segmentation branch (feature vector -> mask for segmentation)

7. Method Perspective rendering ● Problem: Multiple pointclouds can be projected onto a single pixel. Which class should we give the pixel? 7 ● Solution: “Semantic fusion” (sophiscated voting system)

8. Experiments Datasets ● SUNCG Synthetic dataset ○ class number: 40 (furniture) ○ rooms: 404058 ■ create 55000 2D rendering sets ● S3DIS Real-world dataset ○ class number: 13 (furniture) ○ rooms: 272 ■ thousands of viewpoints are provided 8 ● Each point has ○ position: (x, y, z) ○ color: (r, g, b) ● Also normal is computed (u, v, w)

9. Experiments Metrics ● mean accuracy of total classes (mAcc) ● overall accuracy (oAcc) ● mean per-class intersection-over-union (mIoU) ○ IOU = TP / (TP + FP + FN) 9

10. Experiments Comparison with other fully-supervised methods Not so bad even compared with fully-supervised learning. 10

11. Experiments Inference samples for SUNCG dataset 11

12. Experiments Inference samples for S3DIS dataset 12

13. Conclusion ● To the best of our knowledge, this is the first work to apply 2D supervision for 3D semantic point cloud segmentation of wild scenes without using any 3D pointwise annotations. ● Extensive experiments are conducted and the proposed method achieves comparable performance with the state-of-the-art 3D supervised methods on the popular SUNCG and S3DIS benchmarks. 13

14. 14

15. BACK UP 15

16. Method TITLE HERE ● Bullet list 16

17. Experiments Ablation study: projection method/OBSNet decoder ● Bullet list 17

18. Experiments Ablation study - Encoder Design ● Bullet list 18

19. Experiments Ablation study - Amount of training data ● Bullet list 19

20. Experiments Ablation study - Visibility detection by OBSNet ● Bullet list 20

21. Experiments Transfer learning from synthesic to realistic dataset ● Bullet list 21

Weakly supervised semantic segmentation of 3D point cloud

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Weakly supervised semantic segmentation of 3D point cloud

Similar to Weakly supervised semantic segmentation of 3D point cloud (20)

More from Arithmer Inc.

More from Arithmer Inc. (20)

Recently uploaded

Recently uploaded (20)

Weakly supervised semantic segmentation of 3D point cloud