SlideShare a Scribd company logo
Real world data augmentations for
autonomous driving
N AVYA IN TRO DU CTIO N
2
Ambition level 4 for all our platforms
PASSENGERS TRANSPORT GOODS TRANSPORT
Au t o n o m ® S h u t t l e
Au t o n o m ® S h u t t l e
E v o
A u t o n o m ® T r a c t
A T 1 3 5
CUSTOM SELF-
DRIVING SOLUTION
D R I V E N B Y N AV YA
ML team at Navya principally works on:
Camera :
2D Object detection and drivable zone segm. (2D-OD, MTL)
Traffic light detection and relevancy (TLDR)
3D Monocular object detection (3D-MOD)
LiDAR :
Large scale semantic segmentation on pointclouds
Instance segmentation on pointclouds
Semantic Navya Dataset
N AVYA ML TE AM
D e e p l e a r n i n g m o d u l e s o n c a m e r a a n d L i D AR
3
Alexandre Almin Thomas Gauthier Hao Liu
Leo Lemarie Anh Duong
B Ravi Kiran
OVE RVIE W
4
What is a data augmentation (DA)
Categories of data augmentation (classification, detection, segmentation)
Theoretical frameworks for DA
Geometry preserving 2D-DA for 3D monocular detection
DA for 3D monocular object detection
Self supervision pretext tasks for 3D monocular detection
Evaluation on KITTI 3D detection dataset
DA for data redundancy in Active learning (AL) pipeline
Semantic segmentation on pointclouds
Building an AL pipeline for mining informative samples
Evaluation on Semantic-KITTI dataset
Generates augmented I/O pairs
Performs model regularization & reduce the effect of overfitting in
low dataset regime
Increases the diversity of small datasets
Fundamentally models invariance/equivariance to real world
transformations that generate samples of a dataset
Image vs pointcloud augmentations
Representations:
• Images conserve an inherent matrix representation
• Pointclouds are sets with arbitrary input domain size
Instance transform:
• Objects in pointclouds are separable from their background,
enabling for easy 3D transformations (rotation translation).
• Though this does require the transformation to account for
change in pointcloud density with change in
distance/orientation
DATA AU GME N TATIO N
A B r i e f r e v i e w
5
https://blog.waymo.com/2020/04/using-automated-data-augmentation-to.html
https://albumentations.ai/docs/ ,
https://imgaug.readthedocs.io/en/latest/
DA model invariances to represent data Anselmi, et al. 2016
Translation invariance already baked into CNNs due to convolutions
Rotations, Scaling, color transformations…
DA connection with kernel theory Dao, Tri, et al 2019
DA augmentations are seen as a Markov process with transitions defined per sample
DA are represented within a group G Jane H. Lee et al 2020
Augmented sample distribution gX are approximately invariant under the action of group elements g
X ~= gX, g from G
The probability of an augmentation sample is approximately equal to the original sample
WH Y DATA AU GME N TATIO N S WO RK
R e v i e w i n g t h e o r e t i c a l u n d e r s t a n d i n g
6
• Anselmi, et al. "On invarianceand selectivity in representation learning." Information and Inference: A Journalof the IMA 2016.
• Dao, Tri, et al. "A kernel theory of moderndata augmentation." InternationalConference onMachine Learning.PMLR, 2019.
• Chen, Shuxiao, Edgar Dobriban, and Jane H. Lee. "A group-theoretic frameworkfor data augmentation." JMLR21.245 (2020): 1-71.
3D-MOD is a key component of
obstacle detection pipeline
Redundancy & Fusion with LiDAR 3D detection pipeline
Stereo depth estimation pipelines are progressively
replaced with monocular depth estimation
Motivation : DA for 3D-MOD
Datasets for 3D-MOD are costly to create
Data augmentations on 2D object detector’s change
image geometry
View synthesis methods are robust, but heavy
How to reuse existing annotations to be a self-supervised
task ?
C A S E ST U DY 1 : 3 D M O N O C U L A R O B J E C T D E T E C T I O N ( 3 D - M O D )
E x p l o r i n g 2 D D a t a Au g m e n t a t i o n ( D A) f o r 3 D M o n o c u l a r O b j e c t D e t e c t i o n
7
Joint work: Sugirtha T, Sridevi M, Khailash Santhakumar, Hao Liu B Ravi Kiran, Thomas Gauthier, Senthil
Yogamani Accepted at ICCV Workshop on Self-supervised learning for Autonomouas Driving (SSLAD 2021)
Problem formulation : Data augmentation
What transformationsor augmentations could be
performed to (image, 2D-BB) pair that do not change
the depth, orientation or scale of the bounding box?
Problemformulation : Pretext SSL task
What scalable auxilliary task along with a methodology
to generate annotation could be added to the 3D-MOD
detector primary task ?
DATA AU GME N TATIO N S F O R O BJ E CT DE TE CTIO N
8
Weather
based
Label independent
image transforms
Bounding boxInstance based
2D boxes
Monocular 3D boxes
Mosaic
Flip
Affine
Geometry
preserving
Geometry
Aware*
Mosaic-Tile
Affine
Color Tx
Cutmix
Blur
Cutout
Mixup
Shadow
Rain , Snow
Fog, Flare
Gravel
Autumn
Box Mixup
Box
Cutpaste
View
synthesis
Camera
Postion
Apply X
tobox
Learning
based
RandAugment
FastAugment
RL-Search
Focal Length
Receptive
Field
Differentia
ble Neural
Rendering GAN based
Style Transfer
Style Transfer
Self-Supervised & Semi-supervised augmentations
Random
window
classifier
Jigsaw
Augmix
Adeversarial
Rotation
prediction
Contrastive
Losses
*Closest work : Lian, Qing, et al. "Geometry-aware data
augmentation for monocular 3D object detection." arXiv
preprint arXiv:2104.05858 (2021).
Blur, Cutout
Mask
Cutpaste
GE O ME TRY PRE SE RVIN G 2D DATA AU GME N TATIO N S
F o r m o n o c u l a r 3 d o b j e c t d e t e c t i o n
9
New data augmentations : Box-Mixup, Box-Cutpaste and Mosaic Tile
Geometry preservingaugmentation: These transformations do not change the
camera viewpoint or the 3D orientationof the objects in the scene
Self-supervised learning aims
at adding auxiliary/pretext tasks
Where the labels are either automatically
generated either by another sensors (LiDAR) or
are correlated task
The pretext task is correlated with the primary
task and thus training on the pretext task
provides better performance on the primary task
Multi-object labeling (MOL)
Established pretext tasks for 2D object detection
Generate random windows covering existing
foreground bounding boxes
Create soft label showing the proportion of areas
of different classes in the random window
SE LF -SU PE RVISE D LE ARN IN G
F o r m o n o c u l a r 3 d o b j e c t d e t e c t i o n
10
Implentation from :
Lee, Wonhee, Joonil Na, and Gunhee Kim. "Multi-task self-supervised object
detection via recycling of bounding box annotations." Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition. 2019.
Soft-label over randomwindow
SE LF -SU PE RVISE D LE ARN IN G
F o r m o n o c u l a r 3 d o b j e c t d e t e c t i o n
11
Li, Peixuan, et al. "Rtm3d: Real-time monocular 3d detection from object keypoints for autonomous driving." Computer Vision–ECCV 2020: 16th
European Conference, Glasgow
, UK, August 23–28, 2020, Proceedings, Part III 16. Springer International Publishing, 2020.
E VALUATIO N ME TRICS
E v a l u a t i n g 3 D o b j e c t d e t e c t i o n
12
• We evaluateperformance using the mean Average
Precision (mAP)
• We propose a new weighted ICFW mAP using the
Inverse of the Class Frequency Weights to evaluate
gains in non majority classes (especially in KITTI)
• We use the KITTI 3D object detection metrics
• Average Precision (AP) per class
• mAP 2D and ICFW mAP 2D
• mAP 3D and ICFW mAP 3D
• mAP BEV and ICFW mAP BEV
• mAP AOS* and ICFW mAP AOS
* AOS : Averageorientation similarity
BEV mAP: Bird Eye View 2d Box Map
Normalized Class Frequency on validation set of KITTI 3D
New proposed metric
RE SU LTS : SE LF SU PE RVISE D LE ARN IN G WITH DA
13
IoU=0.5 mAP2D mAPBEV mAP3D ICFW mAP2D ICFW mAPBEV ICFW mAP3D
Baseline (B) 41.44 21.17 19.12 33 15.1 14.65
Self-Supervised Learning (SSL) with MOL
B + 8W 0.85 0.53 0.46 0.83 0.7 0.54
B + 16W 0.59 -0.75 -0.59 0.57 -1.88 -1.73
B + 32W 1.4 0.29 0.12 1.75 0.12 -0.17
Data Augmentation (DA)
B + Cutout4 -0.91 0.11 -0.71 -2.79 0.15 -0.54
B + BoxMixup 0.39 0.29 0.21 0.53 0.12 0.04
B + Cutpaste 1.63 1.10 0.34 3.22 1.91 0.49
B + Mosaic -2.61 -1.43 -0.26 -2.96 -0.17 0.09
SSL-MOL + DA
B + 16W + Cutout 1.54 1.27 0.43 2.17 2.81 1.02
B + 16 W + box mixup 1.2 1.67 1.66 1.42 2.57 2.59
B + 16 W + boxmixup cutout 3.51 1.84 1.01 5.57 2.53 1.02
B +16 W + cutpaste cutout 2.87 1.38 2.26 5 1.13 1.19
B +16 W + cutpaste 0.98 0.67 0.72 1.61 0.65 0.73
The number of windows hyper-parameter with composition of data augmentation has been
optimized for in this study and requires either a grid search or DA-search.
RE SU LTS : E XAMPLE S
14
MOL-SSL with cutout & cutpaste DA (bottom)
Baseline (top)
Baseline BEV
Left
MOL-SSL cutout &
cutpaste DA BEV Right
SSL withdataaugmentations performs better at detecting pedestrians/cyclistsas well as cars at larger distances
MTL vs SSL
2D DA for 3D-MOD
DA stand-alone improves the performance of the main task
Box mixup/cut-paste augmentation performs well
MOL-SSL task
Enables the RTM3D network to classify foreground regions with
different classes and background proportions better
This is correlated with the box localisation task
DA-SSL Synergy :
Data augmentation also helps the main task by providing
representations that generalize for both main and augmented
pretext-tasks
SSL pretext tasks and their augmentations are both good
regularizers (inductive bias) and can be combined fruitfully
SSL-pretext task provides a soft-label and makes training with DA
generalize better (hypothesis)
Cons : Seperating augmentations between main and pretext task
is not possible.
CO N CLU SIO N CASE STU DY 1
15
Backbone
Task 1
Task 2
Backbone
Main
Task
Pretext
task
MTL : https://ruder.io/multi-task/
Large scale pointcloud semantic segmentation are fundamental
building blocks in modern AD perception stacks:
Semantic Map layer in modern HDMaps
Drivable zone extraction & Path planning
Semantic re-localization and others…
CASE STU DY 2 : DATA RE DU N DAN CY O N LARGE DATASE TS
S t u d y i n g d a t a a u g m e n t a t i o n u n d e r Ac t i v e l e a r n i n g s e t u p t o r e d u c e d a t a r e d u n d a n c y
16
This work was granted access to the HPC resources of [TGCC/CINES/IDRIS] under the allocation
2021- [AD011012836] made by GENCI (Grand Equipement National de Calcul Intensif)
Compressing Semantic-KITTI: Reducing dataredundancy on pointclouds by Active learning,
Anh Duong, Alexandre Almin, Leo Lemarie, B Ravi Kiran, In Submission 2021
Pointcloud semantic
segmentation datasets have large
amounts of redundant information
Similar scans due to temporal correlation
Similar scans due to similar urban environments
Similar scans due to symmetries
Motivation :
Data augmentations on large dataset had little gains
How do we reduce redundancy or similar samples
(pointcloud, GT) by selecting
Full Dataset = A core subset + Augmentations
Approach :
Study the effect of data augmentations on the active
learning sampling
CASE STU DY 2 : DATA RE DU N DAN CY O N LARGE DATASE TS
S t u d y i n g d a t a a u g m e n t a t i o n u n d e r Ac t i v e l e a r n i n g s e t u p t o r e d u c e d a t a r e d u n d a n c y
17
Large dataset (Core-subset, Aug)
Intuition: Augmentationshere model
equivariancesand should enable us to
compress the dataset
Active learning (AL) is an interactive
learning procedure
That greedily samples the most informative sample(s)to
maximize the performanceof the model.
Multiple ways to decide the mostinformative subsetto label
: likelihood P(x|M), uncertainty P(y|x)
AL Components:
Training subsetS
Query subsetQ
Heuristic func.h (random, entropy, BALD)
Aggregation func.(aggregatespixel scores to scalar)
Redundant/large datasetD
Testset T
Goals :
AL goal : Reduce annotation requests to Oracle
• Reduces Labeling Cost
Our goal : Reduce redundancylarge datasets
• ReducesTraining Time
ACTIVE LE ARN IN G
B a c k g r o u n d
18
2. Evaluate&
Save Model
3. Query
Samples Q
4. Get
labelson Q
Annotator
Largeredundant
dataset D (w/o T)
Oracle
OR OR
5. Update
subset S with Q
1. TrainModel
With DA/no DA
0. Select Initial
subset S from D
Common Test
set T from D
Large scale pointcloud sequences with semantic labels per point
Annotations include semantic class along with instance ID information
Panoptic-Nuscenes provides panoptic tracklet level labels which are temporally consistent across pointcloud scans
Established Architectures : Rangenet++, Salsanext, Cylinder3D
PO IN TCLO U D SE MAN TIC SE GME N TATIO N DATASE T
19
Semantic
KITTI Panoptic
nuscenes
DATASE T SIZ E
S e m a n t i c s e g m e n t a t i o n o n p o i n t c l o u d s
20
Dataset Cities Sequences
(Or Points)
#classes Annotation Sequential
Semantic
KITTI
1x Germany 22 (long) 28 Point, Instance Yes
Panoptic
Nuscenes
Boston
Singapore
1000
40K scans
32 Point, Box,
Instance
Yes
PandaSet 2x USA 100 37 Point, Box Yes
Semantic
Navya(ours*)
22x Cities in 10 Countries
France, Swiss, US, Denmark,
Japan, Germany, Australia,
Israel, Norway, New Zealand
22 (long)
50K scans
24 Point,
Instance
Yes
*in construction
SE MAN TIC N AVYA
L a r g e s c a l e s e m a n t i c s e g m e n t a t i o n d a t a s e t
21
R AN GE IMAGE RE PRE SE N TATIO N
22
P o i n t c l o u d r e p r e s e n t a t i o n
Entropy heuristic :
Choose sample with the largest entropy
Samples that are most uncertain
BALD
Measures information gain between model
predictions, and perturbed* model predictions
Select samples that maximize the information
gain from model parameters
H E U RISTIC F U N CTIO N
23
DATA AU GME N TATIO N IN PO IN TCLO U DS
U s i n g t h e r a n g e i m a g e r e p r e s e n t a t i o n
24
DATA AU GME N TATIO N IN PO IN TCLO U DS
U s i n g t h e r a n g e i m a g e r e p r e s e n t a t i o n
25
E XPE RIME N TAL SE TU P O N SE MAN TIC KIT TI
26
2. Evaluate &
Save Model
3. Query Set Q
(240 samples)
4. Get labels
on Q
Annotator
Subsampled
Semantic KITTI
dataset D (6000)
Oracle
OR OR
5. Update subset
S with Q
1. TrainSqSegV2
With no-DA OR DA
0. Select Initial
subset 100 samples
Test set T
Semantic KITTI
2000 samples
Dataaugmentationsetup
Uniformsamplingon5 DA
- RandomDropout
- Cutout/FrustumDropout
- Gaussiannoise depth
- Gaussiannoise intensity
- Instance Cutpaste
AL training setup
AL steps : 25 Budget : 240 samples
Full Dataset D: 6000 samples
[Random 1/5th subset of Semantic KITTI]
Model : SqueezeSegV2
Pointcloud representation : Spherical Range image
Test Set T : 2000 samples
Heuristics : Random, Entropy, BALD
Data-augmentation : with and without
RE SU LTS
L a b e l E f f i c i e n c y
27
DA provides gains in performance at each AL step on Full dataset of 6000
samples (Semantic KITTI subset)
LABE L E F F ICIE N CY
W i t h a n d w i t h o u t d a t a a u g m e n t a t i o n
28
BALD with DA has best performance on SemanticKITTI.
R AN KE D H ARD E XAMPLE S BY TH E H E U RISTIC
B AL D v s R a n d o m
29
• Larger diversity in samples from the BALD heuristic.
• No Heuristic scores are available for Random heuristic since they are sampled uniformly
Ground truth
Prediction
Heuristic function
Ground truth
Prediction
Heuristic function
Ground truth
Prediction
Heuristic function
E F F E CT O F DATA AU GME N TATIO N F O R CLASSIF ICATIO N
30
Beck, Nathan, et al. "Effective Evaluation of Deep Active Learning on
ImageClassification Tasks." arXiv preprintarXiv:2106.15324 (2021).
DA enable better accuracies at each AL
loop step
DA provides better label efficiency by sampling pointclouds that
are different from dataset+DA samples
A heuristic’s performance with DA applied
depends on the task
Entropy with DA was better than a sophisticated heuristic
function such as BADGE
BALD along with DA was better than Random which performed
better than Entropy
Tasks : Classification vs Semantic Segmentation
Recent work on Semi-Supervised learning
to AL framework
Similar effect of Data augmentations while working with
unsupervised DA (consistency loss)
CO N CLU SIO N S & F U TU RE WO RK CASE STU DY 2
31
Complete benchmark on full semantic KITTI
and Semantic Navya dataset
Aggregation maps uncertainty scores across
a whole PC/image into a scalar
Require a way to sample regions/volumes of PCs
Heuristic functions are scalars and confound multiple regions of the
image
Find the most informativeset of data
augmentations for a dataset to reduce
redundancy
Conclusions Future Work
1. Shorten, Connor, and Taghi M. Khoshgoftaar. "A survey on image data augmentation for deep learning." Journal
of Big Data 6.1 (2019): 1-48.
2. Li, Peixuan, et al. "Rtm3d: Real-time monocular 3d detection from object keypoints for autonomous
driving." Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020,
Proceedings, Part III 16. Springer International Publishing, 2020.
3. Hu, Hou-Ning, et al. "Monocular Quasi-Dense 3D Object Tracking." arXiv preprint arXiv:2103.07351 (2021).
4. Santhakumar, Khailash, et al. "Exploring 2D data augmentation for 3D monocular object detection." arXiv preprint
arXiv:2104.10786 (2021).
5. Lian, Qing, et al. "Geometry-aware data augmentation for monocular 3D object detection." arXiv preprint
arXiv:2104.05858 (2021).
6. Ning, Guanghan, et al. "Data Augmentation for Object Detection via Differentiable Neural Rendering." arXiv
preprint arXiv:2103.02852 (2021).
7. Chen, Shuxiao, Edgar Dobriban, and Jane H. Lee. "A group-theoretic framework for data augmentation." Journal
of Machine Learning Research 21.245 (2020): 1-71.
8. Beck, Nathan, et al. "Effective Evaluation of Deep Active Learning on Image Classification Tasks." arXiv preprint
arXiv:2106.15324 (2021).
RE F E RE N CES I
D a t a a u g m e n t a t i o n s f o r m o n o c u l a r 3 d d e t e c t i o n
32
1. Behley, Jens, et al. "Semantickitti: A dataset for semantic scene understanding of lidar
sequences." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019.
2. Fong, Whye Kit, et al. "Panoptic nuScenes: A Large-Scale Benchmark for LiDAR Panoptic Segmentation
and Tracking." arXiv preprint arXiv:2109.03805 (2021).
3. Milioto, Andres, et al. "Rangenet++: Fast and accurate lidar semantic segmentation." 2019 IEEE/RSJ
International Conference on Intelligent Robots and Systems (IROS). IEEE, 2019.
4. Cortinhal, Tiago, George Tzelepis, and Eren Erdal Aksoy. "SalsaNext: fast, uncertainty-aware semantic
segmentation of LiDAR point clouds for autonomous driving." arXiv preprint arXiv:2003.03653 (2020).
5. Zhou, Hui, et al. "Cylinder3d: An effective 3d framework for driving-scene lidar semantic segmentation."
arXiv preprint arXiv:2008.01550 (2020).
6. Hahner, M., Dai, D., Liniger, A., & Van Gool, L. (2020). Quantifying data augmentation for lidar based 3d
object detection. arXiv preprint arXiv:2004.01643.
7. Ash, Jordan T., et al. "Deep batch active learning by diverse, uncertain gradient lower bounds." arXiv
preprint arXiv:1906.03671 (2019), ICLR 2020. BADGE
8. https://baal.readthedocs.io/en/latest/
9. https://decile-team-distil.readthedocs.io/en/latest/index.html
10. Gao, Mingfei, et al. "Consistency-based semi-supervised active learning: Towards minimizing labeling cost."
European Conference on Computer Vision. Springer, Cham, 2020.
RE F E RE N CES II
D a t a a u g m e n t a t i o n s w i t h i n Ac t i v e l e a r n i n g f o r d a t a r e d u n d a n c y
33

More Related Content

What's hot

Deep Learningによる超解像の進歩
Deep Learningによる超解像の進歩Deep Learningによる超解像の進歩
Deep Learningによる超解像の進歩
Hiroto Honda
 
更適應性的AOI-深度強化學習之應用
更適應性的AOI-深度強化學習之應用更適應性的AOI-深度強化學習之應用
更適應性的AOI-深度強化學習之應用
CHENHuiMei
 
Recent Progress on Single-Image Super-Resolution
Recent Progress on Single-Image Super-ResolutionRecent Progress on Single-Image Super-Resolution
Recent Progress on Single-Image Super-Resolution
Hiroto Honda
 
Ieee 2015 16 matlab @dreamweb techno solutions-trichy
Ieee 2015 16 matlab  @dreamweb techno solutions-trichyIeee 2015 16 matlab  @dreamweb techno solutions-trichy
Ieee 2015 16 matlab @dreamweb techno solutions-trichy
subhu8430
 
Digital image protection using adaptive watermarking techniques
Digital image protection using adaptive watermarking techniquesDigital image protection using adaptive watermarking techniques
Digital image protection using adaptive watermarking techniques
anandk10
 
Transformer 動向調査 in 画像認識
Transformer 動向調査 in 画像認識Transformer 動向調査 in 画像認識
Transformer 動向調査 in 画像認識
Kazuki Maeno
 
Manifold learning with application to object recognition
Manifold learning with application to object recognitionManifold learning with application to object recognition
Manifold learning with application to object recognitionzukun
 
A NOVEL APPROACH FOR IMAGE WATERMARKING USING DCT AND JND TECHNIQUES
A NOVEL APPROACH FOR IMAGE WATERMARKING USING DCT AND JND TECHNIQUESA NOVEL APPROACH FOR IMAGE WATERMARKING USING DCT AND JND TECHNIQUES
A NOVEL APPROACH FOR IMAGE WATERMARKING USING DCT AND JND TECHNIQUES
ijiert bestjournal
 
Medical Imaging at DCU - Kevin McGuinness - UPC Barcelona 2018
Medical Imaging at DCU - Kevin McGuinness - UPC Barcelona 2018Medical Imaging at DCU - Kevin McGuinness - UPC Barcelona 2018
Medical Imaging at DCU - Kevin McGuinness - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
3D Volumetric Data Generation with Generative Adversarial Networks
3D Volumetric Data Generation with Generative Adversarial Networks3D Volumetric Data Generation with Generative Adversarial Networks
3D Volumetric Data Generation with Generative Adversarial Networks
Preferred Networks
 
Digital Watermarking using DWT-SVD
Digital Watermarking using DWT-SVDDigital Watermarking using DWT-SVD
Digital Watermarking using DWT-SVD
Surit Datta
 
Image segmentation hj_cho
Image segmentation hj_choImage segmentation hj_cho
Image segmentation hj_cho
Hyungjoo Cho
 
MUMS Opening Workshop - Machine-Learning Error Models for Quantifying the Epi...
MUMS Opening Workshop - Machine-Learning Error Models for Quantifying the Epi...MUMS Opening Workshop - Machine-Learning Error Models for Quantifying the Epi...
MUMS Opening Workshop - Machine-Learning Error Models for Quantifying the Epi...
The Statistical and Applied Mathematical Sciences Institute
 
Image Compression using Combined Approach of EZW and LZW
Image Compression using Combined Approach of EZW and LZWImage Compression using Combined Approach of EZW and LZW
Image Compression using Combined Approach of EZW and LZW
IJERA Editor
 

What's hot (16)

Deep Learningによる超解像の進歩
Deep Learningによる超解像の進歩Deep Learningによる超解像の進歩
Deep Learningによる超解像の進歩
 
更適應性的AOI-深度強化學習之應用
更適應性的AOI-深度強化學習之應用更適應性的AOI-深度強化學習之應用
更適應性的AOI-深度強化學習之應用
 
Recent Progress on Single-Image Super-Resolution
Recent Progress on Single-Image Super-ResolutionRecent Progress on Single-Image Super-Resolution
Recent Progress on Single-Image Super-Resolution
 
crfasrnn_presentation
crfasrnn_presentationcrfasrnn_presentation
crfasrnn_presentation
 
A0540106
A0540106A0540106
A0540106
 
Ieee 2015 16 matlab @dreamweb techno solutions-trichy
Ieee 2015 16 matlab  @dreamweb techno solutions-trichyIeee 2015 16 matlab  @dreamweb techno solutions-trichy
Ieee 2015 16 matlab @dreamweb techno solutions-trichy
 
Digital image protection using adaptive watermarking techniques
Digital image protection using adaptive watermarking techniquesDigital image protection using adaptive watermarking techniques
Digital image protection using adaptive watermarking techniques
 
Transformer 動向調査 in 画像認識
Transformer 動向調査 in 画像認識Transformer 動向調査 in 画像認識
Transformer 動向調査 in 画像認識
 
Manifold learning with application to object recognition
Manifold learning with application to object recognitionManifold learning with application to object recognition
Manifold learning with application to object recognition
 
A NOVEL APPROACH FOR IMAGE WATERMARKING USING DCT AND JND TECHNIQUES
A NOVEL APPROACH FOR IMAGE WATERMARKING USING DCT AND JND TECHNIQUESA NOVEL APPROACH FOR IMAGE WATERMARKING USING DCT AND JND TECHNIQUES
A NOVEL APPROACH FOR IMAGE WATERMARKING USING DCT AND JND TECHNIQUES
 
Medical Imaging at DCU - Kevin McGuinness - UPC Barcelona 2018
Medical Imaging at DCU - Kevin McGuinness - UPC Barcelona 2018Medical Imaging at DCU - Kevin McGuinness - UPC Barcelona 2018
Medical Imaging at DCU - Kevin McGuinness - UPC Barcelona 2018
 
3D Volumetric Data Generation with Generative Adversarial Networks
3D Volumetric Data Generation with Generative Adversarial Networks3D Volumetric Data Generation with Generative Adversarial Networks
3D Volumetric Data Generation with Generative Adversarial Networks
 
Digital Watermarking using DWT-SVD
Digital Watermarking using DWT-SVDDigital Watermarking using DWT-SVD
Digital Watermarking using DWT-SVD
 
Image segmentation hj_cho
Image segmentation hj_choImage segmentation hj_cho
Image segmentation hj_cho
 
MUMS Opening Workshop - Machine-Learning Error Models for Quantifying the Epi...
MUMS Opening Workshop - Machine-Learning Error Models for Quantifying the Epi...MUMS Opening Workshop - Machine-Learning Error Models for Quantifying the Epi...
MUMS Opening Workshop - Machine-Learning Error Models for Quantifying the Epi...
 
Image Compression using Combined Approach of EZW and LZW
Image Compression using Combined Approach of EZW and LZWImage Compression using Combined Approach of EZW and LZW
Image Compression using Combined Approach of EZW and LZW
 

Similar to AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ravi Kiran

Introduction to multiple object tracking
Introduction to multiple object trackingIntroduction to multiple object tracking
Introduction to multiple object tracking
Fan Yang
 
PointNet
PointNetPointNet
IRJET- 3D Object Recognition of Car Image Detection
IRJET-  	  3D Object Recognition of Car Image DetectionIRJET-  	  3D Object Recognition of Car Image Detection
IRJET- 3D Object Recognition of Car Image Detection
IRJET Journal
 
ANALYSIS OF LUNG NODULE DETECTION AND STAGE CLASSIFICATION USING FASTER RCNN ...
ANALYSIS OF LUNG NODULE DETECTION AND STAGE CLASSIFICATION USING FASTER RCNN ...ANALYSIS OF LUNG NODULE DETECTION AND STAGE CLASSIFICATION USING FASTER RCNN ...
ANALYSIS OF LUNG NODULE DETECTION AND STAGE CLASSIFICATION USING FASTER RCNN ...
IRJET Journal
 
Laplacian-regularized Graph Bandits
Laplacian-regularized Graph BanditsLaplacian-regularized Graph Bandits
Laplacian-regularized Graph Bandits
lauratoni4
 
Matlab project
Matlab projectMatlab project
Matlab project
shahu2212
 
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
ijtsrd
 
Learning Graph Representation for Data-Efficiency RL
Learning Graph Representation for Data-Efficiency RLLearning Graph Representation for Data-Efficiency RL
Learning Graph Representation for Data-Efficiency RL
lauratoni4
 
F010232834
F010232834F010232834
F010232834
IOSR Journals
 
Partial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather Conditions
IRJET Journal
 
10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdf10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdf
mokamojah
 
Scrdet++ analysis
Scrdet++ analysisScrdet++ analysis
Scrdet++ analysis
NEHA Kapoor
 
小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵
小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵
小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵
CHENHuiMei
 
Noise Removal in Traffic Sign Detection Systems
Noise Removal in Traffic Sign Detection SystemsNoise Removal in Traffic Sign Detection Systems
Noise Removal in Traffic Sign Detection Systems
CSEIJJournal
 
Inspection of Suspicious Human Activity in the Crowd Sourced Areas Captured i...
Inspection of Suspicious Human Activity in the Crowd Sourced Areas Captured i...Inspection of Suspicious Human Activity in the Crowd Sourced Areas Captured i...
Inspection of Suspicious Human Activity in the Crowd Sourced Areas Captured i...
IRJET Journal
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite Imagery
RAHUL BHOJWANI
 
IRJET- An Efficient VLSI Architecture for 3D-DWT using Lifting Scheme
IRJET- An Efficient VLSI Architecture for 3D-DWT using Lifting SchemeIRJET- An Efficient VLSI Architecture for 3D-DWT using Lifting Scheme
IRJET- An Efficient VLSI Architecture for 3D-DWT using Lifting Scheme
IRJET Journal
 
Lidar for Autonomous Driving II (via Deep Learning)
Lidar for Autonomous Driving II (via Deep Learning)Lidar for Autonomous Driving II (via Deep Learning)
Lidar for Autonomous Driving II (via Deep Learning)
Yu Huang
 
kanimozhi2019.pdf
kanimozhi2019.pdfkanimozhi2019.pdf
kanimozhi2019.pdf
AshrafDabbas1
 

Similar to AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ravi Kiran (20)

Portfolio
PortfolioPortfolio
Portfolio
 
Introduction to multiple object tracking
Introduction to multiple object trackingIntroduction to multiple object tracking
Introduction to multiple object tracking
 
PointNet
PointNetPointNet
PointNet
 
IRJET- 3D Object Recognition of Car Image Detection
IRJET-  	  3D Object Recognition of Car Image DetectionIRJET-  	  3D Object Recognition of Car Image Detection
IRJET- 3D Object Recognition of Car Image Detection
 
ANALYSIS OF LUNG NODULE DETECTION AND STAGE CLASSIFICATION USING FASTER RCNN ...
ANALYSIS OF LUNG NODULE DETECTION AND STAGE CLASSIFICATION USING FASTER RCNN ...ANALYSIS OF LUNG NODULE DETECTION AND STAGE CLASSIFICATION USING FASTER RCNN ...
ANALYSIS OF LUNG NODULE DETECTION AND STAGE CLASSIFICATION USING FASTER RCNN ...
 
Laplacian-regularized Graph Bandits
Laplacian-regularized Graph BanditsLaplacian-regularized Graph Bandits
Laplacian-regularized Graph Bandits
 
Matlab project
Matlab projectMatlab project
Matlab project
 
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
 
Learning Graph Representation for Data-Efficiency RL
Learning Graph Representation for Data-Efficiency RLLearning Graph Representation for Data-Efficiency RL
Learning Graph Representation for Data-Efficiency RL
 
F010232834
F010232834F010232834
F010232834
 
Partial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather Conditions
 
10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdf10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdf
 
Scrdet++ analysis
Scrdet++ analysisScrdet++ analysis
Scrdet++ analysis
 
小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵
小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵
小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵
 
Noise Removal in Traffic Sign Detection Systems
Noise Removal in Traffic Sign Detection SystemsNoise Removal in Traffic Sign Detection Systems
Noise Removal in Traffic Sign Detection Systems
 
Inspection of Suspicious Human Activity in the Crowd Sourced Areas Captured i...
Inspection of Suspicious Human Activity in the Crowd Sourced Areas Captured i...Inspection of Suspicious Human Activity in the Crowd Sourced Areas Captured i...
Inspection of Suspicious Human Activity in the Crowd Sourced Areas Captured i...
 
Semantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite ImagerySemantic Segmentation on Satellite Imagery
Semantic Segmentation on Satellite Imagery
 
IRJET- An Efficient VLSI Architecture for 3D-DWT using Lifting Scheme
IRJET- An Efficient VLSI Architecture for 3D-DWT using Lifting SchemeIRJET- An Efficient VLSI Architecture for 3D-DWT using Lifting Scheme
IRJET- An Efficient VLSI Architecture for 3D-DWT using Lifting Scheme
 
Lidar for Autonomous Driving II (via Deep Learning)
Lidar for Autonomous Driving II (via Deep Learning)Lidar for Autonomous Driving II (via Deep Learning)
Lidar for Autonomous Driving II (via Deep Learning)
 
kanimozhi2019.pdf
kanimozhi2019.pdfkanimozhi2019.pdf
kanimozhi2019.pdf
 

Recently uploaded

Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 

Recently uploaded (20)

Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 

AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ravi Kiran

  • 1. Real world data augmentations for autonomous driving
  • 2. N AVYA IN TRO DU CTIO N 2 Ambition level 4 for all our platforms PASSENGERS TRANSPORT GOODS TRANSPORT Au t o n o m ® S h u t t l e Au t o n o m ® S h u t t l e E v o A u t o n o m ® T r a c t A T 1 3 5 CUSTOM SELF- DRIVING SOLUTION D R I V E N B Y N AV YA
  • 3. ML team at Navya principally works on: Camera : 2D Object detection and drivable zone segm. (2D-OD, MTL) Traffic light detection and relevancy (TLDR) 3D Monocular object detection (3D-MOD) LiDAR : Large scale semantic segmentation on pointclouds Instance segmentation on pointclouds Semantic Navya Dataset N AVYA ML TE AM D e e p l e a r n i n g m o d u l e s o n c a m e r a a n d L i D AR 3 Alexandre Almin Thomas Gauthier Hao Liu Leo Lemarie Anh Duong B Ravi Kiran
  • 4. OVE RVIE W 4 What is a data augmentation (DA) Categories of data augmentation (classification, detection, segmentation) Theoretical frameworks for DA Geometry preserving 2D-DA for 3D monocular detection DA for 3D monocular object detection Self supervision pretext tasks for 3D monocular detection Evaluation on KITTI 3D detection dataset DA for data redundancy in Active learning (AL) pipeline Semantic segmentation on pointclouds Building an AL pipeline for mining informative samples Evaluation on Semantic-KITTI dataset
  • 5. Generates augmented I/O pairs Performs model regularization & reduce the effect of overfitting in low dataset regime Increases the diversity of small datasets Fundamentally models invariance/equivariance to real world transformations that generate samples of a dataset Image vs pointcloud augmentations Representations: • Images conserve an inherent matrix representation • Pointclouds are sets with arbitrary input domain size Instance transform: • Objects in pointclouds are separable from their background, enabling for easy 3D transformations (rotation translation). • Though this does require the transformation to account for change in pointcloud density with change in distance/orientation DATA AU GME N TATIO N A B r i e f r e v i e w 5 https://blog.waymo.com/2020/04/using-automated-data-augmentation-to.html https://albumentations.ai/docs/ , https://imgaug.readthedocs.io/en/latest/
  • 6. DA model invariances to represent data Anselmi, et al. 2016 Translation invariance already baked into CNNs due to convolutions Rotations, Scaling, color transformations… DA connection with kernel theory Dao, Tri, et al 2019 DA augmentations are seen as a Markov process with transitions defined per sample DA are represented within a group G Jane H. Lee et al 2020 Augmented sample distribution gX are approximately invariant under the action of group elements g X ~= gX, g from G The probability of an augmentation sample is approximately equal to the original sample WH Y DATA AU GME N TATIO N S WO RK R e v i e w i n g t h e o r e t i c a l u n d e r s t a n d i n g 6 • Anselmi, et al. "On invarianceand selectivity in representation learning." Information and Inference: A Journalof the IMA 2016. • Dao, Tri, et al. "A kernel theory of moderndata augmentation." InternationalConference onMachine Learning.PMLR, 2019. • Chen, Shuxiao, Edgar Dobriban, and Jane H. Lee. "A group-theoretic frameworkfor data augmentation." JMLR21.245 (2020): 1-71.
  • 7. 3D-MOD is a key component of obstacle detection pipeline Redundancy & Fusion with LiDAR 3D detection pipeline Stereo depth estimation pipelines are progressively replaced with monocular depth estimation Motivation : DA for 3D-MOD Datasets for 3D-MOD are costly to create Data augmentations on 2D object detector’s change image geometry View synthesis methods are robust, but heavy How to reuse existing annotations to be a self-supervised task ? C A S E ST U DY 1 : 3 D M O N O C U L A R O B J E C T D E T E C T I O N ( 3 D - M O D ) E x p l o r i n g 2 D D a t a Au g m e n t a t i o n ( D A) f o r 3 D M o n o c u l a r O b j e c t D e t e c t i o n 7 Joint work: Sugirtha T, Sridevi M, Khailash Santhakumar, Hao Liu B Ravi Kiran, Thomas Gauthier, Senthil Yogamani Accepted at ICCV Workshop on Self-supervised learning for Autonomouas Driving (SSLAD 2021) Problem formulation : Data augmentation What transformationsor augmentations could be performed to (image, 2D-BB) pair that do not change the depth, orientation or scale of the bounding box? Problemformulation : Pretext SSL task What scalable auxilliary task along with a methodology to generate annotation could be added to the 3D-MOD detector primary task ?
  • 8. DATA AU GME N TATIO N S F O R O BJ E CT DE TE CTIO N 8 Weather based Label independent image transforms Bounding boxInstance based 2D boxes Monocular 3D boxes Mosaic Flip Affine Geometry preserving Geometry Aware* Mosaic-Tile Affine Color Tx Cutmix Blur Cutout Mixup Shadow Rain , Snow Fog, Flare Gravel Autumn Box Mixup Box Cutpaste View synthesis Camera Postion Apply X tobox Learning based RandAugment FastAugment RL-Search Focal Length Receptive Field Differentia ble Neural Rendering GAN based Style Transfer Style Transfer Self-Supervised & Semi-supervised augmentations Random window classifier Jigsaw Augmix Adeversarial Rotation prediction Contrastive Losses *Closest work : Lian, Qing, et al. "Geometry-aware data augmentation for monocular 3D object detection." arXiv preprint arXiv:2104.05858 (2021). Blur, Cutout Mask Cutpaste
  • 9. GE O ME TRY PRE SE RVIN G 2D DATA AU GME N TATIO N S F o r m o n o c u l a r 3 d o b j e c t d e t e c t i o n 9 New data augmentations : Box-Mixup, Box-Cutpaste and Mosaic Tile Geometry preservingaugmentation: These transformations do not change the camera viewpoint or the 3D orientationof the objects in the scene
  • 10. Self-supervised learning aims at adding auxiliary/pretext tasks Where the labels are either automatically generated either by another sensors (LiDAR) or are correlated task The pretext task is correlated with the primary task and thus training on the pretext task provides better performance on the primary task Multi-object labeling (MOL) Established pretext tasks for 2D object detection Generate random windows covering existing foreground bounding boxes Create soft label showing the proportion of areas of different classes in the random window SE LF -SU PE RVISE D LE ARN IN G F o r m o n o c u l a r 3 d o b j e c t d e t e c t i o n 10 Implentation from : Lee, Wonhee, Joonil Na, and Gunhee Kim. "Multi-task self-supervised object detection via recycling of bounding box annotations." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019. Soft-label over randomwindow
  • 11. SE LF -SU PE RVISE D LE ARN IN G F o r m o n o c u l a r 3 d o b j e c t d e t e c t i o n 11 Li, Peixuan, et al. "Rtm3d: Real-time monocular 3d detection from object keypoints for autonomous driving." Computer Vision–ECCV 2020: 16th European Conference, Glasgow , UK, August 23–28, 2020, Proceedings, Part III 16. Springer International Publishing, 2020.
  • 12. E VALUATIO N ME TRICS E v a l u a t i n g 3 D o b j e c t d e t e c t i o n 12 • We evaluateperformance using the mean Average Precision (mAP) • We propose a new weighted ICFW mAP using the Inverse of the Class Frequency Weights to evaluate gains in non majority classes (especially in KITTI) • We use the KITTI 3D object detection metrics • Average Precision (AP) per class • mAP 2D and ICFW mAP 2D • mAP 3D and ICFW mAP 3D • mAP BEV and ICFW mAP BEV • mAP AOS* and ICFW mAP AOS * AOS : Averageorientation similarity BEV mAP: Bird Eye View 2d Box Map Normalized Class Frequency on validation set of KITTI 3D New proposed metric
  • 13. RE SU LTS : SE LF SU PE RVISE D LE ARN IN G WITH DA 13 IoU=0.5 mAP2D mAPBEV mAP3D ICFW mAP2D ICFW mAPBEV ICFW mAP3D Baseline (B) 41.44 21.17 19.12 33 15.1 14.65 Self-Supervised Learning (SSL) with MOL B + 8W 0.85 0.53 0.46 0.83 0.7 0.54 B + 16W 0.59 -0.75 -0.59 0.57 -1.88 -1.73 B + 32W 1.4 0.29 0.12 1.75 0.12 -0.17 Data Augmentation (DA) B + Cutout4 -0.91 0.11 -0.71 -2.79 0.15 -0.54 B + BoxMixup 0.39 0.29 0.21 0.53 0.12 0.04 B + Cutpaste 1.63 1.10 0.34 3.22 1.91 0.49 B + Mosaic -2.61 -1.43 -0.26 -2.96 -0.17 0.09 SSL-MOL + DA B + 16W + Cutout 1.54 1.27 0.43 2.17 2.81 1.02 B + 16 W + box mixup 1.2 1.67 1.66 1.42 2.57 2.59 B + 16 W + boxmixup cutout 3.51 1.84 1.01 5.57 2.53 1.02 B +16 W + cutpaste cutout 2.87 1.38 2.26 5 1.13 1.19 B +16 W + cutpaste 0.98 0.67 0.72 1.61 0.65 0.73 The number of windows hyper-parameter with composition of data augmentation has been optimized for in this study and requires either a grid search or DA-search.
  • 14. RE SU LTS : E XAMPLE S 14 MOL-SSL with cutout & cutpaste DA (bottom) Baseline (top) Baseline BEV Left MOL-SSL cutout & cutpaste DA BEV Right SSL withdataaugmentations performs better at detecting pedestrians/cyclistsas well as cars at larger distances
  • 15. MTL vs SSL 2D DA for 3D-MOD DA stand-alone improves the performance of the main task Box mixup/cut-paste augmentation performs well MOL-SSL task Enables the RTM3D network to classify foreground regions with different classes and background proportions better This is correlated with the box localisation task DA-SSL Synergy : Data augmentation also helps the main task by providing representations that generalize for both main and augmented pretext-tasks SSL pretext tasks and their augmentations are both good regularizers (inductive bias) and can be combined fruitfully SSL-pretext task provides a soft-label and makes training with DA generalize better (hypothesis) Cons : Seperating augmentations between main and pretext task is not possible. CO N CLU SIO N CASE STU DY 1 15 Backbone Task 1 Task 2 Backbone Main Task Pretext task MTL : https://ruder.io/multi-task/
  • 16. Large scale pointcloud semantic segmentation are fundamental building blocks in modern AD perception stacks: Semantic Map layer in modern HDMaps Drivable zone extraction & Path planning Semantic re-localization and others… CASE STU DY 2 : DATA RE DU N DAN CY O N LARGE DATASE TS S t u d y i n g d a t a a u g m e n t a t i o n u n d e r Ac t i v e l e a r n i n g s e t u p t o r e d u c e d a t a r e d u n d a n c y 16 This work was granted access to the HPC resources of [TGCC/CINES/IDRIS] under the allocation 2021- [AD011012836] made by GENCI (Grand Equipement National de Calcul Intensif) Compressing Semantic-KITTI: Reducing dataredundancy on pointclouds by Active learning, Anh Duong, Alexandre Almin, Leo Lemarie, B Ravi Kiran, In Submission 2021
  • 17. Pointcloud semantic segmentation datasets have large amounts of redundant information Similar scans due to temporal correlation Similar scans due to similar urban environments Similar scans due to symmetries Motivation : Data augmentations on large dataset had little gains How do we reduce redundancy or similar samples (pointcloud, GT) by selecting Full Dataset = A core subset + Augmentations Approach : Study the effect of data augmentations on the active learning sampling CASE STU DY 2 : DATA RE DU N DAN CY O N LARGE DATASE TS S t u d y i n g d a t a a u g m e n t a t i o n u n d e r Ac t i v e l e a r n i n g s e t u p t o r e d u c e d a t a r e d u n d a n c y 17 Large dataset (Core-subset, Aug) Intuition: Augmentationshere model equivariancesand should enable us to compress the dataset
  • 18. Active learning (AL) is an interactive learning procedure That greedily samples the most informative sample(s)to maximize the performanceof the model. Multiple ways to decide the mostinformative subsetto label : likelihood P(x|M), uncertainty P(y|x) AL Components: Training subsetS Query subsetQ Heuristic func.h (random, entropy, BALD) Aggregation func.(aggregatespixel scores to scalar) Redundant/large datasetD Testset T Goals : AL goal : Reduce annotation requests to Oracle • Reduces Labeling Cost Our goal : Reduce redundancylarge datasets • ReducesTraining Time ACTIVE LE ARN IN G B a c k g r o u n d 18 2. Evaluate& Save Model 3. Query Samples Q 4. Get labelson Q Annotator Largeredundant dataset D (w/o T) Oracle OR OR 5. Update subset S with Q 1. TrainModel With DA/no DA 0. Select Initial subset S from D Common Test set T from D
  • 19. Large scale pointcloud sequences with semantic labels per point Annotations include semantic class along with instance ID information Panoptic-Nuscenes provides panoptic tracklet level labels which are temporally consistent across pointcloud scans Established Architectures : Rangenet++, Salsanext, Cylinder3D PO IN TCLO U D SE MAN TIC SE GME N TATIO N DATASE T 19 Semantic KITTI Panoptic nuscenes
  • 20. DATASE T SIZ E S e m a n t i c s e g m e n t a t i o n o n p o i n t c l o u d s 20 Dataset Cities Sequences (Or Points) #classes Annotation Sequential Semantic KITTI 1x Germany 22 (long) 28 Point, Instance Yes Panoptic Nuscenes Boston Singapore 1000 40K scans 32 Point, Box, Instance Yes PandaSet 2x USA 100 37 Point, Box Yes Semantic Navya(ours*) 22x Cities in 10 Countries France, Swiss, US, Denmark, Japan, Germany, Australia, Israel, Norway, New Zealand 22 (long) 50K scans 24 Point, Instance Yes *in construction
  • 21. SE MAN TIC N AVYA L a r g e s c a l e s e m a n t i c s e g m e n t a t i o n d a t a s e t 21
  • 22. R AN GE IMAGE RE PRE SE N TATIO N 22 P o i n t c l o u d r e p r e s e n t a t i o n
  • 23. Entropy heuristic : Choose sample with the largest entropy Samples that are most uncertain BALD Measures information gain between model predictions, and perturbed* model predictions Select samples that maximize the information gain from model parameters H E U RISTIC F U N CTIO N 23
  • 24. DATA AU GME N TATIO N IN PO IN TCLO U DS U s i n g t h e r a n g e i m a g e r e p r e s e n t a t i o n 24
  • 25. DATA AU GME N TATIO N IN PO IN TCLO U DS U s i n g t h e r a n g e i m a g e r e p r e s e n t a t i o n 25
  • 26. E XPE RIME N TAL SE TU P O N SE MAN TIC KIT TI 26 2. Evaluate & Save Model 3. Query Set Q (240 samples) 4. Get labels on Q Annotator Subsampled Semantic KITTI dataset D (6000) Oracle OR OR 5. Update subset S with Q 1. TrainSqSegV2 With no-DA OR DA 0. Select Initial subset 100 samples Test set T Semantic KITTI 2000 samples Dataaugmentationsetup Uniformsamplingon5 DA - RandomDropout - Cutout/FrustumDropout - Gaussiannoise depth - Gaussiannoise intensity - Instance Cutpaste AL training setup AL steps : 25 Budget : 240 samples Full Dataset D: 6000 samples [Random 1/5th subset of Semantic KITTI] Model : SqueezeSegV2 Pointcloud representation : Spherical Range image Test Set T : 2000 samples Heuristics : Random, Entropy, BALD Data-augmentation : with and without
  • 27. RE SU LTS L a b e l E f f i c i e n c y 27 DA provides gains in performance at each AL step on Full dataset of 6000 samples (Semantic KITTI subset)
  • 28. LABE L E F F ICIE N CY W i t h a n d w i t h o u t d a t a a u g m e n t a t i o n 28 BALD with DA has best performance on SemanticKITTI.
  • 29. R AN KE D H ARD E XAMPLE S BY TH E H E U RISTIC B AL D v s R a n d o m 29 • Larger diversity in samples from the BALD heuristic. • No Heuristic scores are available for Random heuristic since they are sampled uniformly Ground truth Prediction Heuristic function Ground truth Prediction Heuristic function Ground truth Prediction Heuristic function
  • 30. E F F E CT O F DATA AU GME N TATIO N F O R CLASSIF ICATIO N 30 Beck, Nathan, et al. "Effective Evaluation of Deep Active Learning on ImageClassification Tasks." arXiv preprintarXiv:2106.15324 (2021).
  • 31. DA enable better accuracies at each AL loop step DA provides better label efficiency by sampling pointclouds that are different from dataset+DA samples A heuristic’s performance with DA applied depends on the task Entropy with DA was better than a sophisticated heuristic function such as BADGE BALD along with DA was better than Random which performed better than Entropy Tasks : Classification vs Semantic Segmentation Recent work on Semi-Supervised learning to AL framework Similar effect of Data augmentations while working with unsupervised DA (consistency loss) CO N CLU SIO N S & F U TU RE WO RK CASE STU DY 2 31 Complete benchmark on full semantic KITTI and Semantic Navya dataset Aggregation maps uncertainty scores across a whole PC/image into a scalar Require a way to sample regions/volumes of PCs Heuristic functions are scalars and confound multiple regions of the image Find the most informativeset of data augmentations for a dataset to reduce redundancy Conclusions Future Work
  • 32. 1. Shorten, Connor, and Taghi M. Khoshgoftaar. "A survey on image data augmentation for deep learning." Journal of Big Data 6.1 (2019): 1-48. 2. Li, Peixuan, et al. "Rtm3d: Real-time monocular 3d detection from object keypoints for autonomous driving." Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16. Springer International Publishing, 2020. 3. Hu, Hou-Ning, et al. "Monocular Quasi-Dense 3D Object Tracking." arXiv preprint arXiv:2103.07351 (2021). 4. Santhakumar, Khailash, et al. "Exploring 2D data augmentation for 3D monocular object detection." arXiv preprint arXiv:2104.10786 (2021). 5. Lian, Qing, et al. "Geometry-aware data augmentation for monocular 3D object detection." arXiv preprint arXiv:2104.05858 (2021). 6. Ning, Guanghan, et al. "Data Augmentation for Object Detection via Differentiable Neural Rendering." arXiv preprint arXiv:2103.02852 (2021). 7. Chen, Shuxiao, Edgar Dobriban, and Jane H. Lee. "A group-theoretic framework for data augmentation." Journal of Machine Learning Research 21.245 (2020): 1-71. 8. Beck, Nathan, et al. "Effective Evaluation of Deep Active Learning on Image Classification Tasks." arXiv preprint arXiv:2106.15324 (2021). RE F E RE N CES I D a t a a u g m e n t a t i o n s f o r m o n o c u l a r 3 d d e t e c t i o n 32
  • 33. 1. Behley, Jens, et al. "Semantickitti: A dataset for semantic scene understanding of lidar sequences." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019. 2. Fong, Whye Kit, et al. "Panoptic nuScenes: A Large-Scale Benchmark for LiDAR Panoptic Segmentation and Tracking." arXiv preprint arXiv:2109.03805 (2021). 3. Milioto, Andres, et al. "Rangenet++: Fast and accurate lidar semantic segmentation." 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2019. 4. Cortinhal, Tiago, George Tzelepis, and Eren Erdal Aksoy. "SalsaNext: fast, uncertainty-aware semantic segmentation of LiDAR point clouds for autonomous driving." arXiv preprint arXiv:2003.03653 (2020). 5. Zhou, Hui, et al. "Cylinder3d: An effective 3d framework for driving-scene lidar semantic segmentation." arXiv preprint arXiv:2008.01550 (2020). 6. Hahner, M., Dai, D., Liniger, A., & Van Gool, L. (2020). Quantifying data augmentation for lidar based 3d object detection. arXiv preprint arXiv:2004.01643. 7. Ash, Jordan T., et al. "Deep batch active learning by diverse, uncertain gradient lower bounds." arXiv preprint arXiv:1906.03671 (2019), ICLR 2020. BADGE 8. https://baal.readthedocs.io/en/latest/ 9. https://decile-team-distil.readthedocs.io/en/latest/index.html 10. Gao, Mingfei, et al. "Consistency-based semi-supervised active learning: Towards minimizing labeling cost." European Conference on Computer Vision. Springer, Cham, 2020. RE F E RE N CES II D a t a a u g m e n t a t i o n s w i t h i n Ac t i v e l e a r n i n g f o r d a t a r e d u n d a n c y 33