SlideShare a Scribd company logo
1 of 49
Download to read offline
Fisheye/Omnidirectional View in
Autonomous Driving V
Yu Huang
Outline
• Road-line detection and 3D reconstruction using fisheye cameras
• Vehicle Re-ID for Surround-view Camera System
• SynDistNet: Self-Supervised Monocular Fisheye Camera Distance
Estimation Synergized with Semantic Segmentation for Autonomous
Driving
• Universal Semantic Segmentation for Fisheye Urban Driving Images
• UnRectDepthNet: Self-Supervised Monocular Depth Estimation using a
Generic Framework for Handling Common Camera Distortion Models
• OmniDet: Surround View Cameras based Multi-task Visual Perception
Network for Autonomous Driving
• Adversarial Attacks on Multi-task Visual Perception for Autonomous Driving
Road-line detection and 3D reconstruction
using fisheye cameras
• In future ADAS, smart monitoring of the vehicle environment is a key issue.
• Fisheye cameras have become popular as they provide a panoramic view with
a few low-cost sensors.
• However, current ADAS systems have limited use as most of the underlying
image processing has been designed for perspective views only.
• In this article illustrate how the theoretical work done in omnidirectional
vision over the past ten years can help to tackle this issue.
• To do so, have evaluated a simple algorithm for road line detection based on
the unified sphere model in real conditions.
• firstly highlight the interest of using fisheye cameras in a vehicle, then present
experimental results on the detection of lines on a set of 180 images,
• finally, show how the 3D position of the lines can be recovered by
triangulation.
Road-line detection and 3D reconstruction
using fisheye cameras
Road-line detection and 3D reconstruction
using fisheye cameras
Road-line detection and 3D reconstruction
using fisheye cameras
Road-line detection and 3D reconstruction
using fisheye cameras
Vehicle Re-ID for Surround-view Camera System
• The vehicle re-identification (Re-ID) plays a critical role in the perception system of autonomous
driving, which attracts more and more attention in recent years.
• However, no existing complete solution for the surround-view system mounted on the vehicle.
• two main challenges in above scenario: i) In single-camera view, it is difficult to recognize the
same vehicle from the past image frames due to the fish-eye distortion, occlusion, truncation, etc.
ii) In multi-camera view, the appearance of the same vehicle varies greatly from different cameras
viewpoints.
• Thus, present an integral vehicle Re-ID solution to address these problems.
• Specifically, propose a quality evaluation mechanism to balance the effect of tracking boxs drift
and targets consistence.
• Besides, take advantage of the Re-ID network based on attention mechanism, then combined
with a spatial constraint strategy to further boost the performance between different cameras.
• The experiments demonstrate that solution achieves state-of-the-art accuracy while being real-
time in practice.
• Besides, will release the code and annotated fisheye dataset for the benefit of community.
Vehicle Re-ID for Surround-view Camera System
Vehicle Re-ID for Surround-view Camera System
Vehicles in single view of fisheye camera. (a) The same vehicle features change dramatically in
consecutive frames and vehicles tend to obscure each other. (b) Matching errors are caused by
tracking results. (c) The vehicle center indicated by the orange box is stable while the IoU in
consecutive frames indicated by the yellow box decreases with movement.
Vehicle Re-ID for Surround-view Camera System
The overall framework of vehicle Re-ID in single camera. Each object is assigned a single tracker to realize Re-ID in single
channel. Tracking templates are initialized with object detection results. All tracking outputs are post-processed by the
quality evaluation module to deal with the distorted or occluded objects.
Vehicle Re-ID for Surround-view Camera System
Samples captured by different cameras. (a) The appearances of the same vehicle captured by different
cameras vary greatly, and the same color represents the same object. (b) Objects have a similar
appearance may appear in the same camera view, as shown by these two black vehicles in green boxes.
Vehicle Re-ID for Surround-view Camera System
Illustration of the multi-camera Re-ID network. This network is
a two branch parallel structure. The top branch is employed to
make the network pay more attention on object regions, and
anther is for extracting global features.
Vehicle Re-ID for Surround-view Camera System
Projection uncertainty of key points. Ellipse 1 and ellipse 2 are uncertainty
ranges of front and left (right) cameras, respectively.
Vehicle Re-ID for Surround-view Camera System
The overall framework of the vehicle Re-ID in multi-camera. For the new target, Re-ID model is used first to
extract the features, followed by the distance metrics is carried out for this feature and features in gallery.
Besides, the spatial constraint strategy is adopted to improve the correlation effect.
Vehicle Re-ID for Surround-view Camera System
SynDistNet: Self-Supervised Monocular Fisheye
Camera Distance Estimation Synergized with
Semantic Segmentation for Autonomous Driving
• Self-supervised learning approaches for monocular depth estimation usually suffer from scale
ambiguity.
• do not generalize well when applied on distance estimation for complex projection models such as
in fisheye and omnidirectional cameras.
• introduce a multi-task learning strategy to improve self- supervised monocular distance estimation
on fisheye and pinhole camera images.
• contribution to this work is threefold:
• Firstly, introduce a distance estimation network architecture using a self-attention based encoder coupled with robust
semantic feature guidance to the decoder that can be trained in a one-stage fashion.
• Secondly, integrate a generalized robust loss function, which improves performance significantly while removing the
need for hyperparameter tuning with the reprojection loss.
• Finally, reduce the artifacts caused by dynamic objects violating static world assumption by using a semantic masking
strategy.
• significantly improve upon the RMSE of previous work on fisheye by 25% reduction in RMSE.
SynDistNet: Self-Supervised Monocular Fisheye
Camera Distance Estimation Synergized with
Semantic Segmentation for Autonomous Driving
Overview over the joint prediction of
distance D and semantic segmentation M
from a single input image I。 Compared to
previous approaches, semantically guided
distance estimation produces sharper
depth edges and reasonable distance
estimates for dynamic objects.
SynDistNet: Self-Supervised Monocular Fisheye
Camera Distance Estimation Synergized with
Semantic Segmentation for Autonomous Driving
Overview of proposed framework for
the joint prediction of distance and
semantic segmentation. The upper part
(blue blocks) describes the single steps
for the depth estimation, while the
green blocks describe the single steps
needed for the prediction of the
semantic segmentation. Both tasks are
optimized inside a multi-task network
by using the weighted total loss.
SynDistNet: Self-Supervised Monocular Fisheye
Camera Distance Estimation Synergized with
Semantic Segmentation for Autonomous Driving
Application of semantic masking methods, to handle
potentially dynamic objects. The dynamic objects inside the
segmentation masks from consecutive frames in (b) and (d)
are accumulated to a dynamic object mask, which is used to
mask the photometric error (e), as shown in (h).
SynDistNet: Self-Supervised Monocular Fisheye
Camera Distance Estimation Synergized with
Semantic Segmentation for Autonomous Driving
Visualization of our proposed network architecture to
semantically guide the depth estimation. We utilize a
self-attention based encoder and a semantically guided
decoder using pixel-adaptive convolutions.
SynDistNet: Self-Supervised Monocular Fisheye
Camera Distance Estimation Synergized with
Semantic Segmentation for Autonomous Driving
SynDistNet: Self-Supervised Monocular Fisheye
Camera Distance Estimation Synergized with
Semantic Segmentation for Autonomous Driving
SynDistNet: Self-Supervised Monocular Fisheye
Camera Distance Estimation Synergized with
Semantic Segmentation for Autonomous Driving
Universal Semantic Segmentation for Fisheye
Urban Driving Images
• Semantic segmentation is a critical method in the field of autonomous driving. When
performing semantic image segmentation, a wider field of view (FoV) helps to obtain
more information about the surrounding environment, making automatic driving safer
and more reliable, which could be offered by fisheye cameras.
• In this paper, a seven DoF augmentation method is proposed to transform rectilinear
image to fisheye image in a more comprehensive way.
• In the training process, rectilinear images are transformed into fisheye images in seven
DoF, which simulates the fisheye images taken by cameras of different positions,
orientations and focal lengths. The result shows that training with the seven-DoF
augmentation can improve the models accuracy and robustness against different
distorted fisheye data.
• This seven-DoF augmentation provides a universal semantic segmentation solution for
fisheye cameras in different autonomous driving applications.
• Also, provide specific parameter settings of the augmentation for autonomous driving.
• At last, tested universal semantic segmentation model on real fisheye images and
obtained satisfactory results.
• The code and configurations are released at https://github.com/Yaozhuwa/FisheyeSeg.
Universal Semantic Segmentation for Fisheye
Urban Driving Images
Projection model of fisheye camera. PW
is a point on a rectilinear image that we
place on the x-y plane of the world
coordinate system. θ is the Angle of
incidence of the point relative to the
fisheye camera. P is the imaging point
of PW on the fisheye image. |OP| = fθ.
The relative rotation and translation
between the world coordinate system
and the camera coordinate system
results in six degrees of freedom.
Universal Semantic Segmentation for Fisheye
Urban Driving Images
The six DoF augmentation. Except the first
row, every image is transformed using a
virtual fisheye camera with focal length of
300 pixels. The letter in brackets means
that which axis the camera is panning
along or rotating around.
Universal Semantic Segmentation for Fisheye
Urban Driving Images
the synthetic fisheye images with different f (focal length)
Universal Semantic Segmentation for Fisheye
Urban Driving Images
Universal Semantic Segmentation for Fisheye
Urban Driving Images
Semantic segmentation of real fisheye images.
UnRectDepthNet: Self-Supervised Monocular
Depth Estimation using a Generic Framework for
Handling Common Camera Distortion Models
• This rectification process simplifies the depth estimation significantly, and thus it has been
adopted in CNN approaches.
• However, rectification has several side effects, including a reduced field of view (FOV), resampling
distortion, and sensitivity to calibration errors.
• In this paper, propose a generic scale-aware self-supervised pipeline for estimating depth,
Euclidean distance, and visual odometry from unrectified monocular videos.
• demonstrate a similar level of precision on the unrectified KITTI dataset with barrel distortion
comparable to the rectified KITTI dataset.
• The intuition being that the rectification step can be implicitly absorbed within the CNN model,
which learns the distortion model without increasing complexity.
• not suffer from a reduced field of view and avoids computational costs for rectification at
inference time.
• To further illustrate the general applicability of the proposed framework, apply it to wide-angle
fisheye cameras with 190◦ horizontal field of view.
• The training framework UnRectDepthNet takes in the camera distortion model as an argument
and adapts projection and unprojection functions accordingly.
UnRectDepthNet: Self-Supervised Monocular
Depth Estimation using a Generic Framework for
Handling Common Camera Distortion Models
Depth obtained from a single unrectified (left) and rectified KITTI image (right). Our scale-
aware model, UnRectDepthNet, yields precise boundaries and fine-grained depth maps.
UnRectDepthNet: Self-Supervised Monocular
Depth Estimation using a Generic Framework for
Handling Common Camera Distortion Models
Illustration of distortion correction in KITTI and Wood- Scape datasets. The first row shows a raw KITTI
image with barrel distortion and the corresponding rectified image. The red box was used to crop out
black pixels in periphery causing a loss of FOV. The second row shows a raw WoodScape image with
strong fisheye lens distortion and the corresponding rectified image exhibiting a drastic loss of FOV.
UnRectDepthNet: Self-Supervised Monocular
Depth Estimation using a Generic Framework for
Handling Common Camera Distortion Models
The projection is a complex multi-stage process compared to regular
lenses and thus list the detailed steps:
UnRectDepthNet: Self-Supervised Monocular
Depth Estimation using a Generic Framework for
Handling Common Camera Distortion Models
The radial distortion models are summarized below:
UnRectDepthNet: Self-Supervised Monocular
Depth Estimation using a Generic Framework for
Handling Common Camera Distortion Models
A self- supervised monocular structure-from-motion (SfM):
UnRectDepthNet: Self-Supervised Monocular
Depth Estimation using a Generic Framework for
Handling Common Camera Distortion Models
UnRectDepthNet: Self-Supervised Monocular
Depth Estimation using a Generic Framework for
Handling Common Camera Distortion Models
• The UnRectDepthNet training block on the right enables the usage of various camera
models generically listed in the black box.
• The distortion is then handled internally in the unprojection and projection steps of the
transformation from It to It−1.
• This paper has tested it with KITTI barrel distorted and WoodScape fisheye distorted
video sequences.
• The block on the left indicates the entire workflow of the training pipeline where the top
row depicts the ego masks, Mt→t−1, Mt→t+1 representing the valid pixel coordinates
while synthesizing Iˆt−1→t from It−1 and Iˆt+1→t from It+1 respectively.
• The following row showcases the masks used to filter static pixels, obtained after training
two epochs, and the black pixels are removed from the reconstruction loss.
• Dynamic objects moving at speed similar to the ego car’s as well as homogeneous areas
are filtered out to prevent the contamination of reconstruction loss.
• The third row shows the depth predictions, where the scale ambiguity is resolved using
the ego vehicle’s odometry data.
• Finally, the top block illustrates the inference output.
UnRectDepthNet: Self-Supervised Monocular
Depth Estimation using a Generic Framework for
Handling Common Camera Distortion Models
UnRectDepthNet: Self-Supervised Monocular
Depth Estimation using a Generic Framework for
Handling Common Camera Distortion Models
OmniDet: Surround View Cameras based Multi-task
Visual Perception Network for Autonomous Driving
• Surround View fisheye cameras are commonly deployed in automated driving for 360° near-field
sensing around the vehicle.
• This work presents a multi-task visual perception network on unrectified fisheye images to enable the
vehicle to sense its surrounding environment.
• It consists of six primary tasks necessary for an autonomous driving system: depth estimation, visual
odometry, semantic segmentation, motion segmentation, object detection, and lens soiling detection.
• demonstrate that the jointly trained model performs better than the respective single task versions.
• multi-task model has a shared encoder providing a significant computational advantage and has
synergized decoders where tasks support each other.
• propose a novel camera geometry based adaptation mechanism to encode the fisheye distortion
model both at training and inference.
• This was crucial to enable training on the WoodScape dataset, comprised of data from different parts
of the world collected by 12 different cameras mounted on three different cars with different intrinsics
and viewpoints.
• Given that bounding boxes is not a good representation for distorted fisheye images, also extend
object detection to use a polygon with non-uniformly sampled vertices.
• Additionally evaluate our model on standard automotive datasets, namely KITTI and Cityscapes.
Overview
of
our
Surround
View
cameras
based
multi-task
visual
perception
framework.
OmniDet: Surround View Cameras based Multi-task
Visual Perception Network for Autonomous Driving
Adversarial Attacks on Multi-task Visual
Perception for Autonomous Driving
• Deep neural networks (DNNs) have accomplished impressive success in various
applications, including autonomous driving perception tasks, in recent years.
• On the other hand, deep neural networks are fooled by adversarial attacks.
• This vulnerability raises significant concerns, particularly in safety-critical
applications.
• research into attacking and defending DNNs has gained much coverage.
• In this work, detailed adversarial attacks are applied on a diverse multi-task visual
perception deep network across distance estimation, semantic segmentation,
motion detection, and object detection.
• The experiments consider both white and black box attacks for targeted and un-
targeted cases, while attacking a task and inspecting the effect on all the others,
in addition to inspecting the effect of applying a simple defense method.
Adversarial Attacks on Multi-task Visual
Perception for Autonomous Driving
Adversarial attacks on OmniDet MTL model. Distance,
segmentation, motion and detection perception tasks
are attacked by white and black box methods with
targeted and targeted objectives, resulting in incorrect
model predictions.
Adversarial Attacks on Multi-task Visual
Perception for Autonomous Driving
Illustration of baseline multi-task
architecture comprising of four tasks
Adversarial Attacks on Multi-task Visual
Perception for Autonomous Driving
Adversarial Attacks on Multi-task Visual
Perception for Autonomous Driving
White box Un-targeted, White box Targeted, Black box Un-targeted, & Black box Targeted Attacks. Within each group from
top to bottom, from left to right: Original results, adversarial perturbations, & the impacted results.
Fisheye/Omnidirectional View in Autonomous Driving V

More Related Content

Similar to Fisheye/Omnidirectional View in Autonomous Driving V

Deep VO and SLAM
Deep VO and SLAMDeep VO and SLAM
Deep VO and SLAMYu Huang
 
Real Time Object Identification for Intelligent Video Surveillance Applications
Real Time Object Identification for Intelligent Video Surveillance ApplicationsReal Time Object Identification for Intelligent Video Surveillance Applications
Real Time Object Identification for Intelligent Video Surveillance ApplicationsEditor IJCATR
 
Self-Driving Car to Drive Autonomously using Image Processing and Deep Learning
Self-Driving Car to Drive Autonomously using Image Processing and Deep LearningSelf-Driving Car to Drive Autonomously using Image Processing and Deep Learning
Self-Driving Car to Drive Autonomously using Image Processing and Deep LearningIRJET Journal
 
IRJET- Robust and Fast Detection of Moving Vechiles in Aerial Videos usin...
IRJET-  	  Robust and Fast Detection of Moving Vechiles in Aerial Videos usin...IRJET-  	  Robust and Fast Detection of Moving Vechiles in Aerial Videos usin...
IRJET- Robust and Fast Detection of Moving Vechiles in Aerial Videos usin...IRJET Journal
 
In tech vision-based_obstacle_detection_module_for_a_wheeled_mobile_robot
In tech vision-based_obstacle_detection_module_for_a_wheeled_mobile_robotIn tech vision-based_obstacle_detection_module_for_a_wheeled_mobile_robot
In tech vision-based_obstacle_detection_module_for_a_wheeled_mobile_robotSudhakar Spartan
 
High quality single shot capture of facial geometry
High quality single shot capture of facial geometryHigh quality single shot capture of facial geometry
High quality single shot capture of facial geometryBrohi Aijaz Ali
 
IRJET - Traffic Density Estimation by Counting Vehicles using Aggregate Chann...
IRJET - Traffic Density Estimation by Counting Vehicles using Aggregate Chann...IRJET - Traffic Density Estimation by Counting Vehicles using Aggregate Chann...
IRJET - Traffic Density Estimation by Counting Vehicles using Aggregate Chann...IRJET Journal
 
traffic jam detection using image processing
traffic jam detection using image processingtraffic jam detection using image processing
traffic jam detection using image processingMalika Alix
 
Vision-Based Motorcycle Crash Detection and Reporting Using Deep Learning
Vision-Based Motorcycle Crash Detection and Reporting Using Deep LearningVision-Based Motorcycle Crash Detection and Reporting Using Deep Learning
Vision-Based Motorcycle Crash Detection and Reporting Using Deep LearningIRJET Journal
 
Automatism System Using Faster R-CNN and SVM
Automatism System Using Faster R-CNN and SVMAutomatism System Using Faster R-CNN and SVM
Automatism System Using Faster R-CNN and SVMIRJET Journal
 
Transportation technologies for sustainability
Transportation technologies for sustainabilityTransportation technologies for sustainability
Transportation technologies for sustainabilitySpringer
 
Driving Behavior for ADAS and Autonomous Driving IV
Driving Behavior for ADAS and Autonomous Driving IVDriving Behavior for ADAS and Autonomous Driving IV
Driving Behavior for ADAS and Autonomous Driving IVYu Huang
 
IRJET- A Survey of Approaches for Vehicle Traffic Analysis
IRJET- A Survey of Approaches for Vehicle Traffic AnalysisIRJET- A Survey of Approaches for Vehicle Traffic Analysis
IRJET- A Survey of Approaches for Vehicle Traffic AnalysisIRJET Journal
 
IRJET- A Survey of Approaches for Vehicle Traffic Analysis
IRJET- A Survey of Approaches for Vehicle Traffic AnalysisIRJET- A Survey of Approaches for Vehicle Traffic Analysis
IRJET- A Survey of Approaches for Vehicle Traffic AnalysisIRJET Journal
 
Identifying Parking Spots from Surveillance Cameras using CNN
Identifying Parking Spots from Surveillance Cameras using CNNIdentifying Parking Spots from Surveillance Cameras using CNN
Identifying Parking Spots from Surveillance Cameras using CNNIRJET Journal
 
Video Surveillance Systems For Traffic Monitoring
Video Surveillance Systems For Traffic MonitoringVideo Surveillance Systems For Traffic Monitoring
Video Surveillance Systems For Traffic MonitoringMeridian Media
 
License plate extraction of overspeeding vehicles
License plate extraction of overspeeding vehiclesLicense plate extraction of overspeeding vehicles
License plate extraction of overspeeding vehicleslambanaveen
 
BEV Semantic Segmentation
BEV Semantic SegmentationBEV Semantic Segmentation
BEV Semantic SegmentationYu Huang
 

Similar to Fisheye/Omnidirectional View in Autonomous Driving V (20)

Deep VO and SLAM
Deep VO and SLAMDeep VO and SLAM
Deep VO and SLAM
 
Real Time Object Identification for Intelligent Video Surveillance Applications
Real Time Object Identification for Intelligent Video Surveillance ApplicationsReal Time Object Identification for Intelligent Video Surveillance Applications
Real Time Object Identification for Intelligent Video Surveillance Applications
 
Dave
DaveDave
Dave
 
Self-Driving Car to Drive Autonomously using Image Processing and Deep Learning
Self-Driving Car to Drive Autonomously using Image Processing and Deep LearningSelf-Driving Car to Drive Autonomously using Image Processing and Deep Learning
Self-Driving Car to Drive Autonomously using Image Processing and Deep Learning
 
IRJET- Robust and Fast Detection of Moving Vechiles in Aerial Videos usin...
IRJET-  	  Robust and Fast Detection of Moving Vechiles in Aerial Videos usin...IRJET-  	  Robust and Fast Detection of Moving Vechiles in Aerial Videos usin...
IRJET- Robust and Fast Detection of Moving Vechiles in Aerial Videos usin...
 
In tech vision-based_obstacle_detection_module_for_a_wheeled_mobile_robot
In tech vision-based_obstacle_detection_module_for_a_wheeled_mobile_robotIn tech vision-based_obstacle_detection_module_for_a_wheeled_mobile_robot
In tech vision-based_obstacle_detection_module_for_a_wheeled_mobile_robot
 
High quality single shot capture of facial geometry
High quality single shot capture of facial geometryHigh quality single shot capture of facial geometry
High quality single shot capture of facial geometry
 
IRJET - Traffic Density Estimation by Counting Vehicles using Aggregate Chann...
IRJET - Traffic Density Estimation by Counting Vehicles using Aggregate Chann...IRJET - Traffic Density Estimation by Counting Vehicles using Aggregate Chann...
IRJET - Traffic Density Estimation by Counting Vehicles using Aggregate Chann...
 
traffic jam detection using image processing
traffic jam detection using image processingtraffic jam detection using image processing
traffic jam detection using image processing
 
Vision-Based Motorcycle Crash Detection and Reporting Using Deep Learning
Vision-Based Motorcycle Crash Detection and Reporting Using Deep LearningVision-Based Motorcycle Crash Detection and Reporting Using Deep Learning
Vision-Based Motorcycle Crash Detection and Reporting Using Deep Learning
 
Automatism System Using Faster R-CNN and SVM
Automatism System Using Faster R-CNN and SVMAutomatism System Using Faster R-CNN and SVM
Automatism System Using Faster R-CNN and SVM
 
Transportation technologies for sustainability
Transportation technologies for sustainabilityTransportation technologies for sustainability
Transportation technologies for sustainability
 
Driving Behavior for ADAS and Autonomous Driving IV
Driving Behavior for ADAS and Autonomous Driving IVDriving Behavior for ADAS and Autonomous Driving IV
Driving Behavior for ADAS and Autonomous Driving IV
 
IRJET- A Survey of Approaches for Vehicle Traffic Analysis
IRJET- A Survey of Approaches for Vehicle Traffic AnalysisIRJET- A Survey of Approaches for Vehicle Traffic Analysis
IRJET- A Survey of Approaches for Vehicle Traffic Analysis
 
IRJET- A Survey of Approaches for Vehicle Traffic Analysis
IRJET- A Survey of Approaches for Vehicle Traffic AnalysisIRJET- A Survey of Approaches for Vehicle Traffic Analysis
IRJET- A Survey of Approaches for Vehicle Traffic Analysis
 
Identifying Parking Spots from Surveillance Cameras using CNN
Identifying Parking Spots from Surveillance Cameras using CNNIdentifying Parking Spots from Surveillance Cameras using CNN
Identifying Parking Spots from Surveillance Cameras using CNN
 
Video Surveillance Systems For Traffic Monitoring
Video Surveillance Systems For Traffic MonitoringVideo Surveillance Systems For Traffic Monitoring
Video Surveillance Systems For Traffic Monitoring
 
License plate extraction of overspeeding vehicles
License plate extraction of overspeeding vehiclesLicense plate extraction of overspeeding vehicles
License plate extraction of overspeeding vehicles
 
BEV Semantic Segmentation
BEV Semantic SegmentationBEV Semantic Segmentation
BEV Semantic Segmentation
 
F124144
F124144F124144
F124144
 

More from Yu Huang

Application of Foundation Model for Autonomous Driving
Application of Foundation Model for Autonomous DrivingApplication of Foundation Model for Autonomous Driving
Application of Foundation Model for Autonomous DrivingYu Huang
 
The New Perception Framework in Autonomous Driving: An Introduction of BEV N...
The New Perception Framework  in Autonomous Driving: An Introduction of BEV N...The New Perception Framework  in Autonomous Driving: An Introduction of BEV N...
The New Perception Framework in Autonomous Driving: An Introduction of BEV N...Yu Huang
 
Data Closed Loop in Simulation Test of Autonomous Driving
Data Closed Loop in Simulation Test of Autonomous DrivingData Closed Loop in Simulation Test of Autonomous Driving
Data Closed Loop in Simulation Test of Autonomous DrivingYu Huang
 
Techniques and Challenges in Autonomous Driving
Techniques and Challenges in Autonomous DrivingTechniques and Challenges in Autonomous Driving
Techniques and Challenges in Autonomous DrivingYu Huang
 
BEV Joint Detection and Segmentation
BEV Joint Detection and SegmentationBEV Joint Detection and Segmentation
BEV Joint Detection and SegmentationYu Huang
 
BEV Object Detection and Prediction
BEV Object Detection and PredictionBEV Object Detection and Prediction
BEV Object Detection and PredictionYu Huang
 
Prediction,Planninng & Control at Baidu
Prediction,Planninng & Control at BaiduPrediction,Planninng & Control at Baidu
Prediction,Planninng & Control at BaiduYu Huang
 
Cruise AI under the Hood
Cruise AI under the HoodCruise AI under the Hood
Cruise AI under the HoodYu Huang
 
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)Yu Huang
 
Scenario-Based Development & Testing for Autonomous Driving
Scenario-Based Development & Testing for Autonomous DrivingScenario-Based Development & Testing for Autonomous Driving
Scenario-Based Development & Testing for Autonomous DrivingYu Huang
 
How to Build a Data Closed-loop Platform for Autonomous Driving?
How to Build a Data Closed-loop Platform for Autonomous Driving?How to Build a Data Closed-loop Platform for Autonomous Driving?
How to Build a Data Closed-loop Platform for Autonomous Driving?Yu Huang
 
Annotation tools for ADAS & Autonomous Driving
Annotation tools for ADAS & Autonomous DrivingAnnotation tools for ADAS & Autonomous Driving
Annotation tools for ADAS & Autonomous DrivingYu Huang
 
Simulation for autonomous driving at uber atg
Simulation for autonomous driving at uber atgSimulation for autonomous driving at uber atg
Simulation for autonomous driving at uber atgYu Huang
 
Multi sensor calibration by deep learning
Multi sensor calibration by deep learningMulti sensor calibration by deep learning
Multi sensor calibration by deep learningYu Huang
 
Prediction and planning for self driving at waymo
Prediction and planning for self driving at waymoPrediction and planning for self driving at waymo
Prediction and planning for self driving at waymoYu Huang
 
Jointly mapping, localization, perception, prediction and planning
Jointly mapping, localization, perception, prediction and planningJointly mapping, localization, perception, prediction and planning
Jointly mapping, localization, perception, prediction and planningYu Huang
 
Data pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous drivingData pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous drivingYu Huang
 
Open Source codes of trajectory prediction & behavior planning
Open Source codes of trajectory prediction & behavior planningOpen Source codes of trajectory prediction & behavior planning
Open Source codes of trajectory prediction & behavior planningYu Huang
 
Lidar in the adverse weather: dust, fog, snow and rain
Lidar in the adverse weather: dust, fog, snow and rainLidar in the adverse weather: dust, fog, snow and rain
Lidar in the adverse weather: dust, fog, snow and rainYu Huang
 
Autonomous Driving of L3/L4 Commercial trucks
Autonomous Driving of L3/L4 Commercial trucksAutonomous Driving of L3/L4 Commercial trucks
Autonomous Driving of L3/L4 Commercial trucksYu Huang
 

More from Yu Huang (20)

Application of Foundation Model for Autonomous Driving
Application of Foundation Model for Autonomous DrivingApplication of Foundation Model for Autonomous Driving
Application of Foundation Model for Autonomous Driving
 
The New Perception Framework in Autonomous Driving: An Introduction of BEV N...
The New Perception Framework  in Autonomous Driving: An Introduction of BEV N...The New Perception Framework  in Autonomous Driving: An Introduction of BEV N...
The New Perception Framework in Autonomous Driving: An Introduction of BEV N...
 
Data Closed Loop in Simulation Test of Autonomous Driving
Data Closed Loop in Simulation Test of Autonomous DrivingData Closed Loop in Simulation Test of Autonomous Driving
Data Closed Loop in Simulation Test of Autonomous Driving
 
Techniques and Challenges in Autonomous Driving
Techniques and Challenges in Autonomous DrivingTechniques and Challenges in Autonomous Driving
Techniques and Challenges in Autonomous Driving
 
BEV Joint Detection and Segmentation
BEV Joint Detection and SegmentationBEV Joint Detection and Segmentation
BEV Joint Detection and Segmentation
 
BEV Object Detection and Prediction
BEV Object Detection and PredictionBEV Object Detection and Prediction
BEV Object Detection and Prediction
 
Prediction,Planninng & Control at Baidu
Prediction,Planninng & Control at BaiduPrediction,Planninng & Control at Baidu
Prediction,Planninng & Control at Baidu
 
Cruise AI under the Hood
Cruise AI under the HoodCruise AI under the Hood
Cruise AI under the Hood
 
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
 
Scenario-Based Development & Testing for Autonomous Driving
Scenario-Based Development & Testing for Autonomous DrivingScenario-Based Development & Testing for Autonomous Driving
Scenario-Based Development & Testing for Autonomous Driving
 
How to Build a Data Closed-loop Platform for Autonomous Driving?
How to Build a Data Closed-loop Platform for Autonomous Driving?How to Build a Data Closed-loop Platform for Autonomous Driving?
How to Build a Data Closed-loop Platform for Autonomous Driving?
 
Annotation tools for ADAS & Autonomous Driving
Annotation tools for ADAS & Autonomous DrivingAnnotation tools for ADAS & Autonomous Driving
Annotation tools for ADAS & Autonomous Driving
 
Simulation for autonomous driving at uber atg
Simulation for autonomous driving at uber atgSimulation for autonomous driving at uber atg
Simulation for autonomous driving at uber atg
 
Multi sensor calibration by deep learning
Multi sensor calibration by deep learningMulti sensor calibration by deep learning
Multi sensor calibration by deep learning
 
Prediction and planning for self driving at waymo
Prediction and planning for self driving at waymoPrediction and planning for self driving at waymo
Prediction and planning for self driving at waymo
 
Jointly mapping, localization, perception, prediction and planning
Jointly mapping, localization, perception, prediction and planningJointly mapping, localization, perception, prediction and planning
Jointly mapping, localization, perception, prediction and planning
 
Data pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous drivingData pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous driving
 
Open Source codes of trajectory prediction & behavior planning
Open Source codes of trajectory prediction & behavior planningOpen Source codes of trajectory prediction & behavior planning
Open Source codes of trajectory prediction & behavior planning
 
Lidar in the adverse weather: dust, fog, snow and rain
Lidar in the adverse weather: dust, fog, snow and rainLidar in the adverse weather: dust, fog, snow and rain
Lidar in the adverse weather: dust, fog, snow and rain
 
Autonomous Driving of L3/L4 Commercial trucks
Autonomous Driving of L3/L4 Commercial trucksAutonomous Driving of L3/L4 Commercial trucks
Autonomous Driving of L3/L4 Commercial trucks
 

Recently uploaded

INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Effects of rheological properties on mixing
Effects of rheological properties on mixingEffects of rheological properties on mixing
Effects of rheological properties on mixingviprabot1
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
EduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIEduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIkoyaldeepu123
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .Satyam Kumar
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...Chandu841456
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
pipeline in computer architecture design
pipeline in computer architecture  designpipeline in computer architecture  design
pipeline in computer architecture designssuser87fa0c1
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
DATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage exampleDATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage examplePragyanshuParadkar1
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 

Recently uploaded (20)

INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
Effects of rheological properties on mixing
Effects of rheological properties on mixingEffects of rheological properties on mixing
Effects of rheological properties on mixing
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
EduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIEduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AI
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
pipeline in computer architecture design
pipeline in computer architecture  designpipeline in computer architecture  design
pipeline in computer architecture design
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
DATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage exampleDATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage example
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 

Fisheye/Omnidirectional View in Autonomous Driving V

  • 2. Outline • Road-line detection and 3D reconstruction using fisheye cameras • Vehicle Re-ID for Surround-view Camera System • SynDistNet: Self-Supervised Monocular Fisheye Camera Distance Estimation Synergized with Semantic Segmentation for Autonomous Driving • Universal Semantic Segmentation for Fisheye Urban Driving Images • UnRectDepthNet: Self-Supervised Monocular Depth Estimation using a Generic Framework for Handling Common Camera Distortion Models • OmniDet: Surround View Cameras based Multi-task Visual Perception Network for Autonomous Driving • Adversarial Attacks on Multi-task Visual Perception for Autonomous Driving
  • 3. Road-line detection and 3D reconstruction using fisheye cameras • In future ADAS, smart monitoring of the vehicle environment is a key issue. • Fisheye cameras have become popular as they provide a panoramic view with a few low-cost sensors. • However, current ADAS systems have limited use as most of the underlying image processing has been designed for perspective views only. • In this article illustrate how the theoretical work done in omnidirectional vision over the past ten years can help to tackle this issue. • To do so, have evaluated a simple algorithm for road line detection based on the unified sphere model in real conditions. • firstly highlight the interest of using fisheye cameras in a vehicle, then present experimental results on the detection of lines on a set of 180 images, • finally, show how the 3D position of the lines can be recovered by triangulation.
  • 4. Road-line detection and 3D reconstruction using fisheye cameras
  • 5. Road-line detection and 3D reconstruction using fisheye cameras
  • 6. Road-line detection and 3D reconstruction using fisheye cameras
  • 7. Road-line detection and 3D reconstruction using fisheye cameras
  • 8. Vehicle Re-ID for Surround-view Camera System • The vehicle re-identification (Re-ID) plays a critical role in the perception system of autonomous driving, which attracts more and more attention in recent years. • However, no existing complete solution for the surround-view system mounted on the vehicle. • two main challenges in above scenario: i) In single-camera view, it is difficult to recognize the same vehicle from the past image frames due to the fish-eye distortion, occlusion, truncation, etc. ii) In multi-camera view, the appearance of the same vehicle varies greatly from different cameras viewpoints. • Thus, present an integral vehicle Re-ID solution to address these problems. • Specifically, propose a quality evaluation mechanism to balance the effect of tracking boxs drift and targets consistence. • Besides, take advantage of the Re-ID network based on attention mechanism, then combined with a spatial constraint strategy to further boost the performance between different cameras. • The experiments demonstrate that solution achieves state-of-the-art accuracy while being real- time in practice. • Besides, will release the code and annotated fisheye dataset for the benefit of community.
  • 9. Vehicle Re-ID for Surround-view Camera System
  • 10. Vehicle Re-ID for Surround-view Camera System Vehicles in single view of fisheye camera. (a) The same vehicle features change dramatically in consecutive frames and vehicles tend to obscure each other. (b) Matching errors are caused by tracking results. (c) The vehicle center indicated by the orange box is stable while the IoU in consecutive frames indicated by the yellow box decreases with movement.
  • 11. Vehicle Re-ID for Surround-view Camera System The overall framework of vehicle Re-ID in single camera. Each object is assigned a single tracker to realize Re-ID in single channel. Tracking templates are initialized with object detection results. All tracking outputs are post-processed by the quality evaluation module to deal with the distorted or occluded objects.
  • 12. Vehicle Re-ID for Surround-view Camera System Samples captured by different cameras. (a) The appearances of the same vehicle captured by different cameras vary greatly, and the same color represents the same object. (b) Objects have a similar appearance may appear in the same camera view, as shown by these two black vehicles in green boxes.
  • 13. Vehicle Re-ID for Surround-view Camera System Illustration of the multi-camera Re-ID network. This network is a two branch parallel structure. The top branch is employed to make the network pay more attention on object regions, and anther is for extracting global features.
  • 14. Vehicle Re-ID for Surround-view Camera System Projection uncertainty of key points. Ellipse 1 and ellipse 2 are uncertainty ranges of front and left (right) cameras, respectively.
  • 15. Vehicle Re-ID for Surround-view Camera System The overall framework of the vehicle Re-ID in multi-camera. For the new target, Re-ID model is used first to extract the features, followed by the distance metrics is carried out for this feature and features in gallery. Besides, the spatial constraint strategy is adopted to improve the correlation effect.
  • 16. Vehicle Re-ID for Surround-view Camera System
  • 17. SynDistNet: Self-Supervised Monocular Fisheye Camera Distance Estimation Synergized with Semantic Segmentation for Autonomous Driving • Self-supervised learning approaches for monocular depth estimation usually suffer from scale ambiguity. • do not generalize well when applied on distance estimation for complex projection models such as in fisheye and omnidirectional cameras. • introduce a multi-task learning strategy to improve self- supervised monocular distance estimation on fisheye and pinhole camera images. • contribution to this work is threefold: • Firstly, introduce a distance estimation network architecture using a self-attention based encoder coupled with robust semantic feature guidance to the decoder that can be trained in a one-stage fashion. • Secondly, integrate a generalized robust loss function, which improves performance significantly while removing the need for hyperparameter tuning with the reprojection loss. • Finally, reduce the artifacts caused by dynamic objects violating static world assumption by using a semantic masking strategy. • significantly improve upon the RMSE of previous work on fisheye by 25% reduction in RMSE.
  • 18. SynDistNet: Self-Supervised Monocular Fisheye Camera Distance Estimation Synergized with Semantic Segmentation for Autonomous Driving Overview over the joint prediction of distance D and semantic segmentation M from a single input image I。 Compared to previous approaches, semantically guided distance estimation produces sharper depth edges and reasonable distance estimates for dynamic objects.
  • 19. SynDistNet: Self-Supervised Monocular Fisheye Camera Distance Estimation Synergized with Semantic Segmentation for Autonomous Driving Overview of proposed framework for the joint prediction of distance and semantic segmentation. The upper part (blue blocks) describes the single steps for the depth estimation, while the green blocks describe the single steps needed for the prediction of the semantic segmentation. Both tasks are optimized inside a multi-task network by using the weighted total loss.
  • 20. SynDistNet: Self-Supervised Monocular Fisheye Camera Distance Estimation Synergized with Semantic Segmentation for Autonomous Driving Application of semantic masking methods, to handle potentially dynamic objects. The dynamic objects inside the segmentation masks from consecutive frames in (b) and (d) are accumulated to a dynamic object mask, which is used to mask the photometric error (e), as shown in (h).
  • 21. SynDistNet: Self-Supervised Monocular Fisheye Camera Distance Estimation Synergized with Semantic Segmentation for Autonomous Driving Visualization of our proposed network architecture to semantically guide the depth estimation. We utilize a self-attention based encoder and a semantically guided decoder using pixel-adaptive convolutions.
  • 22. SynDistNet: Self-Supervised Monocular Fisheye Camera Distance Estimation Synergized with Semantic Segmentation for Autonomous Driving
  • 23. SynDistNet: Self-Supervised Monocular Fisheye Camera Distance Estimation Synergized with Semantic Segmentation for Autonomous Driving
  • 24. SynDistNet: Self-Supervised Monocular Fisheye Camera Distance Estimation Synergized with Semantic Segmentation for Autonomous Driving
  • 25. Universal Semantic Segmentation for Fisheye Urban Driving Images • Semantic segmentation is a critical method in the field of autonomous driving. When performing semantic image segmentation, a wider field of view (FoV) helps to obtain more information about the surrounding environment, making automatic driving safer and more reliable, which could be offered by fisheye cameras. • In this paper, a seven DoF augmentation method is proposed to transform rectilinear image to fisheye image in a more comprehensive way. • In the training process, rectilinear images are transformed into fisheye images in seven DoF, which simulates the fisheye images taken by cameras of different positions, orientations and focal lengths. The result shows that training with the seven-DoF augmentation can improve the models accuracy and robustness against different distorted fisheye data. • This seven-DoF augmentation provides a universal semantic segmentation solution for fisheye cameras in different autonomous driving applications. • Also, provide specific parameter settings of the augmentation for autonomous driving. • At last, tested universal semantic segmentation model on real fisheye images and obtained satisfactory results. • The code and configurations are released at https://github.com/Yaozhuwa/FisheyeSeg.
  • 26. Universal Semantic Segmentation for Fisheye Urban Driving Images Projection model of fisheye camera. PW is a point on a rectilinear image that we place on the x-y plane of the world coordinate system. θ is the Angle of incidence of the point relative to the fisheye camera. P is the imaging point of PW on the fisheye image. |OP| = fθ. The relative rotation and translation between the world coordinate system and the camera coordinate system results in six degrees of freedom.
  • 27. Universal Semantic Segmentation for Fisheye Urban Driving Images The six DoF augmentation. Except the first row, every image is transformed using a virtual fisheye camera with focal length of 300 pixels. The letter in brackets means that which axis the camera is panning along or rotating around.
  • 28. Universal Semantic Segmentation for Fisheye Urban Driving Images the synthetic fisheye images with different f (focal length)
  • 29. Universal Semantic Segmentation for Fisheye Urban Driving Images
  • 30. Universal Semantic Segmentation for Fisheye Urban Driving Images Semantic segmentation of real fisheye images.
  • 31. UnRectDepthNet: Self-Supervised Monocular Depth Estimation using a Generic Framework for Handling Common Camera Distortion Models • This rectification process simplifies the depth estimation significantly, and thus it has been adopted in CNN approaches. • However, rectification has several side effects, including a reduced field of view (FOV), resampling distortion, and sensitivity to calibration errors. • In this paper, propose a generic scale-aware self-supervised pipeline for estimating depth, Euclidean distance, and visual odometry from unrectified monocular videos. • demonstrate a similar level of precision on the unrectified KITTI dataset with barrel distortion comparable to the rectified KITTI dataset. • The intuition being that the rectification step can be implicitly absorbed within the CNN model, which learns the distortion model without increasing complexity. • not suffer from a reduced field of view and avoids computational costs for rectification at inference time. • To further illustrate the general applicability of the proposed framework, apply it to wide-angle fisheye cameras with 190◦ horizontal field of view. • The training framework UnRectDepthNet takes in the camera distortion model as an argument and adapts projection and unprojection functions accordingly.
  • 32. UnRectDepthNet: Self-Supervised Monocular Depth Estimation using a Generic Framework for Handling Common Camera Distortion Models Depth obtained from a single unrectified (left) and rectified KITTI image (right). Our scale- aware model, UnRectDepthNet, yields precise boundaries and fine-grained depth maps.
  • 33. UnRectDepthNet: Self-Supervised Monocular Depth Estimation using a Generic Framework for Handling Common Camera Distortion Models Illustration of distortion correction in KITTI and Wood- Scape datasets. The first row shows a raw KITTI image with barrel distortion and the corresponding rectified image. The red box was used to crop out black pixels in periphery causing a loss of FOV. The second row shows a raw WoodScape image with strong fisheye lens distortion and the corresponding rectified image exhibiting a drastic loss of FOV.
  • 34. UnRectDepthNet: Self-Supervised Monocular Depth Estimation using a Generic Framework for Handling Common Camera Distortion Models The projection is a complex multi-stage process compared to regular lenses and thus list the detailed steps:
  • 35. UnRectDepthNet: Self-Supervised Monocular Depth Estimation using a Generic Framework for Handling Common Camera Distortion Models The radial distortion models are summarized below:
  • 36. UnRectDepthNet: Self-Supervised Monocular Depth Estimation using a Generic Framework for Handling Common Camera Distortion Models A self- supervised monocular structure-from-motion (SfM):
  • 37. UnRectDepthNet: Self-Supervised Monocular Depth Estimation using a Generic Framework for Handling Common Camera Distortion Models
  • 38. UnRectDepthNet: Self-Supervised Monocular Depth Estimation using a Generic Framework for Handling Common Camera Distortion Models • The UnRectDepthNet training block on the right enables the usage of various camera models generically listed in the black box. • The distortion is then handled internally in the unprojection and projection steps of the transformation from It to It−1. • This paper has tested it with KITTI barrel distorted and WoodScape fisheye distorted video sequences. • The block on the left indicates the entire workflow of the training pipeline where the top row depicts the ego masks, Mt→t−1, Mt→t+1 representing the valid pixel coordinates while synthesizing Iˆt−1→t from It−1 and Iˆt+1→t from It+1 respectively. • The following row showcases the masks used to filter static pixels, obtained after training two epochs, and the black pixels are removed from the reconstruction loss. • Dynamic objects moving at speed similar to the ego car’s as well as homogeneous areas are filtered out to prevent the contamination of reconstruction loss. • The third row shows the depth predictions, where the scale ambiguity is resolved using the ego vehicle’s odometry data. • Finally, the top block illustrates the inference output.
  • 39. UnRectDepthNet: Self-Supervised Monocular Depth Estimation using a Generic Framework for Handling Common Camera Distortion Models
  • 40. UnRectDepthNet: Self-Supervised Monocular Depth Estimation using a Generic Framework for Handling Common Camera Distortion Models
  • 41. OmniDet: Surround View Cameras based Multi-task Visual Perception Network for Autonomous Driving • Surround View fisheye cameras are commonly deployed in automated driving for 360° near-field sensing around the vehicle. • This work presents a multi-task visual perception network on unrectified fisheye images to enable the vehicle to sense its surrounding environment. • It consists of six primary tasks necessary for an autonomous driving system: depth estimation, visual odometry, semantic segmentation, motion segmentation, object detection, and lens soiling detection. • demonstrate that the jointly trained model performs better than the respective single task versions. • multi-task model has a shared encoder providing a significant computational advantage and has synergized decoders where tasks support each other. • propose a novel camera geometry based adaptation mechanism to encode the fisheye distortion model both at training and inference. • This was crucial to enable training on the WoodScape dataset, comprised of data from different parts of the world collected by 12 different cameras mounted on three different cars with different intrinsics and viewpoints. • Given that bounding boxes is not a good representation for distorted fisheye images, also extend object detection to use a polygon with non-uniformly sampled vertices. • Additionally evaluate our model on standard automotive datasets, namely KITTI and Cityscapes.
  • 43. OmniDet: Surround View Cameras based Multi-task Visual Perception Network for Autonomous Driving
  • 44. Adversarial Attacks on Multi-task Visual Perception for Autonomous Driving • Deep neural networks (DNNs) have accomplished impressive success in various applications, including autonomous driving perception tasks, in recent years. • On the other hand, deep neural networks are fooled by adversarial attacks. • This vulnerability raises significant concerns, particularly in safety-critical applications. • research into attacking and defending DNNs has gained much coverage. • In this work, detailed adversarial attacks are applied on a diverse multi-task visual perception deep network across distance estimation, semantic segmentation, motion detection, and object detection. • The experiments consider both white and black box attacks for targeted and un- targeted cases, while attacking a task and inspecting the effect on all the others, in addition to inspecting the effect of applying a simple defense method.
  • 45. Adversarial Attacks on Multi-task Visual Perception for Autonomous Driving Adversarial attacks on OmniDet MTL model. Distance, segmentation, motion and detection perception tasks are attacked by white and black box methods with targeted and targeted objectives, resulting in incorrect model predictions.
  • 46. Adversarial Attacks on Multi-task Visual Perception for Autonomous Driving Illustration of baseline multi-task architecture comprising of four tasks
  • 47. Adversarial Attacks on Multi-task Visual Perception for Autonomous Driving
  • 48. Adversarial Attacks on Multi-task Visual Perception for Autonomous Driving White box Un-targeted, White box Targeted, Black box Un-targeted, & Black box Targeted Attacks. Within each group from top to bottom, from left to right: Original results, adversarial perturbations, & the impacted results.