2. The Fundamentals:
A. Computer Vision
B. Computer Vision Tasks
C. Convolutional neural network - basics of Modern Computer Vision
D. Semantic Segmentation and Autonomous Vehicle
3. A. Computer Vision
● Computer vision is a field of artificial intelligence (AI) that enables computers
and systems to derive meaningful information from digital images, videos, and
other visual inputs.
● The agenda of this field is to enable machines to view the world as humans do.
5. B. Computer Vision Tasks
● The most general computer vision tasks that we frequently encounter in AI
jargon include:
○ Image classification
○ Object detection
○ Image segmentation- Semantic Segmentation and Instance Segmentation.
7. C. Convolutional neural network - basics of
Modern Computer Vision
● In Deep learning, CNN is most commonly applied to analyze the visual imagery.
● Most computer vision algorithms are based on convolution neural network.
● CNNs are able to treat images like matrices as they exist and extract spatial
features from them, like texture, edges and depth. They do this by using
convolutional layers and pooling.
9. D. Semantic Segmentation and Autonomous
Vehicle
● For fully autonomous vehicles, semantic Segmentation is a core element where
neural networks need to output high-resolution feature maps.
● Semantic segmentation is a key technology for autonomous vehicles to
understand the surrounding scenes.
● Semantic segmentation is a fundamental task in which each pixel of the input
image should be assigned to the corresponding label.
● It plays a vital role in many practical applications, such as medical image
segmentation, navigation of autonomous vehicles, and robots.
11. DDRNets- Introduction
● Real-time semantic segmentation is the task of achieving
computationally efficient semantic segmentation.
● The appealing performances of contemporary models usually come at
the expense of heavy computations and lengthy inference time, which
is intolerable for self-driving.
● Mostly methods are very time-consuming in the inference stage and
can not be directly deployed on the actual autonomous vehicles.
● DDRnets tackle this problem and achieves a new state-of-the-art
trade-off between accuracy and speed on both Cityscapes and CamVid
datasets.
13. DDRNets
● Semantic segmentation is a kind of dense prediction task which is
computationally expensive. This problem is especially critical for scene parsing
of autonomous driving.
● DDRnets consists of two main components:
the Deep Dual-resolution Network and the Deep Aggregation Pyramid Pooling
Module.
14. Architecture
Deep Dual-resolution Network- A family of novel bilateral networks with deep dual-
resolution branches and multiple bilateral fusions is proposed for real-time semantic
segmentation as efficient backbones.
Deep Aggregation Pyramid Pooling Module- A novel module is designed to harvest rich
context information by combining feature aggregation with pyramid pooling. When
executed on low-resolution feature maps, it leads to little increase in inference time.
17. Conclusion
● With widely used test augmentation, this method is superior to most state-of-
the-art models and requires much less computation.
● Due to the simplicity and efficiency of method, it can be seen as a strong
baseline for unifying real-time and high-accuracy semantic segmentation.