Successfully reported this slideshow.
Your SlideShare is downloading. ×

ObjectDetection.pptx

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 28 Ad

More Related Content

Similar to ObjectDetection.pptx (20)

Recently uploaded (20)

Advertisement

ObjectDetection.pptx

  1. 1. DEEP LEARNING FOR OBJECT DETECTION AND SCENE PERCEPTION IN SELF-DRIVING CARS RV College of Engineering Go, change the world WIRIN Internship Presentation AMAL SABU 1RV20LDC02 RITIK PABBARAJU 1RV19IM047 Mentor Name: B. Roja Reddy Designation: Assistant Professor Department: ETE 1/30/2023 WIRIN Internship 1
  2. 2. • 1st week - basics of image analysis and python • 2nd week- literature review and Exploratory data analysis • 3rd week- Information documentation and implementation of methodologies • 4th week- results verification and testing RV College of Engineering Internship Outline Go, change the world 1/30/2023 WIRIN Internship 2
  3. 3. • Introduction • Motivation • Problem Statement • Objectives of the project • Literature Survey • Methodology • Result • Conclusion and future work • References RV College of Engineering Outline Go, change the world 1/30/2023 WIRIN Internship 3
  4. 4. With recent advances in artificial intelligence (AI), machine learning (ML) and deep learning (DL), various applications of these techniques have gained prominence and come to fore. One such application is self-driving cars, which is anticipated to have a profound and revolutionary impact on society and the way people commute.. Due to rapid advances in AI and associated technologies, cars are eventually poised to evolve into autonomous robots entrusted with human lives, and bring about a diverse socio-economic impact . However, for these cars to become a functional reality, they need to be equipped with perception and cognition to tackle high-pressure real-life scenarios, arrive at suitable decisions, and take appropriate and safest action at all times. RV College of Engineering Introduction Go, change the world 1/30/2023 WIRIN Internship 4
  5. 5. Therefore, accurate object detection algorithms are needed. One challenge for example is the processing of numerous candidate object locations(often called “proposals”). These candidates provide only rough localization that must be refined to achieve precise localization. However, solutions to these problems often compromise speed, accuracy, or simplicity. Recent state-of-the-art deep learning models that address the problem of object detection include Region-Based Convolutional Neural Networks (R-CNN)and their improved versions Fast R-CNN and Faster R-CNN, designed for model performance and first introduced in 2013. A second model for object detection introduced in 2015 is YOLO, designed for speed and real-time use . RV College of Engineering Introduction Go, change the world 1/30/2023 WIRIN Internship 5
  6. 6. RV College of Engineering MOTIVATION Go, change the world 1/30/2023 WIRIN Internship 6 • One of the most anticipated technologies and active research topic. • Major Challenges for Computer Vision and Machine Learning. • Requirement of Autonomous Driving: Object Detection. • Accurate and real-time Object Detection algorithms needed.
  7. 7. RV College of Engineering Problem Statement Go, change the world 1/30/2023 WIRIN Internship 7 Implementing deep learning algorithm to detect object and scene perception in self-driving car using fast RCNN and YOLO.
  8. 8. • Collection of dataset for the training purpose. • Using YOLO to train the data set. • Creating the model which will be used in the test code. • Testing of the algorithm. RV College of Engineering Objectives of the Project Go, change the world 1/30/2023 WIRIN Internship 8
  9. 9. RV College of Engineering Literature Survey Go, change the world 1/30/2023 WIRIN Internship 9 SL.No PAPERS TITLE & AUTHOR DESCRIPTION 1. Deep learning for object detection and scene perception in self-driving cars: Abhishek Gupta, Alagan Anpalagan Ling Guan, Ahmed Shaharyar Khwaja This article presents a comprehensive survey of deep learning applications for object detection and scene perception in autonomous vehicles. Unlike existing review papers, we examine the theory underlying self-driving vehicles from deep learning perspective and current implementations, followed by their critical evaluations. Deep learning is one potential solution for object detection and scene perception problems, which can enable algorithm- driven and data-driven cars. 2. Multi-Traffic Scene Perception Based on Supervised Learning Lisheng Jin, Mei Chen, Yuying Jiang, Haipeng Xia Classification is a methodology to identify the type of optical characteristics for vision enhancement algorithms to make them more efficient. Firstly, underlying visual features are extracted from multi-traffic scene images, and then the feature was expressed as an eight-dimensions feature matrix. Secondly, five supervised learning algorithms are used to train classifiers. The analysis shows that extracted features can accurately describe the image semantics and the classifiers have high recognition accuracy rate and adaptive ability. The proposed method provides the basis for further enhancing the detection of anterior vehicle detection during nighttime illumination changes, as well as enhancing the driver's field of vision in a foggy day. 3. Real-time Object Detection for Autonomous Driving using Deep Learning Duy Anh Tran, Pascal Fischer, Alen Smajic Yujin So Datasets drive vision progress, yet existing driving datasets are limited in terms of visual content, scene variation, the richness of annotations, and the geographic distribution and supported tasks to study multitask learning for autonomous driving. In 2018 Yu et al. released BDD100K, the largest driving video dataset with 100K videos and 10 tasks to evaluate the progress of image recognition algorithms on autonomous driving. The dataset possesses geographic, environmental, and weather diversity, which is useful for training models that are less likely to be surprised by new conditions. Provided are bounding box annotations of 13 categories for each of the reference frames of 100K videos and 2D bounding boxes annotated on 100.000 images for "other vehicle", "pedestrian", "traffic light", "traffic sign", "truck", "train", "other person", "bus", "car", "rider", "motorcycle", "bicycle", "trailer".
  10. 10. RV College of Engineering Go, change the world 1/30/2023 WIRIN Internship 10 SL.No PAPERS TITLE & AUTHOR DESCRIPTION 4. Driving Scene Perception Network: Real- time Joint Detection, Depth Estimation and Semantic Segmentation Liangfu Chen, Zeng Yang, Jianjun Ma, Zheng Luo As the demand for enabling high-level autonomous driving has increased in recent years and visual perception is one of the critical features to enable fully autonomous driving, in this paper, we introduce an efficient approach for simultaneous object detection, depth estimation and pixel-level semantic segmentation using a shared convolutional architecture. The proposed network model, which we named Driving Scene Perception Network (DSPNet), uses multi-level feature maps and multi-task learning to improve the accuracy and efficiency of object detection, depth estimation and image segmentation tasks from a single input image. Hence, the resulting network model uses less than 850 MiB of GPU memory and achieves 14.0 fps on NVIDIA GeForce GTX 1080 with a 1024x512 input image, and both precision and efficiency have been improved over combination of single tasks 5. ACDC: The Adverse Conditions Dataset with Correspondences for Semantic Driving Scene Understanding Christos Sakaridis, Dengxin Dai, Luc Van Gool Level 5 autonomy for self-driving cars requires a robust visual perception system that can parse input images under any visual condition. However, existing semantic segmentation datasets are either dominated by images captured under normal conditions or are small in scale. To address this, we introduce ACDC, the Adverse Conditions Dataset with Correspondences for training and testing semantic segmentation methods on adverse visual conditions. ACDC consists of a large set of 4006 images which are equally distributed between four common adverse conditions: fog, nighttime, rain, and snow. Each adverse-condition image comes with a high-quality fine pixel-level semantic annotation, a corresponding image of the same scene taken under normal conditions, and a binary mask that distinguishes between intra-image regions of clear and uncertain semantic content. Thus, ACDC supports both standard semantic segmentation and the newly introduced uncertainty-aware semantic segmentation
  11. 11. RV College of Engineering Go, change the world WIRIN Internship 1/30/2023 11 Design Methodology • Real-time object detection of traffic objects in a video • Train YOLO and Faster R-CNN on BDD100K Dataset • compare performances: FPS and mAP • "other vehicle", "pedestrian", "traffic light", "traffic sign", "truck", "train", "other person", "bus", "car", "rider", "motorcycle", "bicycle", "trailer".
  12. 12. RV College of Engineering Go, change the world WIRIN Internship 1/30/2023 12 Driving Datasets • KITTI: Traffic scenes of Karlsruhe, 8 object classes • Waymo Open: 1950 driving videos, 4 object classes • nuScenes: 1.4 million images from Singapore and Boston, 1000 videos and 23 object classes • BDD100K: 100K Images and driving videos from cities in the US, 13 object classes
  13. 13. RV College of Engineering Go, change the world WIRIN Internship 1/30/2023 13 SYSTEM DESCRIPTION Single end-to-end CNN Object Detection as single regression problem Divides image into S x S grid cells and predicts for each cell: ● B(=2) bounding boxes with 4 coordinates and confidence score ● C class probabilities Map object to grid cell containing center of object
  14. 14. RV College of Engineering Go, change the world WIRIN Internship 1/30/2023 14 YOLO (You Only Look Once) S = 7 (paper)) S = 14 Implementation from scratch Numerous objects inside images → Increase grid cell size to S=14 → Increase output parameters from 1127 to 4508 → 23th convolutional layer: stride 1 instead of 2 to retain size 14 x 14
  15. 15. RV College of Engineering Go, change the world WIRIN Internship 1/30/2023 15 - 24 convolutional layers - 4 pooling layers - - 2 fully connected layers - 1 x 1 convolutions to reduce number of feature maps - Leaky ReLU activation after layers - Dropout between two fully connected layers → Added batch normalization between layers → 23th convolutional layer: Change stride size to 1 to retain size 14 x 14 YOLO (You Only Look Once)
  16. 16. RV College of Engineering Go, change the world WIRIN Internship 1/30/2023 16 YOLO (You Only Look Once) Output tensor of YOLO: S x S x (B x 5 + C) paper: 7 x 7 x (2 x 5 + 20) → Each grid cell represented by a vector of length 30 Our model: 14 x 14 x (2 x 5 + 13)
  17. 17. RV College of Engineering Go, change the world WIRIN Internship 1/30/2023 17 YOLO (You Only Look Once) Loss function
  18. 18. RV College of Engineering Go, change the world WIRIN Internship 1/30/2023 18 Faster R-CNN: Consists of three parts: 1.Backbone: classification network (i.e. VGG, ResNet) → pretrained on ImageNet → generate high resolution feature maps (input: 640x640) 2.RPN (Region Proposal Network)
  19. 19. RV College of Engineering Go, change the world WIRIN Internship 1/30/2023 19 FAST R-CNN Receives high resolution feature maps and region proposals ROI pooling layer: get fixed sized region proposals Outputs: -Softmax Score for every class -Corrected bounding boxes for the region proposals Loss: -Regression: log loss -Bounding box: smooth L1 loss
  20. 20. • Python in Spyder IDE in this project. RV College of Engineering Software Used Go, change the world 1/30/2023 WIRIN Internship 20
  21. 21. RV College of Engineering Go, change the world WIRIN Internship 1/30/2023 21 EXPERIMENTAL RESULTS: YOLO Epochs: 81 Learning rate: 1e-5 (decreasing) Batch size: 10 Faster R-CNN Epochs: 60 Learning rate: 1e-4 (decreasing) Batch size: 16 mAP FPS YOLO 18,6 212,4 Faster R-CNN 41,8 17,1 Hybrid incremental net 45,7 unknown
  22. 22. RV College of Engineering Go, change the world WIRIN Internship 1/30/2023 22 Experimental Results:
  23. 23. RV College of Engineering Go, change the world WIRIN Internship 1/30/2023 23 Experimental Results:
  24. 24. RV College of Engineering Go, change the world WIRIN Internship 1/30/2023 24 CONCLUSION: Conclusion: ● Faster R-CNN: high accuracy, but lower FPS ● YOLO: high FPS, but lower accuracy Future Work: ● We used the first version of YOLO → use of newer YOLO versions ● Further experiments with other models ● Achieve high accuracy AND high FPS
  25. 25. RV College of Engineering References Go, change the world 1/30/2023 WIRIN Internship 25 Sl No Author, Title of paper, Journal 1 Abhishek Gupta, Alagan Anpalagan ,Ling Guan, Ahmed Shaharyar Khwaja, “Deep learning for object detection and scene perception in self-driving cars”. 2 Lisheng Jin, Mei Chen, Yuying Jiang, Haipeng Xia, “Multi-Traffic Scene Perception Based on Supervised Learning “. 3 Duy Anh Tran, Pascal Fischer, Alen Smajic, Yujin So, “Real-time Object Detection for Autonomous Driving using Deep Learning” 4 Liangfu Chen, Zeng Yang, Jianjun Ma, Zheng Luo, “Driving Scene Perception Network: Real-time Joint Detection, Depth Estimation and Semantic Segmentation” 5 Christos Sakaridis, Dengxin Dai, Luc Van Gool, “ACDC: The Adverse Conditions Dataset with Correspondences for Semantic Driving Scene Understanding”
  26. 26. WIRIN Internship 1/30/2023 26
  27. 27. RV College of Engineering Go, change the world WIRIN Internship 1/30/2023 27 KEY LEARNINGS: • Hands on experience on advanced techniques of computer vision. • The core concept and math behind these techniques • How to learn through research paper • Hands on experience is way more necessary than education one receives from classroom. Thanks to R V College of Engineering for providing us with this opportunity
  28. 28. Thank You RV College of Engineering Go, change the world 1/30/2023 WIRIN Internship 28

×