Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Building HD maps with dashcams


Published on

About my work at DeNA Co., Ltd.. We built HD maps with images taken by dash cams for self-driving cars. Deep learning is extensively used for detecting objects on the road and SfM is used for reconstructing 3D points with 2D images. This slide was presented at TechCon 2019

Published in: Engineering
  • Legitimate jobs paying $40/h Tap into the booming online job, industry and start working now! ▲▲▲
    Are you sure you want to  Yes  No
    Your message goes here

Building HD maps with dashcams

  1. 1. #denatechcon #denatechcon Building HD Maps with Dashcams Kosuke Kuzuoka AI System Group DeNA Co., Ltd.
  2. 2. #denatechcon Agenda • Who I am • Our Goal • Intro to DL and SfM • 3D Point Reconstruction • Recognizing Objects • Putting It All Together
  3. 3. #denatechcon Who I am • Profile • Kosuke Kuzuoka • 22 years old • Experience • June 2018 - Present AI Research Engineer at DeNA Co., Ltd. • March 2017 - June 2018 R&D manager at CONCORE’S, inc. • Interests • Self Driving Cars • Computer Vision Facebook: LinkedIn:
  4. 4. #denatechcon What I have done before Detecting objects from construction plans using deep learning algorithms Patent pending algorithm that I developed for detecting pillars across multiple tiled images
  5. 5. #denatechcon Our Goal ● To create high definition maps at a lower price ● 3D point reconstruction and object detection in dashcam images ● No use of expensive equipment, such as LiDAR
  6. 6. #denatechcon Isn’t it like google maps? ● A map designed for humans ● It has useful information for humans ● A map designed for machines ● It has useful information for cars, such as where traffic signs exist
  7. 7. #denatechcon Is it for self-driving cars? ● It’s extensively used in self-driving cars, such as for localization and path planning ● Therefore, the location accuracy for HD maps need to be within a few centimeters ● A self-driving car needs to know which direction the lane is leading, where the traffic signs are, etc.
  8. 8. #denatechcon Introduction to Deep Learning ● The idea of deep learning has existed from the late 1950s, invented by Frank Rosenblatt. ● It was originally called Perceptron, and it was able to solve linearly separable problems. ● Later, it turned out that simple Perceptron wasn’t able to solve non-linearly separable problems.
  9. 9. #denatechcon Why is deep learning popular nowadays? ● Large scale datasets such as ImageNet have been made public for research purposes ● High computational resources such as GPU are more accessible than ever before
  10. 10. #denatechcon Okay, but what can you do with DL? ● Using deep learning, we can solve object detection and instance segmentation problems ● Object detection detects multiple objects in the image, while instance segmentation segments object boundaries ● Using deep learning, we can solve image classification and image localization problems ● Image classification classifies what is in the image, while image localization classifies what and where in the image
  11. 11. #denatechcon Okay, let’s sum that up • Deep learning is not new • Data is important for deep learning • High computational resources are necessary • You can do so many things with deep learning
  12. 12. #denatechcon Introduction to SfM SfM stands for Structure from Motion, and is an algorithm to reconstruct 3D points (called structure) from images taken with different angles or positions (called motion). Large scale applications include for example reconstructing all of Rome using only images found on the web.
  13. 13. #denatechcon How does SfM work? ● Extracts features from images. e.g. corners or edges ● Matches the features in images taken from different positions ● Calculates the corresponding points in 3D coordinates using triangulation ● Calculates camera position and optimizes reconstructed 3D points
  14. 14. #denatechcon What can you do with SfM? It built a 3D representation of Rome within a day with images found on the web. It used 150k images, and the processing time was around 21 hours using 496 CPU cores.
  15. 15. #denatechcon Let’s sum that up • SfM can reconstruct 3D shapes from 2D images • 3D representation of Rome can be built in a day using images from the web
  16. 16. #denatechcon So we have tools. What now? ● Dashcam images are used for reconstructing 3D points by SfM ● The same images are used for detecting objects in 2D space ● Both results are integrated to get 3D representations of each object
  17. 17. #denatechcon 3D Point Reconstruction ● Images are taken by driving in the highlighted region in Minatomirai ● Dashcam images are used for SfM and object detection
  18. 18. #denatechcon Overall shape looks good ● a ● b ● c ● 3D modeling in relatively small region in Minatomirai ● Reconstructed shape matches the highlighted region in the map
  19. 19. #denatechcon Slightly larger region, still good ● Red arrows indicate the direction the car was driving ● The reconstructed shape matches the highlighted region in the map
  20. 20. #denatechcon Hooray, view from top is good ● SfM was applied in a larger region in the Minatomirai area ● Overall shape still matches the map
  21. 21. #denatechcon What about the closer view? The detail of road markings and speed limit signs can be found, though some information is unnecessary Lanes are reconstructed well on the left side, but the the center lane markings on the right are missing. This is caused by the divider
  22. 22. #denatechcon Some findings with SfM are: • Reconstructed 3D points contain small details • GPU can reduce the processing time significantly • The more images, the better the result
  23. 23. #denatechcon Recognizing Objects ● We chose Faster R-CNN for detecting traffic signs ● Faster R-CNN was a state-of-the-art detector in 2016 ● Faster R-CNN is a really accurate object detector when compared to other real-time detectors, but it’s slower
  24. 24. #denatechcon Objects are detected correctly ● Most of traffic signs are detected correctly, though there is a small traffic sign missed by the detector ● The network predicts the category for each box, and there are more than 100 categories to choose from
  25. 25. #denatechcon Another example for traffic signs
  26. 26. #denatechcon What now for lane detection? ● We chose LaneNet published in 2018 as a lane detector ● LaneNet transforms an original image to a bird’s eye image with learned parameters ● It can detect multiple lane instances at real-time speed and high accuracy
  27. 27. #denatechcon Deep learning can detect lanes! ● Different colors indicate different instances ● You can see that the lanes are detected correctly ● It can detect curved lanes as well, though they aren’t in the image
  28. 28. #denatechcon Another example for lane detection
  29. 29. #denatechcon What about road markings? Bird’s eye transformation on original image Inverse transformation on bird’s eye image Faster R-CNN on bird’s eye image
  30. 30. #denatechcon Deep learning works for road markings! ● Road markings are detected correctly. ● It distinguishes the lane from the stop sign ● The detection fits objects, though not perfectly
  31. 31. #denatechcon Another example for road markings
  32. 32. #denatechcon The result is impressive
  33. 33. #denatechcon Objects are detected precisely
  34. 34. #denatechcon Let’s sum that up • Traffic sign recognition with more than 100 categories can be solved with deep learning • Deep learning works well on complicated tasks such as lane and road marking detection • The more data, the better the results
  35. 35. #denatechcon Putting It All Together ● Green points indicate the region used for 3D reconstruction ● The detection has to be done in frames where the objects are highlighted in green
  36. 36. #denatechcon Results are now integrated We can get a 3D representation of detected objects by integrating both results. The final result will look like image above.
  37. 37. #denatechcon Now, objects are represented in 3D ● Detected traffic signs and road markings are converted to 3D ● Each object has a 3D representation after integrating both SfM and object detection results
  38. 38. #denatechcon We are done! ● Reconstructed 3D view looking from top ● You can see the detected lanes and road markings now have a 3D representation
  39. 39. #denatechcon Using this technique, we could do: • Automating process for map creation • Creating HD maps for other services • Detecting changes automatically
  40. 40. #denatechcon Thanks!
  41. 41. #denatechcon #denatechcon