SlideShare a Scribd company logo
1 of 8
Bidirectional Information exchange
Subtitle as needed (paper subtitle)
Authors Name/s per 1st Affiliation (Author)
line 1 (of Affiliation): dept. name of organization
line 2-name of organization, acronyms acceptable
line 3-City, Country
line 4-e-mail address if desired
Authors Name/s per 2nd Affiliation (Author)
line 1 (of Affiliation): dept. name of organization
line 2-name of organization, acronyms acceptable
line 3-City, Country
line 4-e-mail address if desired
Abstract— It is essential to collect and analyze
environmental information surrounding the vehicle in an
autonomous driving environment. To do this, it is
necessary to analyze real-time information such as
location, orientation and size of objects. In particular,
handling of occlusion or truncation helps to ensure safe
driving by providing exact information about the
surroundings of the vehicle. In this paper, we propose a
new method to generate object proposals around a vehicle
using projected LiDAR information and to perform object
detection more accurately by exploiting a 2D image based
classifier. This method generates a proposal by filtering
the LiDAR information into a 2D edge with a simple but
strong affinity. Then classify the objects in the proposal
through R-FCN and map the class labels classified at the
edge points used to make the proposal. This class label
combines with the orientation information of the edge in
3D space to complete the 3D box of the object. This
compensates for the disadvantage of the CCD-based CNN
classifier which shows difficulties to obtain spatial
information and solves such problems as occlusion. We
compared our method to state-of-the-art results on the
KITTI Dataset and showed good results in terms of speed
and accuracy (especially IOU with ground-truth).
Keywords—LiDAR; CCD; Proposal generation; Object
Detection; R-FCN
I. INTRODUCTION
In recent years, CNN had many contributions from fields such
as object classification and object detection. Especially, as the
autonomous driving becomes an issue, deep network
researches for object recognition and behavior analysis in the
driving environment are actively being carried out. The most
recent deep network-based object classification researches are
divided into two categories : two-stage type and one-stage
type. In two-stage type, 'proposal generator' and 'classifier'
scheme are implemented as an independent component, but in
one-stage type these are performed together.
In two-stage type, R-CNN[1] shows good performance in
object classification by evaluating the 'scores' of features that
exist in that region via 'region'(?) and classifying the proposed
regions into feature maps. This method significantly reduces
the computational cost compared to the sliding window
method, and thus has advantages in terms of speed. Faster R-
CNN [2] devised the 'RPN (Region Proposal Network)' that
generates the proposal through feature sharing, taking
advantage of the fact that the convolution layer itself contains
enough information to represent the object. HyperNet [3]
replaces the features used in the Faster R-CNN by pooling
several times to compensate for the fact that the FC5 layer
information is insufficient for small objects. It creates a hyper
feature set and used it to match objects of various sizes. There
is, however, a dilemma for translation invariants and variants
between linked object detection and classifiers. To solve this
dilemma, R-FCN [4] divides the ROI into k × k lattices and
finds objects in the center of ROI through local voting.
In one-stage type, bounding prediction and class
probability calculation are performed together like YOLO[5].
In YOLO, the class probability map and the bounding box
confidence map in the grid are generated based on the grid
without generating a separate proposal, and the class of the
bounding box is classified immediately, greatly simplifying
the overall calculation process. Fast YOLO [ ] simplifies the
network a little bit and shows a computation speed of 155 fps.
In a similar approach, a single shot detector (SSD)[6] was
introduced to evaluate box and confidence through step
features in the convolution layer like HyperNet.
Even though the existing methods are good at classifying
occlusion and truncated objects, but is vulnerable to the
processing of the obscured part of the object.
However, for autonomous navigation, additional
information is needed in detecting and classifying objects
well. The most important information is occlusion and
truncation among objects. These two factors are important
factors in understanding the actual position and arrangement
of nearby objects in the driving environment. Of course, the
existing set of methods is also good at classifying occlusion
and truncated objects, but is vulnerable to the processing of
the obscured part of the object.
In both types, a bounding box is created based on the
characteristics of the unobscured part of the obscured object,
and the object is classified through a part of the object
represented in the box. So, although the class of the object can
be found, there is a problem that the size of the actual object in
the
hidden
area
can
not be
estimated.
Another problem is obtaining spatial information such as the
3D location and orientation of the detected object, and the
actual size of the object. In order to grasp the traffic flow
around the vehicle prior to autonomous driving, it is very
difficult to estimate the 3D spatial information using only one
accurate RGB image. Therefore, using 3D depth information
is the most intuitive method. 3DOP[7] generates Depth map
using stereo image, and by projecting each Pixel of RGB
Image into 3D space. It defines the relationship between each
pixel as an MRF energy function through several properties,
classifies the object by Linear SVM. However, the
sophisticated stereo-based depth map generation used in
3DOP consumes a lot of computational cost. Therefore, there
is a method to obtain depth in addition to stereo, among which
laser sensor information such as LiDAR is used. Laser sensors
can collect 3D spatial information very quickly. Among these
studies, vote3D [8] used LiDAR directly learning in 3D space
through 3D Voxel. 3DVP classifies LiDAR pointcloud into
3D Voxel and applies 2D Alignment with 3D CAD Model. In
addition, there are studies in which LiDAR points are
projected onto 2D to generate 2D depth maps. But, LiDAR
information has sparse characteristics basically and it is
difficult to learn steadily because it shows non-uniform results
depending on the surface of environment or object. Based on
the structure of RPN + classifier of Faster R-CNN,
SubCNN[9] adds subcategory information to both sides and
Figure 1. System overview. The upper network is characterized by scoring through voting for the area of k x k with R-
FCN. In this study, ROI of this network is generated by filtering LiDAR pointcloud. Through this ROI, the object is
classified by voting and the classified class label is mapped again to the point to expand it in 3D space. This creates a final
bounding box for each object.
corrects the area according to the size and direction
information of subclass. Particularly in 'Car' category,
3DVP[10] was applied to identify the 3D location and
orientation of the vehicle and perform 2D and 3D
segmentation. However, this method can only be applied to
the rigid body model, and many additional samples are
required depending on each sub category class. In addition,
manual sorting for alignment is required for all of them, and
the sparse LiDAR pointcloud has the disadvantage that it is
difficult to construct a sufficient voxel for long distance
objects. And the computation speed is not fast enough and is
not suitable for autonomous driving environments where real-
time processing is essential. Therefore, this study proposes
fast and accurate 2D and 3D object detection and classification
method suitable for autonomous driving environment by
adding CCD image and LiDAR sensor information. The
proposed method uses a LiDAR Point as 2D data that
combines all the points of the z-axis in one plane instead of
the 3D voxel, which guarantees a much higher density than the
conventional one. We filter them and group edges to find
edges with high affinity and generate 2D proposals. The
generated proposal is used as the ROI of the R-FCN network
and the bounding box of the object in 3D space is created by
mapping the classification result to the constituent edge point.
Finally, if you project it onto 2D again, you can expand the
box to the hidden region that does not appear in the 2D image.
Our benefits through this study:
- This method replaces the existing proposal generator and
improves the speed by mixing the simplified LiDAR sensor
information with the powerful CCD-based CNN architecture.
- It is possible to acquire 3-D spatial information about the
object around the vehicle at the same time by only the
information generated in the process of making proposal
without additional process.
- This method expands the bounding box of the occluded or
truncated region to extend the box closer to the actual size of
the object.
To do this, we show how to perform a mutual projection
between 2D and 3D points using a simple method in Section
II-A. Section II-B uses this to detect objects in the CCD-based
state-of-the-art classifier R-FCN network. Section II-C
expands the box based on orientation information obtained in
II-A and orientation information obtained in 3D space through
the detected class label to determine the optimal bounding box
for the actual class. Section-III quantitatively verifies the
performance of the Average precision(AP) in the object and
tracking sequences in the KITTI Dataset and compares it with
the previous studies. Also, confirm the expansion result for the
obscured area through the picture.
II. BIDIRECTIONAL INFORMATION EXCHANGE
A. Proposal Generation with LiDAR
We propose 'bidirectional projection filtering' using 2D -
3D information as an intersection through a calibration matrix
between CCD image and 3D point cloud as shown in Figure 2.
Through this method, it is possible to generate an object
proposal, estimate an obscured area, and obtain the actual size
and 3D spatial information of the detected object based on the
CCD image. In this section, we describe how to remove point
information that is not needed for proposal generation and
exchange information between 2D and 3D space through
projection. As shown in figure 1- (b), we remove the ground to
separate object points and eliminate points outside the unused
CCD angle. Next, project the z-axis on one side to obtain a
dense 2D object shape. Then, the error points are filtered and
the remaining edges are grouped based on the affinity between
the neighboring edges, and a proposal is generated in units of
edge groups generated.
Figure 2. Point filtering through 2D projection of LiDAR.
(a) original image, (b) Lidar point cloud with ground
removed, (c) removal of points outside the angle of view
and reduce dimensions to increase density (d) 2D
projection and filtering of points, (e) 3D point cloud before
filtering, and (F) 2D point cloud after filtering.
Ground Plane removal - In the LiDAR environment, the
simplest way to remove the ground is to remove all of the
LiDAR points below the corresponding height by considering
the ground of the vehicle equipped with the sensor as ground
plane. In this paper, we propose a surface detection method
considering the curvature of the road surface because the road
is infinitely flat and there is no obstacle. Although there are
many methods, this study uses the fact that the objects in the
driving environment have a volume including a certain height.
Through the voxel grid of 20x20x10 cm, the height of the
points constituting the cell is obtained for all XY cells in which
the point exists.
(1)
(2)
When each coordinate of the XY plane is i and j, each grid
cell is represented by Cij, and if the point set belonging to Cij
is PSij. In equation (1), HPSij is the height of each cells.
Ground G(i,j) in equation (2) is the grid with zero height area.
Figure 2-(b) shows the point cloud in which the ground area is
removed in this way.
Outside point elimination – It is necessary to further
simplify the point set with ground removed. First, the
LiDAR Point is multiplied with the calibration matrix to
remove all points outside the x-axis range of the CCD image
and then process the remaining points only. The first reason for
doing a CCD projection is because you can easily remove
LiDAR points outside the angle of view. Second, since the
reference of the horizontal axis search changes from radian to
pixel through projection, it is possible to perform filtering on
the horizontal axis in a very simple manner. Figure 2-(c) shows
that the range of points to be processed is greatly reduced.
Z axis elimination - Sparse LiDAR points after the ground is
removed are configured in a two-dimensional plane by
projecting all the points on the z-axis to increase the density
and at the same time for fast computation. Since the height that
can be reached is 50cm or more, PS
z
ij = -0.9 is set for all point
set PSij in order to locate the upper plane at 50cm above the
ground where the radar sensor of the vehicle is installed.
However, as shown in Figure 2-(b), since the 3D point of the
existing 3D space is pressed in one plane, the density of the
point is increased, but there is a portion that looks like noise
due to the depth difference.
Edge filtering in CCD - The use of 2D LiDAR point clouds
by CCD projection has several advantages. In 3D space,
LiDAR spreads radially with zero point and obtains distance
information. In order to extract the shortest line of an object
from a shooting point, the distance must be calculated for all
the angles that are continuous values. However, if we project
this onto the CCD, the discrete value of the pixel becomes the
reference domain as shown in Figure 2-(d). As shown in the
figure 3, duplicate points can easily be removed according to
height or depth in one pixel column in pixel level. In addition,
since the error pixel can be easily discriminated through the
gradient operation proportional to the distance between
neighboring pixels, it is possible to generate the edges closest
to the photographed vehicle in each column of the x axis.
(3)
In equation (3), Cli denotes all x-axis columns containing at
least one projected LiDAR Point and hPsi denotes the
minimum height of each Cli. This gives hPsi for all i and
removes noise through a one-dimensional median filter[11].
At this time, not only the height of the point selected by the
median filter but also the index of the corresponding point are
copied together to remove the noise point in the 3D space as
well. Figure 2-(f) shows a three-dimensional point cloud
arranged through this.
Segmentation by edge affinity – The point map determined
through the above process is segmented by two conditions.
First, we group the consecutive edges by the edge grouping
method introduced in Edgeboxes[12]. In this case, the edge
magnitude mp is used to inverse the Euclidean distance
between adjacent points Pi and Pj of two neighboring edge
groups as shown in equation (4).
(4)
(5)
The affinity score a(si,sj) is obtained for all edge groups Si, and
the edge is segmented based on the boundary of the group
whose score changes rapidly. (4) is a formula for obtaining the
affinity score between neighboring edge groups. PGz
i and PGz
j,
which are the ground heights, and PGz1.5
i and PGz1.5
j, which are
1.5m high from the ground, are projected on the CCD image,
respectively, for pi and pj located at the boundary of each edge
set. Create a bounding box by connecting horizontally to the
adjacent boundary.
Proposal generation based on actual size – We create a
bounding box between segmented boundaries. At this time, the
bottom value of each boundary point is determined through the
ground position obtained by the ground removal process. The
height of the bounding box is 1.5m which covers all three
classes that are mainly detected in the driving environment and
it is projected on CCD image to generate proposal in CCD
image. The yellow line in Figure 4 represents the projected
LiDAR edge, and the orange line represents the boundary
divided by the affinity of the LiDAR edge. The green box at
the bottom of Figure 4 creates a box with a height of 1.5m
between the boundaries based on this.
Figure 4. A box proposal created based on edge group
boundaries divided by edge affinity.
B. Classification with R-FCN
The generated proposal classifies objects by combining them
with R-FCN showing state-of-the-art performance. Figure 5-(b)
shows the R-FCN implemented through ResNets [13] 101. R-
FCN solves the object translation invariance and variance
dilemma between detection and classification by dividing
Figure 3. The process by which objects are detected by bidirectional projection filtering. (a) original image, (b) 2D LiDAR
projection, (c) 2D gradient-based edge cleanup, (d) Proposal creation using edge affinity, and classification result by R-
FCN. (e) 3D bounding box expansion by LiDAR projection(Red Boxes). , (f) final result.
ROI is divided into n x n grid and feature set is trained through
each grid cell. If a box proposal is determined according to this
learning method, a high voting score is returned if the object is
sufficiently wrapped. By using this feature, it is possible to
organize the surrounding box based on the voting score of the
redundant box due to noise of LiDAR.
Figure 5. Detailed structure of proposed method. (a) 3D to
2D Projection Filtering (b) R-FCN (c) 2D to 3D Projection
Filtering. (d) Result
C. 3D projection and box extension
This time, we assign the class label to the edge group inside the
edge boundary of the box classified by R-FCN. Each labeled
edge is projected in the 3D space through the index given in
Figure 5-(c) and the orientation of the object is estimated
through the X-Y axis pole among the points constituting the
edge. Once the orientation is determined, the 3D box is
expanded to match the class specific size and aspect ratio. In
this case, the edge to be used as the base of the expansion can
be determined according to the positional relationship between
the edge group and the height difference of hPi and hPj at the
segmentation boundary line surrounding the edge in 2D space.
Using this, we extend the box around the edge that contains the
edge of the object. Figure 6 shows the process of expanding the
box through the labeled edges and integrating the overlapping
boxes and edges.
III. EXPERIMENT RESULTS
This study was conducted through caffe[15] framework and
used NVIDIA TITAN X GPU. The two networks we used for
evaluation, Faster R-CNN and R-FCN, were originally studied
and evaluated in the PASCAL VOC Dataset, but this study
was conducted through KITTI Dataset because LiDAR
information is required together. We also compared the
original R-FCN method using Selective search [14] and the
results of this study with the KITTI object dataset by using R-
FCN learning model learned in PASCAL VOC for objective
comparison. The metrics used in the evaluation are the
standard mean average precision and the mean intersection-
over-union.
A. Experiments on KITTI
We first training through the KITTI Object Dataset and
Tracking sequence, which is a typical dataset that can use
CCD information and LiDAR information among public
datasets that can use R-FCN at present. In training phase, four
classes of 'Car', 'Pedestrian', 'cyclist', and 'background' were
combined and learned in four classes. Since the proposed
method does not require additional learning for LiDAR, there
is no need to change the architecture of R-FCN to learn CCD
image. However, as the type of class changes, the volume of
the feature set has changed. But the valid range of LiDAR
Sensor data given in KITTI Dataset is about 50m. It was not
possible to evaluate the object of hard difficulty which is not
wide enough to cover the entire CCD image and belong to the
distance of 50m or more.
Table 1 compares the object detection rates of existing state-
of-the-art methods and our research by measuring the mAP by
difficulty for the three classes. We measured the precision of
the box with the ground truth area and IOU of 50% or more,
and found good results for 'Car' and 'Pedestrian'. However, in
the case of 'cyclist', LiDAR points were not uniformly
distributed according to the spoke shape of the bicycle wheel,
so that a box proposal of a sufficient size could not be
generated, resulting in a relatively low value.
Car Pedestrian Cyclist
Method E M E M E M
Regionlet[15] 84.75 76.45 73.14 61.15 70.41 58.72
3DVP[10] 87.46 75.77 - - - -
SubCat[16] 84.14 75.46 - - - -
SDP [17] 90.33 83.53 77.74 64.19 74.08 61.31
Ours 95.41 88.54 81.78 65.71 72.11 60.85
Table 1. The results of the KITTI dataset of our study are
compared with state-of-the-art methods. Hard difficulty
was excluded from comparison because of the limit of
measurement distance of LiDAR sensor of about 50m.
B. Experiments on PASCAL VOC & KITTI
In fact, this study is close to the proposal generation method,
so the classifier R-FCN itself is used without any
modification. Therefore, the measurement of mAP at KITTI is
Figure 6. Expanded box by determined edge orientation
and corner detection. (a) shows the object box classified by
R-FCN, and (b) is a figure that is extended to 2D according
to the actual size of the class through 2D to 3D projection
and then projected to 2D.
not enough to analyze the performance of our study because it
is more influenced by the performance of our R-FCN.
Therefore, we compared the results with R-FCN using Faster
R-CNN or Selective search using RPN to objectively measure
performance as a proposal creation and result correction tool.
In this comparison, we used the caffe model learned from
PASCAL VOC 07 + 12 published in the original paper as it is
to prevent problems that may occur during re-learning of two
networks through KITTI Dataset. Table 2 compares the
accuracy and computation time of two other studies with this
study. Table 3 shows the AP variation with IOU ratio to
ground-truth. As shown in the table, the overlap ratio is higher
than that of conventional RPN or SS. This is because the box
extension through the class label reduces the error by re-
expanding the box closer to the actual object size. We have
confirmed that, as long as the object belongs to the LiDAR
sensor range, this study expresses the object as a bounding box
closer to the ground-truth than the existing research. Although
the box extension is performed only for the vehicle category
due to the problem of orientation acquisition, as shown in
Zhang et al.[14]'s experiment. But, if the aspect ratio and size
of the object such as pedestrian can be generalized, it is
possible to expand the additional category.
Training
Data
Test
Data
mAP
(%)
test time
(sec/img)
RPN+Faster R-
CNN
07+12 KITTI 75.7 0.37
RPN+R-FCN 07+12 KITTI 77.4 0.20
SS+R-FCN 07+12 KITTI 80.4 2.21
Ours +R-FCN 07+12 KITTI 82.4 0.17
Table 2. Comparison result of detection rate in KITTI
Object and Tracking dataset. To reduce errors, we used
the original pre-training model.
Training
Data
Test
Data
AP@
0.5
AP@
0.7
AP@
0.9
RPN+R-FCN 07+12 KITTI(car) 84.8 77.4 55.2
SS+R-FCN 07+12 KITTI(car) 86.3 80.4 58.4
Ours 07+12 KITTI(car) 89.7 82.4 80.1
Table 3. The AP change according to the IOU rate in the
Car category. The proposed method shows a high overlap
ratio with respect to the ground truth compared to the
RGB feature based proposal generator such as selective
search.
IV. CONCLUSION
We propose a simple, but strong method for autonomous
driving through this study. We have shown better results than
state-of-the-art methods by combining CCD-based classifier
and LiDAR information effectively through the proposed BPF.
Especially, 3D space information of a partially obscured object
is grasped in real time, and the detection area is extended based
on this, so that the result shows a high IOU rate with the
groundtruth. However, there are still problems to be solved,
such as how to handle objects that appear on the CCD but are
outside the LiDAR range, or to expand on both sides of the
object. In addition, LiDAR sensing range is shorter than the
visible range of the CCD, so it is necessary to compensate for
CCD objects. We will address these issues through additional
research.
REFERENCES
[1] Girshick, Ross, et al. "Rich feature hierarchies for accurate object
detection and semantic segmentation." Proceedings of the IEEE
conference on computer vision and pattern recognition. 2014.
[2] Ren, Shaoqing, et al. "Faster R-CNN: Towards real-time object
detection with region proposal networks." Advances in neural
information processing systems. 2015.
[3] Kong, Tao, et al. "HyperNet: Towards Accurate Region Proposal
Generation and Joint Object Detection." arXiv preprint
arXiv:1604.00600 (2016).
[4] Dai, Jifeng, et al. "R-FCN: Object Detection via Region-based Fully
Convolutional Networks." arXiv preprint arXiv:1605.06409 (2016).
[5] Redmon, Joseph, et al. "You only look once: Unified, real-time object
detection." arXiv preprint arXiv:1506.02640 (2015).
[6] Liu, Wei, et al. "SSD: Single Shot MultiBox Detector." arXiv preprint
arXiv:1512.02325 (2015).
[7] Chen, Xiaozhi, et al. "3d object proposals for accurate object class
detection." Advances in Neural Information Processing Systems. 2015.
[8] Wang, Dominic Zeng, and Ingmar Posner. "Voting for voting in online
point cloud object detection." Proceedings of the Robotics: Science and
Systems, Rome, Italy 1317 (2015).
[9] Xiang, Yu, et al. "Subcategory-aware Convolutional Neural Networks
for Object Proposals and Detection." arXiv preprint arXiv:1604.04693
(2016).
[10] Xiang, Yu, et al. "Data-driven 3d voxel patterns for object category
recognition." 2015 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR). IEEE, 2015.
[11] T. Huang, G. Yang, and G. Tang, "A fast two-dimensional median
filtering algorithm", IEEE Trans. Acoust., Speech, Signal Processing,
vol. 27, no. 1, pp. 13–18, 1979.
[12] Zitnick, C. Lawrence, and Piotr Dollár. "Edge boxes: Locating object
proposals from edges." European Conference on Computer Vision.
Springer International Publishing, 2014.
[13] He, Kaiming, et al. "Deep residual learning for image recognition."
arXiv preprint arXiv:1512.03385 (2015).
[14] Uijlings, Jasper RR, et al. "Selective search for object recognition."
International journal of computer vision 104.2 (2013): 154-171.
[15] Wang, Xiaoyu, et al. "Regionlets for generic object detection." IEEE
transactions on pattern analysis and machine intelligence 37.10 (2015):
2071-2084.
[16] Ohn-Bar, Eshed, and Mohan Manubhai Trivedi. "Learning to detect
vehicles by clustering appearance patterns." IEEE Transactions on
Intelligent Transportation Systems 16.5 (2015): 2511-2521.
[17] Yang, Fan, Wongun Choi, and Yuanqing Lin. "Exploit all the layers:
Fast and accurate cnn object detector with scale dependent pooling and
cascaded rejection classifiers." Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition. 2016.
[18] Zhang, Liliang, et al. "Is Faster R-CNN Doing Well for Pedestrian
Detection?." European Conference on Computer Vision. Springer
International Publishing, 2016.
Mmpaper draft10

More Related Content

What's hot

高精地图数据协议标准探究
高精地图数据协议标准探究高精地图数据协议标准探究
高精地图数据协议标准探究Weijun Zhong
 
Depth Fusion from RGB and Depth Sensors III
Depth Fusion from RGB and Depth Sensors  IIIDepth Fusion from RGB and Depth Sensors  III
Depth Fusion from RGB and Depth Sensors IIIYu Huang
 
Classified 3d Model Retrieval Based on Cascaded Fusion of Local Descriptors
Classified 3d Model Retrieval Based on Cascaded Fusion of Local Descriptors  Classified 3d Model Retrieval Based on Cascaded Fusion of Local Descriptors
Classified 3d Model Retrieval Based on Cascaded Fusion of Local Descriptors ijcga
 
Deep Learning’s Application in Radar Signal Data II
Deep Learning’s Application in Radar Signal Data IIDeep Learning’s Application in Radar Signal Data II
Deep Learning’s Application in Radar Signal Data IIYu Huang
 
Fisheye Omnidirectional View in Autonomous Driving II
Fisheye Omnidirectional View in Autonomous Driving IIFisheye Omnidirectional View in Autonomous Driving II
Fisheye Omnidirectional View in Autonomous Driving IIYu Huang
 
Camera-based road Lane detection by deep learning III
Camera-based road Lane detection by deep learning IIICamera-based road Lane detection by deep learning III
Camera-based road Lane detection by deep learning IIIYu Huang
 
fusion of Camera and lidar for autonomous driving II
fusion of Camera and lidar for autonomous driving IIfusion of Camera and lidar for autonomous driving II
fusion of Camera and lidar for autonomous driving IIYu Huang
 
Simulation for autonomous driving at uber atg
Simulation for autonomous driving at uber atgSimulation for autonomous driving at uber atg
Simulation for autonomous driving at uber atgYu Huang
 
3-d interpretation from single 2-d image IV
3-d interpretation from single 2-d image IV3-d interpretation from single 2-d image IV
3-d interpretation from single 2-d image IVYu Huang
 
3-d interpretation from single 2-d image for autonomous driving
3-d interpretation from single 2-d image for autonomous driving3-d interpretation from single 2-d image for autonomous driving
3-d interpretation from single 2-d image for autonomous drivingYu Huang
 
3-d interpretation from stereo images for autonomous driving
3-d interpretation from stereo images for autonomous driving3-d interpretation from stereo images for autonomous driving
3-d interpretation from stereo images for autonomous drivingYu Huang
 
Depth Fusion from RGB and Depth Sensors by Deep Learning
Depth Fusion from RGB and Depth Sensors by Deep LearningDepth Fusion from RGB and Depth Sensors by Deep Learning
Depth Fusion from RGB and Depth Sensors by Deep LearningYu Huang
 
Pedestrian behavior/intention modeling for autonomous driving V
Pedestrian behavior/intention modeling for autonomous driving VPedestrian behavior/intention modeling for autonomous driving V
Pedestrian behavior/intention modeling for autonomous driving VYu Huang
 
Deep learning for 3-D Scene Reconstruction and Modeling
Deep learning for 3-D Scene Reconstruction and Modeling Deep learning for 3-D Scene Reconstruction and Modeling
Deep learning for 3-D Scene Reconstruction and Modeling Yu Huang
 
Driving behaviors for adas and autonomous driving xiv
Driving behaviors for adas and autonomous driving xivDriving behaviors for adas and autonomous driving xiv
Driving behaviors for adas and autonomous driving xivYu Huang
 
Deep Learning’s Application in Radar Signal Data
Deep Learning’s Application in Radar Signal DataDeep Learning’s Application in Radar Signal Data
Deep Learning’s Application in Radar Signal DataYu Huang
 
Stereo Matching by Deep Learning
Stereo Matching by Deep LearningStereo Matching by Deep Learning
Stereo Matching by Deep LearningYu Huang
 
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)Yu Huang
 
Depth Fusion from RGB and Depth Sensors IV
Depth Fusion from RGB and Depth Sensors  IVDepth Fusion from RGB and Depth Sensors  IV
Depth Fusion from RGB and Depth Sensors IVYu Huang
 

What's hot (20)

高精地图数据协议标准探究
高精地图数据协议标准探究高精地图数据协议标准探究
高精地图数据协议标准探究
 
Depth Fusion from RGB and Depth Sensors III
Depth Fusion from RGB and Depth Sensors  IIIDepth Fusion from RGB and Depth Sensors  III
Depth Fusion from RGB and Depth Sensors III
 
Classified 3d Model Retrieval Based on Cascaded Fusion of Local Descriptors
Classified 3d Model Retrieval Based on Cascaded Fusion of Local Descriptors  Classified 3d Model Retrieval Based on Cascaded Fusion of Local Descriptors
Classified 3d Model Retrieval Based on Cascaded Fusion of Local Descriptors
 
Deep Learning’s Application in Radar Signal Data II
Deep Learning’s Application in Radar Signal Data IIDeep Learning’s Application in Radar Signal Data II
Deep Learning’s Application in Radar Signal Data II
 
Survey 1 (project overview)
Survey 1 (project overview)Survey 1 (project overview)
Survey 1 (project overview)
 
Fisheye Omnidirectional View in Autonomous Driving II
Fisheye Omnidirectional View in Autonomous Driving IIFisheye Omnidirectional View in Autonomous Driving II
Fisheye Omnidirectional View in Autonomous Driving II
 
Camera-based road Lane detection by deep learning III
Camera-based road Lane detection by deep learning IIICamera-based road Lane detection by deep learning III
Camera-based road Lane detection by deep learning III
 
fusion of Camera and lidar for autonomous driving II
fusion of Camera and lidar for autonomous driving IIfusion of Camera and lidar for autonomous driving II
fusion of Camera and lidar for autonomous driving II
 
Simulation for autonomous driving at uber atg
Simulation for autonomous driving at uber atgSimulation for autonomous driving at uber atg
Simulation for autonomous driving at uber atg
 
3-d interpretation from single 2-d image IV
3-d interpretation from single 2-d image IV3-d interpretation from single 2-d image IV
3-d interpretation from single 2-d image IV
 
3-d interpretation from single 2-d image for autonomous driving
3-d interpretation from single 2-d image for autonomous driving3-d interpretation from single 2-d image for autonomous driving
3-d interpretation from single 2-d image for autonomous driving
 
3-d interpretation from stereo images for autonomous driving
3-d interpretation from stereo images for autonomous driving3-d interpretation from stereo images for autonomous driving
3-d interpretation from stereo images for autonomous driving
 
Depth Fusion from RGB and Depth Sensors by Deep Learning
Depth Fusion from RGB and Depth Sensors by Deep LearningDepth Fusion from RGB and Depth Sensors by Deep Learning
Depth Fusion from RGB and Depth Sensors by Deep Learning
 
Pedestrian behavior/intention modeling for autonomous driving V
Pedestrian behavior/intention modeling for autonomous driving VPedestrian behavior/intention modeling for autonomous driving V
Pedestrian behavior/intention modeling for autonomous driving V
 
Deep learning for 3-D Scene Reconstruction and Modeling
Deep learning for 3-D Scene Reconstruction and Modeling Deep learning for 3-D Scene Reconstruction and Modeling
Deep learning for 3-D Scene Reconstruction and Modeling
 
Driving behaviors for adas and autonomous driving xiv
Driving behaviors for adas and autonomous driving xivDriving behaviors for adas and autonomous driving xiv
Driving behaviors for adas and autonomous driving xiv
 
Deep Learning’s Application in Radar Signal Data
Deep Learning’s Application in Radar Signal DataDeep Learning’s Application in Radar Signal Data
Deep Learning’s Application in Radar Signal Data
 
Stereo Matching by Deep Learning
Stereo Matching by Deep LearningStereo Matching by Deep Learning
Stereo Matching by Deep Learning
 
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
 
Depth Fusion from RGB and Depth Sensors IV
Depth Fusion from RGB and Depth Sensors  IVDepth Fusion from RGB and Depth Sensors  IV
Depth Fusion from RGB and Depth Sensors IV
 

Viewers also liked

Presentación1
Presentación1Presentación1
Presentación1elcoroso
 
SEOday 2017 - Local Search - Rasmus Himmelstrup
SEOday 2017 - Local Search - Rasmus HimmelstrupSEOday 2017 - Local Search - Rasmus Himmelstrup
SEOday 2017 - Local Search - Rasmus HimmelstrupRasmus Gi
 
Short Term Rental Statistics Report Altadena (10-2014 to 10-2016)
Short Term Rental Statistics Report Altadena (10-2014 to 10-2016)Short Term Rental Statistics Report Altadena (10-2014 to 10-2016)
Short Term Rental Statistics Report Altadena (10-2014 to 10-2016)Robert StGenis
 
FBR / Passer de l'idée au système
FBR / Passer de l'idée au systèmeFBR / Passer de l'idée au système
FBR / Passer de l'idée au systèmeCOP_HHA
 
Helikopter - bölüm 04 - helikopter sistemleri - emre akar
Helikopter - bölüm 04 - helikopter sistemleri -  emre akarHelikopter - bölüm 04 - helikopter sistemleri -  emre akar
Helikopter - bölüm 04 - helikopter sistemleri - emre akarİstanbul Arel Üniversitesi
 
Seguridad informatica;robo de identidad
Seguridad informatica;robo de identidadSeguridad informatica;robo de identidad
Seguridad informatica;robo de identidadSofia Rivera
 
Luchterhand Verkehrsrecht
Luchterhand VerkehrsrechtLuchterhand Verkehrsrecht
Luchterhand VerkehrsrechtAhmet Özdemir
 
Marcel Hild - Spryker (e)commerce framework als Alternative zu traditioneller...
Marcel Hild - Spryker (e)commerce framework als Alternative zu traditioneller...Marcel Hild - Spryker (e)commerce framework als Alternative zu traditioneller...
Marcel Hild - Spryker (e)commerce framework als Alternative zu traditioneller...AboutYouGmbH
 
Mobiles Arbeiten: das komplette Firmenwissen auf dem Handy
Mobiles Arbeiten:  das komplette Firmenwissen auf dem HandyMobiles Arbeiten:  das komplette Firmenwissen auf dem Handy
Mobiles Arbeiten: das komplette Firmenwissen auf dem HandyOPTIMAL SYSTEMS GmbH
 

Viewers also liked (11)

Presentación1
Presentación1Presentación1
Presentación1
 
SEOday 2017 - Local Search - Rasmus Himmelstrup
SEOday 2017 - Local Search - Rasmus HimmelstrupSEOday 2017 - Local Search - Rasmus Himmelstrup
SEOday 2017 - Local Search - Rasmus Himmelstrup
 
Short Term Rental Statistics Report Altadena (10-2014 to 10-2016)
Short Term Rental Statistics Report Altadena (10-2014 to 10-2016)Short Term Rental Statistics Report Altadena (10-2014 to 10-2016)
Short Term Rental Statistics Report Altadena (10-2014 to 10-2016)
 
SHEREIF MOSTAFA v5
SHEREIF MOSTAFA v5SHEREIF MOSTAFA v5
SHEREIF MOSTAFA v5
 
FBR / Passer de l'idée au système
FBR / Passer de l'idée au systèmeFBR / Passer de l'idée au système
FBR / Passer de l'idée au système
 
Helikopter - bölüm 04 - helikopter sistemleri - emre akar
Helikopter - bölüm 04 - helikopter sistemleri -  emre akarHelikopter - bölüm 04 - helikopter sistemleri -  emre akar
Helikopter - bölüm 04 - helikopter sistemleri - emre akar
 
Seguridad informatica;robo de identidad
Seguridad informatica;robo de identidadSeguridad informatica;robo de identidad
Seguridad informatica;robo de identidad
 
Luchterhand Verkehrsrecht
Luchterhand VerkehrsrechtLuchterhand Verkehrsrecht
Luchterhand Verkehrsrecht
 
Evolving the Kanban Board
Evolving the Kanban BoardEvolving the Kanban Board
Evolving the Kanban Board
 
Marcel Hild - Spryker (e)commerce framework als Alternative zu traditioneller...
Marcel Hild - Spryker (e)commerce framework als Alternative zu traditioneller...Marcel Hild - Spryker (e)commerce framework als Alternative zu traditioneller...
Marcel Hild - Spryker (e)commerce framework als Alternative zu traditioneller...
 
Mobiles Arbeiten: das komplette Firmenwissen auf dem Handy
Mobiles Arbeiten:  das komplette Firmenwissen auf dem HandyMobiles Arbeiten:  das komplette Firmenwissen auf dem Handy
Mobiles Arbeiten: das komplette Firmenwissen auf dem Handy
 

Similar to Mmpaper draft10

Real-time 3D Object Detection on LIDAR Point Cloud using Complex- YOLO V4
Real-time 3D Object Detection on LIDAR Point Cloud using Complex- YOLO V4Real-time 3D Object Detection on LIDAR Point Cloud using Complex- YOLO V4
Real-time 3D Object Detection on LIDAR Point Cloud using Complex- YOLO V4IRJET Journal
 
10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdf10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdfmokamojah
 
LiDAR-based Autonomous Driving III (by Deep Learning)
LiDAR-based Autonomous Driving III (by Deep Learning)LiDAR-based Autonomous Driving III (by Deep Learning)
LiDAR-based Autonomous Driving III (by Deep Learning)Yu Huang
 
IRJET- Automatic Traffic Sign Detection and Recognition using CNN
IRJET- Automatic Traffic Sign Detection and Recognition using CNNIRJET- Automatic Traffic Sign Detection and Recognition using CNN
IRJET- Automatic Traffic Sign Detection and Recognition using CNNIRJET Journal
 
Goal location prediction based on deep learning using RGB-D camera
Goal location prediction based on deep learning using RGB-D cameraGoal location prediction based on deep learning using RGB-D camera
Goal location prediction based on deep learning using RGB-D camerajournalBEEI
 
Object gripping algorithm for robotic assistance by means of deep leaning
Object gripping algorithm for robotic assistance by means of deep leaning Object gripping algorithm for robotic assistance by means of deep leaning
Object gripping algorithm for robotic assistance by means of deep leaning IJECEIAES
 
An Analysis of Various Deep Learning Algorithms for Image Processing
An Analysis of Various Deep Learning Algorithms for Image ProcessingAn Analysis of Various Deep Learning Algorithms for Image Processing
An Analysis of Various Deep Learning Algorithms for Image Processingvivatechijri
 
A REVIEW ON IMPROVING TRAFFIC-SIGN DETECTION USING YOLO ALGORITHM FOR OBJECT ...
A REVIEW ON IMPROVING TRAFFIC-SIGN DETECTION USING YOLO ALGORITHM FOR OBJECT ...A REVIEW ON IMPROVING TRAFFIC-SIGN DETECTION USING YOLO ALGORITHM FOR OBJECT ...
A REVIEW ON IMPROVING TRAFFIC-SIGN DETECTION USING YOLO ALGORITHM FOR OBJECT ...IRJET Journal
 
Udacity-Didi Challenge Finalists
Udacity-Didi Challenge FinalistsUdacity-Didi Challenge Finalists
Udacity-Didi Challenge FinalistsDavid Silver
 
Traffic Light Detection and Recognition for Self Driving Cars using Deep Lear...
Traffic Light Detection and Recognition for Self Driving Cars using Deep Lear...Traffic Light Detection and Recognition for Self Driving Cars using Deep Lear...
Traffic Light Detection and Recognition for Self Driving Cars using Deep Lear...ijtsrd
 
Licence Plate Recognition Using Supervised Learning and Deep Learning
Licence Plate Recognition Using Supervised Learning and Deep LearningLicence Plate Recognition Using Supervised Learning and Deep Learning
Licence Plate Recognition Using Supervised Learning and Deep LearningIRJET Journal
 
Traffic Sign Detection and Recognition for Automated Driverless Cars Based on...
Traffic Sign Detection and Recognition for Automated Driverless Cars Based on...Traffic Sign Detection and Recognition for Automated Driverless Cars Based on...
Traffic Sign Detection and Recognition for Automated Driverless Cars Based on...ijtsrd
 
Lane and Object Detection for Autonomous Vehicle using Advanced Computer Vision
Lane and Object Detection for Autonomous Vehicle using Advanced Computer VisionLane and Object Detection for Autonomous Vehicle using Advanced Computer Vision
Lane and Object Detection for Autonomous Vehicle using Advanced Computer VisionYogeshIJTSRD
 
Automatic Detection of Unexpected Accidents Monitoring Conditions in Tunnels
Automatic Detection of Unexpected Accidents Monitoring Conditions in TunnelsAutomatic Detection of Unexpected Accidents Monitoring Conditions in Tunnels
Automatic Detection of Unexpected Accidents Monitoring Conditions in TunnelsIRJET Journal
 
IRJET - A Survey Paper on Efficient Object Detection and Matching using F...
IRJET -  	  A Survey Paper on Efficient Object Detection and Matching using F...IRJET -  	  A Survey Paper on Efficient Object Detection and Matching using F...
IRJET - A Survey Paper on Efficient Object Detection and Matching using F...IRJET Journal
 
Object Detection for Autonomous Cars using AI/ML
Object Detection for Autonomous Cars using AI/MLObject Detection for Autonomous Cars using AI/ML
Object Detection for Autonomous Cars using AI/MLIRJET Journal
 

Similar to Mmpaper draft10 (20)

Real-time 3D Object Detection on LIDAR Point Cloud using Complex- YOLO V4
Real-time 3D Object Detection on LIDAR Point Cloud using Complex- YOLO V4Real-time 3D Object Detection on LIDAR Point Cloud using Complex- YOLO V4
Real-time 3D Object Detection on LIDAR Point Cloud using Complex- YOLO V4
 
information-11-00583-v3.pdf
information-11-00583-v3.pdfinformation-11-00583-v3.pdf
information-11-00583-v3.pdf
 
10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdf10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdf
 
LiDAR-based Autonomous Driving III (by Deep Learning)
LiDAR-based Autonomous Driving III (by Deep Learning)LiDAR-based Autonomous Driving III (by Deep Learning)
LiDAR-based Autonomous Driving III (by Deep Learning)
 
IRJET- Automatic Traffic Sign Detection and Recognition using CNN
IRJET- Automatic Traffic Sign Detection and Recognition using CNNIRJET- Automatic Traffic Sign Detection and Recognition using CNN
IRJET- Automatic Traffic Sign Detection and Recognition using CNN
 
Goal location prediction based on deep learning using RGB-D camera
Goal location prediction based on deep learning using RGB-D cameraGoal location prediction based on deep learning using RGB-D camera
Goal location prediction based on deep learning using RGB-D camera
 
Object gripping algorithm for robotic assistance by means of deep leaning
Object gripping algorithm for robotic assistance by means of deep leaning Object gripping algorithm for robotic assistance by means of deep leaning
Object gripping algorithm for robotic assistance by means of deep leaning
 
An Analysis of Various Deep Learning Algorithms for Image Processing
An Analysis of Various Deep Learning Algorithms for Image ProcessingAn Analysis of Various Deep Learning Algorithms for Image Processing
An Analysis of Various Deep Learning Algorithms for Image Processing
 
A REVIEW ON IMPROVING TRAFFIC-SIGN DETECTION USING YOLO ALGORITHM FOR OBJECT ...
A REVIEW ON IMPROVING TRAFFIC-SIGN DETECTION USING YOLO ALGORITHM FOR OBJECT ...A REVIEW ON IMPROVING TRAFFIC-SIGN DETECTION USING YOLO ALGORITHM FOR OBJECT ...
A REVIEW ON IMPROVING TRAFFIC-SIGN DETECTION USING YOLO ALGORITHM FOR OBJECT ...
 
Udacity-Didi Challenge Finalists
Udacity-Didi Challenge FinalistsUdacity-Didi Challenge Finalists
Udacity-Didi Challenge Finalists
 
Traffic Light Detection and Recognition for Self Driving Cars using Deep Lear...
Traffic Light Detection and Recognition for Self Driving Cars using Deep Lear...Traffic Light Detection and Recognition for Self Driving Cars using Deep Lear...
Traffic Light Detection and Recognition for Self Driving Cars using Deep Lear...
 
Licence Plate Recognition Using Supervised Learning and Deep Learning
Licence Plate Recognition Using Supervised Learning and Deep LearningLicence Plate Recognition Using Supervised Learning and Deep Learning
Licence Plate Recognition Using Supervised Learning and Deep Learning
 
kanimozhi2019.pdf
kanimozhi2019.pdfkanimozhi2019.pdf
kanimozhi2019.pdf
 
Traffic Sign Detection and Recognition for Automated Driverless Cars Based on...
Traffic Sign Detection and Recognition for Automated Driverless Cars Based on...Traffic Sign Detection and Recognition for Automated Driverless Cars Based on...
Traffic Sign Detection and Recognition for Automated Driverless Cars Based on...
 
Lane and Object Detection for Autonomous Vehicle using Advanced Computer Vision
Lane and Object Detection for Autonomous Vehicle using Advanced Computer VisionLane and Object Detection for Autonomous Vehicle using Advanced Computer Vision
Lane and Object Detection for Autonomous Vehicle using Advanced Computer Vision
 
Automatic Detection of Unexpected Accidents Monitoring Conditions in Tunnels
Automatic Detection of Unexpected Accidents Monitoring Conditions in TunnelsAutomatic Detection of Unexpected Accidents Monitoring Conditions in Tunnels
Automatic Detection of Unexpected Accidents Monitoring Conditions in Tunnels
 
IRJET - A Survey Paper on Efficient Object Detection and Matching using F...
IRJET -  	  A Survey Paper on Efficient Object Detection and Matching using F...IRJET -  	  A Survey Paper on Efficient Object Detection and Matching using F...
IRJET - A Survey Paper on Efficient Object Detection and Matching using F...
 
AR/SLAM for end-users
AR/SLAM for end-usersAR/SLAM for end-users
AR/SLAM for end-users
 
final_report
final_reportfinal_report
final_report
 
Object Detection for Autonomous Cars using AI/ML
Object Detection for Autonomous Cars using AI/MLObject Detection for Autonomous Cars using AI/ML
Object Detection for Autonomous Cars using AI/ML
 

Recently uploaded

Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
 

Recently uploaded (20)

Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
 

Mmpaper draft10

  • 1. Bidirectional Information exchange Subtitle as needed (paper subtitle) Authors Name/s per 1st Affiliation (Author) line 1 (of Affiliation): dept. name of organization line 2-name of organization, acronyms acceptable line 3-City, Country line 4-e-mail address if desired Authors Name/s per 2nd Affiliation (Author) line 1 (of Affiliation): dept. name of organization line 2-name of organization, acronyms acceptable line 3-City, Country line 4-e-mail address if desired Abstract— It is essential to collect and analyze environmental information surrounding the vehicle in an autonomous driving environment. To do this, it is necessary to analyze real-time information such as location, orientation and size of objects. In particular, handling of occlusion or truncation helps to ensure safe driving by providing exact information about the surroundings of the vehicle. In this paper, we propose a new method to generate object proposals around a vehicle using projected LiDAR information and to perform object detection more accurately by exploiting a 2D image based classifier. This method generates a proposal by filtering the LiDAR information into a 2D edge with a simple but strong affinity. Then classify the objects in the proposal through R-FCN and map the class labels classified at the edge points used to make the proposal. This class label combines with the orientation information of the edge in 3D space to complete the 3D box of the object. This compensates for the disadvantage of the CCD-based CNN classifier which shows difficulties to obtain spatial information and solves such problems as occlusion. We compared our method to state-of-the-art results on the KITTI Dataset and showed good results in terms of speed and accuracy (especially IOU with ground-truth). Keywords—LiDAR; CCD; Proposal generation; Object Detection; R-FCN I. INTRODUCTION In recent years, CNN had many contributions from fields such as object classification and object detection. Especially, as the autonomous driving becomes an issue, deep network researches for object recognition and behavior analysis in the driving environment are actively being carried out. The most recent deep network-based object classification researches are divided into two categories : two-stage type and one-stage type. In two-stage type, 'proposal generator' and 'classifier' scheme are implemented as an independent component, but in one-stage type these are performed together. In two-stage type, R-CNN[1] shows good performance in object classification by evaluating the 'scores' of features that exist in that region via 'region'(?) and classifying the proposed regions into feature maps. This method significantly reduces the computational cost compared to the sliding window method, and thus has advantages in terms of speed. Faster R- CNN [2] devised the 'RPN (Region Proposal Network)' that generates the proposal through feature sharing, taking advantage of the fact that the convolution layer itself contains enough information to represent the object. HyperNet [3] replaces the features used in the Faster R-CNN by pooling several times to compensate for the fact that the FC5 layer information is insufficient for small objects. It creates a hyper feature set and used it to match objects of various sizes. There is, however, a dilemma for translation invariants and variants between linked object detection and classifiers. To solve this dilemma, R-FCN [4] divides the ROI into k × k lattices and finds objects in the center of ROI through local voting. In one-stage type, bounding prediction and class probability calculation are performed together like YOLO[5]. In YOLO, the class probability map and the bounding box confidence map in the grid are generated based on the grid without generating a separate proposal, and the class of the bounding box is classified immediately, greatly simplifying the overall calculation process. Fast YOLO [ ] simplifies the network a little bit and shows a computation speed of 155 fps. In a similar approach, a single shot detector (SSD)[6] was introduced to evaluate box and confidence through step features in the convolution layer like HyperNet. Even though the existing methods are good at classifying occlusion and truncated objects, but is vulnerable to the processing of the obscured part of the object. However, for autonomous navigation, additional information is needed in detecting and classifying objects well. The most important information is occlusion and truncation among objects. These two factors are important factors in understanding the actual position and arrangement of nearby objects in the driving environment. Of course, the existing set of methods is also good at classifying occlusion and truncated objects, but is vulnerable to the processing of the obscured part of the object. In both types, a bounding box is created based on the characteristics of the unobscured part of the obscured object, and the object is classified through a part of the object represented in the box. So, although the class of the object can be found, there is a problem that the size of the actual object in
  • 2. the hidden area can not be estimated. Another problem is obtaining spatial information such as the 3D location and orientation of the detected object, and the actual size of the object. In order to grasp the traffic flow around the vehicle prior to autonomous driving, it is very difficult to estimate the 3D spatial information using only one accurate RGB image. Therefore, using 3D depth information is the most intuitive method. 3DOP[7] generates Depth map using stereo image, and by projecting each Pixel of RGB Image into 3D space. It defines the relationship between each pixel as an MRF energy function through several properties, classifies the object by Linear SVM. However, the sophisticated stereo-based depth map generation used in 3DOP consumes a lot of computational cost. Therefore, there is a method to obtain depth in addition to stereo, among which laser sensor information such as LiDAR is used. Laser sensors can collect 3D spatial information very quickly. Among these studies, vote3D [8] used LiDAR directly learning in 3D space through 3D Voxel. 3DVP classifies LiDAR pointcloud into 3D Voxel and applies 2D Alignment with 3D CAD Model. In addition, there are studies in which LiDAR points are projected onto 2D to generate 2D depth maps. But, LiDAR information has sparse characteristics basically and it is difficult to learn steadily because it shows non-uniform results depending on the surface of environment or object. Based on the structure of RPN + classifier of Faster R-CNN, SubCNN[9] adds subcategory information to both sides and Figure 1. System overview. The upper network is characterized by scoring through voting for the area of k x k with R- FCN. In this study, ROI of this network is generated by filtering LiDAR pointcloud. Through this ROI, the object is classified by voting and the classified class label is mapped again to the point to expand it in 3D space. This creates a final bounding box for each object.
  • 3. corrects the area according to the size and direction information of subclass. Particularly in 'Car' category, 3DVP[10] was applied to identify the 3D location and orientation of the vehicle and perform 2D and 3D segmentation. However, this method can only be applied to the rigid body model, and many additional samples are required depending on each sub category class. In addition, manual sorting for alignment is required for all of them, and the sparse LiDAR pointcloud has the disadvantage that it is difficult to construct a sufficient voxel for long distance objects. And the computation speed is not fast enough and is not suitable for autonomous driving environments where real- time processing is essential. Therefore, this study proposes fast and accurate 2D and 3D object detection and classification method suitable for autonomous driving environment by adding CCD image and LiDAR sensor information. The proposed method uses a LiDAR Point as 2D data that combines all the points of the z-axis in one plane instead of the 3D voxel, which guarantees a much higher density than the conventional one. We filter them and group edges to find edges with high affinity and generate 2D proposals. The generated proposal is used as the ROI of the R-FCN network and the bounding box of the object in 3D space is created by mapping the classification result to the constituent edge point. Finally, if you project it onto 2D again, you can expand the box to the hidden region that does not appear in the 2D image. Our benefits through this study: - This method replaces the existing proposal generator and improves the speed by mixing the simplified LiDAR sensor information with the powerful CCD-based CNN architecture. - It is possible to acquire 3-D spatial information about the object around the vehicle at the same time by only the information generated in the process of making proposal without additional process. - This method expands the bounding box of the occluded or truncated region to extend the box closer to the actual size of the object.
  • 4. To do this, we show how to perform a mutual projection between 2D and 3D points using a simple method in Section II-A. Section II-B uses this to detect objects in the CCD-based state-of-the-art classifier R-FCN network. Section II-C expands the box based on orientation information obtained in II-A and orientation information obtained in 3D space through the detected class label to determine the optimal bounding box for the actual class. Section-III quantitatively verifies the performance of the Average precision(AP) in the object and tracking sequences in the KITTI Dataset and compares it with the previous studies. Also, confirm the expansion result for the obscured area through the picture. II. BIDIRECTIONAL INFORMATION EXCHANGE A. Proposal Generation with LiDAR We propose 'bidirectional projection filtering' using 2D - 3D information as an intersection through a calibration matrix between CCD image and 3D point cloud as shown in Figure 2. Through this method, it is possible to generate an object proposal, estimate an obscured area, and obtain the actual size and 3D spatial information of the detected object based on the CCD image. In this section, we describe how to remove point information that is not needed for proposal generation and exchange information between 2D and 3D space through projection. As shown in figure 1- (b), we remove the ground to separate object points and eliminate points outside the unused CCD angle. Next, project the z-axis on one side to obtain a dense 2D object shape. Then, the error points are filtered and the remaining edges are grouped based on the affinity between the neighboring edges, and a proposal is generated in units of edge groups generated. Figure 2. Point filtering through 2D projection of LiDAR. (a) original image, (b) Lidar point cloud with ground removed, (c) removal of points outside the angle of view and reduce dimensions to increase density (d) 2D projection and filtering of points, (e) 3D point cloud before filtering, and (F) 2D point cloud after filtering. Ground Plane removal - In the LiDAR environment, the simplest way to remove the ground is to remove all of the LiDAR points below the corresponding height by considering the ground of the vehicle equipped with the sensor as ground plane. In this paper, we propose a surface detection method considering the curvature of the road surface because the road is infinitely flat and there is no obstacle. Although there are many methods, this study uses the fact that the objects in the driving environment have a volume including a certain height. Through the voxel grid of 20x20x10 cm, the height of the points constituting the cell is obtained for all XY cells in which the point exists. (1) (2) When each coordinate of the XY plane is i and j, each grid cell is represented by Cij, and if the point set belonging to Cij is PSij. In equation (1), HPSij is the height of each cells. Ground G(i,j) in equation (2) is the grid with zero height area. Figure 2-(b) shows the point cloud in which the ground area is removed in this way. Outside point elimination – It is necessary to further simplify the point set with ground removed. First, the LiDAR Point is multiplied with the calibration matrix to remove all points outside the x-axis range of the CCD image and then process the remaining points only. The first reason for doing a CCD projection is because you can easily remove LiDAR points outside the angle of view. Second, since the reference of the horizontal axis search changes from radian to pixel through projection, it is possible to perform filtering on the horizontal axis in a very simple manner. Figure 2-(c) shows that the range of points to be processed is greatly reduced. Z axis elimination - Sparse LiDAR points after the ground is removed are configured in a two-dimensional plane by projecting all the points on the z-axis to increase the density and at the same time for fast computation. Since the height that can be reached is 50cm or more, PS z ij = -0.9 is set for all point set PSij in order to locate the upper plane at 50cm above the ground where the radar sensor of the vehicle is installed. However, as shown in Figure 2-(b), since the 3D point of the existing 3D space is pressed in one plane, the density of the point is increased, but there is a portion that looks like noise due to the depth difference. Edge filtering in CCD - The use of 2D LiDAR point clouds by CCD projection has several advantages. In 3D space, LiDAR spreads radially with zero point and obtains distance information. In order to extract the shortest line of an object from a shooting point, the distance must be calculated for all the angles that are continuous values. However, if we project this onto the CCD, the discrete value of the pixel becomes the reference domain as shown in Figure 2-(d). As shown in the figure 3, duplicate points can easily be removed according to height or depth in one pixel column in pixel level. In addition, since the error pixel can be easily discriminated through the gradient operation proportional to the distance between
  • 5. neighboring pixels, it is possible to generate the edges closest to the photographed vehicle in each column of the x axis. (3) In equation (3), Cli denotes all x-axis columns containing at least one projected LiDAR Point and hPsi denotes the minimum height of each Cli. This gives hPsi for all i and removes noise through a one-dimensional median filter[11]. At this time, not only the height of the point selected by the median filter but also the index of the corresponding point are copied together to remove the noise point in the 3D space as well. Figure 2-(f) shows a three-dimensional point cloud arranged through this. Segmentation by edge affinity – The point map determined through the above process is segmented by two conditions. First, we group the consecutive edges by the edge grouping method introduced in Edgeboxes[12]. In this case, the edge magnitude mp is used to inverse the Euclidean distance between adjacent points Pi and Pj of two neighboring edge groups as shown in equation (4). (4) (5) The affinity score a(si,sj) is obtained for all edge groups Si, and the edge is segmented based on the boundary of the group whose score changes rapidly. (4) is a formula for obtaining the affinity score between neighboring edge groups. PGz i and PGz j, which are the ground heights, and PGz1.5 i and PGz1.5 j, which are 1.5m high from the ground, are projected on the CCD image, respectively, for pi and pj located at the boundary of each edge set. Create a bounding box by connecting horizontally to the adjacent boundary. Proposal generation based on actual size – We create a bounding box between segmented boundaries. At this time, the bottom value of each boundary point is determined through the ground position obtained by the ground removal process. The height of the bounding box is 1.5m which covers all three classes that are mainly detected in the driving environment and it is projected on CCD image to generate proposal in CCD image. The yellow line in Figure 4 represents the projected LiDAR edge, and the orange line represents the boundary divided by the affinity of the LiDAR edge. The green box at the bottom of Figure 4 creates a box with a height of 1.5m between the boundaries based on this. Figure 4. A box proposal created based on edge group boundaries divided by edge affinity. B. Classification with R-FCN The generated proposal classifies objects by combining them with R-FCN showing state-of-the-art performance. Figure 5-(b) shows the R-FCN implemented through ResNets [13] 101. R- FCN solves the object translation invariance and variance dilemma between detection and classification by dividing Figure 3. The process by which objects are detected by bidirectional projection filtering. (a) original image, (b) 2D LiDAR projection, (c) 2D gradient-based edge cleanup, (d) Proposal creation using edge affinity, and classification result by R- FCN. (e) 3D bounding box expansion by LiDAR projection(Red Boxes). , (f) final result.
  • 6. ROI is divided into n x n grid and feature set is trained through each grid cell. If a box proposal is determined according to this learning method, a high voting score is returned if the object is sufficiently wrapped. By using this feature, it is possible to organize the surrounding box based on the voting score of the redundant box due to noise of LiDAR. Figure 5. Detailed structure of proposed method. (a) 3D to 2D Projection Filtering (b) R-FCN (c) 2D to 3D Projection Filtering. (d) Result C. 3D projection and box extension This time, we assign the class label to the edge group inside the edge boundary of the box classified by R-FCN. Each labeled edge is projected in the 3D space through the index given in Figure 5-(c) and the orientation of the object is estimated through the X-Y axis pole among the points constituting the edge. Once the orientation is determined, the 3D box is expanded to match the class specific size and aspect ratio. In this case, the edge to be used as the base of the expansion can be determined according to the positional relationship between the edge group and the height difference of hPi and hPj at the segmentation boundary line surrounding the edge in 2D space. Using this, we extend the box around the edge that contains the edge of the object. Figure 6 shows the process of expanding the box through the labeled edges and integrating the overlapping boxes and edges. III. EXPERIMENT RESULTS This study was conducted through caffe[15] framework and used NVIDIA TITAN X GPU. The two networks we used for evaluation, Faster R-CNN and R-FCN, were originally studied and evaluated in the PASCAL VOC Dataset, but this study was conducted through KITTI Dataset because LiDAR information is required together. We also compared the original R-FCN method using Selective search [14] and the results of this study with the KITTI object dataset by using R- FCN learning model learned in PASCAL VOC for objective comparison. The metrics used in the evaluation are the standard mean average precision and the mean intersection- over-union. A. Experiments on KITTI We first training through the KITTI Object Dataset and Tracking sequence, which is a typical dataset that can use CCD information and LiDAR information among public datasets that can use R-FCN at present. In training phase, four classes of 'Car', 'Pedestrian', 'cyclist', and 'background' were combined and learned in four classes. Since the proposed method does not require additional learning for LiDAR, there is no need to change the architecture of R-FCN to learn CCD image. However, as the type of class changes, the volume of the feature set has changed. But the valid range of LiDAR Sensor data given in KITTI Dataset is about 50m. It was not possible to evaluate the object of hard difficulty which is not wide enough to cover the entire CCD image and belong to the distance of 50m or more. Table 1 compares the object detection rates of existing state- of-the-art methods and our research by measuring the mAP by difficulty for the three classes. We measured the precision of the box with the ground truth area and IOU of 50% or more, and found good results for 'Car' and 'Pedestrian'. However, in the case of 'cyclist', LiDAR points were not uniformly distributed according to the spoke shape of the bicycle wheel, so that a box proposal of a sufficient size could not be generated, resulting in a relatively low value. Car Pedestrian Cyclist Method E M E M E M Regionlet[15] 84.75 76.45 73.14 61.15 70.41 58.72 3DVP[10] 87.46 75.77 - - - - SubCat[16] 84.14 75.46 - - - - SDP [17] 90.33 83.53 77.74 64.19 74.08 61.31 Ours 95.41 88.54 81.78 65.71 72.11 60.85 Table 1. The results of the KITTI dataset of our study are compared with state-of-the-art methods. Hard difficulty was excluded from comparison because of the limit of measurement distance of LiDAR sensor of about 50m. B. Experiments on PASCAL VOC & KITTI In fact, this study is close to the proposal generation method, so the classifier R-FCN itself is used without any modification. Therefore, the measurement of mAP at KITTI is Figure 6. Expanded box by determined edge orientation and corner detection. (a) shows the object box classified by R-FCN, and (b) is a figure that is extended to 2D according to the actual size of the class through 2D to 3D projection and then projected to 2D.
  • 7. not enough to analyze the performance of our study because it is more influenced by the performance of our R-FCN. Therefore, we compared the results with R-FCN using Faster R-CNN or Selective search using RPN to objectively measure performance as a proposal creation and result correction tool. In this comparison, we used the caffe model learned from PASCAL VOC 07 + 12 published in the original paper as it is to prevent problems that may occur during re-learning of two networks through KITTI Dataset. Table 2 compares the accuracy and computation time of two other studies with this study. Table 3 shows the AP variation with IOU ratio to ground-truth. As shown in the table, the overlap ratio is higher than that of conventional RPN or SS. This is because the box extension through the class label reduces the error by re- expanding the box closer to the actual object size. We have confirmed that, as long as the object belongs to the LiDAR sensor range, this study expresses the object as a bounding box closer to the ground-truth than the existing research. Although the box extension is performed only for the vehicle category due to the problem of orientation acquisition, as shown in Zhang et al.[14]'s experiment. But, if the aspect ratio and size of the object such as pedestrian can be generalized, it is possible to expand the additional category. Training Data Test Data mAP (%) test time (sec/img) RPN+Faster R- CNN 07+12 KITTI 75.7 0.37 RPN+R-FCN 07+12 KITTI 77.4 0.20 SS+R-FCN 07+12 KITTI 80.4 2.21 Ours +R-FCN 07+12 KITTI 82.4 0.17 Table 2. Comparison result of detection rate in KITTI Object and Tracking dataset. To reduce errors, we used the original pre-training model. Training Data Test Data AP@ 0.5 AP@ 0.7 AP@ 0.9 RPN+R-FCN 07+12 KITTI(car) 84.8 77.4 55.2 SS+R-FCN 07+12 KITTI(car) 86.3 80.4 58.4 Ours 07+12 KITTI(car) 89.7 82.4 80.1 Table 3. The AP change according to the IOU rate in the Car category. The proposed method shows a high overlap ratio with respect to the ground truth compared to the RGB feature based proposal generator such as selective search. IV. CONCLUSION We propose a simple, but strong method for autonomous driving through this study. We have shown better results than state-of-the-art methods by combining CCD-based classifier and LiDAR information effectively through the proposed BPF. Especially, 3D space information of a partially obscured object is grasped in real time, and the detection area is extended based on this, so that the result shows a high IOU rate with the groundtruth. However, there are still problems to be solved, such as how to handle objects that appear on the CCD but are outside the LiDAR range, or to expand on both sides of the object. In addition, LiDAR sensing range is shorter than the visible range of the CCD, so it is necessary to compensate for CCD objects. We will address these issues through additional research. REFERENCES [1] Girshick, Ross, et al. "Rich feature hierarchies for accurate object detection and semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014. [2] Ren, Shaoqing, et al. "Faster R-CNN: Towards real-time object detection with region proposal networks." Advances in neural information processing systems. 2015. [3] Kong, Tao, et al. "HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection." arXiv preprint arXiv:1604.00600 (2016). [4] Dai, Jifeng, et al. "R-FCN: Object Detection via Region-based Fully Convolutional Networks." arXiv preprint arXiv:1605.06409 (2016). [5] Redmon, Joseph, et al. "You only look once: Unified, real-time object detection." arXiv preprint arXiv:1506.02640 (2015). [6] Liu, Wei, et al. "SSD: Single Shot MultiBox Detector." arXiv preprint arXiv:1512.02325 (2015). [7] Chen, Xiaozhi, et al. "3d object proposals for accurate object class detection." Advances in Neural Information Processing Systems. 2015. [8] Wang, Dominic Zeng, and Ingmar Posner. "Voting for voting in online point cloud object detection." Proceedings of the Robotics: Science and Systems, Rome, Italy 1317 (2015). [9] Xiang, Yu, et al. "Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection." arXiv preprint arXiv:1604.04693 (2016). [10] Xiang, Yu, et al. "Data-driven 3d voxel patterns for object category recognition." 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2015. [11] T. Huang, G. Yang, and G. Tang, "A fast two-dimensional median filtering algorithm", IEEE Trans. Acoust., Speech, Signal Processing, vol. 27, no. 1, pp. 13–18, 1979. [12] Zitnick, C. Lawrence, and Piotr Dollár. "Edge boxes: Locating object proposals from edges." European Conference on Computer Vision. Springer International Publishing, 2014. [13] He, Kaiming, et al. "Deep residual learning for image recognition." arXiv preprint arXiv:1512.03385 (2015). [14] Uijlings, Jasper RR, et al. "Selective search for object recognition." International journal of computer vision 104.2 (2013): 154-171. [15] Wang, Xiaoyu, et al. "Regionlets for generic object detection." IEEE transactions on pattern analysis and machine intelligence 37.10 (2015): 2071-2084. [16] Ohn-Bar, Eshed, and Mohan Manubhai Trivedi. "Learning to detect vehicles by clustering appearance patterns." IEEE Transactions on Intelligent Transportation Systems 16.5 (2015): 2511-2521. [17] Yang, Fan, Wongun Choi, and Yuanqing Lin. "Exploit all the layers: Fast and accurate cnn object detector with scale dependent pooling and cascaded rejection classifiers." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. [18] Zhang, Liliang, et al. "Is Faster R-CNN Doing Well for Pedestrian Detection?." European Conference on Computer Vision. Springer International Publishing, 2016.