[关键词]
[摘要]
基于 BEV 的多传感器融合的自动驾驶感知算法近年来已经取得了重大进展,持续促进着自动驾驶的发展。在多传感器融合的感知算法的研究中,多视角图像向 BEV 视角的转换和多模态特征的融合一直是 BEV 感知算法的重点和难点。在本文中,我们提出了 MSEPE-CRN,一种用于3D目标检测的相机与毫米波雷达融合的感知算法,利用边缘特征和点云提升深度预测的精度,进而实现多视角图像向 BEV 特征的精确转换。同时,引入了多尺度可变形大核注意力机制进行模态融合,解决因不同传感器特征差异过大而导致的错位问题。在 nuScenes 开源数据集上的实验结果表明,与基准网络相比,mAP 提升 2.17%、NDS 提升 1.93%、mATE 提升 2.58%、mAOE提升8.08%、mAVE 提升 2.13%。这表明我们的算法可以有效提高车辆对路面上的运动障碍物的感知能力,具有一定实用价值。
[Key word]
[Abstract]
BEV-based multi-sensor fusion perception algorithms for autonomous driving have made significant progress in recent years and continue to contribute to the development of autonomous driving. In the research of multi-sensor fusion perception algorithms, the conversion of multi-view images to BEV viewpoints and the fusion of multi-modal features have been the focus and difficulty of BEV perception algorithms. In this paper, we propose MSEPE-CRN, a fusion sensing algorithm of camera and millimeter-wave radar for 3D target detection, which utilizes edge features and point clouds to improve the accuracy of depth prediction, and then realizes the accurate conversion of multi-view images to BEV features. Meanwhile, a multi-scale deformable large kernel attention mechanism is introduced for modal fusion to solve the misalignment problem due to the excessive difference of features from different sensors. Experimental results on the nuScenes open-source dataset show that mAP improves 2.17%, NDS improves 1.93%, mATE improves 2.58%, mAOE improves 8.08%, and mAVE improves 2.13% compared with the benchmark network. This shows that our algorithm can effectively improve the vehicle"s ability to perceive moving obstacles on the road, and has some practical value.
[中图分类号]
U469.79
[基金项目]
佛山仙湖实验室先进能源科学与技术广东开放基金项目(XHD2020-003)