[关键词]
[摘要]
研究提出一种面向自动驾驶的多模态信息融合的目标识别方法,旨在解决自动驾驶环境下车辆和行人检测问题。该方法首先对ResNet50网络进行改进,引入基于空间注意力机制和混合空洞卷积,通过选择核卷积替换部分卷积层,使网络能够根据特征尺寸动态调整感受野的大小;然后,卷积层中使用锯齿状混合空洞卷积,捕获多尺度上下文信息,提高网络特征提取能力。改用GIoU损失函数替代YOLOv3中的定位损失函数,GIoU损失函数在实际应用中具有较好操作性;最后,提出了基于数据融合的人车目标分类识别算法,有效提高目标检测的准确率。实验结果表明,该方法与OFTNet 、VoxelNet 和FasterRCNN网络相比,在mAP指标白天提升幅度最高可达0.05,晚上可达0.09,收敛效果好。
[Key word]
[Abstract]
A multi-modal information fusion based object recognition method for autonomous driving is proposed to address the vehicle and pedestrian detection challenge in autonomous driving environments. The method first improves ResNet50 network based on spatial attention mechanism and hybrid null convolution. The standard convolution is replaced by selective kernel convolution, which allows the network to dynamically adjust the size of the perceptual field according to the feature size. Then, the sawtooth hybrid null convolution is used to enable the network to capture multi-scale contextual information and improve the network feature extraction capability. The localization loss function in YOLOv3 is replaced with the GIoU loss function, which has better operability in practical applications. Finally, human-vehicle target classification and recognition algorithm based on two kinds of data fusion is proposed, which can improve the accuracy of the target detection. Experimental results show that compared with OFTNet, VoxelNet and FASTERRCNN, the mAP index can be improved by 0.05 during daytime and 0.09 in the evening, and the convergence effect is good.
[中图分类号]
[基金项目]
国家自然科学基金资助项目(51975217)。