面向自动驾驶的多模态信息融合动态目标识别

doi:10.11835/j.issn.1000.582X.2024.04.012

首页 > 过刊浏览>2024年第47卷第4期 >139-156. DOI:10.11835/j.issn.1000.582X.2024.04.012

面向自动驾驶的多模态信息融合动态目标识别
DOI:
                        10.11835/j.issn.1000.582X.2024.04.012
                    
CSTR:
                        
                    
作者:
                        张明容1张明容
广东轻工职业技术学院 汽车技术学院，广州 510000
在期刊界中查找
在百度中查找
在本站中查找
喻皓2喻皓
广汽埃安新能源汽车股份有限公司研发中心，广州 511400
在期刊界中查找
在百度中查找
在本站中查找
吕辉3吕辉
华南理工大学 机械与汽车工程学院，广州 510641
在期刊界中查找
在百度中查找
在本站中查找
姜立标3姜立标
华南理工大学 机械与汽车工程学院，广州 510641
在期刊界中查找
在百度中查找
在本站中查找
李利平3李利平
华南理工大学 机械与汽车工程学院，广州 510641
在期刊界中查找
在百度中查找
在本站中查找
卢磊4卢磊
广州城市理工学院 工程研究院，广州 510800
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:1.广东轻工职业技术学院 汽车技术学院，广州 510000;2.广汽埃安新能源汽车股份有限公司研发中心，广州 511400;3.华南理工大学 机械与汽车工程学院，广州 510641;4.广州城市理工学院 工程研究院，广州 510800
作者简介:张明容（1983—），女，博士，副教授，主要从事智能网联汽车方向研究,（E-mail）153155269@qq.com。
通讯作者:喻皓，男，高级工程师，（E-mail）yuhao@gacne.com.cn。
中图分类号:
基金项目:国家自然科学基金资助项目（51975217）。

Multimodal information fusion dynamic target recognition for autonomous driving

Author:

ZHANG Mingrong ^¹
ZHANG Mingrong
School of Automotive Technology, Guangdong Industry Polytechnic, Guangzhou 510000,P. R. China
在期刊界中查找
在百度中查找
在本站中查找
YU Hao ^²
YU Hao
GAC AION New Energy Automobile Co., Ltd., Guangzhou 511400, P. R. China
在期刊界中查找
在百度中查找
在本站中查找
LYU Hui ^³
LYU Hui
School of Mechanical & Automotive Engineering, South China University of Technology,Guangzhou 510641, P. R. China
在期刊界中查找
在百度中查找
在本站中查找
JIANG Libiao ^³
JIANG Libiao
School of Mechanical & Automotive Engineering, South China University of Technology,Guangzhou 510641, P. R. China
在期刊界中查找
在百度中查找
在本站中查找
LI Liping ^³
LI Liping
School of Mechanical & Automotive Engineering, South China University of Technology,Guangzhou 510641, P. R. China
在期刊界中查找
在百度中查找
在本站中查找
LU Lei ^⁴
LU Lei
Engineering Research Institute, Guangzhou City;University of Technology, Guangzhou 510800, P. R. China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

1.School of Automotive Technology, Guangdong Industry Polytechnic, Guangzhou 510000,P. R. China;2.GAC AION New Energy Automobile Co., Ltd., Guangzhou 511400, P. R. China;3.School of Mechanical & Automotive Engineering, South China University of Technology,Guangzhou 510641, P. R. China;4.Engineering Research Institute, Guangzhou City;University of Technology, Guangzhou 510800, P. R. China

Fund Project:

Supported by National Natural Science Foundation of China（51975217）.

摘要

图/表

访问统计

参考文献 [20]

相似文献 [15]

引证文献

资源附件

文章评论

摘要:

研究提出一种面向自动驾驶的多模态信息融合的目标识别方法，旨在解决自动驾驶环境下车辆和行人检测问题。该方法首先对ResNet50网络进行改进，引入基于空间注意力机制和混合空洞卷积，通过选择核卷积替换部分卷积层，使网络能够根据特征尺寸动态调整感受野的大小；然后，卷积层中使用锯齿状混合空洞卷积，捕获多尺度上下文信息，提高网络特征提取能力。改用GIoU损失函数替代YOLOv3中的定位损失函数，GIoU损失函数在实际应用中具有较好操作性；最后，提出了基于数据融合的人车目标分类识别算法，有效提高目标检测的准确率。实验结果表明，该方法与OFTNet 、VoxelNet 和FasterRCNN网络相比，在mAP指标白天提升幅度最高可达0.05，晚上可达0.09，收敛效果好。

关键词:自动驾驶;ResNet50;YOLOv3;数据融合;注意力机制;损失函数

Abstract:

A multi-modal information fusion based object recognition method for autonomous driving is proposed to address the vehicle and pedestrian detection challenge in autonomous driving environments. The method first improves ResNet50 network based on spatial attention mechanism and hybrid null convolution. The standard convolution is replaced by selective kernel convolution, which allows the network to dynamically adjust the size of the perceptual field according to the feature size. Then, the sawtooth hybrid null convolution is used to enable the network to capture multi-scale contextual information and improve the network feature extraction capability. The localization loss function in YOLOv3 is replaced with the GIoU loss function, which has better operability in practical applications. Finally, human-vehicle target classification and recognition algorithm based on two kinds of data fusion is proposed, which can improve the accuracy of the target detection. Experimental results show that compared with OFTNet, VoxelNet and FASTERRCNN, the mAP index can be improved by 0.05 during daytime and 0.09 in the evening, and the convergence effect is good.

Key words:autonomous driving;ResNet50;YOLOv3;data fusion;attention mechanism;loss function

参考文献

[1] 熊璐,吴建峰,邢星宇,等.自动驾驶汽车行驶风险评估方法综述[J/OL].汽车工程学报:1-15 [2023-04-28]. 网址:http://kns.cnki.net/kcms/detail/50.1206.U.20230425.0916.002.htmlXiong L, Wu J F, Xing X Y, et al. Review of automatic driving vehicle driving risk assessment methods[J/OL]. Automotive Engineering Journal: 1-15[2023-04-28].http://kns.cnki.net/kcms/detail/50.1206.U.20230425.0916.002. html(in Chinese)

[2] Nan Y L,Zhang H C, Zeng Y. Intelligent detection of Multi-Class pitaya fruits in target picking row based on WGB-YOLO network[J]. Computers and Electronics in Agriculture,2023,208: 107780.

[3] Li J R, Cai R Y, Tan Y, et al. Automatic detection of actual water depth of urban floods from social media images[J]. Measurement,2023,216: 1-19.

[4] Vora S, Lang A H, Helou B, et al. Pointpainting: sequential fusion for 3d object detection[C]//Proc of Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 4604-4612.

[5] Ren S, He K, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2016, 39(6):1137-1149.

[6] Ku J, Mozifian M, Lee J, et al. Joint 3d proposal generation and object detection from view aggregation【C]//Proc of 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Madrid: IEEE, 2018, 1-8.

[7] Botha F. Data fusion of radar and stereo vision for detection and tracking of moving objects[C]//Pattern Recognition Association of South Africa & Robotics & Mechatronics International Conference. Bloemfontein: IEEE, 2017.

[8] Li Y, Ma L, Zhong Z, et al. Deep learning for lidar point clouds in autonomous driving: a review [J]. IEEE Transactions on Neural Networks and Learning Systems, 2020(99):1-21.

[9] Wang Y X,Xu S S,Li W B, et al. Identification and location of grapevine sucker based on information fusion of 2D laser scanner and machine vision)[J]. International Journal of Agricultural and Biological Engineering, 2017,10(2), 84-93.

[10] Barrientos A, Garzón M, Fotiadis P E .Human detection from a mobile robot using fusion of laser and vision information[J].Sensors,2013,13(9):11603-11635.

[11] Huang Y, Xiao Y, Wang P, et al.A seam-tracking laser welding platform with 3D and 2D visual information fusion vision sensor system[J].The International Journal of Advanced Manufacturing Technology,2013,67(1-4):415-426.

[12] Ajayi O G, Ashi J, Guda B. Performance evaluation of YOLO v5 model for automatic crop and weed classification on UAV images[J]. Smart Agricultural Technology,2023,5: 1-10.

[13] He K M, Zhang X, Ren S, et al. Deep Residual Learning for Image Recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770-778.

[14] Li Z,Xu B L,Wu D, et al. A YOLO-GGCNN based grasping framework for mobile robots in unknown environments[J]. Expert Systems With Applications,2023, 225: 1-14.

[15] Zhao C, Shu X, Yan X, et al. RDD-YOLO: a modified YOLO for detection of steel surface defects[J]. Measurement,2023,214:1-12

[16] 邹承明,薛榕刚.GIoU和Focal loss融合的YOLOv3目标检测算法[J].计算机工程与应用,2020,56(24):214-222.Zou C M, Xue R G. Improved YOLOv3 object detection algorithm:combining GIoU and Focal loss[J]. Computer Engineering and Applications, 2020, 56(24):214-222.(in Chinese) .

[17] Geiger A, Lenz P, Urtasun R. Are we ready for autonomous driving? the kitti vision benchmark suite[C]//2012 IEEE conference on computer vision and pattern recognition. IEEE, 2012: 3354-3361.

[18] Roddick T, Kendall A, Cipolla R. Orthographic feature transform for monocular 3d object detection[J]. arXiv preprint arXiv:1811.08188, 2018.

[19] Zhou Y, Tuzel O. Voxelnet: end-to-end learning for point cloud based 3d object detection[C]//Proc of Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018, 4490-4499.

[20] 吴喆.基于深度学习的动态背景下船舶检测和跟踪的研究[D].宜昌: 中国三峡大学,2019.Wu Z. Research on ship detection and tracking in dynamic background based on deep learning[D]. Yichang: China Three Gorges University, 2019.(in Chinese)

引用本文

张明容,喻皓,吕辉,姜立标,李利平,卢磊.面向自动驾驶的多模态信息融合动态目标识别[J].重庆大学学报,2024,47(4):139-156.

复制

文章指标

点击次数:359
下载次数: 916
HTML阅读次数: 120
引用次数: 0

历史

收稿日期:2023-05-12
最后修改日期:
录用日期:
在线发布日期: 2024-05-06
出版日期:

期刊社主页

编辑部首页

期刊介绍

编委会

数据库收录

过刊浏览

联系我们

引用本文

相关视频

分享

文章指标

历史

文章二维码

期刊社主页

编辑部首页

期刊介绍

编委会

数据库收录

过刊浏览

联系我们

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码