面向自动驾驶的多模态信息融合动态目标识别
作者:
作者单位:

1.广东轻工职业技术学院 汽车技术学院,广州 510000;2.广汽埃安新能源汽车股份有限公司研发中心,广州 511400;3.华南理工大学 机械与汽车工程学院,广州 510641;4.广州城市理工学院 工程研究院,广州 510800

作者简介:

张明容(1983—),女,博士,副教授,主要从事智能网联汽车方向研究,(E-mail)153155269@qq.com。

通讯作者:

喻皓,男,高级工程师,(E-mail)yuhao@gacne.com.cn。

基金项目:

国家自然科学基金资助项目(51975217)。


Multimodal information fusion dynamic target recognition for autonomous driving
Author:
Affiliation:

1.School of Automotive Technology, Guangdong Industry Polytechnic, Guangzhou 510000,P. R. China;2.GAC AION New Energy Automobile Co., Ltd., Guangzhou 511400, P. R. China;3.School of Mechanical & Automotive Engineering, South China University of Technology,Guangzhou 510641, P. R. China;4.Engineering Research Institute, Guangzhou City;University of Technology, Guangzhou 510800, P. R. China

Fund Project:

Supported by National Natural Science Foundation of China(51975217).

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [20]
  • |
  • 相似文献 [15]
  • | | |
  • 文章评论
    摘要:

    研究提出一种面向自动驾驶的多模态信息融合的目标识别方法,旨在解决自动驾驶环境下车辆和行人检测问题。该方法首先对ResNet50网络进行改进,引入基于空间注意力机制和混合空洞卷积,通过选择核卷积替换部分卷积层,使网络能够根据特征尺寸动态调整感受野的大小;然后,卷积层中使用锯齿状混合空洞卷积,捕获多尺度上下文信息,提高网络特征提取能力。改用GIoU损失函数替代YOLOv3中的定位损失函数,GIoU损失函数在实际应用中具有较好操作性;最后,提出了基于数据融合的人车目标分类识别算法,有效提高目标检测的准确率。实验结果表明,该方法与OFTNet 、VoxelNet 和FasterRCNN网络相比,在mAP指标白天提升幅度最高可达0.05,晚上可达0.09,收敛效果好。

    Abstract:

    A multi-modal information fusion based object recognition method for autonomous driving is proposed to address the vehicle and pedestrian detection challenge in autonomous driving environments. The method first improves ResNet50 network based on spatial attention mechanism and hybrid null convolution. The standard convolution is replaced by selective kernel convolution, which allows the network to dynamically adjust the size of the perceptual field according to the feature size. Then, the sawtooth hybrid null convolution is used to enable the network to capture multi-scale contextual information and improve the network feature extraction capability. The localization loss function in YOLOv3 is replaced with the GIoU loss function, which has better operability in practical applications. Finally, human-vehicle target classification and recognition algorithm based on two kinds of data fusion is proposed, which can improve the accuracy of the target detection. Experimental results show that compared with OFTNet, VoxelNet and FASTERRCNN, the mAP index can be improved by 0.05 during daytime and 0.09 in the evening, and the convergence effect is good.

    参考文献
    [1] 熊璐,吴建峰,邢星宇,等.自动驾驶汽车行驶风险评估方法综述[J/OL].汽车工程学报:1-15 [2023-04-28]. 网址:http://kns.cnki.net/kcms/detail/50.1206.U.20230425.0916.002.htmlXiong L, Wu J F, Xing X Y, et al. Review of automatic driving vehicle driving risk assessment methods[J/OL]. Automotive Engineering Journal: 1-15[2023-04-28].http://kns.cnki.net/kcms/detail/50.1206.U.20230425.0916.002. html(in Chinese)
    [2] Nan Y L,Zhang H C, Zeng Y. Intelligent detection of Multi-Class pitaya fruits in target picking row based on WGB-YOLO network[J]. Computers and Electronics in Agriculture,2023,208: 107780.
    [3] Li J R, Cai R Y, Tan Y, et al. Automatic detection of actual water depth of urban floods from social media images[J]. Measurement,2023,216: 1-19.
    [4] Vora S, Lang A H, Helou B, et al. Pointpainting: sequential fusion for 3d object detection[C]//Proc of Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 4604-4612.
    [5] Ren S, He K, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2016, 39(6):1137-1149.
    [6] Ku J, Mozifian M, Lee J, et al. Joint 3d proposal generation and object detection from view aggregation【C]//Proc of 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Madrid: IEEE, 2018, 1-8.
    [7] Botha F. Data fusion of radar and stereo vision for detection and tracking of moving objects[C]//Pattern Recognition Association of South Africa & Robotics & Mechatronics International Conference. Bloemfontein: IEEE, 2017.
    [8] Li Y, Ma L, Zhong Z, et al. Deep learning for lidar point clouds in autonomous driving: a review [J]. IEEE Transactions on Neural Networks and Learning Systems, 2020(99):1-21.
    [9] Wang Y X,Xu S S,Li W B, et al. Identification and location of grapevine sucker based on information fusion of 2D laser scanner and machine vision)[J]. International Journal of Agricultural and Biological Engineering, 2017,10(2), 84-93.
    [10] Barrientos A, Garzón M, Fotiadis P E .Human detection from a mobile robot using fusion of laser and vision information[J].Sensors,2013,13(9):11603-11635.
    [11] Huang Y, Xiao Y, Wang P, et al.A seam-tracking laser welding platform with 3D and 2D visual information fusion vision sensor system[J].The International Journal of Advanced Manufacturing Technology,2013,67(1-4):415-426.
    [12] Ajayi O G, Ashi J, Guda B. Performance evaluation of YOLO v5 model for automatic crop and weed classification on UAV images[J]. Smart Agricultural Technology,2023,5: 1-10.
    [13] He K M, Zhang X, Ren S, et al. Deep Residual Learning for Image Recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770-778.
    [14] Li Z,Xu B L,Wu D, et al. A YOLO-GGCNN based grasping framework for mobile robots in unknown environments[J]. Expert Systems With Applications,2023, 225: 1-14.
    [15] Zhao C, Shu X, Yan X, et al. RDD-YOLO: a modified YOLO for detection of steel surface defects[J]. Measurement,2023,214:1-12
    [16] 邹承明,薛榕刚.GIoU和Focal loss融合的YOLOv3目标检测算法[J].计算机工程与应用,2020,56(24):214-222.Zou C M, Xue R G. Improved YOLOv3 object detection algorithm:combining GIoU and Focal loss[J]. Computer Engineering and Applications, 2020, 56(24):214-222.(in Chinese) .
    [17] Geiger A, Lenz P, Urtasun R. Are we ready for autonomous driving? the kitti vision benchmark suite[C]//2012 IEEE conference on computer vision and pattern recognition. IEEE, 2012: 3354-3361.
    [18] Roddick T, Kendall A, Cipolla R. Orthographic feature transform for monocular 3d object detection[J]. arXiv preprint arXiv:1811.08188, 2018.
    [19] Zhou Y, Tuzel O. Voxelnet: end-to-end learning for point cloud based 3d object detection[C]//Proc of Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018, 4490-4499.
    [20] 吴喆.基于深度学习的动态背景下船舶检测和跟踪的研究[D].宜昌: 中国三峡大学,2019.Wu Z. Research on ship detection and tracking in dynamic background based on deep learning[D]. Yichang: China Three Gorges University, 2019.(in Chinese)
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

张明容,喻皓,吕辉,姜立标,李利平,卢磊.面向自动驾驶的多模态信息融合动态目标识别[J].重庆大学学报,2024,47(4):139-156.

复制
分享
文章指标
  • 点击次数:359
  • 下载次数: 916
  • HTML阅读次数: 120
  • 引用次数: 0
历史
  • 收稿日期:2023-05-12
  • 在线发布日期: 2024-05-06
文章二维码