Abstract:The vehicle detection problem in the autonomous driving environment has many small targets and serious target occlusion, etc. In this paper, a multimodal information fusion dynamic target recognition method for autonomous driving is proposed.. The method mainly includes the following improvements: 1. Improved ResNet50 network based on spatial attention mechanism and hybrid null convolution: the 3×3 standard convolution in the conv2_x and conv3_x parts is replaced using selective kernel convolution, which allows the network to dynamically adjust the size of the perceptual field according to the feature size. The sawtooth hybrid null convolution [1,2,1,2,1,2] is used in the conv4_x part to enable the network to capture multi-scale contextual information and improve the network feature extraction capability.2. Switch to GIoU loss function: the localization loss function in YOLOv3 is replaced with the GIoU loss function, which has better operability in practical applications.3. Based on Two data fusion algorithm for human-vehicle target classification and recognition: A human-vehicle target classification and recognition algorithm based on two kinds of data fusion is proposed, which can effectively improve the accuracy of target detection. Experimental results show that compared with OFTNet, VoxelNet and FASTERRCNN, the mAP index can be improved by 0.05 in the daytime and 0.09 in the evening, and the convergence effect is better.