Abstract:Aiming at the imbalance between efficiency and safety of cooperative lane changing of self-driving vehicles triggered by abnormal events in mixed traffic flow, this paper proposes a trajectory optimization model based on Multi-Intelligent Dueling Double-Depth Q-Network (MAD3QN). The model dynamically couples global traffic efficiency and local safety indexes through a hierarchical reward mechanism that combines a centralized experience pool-sharing strategy to eliminate strategy oscillations and information silos in multi-vehicle coordination, and the model effectively suppresses Q over-estimation bias by separating the state value function from the dominance function and introducing a dual Q learning asynchronous update mechanism. The study builds a Carla-SUMO joint simulation platform, integrates microscopic driving behavior modeling and physical engine interaction, calibrates the key parameters of the follow-along model; and designs dynamic and abnormal scenarios, such as accidental lane occupancy and emergency obstacle avoidance, to fill in the gap of the research on the active control of self-driving vehicles in unsteady traffic flow. The experimental results show that compared with the advanced benchmark model, the multi-intelligent body dueling double deep Q-network model has significantly improved the average reward value under different traffic flow densities, realizing the Pareto optimization of safety and access efficiency, and providing an innovative framework of both robustness and generalization ability for the collaborative decision-making of self-driving vehicles under abnormal events.