混合交通流异常事件下自动驾驶车辆轨迹优化
作者:
作者单位:

中国人民公安大学交通管理学院

基金项目:

国家重点研发计划项目(2023YFB4302701)


Optimization of autonomous vehicles trajectories under abnormal events in mixed traffic flow
Author:
Affiliation:

School of Traffic Management, People’s Public Security University of China

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [29]
  • | |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    针对混合交通流中异常事件引发的自动驾驶车辆协同变道效率和安全性失衡问题,本文提出一种基于多智能体决斗双重深度Q网络(MAD3QN)的轨迹优化模型。该模型通过分层奖励机制动态耦合全局交通效率与局部安全指标,结合集中式经验池共享策略消除多车协同中的策略震荡与信息孤岛问题,通过分离状态价值函数与优势函数并引入双重Q学习异步更新机制,模型有效抑制Q值过估计偏差。研究搭建Carla-SUMO联合仿真平台,融合微观驾驶行为建模和物理引擎交互,并标定跟驰模型关键参数;设计事故占道、紧急避障等动态异常场景,填补自动驾驶车辆在非稳态交通流下主动管控的研究空白。实验结果显示,相较于先进基准模型,多智能体决斗双重深度Q网络模型在不同交通流密度下平均奖励值均有显著提升,实现安全性与通行效率的帕累托最优,为异常事件下自动驾驶车辆协同决策提供兼具鲁棒性与泛化能力的创新框架。

    Abstract:

    Aiming at the imbalance between efficiency and safety of cooperative lane changing of self-driving vehicles triggered by abnormal events in mixed traffic flow, this paper proposes a trajectory optimization model based on Multi-Intelligent Dueling Double-Depth Q-Network (MAD3QN). The model dynamically couples global traffic efficiency and local safety indexes through a hierarchical reward mechanism that combines a centralized experience pool-sharing strategy to eliminate strategy oscillations and information silos in multi-vehicle coordination, and the model effectively suppresses Q over-estimation bias by separating the state value function from the dominance function and introducing a dual Q learning asynchronous update mechanism. The study builds a Carla-SUMO joint simulation platform, integrates microscopic driving behavior modeling and physical engine interaction, calibrates the key parameters of the follow-along model; and designs dynamic and abnormal scenarios, such as accidental lane occupancy and emergency obstacle avoidance, to fill in the gap of the research on the active control of self-driving vehicles in unsteady traffic flow. The experimental results show that compared with the advanced benchmark model, the multi-intelligent body dueling double deep Q-network model has significantly improved the average reward value under different traffic flow densities, realizing the Pareto optimization of safety and access efficiency, and providing an innovative framework of both robustness and generalization ability for the collaborative decision-making of self-driving vehicles under abnormal events.

    参考文献
    [1] Liu Y, Wang X, Li L, et al. A Novel Lane Change Decision-Making Model of Autonomous Vehicle Based on Support Vector Machine[J]. IEEE Access, 2019:26543-26550.DOI:10.1109/ACCESS.2019.2900416.
    [2] Wang J, Zhang Q, Zhao D. Highway Lane Change Decision-Making via Attention-Based Deep Reinforcement Learning[J]. IEEE/CAA Journal of Automatica Sinica, 2022(9-3).DOI:10.1109/JAS.2021.1004395.
    [3] He X, Yang H, Hu Z, et al. Robust Lane Change Decision Making for Autonomous Vehicles: An Observation Adversarial Reinforcement Learning Approach[J]. IEEE Transactions on Intelligent Vehicles, 2023, 8(1): 184-193. doi:10.1109/TIV.2022.3165178.
    [4] Chen D, Hajidavalloo M R, Li Z, et al. Deep Multi-Agent Reinforcement Learning for Highway On-Ramp Merging in Mixed Traffic[J]. IEEE transactions on intelligent transportation systems, 2023(11):24.
    [5] Li G, Qiu Y, Yang Y, et al. Lane Change Strategies for Autonomous Vehicles: A Deep Reinforcement Learning Approach Based on Transformer[J]. IEEE Transactions on Intelligent Vehicles, 2023, 8:2197-2211.DOI:10.1109/TIV.2022.3227921.
    [6] Wang P, Li H, Chan C Y. Continuous Control for Automated Lane Change Behavior Based on Deep Deterministic Policy Gradient Algorithm[C]//2019 IEEE Intelligent Vehicles Symposium (IV). IEEE, 2019.DOI:10.1109/IVS.2019.8813903.
    [7] Yuan Q, Yan F, Yin Z, et al. Decision-making and planning methods for autonomous vehicles based on multistate estimations and game theory[J]. Advanced Intelligent Systems, 2023, 5: 2300177.
    [8] Feng S, Sun H, Yan X, et al. Dense reinforcement learning for safety validation of autonomous vehicles[J]. Nature, 2023.DOI:10.1038/s41586-023-05732-2.
    [9] Mnih V, Kavukcuoglu K, Silver D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529-533.
    [10] Robinson B W, Rodegerdts L A. Capacity And Performance Of Roundabouts: A Summary Of Recommendations In The Fhwa Roundabout Guide[C]// Fourth Int. Symp. on Highway Capacity. Washington, DC: Transportation Research Board National Research Council, 2000: 422-433.
    [11] A D C, B S A, C A H. Variable speed limit control for steady and oscillatory queues at fixed freeway bottlenecks[J]. Transportation Research Part B: Methodological, 2014, 70:340-358.DOI:10.1016/j.trb.2014.08.006.
    [12] 杨庆芳,马明辉,梁士栋,等.高速公路瓶颈区域可变限速阶梯控制方法[J].西南交通大学学报, 2015, 50(2):354-360.
    Yang Q, Ma M, Liang S, et al. Variable speed limit step control method for highway bottleneck area[J]. Journal of Southwest Jiaotong University, 2015, 50(2):354-360.
    [14] [13] Feng J, Shi T, Wu Y, et al. Multi-Lane Differential Variable Speed Limit Control via Deep Neural Networks Optimized by an Adaptive Evolutionary Strategy[J]. Sensors (Basel, Switzerland), 2023, 23.DOI:10.3390/s23104659.
    [15] [14] Chen D, Hajidavalloo M R, Li Z, et al. Deep Multi-Agent Reinforcement Learning for Highway On-Ramp Merging in Mixed Traffic[J]. IEEE transactions on intelligent transportation systems, 2023(11):24.
    [16] [15] Yilmaz-Niewerth S, Hbel R, Friedrich B. Developing a comprehensive large-scale co-simulation for replication of automated driving in urban traffic scenarios[J]. Transportation Research Procedia, 2024, 78:522-529.DOI:10.1016/j.trpro.2024.02.065.
    [17] [16] 叶宝林,王欣,李灵犀,等.基于深度强化学习PPO的车辆智能控制方法[J/OL].计算机工程,1-14[2024-11-14].https://doi.org/10.19678/j.issn.1000-3428.0068889.
    Ye B, Wang X, Li L, et al. Intelligent vehicle control method based on deep reinforcement learning PPO[J/OL]. Computer Engineering,1-14 [2024-11-14]. https://doi.org/10.19678/j.issn.1000-3428.0068889.
    [19] [17] 刘玮,程旭,李浩源.优化的协作多智能体强化学习架构[J].计算机系统应用,2024,33(11):79-89.DOI:10.15888/j.cnki.csa.009636.
    Liu W, Cheng X, Li H. Optimized collaborative multi-intelligent body reinforcement learning architecture[J]. Computer System Applications, 2024,33(11):79-89.DOI:10.15888/j.cnki.csa.009636.
    [21] [18] Zhang H, Huang M, Zhou H, et al. 2023. Capacity Maximization in RIS-UAV Networks: A DDQN-Based Trajectory and Phase Shift Optimization Approach[J]. IEEE Transactions on Wireless Communications, 22(4): 2583-2591. doi: 10.1109/TWC.2022.3212830.
    [22] [19] Hasselt H V, Guez A, Silver D. Deep Reinforcement Learning with Double Q-learning[J]. Computer ence, 2015.DOI:10.48550/arXiv.1509.06461.
    [23] [20] Guo Y, Zhu L, Li Y, et al. "Car-Following Model Based Analysis of Mixed Traffic Flow Characteristics," 2021 China Automation Congress (CAC), Beijing, China, 2021, pp. 2567-2570, doi: 10.1109/CAC53003.2021.9727760.
    [24] [21] German Aerospace Center (DLR). Definition of Vehicles, Vehicle Types, and Routes [EB/OL]. [2024-10-15]. Available: https://sumo.dlr.de/wiki/Definition_of_Vehicles,_Vehicle_Types,_and_Routes#Car-Following_Models
    [25] [22] German Aerospace Center (DLR). Lane-Changing Models [EB/OL]. [2024-10-20]. Available: https://sumo.dlr.de/wiki/Definition_of_Vehicles,_Vehicle_Types,_and_Routes#Lane-Changing_Models
    [26] [23] Nilsson F. Simulation-based analysis of partially automated vehicular networks[J]. 2019.
    [27] [24] Andreotti E, Selpi, Boyraz P. Potential impact of autonomous vehicles in mixed traffic from simulation using real traffic flow[J]. Journal of Intelligent and Connected Vehicles, 6[2024-10-28].DOI:10.26599/JICV.2023.9210001.
    [28] [25] 王宇唯,黄宏成.基于CARLA的仿真数据集生成框架研究[J].传动技术,2023,37(04):3-6.
    Wang Y, Huang H. Research on simulation dataset generation framework based on CARLA[J]. Transmission Technology,2023,37(04):3-6.
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:19
  • 下载次数: 0
  • HTML阅读次数: 0
  • 引用次数: 0
历史
  • 收稿日期:2024-11-15
  • 最后修改日期:2025-05-13
  • 录用日期:2025-06-03
文章二维码