结合博弈论与强化学习的态势感知与路径预测
CSTR:
作者:
作者单位:

1.国网重庆市电力公司 重庆 400014;2.国网重庆电力公司电力科学研究院 重庆 401123;3.国网电力科学研究院有限公司 南京 211106;4.南瑞集团有限公司 南京南瑞信息通信科技有限公司 南京 211106;5.重庆邮电大学 软件工程学院 重庆 400000

作者简介:

杨云(1964—),男,高级工程师,主要从事电力通信技术、智能电网和信息安全技术方向研究,(E-mail) yy@cq.sgcc.com.cn。

通讯作者:

刘俊(1978—),男,硕士生导师,(E-mail) junliu@cqupt.edu.cn。

中图分类号:

TP393

基金项目:

重庆市电力公司科技项目(520626190067)。


Situational awareness and path prediction combining game theory and reinforcement learning
Author:
Affiliation:

1.State Grid Chongqing Electric Power Company, Chongqing 400014, P. R. China;2.Electric Power Research Institute of State Grid Chongqing Electric Power Company, Chongqing 401123, P. R. China;3.State Grid Electric Power Research Institute Co., Ltd., Nanjing 211106, P. R. China;4.Nanjing NARI Information Communication Technology Co., Ltd., NARI Group Co., Ltd., Nanjing 211106, P. R. China;5.School of Software Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400000, P. R. China

Fund Project:

Supported by Science and Technology Projects of State Grid Chongqing Electric Power Company (520626190067).

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    网络安全态势感知技术对评估网络安全状况及预测攻击行为路径,辅助管理员做出有效防御有重要意义。传统的网络态势评估方法大多偏重在理论层面进行静态分析,难以实际运用,传感器收集到的数据庞大繁杂,易造成存储空间负载过大。针对上述问题,结合博弈论算法与强化学习算法,提出一种结合博弈论与强化学习的网络攻防动态感知模型以分析网络态势安全及预测攻击路径。首先,设计带有优先级关系矩阵的层次分析法计算系统损失及安全态势;其次,引入Boltzmann概率分布法计算混合策略纳什均衡;最后,改进Q-Learning与博弈论算法对网络状态转移进行动态分析,达到准确预测攻击路径、选择最优防御策略的目的。通过网络仿真实验,验证模型的有效性和可行性。

    Abstract:

    Cybersecurity situational awareness technology plays a critical role in assessing network security status, predicting potential attack paths, and assisting administrators in implementing effective defenses. Traditional methods for network situation assessment mostly rely on theoretical analysis, limiting their practicality in real-world networks. Additionally, the complexity of sensor-collected data often results in excessive storage demands. To address these challenges, this paper proposes a dynamic network attack-defense perception model that integrates reinforcement learning and game theory to enhance situational awareness and predict potential attack paths. The approach begins with the design of a hierarchical analytic process using a priority relation matrix to calculate system losses and assess security posture. Next, the Boltzmann probability distribution is employed to calculate the mixed-strategy Nash equilibrium, identifying optimal strategic responses. Finally, an improved Q-learning algorithm, in combination with game-theoretic principles, is used to dynamically model network state transitions, enabling accurate prediction of attack paths and supporting defenders in selecting optimal defense strategies. Simulation results validate the model’s effectiveness and practicality in complex network environments.

    参考文献
    相似文献
    引证文献
引用本文

杨云,梁花,魏兴慎,李洋,刘俊.结合博弈论与强化学习的态势感知与路径预测[J].重庆大学学报,2025,48(6):84-97.

复制
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2020-10-12
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2025-07-11
  • 出版日期:
文章二维码