基于强化学习的干线信号混合协同优化方法 Mixed-coordinated Decision-making Method for Arterial Signals Based on Reinforcement Learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于强化学习的干线信号混合协同优化方法

引用本文：	马东方,陈曦,吴晓东,金盛.基于强化学习的干线信号混合协同优化方法[J].交通运输系统工程与信息,2022,22(2):145-153.

作者姓名：	马东方陈曦吴晓东金盛

作者单位：	1.浙江大学，海洋传感与网络研究所，杭州 310058；2.公安部，交通管理科学研究所，江苏无锡 214151

基金项目：	浙江省自然科学基金杰出青年项目；国家自然科学基金

摘要：	交通拥堵已成为很多大中城市普遍存在的社会问题。信号控制作为缓堵保畅的重要措施之一，愈发受到社会关注。信号优化手段可分为模型驱动和数据驱动两类，且随着交通大数据的不断充实，基于强化学习的数据驱动方法日益成为新兴发展方向。然而，现有数据驱动类研究主要偏重于决策模型设计，缺乏对智能体结构的探讨；同时，在多路口协同方面多采用分布式策略，忽略了智能体之间信息交互，无法保障区域层面的整体最优性。为此，本文以干线信号为对象，构建一种多智能体混合式协同决策的信号优化方法。首先，针对交通状态的多样性、异构性及数据不均衡性，设计分布训练-分区记忆的单智能体决策模型，并优化状态空间和回报函数，界定单路口控制的最佳方案；其次，融合分布式和集中式学习的模型优势设计多智能体交互方法，在单路口分布式控制的基础上，设置中心智能体评价局部智能体的决策行为并反馈附加回报以调整局部智能体的决策模型，实现干线多信号的协同运行。最后，搭建仿真平台完成效果测试与算法对比。结果表明：新方法与独立优化和分布式协同相比，在支路交通流基本不受影响的前提下，干线停车次数分别降低了14.8%和13.6%，具有更好的控制效果。
关键词：	智能交通协同决策深度强化学习智能体设计中心智能体
收稿时间：	2021-10-19
Mixed-coordinated Decision-making Method for Arterial Signals Based on Reinforcement Learning

MA Dong-fang,CHEN Xi,WU Xiao-dong,JIN Sheng.Mixed-coordinated Decision-making Method for Arterial Signals Based on Reinforcement Learning[J].Transportation Systems Engineering and Information,2022,22(2):145-153.

Authors:	MA Dong-fang CHEN Xi WU Xiao-dong JIN Sheng

Institution:	1. Institute of Marine Sensing and Networking, Zhejiang University, Hangzhou 310058, China; 2.Traffic Management Research Institute, Ministry of Public Security, Wuxi 214151, Jiangsu, China

Abstract:	Traffic congestion has become a common social problem in many large and medium- sized cities. Signal control, as one of the important measures to alleviate congestion, has attracted great attention. Signal optimization methods can be divided into two types: model-driven and data-driven. With the development of traffic big data, datadriven methods based on reinforcement learning have become an emerging development direction. However, the existing data-driven researches mainly focus on algorithm design while lacking the discussion of agent design. Meanwhile, the distribution strategy is mostly used in multi-intersection coordinated problems, which ignores the communication between agents and cannot guarantee the overall optimization. Therefore, this paper proposes a multiagent cooperative decision-making optimization method for arterial signals. First, given the diversity, heterogeneity, and data imbalance of the traffic state, a single-agent model with a memory palace is designed, in which the state space and reward function are optimized. Secondly, the advantages of distributed and centralized learning are integrated to design an interaction method. Based on the distributed control, a central agent is set up to evaluate behaviors of local agents and provide additional rewards to adjust the model of local agents to realize coordinated control. Finally, a simulation platform is built to conduct the test and algorithm comparison. The results show that compared with independent control and distributed coordination, the proposed method reduces the stop times on the arterial road by 14.8% and 13.6%, respectively, and has a better control effect without affecting the branch road traffic flow.

Keywords:	intelligent transportation cooperative decision-making deep reinforcement learning agent design central agent
本文献已被万方数据等数据库收录！
	点击此处可从《交通运输系统工程与信息》浏览原始摘要信息
	点击此处可从《交通运输系统工程与信息》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏