首页 | 本学科首页   官方微博 | 高级检索  
     检索      

考虑博弈的多智能体强化学习分布式信号控制
引用本文:曲昭伟,潘昭天,陈永恒,李海涛,王鑫.考虑博弈的多智能体强化学习分布式信号控制[J].交通运输系统工程与信息,2020,20(2):76-82.
作者姓名:曲昭伟  潘昭天  陈永恒  李海涛  王鑫
作者单位:吉林大学交通学院,长春 130022
基金项目:国家自然科学基金/National Natural Science Foundation of China(51705196).
摘    要:交通需求的不均衡和波动会增加分布式信号控制优化的难度. 由于现有独立动作的多智能体强化学习(IA-MARL)仅基于自身的历史经验做出决策,基于IA-MARL的分布式信号控制难以及时缓解交通需求不均衡和波动的影响. 本文融入博弈论的混合策略纳什均衡概念,改进IA-MARL的决策过程,提出考虑博弈的多智能体强化学习(G-MARL)框架. 在采用带有泊松到达率的道路网络流量不均衡输入的格子网络中,分别对基于IA-MARL 和GMARL 的分布式控制方法进行数值模拟,获取单位行程时间和单位车均延误曲线. 结果显示,与IA-MARL相比,G-MARL在单位行程时间和单位车均延误方面分别改善59.94%和81.45%. 证明G-MARL适用于不饱和且交通需求不均衡和波动的分布式信号控制.

关 键 词:智能交通  分布式交通信号控制  多智能体强化学习  不均衡需求下的城市道路网络  博弈论  数值模拟  
收稿时间:2019-12-10

Distributed Signal Control of Multi-agent Reinforcement Learning Based on Game
QU Zhao-wei,PAN Zhao-tian,CHEN Yong-heng,LI Hai-tao,WANG Xin.Distributed Signal Control of Multi-agent Reinforcement Learning Based on Game[J].Transportation Systems Engineering and Information,2020,20(2):76-82.
Authors:QU Zhao-wei  PAN Zhao-tian  CHEN Yong-heng  LI Hai-tao  WANG Xin
Institution:College of Transportation, Jilin University, Changchun 130022, China
Abstract:The difficulty of distributed signal control is increasing due to the unbalance and fluctuation of traffic demand. Since the decision-making of existing independent action multi-agent reinforcement learning (IA-MARL) is based on its own historical experience, the distributed signal control based on IA-MARL is difficult to timely alleviate the impact of unbalanced and fluctuating traffic demand. In this paper, the framework of multi- agent reinforcement learning based on the game (G-MARL) was proposed by improving the decision- making of IAMARL with integrating the mixed strategy Nash- equilibrium, which is a concept in game theory. In the grid network with the Poisson arrival rate, the distributed control methods based on IA-MARL and G-MARL were simulated to obtain the unit travel time and the unit vehicle delay curves. The results show that, the unit travel time and the unit vehicle average delay obtained by G-MARL are reduced by 59.94% and 81.45% compared with IAMARL respectively. It is proved that G-MARL is suitable for distributed signal control when there are unbalances and fluctuations in traffic demand with the unsaturated state.
Keywords:intelligent transportation  distributed traffic signal control  multi-agent reinforcement learning  urban road network under unbalanced demand  game theory  numerical simulation  
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《交通运输系统工程与信息》浏览原始摘要信息
点击此处可从《交通运输系统工程与信息》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号