首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于多智能体深度强化学习的停车系统智能延时匹配方法
引用本文:赵聪,张昕源,李兴华,杜豫川.基于多智能体深度强化学习的停车系统智能延时匹配方法[J].中国公路学报,2022,35(7):261-272.
作者姓名:赵聪  张昕源  李兴华  杜豫川
作者单位:1. 同济大学 道路与交通工程教育部重点实验室, 上海 201804;2. 同济大学 交通运输工程学院, 上海 201804;3. 同济大学 城市交通研究院, 上海 200092
基金项目:上海市教育委员会科研创新计划项目(2021-01-07-00-07-E00092);上海市科学技术委员会科研计划项目(19DZ1208700)
摘    要:“互联网+”模式下区域停车“用户-资源”优化匹配是解决找车位难问题的有效途径,传统研究主要关注动态匹配机制设计,缺乏对用户匹配时机的考虑。在随机动态环境下,用户到达目的地附近后进行适当的延时等待,往往可以获得更优质的泊位资源,但取决于当前的停车供需模式。据此首次提出智能延时匹配策略,将每个停车用户抽象为智能体,构建多智能体深度Q学习模型(M-DQN)。结合系统的停车供需状态学习,用户自主决策延时等待时间,进入分配池后,系统利用匈牙利算法进行泊位匹配。在智能体总数量可变的环境下,利用集中式训练与分布式执行的框架,实现多智能体协同优化。为对比智能延时策略的效果,设计等待零时长策略(Greedy)和等待最大时长策略(Max Delay)。在算例中,结合同济大学四平路校区实测停车数据,设计3种不同的停车供需模式场景。在工作日早高峰时段,Greedy是最优的匹配策略,M-DQN和Max Delay的平均停车过程总用时会增加,匹配成功率下降;在工作日非高峰时段,M-DQN的平均停车过程总用时相较于Greedy和Max Delay分别减少23.8%和22.4%,效果提升明显;在工作日晚高峰时段,M-DQN的平均停车过程总用时相较于Greedy和Max Delay分别减少了12.8%和14.5%,M-DQN可以结合供需状态学习到最优的匹配策略。研究结果表明:在停车供需相对平衡的环境下,所提出的延时匹配策略和多智能体深度强化学习方法可以有效减少用户停车的平均行驶时间和步行距离,且停车周转率越高效果越好;但延时策略在应用方面仍有一定的局限性,不适用于停车供给紧张,停车周转率较低的场景。

关 键 词:交通工程  城市停车  优化匹配  深度强化学习  多智能体  马尔科夫决策过程  
收稿时间:2020-09-10

Intelligent Delay Matching Method for Parking Allocation System via Multi-agent Deep Reinforcement Learning
ZHAO Cong,ZHANG Xin-yuan,LI Xing-hua,DU Yu-chuan.Intelligent Delay Matching Method for Parking Allocation System via Multi-agent Deep Reinforcement Learning[J].China Journal of Highway and Transport,2022,35(7):261-272.
Authors:ZHAO Cong  ZHANG Xin-yuan  LI Xing-hua  DU Yu-chuan
Institution:1. Key Laboratory of Road and Traffic Engineering of the Ministry of Education, Tongji University, Shanghai 201804, China;2. College of Transportation Engineering, Tongji University, Shanghai 201804, China;3. Urban Mobility Institute, Tongji University, Shanghai 200092, China
Abstract:The dynamic parking spot allocation via “Internet+” become a feasible approach to solve the problem of cruising-for-parking. The traditional research mainly focuses on the dynamic mechanism design without considering the timing of parking matchings. In the random dynamic environment, users may obtain more high-quality parking slots by appropriate delay after arriving near their destinations. However, it depends on the current parking supply and demand pattern. In this paper, an intelligent delay matching strategy was proposed for the first time. The problem was formulated by the multi-agent deep Q learning (M-DQN), and each user was recognized as an agent. When users entered the allocation pool, the system applied the Hungarian algorithm for parking slot matching. As the total number of agents was variable, the framework of centralized training and distributed execution was proposed to realize the cooperative optimization of multi-agents. To compare the effect of the intelligent delay strategy, the paper designed the zero-wait strategy Greedy and the maximum-wait strategy Max Delay. In numerical experiments, three different parking scenarios were designed by the measured parking data of Siping Road Campus of Tongji University, China. During the morning rush hour on weekdays, Greedy is the best strategy, the average total parking time of M-DQN and Max Delay represent significant increases compared with Greedy. During the off-peak hour on weekdays, the average total parking time of M-DQN is reduced by 23.8% and 22.4% respectively compared with Greedy and Max Delay. During the evening rush hour on weekdays, the average total parking time of M-DQN is reduced by 12.8% and 14.5% respectively compared with Greedy and Max Delay. M-DQN can learn the optimal delay matching strategy according to the states of parking supply and demand. The results show the proposed delay matching strategy and the multi-agent deep reinforcement learning method can effectively reduce the users' average parking time and walking distance, under the environment with balanced parking supply and demand. Moreover, the scenarios with higher parking turnover rate have the better matching effect. However, there are still some limitations in the application of the delay strategy, which is not suitable for the scenario with tight parking supply and low parking turnover rate.
Keywords:traffic engineering  urban parking  multi-agent  optimized matching  deep reinforcement learning  Markov decision process  
点击此处可从《中国公路学报》浏览原始摘要信息
点击此处可从《中国公路学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号