首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Design of Reinforcement Learning Parameters for Seamless Application of Adaptive Traffic Signal Control
Authors:Samah El-Tantawy  Baher Abdulhai  Hossam Abdelgawad
Institution:1. Civil Engineering Department, University of Toronto, Toronto, Ontario, Canada;2. Engineering Mathematics Department, Cairo University, Giza, Egypt;3. Department of Civil Engineering, Faculty of Engineering, Cairo University, Giza, Egypt
Abstract:Adaptive traffic signal control (ATSC) is a promising technique to alleviate traffic congestion. This article focuses on the development of an adaptive traffic signal control system using Reinforcement Learning (RL) as one of the efficient approaches to solve such stochastic closed loop optimal control problem. A generic RL control engine is developed and applied to a multi-phase traffic signal at an isolated intersection in Downtown Toronto in a simulation environment. Paramics, a microscopic simulation platform, is used to train and evaluate the adaptive traffic control system. This article investigates the following dimensions of the control problem: 1) RL learning methods, 2) traffic state representations, 3) action selection methods, 4) traffic signal phasing schemes, 5) reward definitions, and 6) variability of flow arrivals to the intersection. The system was tested on three networks (i.e., small, medium, large-scale) to ensure seamless transferability of the system design and results. The RL controller is benchmarked against optimized pretimed control and actuated control. The RL-based controller saves 48% average vehicle delay when compared to optimized pretimed controller and fully-actuated controller. In addition, the effect of the best design of RL-based ATSC system is tested on a large-scale application of 59 intersections in downtown Toronto and the results are compared versus the base case scenario of signal control systems in the field which are mix of pretimed and actuated controllers. The RL-based ATSC results in the following savings: average delay (27%), queue length (28%), and l CO2 emission factors (28%).
Keywords:Adaptive Traffic Signal Control  Reinforcement Learning  Temporal Difference Learning
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号