Human-like autonomous car-following model with deep reinforcement learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Human-like autonomous car-following model with deep reinforcement learning

Institution:	1. Key Laboratory of Road and Traffic Engineering, Ministry of Education, Shanghai 201804, China;2. School of Transportation Engineering, Tongji University, Shanghai 201804, China;3. Department of Civil and Environmental Engineering, University of Washington, Seattle, WA 98195-2700, USA;1. Department of Automation, Tsinghua University, Beijing 100084, China;2. College of Civil Engineering and Architecture, Zhejiang University, Hangzhou 310058, China;3. Department of Civil and Environmental Engineering, University of Maryland, College Park, MD 20742, United States

Abstract:	This study proposes a framework for human-like autonomous car-following planning based on deep reinforcement learning (deep RL). Historical driving data are fed into a simulation environment where an RL agent learns from trial and error interactions based on a reward function that signals how much the agent deviates from the empirical data. Through these interactions, an optimal policy, or car-following model that maps in a human-like way from speed, relative speed between a lead and following vehicle, and inter-vehicle spacing to acceleration of a following vehicle is finally obtained. The model can be continuously updated when more data are fed in. Two thousand car-following periods extracted from the 2015 Shanghai Naturalistic Driving Study were used to train the model and compare its performance with that of traditional and recent data-driven car-following models. As shown by this study’s results, a deep deterministic policy gradient car-following model that uses disparity between simulated and observed speed as the reward function and considers a reaction delay of 1 s, denoted as DDPGvRT, can reproduce human-like car-following behavior with higher accuracy than traditional and recent data-driven car-following models. Specifically, the DDPGvRT model has a spacing validation error of 18% and speed validation error of 5%, which are less than those of other models, including the intelligent driver model, models based on locally weighted regression, and conventional neural network-based models. Moreover, the DDPGvRT demonstrates good capability of generalization to various driving situations and can adapt to different drivers by continuously learning. This study demonstrates that reinforcement learning methodology can offer insight into driver behavior and can contribute to the development of human-like autonomous driving algorithms and traffic-flow models.

Keywords:	Autonomous car following Human-like driving planning Deep reinforcement learning Naturalistic driving study Deep deterministic policy gradient
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏