基于深度强化学习的车辆跟驰控制 A Car-following Control Algorithm Based on Deep Reinforcement Learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于深度强化学习的车辆跟驰控制

引用本文：	朱冰,蒋渊德,赵健,陈虹,邓伟文.基于深度强化学习的车辆跟驰控制[J].中国公路学报,2019,32(6):53-60.

作者姓名：	朱冰蒋渊德赵健陈虹邓伟文

作者单位：	1. 吉林大学汽车仿真与控制国家重点实验室, 吉林长春 130025;2. 北京航空航天大学交通科学与工程学院, 北京 100083

基金项目：	国家重点研发计划项目（2016YFB0100904）；国家自然科学基金项目（51775235）；吉林省科技发展计划重点科技研发项目（20180201056GX）；吉林省发改委科技研发项目（2019C036-6）

摘要：	针对自适应巡航控制系统在控制主车跟驰行驶中受前车运动状态的不确定性影响问题，在分析车辆运动特点的基础上，提出一种能够考虑前车运动随机性的跟驰控制策略。搭建驾驶人实车驾驶数据采集平台，招募驾驶人进行实车跟驰道路试验，建立驾驶人真实驾驶数据库。假设车辆未来时刻的加速度决策主要受前方目标车辆运动影响，建立基于双前车跟驰结构的主车纵向控制架构。将驾驶数据库中的驾驶数据分别视作前车和前前车运动变化历程，利用高斯过程算法建立了前车纵向加速度变化随机过程模型，实现对前方目标车运动状态分布的概率性建模。将车辆跟驰问题构建为一定奖励函数下的马尔可夫决策过程，引入深度强化学习研究主车跟驰控制问题。利用近端策略优化算法建立车辆跟驰控制策略，通过与前车运动随机过程模型进行交互式迭代学习，得到具有运动不确定性跟驰环境下的主车纵向控制策略，实现对车辆纵向控制的最优决策。最后基于真实驾驶数据，对控制策略进行测试。研究结果表明：该策略建立了车辆纵向控制与主车和双前车状态之间的映射关系，在迭代学习过程中对前车运动的随机性进行考虑，跟驰控制中不需要对前车运动进行额外的概率预测，能够以较低的计算量实现主车稳定跟随前车行驶。
关键词：	汽车工程跟驰控制深度强化学习自适应巡航控制运动不确定高斯过程
收稿时间：	2019-03-19
A Car-following Control Algorithm Based on Deep Reinforcement Learning

ZHU Bing,JIANG Yuan-de,ZHAO Jian,CHEN Hong,DENG Wei-wen.A Car-following Control Algorithm Based on Deep Reinforcement Learning[J].China Journal of Highway and Transport,2019,32(6):53-60.

Authors:	ZHU Bing JIANG Yuan-de ZHAO Jian CHEN Hong DENG Wei-wen

Institution:	1. State Key Laboratory of Automotive Simulation and Control, Jilin University, Changchun 130025, Jilin, China;2. School of Transportation Science and Engineering, Beihang University, Beijing 100083, China

Abstract:	Longitudinal acceleration decisions in a car-following control mode are directly determined by the state of the preceding vehicle. A driver's uncertainty makes car-following control difficult because of the complexity in state prediction of the target vehicle. To address the problem in which the performance of adaptive cruise control may deteriorate without consideration of the uncertainty of the preceding vehicle, a car-following control strategy based on deep reinforcement learning was proposed. To study the characteristics of human drivers, a driving-data-acquisition platform was established, and substantial amounts of human-driving data were collected. Based on the assumption that longitudinal control decisions are mainly affected by the preceding vehicle, a two-predecessor following structure was established. The vehicles in the driving dataset were taken as target vehicles 1^# and 2^# of the car-following control. Based on the real-world driving dataset, a stochastic process model was established to describe the characteristics of preceding vehicle 1^# based on Gaussian process algorithm. Then car-following control was established as a Markov decision process. A car-following control method based on deep reinforcement learning was obtained through iterative learning with the stochastic process model using proximal policy optimization. Finally, the algorithm was verified based on the driving dataset. The results demonstrate that the mapping between longitudinal acceleration decisions and the states of the host and preceding vehicles can be obtained through iterative learning with consideration of the uncertainty of the target vehicle.

Keywords:	automotive engineering car-following control deep reinforcement learning adaptive cruise control driver's uncertainty Gaussian process
本文献已被 CNKI 等数据库收录！
	点击此处可从《中国公路学报》浏览原始摘要信息
	点击此处可从《中国公路学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏