基于强化学习的智能车人机共融转向驾驶决策方法 Human-machine integration method for steering decision-making of intelligent vehicle based on reinforcement learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于强化学习的智能车人机共融转向驾驶决策方法

引用本文：	吴超仲,冷姚,陈志军,罗鹏.基于强化学习的智能车人机共融转向驾驶决策方法[J].交通运输工程学报,2022,22(3):55-67.

作者姓名：	吴超仲冷姚陈志军罗鹏

作者单位：	1.武汉理工大学智能交通系统研究中心，湖北武汉 4300632.武汉理工大学交通与物流工程学院，湖北武汉 4300633.武汉理工大学计算机与人工智能学院，湖北武汉 430063

基金项目：	国家自然科学基金项目52172394国家重点研发计划2018YFB1600600湖北省科技重大专项2020AAA001

摘要：	针对智能车人机共融驾驶系统中人和自主驾驶系统的驾驶权连续动态分配问题，尤其是因建模误差导致的权重分配方法适应性低的难题，提出了基于强化学习的人机共融转向驾驶决策方法；考虑驾驶人的转向特性，搭建了基于双点预瞄的驾驶人模型，并采用预测控制理论建立了智能车自主转向控制模型，构建了智能车人机同时在环的转向控制框架；基于Actor-Critic强化学习架构，设计了用于人机驾驶权分配的深度确定性策略梯度(DDPG)智能体，以曲率契合度、跟踪精确性和乘坐舒适性为目标，提出了基于模型的收益函数；构建了人机共融驾驶权分配强化学习框架，包含驾驶人模型、自主转向模型、驾驶权分配智能体以及收益函数；为了验证方法的有效性，招募了8位驾驶人开展共计48人次的模拟驾驶试验。研究结果表明:在曲率适应性验证中，人机共融-DDPG方法优于人工驾驶和人机共融-Fuzzy方法，跟踪性平均提升70.69%、39.67%，舒适性平均提升18.34%、7.55%；在速度适应性验证中，车速为40、60和80 km·h-1条件下，驾驶人权重大于0.5的时间占比分别为90.00%、85.76%、60.74%，且跟踪性相轨迹和舒适性相轨迹都能有效收敛。可见，提出的方法能够适应曲率和车速变化，在保证安全性的前提下提升了跟踪性和舒适性。
关键词：	智能车人机共融转向驾驶决策驾驶权分配强化学习
收稿时间：	2021-12-23
Human-machine integration method for steering decision-making of intelligent vehicle based on reinforcement learning

WU Chao-zhong,LENG Yao,CHEN Zhi-jun,LUO Peng.Human-machine integration method for steering decision-making of intelligent vehicle based on reinforcement learning[J].Journal of Traffic and Transportation Engineering,2022,22(3):55-67.

Authors:	WU Chao-zhong LENG Yao CHEN Zhi-jun LUO Peng

Affiliation:	1.Intelligent Transportation Systems Research Center, Wuhan University of Technology, Wuhan 430063, Hubei, China2.School of Transportation and Logistics Engineering, Wuhan University of Technology, Wuhan 430063, Hubei, China3.School of Computer Science and Artificial Intelligence, Wuhan University of Technology, Wuhan 430063, Hubei, China

Abstract:	In terms of the continuous dynamic allocation problem of driving weights between human and autonomous driving systems in the human-machine integration (HMI) driving system of intelligent vehicles, especially the low adaptability problem of weight allocation methods caused by modeling errors, a HMI steering decision-making method based on the reinforcement learning was proposed. In view of drivers' steering characteristics, a driver model based on the two-point preview was built, and an autonomous steering control model of intelligent vehicles was established by adopting the predictive control theory. On this basis, a steering control framework of simultaneous human-machine in-loop for intelligent vehicles was constructed. According to the Actor-Critic reinforcement learning framework, a deep deterministic policy gradient (DDPG) agent for the human-machine driving weight allocation was designed, and a model-based gain function was proposed with the curvature adaptability, tracking accuracy, and ride comfort as targets. A reinforcement learning framework for the HMI driving weight allocation was constructed, which contains a driver model, an autonomous steering model, a driving weight allocation agent, and a gain function. To verify the effectiveness of the proposed method, eight drivers were recruited, and a total of 48 simulated driving experiments were carried out. Research results show that in the verification of curvature adaptability, the HMI-DDPG method is superior to the manned driving and HMI-Fuzzy methods. The trackability improves by an average of 70.69% and 39.67%, respectively, and the comfortability increases by an average of 18.34% and 7.55%, respectively. In the verification of speed adaptability, under the conditions of a vehicle speed of 40, 60, and 80 km·h-1, the time proportion is 90.00%, 85.76%, and 60.74%, respectively, when the driver's weight is greater than 0.5. The phase trajectories of both the trackability and the comfort can effectively converge. Therefore, the proposed method can adapt to changes in curvature and vehicle speed and improve the trackability and comfort on the premise of ensuring safety.

Keywords:

	点击此处可从《交通运输工程学报》浏览原始摘要信息
	点击此处可从《交通运输工程学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏