首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于REINFORCE算法和神经网络的无人驾驶车辆变道控制
引用本文:闫浩,刘小珠,石英.基于REINFORCE算法和神经网络的无人驾驶车辆变道控制[J].交通信息与安全,2021,39(1):164-172.
作者姓名:闫浩  刘小珠  石英
作者单位:武汉理工大学自动化学院 武汉 430070
基金项目:国家自然科学基金项目;湖北省技术创新重大项目
摘    要:针对无人驾驶车辆变道超车场景,研究基于REINFORCE算法和神经网络技术的无人驾驶车辆变道控制策略。通过车辆动力学模型确定模型的反馈量、控制量和输出限幅要求; 设计神经网络控制器的结构,根据REINFORCE算法设计控制器训练方案; 分析经验池数据数值和方差过大的问题,提出1种经验池数据预处理的方法以改进控制器训练方案; 结合无人驾驶车辆运行场景,分析和研究强化学习过程中产生的奖励分布稀疏问题,并针对该问题提出1种基于对数函数的奖励塑造解决方案; 与PID控制器和LQR控制器进行对比实验验证。实验结果表明,与PID相比,该控制策略有更小的最大误差,变道过程更安全; 与LQR相比,该控制策略性能表现接近,以此证明其用于无人驾驶车辆变道控制任务的可行性。此外,记录在不同平台下该控制策略的执行时间以证明其实时性和在轻量级平台运行的可行性。 

关 键 词:交通控制    无人驾驶车辆    变道控制    强化学习
收稿时间:2020-09-25

Lane-change Control for Unmanned Vehicle Based on REINFORCE Algorithm and Neural Network
YAN Hao,LIU Xiaozhu,SHI Ying.Lane-change Control for Unmanned Vehicle Based on REINFORCE Algorithm and Neural Network[J].Journal of Transport Information and Safety,2021,39(1):164-172.
Authors:YAN Hao  LIU Xiaozhu  SHI Ying
Institution:School of Automation, Wuhan University of Technology, Wuhan 430070, China
Abstract:For lane change and overtaking of unmanned vehicles, the paper studies the lane change control strategy of unmanned vehicles based on the REINFORCE algorithm and neural network. The feedback, control input, and output limit requirement of the vehicle dynamics model are determined. The REINFORCE algorithm is used to design the structure of the neural network controller and the training plan of the controller. For too large data value and variance of the experience pool, a preprocessing method of the experience pool data is proposed to improve the controller training plan. Besides analyzing sparse reward distribution in the reinforcement learning process, a reward shaping solution based on logarithmic function is proposed combined with the running condition of unmanned vehicles. Compared with PID and LQR controllers, the experiment is carried out. The results show that the proposed control strategy has smaller maximum error compared with PID, with a safer lane-change process. The performance of the control strategy is similar to LQR, which proves its feasibility for the lane change control task of unmanned vehicles. Also, the execution time of the control strategy in different platforms is recorded to prove its real-time performance and feasibility in lightweight platforms. 
Keywords:
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《交通信息与安全》浏览原始摘要信息
点击此处可从《交通信息与安全》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号