首页 | 本学科首页   官方微博 | 高级检索  
     

基于深度强化学习的智能船舶航迹跟踪控制
引用本文:祝亢,黄珍,王绪明. 基于深度强化学习的智能船舶航迹跟踪控制[J]. 中国舰船研究, 2021, 0(1)
作者姓名:祝亢  黄珍  王绪明
作者单位:武汉理工大学自动化学院;武汉理工大学智能交通系统研究中心
基金项目:国家重点研发计划资助项目(2018YFB1601500)。
摘    要:[目的]智能船舶的航迹跟踪控制问题往往面临着控制环境复杂、控制器稳定性不高以及大量的算法计算等问题。为实现对航迹跟踪的精准控制,提出一种引入深度强化学习技术的航向控制器。[方法]首先,结合视线(LOS)算法制导,以船舶的操纵特性和控制要求为基础,将航迹跟踪问题建模成马尔可夫决策过程,设计其状态空间、动作空间、奖励函数;然后,使用深度确定性策略梯度(DDPG)算法作为控制器的实现,采用离线学习方法对控制器进行训练;最后,将训练完成的控制器与BP-PID控制器进行对比研究,分析控制效果。[结果]仿真结果表明,设计的深度强化学习控制器可以从训练学习过程中快速收敛达到控制要求,训练后的网络与BP-PID控制器相比跟踪迅速,具有偏航误差小、舵角变化频率小等优点。[结论]研究成果可为智能船舶航迹跟踪控制提供参考。

关 键 词:智能船舶  航迹跟踪控制  深度强化学习  视线导航法

Tracking control of intelligent ship based on deep reinforcement learning
ZHU Kang,HUANG Zhen,WANG Xuming. Tracking control of intelligent ship based on deep reinforcement learning[J]. Chinese Journal of Ship Research (CJSR), 2021, 0(1)
Authors:ZHU Kang  HUANG Zhen  WANG Xuming
Affiliation:(School of Automation,Wuhan University of Technology,Wuhan 430070,China;Intelligent Transport System Research Center,Wuhan University of Technology,Wuhan 430063,China)
Abstract:[Objectives] The tracking control of intelligent ships often faces the problem of low controller stability in complex control environments and manual algorithmic computing. In order to achieve precise tracking control, this paper proposes a controller based on deep reinforcement learning(DRL).[Methods]Guided by the line-of-sight(LOS) algorithm and based on the maneuvering characteristics and control requirements of ships, this paper formulates a path of Markov decision processes by following the control problem, designing its state space, action space and reward by applying a deep deterministic policy gradient(DDPG) algorithm to implement the controller. An off-line learning method was used to train the controller. After the training, a comparison was made with BP-PID control to analyze the control effects.[Results]Simulation results show that the deep reinforcement learning(DRL) controller can rapidly converge from the training process to meet the control requirements, with the advantages of small yaw error, and a visible reduction in the frequency of changes of the rudder angle.[ Conclusions] The study results can provide a reference for the tracking control of intelligent ships.
Keywords:intelligent ships  tracking control  deep reinforcement learning(DRL)  line-of-sight algorithm
本文献已被 CNKI 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号