Journal of Systems Engineering and Electronics ›› 2022, Vol. 33 ›› Issue (1): 170-179.doi: 10.23919/JSEE.2022.000017

• CONTROL THEORY AND APPLICATION • Previous Articles     Next Articles

Reinforcement learning based parameter optimization of active disturbance rejection control for autonomous underwater vehicle

Wanping SONG1(), Zengqiang CHEN1,2,*(), Mingwei SUN1(), Qinglin SUN1()   

  1. 1 College of Artificial Intelligence, Nankai University, Tianjin 300350, China
    2 Key Laboratory of Intelligent Robotics of Tianjin, Nankai University, Tianjin 300350, China
  • Received:2020-11-26 Accepted:2021-11-24 Online:2022-01-18 Published:2022-02-22
  • Contact: Zengqiang CHEN E-mail:1422501596@qq.com;chenzq@nankai.edu.cn;smw_sunmingwei@163.com;sunql@nankai.edu.cn
  • About author:|SONG Wanping was born in 1998. She received her B.S. degree from Nanjing Agricultural University, Nanjing, China, in 2019. She is currently a graduate student of Nankai University, Tianjin, China. Her current research interests include active disturbance rejection control and reinforcement learning. E-mail: 1422501596@qq.com||CHEN Zengqiang was born in 1964. He received his B.S., M.E., and Ph.D. degrees from Nankai University, in 1987, 1990, and 1997, respectively. He is currently a professor of control theory and engineering of Nankai University, and deputy director of Institute of Robotics and Information Automation. His current research interests include intelligent predictive control, chaotic systems and complex dynamic network, and multi-agent system control. E-mail: chenzq@nankai.edu.cn||SUN Mingwei was born in 1972. He received his Ph.D. degree from the Department of Computer and Systems Science, Nankai University, Tianjin, China, in 2000. From 2000 to 2008, he was a Flight Control Engineer with Beijing Electro-mechanical Engineering Research Institute, Beijing, China. Since 2009, he has been with Nankai University as a professor. His research interests include flight control, guidance, model predictive control, active disturbance rejection control, and nonlinear optimization. E-mail: smw_sunmingwei@163.com||SUN Qinglin received his B.E. and M.E. degrees in control theory and control engineering from Tianjin University, Tianjin, China, in 1985 and 1990, respectively, and his Ph.D. degree in control science and engineering from Nankai University, Tianjin, China, in 2003. He is currently a professor at the Intelligence Predictive Adaptive Control Laboratory of Nankai University and associate dean of College of Artificial Intelligence. His research interests include self-adaptive control, modeling and control of flexible spacecraft, and embedded control systems. E-mail: sunql@nankai.edu.cn
  • Supported by:
    This work was supported by the National Natural Science Foundation of China (61973175; 61973172) and Tianjin Natural Science Foundation (19JCZDJC32800).

Abstract:

This paper proposes a liner active disturbance rejection control (LADRC) method based on the Q-Learning algorithm of reinforcement learning (RL) to control the six-degree-of-freedom motion of an autonomous underwater vehicle (AUV). The number of controllers is increased to realize AUV motion decoupling. At the same time, in order to avoid the oversize of the algorithm, combined with the controlled content, a simplified Q-learning algorithm is constructed to realize the parameter adaptation of the LADRC controller. Finally, through the simulation experiment of the controller with fixed parameters and the controller based on the Q-learning algorithm, the rationality of the simplified algorithm, the effectiveness of parameter adaptation, and the unique advantages of the LADRC controller are verified.

Key words: autonomous underwater vehicle (AUV), reinforcement learning (RL), Q-learning, linear active disturbance rejection control (LADRC), motion decoupling, parameter optimization