Journal of Systems Engineering and Electronics ›› 2024, Vol. 35 ›› Issue (2): 374-385.doi: 10.23919/JSEE.2023.000157

• SYSTEMS ENGINEERING • Previous Articles    

Real-time UAV path planning based on LSTM network

Jiandong ZHANG1(), Yukun GUO1,2(), Lihui ZHENG1,3(), Qiming YANG1,*(), Guoqing SHI1(), Yong WU1()   

  1. 1 School of Electronics and Information, Northwestern Polytechnical University, Xi’an 710129, China
    2 The Flight Automatic Control Research Institute of AVIC, Xi’an 710065, China
    3 Military Representative Office of Marine Wuhan Bureau in Luoyang Area, Luoyang 471000, China
  • Received:2022-09-15 Online:2024-04-18 Published:2024-04-18
  • Contact: Qiming YANG E-mail:jdzhang@nwpu.edu.cn;2020202124@mail.nwpu.edu.cn;lihuizheng@mail.nwpu.edu.cn;yangqm@nwpu.edu.cn;shiguoqing@nwpu.edu.cn;yongwu@nwpu.edu.cn
  • About author:
    ZHANG Jiandong was born in 1974. He received both his M.S. and Ph.D. degrees in system engineering from Northwestern Polytechnical University. He is an associate professor at the Department of System and Control Engineering in Northwestern Polytechnical University, China. His research interests include modeling simulation and effectiveness evaluation of complex systems, development and design of integrated avionics system, and system measurement & test technologies. E-mail: jdzhang@nwpu.edu.cn

    GUO Yukun was born in 1999. He received his B.S. degree in detection, guidance and control technology from Northwestern Polytechnical University in Xi’an, China. He is currently working toward his M.S. degree in electronic science and technology from the School of Electronics and Information Technology at Northwestern Polytechnical University. His research interests include unmanned aerial vehicle path planning and deep reinforcement learning. E-mail: 2020202124@mail.nwpu.edu.cn

    ZHENG Lihui was born in 1988. He received his B.S. degree in fire command and control engineering from the Naval Aviation University in Yantai, China. He is currently pursuing his M.S. degree in the School of Electronics and Information Technology at Northwestern Polytechnical University. His research interests include advanced firepower and command and control theory, mission planning and combat flight software, artificial intelligence and multiunmanned system mission decision technology. E-mail: lihuizheng@mail.nwpu.edu.cn

    YANG Qiming was born in 1988. He received his M.S. degree from Northwestern Polytechnical University (NPU), Xi’an, China in 2013. He was awarded with a Ph.D. degree in electronic science and technology in 2020. He is an assistant researcher of NPU. His main research interests include artificial intelligence and its application on control and decision of unmanned aerial vehicle. E-mail: yangqm@nwpu.edu.cn

    SHI Guoqing was born in 1974. He received his M.S. and Ph.D. degrees in system engineering from Northwestern Polytechnical University. He is an associate professor at the Department of System and Control Engineering in Northwestern Polytechnical University, China. His research interests include integrated avionics system measurement & test technologies, development and design of embedded real-time systems, modeling simulation and effectiveness evaluation of complex systems. E-mail: shiguoqing@nwpu.edu.cn

    WU Yong was born in 1964. He received his B.S. degree in aeronautical fire control and M.S. degree in fire control from Northwestern Polytechnical University. He is a professor in the Department of Systems and Control Engineering, Northwestern Polytechnical University, China. His research interests include avionics integrated systems and simulation techniques, complex systems modeling, and simulation and effectiveness assessment. E-mail: yongwu@nwpu.edu.cn
  • Supported by:
    This work was supported by the Natural Science Basic Research Program of Shaanxi (2022JQ-593).

Abstract:

To address the shortcomings of single-step decision making in the existing deep reinforcement learning based unmanned aerial vehicle (UAV) real-time path planning problem, a real-time UAV path planning algorithm based on long short-term memory (RPP-LSTM) network is proposed, which combines the memory characteristics of recurrent neural network (RNN) and the deep reinforcement learning algorithm. LSTM networks are used in this algorithm as Q-value networks for the deep Q network (DQN) algorithm, which makes the decision of the Q-value network has some memory. Thanks to LSTM network, the Q-value network can use the previous environmental information and action information which effectively avoids the problem of single-step decision considering only the current environment. Besides, the algorithm proposes a hierarchical reward and punishment function for the specific problem of UAV real-time path planning, so that the UAV can more reasonably perform path planning. Simulation verification shows that compared with the traditional feed-forward neural network (FNN) based UAV autonomous path planning algorithm, the RPP-LSTM proposed in this paper can adapt to more complex environments and has significantly improved robustness and accuracy when performing UAV real-time path planning.

Key words: deep Q network, path planning, neural network, unmanned aerial vehicle (UAV), long short-term memory (LSTM)