Real-time UAV path planning based on LSTM network

doi:10.23919/JSEE.2023.000157

Journal of Systems Engineering and Electronics ›› 2024, Vol. 35 ›› Issue (2): 374-385.doi: 10.23919/JSEE.2023.000157

• SYSTEMS ENGINEERING • Previous Articles

Real-time UAV path planning based on LSTM network

Jiandong ZHANG¹(), Yukun GUO¹^,²(), Lihui ZHENG¹^,³(), Qiming YANG¹^,*(), Guoqing SHI¹(), Yong WU¹()

¹ School of Electronics and Information, Northwestern Polytechnical University, Xi’an 710129, China
² The Flight Automatic Control Research Institute of AVIC, Xi’an 710065, China
³ Military Representative Office of Marine Wuhan Bureau in Luoyang Area, Luoyang 471000, China

Received:2022-09-15 Online:2024-04-18 Published:2024-04-18
Contact: Qiming YANG E-mail:jdzhang@nwpu.edu.cn;2020202124@mail.nwpu.edu.cn;lihuizheng@mail.nwpu.edu.cn;yangqm@nwpu.edu.cn;shiguoqing@nwpu.edu.cn;yongwu@nwpu.edu.cn
About author:
ZHANG Jiandong was born in 1974. He received both his M.S. and Ph.D. degrees in system engineering from Northwestern Polytechnical University. He is an associate professor at the Department of System and Control Engineering in Northwestern Polytechnical University, China. His research interests include modeling simulation and effectiveness evaluation of complex systems, development and design of integrated avionics system, and system measurement & test technologies. E-mail: jdzhang@nwpu.edu.cn

GUO Yukun was born in 1999. He received his B.S. degree in detection, guidance and control technology from Northwestern Polytechnical University in Xi’an, China. He is currently working toward his M.S. degree in electronic science and technology from the School of Electronics and Information Technology at Northwestern Polytechnical University. His research interests include unmanned aerial vehicle path planning and deep reinforcement learning. E-mail: 2020202124@mail.nwpu.edu.cn

ZHENG Lihui was born in 1988. He received his B.S. degree in fire command and control engineering from the Naval Aviation University in Yantai, China. He is currently pursuing his M.S. degree in the School of Electronics and Information Technology at Northwestern Polytechnical University. His research interests include advanced firepower and command and control theory, mission planning and combat flight software, artificial intelligence and multiunmanned system mission decision technology. E-mail: lihuizheng@mail.nwpu.edu.cn

YANG Qiming was born in 1988. He received his M.S. degree from Northwestern Polytechnical University (NPU), Xi’an, China in 2013. He was awarded with a Ph.D. degree in electronic science and technology in 2020. He is an assistant researcher of NPU. His main research interests include artificial intelligence and its application on control and decision of unmanned aerial vehicle. E-mail: yangqm@nwpu.edu.cn

SHI Guoqing was born in 1974. He received his M.S. and Ph.D. degrees in system engineering from Northwestern Polytechnical University. He is an associate professor at the Department of System and Control Engineering in Northwestern Polytechnical University, China. His research interests include integrated avionics system measurement & test technologies, development and design of embedded real-time systems, modeling simulation and effectiveness evaluation of complex systems. E-mail: shiguoqing@nwpu.edu.cn

WU Yong was born in 1964. He received his B.S. degree in aeronautical fire control and M.S. degree in fire control from Northwestern Polytechnical University. He is a professor in the Department of Systems and Control Engineering, Northwestern Polytechnical University, China. His research interests include avionics integrated systems and simulation techniques, complex systems modeling, and simulation and effectiveness assessment. E-mail: yongwu@nwpu.edu.cn
Supported by:
This work was supported by the Natural Science Basic Research Program of Shaanxi (2022JQ-593).

Abstract

Abstract:

To address the shortcomings of single-step decision making in the existing deep reinforcement learning based unmanned aerial vehicle (UAV) real-time path planning problem, a real-time UAV path planning algorithm based on long short-term memory (RPP-LSTM) network is proposed, which combines the memory characteristics of recurrent neural network (RNN) and the deep reinforcement learning algorithm. LSTM networks are used in this algorithm as Q-value networks for the deep Q network (DQN) algorithm, which makes the decision of the Q-value network has some memory. Thanks to LSTM network, the Q-value network can use the previous environmental information and action information which effectively avoids the problem of single-step decision considering only the current environment. Besides, the algorithm proposes a hierarchical reward and punishment function for the specific problem of UAV real-time path planning, so that the UAV can more reasonably perform path planning. Simulation verification shows that compared with the traditional feed-forward neural network (FNN) based UAV autonomous path planning algorithm, the RPP-LSTM proposed in this paper can adapt to more complex environments and has significantly improved robustness and accuracy when performing UAV real-time path planning.

Key words: deep Q network, path planning, neural network, unmanned aerial vehicle (UAV), long short-term memory (LSTM)

Jiandong ZHANG, Yukun GUO, Lihui ZHENG, Qiming YANG, Guoqing SHI, Yong WU. Real-time UAV path planning based on LSTM network[J]. Journal of Systems Engineering and Electronics, 2024, 35(2): 374-385.

Figures/Tables 17

Fig 1

Fig 2

Fig 3

Fig 4

Fig 5

Table 1

Fig 6

Fig 7

Table 2

Fig 8

Table 3

Fig 9

Fig 10

Table 4

Fig 11

Table 5

Fig 12

References 31

1	AZMAT M, KUMMER S Potential applications of unmanned ground and aerial vehicles to mitigate challenges of transport and logistics-related critical success factors in the humanitarian supply chain. Asian Journal of Sustainability and Social Responsibility, 2020, 5 (1): 1- 22. doi: 10.1186/s41180-019-0030-x
2	HOSSAIN M S, CHAITANYA K, BHATTACHARYA Y, et al Integration of smart watch and geographic information system (GIS) to identify post-earthquake critical rescue area part. II. Analytical evaluation of the system. Progress in Disaster Science, 2021, 9, 100132.
3	KHAN M T R, MUHAMMAD SAAD M, RU Y, et al Aspects of unmanned aerial vehicles path planning: overview and applications. International Journal of Communication Systems, 2021, 34 (10): e4827. doi: 10.1002/dac.4827
4	YANG C H, TSAI M H, KANG S C, et al UAV path planning method for digital terrain model reconstruction—a debris fan example. Automation in Construction, 2018, 93, 214- 230. doi: 10.1016/j.autcon.2018.05.024
5	WANG G Q, ZHENG X Y, ZHAO H T, et al. Unmanned aerial vehicles path planning based on deep reinforcement learning. Proc. of the International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery, 2019: 81−88.
6	ZHENG Z, LIU Y, ZHANG X Y. The more obstacle information sharing, the more effective real-time path planning? Knowledge-Based Systems, 2016, 114: 36−46.
7	STENTZ A. The focused d* algorithm for real-time replanning. Proc. of the International Joint Conference on Artificial Intelligence, 1995: 1652−1659.
8	CHEN G, LIU D, WANG Y F, et al Path planning method with obstacle avoidance for manipulators in dynamic environment. International Journal of Advanced Robotic Systems, 2018, 15 (6): 1729881418820223.
9	ZHANG Z, WU J, DAI J Y, et al A novel real-time penetration path planning algorithm for stealth UAV in 3D complex dynamic environment. IEEE Access, 2020, 8, 122757- 122771. doi: 10.1109/ACCESS.2020.3007496
10	HUANG H, HUANG P, ZHONG S, et al. Dynamic path planning based on improved D algorithms of Gaode map. Proc. of the IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference, 2019: 15−17.
11	LIKHACHEV M, KOENIG S. A generalized framework for lifelong planning A* search. Proc. of the International Conference on Automated Planning and Scheduling, 2005: 5–10.
12	OGATA K A generic approach on how to formally specify and model check path finding algorithms: Dijkstra, A* and LPA. International Journal of Software Engineering and Knowledge Engineering, 2020, 30 (10): 1481- 1523. doi: 10.1142/S0218194020400215
13	LIM J, OREN S, PANAGIOTIS T. Class-ordered LPA*: an incremental-search algorithm for weighted colored graphs. Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021: 6907−6913.
14	SVEN K, LIKHACHEV M D* lite. Proc. of the 18th National Conference on Artificial Intelligence, 2002, 476- 483.
15	XIE K L, QIANG J, YANG H T Research and optimization of d-start lite algorithm in track planning. IEEE Access, 2020, 8, 161920- 161928. doi: 10.1109/ACCESS.2020.3021073
16	ZHU X H, YAN B, YUE Y Path planning and collision avoidance in unknown environments for USVs based on an improved D* Lite. Applied Sciences, 2021, 11 (17): 7863. doi: 10.3390/app11177863
17	LI J K, LIU Y. Deep reinforcement learning based adaptive real-time path planning for UAV. Proc. of the 8th International Conference on Dependable Systems and Their Applications, 2021: 522−530.
18	HU Z J, GAO X G, WAN K F, et al Relevant experience learning: a deep reinforcement learning method for UAV autonomous motion planning in complex unknown environments. Chinese Journal of Aeronautics, 2021, 34 (12): 187- 204. doi: 10.1016/j.cja.2020.12.027
19	CANDELI A, DE TOMMASI G, LUI D G, et al A deep deterministic policy gradient learning approach to missile autopilot design. IEEE Access, 2022, 10, 19685- 19696. doi: 10.1109/ACCESS.2022.3150926
20	XIANG X C, FOO S Recent advances in deep reinforcement learning applications for solving partially observable markov decision processes (POMDP) problems: Part 1—fundamentals and applications in games, robotics and natural language processing. Machine Learning and Knowledge Extraction, 2021, 3 (3): 554- 581. doi: 10.3390/make3030029
21	YANG S M, YOO S Y, JEONG O R DeNERT-KG: named entity and relation extraction model using DQN, knowledge graph, and BERT. Applied Sciences, 2020, 10 (18): 6429. doi: 10.3390/app10186429
22	LE N, RATHOUR V S, YAMAZAKI K, et al Deep reinforcement learning in computer vision: a comprehensive survey. Artificial Intelligence Review, 2022, 55, 2733- 2819. doi: 10.1007/s10462-021-10061-9
23	RAHMAN S, SARKER S, HAQUE A K M, et al. Deep reinforcement learning: a new frontier in computer vision research. AHAD M A R, INOUE A, ed. Vision, sensing and analytics: integrative approaches. Cham: Springer International Publishing, 2021.
24	AZAR A T, KOUBAA A, ALI MOHAMED N, et al Drone deep reinforcement learning: a review. Electronics, 2021, 10 (9): 999. doi: 10.3390/electronics10090999
25	TAI L, PAOLO G, LIU M. Virtual-to-real deep reinforcement learning: continuous control of mobile robots for mapless navigation. Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2017: 24−28.
26	VENTURINI F, MASON F, PASE F, et al. Distributed reinforcement learning for flexible UAV swarm control with transfer learning capabilities. Proc. of the 6th ACM Workshop on Micro Aerial Vehicle Networks, Systems, and Applications, 2020: 1−6.
27	VENTURINI F, MASON F, PASE F, et al Distributed reinforcement learning for flexible and efficient uav swarm control. IEEE Trans. on Cognitive Communications and Networking, 2021, 7 (3): 955- 969. doi: 10.1109/TCCN.2021.3063170
28	YAN C, XIANG X J, WANG C Towards real-time path planning through deep reinforcement learning for a UAV in dynamic environments. Journal of Intelligent & Robotic Systems, 2020, 98 (2): 297- 309.
29	CHEN X, AI Y D. Multi-UAV path planning based on improved neural network. Proc. of the Chinese Control and Decision Conference, 2018: 9−11.
30	GUO N, LI C H, GAO T T, et al A fusion method of local path planning for mobile robots based on LSTM neural network and reinforcement learning. Mathematical Problems in Engineering, 2021, 2021 (10): 5524232.
31	GUO N, LI C H, WANG D, et al Local path planning of mobile robot based on long short-term memory neural network. Automatic Control and Computer Sciences, 2021, 55 (1): 53- 65. doi: 10.3103/S014641162101003X

Reward type	Priority
Collision reward	1
Arrival reward	2
Track angle reward	3
Distance reward	4

Parameter	Priority
Collision reward	−10
Arrival reward	20
Track angle reward	5
Distance reward	5
$\gamma $	0.8
$\alpha $	0.8
${D_{{\rm{safe}}} }/{\rm{km}}$	0.2

Evaluation indicator	FNN	LSTM
Path length/km	24.3	24.12
Trajectory yaw angle variance	738.85	188.58
Minimum distance to obstacle/m	430.7	354.4

Evaluation indicator	FNN	LSTM
Path length/km	21.24	21.24
Trajectory yaw angle variance	269.79	257.85

Evaluation indicator	FNN	LSTM
Path length/km	Undone	25.2
Trajectory yaw angle variance	Undone	363.1
Minimum distance to obstacle/m	Undone	344.7

Real-time UAV path planning based on LSTM network

RichHTML

PDF (PC)

Knowledge

Abstract

Cite this article

Share this article

Figures/Tables 17

References 31

Related Articles 15

Recommended Articles

Metrics

Comments

[1]	Dada ZHAO, Kai DING, Xiaogang QI, Yu CHEN, Hailin FENG. Sound event localization and detection based on deep learning [J]. Journal of Systems Engineering and Electronics, 2024, 35(2): 294-301.
[2]	Boyu QIN, Dong ZHANG, Shuo TANG, Yang XU. Two-layer formation-containment fault-tolerant control of fixed-wing UAV swarm for dynamic target tracking [J]. Journal of Systems Engineering and Electronics, 2023, 34(6): 1375-1396.
[3]	Jianhong WANG, RAMIREZ-MENDOZA Ricardo A., Yang XU. Nonlinear direct data-driven control for UAV formation flight system [J]. Journal of Systems Engineering and Electronics, 2023, 34(6): 1409-1418.
[4]	Qiang GUO, Long TENG, Xinliang WU, Liangang QI, Wenming SONG. Deinterleaving of radar pulse based on implicit feature [J]. Journal of Systems Engineering and Electronics, 2023, 34(6): 1537-1549.
[5]	Zhiwen XIAO, Xiaowei FU. A cooperative detection game: UAV swarm vs. one fast intruder [J]. Journal of Systems Engineering and Electronics, 2023, 34(6): 1565-1575.
[6]	Chaopeng YU, Wei XIONG, Xiaoqing LI, Lei DONG. Deep convolutional neural network for meteorology target detection in airborne weather radar images [J]. Journal of Systems Engineering and Electronics, 2023, 34(5): 1147-1157.
[7]	Hao DU, Wei WANG, Xuerao WANG, Jingqiu ZUO, Yuanda WANG. Scene image recognition with knowledge transfer for drone navigation [J]. Journal of Systems Engineering and Electronics, 2023, 34(5): 1309-1318.
[8]	Jiawei XIA, Xufang ZHU, Zhong LIU, Qingtao XIA. LSTM-DPPO based deep reinforcement learning controller for path following optimization of unmanned surface vehicle [J]. Journal of Systems Engineering and Electronics, 2023, 34(5): 1343-1358.
[9]	Qihai YAO, Yong WANG, Yixin YANG. Range estimation of few-shot underwater sound source in shallow water based on transfer learning and residual CNN [J]. Journal of Systems Engineering and Electronics, 2023, 34(4): 839-850.
[10]	Yi NAN, Guoxing YI, Lei HU, Changhong WANG, Zhenbiao TU. Influencing factor analysis of interception probability and classification-regression neural network based estimation [J]. Journal of Systems Engineering and Electronics, 2023, 34(4): 992-1006.
[11]	Yongbin YU, Haowen TANG, Xiao FENG, Xiangxiang WANG, Hang HUANG. Design of multilayer cellular neural network based on memristor crossbar and its application to edge detection [J]. Journal of Systems Engineering and Electronics, 2023, 34(3): 641-649.
[12]	Yunxiu ZENG, Kai XU. Recognition and interfere deceptive behavior based on inverse reinforcement learning and game theory [J]. Journal of Systems Engineering and Electronics, 2023, 34(2): 270-288.
[13]	Jie LI, Xiaoyu DANG, Sai LI. DQN-based decentralized multi-agent JSAP resource allocation for UAV swarm communication [J]. Journal of Systems Engineering and Electronics, 2023, 34(2): 289-298.
[14]	Yaozhong ZHANG, Yike LI, Zhuoran WU, Jialin XU. Deep reinforcement learning for UAV swarm rendezvous behavior [J]. Journal of Systems Engineering and Electronics, 2023, 34(2): 360-373.
[15]	Yukun YANG, Xiangdong LIU. Relational graph location network for multi-view image localization [J]. Journal of Systems Engineering and Electronics, 2023, 34(2): 460-468.