DQN-based decentralized multi-agent JSAP resource allocation for UAV swarm communication

doi:10.23919/JSEE.2023.000045

Journal of Systems Engineering and Electronics ›› 2023, Vol. 34 ›› Issue (2): 289-298.doi: 10.23919/JSEE.2023.000045

• ELECTRONICS TECHNOLOGY • Previous Articles

DQN-based decentralized multi-agent JSAP resource allocation for UAV swarm communication

Jie LI(), Xiaoyu DANG(), Sai LI()

¹ College of Electronic and Information Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China

Received:2021-12-03 Online:2023-04-18 Published:2023-04-18
Contact: Xiaoyu DANG E-mail:lj2022@nuaa.edu.cn;dang@nuaa.edu.cn;li_sai@nuaa.edu.cn
About author:
LI Jie was born in 1985. She received her B.S. degree in electronic information and engineering and M.S. degree in signal and information processing from University of Jinan, Ji’nan, China, in 2009 and 2011, respectively. She is now studying for her Ph.D. degree in Nanjing University of Aeronautics and Astronautics. Her research interests are UAV swarming communications, wireless communication, and signal processing. E-mail: lj2022@nuaa.edu.cn

DANG Xiaoyu was born in 1973. He received his Ph.D. degree in electrical engineering from Brigham Young University，Provo, UT, USA. He is a professor in the College of Electronic and Information Engineering, Nanjing University of Aeronautics and Astronautics. His research interests include deep space communications, satellite positioning and navigation, unmanned aerial vehicle swarm communications, and aeronautical and astronautical telemetry. E-mail: dang@nuaa.edu.cn

LI Sai was born in 1993. He received his B.S. degree in electronic science and technology from Henan University of Engineering, in 2016, and M.S. degree in electronic science and technology from North China University of Technology, in 2019. He is pursuing his Ph.D. degree in Nanjing University of Aeronautics and Astronautics. His research interests include unmanned aerial vehicle communication and non-orthogonal multiple access. E-mail: li_sai@nuaa.edu.cn
Supported by:
This work was supported by the National Natural Science Foundation of China (62031017;61971221)

Abstract

Abstract:

It is essential to maximize capacity while satisfying the transmission time delay of unmanned aerial vehicle (UAV) swarm communication system. In order to address this challenge, a dynamic decentralized optimization mechanism is presented for the realization of joint spectrum and power (JSAP) resource allocation based on deep Q-learning networks (DQNs). Each UAV to UAV (U2U) link is regarded as an agent that is capable of identifying the optimal spectrum and power to communicate with one another. The convolutional neural network, target network, and experience replay are adopted while training. The findings of the simulation indicate that the proposed method has the potential to improve both communication capacity and probability of successful data transmission when compared with random centralized assignment and multichannel access methods.

Key words: joint spectrum and power (JSAP), unmanned aerial vehicle (UAV) swarm communication, deep Q-learning network (DQN), UAV to UAV (U2U)

Jie LI, Xiaoyu DANG, Sai LI. DQN-based decentralized multi-agent JSAP resource allocation for UAV swarm communication[J]. Journal of Systems Engineering and Electronics, 2023, 34(2): 289-298.

Figures/Tables 11

Fig 1

Fig 2

Fig 3

Table 1

Fig 4

Fig 5

Fig 6

Fig 7

Fig 8

Fig 9

Fig 10

References 33

1	LI B, FEI Z S, ZHANG Y UAV communications for 5G and beyond: recent advances and future trends. IEEE Internet of Things Journal, 2019, 6 (2): 2241- 2263. doi: 10.1109/JIOT.2018.2887086
2	XU J, GUO Q, XIAO L, et al Autonomous decision-making method for combat mission of UAV based on deep reinforcement learning. Proc. of the IEEE 4th Advanced Information Technology, Electronic and Automation Control Conference, 2019, 538- 544.
3	SHAKHATREH H, SAWALMEH A H, FUQAHA A A, et al Unmanned aerial vehicles (UAVs): a survey on civil applications and key research challenges. IEEE Access, 2019, 7, 48572- 48634. doi: 10.1109/ACCESS.2019.2909530
4	ZHANG J D, YANG Q M, SHI G Q, et al UAV cooperative air combat maneuver decision based on multi-agent reinforcement learning. Journal of Systems Engineering and Electronics, 2021, 32 (6): 1421- 1438. doi: 10.23919/JSEE.2021.000121
5	ZUO J L, YANG R N, ZHANG Y, et al Intelligent decision making in air combat maneuvering based on heuristic reinforcement learning. Acta Aeronautica et Astronautica Sinica, 2017, 38 (10): 217- 230.
6	HANG C Q, DONG K S, HUANG H Q, et al Autonomous air combat maneuver decision using Bayesian inference and moving horizon optimization. Journal of Systems Engineering and Electronics, 2018, 29 (1): 86- 97. doi: 10.21629/JSEE.2018.01.09
7	XI Z F, XU A, KOU Y X, et al Decision process of multiaircraft cooperative air combat maneuver. Systems Engineering and Electronics, 2020, 42 (2): 381- 389.
8	ZHANG J, WANG G, YUE S H et al. Multi-agent system application in accordance with game theory in bi-directional coordination network model. Journal of Systems Engineering and Electronics, 2020, 31 (2): 279- 289.
9	ZUO J L, ZHANG Y, YANG R N, et al Reconstruction and evaluation of medium-rang cooperation air combat decision making process with two phase clustering. Systems Engineering and Electronics, 2020, 42 (1): 108- 117.
10	XU H T, HUANG W T, ZHOU Y H, et al Edge computing resource allocation for unmanned aerial vehicle assisted mobile network with blockchain applications. IEEE Trans. on Wireless Communications, 2021, 20 (5): 3107- 3121.
11	ALSALAM B H Y, MORTON K, CAMPBELL D, et al. Autonomous UAV with vision based on-board decision making for remote sensing and precision agriculture. Proc. of the IEEE Aerospace Conference, 2017: 1-12. DOI: 10.1109/AERO.2017.7943593.
12	YUAN H W, XIAO C S, ZHAN W Q, et al. Target detection, positioning and tracking using new UAV gas sensor systems: simulation and analysis. Journal of Intelligent & Robotic Systems, 2019, 94: 871−882.
13	YANG Y Z, ZHENG Z J, BIAN K G, et al Real-time profiling of fine-grained air quality index distribution using UAV sensing. IEEE Internet of Things Juurnal, 2018, 5 (1): 186- 198. doi: 10.1109/JIOT.2017.2777820
14	CAI Y, YU F R, LI J, et al Medium access control for unmanned aerial vehicle (UAV) ad-hoc networks with full-duplex radios and multi-packet reception capability. IEEE Trans. on Vehicular Technology, 2013, 62 (1): 390- 394. doi: 10.1109/TVT.2012.2211905
15	FENG Z Y, JI L, ZHANG Q X, et al Spectrum management for mm-Wave enabled UAV swarm networks: challenges and opportunities. IEEE Communications Magazine, 2019, 57 (1): 146- 153. doi: 10.1109/MCOM.2018.1800087
16	TAKAHASHI Y, KAWAMOTO Y, NISHIYAMA H, et al A novel radio resource optimization method for relay-based unmanned aerial vehicles. IEEE Trans. on Wireless Communications, 2018, 17 (11): 7352- 7363. doi: 10.1109/TWC.2018.2866576
17	GUPTA L, JAIN R, VASZKUN G Survey of important issues in UAV communication networks. IEEE Communications Surveys & Tutorials, 2016, 18 (2): 1123- 1152.
18	KAI C H, LI H, XU L, et al Joint subcarrier assignment with power allocation for sum rate maximization of D2D communications in wireless cellular networks. IEEE Trans. on Vehicular Technology, 2019, 68 (5): 4748- 4759. doi: 10.1109/TVT.2019.2903815
19	ESMAT H H, EMESALAWY M M, IBRAHIM I I Adaptive resource sharing algorithm for device-to-device communications underlaying cellular networks. IEEE Communications Letters, 2016, 20 (3): 530- 533. doi: 10.1109/LCOMM.2016.2517012
20	SHAO J T, ZHENG J J, ZHANG B Deep convolutional neural networks for thyroid tumor grading using ultrasound B-mode images. Journal of the Acoustical Society of America, 2020, 148 (3): 1529- 1535. doi: 10.1121/10.0001924
21	ALEJANDRO G A, PEINADO A M, GONZALEZ J A, et al A gated recurrent convolutional neural network for robust spoofing detection. IEEE/ACM Trans. on Audio Speech and Language Processing, 2019, 27 (12): 1985- 1999. doi: 10.1109/TASLP.2019.2937413
22	GLATT R, DA SILVA F L, DA COSTA BIANCHI R A, et al Deep case-based policy inference for knowledge transfer in reinforcement learning. Expert Systems with Applications, 2020, 156, 113420.
23	WANG S X, LIU H P, GOMES P H, et al. Deep reinforcement learning for dynamic multichannel access in wireless networks. IEEE Trans. on Conitive Communications and Networking, 2018, 4(2): 257−265.
24	XU Y, ZHANG T K, YANG D C, et al Joint resource and trajectory optimization for security in UAV-assisted MEC systems. IEEE Trans. on Wireless Communications, 2021, 69 (1): 573- 588. doi: 10.1109/TCOMM.2020.3025910
25	YANG G, DAI R, LIANG Y C Energy-efficient UAV backscatter communication with joint trajectory design and resource optimization. IEEE Trans. on Wireless Communications, 2021, 20 (2): 926- 941. doi: 10.1109/TWC.2020.3029225
26	SUN Y, XU D F, NG D W K, et al Optimal 3D-trajectory design and resource allocation for solar-powered UAV communication systems. IEEE Trans. on Communications, 2019, 67 (6): 4281- 4298.
27	WANG Y, LI I D, CHEN Y B, et al Joint resource allocation and UAV trajectory optimization for space-air-ground internet of remote things networks. IEEE Systems Journal, 2021, 15 (4): 4745- 4755. doi: 10.1109/JSYST.2020.3019463
28	ZHANG S H, ZHANG H L, DI B, et al Cellular UAV-to-X communications: design and optimization for multi-UAV networks. IEEE Trans. on Wireless Communications, 2019, 18 (2): 1346- 1359. doi: 10.1109/TWC.2019.2892131
29	ZHU Q M, CHEN X M, YANG Z Q, et al Experimental teaching of wireless channel fading emulation based on FPGAs. Journal of Electrical and Electronic Education, 2019, 41 (6): 138- 141.
30	RICE M, DAVIS A, BETTWEISER C Wideband channel model for aeronautical telemetry. IEEE Trans. on Aerospace and Electronic Systems, 2004, 40 (1): 57- 69. doi: 10.1109/TAES.2004.1292142
31	MNIH V, KAVUKCUOGLU K, SILVER D, et al Human level control through deep reinforcement learning. Nature, 2015, 518 (7540): 529- 533. doi: 10.1038/nature14236
32	LIN L J Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning, 1992, 8 (3): 293- 321.
33	TR 36.777. Enhanced LTE support for aerial vehicles release 15. https://www.3gpp.org/ftp/Specs/archive/36_series/36.777/.

Parameters	Value
Carrier frequency/GHz	1
Number of spectrum band	20
Bandwidth/Mbps	2
Maximum transmission power/dBm	23
Gaussian noise power/dBm	−96
Maximum UAV speed/(m·s⁻¹)	10

DQN-based decentralized multi-agent JSAP resource allocation for UAV swarm communication

RichHTML

PDF (PC)

Knowledge

Abstract

Cite this article

Share this article

Figures/Tables 11

References 33

Related Articles 0

Recommended Articles

Metrics

Comments