Journal of Systems Engineering and Electronics ›› 2025, Vol. 36 ›› Issue (4): 985-993.doi: 10.23919/JSEE.2025.000111

• SYSTEMS ENGINEERING • Previous Articles    

Adaptive dwell scheduling based on Q-learning for multifunctional radar system

Siyu HENG(), Ting CHENG(), Zishu HE(), Yuanqing WANG(), Luqing LIU()   

  • Received:2023-08-04 Online:2025-08-18 Published:2025-09-04
  • Contact: Ting CHENG E-mail:sy_heng1999@qq.com;citrus@uestc.edu.cn;zshe@uestc.edu.cn;wyq13069883010@163.com;2421305188@qq.com
  • About author:
    HENG Siyu was born in 1999. She received her B.S. degree from Jilin University, Changchun, China, in 2021. She is currently pursuing her M.S. degree with the School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China. Her research interests include radar dwell scheduling and radar resource management. E-mail: sy_heng1999@qq.com

    CHENG Ting was born in 1982. She received her B.S. and Ph.D. degrees in electronics engineering from University of Electronic Science and Technology of China, Chengdu, China, in 2006 and 2008, respectively. She is currently an associate professor with the School of Information and Communication Engineering, University of Electronic Science and Technology of China. Her research interests include target tracking, radar dwell scheduling, radar resource management, and cognitive radar. E-mail: citrus@uestc.edu.cn

    HE Zishu was born in 1962. He received his B.S., M.S. and Ph.D. degrees in signal and information processing from the University of Electronic Science and Technology of China (UESTC), Chengdu, China, in 1984, 1988, and 2000, respectively. He is currently a professor of signal and information processing in the School of Electronic Engineering, UESTC. His research interests include array signal processing, digital beam forming, the theory on multiple input multiple output (MIMO) communication and MIMO radar, adaptive signal processing, and interference cancellation. E-mail: zshe@uestc.edu.cn

    WANG Yuanqing was born in 1997. He received his B.S. degree from Harbin Engineering University, Harbin, China, in 2020. He is currently pursuing his M.S. degree with the School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China. His research interests include suppression jamming and radar resource management. E-mail: wyq13069883010@163.com

    LIU Luqing was born in 2001. She received her B.S. degree from the Wuhan University of Technology, Wuhan, China, in 2022. She is currently pursuing her M.S. degree with the School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China. Her research interests include radar dwell scheduling and radar resource management. E-mail: 2421305188@qq.com
  • Supported by:
    This work was supported by the National Natural Science Foundation of China (61771095; 62031007).

Abstract:

The dwell scheduling problem for a multifunctional radar system is led to the formation of corresponding optimization problem. In order to solve the resulting optimization problem, the dwell scheduling process in a scheduling interval (SI) is formulated as a Markov decision process (MDP), where the state, action, and reward are specified for this dwell scheduling problem. Specially, the action is defined as scheduling the task on the left side, right side or in the middle of the radar idle timeline, which reduces the action space effectively and accelerates the convergence of the training. Through the above process, a model-free reinforcement learning framework is established. Then, an adaptive dwell scheduling method based on Q-learning is proposed, where the converged Q value table after training is utilized to instruct the scheduling process. Simulation results demonstrate that compared with existing dwell scheduling algorithms, the proposed one can achieve better scheduling performance considering the urgency criterion, the importance criterion and the desired execution time criterion comprehensively. The average running time shows the proposed algorithm has real-time performance.

Key words: multifunctional radar, dwell scheduling, reinforcement learning, Q-learning