Journal of Systems Engineering and Electronics ›› 2020, Vol. 31 ›› Issue (4): 734-742.doi: 10.23919/JSEE.2020.000048

• Systems Engineering • Previous Articles     Next Articles

Deep reinforcement learning and its application in autonomous fitting optimization for attack areas of UCAVs

Yue LI1,2(), Xiaohui QIU2,*(), Xiaodong LIU3(), Qunli XIA1()   

  1. 1 School of Aerospace Engineering, Beijing Institute of Technology, Beijing 100081, China
    2 Science and Technology on Electro-Optic Control Laboratory, Luoyang 471000, China
    3 Beijing Aerospace Automatic Control Research Institute, Beijing 100854, China
  • Received:2019-11-12 Online:2020-08-25 Published:2020-08-25
  • Contact: Xiaohui QIU E-mail:liyue627167955@163.com;qiuxh759@163.com;k.start@163.com;1010@bit.edu.cn
  • About author:LI Yue was born in 1995. He received his B.E. degree from Beijing Institute of Technology in 2016. He is currently a doctoral student in School of Aerospace Engineering, Beijing Institute of Technology. His main research interests include flight vehicle design, guidance and control. E-mail: liyue627167955@163.com|QIU Xiaohui was born in 1975. He received hisB.E. degree in industrial automation from Xi'an Jiaotong Universityin 1997, and M.E. degree in electronic information engineering fromNorthwestern Polytechnical University in 2008. He is asenior engineer in Science and Technology on Electro-Optic ControlLaboratory. He is currently a doctoral student in School ofAerospace Engineering, Beijing Institute of Technology. Hisresearch interest is system simulation.E-mail: qiuxh759@163.com|LIU Xiaodong was born in 1987. He received his B.E. degree from Qingdao University in 2008, and Ph.D. degree in automation control science and engineering from Beihang University in 2013. From 2013 to 2015, he was a post-doctoral researcher with Beijing Aerospace Automatic Control Institute, Beijing, China. Currently, he is a senior engineer of the Beijing Aerospace Automatic Control Institute. His research interests include adaptive control, integrated guidance and control for aircraft. E-mail: k.start@163.com|XIA Qunli was born in 1971. He received his B.E. degree in launcher and design, M.E. degree in flight mechanics and Ph.D. degree in aircraft design from Beijing Institute of Technology in 1993, 1996 and 1999, respectively. He is currently an adjunct professor with Beijing Institute of Technology. His research interests are control and guidance technology. E-mail: 1010@bit.edu.cn
  • Supported by:
    the Key Laboratory of Defense Science and Technology Foundation of Luoyang Electro-optical Equipment Research Institute(6142504200108);This work was supported by the Key Laboratory of Defense Science and Technology Foundation of Luoyang Electro-optical Equipment Research Institute (6142504200108)

Abstract:

The ever-changing battlefield environment requires the use of robust and adaptive technologies integrated into a reliable platform. Unmanned combat aerial vehicles (UCAVs) aim to integrate such advanced technologies while increasing the tactical capabilities of combat aircraft. As a research object, common UCAV uses the neural network fitting strategy to obtain values of attack areas. However, this simple strategy cannot cope with complex environmental changes and autonomously optimize decision-making problems. To solve the problem, this paper proposes a new deep deterministic policy gradient (DDPG) strategy based on deep reinforcement learning for the attack area fitting of UCAVs in the future battlefield. Simulation results show that the autonomy and environmental adaptability of UCAVs in the future battlefield will be improved based on the new DDPG algorithm and the training process converges quickly. We can obtain the optimal values of attack areas in real time during the whole flight with the well-trained deep network.

Key words: attack area, neural network, deep deterministic policy gradient (DDPG), unmanned combat aerial vehicle (UCAV)