Journal of Systems Engineering and Electronics ›› 2021, Vol. 32 ›› Issue (3): 642-657.doi: 10.23919/JSEE.2021.000055

• SYSTEMS ENGINEERING • Previous Articles     Next Articles

A single-task and multi-decision evolutionary game model based on multi-agent reinforcement learning

Ye MA1,*(), Tianqing CHANG1(), Wenhui FAN2()   

  1. 1 Academy of Army Armored Force, Beijing 100072, China
    2 Department of Automation, Tsinghua University, Beijing 100084, China
  • Received:2020-03-03 Online:2021-06-18 Published:2021-07-26
  • Contact: Ye MA;;
  • About author:|MA Ye was born in 1993. She received her master’s degree from the Academy of Army Armored Force, Beijing, in 2017. Currently, she is a Ph.D. candidate at the Academy of Army Armored Force. Her research interests include intelligent technology of control system, and modeling and simulation of complex systems. E-mail:||CHANG Tianqing was born in 1963. He received his Ph.D. degree in concurrent engineering from Tsinghua University in 1999. Since 2000, he has been a professor with the Academy of Army Armored Force. His current research interests include target detection and recognition, as well as navigation, guidance and control. E-mail:||FAN Wenhui was born in 1968. He received his Ph.D. degree in control science and engineering from Zhejiang University in 1998. Currently, he is a professor, doctoral tutor, and vice-director in the Department of Automation, Tsinghua University. His research interests include modeling and simulation of complex systems, product information integration modeling technology, product lifecycle management technology, and collaborative design platform technology. E-mail:
  • Supported by:
    This work was supported by the National Key R&D Program of China (2017YFB1400105)


In the evolutionary game of the same task for groups, the changes in game rules, personal interests, the crowd size, and external supervision cause uncertain effects on individual decision-making and game results. In the Markov decision framework, a single-task multi-decision evolutionary game model based on multi-agent reinforcement learning is proposed to explore the evolutionary rules in the process of a game. The model can improve the result of a evolutionary game and facilitate the completion of the task. First, based on the multi-agent theory, to solve the existing problems in the original model, a negative feedback tax penalty mechanism is proposed to guide the strategy selection of individuals in the group. In addition, in order to evaluate the evolutionary game results of the group in the model, a calculation method of the group intelligence level is defined. Secondly, the Q-learning algorithm is used to improve the guiding effect of the negative feedback tax penalty mechanism. In the model, the selection strategy of the Q-learning algorithm is improved and a bounded rationality evolutionary game strategy is proposed based on the rule of evolutionary games and the consideration of the bounded rationality of individuals. Finally, simulation results show that the proposed model can effectively guide individuals to choose cooperation strategies which are beneficial to task completion and stability under different negative feedback factor values and different group sizes, so as to improve the group intelligence level.

Key words: multi-agent, reinforcement learning, evolutionary game, Q-learning