
Journal of Systems Engineering and Electronics ›› 2021, Vol. 32 ›› Issue (3): 642-657.doi: 10.23919/JSEE.2021.000055
• SYSTEMS ENGINEERING • Previous Articles Next Articles
					
													Ye MA1,*( ), Tianqing CHANG1(
), Tianqing CHANG1( ), Wenhui FAN2(
), Wenhui FAN2( )
)
												  
						
						
						
					
				
Received:2020-03-03
															
							
															
							
															
							
																	Online:2021-06-18
															
							
																	Published:2021-07-26
															
						Contact:
								Ye MA   
																	E-mail:mayegf@126.com;oliver_chan1214@126.com;fanwenhui@tsinghua.edu.cn
																					About author:Supported by:Ye MA, Tianqing CHANG, Wenhui FAN. A single-task and multi-decision evolutionary game model based on multi-agent reinforcement learning[J]. Journal of Systems Engineering and Electronics, 2021, 32(3): 642-657.
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
 
													
													Table 2
Percentage of each strategy"
| Algorithm | Proportion of cooperation strategies | Proportion of competition strategies | Proportion of inaction strategies | 
| Algorithm of this article | 0.54 | 0.23 | 0.23 | 
| Nash-Q learning algorithm | 0.51 | 0.25 | 0.24 | 
| Monte Carlo method | 0.47 | 0.27 | 0.26 | 
| Genetic algorithm | 0.45 | 0.26 | 0.29 | 
| 1 | SUTTON R, BARTO A. Reinforcement learning: an introduction. Cambridge: MIT Press, 1998. | 
| 2 | AWHEDA M D, SCHWARTZ H M. The residual gradient FACL algorithm for differential games. Proc. of the Canadian Conference on Electrical and Computer Engineering, 2015: 1006−1011. | 
| 3 | JELAI Z. Reinforcement learning based human-prosthetic robot interaction control in movement therapy. Proc. of the International Conference on New Technologies, Development and Application, 2020: 172−181. | 
| 4 | LITTMAN M L. Markov games as a framework for multi-agent reinforcement learning. Proc. of the 11th International Conference on Machine Learning, 1994: 157−163. | 
| 5 | LI Y, HAN W, WANG Y Q Deep reinforcement learning with application to air confrontation intelligent decision-making of manned/unmanned aerial vehicle cooperative system. IEEE Access, 2020, 8, 67887- 67898. doi: 10.1109/ACCESS.2020.2985576 | 
| 6 | DEPTULA P, BELL Z I, DOUCETTE E A, et al Data-based reinforcement learning approximate optimal control for an uncertain nonlinear system with control effectiveness faults. Automatica, 2020, 116, 108922. doi: 10.1016/j.automatica.2020.108922 | 
| 7 | GOTTSCHALK S, BURGER M Differences and similarities between reinforcement learning and the classical optimal control framework. Proceedings in Applied Mathematics and Mechanics, 2019, 19 (1): e201900390. | 
| 8 | LIAO H C, LIU J S. A model-based reinforcement learning approach to time-optimal control problems. Proc. of the International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, 2019: 657−665. | 
| 9 | SHI H B, ZHAI L J, WU H B, et al A multi-tier reinforcement learning model for a cooperative multi-agent system. IEEE Trans. on Cognitive and Developmental Systems, 2020, 12 (3): 636- 644. doi: 10.1109/TCDS.2020.2970487 | 
| 10 | NGUYEN N D, NGUYEN T, NAHAVANDI S Multi-agent behavioral control system using deep reinforcement learning. Neurocomputing, 2019, 359 (24): 58- 68. | 
| 11 | QIE H, SHI D, SHEN T, et al Joint optimization of multi-UAV target assignment and path planning based on multi-agent reinforcement learning. IEEE Access, 2019, 7, 146264- 146272. doi: 10.1109/ACCESS.2019.2943253 | 
| 12 | FIRDAUSIYAH N, TANIGUCHI E, QURESHI A G Modeling city logistics using adaptive dynamic programming based multi-agent simulation. Transportation Research Part E: Logs and Transportation Review, 2019, 125, 74- 96. doi: 10.1016/j.tre.2019.02.011 | 
| 13 | REN Y, FAN D M, FENG Q, et al Agent-based restoration approach for reliability with load balancing on smart grids. Applied Energy, 2019, 249, 46- 57. doi: 10.1016/j.apenergy.2019.04.119 | 
| 14 | MYERSON R B. Game theory: analysis of conflict. Cambridge: Harvard University Press, 1997. | 
| 15 | NIE L, WANG X G, PAN F Y A game-theory approach based on genetic algorithm for flexible job shop scheduling problem. Journal of Physics: Conference Series, 2019, 1187, 032095. doi: 10.1088/1742-6596/1187/3/032095 | 
| 16 | WANG X H, ZHONG X X, LI L, et al. PSOGT: PSO and game theoretic based task allocation in mobile edge computing. Proc. of the IEEE 21st International Conference on High Performance Computing and Communications, 2019. DOI: 10.1109/HPCC/SmartCity/DSS. 2019.00318. | 
| 17 | XU L, HU B, GUAN Z Z, et al. Multi-agent deep reinforcement learning for pursuit-evasion game scalability. Proc. of the Chinese Intelligent Systems Conference, 2020: 658−669. | 
| 18 | ABDOOS M. A cooperative multi-agent system for traffic signal control using game theory and reinforcement learning. IEEE Intelligent Transportation Systems Magazine, 2020. DOI: 10.1109/MITS. 2020.2990189. | 
| 19 | BENDOR J, MOOKHERJEE D, RAY D Reinforcement learning in repeated interaction games. Advances in Theoretical Economics, 2001, 3 (2): 159- 174. doi: 10.2202/1534-5963.1008 | 
| 20 | CRANDALL J W, GOODRICH M A Learning to compete, coordinate, and cooperate in repeated games using reinforcement learning. Machine Learning, 2011, 82, 281- 314. doi: 10.1007/s10994-010-5192-9 | 
| 21 | HU J L, WELLMAN M P. Multiagent reinforcement learning: theoretical framework and an algorithm. Proc. of the 15th International Conference on Machine Learning, 1998: 242−250. | 
| 22 | LIU H, LI J F, GE S Y, et al Coordinated scheduling of grid-connected integrated energy microgrid based on multi-agent game and reinforcement learning. Automation of Electric Power Systems, 2019, 43 (1): 58- 66. | 
| 23 | XU L, ZHO Z J Channel and power allocation algorithm based on distributed cooperative Q learning. Computer Engineering, 2019, 45 (6): 166- 170, 180. | 
| 24 | MATTA M, CARDARILLI G C, NUNZIO L D, et al Q-RTS: a real-time swarm intelligence based on multi-agent Q-learning. Electronics Letters, 2019, 55 (10): 589- 591. doi: 10.1049/el.2019.0244 | 
| 25 | CHEN Y, LIU J M, ZHAO H. Social structure emergence: a multi-agent reinforcement learning framework for relationship building. Proc. of the 19th International Conference on Autonomous Agents and Multi-Agent Systems, 2020: 1807−1809. | 
| 26 | GE Y Y, ZHU F, HUANG W, et al Multi-agent cooperation Q-learning algorithm based on constrained Markov game. Computer Science and Information Systems, 2020, 17 (2): 647- 664. doi: 10.2298/CSIS191220009G | 
| 27 | DAEICHIAN A, HAGHANI A Fuzzy Q-learning based multi-agent system for intelligent traffic control by a game theory approach. Arabian Journal for Science and Engineering, 2018, 43 (6): 3241- 3247. doi: 10.1007/s13369-017-3018-9 | 
| 28 | ULUSOY U, GUZEL M S, BOSTANCI E. A Q-learning-based approach for simple and multi-agent systems. Multi-Agent Systems-Strategies and Applications, 2020. DOI: 10.5772/intechopen. 88484. | 
| 29 | HOFBAUER J, SIGMUND K. Evolutionary games and population dynamics. Cambridge: Cambridge University Press, 1998. | 
| 30 | NOWAK M A. Evolutionary dynamics: exploring the equations of life. Cambridge: Harvard University Press, 2006. | 
| 31 | SMITH J M. Evolution and the theory of games. Cambridge: Cambridge University Press, 1982. | 
| 32 | KIMURA M. The neutral theory of molecular evolution. Cambridge: Cambridge University Press, 1983. | 
| 33 | CHEN Z H, YANG Z H, WANG H B, et al Overview of reinforcement learning from knowledge expression and handling. Control and Decision, 2008, 23 (9): 962- 975. | 
| 34 | GAO Y, CHEN S F, LU X Research on reinforcement learning technology: a review. Acta Automatica Sinica, 2004, 30 (1): 86- 100. | 
| [1] | Bohao LI, Yunjie WU, Guofei LI. Hierarchical reinforcement learning guidance with threat avoidance [J]. Journal of Systems Engineering and Electronics, 2022, 33(5): 1173-1185. | 
| [2] | Xiaofeng LI, Lu DONG, Changyin SUN. Hybrid Q-learning for data-based optimal control of non-linear switching system [J]. Journal of Systems Engineering and Electronics, 2022, 33(5): 1186-1194. | 
| [3] | Ang GAO, Qisheng GUO, Zhiming DONG, Zaijiang TANG, Ziwei ZHANG, Qiqi FENG. Research on virtual entity decision model for LVC tactical confrontation of army units [J]. Journal of Systems Engineering and Electronics, 2022, 33(5): 1249-1267. | 
| [4] | Jingyu CAO, Lu DONG, Changyin SUN. Day-ahead scheduling based on reinforcement learning with hybrid action space [J]. Journal of Systems Engineering and Electronics, 2022, 33(3): 693-705. | 
| [5] | Xiangyang LIN, Qinghua XING, Fuxian LIU. Choice of discount rate in reinforcement learning with long-delay rewards [J]. Journal of Systems Engineering and Electronics, 2022, 33(2): 381-392. | 
| [6] | Wenzhang LIU, Lu DONG, Jian LIU, Changyin SUN. Knowledge transfer in multi-agent reinforcement learning with incremental number of agents [J]. Journal of Systems Engineering and Electronics, 2022, 33(2): 447-460. | 
| [7] | Zheng WANG, Zhiyuan HU, Xuanfang YANG. Multi-agent and ant colony optimization for ship integrated power system network reconfiguration [J]. Journal of Systems Engineering and Electronics, 2022, 33(2): 489-496. | 
| [8] | Wanping SONG, Zengqiang CHEN, Mingwei SUN, Qinglin SUN. Reinforcement learning based parameter optimization of active disturbance rejection control for autonomous underwater vehicle [J]. Journal of Systems Engineering and Electronics, 2022, 33(1): 170-179. | 
| [9] | Jiandong ZHANG, Qiming YANG, Guoqing SHI, Yi LU, Yong WU. UAV cooperative air combat maneuver decision based on multi-agent reinforcement learning [J]. Journal of Systems Engineering and Electronics, 2021, 32(6): 1421-1438. | 
| [10] | Kaifang WAN, Bo LI, Xiaoguang GAO, Zijian HU, Zhipeng YANG. A learning-based flexible autonomous motion control method for UAV in dynamic unknown environments [J]. Journal of Systems Engineering and Electronics, 2021, 32(6): 1490-1508. | 
| [11] | Sader MALIKA, Fuyong WANG, Zhongxin LIU, Zengqiang CHEN. Distributed fuzzy fault-tolerant consensus of leader-follower multi-agent systems with mismatched uncertainties [J]. Journal of Systems Engineering and Electronics, 2021, 32(5): 1031-1040. | 
| [12] | Xin ZENG, Yanwei ZHU, Leping YANG, Chengming ZHANG. A guidance method for coplanar orbital interception based on reinforcement learning [J]. Journal of Systems Engineering and Electronics, 2021, 32(4): 927-938. | 
| [13] | Duo QI, Junhua HU, Xiaolong LIANG, Jiaqiang ZHANG, Zhihao ZHANG. Research on consensus of multi-agent systems with and without input saturation constraints [J]. Journal of Systems Engineering and Electronics, 2021, 32(4): 947-955. | 
| [14] | Ming ZHANG, Jianjun ZHU, Hehua WANG. Evolutionary game analysis of problem processing mechanism in new collaboration [J]. Journal of Systems Engineering and Electronics, 2021, 32(1): 136-150. | 
| [15] | Bingqiang LI, Tianyi LAN, Yiyun ZHAO, Shuaishuai LYU. 
														
															Open-loop and closed-loop | 
| Viewed | ||||||
| Full text |  | |||||
| Abstract |  | |||||