Journal of Systems Engineering and Electronics ›› 2021, Vol. 32 ›› Issue (3): 642-657.doi: 10.23919/JSEE.2021.000055
• SYSTEMS ENGINEERING • Previous Articles Next Articles
Ye MA1,*(), Tianqing CHANG1(), Wenhui FAN2()
Received:
2020-03-03
Online:
2021-06-18
Published:
2021-07-26
Contact:
Ye MA
E-mail:mayegf@126.com;oliver_chan1214@126.com;fanwenhui@tsinghua.edu.cn
About author:
Supported by:
Ye MA, Tianqing CHANG, Wenhui FAN. A single-task and multi-decision evolutionary game model based on multi-agent reinforcement learning[J]. Journal of Systems Engineering and Electronics, 2021, 32(3): 642-657.
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
Table 2
Percentage of each strategy"
Algorithm | Proportion of cooperation strategies | Proportion of competition strategies | Proportion of inaction strategies |
Algorithm of this article | 0.54 | 0.23 | 0.23 |
Nash-Q learning algorithm | 0.51 | 0.25 | 0.24 |
Monte Carlo method | 0.47 | 0.27 | 0.26 |
Genetic algorithm | 0.45 | 0.26 | 0.29 |
1 | SUTTON R, BARTO A. Reinforcement learning: an introduction. Cambridge: MIT Press, 1998. |
2 | AWHEDA M D, SCHWARTZ H M. The residual gradient FACL algorithm for differential games. Proc. of the Canadian Conference on Electrical and Computer Engineering, 2015: 1006−1011. |
3 | JELAI Z. Reinforcement learning based human-prosthetic robot interaction control in movement therapy. Proc. of the International Conference on New Technologies, Development and Application, 2020: 172−181. |
4 | LITTMAN M L. Markov games as a framework for multi-agent reinforcement learning. Proc. of the 11th International Conference on Machine Learning, 1994: 157−163. |
5 |
LI Y, HAN W, WANG Y Q Deep reinforcement learning with application to air confrontation intelligent decision-making of manned/unmanned aerial vehicle cooperative system. IEEE Access, 2020, 8, 67887- 67898.
doi: 10.1109/ACCESS.2020.2985576 |
6 |
DEPTULA P, BELL Z I, DOUCETTE E A, et al Data-based reinforcement learning approximate optimal control for an uncertain nonlinear system with control effectiveness faults. Automatica, 2020, 116, 108922.
doi: 10.1016/j.automatica.2020.108922 |
7 | GOTTSCHALK S, BURGER M Differences and similarities between reinforcement learning and the classical optimal control framework. Proceedings in Applied Mathematics and Mechanics, 2019, 19 (1): e201900390. |
8 | LIAO H C, LIU J S. A model-based reinforcement learning approach to time-optimal control problems. Proc. of the International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, 2019: 657−665. |
9 |
SHI H B, ZHAI L J, WU H B, et al A multi-tier reinforcement learning model for a cooperative multi-agent system. IEEE Trans. on Cognitive and Developmental Systems, 2020, 12 (3): 636- 644.
doi: 10.1109/TCDS.2020.2970487 |
10 | NGUYEN N D, NGUYEN T, NAHAVANDI S Multi-agent behavioral control system using deep reinforcement learning. Neurocomputing, 2019, 359 (24): 58- 68. |
11 |
QIE H, SHI D, SHEN T, et al Joint optimization of multi-UAV target assignment and path planning based on multi-agent reinforcement learning. IEEE Access, 2019, 7, 146264- 146272.
doi: 10.1109/ACCESS.2019.2943253 |
12 |
FIRDAUSIYAH N, TANIGUCHI E, QURESHI A G Modeling city logistics using adaptive dynamic programming based multi-agent simulation. Transportation Research Part E: Logs and Transportation Review, 2019, 125, 74- 96.
doi: 10.1016/j.tre.2019.02.011 |
13 |
REN Y, FAN D M, FENG Q, et al Agent-based restoration approach for reliability with load balancing on smart grids. Applied Energy, 2019, 249, 46- 57.
doi: 10.1016/j.apenergy.2019.04.119 |
14 | MYERSON R B. Game theory: analysis of conflict. Cambridge: Harvard University Press, 1997. |
15 |
NIE L, WANG X G, PAN F Y A game-theory approach based on genetic algorithm for flexible job shop scheduling problem. Journal of Physics: Conference Series, 2019, 1187, 032095.
doi: 10.1088/1742-6596/1187/3/032095 |
16 | WANG X H, ZHONG X X, LI L, et al. PSOGT: PSO and game theoretic based task allocation in mobile edge computing. Proc. of the IEEE 21st International Conference on High Performance Computing and Communications, 2019. DOI: 10.1109/HPCC/SmartCity/DSS. 2019.00318. |
17 | XU L, HU B, GUAN Z Z, et al. Multi-agent deep reinforcement learning for pursuit-evasion game scalability. Proc. of the Chinese Intelligent Systems Conference, 2020: 658−669. |
18 | ABDOOS M. A cooperative multi-agent system for traffic signal control using game theory and reinforcement learning. IEEE Intelligent Transportation Systems Magazine, 2020. DOI: 10.1109/MITS. 2020.2990189. |
19 |
BENDOR J, MOOKHERJEE D, RAY D Reinforcement learning in repeated interaction games. Advances in Theoretical Economics, 2001, 3 (2): 159- 174.
doi: 10.2202/1534-5963.1008 |
20 |
CRANDALL J W, GOODRICH M A Learning to compete, coordinate, and cooperate in repeated games using reinforcement learning. Machine Learning, 2011, 82, 281- 314.
doi: 10.1007/s10994-010-5192-9 |
21 | HU J L, WELLMAN M P. Multiagent reinforcement learning: theoretical framework and an algorithm. Proc. of the 15th International Conference on Machine Learning, 1998: 242−250. |
22 | LIU H, LI J F, GE S Y, et al Coordinated scheduling of grid-connected integrated energy microgrid based on multi-agent game and reinforcement learning. Automation of Electric Power Systems, 2019, 43 (1): 58- 66. |
23 | XU L, ZHO Z J Channel and power allocation algorithm based on distributed cooperative Q learning. Computer Engineering, 2019, 45 (6): 166- 170, 180. |
24 |
MATTA M, CARDARILLI G C, NUNZIO L D, et al Q-RTS: a real-time swarm intelligence based on multi-agent Q-learning. Electronics Letters, 2019, 55 (10): 589- 591.
doi: 10.1049/el.2019.0244 |
25 | CHEN Y, LIU J M, ZHAO H. Social structure emergence: a multi-agent reinforcement learning framework for relationship building. Proc. of the 19th International Conference on Autonomous Agents and Multi-Agent Systems, 2020: 1807−1809. |
26 |
GE Y Y, ZHU F, HUANG W, et al Multi-agent cooperation Q-learning algorithm based on constrained Markov game. Computer Science and Information Systems, 2020, 17 (2): 647- 664.
doi: 10.2298/CSIS191220009G |
27 |
DAEICHIAN A, HAGHANI A Fuzzy Q-learning based multi-agent system for intelligent traffic control by a game theory approach. Arabian Journal for Science and Engineering, 2018, 43 (6): 3241- 3247.
doi: 10.1007/s13369-017-3018-9 |
28 | ULUSOY U, GUZEL M S, BOSTANCI E. A Q-learning-based approach for simple and multi-agent systems. Multi-Agent Systems-Strategies and Applications, 2020. DOI: 10.5772/intechopen. 88484. |
29 | HOFBAUER J, SIGMUND K. Evolutionary games and population dynamics. Cambridge: Cambridge University Press, 1998. |
30 | NOWAK M A. Evolutionary dynamics: exploring the equations of life. Cambridge: Harvard University Press, 2006. |
31 | SMITH J M. Evolution and the theory of games. Cambridge: Cambridge University Press, 1982. |
32 | KIMURA M. The neutral theory of molecular evolution. Cambridge: Cambridge University Press, 1983. |
33 | CHEN Z H, YANG Z H, WANG H B, et al Overview of reinforcement learning from knowledge expression and handling. Control and Decision, 2008, 23 (9): 962- 975. |
34 | GAO Y, CHEN S F, LU X Research on reinforcement learning technology: a review. Acta Automatica Sinica, 2004, 30 (1): 86- 100. |
[1] | Bohao LI, Yunjie WU, Guofei LI. Hierarchical reinforcement learning guidance with threat avoidance [J]. Journal of Systems Engineering and Electronics, 2022, 33(5): 1173-1185. |
[2] | Xiaofeng LI, Lu DONG, Changyin SUN. Hybrid Q-learning for data-based optimal control of non-linear switching system [J]. Journal of Systems Engineering and Electronics, 2022, 33(5): 1186-1194. |
[3] | Ang GAO, Qisheng GUO, Zhiming DONG, Zaijiang TANG, Ziwei ZHANG, Qiqi FENG. Research on virtual entity decision model for LVC tactical confrontation of army units [J]. Journal of Systems Engineering and Electronics, 2022, 33(5): 1249-1267. |
[4] | Jingyu CAO, Lu DONG, Changyin SUN. Day-ahead scheduling based on reinforcement learning with hybrid action space [J]. Journal of Systems Engineering and Electronics, 2022, 33(3): 693-705. |
[5] | Xiangyang LIN, Qinghua XING, Fuxian LIU. Choice of discount rate in reinforcement learning with long-delay rewards [J]. Journal of Systems Engineering and Electronics, 2022, 33(2): 381-392. |
[6] | Wenzhang LIU, Lu DONG, Jian LIU, Changyin SUN. Knowledge transfer in multi-agent reinforcement learning with incremental number of agents [J]. Journal of Systems Engineering and Electronics, 2022, 33(2): 447-460. |
[7] | Zheng WANG, Zhiyuan HU, Xuanfang YANG. Multi-agent and ant colony optimization for ship integrated power system network reconfiguration [J]. Journal of Systems Engineering and Electronics, 2022, 33(2): 489-496. |
[8] | Wanping SONG, Zengqiang CHEN, Mingwei SUN, Qinglin SUN. Reinforcement learning based parameter optimization of active disturbance rejection control for autonomous underwater vehicle [J]. Journal of Systems Engineering and Electronics, 2022, 33(1): 170-179. |
[9] | Jiandong ZHANG, Qiming YANG, Guoqing SHI, Yi LU, Yong WU. UAV cooperative air combat maneuver decision based on multi-agent reinforcement learning [J]. Journal of Systems Engineering and Electronics, 2021, 32(6): 1421-1438. |
[10] | Kaifang WAN, Bo LI, Xiaoguang GAO, Zijian HU, Zhipeng YANG. A learning-based flexible autonomous motion control method for UAV in dynamic unknown environments [J]. Journal of Systems Engineering and Electronics, 2021, 32(6): 1490-1508. |
[11] | Sader MALIKA, Fuyong WANG, Zhongxin LIU, Zengqiang CHEN. Distributed fuzzy fault-tolerant consensus of leader-follower multi-agent systems with mismatched uncertainties [J]. Journal of Systems Engineering and Electronics, 2021, 32(5): 1031-1040. |
[12] | Xin ZENG, Yanwei ZHU, Leping YANG, Chengming ZHANG. A guidance method for coplanar orbital interception based on reinforcement learning [J]. Journal of Systems Engineering and Electronics, 2021, 32(4): 927-938. |
[13] | Duo QI, Junhua HU, Xiaolong LIANG, Jiaqiang ZHANG, Zhihao ZHANG. Research on consensus of multi-agent systems with and without input saturation constraints [J]. Journal of Systems Engineering and Electronics, 2021, 32(4): 947-955. |
[14] | Ming ZHANG, Jianjun ZHU, Hehua WANG. Evolutionary game analysis of problem processing mechanism in new collaboration [J]. Journal of Systems Engineering and Electronics, 2021, 32(1): 136-150. |
[15] |
Bingqiang LI, Tianyi LAN, Yiyun ZHAO, Shuaishuai LYU.
Open-loop and closed-loop |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||