Equilibrium learning for multi-stage cyber-physical multi-domain security game in island air defense

doi:10.23919/JSEE.2024.000006

Journal of Systems Engineering and Electronics ›› 2026, Vol. 37 ›› Issue (2): 567-578.doi: 10.23919/JSEE.2024.000006

• SYSTEMS ENGINEERING • Previous Articles

Equilibrium learning for multi-stage cyber-physical multi-domain security game in island air defense

Weilin YUAN¹(), Shaofei CHEN²^,*(), Lina LU²(), Zhenzhen HU²(), Yu XIE²(), Jing CHEN²()

¹College of Information and Communication, National University of Defense Technology Wuhan 430014, China
²College of Intelligence Science and Technology, National University of Defense Technology Changsha 410073, China

Received:2023-07-06 Online:2026-04-18 Published:2026-04-30
Contact: Shaofei CHEN E-mail:yuanweilin12@nudt.edu.cn;chensf005@163.com;lulina16@nudt.edu.cn;hzzmail@163.com;xieyu_nudt@139.com;Chenjing001@vip.sina.com
About author:
YUAN Weilin was born in 1994. He received his B.S., M.S., and Ph.D. degrees in control science and engineering from National University of Defense Technology, Changsha, China, in 2012, 2016 and 2023 respectively. He is a lecturer in National University of Defense Technology. His research interests include cognitive decision-making, intelligent gaming, opponent modeling, reinforcement learning, and multi-agent system. E-mail: yuanweilin12@nudt.edu.cn

CHEN Shaofei was born in 1987. He received his B.S. degree from Harbin Institute of Technology, Harbin, China, in 2009, M.S. and Ph.D. degrees in control science and engineering from National University of Defense Technology, Changsha, in 2011 and 2016 respectively. Since 2019, he has been an associate professor with the College of Intelligence Science and Technology, National University of Defense Technology. His research interests include artificial intelligence, multiagent system, and reinforcement learning. E-mail: chensf005@163.com

LU Lina was born in 1984. She received her Ph.D. degree in control science and engineering from National University of Defense Technology, Changsha, China, in 2020. Since 2020, she has been a lecturer with the College of Intelligence Science and Technology, National University of Defense Technology. Her current research interests include reinforcement learning, opponent modeling, and complex network. E-mail: lulina16@nudt.edu.cn

HU Zhenzhen was born in 1984. He received his B.S. degree in thermal energy and power engineering from Shanghai Jiaotong University in 2006 and M.S. degree in fluid mechanics at China Aerodynamics Research and Development Center in 2009. He is currently pursuing his Ph.D. degree in control science and engineering at National University of Defense Technology, Changsha, China. His current research interests include artificial intelligence, opponent modeling, and game theory. E-mail: hzzmail@163.com

XIE Yu was born in 1982. He received his M.S. and Ph.D. degrees in aeronautical and astronautical science and technology from National University of Defense Technology, Changsha, in 2007 and 2012. Since 2015, he has been an associate professor with the College of Intelligence Science and Technology, National University of Defense Technology. His research interests include intelligent decision-making and planning. E-mail: xieyu_nudt@139.com

CHEN Jing was born in 1972. He received his M.S. and Ph.D. degrees in control science and engineering from National University of Defense Technology, Changsha, China, in 1993 and 1999, respectively. He is a full professor with the Department of Intelligence Science and Technology, College of Intelligence Science and Technology, National University of Defense Technology. His current research interests include artificial intelligence, intelligence control, and unmanned vehicle mission planning. E-mail: Chenjing001@vip.sina.com
Supported by:
This work was supported by the National Natural Science Foundation of China (92271108; 61702528; 61806212; 62173336).

Abstract

Abstract:

Multi-domain competition is developing for disintegrating the component of the opponent’s operational system and winning advantage in decision space. Island air defense is a typical multi-domain security problem, which dramatically increases the complexity of decision-making by considering different factors such as multi-stages decisions, multi-domain settings, imperfection information, and uncertain events. However, current research on island air defense security problems is sparse and lacks consideration of key factors. To provide support for assisting human commanders to take wise decisions in a complex environment, we build a multi-domain multi-state island air defense model and propose responding solving algorithms. We study the whole progress of island air defense and propose a multi-domain, multi-stage imperfection information security game that formulates critical characters in the adversarial scenario of island air defense. In addition, considering a bounded rational opponent’s possible strategies, we propose an opponent-aware Monte Carlo counterfactual regret minimization algorithm for learning a robust defensive strategy in the security game. We evaluate our methods in various adversarial scenarios. The results show that our equilibrium learning method can effectively play against an opponent with bounded rationality and significantly outperform some advanced algorithms.

Key words: island air defense, counterfactual regret minimization, Nash equilibrium, security game, cyber-physical system

Weilin YUAN, Shaofei CHEN, Lina LU, Zhenzhen HU, Yu XIE, Jing CHEN. Equilibrium learning for multi-stage cyber-physical multi-domain security game in island air defense[J]. Journal of Systems Engineering and Electronics, 2026, 37(2): 567-578.

Figures/Tables 13

Fig 1

Fig 2

Fig 3

Table 1

Table 2

Table 3

Table 4

Fig 4

Table 5

Fig 5

Fig 6

Fig 7

Fig 8

References 33

1	U.S. ARMY. The U.S. army in multi-domain operations 2028. https://publicintelligence.net/usarmy-multidomain-ops-2028/.
2	Singapore Government Agency. Fact sheet: see more, shoot further, smarter – RSAF’s island air defence system. https://www.mindef.gov.sg/web/portal/mindef/news-and-events/latest-releases/article-detail/2020/December/17dec20_fs.
3	TAKAKO F G. Non-cooperative game theory. Japan: Springer Japan eBooks, 2015.
4	KORZHYK D, YIN Z , KIEKINTVELD C, et al. Stackelberg vs. Nash in security games: an extended investigation of interchangeability, equivalence, and uniqueness. Journal of Artificial Intelligence Research, 2011, 41: 297−327.
5	YUAN W L, LUO J R, LU L N, et. al. Methods in adversarial intelligent game: a holistic comparative analysis from perspective of game theory and reinforcement learning. Computer Science, 2022, 49 (8): 191−204.
6	PITA J, JAIN M, MARECKI J, et al. Deployed armor protection: the application of a game theoretic model for security at the Los Angeles international airport. Proc. of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems: Industrial Track, 2008: 125−132.
7	TSAI J, RATHI S, KIEKINTVELD C, et al. IRIS: a tool for strategic security allocation in transportation networks. Cambridge: Cambridge University Press, 2011.
8	PITA J, BELLAMANE H, JAIN M, et al Security applications: lessons of real-world deployment. ACM SIGecom Exchanges, 2009, 8 (2): 1- 4.
9	HAN C Y, LUNDAY B J, ROBBINS M J A game theoretic model for the optimal location of integrated air defense system missile batteries. Informs Journal on Computing, 2016, 28 (3): 405- 416. doi: 10.1287/ijoc.2016.0690
10	KEITH A, AHNER D. Counterfactual regret minimization for integrated cyber and air defense resource allocation. European Journal of Operational Research 2021, 292(1): 95−107.
11	ROBERSON B The Colonel Blotto game. Economic Theory, 2006, 29 (1): 1- 24. doi: 10.1007/s00199-005-0071-5
12	KOVENOCK D, ROBERSON B Coalitional Colonel Blotto games with application to the economics of alliances. Journal of Public Economic Theory, 2012, 14 (4): 653- 676. doi: 10.1111/j.1467-9779.2012.01556.x
13	ZOU M W, CHEN S F, LUO J R, et al. An evolutionary learning approach for anti-jamming game in cognitive radio confrontation. Proc. of the IEEE International Conference on Systems, Man, and Cybernetics, 2022: 3210−3215.
14	HASAN K, SHETTY S, SOKOLOWSKI J A, et al. Security game for cyber physical systems. Proc. of the Communications and Networking Symposium, 2018: 1−12.
15	CLEMPNER J, POZNYAK A Stackelberg security games. Expert Systems with Applications, 2015, 42 (8): 3967- 3979. doi: 10.1016/j.eswa.2014.12.034
16	SINHA A, FANG F, AN B, et al. Stackelberg security games: looking beyond a decade of success. Proc. of the International Joint Conference on Artificial Intelligence, 2018: 5494−5501.
17	CONITZER V, SANDHOLM T. Computing the optimal strategy to commit to. Proc. of the 7th ACM Conference on Electronic Commerce, 2006: 82−90.
18	MUTZARI D, AUMANN Y, KRAUS S. Robust solutions for multi-defender Stackelberg security games. https://arxiv.org/pdf/2204.14000.pdf.
19	NGUYEN T, JIAN A, TAMB M. Stop the compartmentalization: unified robust algorithms for handling uncertainties in security games. Proc. of the Autonomous Agents and Multi-Agents Systems, 2014: 317−324.
20	GUILLERMO A J, JULIO B C . Repeated Stackelberg security games: learning with incomplete state information. https://www-sciencedirect-com-s.libyc.nudt.edu.cn/science/article/pii/S0951832019304478.
21	GUO Q, AN B, BOVSANSKY B, et al. Comparing strategic secrecy and Stackelberg commitment in security games. Proc. of the International Joint Conference on Artificial Intelligence, 2017: 3691−3699.
22	VINYALS O, BABUSCHKIN I, CZARNECKI W M, et al Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature, 2019, 575 (7782): 350- 354. doi: 10.1038/s41586-019-1724-z
23	SANDHOLM T, GILPIN A, CONITZER V. Mixed-integer programming methods for finding Nash equilibria. Proc. of the National Conference on Artificial Intelligence, 2005: 495−501.
24	LIU W T, LEI J L, YI P, et al. No-regret learning for repeated non-cooperative games with lossy bandits. https://www.sciencedirect.com/science/article/abs/pii/S0005109823006222.
25	ZINKEVICH M, JOHANSON M, BOWLING M, et al Regret minimization in games with incomplete information. Advances in Neural Information Processing Systems, 2007, 20, 905- 912.
26	BROWN N, SANDHOLM T. Solving imperfect-information games via discounted regret minimization. https://doi.org/10.48550/arXiv.1809.04040.
27	LANCTOT M. Monte Carlo sampling and regret minimization for equilibrium computation and decision-making in large extensive form games. Edmonton: University of Albert, 2013.
28	BOWLING M, BURCH N, JOHANSON M, et al Heads-up limit hold’em poker is solved. Science, 2015, 347 (6218): 145- 149. doi: 10.1126/science.1259433
29	BLAIR A, SAFFIDINE A AI surpasses humans at six-player poker. Science, 2019, 365 (6456): 864- 865. doi: 10.1126/science.aay7774
30	MORAVCIK M, SCHMID M, BURCH N, et al Deepstack: expert-level artificial intelligence in heads-up no-limit poker. Science, 2017, 356 (6337): 508- 513. doi: 10.1126/science.aam6960
31	SHOHAM Y, LEYTON-BROWN K. Multiagent systems: algorithmic, game-theoretic, and logical foundations. London: Cambridge University Press, 2008.
32	BROWN N. Equilibrium finding for large adversarial imperfect-information games. London: Carnegie Mellon University, 2020.
33	FRITH C, FRITH U. Theory of mind. https://www.researchgate.net/publication/232296544_Theory_of_mind.

Stage	Player	Domain	$ \|\mathop {\boldsymbol{A}}\nolimits_i^j \| $	$ \|\mathop {\boldsymbol{I}}\nolimits_i^j \| $
1	Defender	Physical	$ C_{\|D\|}^{{n_{{\mathrm{pd}}}}} $	−
2	Nature	Cyber	$ C_{{n_{{\mathrm{pd}}}}}^{{e_{{\mathrm{cd}}}}} $	−
3	Nature	Cyber	$ C_{{n_{{\mathrm{pd}}}}}^{{e_{{\mathrm{ca}}}}} $	−
4	Attacker	Cyber	$ C_{{e_{{\mathrm{ca}}}}}^{{n_{{\mathrm{ca}}}}} $	$ C_{\|D\|}^{{n_{{\mathrm{pd}}}}}C_{{n_{{\mathrm{pd}}}}}^{{e_{{\mathrm{ca}}}}} $
5	Nature	Cyber	$ {2^{{n_{{\mathrm{ca}}}}}} $	−
6	Defender	Cyber	$ C_{{e_{{\mathrm{cd}}}}}^{{n_{{\mathrm{cd}}}}} $	$ C_{\|D\|}^{{n_{{\mathrm{pd}}}}}C_{{n_{{\mathrm{pd}}}}}^{{e_{{\mathrm{cd}}}}}\displaystyle\sum\limits_{i = 0}^{{n_{{\mathrm{ca}}}}} {C_{{n_{{\mathrm{pd}}}}}^i} $
7	Attacker	Physical	$ C_{\|D\|}^{{n_{{\mathrm{pa}}}}} $	$ C_{\|D\|}^{{n_{{\mathrm{pd}}}}}C_{{n_{{\mathrm{pd}}}}}^{{e_{{\mathrm{ca}}}}}C_{{e_{{\mathrm{cd}}}}}^{{n_{{\mathrm{cd}}}}}C_{{n_{{\mathrm{pd}}}}}^{{n_{{\mathrm{cd}}}}} $

Stage	Player	Domain	$ \|\mathop {\boldsymbol{A}}\nolimits_i^j \| $	$ \|\mathop {\boldsymbol{I}}\nolimits_i^j \| $
1	Defender	Physical	$ C_{\|D\|}^{{n_{{\mathrm{pd}}}}} $	−
2	Defender	Physical	$ C_{{n_{{\mathrm{pd}}}}}^{{n_{{\mathrm{pu}}}}} $	−
3	Nature	Cyber	$ C_{{n_{{\mathrm{pu}}}}}^{{e_{{\mathrm{cd}}}}} $	−
4	Nature	Cyber	$ C_{{n_{{\mathrm{pu}}}}}^{{e_{{\mathrm{ca}}}}} $	−
5	Attacker	Cyber	$ C_{{e_{{\mathrm{ca}}}}}^{{n_{{\mathrm{ca}}}}} $	$ C_{{n_{{\mathrm{pu}}}}}^{{n_{{\mathrm{pd}}}}}C_{{n_{{\mathrm{pu}}}}}^{{e_{{\mathrm{ca}}}}} $
6	Nature	Cyber	$ {2^{{n_{{\mathrm{ca}}}}}} $	−
7	Defender	Cyber	$ C_{{e_{{\mathrm{cd}}}}}^{{n_{{\mathrm{cd}}}}} $	$ C_{\|D\|}^{{n_{{\mathrm{pd}}}}}C_{{n_{{\mathrm{pd}}}}}^{{e_{{\mathrm{cd}}}}}\displaystyle\sum\limits_{i = 0}^{{n_{{\mathrm{ca}}}}} {C_{{n_{{\mathrm{pu}}}}}^i} $
8	Attacker	Physical	$ C_{\|D\|}^{{n_{{\mathrm{pa}}}}} $	$ C_{{n_{{\mathrm{pu}}}}}^{{n_{{\mathrm{pd}}}}}C_{{n_{{\mathrm{pu}}}}}^{{e_{{\mathrm{ca}}}}}C_{{e_{{\mathrm{cd}}}}}^{{n_{{\mathrm{cd}}}}}C_{{n_{{\mathrm{pu}}}}}^{{n_{{\mathrm{cd}}}}} $

Parameter	Base value	Description	Constrain
$ D $	−	Set of islands	−
$ {n_{{\mathrm{{{{pd}}}}}}} $	6	Number of IADs	$ {n_{{{\mathrm{pd}}}}} \leqslant \|D\| $
$ {n_{{{\mathrm{pu}}}}} $	5	Number of public IADS	$ {n_{{{\mathrm{pu}}}}} \leqslant {n_{{{\mathrm{pd}}}}} $
$ {n_{{{\mathrm{pa}}}}} $	3	Number of AMS	$ {n_{{{\mathrm{pa}}}}} \leqslant \|D\| $
$ {e_{{{\mathrm{cd}}}}} $	4	Number of defense-capable cyber nodes	$ {e_{{{\mathrm{cd}}}}} \leqslant {n_{{{\mathrm{pu}}}}} $
$ {e_{{{\mathrm{ca}}}}} $	4	Number of attack-capable cyber nodes	$ {e_{{{\mathrm{ca}}}}} \leqslant {n_{{{\mathrm{pd}}}}} $
$ {n_{{{\mathrm{cd}}}}} $	2	Number of cyber defense source	$ {n_{{{\mathrm{ca}}}}} \leqslant {e_{{{\mathrm{ca}}}}} $
$ {n_{{{\mathrm{ca}}}}} $	2	Number of cyber attack source	$ {n_{{{\mathrm{cd}}}}} \leqslant {e_{{{\mathrm{cd}}}}} $
$ r $	0.3	Coverage radius of each IADS	−
$ {p_{{{\mathrm{pd}}}}} $	0.9	Physical defense effectiveness	−
$ {p_{{{\mathrm{cd}}}}} $	0.8	Cyber defense effectiveness	−
$ {p_{{{\mathrm{ca}}}}} $	0.7	Cyber attack effectiveness	−
$ {p_{{\mathrm{cs}}}} $	0.8	Cyber sensor detection effectiveness	−

ID	Island information			Physical domain		Cyber domain
ID	X	Y	Value	Defense	Attack	Defense	Attack
1	0.475492	0.793995	0.865261	IADS	−	√	√
2	0.295861	0.958631	0.534103	−	AM	−	−
3	0.363062	0.517419	0.763656	IADS	AM	−	−
4	0.527012	0.737675	0.257084	IADS	−	√	√
5	0.764324	0.928512	0.820001	IADS	−	−	−
6	0.763333	0.856612	0.942176	IADS	−	−	−
7	0.423657	0.946227	0.529040	IADS	AM	−	−

ID	Island information			Physical domain		Cyber domain
ID	X	Y	Value	Defense	Attack	Defense	Attack
1	0.475492	0.793995	0.865261	IADS	−	√	√
2	0.295861	0.958631	0.534103	−	AM	−	−
3	0.363062	0.517419	0.763656	−	AM	−	−
4	0.527012	0.737675	0.257084	IADS	−	√	√
5	0.764324	0.928512	0.820001	−	AM	−	−
6	0.763333	0.856612	0.942176	IADS	AM	−	−
7	0.423657	0.946227	0.529040	−	AM	−	−

Equilibrium learning for multi-stage cyber-physical multi-domain security game in island air defense

RichHTML

PDF (PC)

Knowledge

Abstract

Cite this article

Share this article

Figures/Tables 13

References 33

Related Articles 4

Recommended Articles

Metrics

Comments

[1]	Ruhao JIANG, He LUO, Yingying MA, Guoqiang WANG. Multicriteria game approach to air-to-air combat tactical decisions for multiple UAVs [J]. Journal of Systems Engineering and Electronics, 2023, 34(6): 1447-1464.
[2]	Qiuni LI, Rennong YANG, Haoliang LI, Huan ZHANG, Chao FENG. Modeling and game strategy analysis of suppressing IADS for multiple fighters' cooperation [J]. Journal of Systems Engineering and Electronics, 2018, 29(2): 296-304.
[3]	Guopeng Zhang, Peng Liu, and Enjie Ding. Energy efficient resource allocation in non-cooperative multi-cell OFDMA systems [J]. Journal of Systems Engineering and Electronics, 2011, 22(1): 175-182.
[4]	Xiaohui Yu and Qiang Zhang. Fuzzy Nash equilibrium of fuzzy n-person non-cooperative game [J]. Journal of Systems Engineering and Electronics, 2010, 21(1): 47-56.