Optimal competitive resource assignment in two-stage Colonel Blotto game with Lanchester-type attrition

doi:10.23919/JSEE.2023.000165

Journal of Systems Engineering and Electronics ›› 2026, Vol. 37 ›› Issue (1): 242-256.doi: 10.23919/JSEE.2023.000165

• SYSTEMS ENGINEERING • Previous Articles Next Articles

Optimal competitive resource assignment in two-stage Colonel Blotto game with Lanchester-type attrition

Weilin YUAN(), Shaofei CHEN(), Zhenzhen HU(), Xiang JI(), Lina LU(), Xiaolong SU(), Jing CHEN()

College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, China

Received:2022-12-12 Online:2026-02-18 Published:2026-03-11
Contact: Shaofei CHEN E-mail:yuanweilin12@nudt.edu.cn;chensf005@163.com;hzzmail@163.com;jixiang14@nudt.edu.cn;lulina16@nudt.edu.cn;xiaolongsu@nudt.edu.cn;Chenjing001@vip.sina.com
About author:
YUAN Weilin was born in 1994. He received his B.S, M.S, and Ph.D. degrees from control science and engineering from the National University of Defense Technology, Changsha, China, in 2012, 2016 and 2023, respectively. He is a lecturer in National University of Defense Technology. His research interests include cognitive decision-making, intelligent gaming, opponent modeling, reinforcement learning, and multi-agent system. E-mail: yuanweilin12@nudt.edu.cn

CHEN Shaofei was born in 1987. He received his B.S. degree from Harbin Institute of Technology, Harbin, China, in 2009, M.S. and Ph.D. degrees in control science and engineering from National University of Defense Technology, Changsha, in 2011 and 2016, respectively. Since 2019, he has been an associate professor with the College of Intelligence Science and Technology, National University of Defense Technology. His research interests include artificial intelligence, multiagent system, and reinforcement learning. E-mail: chensf005@163.com

HU Zhenzhen born in 1984. He received his B.S. degree in thermal energy and power engineering from Shanghai Jiaotong University in 2006 and M.S. degree in fluid mechanics at China Aerodynamics Research and Development Center in 2009. He is currently pursuing his Ph.D. degree in control science and engineering at the National University of Defense Technology, Changsha, China. His current research interests include artificial intelligence, opponent modeling, and game theory. E-mail: hzzmail@163.com

JI Xiang was born in 1991. He received his M.S, and Ph.D. degrees in control science and engineering from the National University of Defense Technology, Changsha, China, in 2014 and 2022, respectively. He is a lecturer in National University of Defense Technology. His research interests include swarm intelligence, complex network, and intelligent gaming. E-mail: jixiang14@nudt.edu.cn

LU Lina born in 1984. She received her Ph.D. degree in control science and engineering from the National University of Defense Technology, Changsha, China, in 2020. Since 2020, she has been a lecturer with the College of Intelligence Science and Technology, National University of Defense Technology. Her current research interests include reinforcement learning, opponent modeling, and complex network. E-mail: lulina16@nudt.edu.cn

SU Xiaolong born in 2000. He received his B.S. degree from the National University of Defense Technology, Changsha, China, in 2021. He is currently pursuing his M.S. degree in control science and engineering at the National University of Defense Technology, Changsha, China. His research interests include game theory, intelligent gaming, and reinforcement learning. E-mail: xiaolongsu@nudt.edu.cn

CHEN Jing was born in 1972. He received his M.S. and Ph.D. degrees in control science and engineering from the National University of Defense Technology, Changsha, China, in 1993 and 1999, respectively. He is a full professor with the Department of Intelligence Science and Technology, College of Intelligence Science and Technology, National University of Defense Technology. His current research interests include artificial intelligence, intelligence control, and unmanned vehicle mission planning. E-mail: Chenjing001@vip.sina.com
Supported by:
This work was supported by the National Natural Science Foundation of China (61702528; 61806212; 62173336).

Abstract

Abstract:

In strategic decision-making tasks, determining how to assign limited costly resource towards the defender and the attacker is a central problem. However, it is hard for pre-allocated resource assignment to adapt to dynamic fighting scenarios, and exists situations where the scenario and rule of the Colonel Blotto (CB) game are too restrictive in real world. To address these issues, a support stage is added as supplementary for pre-allocated results, in which a novel two-stage competitive resource assignment problem is formulated based on CB game and stochastic Lanchester equation (SLE). Further, the force attrition in these two stages is formulated as a stochastic progress to consider the complex fighting progress, including the case that the player with fewer resources defeats the player with more resources and wins the battlefield. For solving this two-stage resource assignment problem, nested solving and no-regret learning are proposed to search the optimal resource assignment strategies. Numerical experiments are taken to analyze the effectiveness of the proposed model and study the assignment strategies in various cases.

Key words: resource assignment, Colonel Blotto (CB) game, stochastic Lanchester equation (SLE), regret match

Weilin YUAN, Shaofei CHEN, Zhenzhen HU, Xiang JI, Lina LU, Xiaolong SU, Jing CHEN. Optimal competitive resource assignment in two-stage Colonel Blotto game with Lanchester-type attrition[J]. Journal of Systems Engineering and Electronics, 2026, 37(1): 242-256.

Figures/Tables 9

Fig 1

Fig 2

Table 1

Table 2

Table 3

Table 4

Table 5

Fig 6

Fig 7

References 29

1	ROBERSON B The Colonel Blotto game. Economic Theory, 2006, 29 (1): 1- 24. doi: 10.1007/s00199-005-0071-5
2	DAN K, ROBERSON B Coalitional Colonel Blotto games with application to the economics of alliances. Journal of Public Economic Theory, 2012, 14 (4): 653- 676. doi: 10.1111/j.1467-9779.2012.01556.x
3	MASUCCI A M, SILVA A. Strategic resource allocation for competitive influence in social networks. https://doi.org/10.48550/arXiv.1402.5388.
4	KUMAR D, SINGH A S. A survey on resource allocation techniques in cloud computing. Proc. of the International Conference on Computing, Communication & Automation, 2015: 655−660.
5	SCHWARTZ G, LOISEAU P, SASTRY S S. The heterogeneous Colonel Blotto game. Proc. of the International Conference on Network Games, 2017: 232−238.
6	ADAM L, HORCIK R, KASL T, et al. Double oracle algorithm for computing equilibria in continuous games. Proc. of the National Conference on Artificial Intelligence, 2021, 35(6): 5070−5077.
7	GUPTA A, BASAR T, SCHWARTZ G A. A three-stage Colonel Blotto game: when to provide more information to an adversary. Proc. of the International Conference on Decision and Game Theory for Security, 2014: 216−233.
8	KOVENOCK D, MAUBOUSSIN M J, ROBERSON B Asymmetric conflicts with endogenous dimensionality. Korean Economic Review, 2010, 26 (1): 287- 305.
9	AYTON A, PRESTON P, AUTRAND F, et al The battle of crecy. The Journal of Military History, 2005, 69 (4): 1198- 1199. doi: 10.1353/jmh.2005.0214
10	LI X M, ZHENG J. Pure strategy Nash equilibrium in 2-contestant generalized lottery Colonel Blotto games. Social Science Electronic Publishing, 2022, 103: 102771.
11	HAN Q, LI W M, XU Q L, et al Lanchester equation for cognitive domain using hesitant fuzzy linguistic terms sets. Journal of Systems Engineering and Electronics, 2022, 33 (3): 674- 682. doi: 10.23919/JSEE.2022.000062
12	CANGIOTTI N, CAPOLLI M, MATTIA S A generalization of unaimed fire Lanchester’s model in multi-battle warfare. Operational Research, 2023, 23 (2): 38- 57. doi: 10.1007/s12351-023-00776-8
13	FENG B S, ZHOU X G, LIN Y J Study of air combat efficiency of carrier-borne fighter based on stochastic Lanchester battle theory. Computer Technology and Development, 2013, 23 (5): 199- 201.
14	DEITCHMANS J A Lanchester model of guerrilla warfare. Operations Research, 1962, 10 (6): 818- 827. doi: 10.1287/opre.10.6.818
15	TAYLOR J G, BROWN G G Further canonical methods in the solution of variable-coefficient Lanchester-type equations of modern warfare. Operations Research, 1976, 24 (1): 44- 69. doi: 10.1287/opre.24.1.44
16	MACKAY N J. Lanchester combat models. https://arxiv.org/abs/math/0606300.
17	KRESS M, LIN K Y, MACKAY N J The attrition dynamics of multilateral war. Operation Research, 2018, 66 (4): 950- 956. doi: 10.1287/opre.2018.1718
18	KALLONIATIS A C, HOEK K, ZUPARIC M, et al Optimising structure in a networked Lanchester model for fires and manoeuvre in warfare. Journal of the Operational Research Society, 2021, 72 (8): 1863- 1878. doi: 10.1080/01605682.2020.1745701
19	ZUPARIC M, SHELYAG S, ANGELOVA M, et al Modelling host population support for combat adversaries. Journal of the Operational Research Society, 2023, 74 (3): 928- 943. doi: 10.1080/01605682.2022.2122736
20	CHEN X Y, JING Y W, LI C J, et al Analysis of optimum strategy using Lanchester equation for naval battles like Trafalgar. Journal of Northeastern University, 2009, 30 (4): 535- 538.
21	JI X, ZHANG W P, XIANG F T, et al A swarm confrontation method based on Lanchester law and Nash equilibrium. Electronics, 2022, 11 (6): 896- 911. doi: 10.3390/electronics11060896
22	CHE Y K, GALE I Difference-form contests and the robustness of all-pay auctions. Games & Economic Behavior, 2000, 30 (1): 22- 43.
23	ALCALDE J, DAHM M Tullock and Hirshleifer: a meeting of the minds. Review of Economic Design, 2007, 11 (1): 101- 124.
24	DAHM L C Foundations for contest success functions. Economic Theory, 2010, 43 (1): 81- 98. doi: 10.1007/s00199-008-0425-x
25	DONG Q V. Models and solutions of strategic resource allocation problems: approximate equilibrium and online learning in Blotto games. Paris: Sorbonne University, 2020.
26	DONG Q V, LOISEAU P, SILVA A. Approximate equilibria in non-constant-sum Colonel Blotto and lottery Blotto games with large numbers of battlefields. https://arxiv.org/abs/1910.06559v1.
27	TODD W N, MARC L An introduction to counterfactual regret minimization. Proc. of the 4th Symposium on Educational Advances in Artificial Intelligence, 2013, 11, 1- 38.
28	MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing Atari with deep reinforcement learning. https://arxiv.org/pdf/1312.5602.pdf.
29	ZHA D C, XIE J R, MA W Y, et al. DouZero: mastering DouDizhu with self-play deep reinforcement learning. https://arxiv.org/pdf/2106.06135.pdf.

Condition	Parameter			Result in CB-2T
Condition	c₁	c₂	κ(x)	x⁽¹⁾	x⁽²⁾
$ s = 2 $ $ n = 5 $	$ 1.0 $	$ 0.1 $	$ 2{\boldsymbol{x}} $	(0,0)	(2,3)
	$ 1.0 $	$ 0.1 $	$ 4{\boldsymbol{x}} $	(0,1)	(4,0)
	$ 1.0 $	$ 0.1 $	$ 6 {\boldsymbol{x}}$	(1,1)	(3,0)
	$ 1.0 $	$ 0.1 $	$ 8{\boldsymbol{x}} $	(1,1)	(2,1)
	$ 1.0 $	$ 0.1 $	$ 10{\boldsymbol{x}} $	(1,1)	(3,0)
$ s = 3 $ $ n = 10 $	$ 1.0 $	$ 0.1 $	$ 2 {\boldsymbol{x}}$	(0,0,0)	(5,4,1)
	$ 1.0 $	$ 0.1 $	$ 4{\boldsymbol{x}} $	(0,1,1)	(8,0,0)
	$ 1.0 $	$ 0.1 $	$ 6{\boldsymbol{x}} $	(1,2,2)	(5,0,0)
	$ 1.0 $	$ 0.1 $	$ 8 {\boldsymbol{x}}$	(2,0,2)	(0,6,0)
	$ 1.0 $	$ 0.1 $	$ 10{\boldsymbol{x}} $	(2,2,2)	(4,0,0)

Algorithm	RM-2T		DQN-2T		Rand-2T
Algorithm	WP/%	AWR	WP/%	AWR	WP/%	AWR
RM-2T	—	—	51.5	−0.110	78.0	0.361
DQN-2T	48.5	−0.181	—	—	89.4	0.588
Rand-2T	22.0	−0.711	10.6	-0.928	—	—

Algorithm	RM-2T		Rand-T1		Rand-T2
Algorithm	WP/%	AWR	WP/%	AWR	WP/%	AWR
RM-2T	—	—	54.0	−0.120	76.1	0.382
Rand-T1	46.0	−0.280	—	—	83.4	0.468
Rand-T2	23.9	−0.622	16.6	−0.768	—	—

Optimal competitive resource assignment in two-stage Colonel Blotto game with Lanchester-type attrition

RichHTML

PDF (PC)

Knowledge

Abstract

Cite this article

Share this article

Figures/Tables 9

References 29

Related Articles 0

Recommended Articles

Metrics

Comments