垄断交易:有限单边响应博弈的基准环境
Monopoly Deal: A Benchmark Environment for Bounded One-Sided Response Games
October 29, 2025
作者: Will Wolf
cs.AI
摘要
卡牌游戏被广泛用于研究不确定性下的序列决策问题,在谈判、金融和网络安全领域具有现实对应模型。根据控制流模式,这类游戏通常分为三类:严格顺序型(玩家轮替执行单一动作)、确定性响应型(特定动作触发固定结果)以及无界互惠响应型(允许交替对抗)。一种研究较少但策略丰富的结构是有限单边响应机制——当玩家行动短暂将控制权转移给对手时,对方必须通过一个或多个操作满足固定条件才能结束回合。我们将具有这种机制的游戏称为有限单边响应游戏(BORGs)。本文以改良版《地产大亨:交易》作为基准环境来分离该动态机制,其中"收取租金"行动会强制对手选择支付资产。金牌算法反事实遗憾最小化(CFR)无需新型算法扩展即可收敛于有效策略。我们构建的轻量级全栈研究平台整合了游戏环境、并行化CFR运行时及可人机对战的网页界面。经训练的CFR智能体及源代码已发布于https://monopolydeal.ai。
English
Card games are widely used to study sequential decision-making under
uncertainty, with real-world analogues in negotiation, finance, and
cybersecurity. These games typically fall into three categories based on the
flow of control: strictly sequential (players alternate single actions),
deterministic response (some actions trigger a fixed outcome), and unbounded
reciprocal response (alternating counterplays are permitted). A less-explored
but strategically rich structure is the bounded one-sided response, where a
player's action briefly transfers control to the opponent, who must satisfy a
fixed condition through one or more moves before the turn resolves. We term
games featuring this mechanism Bounded One-Sided Response Games (BORGs). We
introduce a modified version of Monopoly Deal as a benchmark environment that
isolates this dynamic, where a Rent action forces the opponent to choose
payment assets. The gold-standard algorithm, Counterfactual Regret Minimization
(CFR), converges on effective strategies without novel algorithmic extensions.
A lightweight full-stack research platform unifies the environment, a
parallelized CFR runtime, and a human-playable web interface. The trained CFR
agent and source code are available at https://monopolydeal.ai.