ChatPaper.aiChatPaper

垄断交易:有限单边响应博弈的基准环境

Monopoly Deal: A Benchmark Environment for Bounded One-Sided Response Games

October 29, 2025
作者: Will Wolf
cs.AI

摘要

卡牌游戏被广泛用于研究不确定性下的序列决策问题,在谈判、金融和网络安全领域具有现实对应模型。根据控制流模式,这类游戏通常可分为三类:严格顺序型(玩家轮替执行单动作)、确定性响应型(特定动作触发固定结果)以及无界互惠响应型(允许交替反制)。一种研究较少但策略丰富的结构是有限单边响应机制——当玩家行动短暂将控制权转移给对手时,对手必须通过一个或多个操作满足固定条件才能结束回合。我们将具有此机制的游戏称为有限单边响应游戏(BORGs)。我们以改良版《地产大亨卡牌游戏》作为基准环境来隔离这种动态机制,其中"收取租金"行动会强制对手选择支付资产。金牌算法反事实遗憾最小化(CFR)无需新算法扩展即可收敛于有效策略。我们构建的轻量级全栈研究平台整合了游戏环境、并行化CFR运行时及可人机对战的网页界面。训练完成的CFR智能体及源代码已发布于https://monopolydeal.ai。
English
Card games are widely used to study sequential decision-making under uncertainty, with real-world analogues in negotiation, finance, and cybersecurity. These games typically fall into three categories based on the flow of control: strictly sequential (players alternate single actions), deterministic response (some actions trigger a fixed outcome), and unbounded reciprocal response (alternating counterplays are permitted). A less-explored but strategically rich structure is the bounded one-sided response, where a player's action briefly transfers control to the opponent, who must satisfy a fixed condition through one or more moves before the turn resolves. We term games featuring this mechanism Bounded One-Sided Response Games (BORGs). We introduce a modified version of Monopoly Deal as a benchmark environment that isolates this dynamic, where a Rent action forces the opponent to choose payment assets. The gold-standard algorithm, Counterfactual Regret Minimization (CFR), converges on effective strategies without novel algorithmic extensions. A lightweight full-stack research platform unifies the environment, a parallelized CFR runtime, and a human-playable web interface. The trained CFR agent and source code are available at https://monopolydeal.ai.
PDF22February 7, 2026