Agent Bazaar：在多智能体市场中实现经济对齐

摘要

将大型语言模型（LLMs）部署为自主经济代理会引入超越个体能力失效的系统性风险。随着代理转向直接与市场交互，其集体行为可能放大波动性并大规模掩盖欺骗行为。我们提出Agent Bazaar——一个用于评估经济对齐（即代理系统维护市场稳定与完整性的能力）的多代理模拟框架。我们识别出两种失效模式：（1）B2C市场中的算法不稳定性（“崩溃”），即企业放大价格波动直至市场崩溃；（2）C2C市场中的女巫欺骗（“柠檬市场”），即单个欺骗性代理通过控制多个协调的卖家身份，用欺诈性列表淹没市场，侵蚀信任与消费者福利。我们评估了前沿与开放权重模型在这两种场景下的表现，发现模型普遍无法自我调节，且失效严重程度随模型不同而变化，而非取决于模型规模。我们提出经济对齐的控制机制——稳定企业与怀疑守卫——这些机制改善了结果，但在更困难的市场条件下仍然脆弱。为弥补这一差距，我们使用自适应课程方案训练REINFORCE++代理，所得9B模型在所有评估的前沿与开放权重模型中表现最佳。我们提出经济对齐分数（EAS），一个由稳定性、完整性、福利与盈利能力四个分量组成的标量指标，支持模型间的直接比较。我们的结果表明，经济对齐与通用能力正交，可通过定向强化学习直接训练。

English

The deployment of Large Language Models (LLMs) as autonomous economic agents introduces systemic risks that extend beyond individual capability failures. As agents transition to directly interacting with marketplaces, their collective behavior can amplify volatility and mask deception at scale. We introduce the Agent Bazaar, a multi-agent simulation framework for evaluating Economic Alignment, the capacity of agentic systems to preserve market stability and integrity. We identify two failure modes: (1) Algorithmic Instability in a B2C market ("The Crash"), where firms amplify price volatility until the market collapses, and (2) Sybil Deception in a C2C market ("The Lemon Market"), where a single deceptive agent controlling multiple coordinated seller identities floods the market with fraudulent listings, eroding trust and consumer welfare. We evaluate frontier and open-weight models across both scenarios and find that models largely fail to self-regulate, with failure severity varying by model rather than by size. We propose economically aligned harnesses, Stabilizing Firms and Skeptical Guardians, that improve outcomes but remain fragile under harder market conditions. To close this gap, we train agents with REINFORCE++ using an adaptive curriculum, producing a 9B model that outperforms all evaluated frontier and open-weight models. We propose the Economic Alignment Score (EAS), a 4-component scalar metric aggregating stability, integrity, welfare, and profitability, enabling direct cross-model comparison. Our results show that economic alignment is orthogonal to general capability and can be directly trained with targeted RL.