AgentFugue：透過集體推理實現長時域任務的智能體可擴展性

摘要

近期長時程自主任務的進展主要來自於透過更強模型、更好工具及更有效框架來擴展個別代理。相比之下，對於擴展規模（scaling out）的理解則少得多：多個同級代理針對同一任務時，是否能在不依賴明確角色分工或工作流程編排的情況下，成為額外的能力來源。我們研究此問題，並提出 AgentFugue，這是一個圍繞共享推理中樞（shared reasoning hub）建立的集體推理框架。當同級代理平行探索同一任務時，中樞會記錄每個代理已建立、嘗試或排除的簡潔筆記，並使每個代理能選擇性地以對其當前搜尋有用的形式存取其他代理的發現。此設計將原本孤立的軌跡轉化為可重複使用之中間推理的互聯生態系統，無需集中規劃。我們將中樞實作為插入式通訊層，並以監督式微調與端到端強化學習進行訓練。在我們研究的具挑戰性的長時程設定中，AgentFugue 超越了強基線。我們的結果表明，集體推理能將同級代理系統的擴展規模轉化為能力提升的獨立來源，而不僅僅是花費更多計算的方式。

English

Recent progress on long-horizon agentic tasks has been driven largely by scaling up individual agents through stronger models, better tools, and more effective scaffolding. In contrast, much less is understood about scaling out: whether multiple peer agents, all targeting the same task, can become an additional source of capability without relying on explicit role specialization or workflow orchestration. We study this question and propose AgentFugue, a collective reasoning framework built around a shared reasoning hub. As peer agents explore the same task in parallel, the hub records concise notes on what each agent has established, attempted, or ruled out, and enables each agent to selectively access what other agents have discovered in a form useful for its current search. This design turns otherwise isolated trajectories into a connected ecology of reusable intermediate reasoning without requiring centralized planning. We instantiate the hub as a plug-in communication layer, trained with supervised fine-tuning and end-to-end reinforcement learning. Across the challenging long-horizon settings we study, AgentFugue improves over strong baselines. Our results suggest that collective reasoning can turn scaling out peer agent systems into a distinct source of capability gains, rather than merely a way of spending more compute.