ChatPaper.aiChatPaper

Agent0:透過工具整合推理實現從零資料啟動的自我進化智慧體

Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning

November 20, 2025
作者: Peng Xia, Kaide Zeng, Jiaqi Liu, Can Qin, Fang Wu, Yiyang Zhou, Caiming Xiong, Huaxiu Yao
cs.AI

摘要

大型語言模型(LLM)代理通常透過強化學習(RL)進行訓練,但其發展受到人類標註數據依賴性的制約,這不僅限制了可擴展性,更將人工智慧束縛於人類既有知識框架。現有的自我進化框架雖提供替代方案,但普遍受制於模型的固有能力與單輪互動模式,難以發展涉及工具使用或動態推理的複雜課程體系。我們提出 Agent0——一個完全自主的框架,透過多步驟協同進化與無縫工具整合,實現無需外部數據的高性能代理進化。Agent0 在源自同一基礎 LLM 的兩個代理間建立共生競爭機制:課程代理負責提出漸進式的前沿難題,而執行代理則學習解決這些任務。我們整合外部工具以增強執行代理的問題解決能力,此能力提升反過來促使課程代理構建更複雜、具工具意識的任務。透過這種迭代過程,Agent0 建立了自我強化的循環,持續生成高品質課程。實驗結果顯示,Agent0 顯著提升推理能力,使 Qwen3-8B-Base 模型在數學推理任務上提升 18%,通用推理基準測試中提升 24%。程式碼已公開於:https://github.com/aiming-lab/Agent0。
English
Large Language Model (LLM) Agents, often trained with Reinforcement Learning (RL), are constrained by a dependency on human-curated data, limiting scalability and tethering AI to human knowledge. Existing self-evolution frameworks offer an alternative but are typically restricted by the model's inherent capabilities and single-round interactions, hindering the development of complex curricula involving tool use or dynamic reasoning. We introduce Agent0, a fully autonomous framework that evolves high-performing agents without external data through multi-step co-evolution and seamless tool integration. Agent0 establishes a symbiotic competition between two agents initialized from the same base LLM: a curriculum agent that proposes increasingly challenging frontier tasks, and an executor agent that learns to solve them. We integrate external tools to enhance the executor's problem-solving capacity; this improvement, in turn, pressures the curriculum agent to construct more complex, tool-aware tasks. Through this iterative process, Agent0 establishes a self-reinforcing cycle that continuously produces high-quality curricula. Empirically, Agent0 substantially boosts reasoning capabilities, improving the Qwen3-8B-Base model by 18% on mathematical reasoning and 24% on general reasoning benchmarks. Code is available at https://github.com/aiming-lab/Agent0.
PDF974December 1, 2025