ChatPaper.aiChatPaper

Youtu-Agent:透過自動化生成與混合策略優化提升智能體生產力

Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization

December 31, 2025
作者: Yuchen Shi, Yuzheng Cai, Siqi Cai, Zihan Xu, Lichao Chen, Yulei Qin, Zhijian Zhou, Xiang Fei, Chaofan Qiu, Xiaoyu Tan, Gang Li, Zongyi Li, Haojia Lin, Guocan Cai, Yong Mao, Yunsheng Wu, Ke Li, Xing Sun
cs.AI

摘要

現有的大型語言模型(LLM)智能體框架面臨兩大挑戰:高配置成本與靜態能力瓶頸。構建高質量智能體通常需要耗費大量人力進行工具整合與提示詞工程,而已部署的智能體若缺乏昂貴的微調過程,則難以適應動態環境。為解決這些問題,我們提出Youtu-Agent——一個專注於LLM智能體自動化生成與持續演進的模組化框架。該框架採用結構化配置系統,將執行環境、工具組與上下文管理解耦,實現靈活複用與自動化合成。我們引入兩種生成模式:針對標準任務的工作流模式,以及應對複雜非標準需求的元智能體模式,可自動生成工具代碼、提示詞及配置。此外,Youtu-Agent建立了混合策略優化系統:(1)智能體實踐模組通過情境內優化使智能體無需參數更新即可積累經驗提升性能;(2)智能體強化學習模組整合分散式訓練框架,以大規模端到端方式實現任意Youtu-Agent的可擴展穩定強化學習。實驗表明,Youtu-Agent在WebWalkerQA(71.47%)和GAIA(72.8%)基準測試中採用開源權重模型達到頂尖水平。其自動生成管線工具合成成功率超過81%,實踐模組更將AIME 2024/2025任務性能分別提升2.7%與5.4%。此外,智能體強化學習訓練在7B參數LLM上實現40%加速且性能穩步提升,於數學與通用/多跳問答基準測試中分別將代碼推理與搜索能力最高提升35%和21%。
English
Existing Large Language Model (LLM) agent frameworks face two significant challenges: high configuration costs and static capabilities. Building a high-quality agent often requires extensive manual effort in tool integration and prompt engineering, while deployed agents struggle to adapt to dynamic environments without expensive fine-tuning. To address these issues, we propose Youtu-Agent, a modular framework designed for the automated generation and continuous evolution of LLM agents. Youtu-Agent features a structured configuration system that decouples execution environments, toolkits, and context management, enabling flexible reuse and automated synthesis. We introduce two generation paradigms: a Workflow mode for standard tasks and a Meta-Agent mode for complex, non-standard requirements, capable of automatically generating tool code, prompts, and configurations. Furthermore, Youtu-Agent establishes a hybrid policy optimization system: (1) an Agent Practice module that enables agents to accumulate experience and improve performance through in-context optimization without parameter updates; and (2) an Agent RL module that integrates with distributed training frameworks to enable scalable and stable reinforcement learning of any Youtu-Agents in an end-to-end, large-scale manner. Experiments demonstrate that Youtu-Agent achieves state-of-the-art performance on WebWalkerQA (71.47\%) and GAIA (72.8\%) using open-weight models. Our automated generation pipeline achieves over 81\% tool synthesis success rate, while the Practice module improves performance on AIME 2024/2025 by +2.7\% and +5.4\% respectively. Moreover, our Agent RL training achieves 40\% speedup with steady performance improvement on 7B LLMs, enhancing coding/reasoning and searching capabilities respectively up to 35\% and 21\% on Maths and general/multi-hop QA benchmarks.
PDF811January 6, 2026