Youtu-Agent:通过自动化生成与混合策略优化提升智能体生产力
Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization
December 31, 2025
作者: Yuchen Shi, Yuzheng Cai, Siqi Cai, Zihan Xu, Lichao Chen, Yulei Qin, Zhijian Zhou, Xiang Fei, Chaofan Qiu, Xiaoyu Tan, Gang Li, Zongyi Li, Haojia Lin, Guocan Cai, Yong Mao, Yunsheng Wu, Ke Li, Xing Sun
cs.AI
摘要
现有大型语言模型(LLM)智能体框架面临两大挑战:高配置成本与静态能力局限。构建高质量智能体往往需要投入大量人工进行工具集成与提示词工程,而已部署的智能体若缺乏昂贵的微调则难以适应动态环境。为解决这些问题,我们提出Youtu-Agent——一个面向LLM智能体自动化生成与持续演进的模块化框架。该框架采用结构化配置系统,将执行环境、工具集与上下文管理解耦,实现灵活复用与自动化合成。我们引入两种生成范式:面向标准任务的工作流模式,以及针对复杂非标需求的元智能体模式,可自动生成工具代码、提示词及配置方案。此外,Youtu-Agent建立了混合策略优化体系:(1)智能体实践模块通过上下文优化使智能体无需参数更新即可积累经验提升性能;(2)智能体强化学习模块对接分布式训练框架,支持任意Youtu-Agent以端到端、大规模方式进行可扩展的稳定强化学习。实验表明,Youtu-Agent在WebWalkerQA(71.47%)和GAIA(72.8%)基准上使用开源权重模型达到领先水平。我们的自动化生成管道工具合成成功率超81%,实践模块将AIME 2024/2025任务性能分别提升2.7%和5.4%。智能体强化学习训练在7B参数LLM上实现40%加速且性能稳定提升,在数学与通用/多跳问答基准上分别将代码推理和搜索能力最高提升35%和21%。
English
Existing Large Language Model (LLM) agent frameworks face two significant challenges: high configuration costs and static capabilities. Building a high-quality agent often requires extensive manual effort in tool integration and prompt engineering, while deployed agents struggle to adapt to dynamic environments without expensive fine-tuning. To address these issues, we propose Youtu-Agent, a modular framework designed for the automated generation and continuous evolution of LLM agents. Youtu-Agent features a structured configuration system that decouples execution environments, toolkits, and context management, enabling flexible reuse and automated synthesis. We introduce two generation paradigms: a Workflow mode for standard tasks and a Meta-Agent mode for complex, non-standard requirements, capable of automatically generating tool code, prompts, and configurations. Furthermore, Youtu-Agent establishes a hybrid policy optimization system: (1) an Agent Practice module that enables agents to accumulate experience and improve performance through in-context optimization without parameter updates; and (2) an Agent RL module that integrates with distributed training frameworks to enable scalable and stable reinforcement learning of any Youtu-Agents in an end-to-end, large-scale manner. Experiments demonstrate that Youtu-Agent achieves state-of-the-art performance on WebWalkerQA (71.47\%) and GAIA (72.8\%) using open-weight models. Our automated generation pipeline achieves over 81\% tool synthesis success rate, while the Practice module improves performance on AIME 2024/2025 by +2.7\% and +5.4\% respectively. Moreover, our Agent RL training achieves 40\% speedup with steady performance improvement on 7B LLMs, enhancing coding/reasoning and searching capabilities respectively up to 35\% and 21\% on Maths and general/multi-hop QA benchmarks.