Youtu-LLM:解锁轻量化大型语言模型原生智能体潜能
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models
December 31, 2025
作者: Junru Lu, Jiarui Qin, Lingfeng Qiao, Yinghui Li, Xinyi Dai, Bo Ke, Jianfeng He, Ruizhi Qiao, Di Yin, Xing Sun, Yunsheng Wu, Yinsong Liu, Shuangyin Liu, Mingkong Tang, Haodong Lin, Jiayi Kuang, Fanxu Meng, Xiaojuan Tang, Yunjia Xi, Junjie Huang, Haotong Yang, Zhenyi Shen, Yangning Li, Qianwen Zhang, Yifei Yu, Siyu An, Junnan Dong, Qiufeng Wang, Jie Wang, Keyu Chen, Wei Wen, Taian Guo, Zhifeng Shen, Daohai Yu, Jiahao Li, Ke Li, Zongyi Li, Xiaoyu Tan
cs.AI
摘要
我们推出Youtu-LLM——一款轻量而强大的语言模型,在实现高计算效率的同时兼具原生智能体能力。与依赖知识蒸馏的典型小模型不同,Youtu-LLM(1.96B)通过从头预训练系统化培育推理与规划能力。其核心技术突破包括:(1)支持长上下文的紧凑架构:基于稠密多潜注意力架构与新颖的STEM导向词表,模型支持128k上下文窗口。该设计以极小内存占用实现稳健的长程推理与状态追踪,尤其适合长周期智能体与推理任务。(2)分阶段的"常识-STEM-智能体"课程训练:我们构建了约11T token的大规模语料库,实施多阶段训练策略。通过将预训练数据分布从通用常识逐步过渡至复杂STEM与智能体任务,确保模型获得深层认知能力而非表面对齐。(3)可扩展的智能体中期训练:针对智能体中期训练,我们采用多样化数据构建方案,在数学、编程及工具使用领域合成丰富轨迹。高质量数据使模型能有效内化规划与反思能力。大量实验表明,Youtu-LLM在2B参数量以下模型中刷新性能纪录:在通用基准测试中与更大模型表现相当,在智能体专项任务上显著超越现有SOTA基线,证明轻量模型亦可具备强大的内生智能体能力。
English
We introduce Youtu-LLM, a lightweight yet powerful language model that harmonizes high computational efficiency with native agentic intelligence. Unlike typical small models that rely on distillation, Youtu-LLM (1.96B) is pre-trained from scratch to systematically cultivate reasoning and planning capabilities. The key technical advancements are as follows: (1) Compact Architecture with Long-Context Support: Built on a dense Multi-Latent Attention (MLA) architecture with a novel STEM-oriented vocabulary, Youtu-LLM supports a 128k context window. This design enables robust long-context reasoning and state tracking within a minimal memory footprint, making it ideal for long-horizon agent and reasoning tasks. (2) Principled "Commonsense-STEM-Agent" Curriculum: We curated a massive corpus of approximately 11T tokens and implemented a multi-stage training strategy. By progressively shifting the pre-training data distribution from general commonsense to complex STEM and agentic tasks, we ensure the model acquires deep cognitive abilities rather than superficial alignment. (3) Scalable Agentic Mid-training: Specifically for the agentic mid-training, we employ diverse data construction schemes to synthesize rich and varied trajectories across math, coding, and tool-use domains. This high-quality data enables the model to internalize planning and reflection behaviors effectively. Extensive evaluations show that Youtu-LLM sets a new state-of-the-art for sub-2B LLMs. On general benchmarks, it achieves competitive performance against larger models, while on agent-specific tasks, it significantly surpasses existing SOTA baselines, demonstrating that lightweight models can possess strong intrinsic agentic capabilities.