Youtu-LLM:釋放輕量化大型語言模型的原生智能體潛能
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models
December 31, 2025
作者: Junru Lu, Jiarui Qin, Lingfeng Qiao, Yinghui Li, Xinyi Dai, Bo Ke, Jianfeng He, Ruizhi Qiao, Di Yin, Xing Sun, Yunsheng Wu, Yinsong Liu, Shuangyin Liu, Mingkong Tang, Haodong Lin, Jiayi Kuang, Fanxu Meng, Xiaojuan Tang, Yunjia Xi, Junjie Huang, Haotong Yang, Zhenyi Shen, Yangning Li, Qianwen Zhang, Yifei Yu, Siyu An, Junnan Dong, Qiufeng Wang, Jie Wang, Keyu Chen, Wei Wen, Taian Guo, Zhifeng Shen, Daohai Yu, Jiahao Li, Ke Li, Zongyi Li, Xiaoyu Tan
cs.AI
摘要
我們推出Youtu-LLM——一款輕量級卻強大的語言模型,在高效計算性能與原生智能體能力之間實現完美平衡。與依賴蒸餾技術的典型小模型不同,Youtu-LLM(1.96B)從零開始進行預訓練,系統性培養推理與規劃能力。其核心技術突破包括:(1)支持長上下文的緊湊架構:基於稠密多潛在注意力架構與創新型STEM導向詞表構建,Youtu-LLM支持128k上下文窗口。該設計在最小內存佔用下實現強健的長上下文推理與狀態追蹤,特別適合長週期智能體與推理任務。(2)系統化的「常識-STEM-智能體」課程策略:我們構建了約11T標記的大規模語料庫,實施多階段訓練策略。通過將預訓練數據分佈從通用常識逐步過渡至複雜STEM與智能體任務,確保模型獲得深層認知能力而非表面對齊。(3)可擴展的智能體中期訓練:針對智能體中期訓練,我們採用多樣化數據構建方案,在數學、編程及工具使用領域合成豐富多元的行動軌跡。高質量數據使模型能有效內化規劃與反思行為。大量評估表明,Youtu-LLM為20億參數以下模型設立了新標杆:在通用基準測試中與更大模型競逐,而在智能體專項任務中顯著超越現有SOTA基線,證明輕量級模型同樣可具備強大的內生智能體能力。
English
We introduce Youtu-LLM, a lightweight yet powerful language model that harmonizes high computational efficiency with native agentic intelligence. Unlike typical small models that rely on distillation, Youtu-LLM (1.96B) is pre-trained from scratch to systematically cultivate reasoning and planning capabilities. The key technical advancements are as follows: (1) Compact Architecture with Long-Context Support: Built on a dense Multi-Latent Attention (MLA) architecture with a novel STEM-oriented vocabulary, Youtu-LLM supports a 128k context window. This design enables robust long-context reasoning and state tracking within a minimal memory footprint, making it ideal for long-horizon agent and reasoning tasks. (2) Principled "Commonsense-STEM-Agent" Curriculum: We curated a massive corpus of approximately 11T tokens and implemented a multi-stage training strategy. By progressively shifting the pre-training data distribution from general commonsense to complex STEM and agentic tasks, we ensure the model acquires deep cognitive abilities rather than superficial alignment. (3) Scalable Agentic Mid-training: Specifically for the agentic mid-training, we employ diverse data construction schemes to synthesize rich and varied trajectories across math, coding, and tool-use domains. This high-quality data enables the model to internalize planning and reflection behaviors effectively. Extensive evaluations show that Youtu-LLM sets a new state-of-the-art for sub-2B LLMs. On general benchmarks, it achieves competitive performance against larger models, while on agent-specific tasks, it significantly surpasses existing SOTA baselines, demonstrating that lightweight models can possess strong intrinsic agentic capabilities.