Ling and Ring 2.6 技術報告：在萬億參數規模下高效且即時的能動智能

摘要

高效且可擴展的代理智能，需要模型同時具備低延遲響應與強大推理能力，同時在訓練、服務與部署上保持實用性。本報告中，我們提出 Ling-2.6 與 Ring-2.6 模型家族，旨在大規模應對此挑戰。Ling-2.6 專為即時生成回應與高每個輸出Token能力而優化，而 Ring-2.6 則針對更深層推理與更進階的代理工作流程量身打造。我們並非從頭訓練，而是透過架構遷移預訓練與大規模後訓練，對 Ling-2.0 基礎模型進行升級。此升級以模型架構、優化目標、服務系統與代理訓練環境的統一協同設計為指導，從而同時提升模型能力與部署效率。在架構層面，我們引入混合線性注意力設計，將 Lightning Attention 與 MLA 整合，提升長上下文訓練與解碼效率。為進一步增強Token效率，我們透過演化思維鏈、語言單元策略優化、雙向偏好對齊及最短路徑正確回應蒸餾，來優化每個輸出Token的能力。針對代理能力，我們提出 KPop，這是一個強化學習框架，旨在支援 Ring-2.6-1T 在大規模環境基礎資料上的穩定訓練。KPop 透過編碼、搜尋、工具使用與工作流程執行的異步排程，提升訓練效率，從而實現從複雜代理-環境互動中進行可擴展學習。Ling-2.6 與 Ring-2.6 共同為高效、可擴展且開放的代理系統提供了實用路徑。我們開源 2.6 模型家族的所有檢查點，以支持實用代理智慧的進一步研究與開發。

English

Efficient and scalable agentic intelligence requires models that can deliver both low-latency responses and strong reasoning capabilities while remaining practical to train, serve, and deploy. In this report, we present Ling-2.6 and Ring-2.6, a family of models designed to address this challenge at scale. Ling-2.6 is optimized for instant response generation and high capability per output token, whereas Ring-2.6 is tailored for deeper reasoning and more advanced agentic workflows. Instead of training from scratch, we upgrade the Ling-2.0 base model through architectural migration pre-training and large-scale post-training. This upgrade is guided by a unified co-design of model architecture, optimization objectives, serving systems, and agent training environments, enabling improvements in both model capability and deployment efficiency. At the architectural level, we introduce a hybrid linear attention design that integrates Lightning Attention with MLA, improving the efficiency of long-context training and decoding. To further enhance token efficiency, we optimize capability per output token through Evolutionary Chain-of-Thought, Linguistic Unit Policy Optimization, bidirectional preference alignment, and shortest-correct-response distillation. For agentic capabilities, we propose KPop, a reinforcement learning framework designed to support stable training of Ring-2.6-1T on large-scale environment-grounded data. KPop improves training efficiency through asynchronous scheduling across coding, search, tool use, and workflow execution, enabling scalable learning from complex agent-environment interactions. Together, Ling-2.6 and Ring-2.6 provide a practical pathway toward efficient, scalable, and open agentic systems. We open-source all checkpoints in the 2.6 family to support further research and development in practical agentic intelligence.