Ling and Ring 2.6 テクニカルレポート：兆パラメータ規模における効率的かつ即時的なエージェント型知能

要旨

効率的でスケーラブルなエージェント型知能を実現するには、低レイテンシーの応答と強力な推論能力の両方を備え、かつ訓練、提供、展開が実用的なモデルが必要です。本報告書では、この課題に大規模に対処するために設計されたモデルファミリー、Ling-2.6とRing-2.6を紹介します。Ling-2.6は即時応答生成と出力トークンあたりの高い性能に最適化されており、一方Ring-2.6はより深い推論と高度なエージェントワークフローに特化しています。ゼロからの訓練ではなく、アーキテクチャ移行事前訓練と大規模事後訓練を通じて、Ling-2.0ベースモデルをアップグレードします。このアップグレードは、モデルアーキテクチャ、最適化目標、提供システム、エージェント訓練環境の統一的な共同設計に導かれ、モデル性能と展開効率の両方の改善を可能にします。アーキテクチャレベルでは、Lightning AttentionとMLAを統合したハイブリッド線形アテンション設計を導入し、長コンテキスト訓練とデコーディングの効率を向上させます。トークン効率をさらに高めるため、Evolutionary Chain-of-Thought、Linguistic Unit Policy Optimization、双方向選好アライメント、および最短正解応答蒸留を通じて、出力トークンあたりの性能を最適化します。エージェント能力については、Ring-2.6-1Tの大規模環境接地データでの安定した訓練を支援するように設計された強化学習フレームワークであるKPopを提案します。KPopは、コーディング、検索、ツール使用、ワークフロー実行にわたる非同期スケジューリングを通じて訓練効率を向上させ、複雑なエージェント環境相互作用からのスケーラブルな学習を可能にします。Ling-2.6とRing-2.6は、効率的でスケーラブルかつオープンなエージェントシステムへの実用的な道筋を提供します。実用的なエージェント型知能におけるさらなる研究開発を支援するため、2.6ファミリーのすべてのチェックポイントをオープンソースとして公開します。

English

Efficient and scalable agentic intelligence requires models that can deliver both low-latency responses and strong reasoning capabilities while remaining practical to train, serve, and deploy. In this report, we present Ling-2.6 and Ring-2.6, a family of models designed to address this challenge at scale. Ling-2.6 is optimized for instant response generation and high capability per output token, whereas Ring-2.6 is tailored for deeper reasoning and more advanced agentic workflows. Instead of training from scratch, we upgrade the Ling-2.0 base model through architectural migration pre-training and large-scale post-training. This upgrade is guided by a unified co-design of model architecture, optimization objectives, serving systems, and agent training environments, enabling improvements in both model capability and deployment efficiency. At the architectural level, we introduce a hybrid linear attention design that integrates Lightning Attention with MLA, improving the efficiency of long-context training and decoding. To further enhance token efficiency, we optimize capability per output token through Evolutionary Chain-of-Thought, Linguistic Unit Policy Optimization, bidirectional preference alignment, and shortest-correct-response distillation. For agentic capabilities, we propose KPop, a reinforcement learning framework designed to support stable training of Ring-2.6-1T on large-scale environment-grounded data. KPop improves training efficiency through asynchronous scheduling across coding, search, tool use, and workflow execution, enabling scalable learning from complex agent-environment interactions. Together, Ling-2.6 and Ring-2.6 provide a practical pathway toward efficient, scalable, and open agentic systems. We open-source all checkpoints in the 2.6 family to support further research and development in practical agentic intelligence.