TeleChat2、TeleChat2.5及T1技术报告

摘要

我们推出最新一代TeleChat系列模型：TeleChat2、TeleChat2.5及T1，相较于前代TeleChat实现了显著升级。尽管模型架构变动不大，但通过优化预训练与后训练阶段的策略，新系列在性能上取得了重大突破。该系列以TeleChat2为起点，其预训练过程使用了10万亿高质量且多样化的token，随后通过监督微调（SFT）和直接偏好优化（DPO）进一步增强能力。TeleChat2.5和T1在此基础上扩展了训练流程，引入了针对特定领域的持续预训练阶段，并结合强化学习（RL）以提升代码生成和数学推理任务的表现。其中，T1版本专为复杂推理设计，支持长链式思维（CoT）推理，在数学与编程方面展现出显著进步；而TeleChat2.5则侧重于速度，提供快速推理能力。这两款旗舰模型T1和TeleChat2.5均采用密集的Transformer架构，拥有1150亿参数，相比原版TeleChat，在推理与通用任务性能上实现了重大跨越。值得注意的是，T1-115B在多项指标上超越了OpenAI的o1-mini和GPT-4o等专有模型。我们公开发布了TeleChat2、TeleChat2.5及T1，包括350亿和1150亿参数的后训练版本，旨在为开发者和研究人员提供面向多样化应用的最先进语言模型。

English

We introduce the latest series of TeleChat models: TeleChat2, TeleChat2.5, and T1, offering a significant upgrade over their predecessor, TeleChat. Despite minimal changes to the model architecture, the new series achieves substantial performance gains through enhanced training strategies in both pre-training and post-training stages. The series begins with TeleChat2, which undergoes pretraining on 10 trillion high-quality and diverse tokens. This is followed by Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) to further enhance its capabilities. TeleChat2.5 and T1 expand the pipeline by incorporating a continual pretraining phase with domain-specific datasets, combined with reinforcement learning (RL) to improve performance in code generation and mathematical reasoning tasks. The T1 variant is designed for complex reasoning, supporting long Chain-of-Thought (CoT) reasoning and demonstrating substantial improvements in mathematics and coding. In contrast, TeleChat2.5 prioritizes speed, delivering rapid inference. Both flagship models of T1 and TeleChat2.5 are dense Transformer-based architectures with 115B parameters, showcasing significant advancements in reasoning and general task performance compared to the original TeleChat. Notably, T1-115B outperform proprietary models such as OpenAI's o1-mini and GPT-4o. We publicly release TeleChat2, TeleChat2.5 and T1, including post-trained versions with 35B and 115B parameters, to empower developers and researchers with state-of-the-art language models tailored for diverse applications.

TeleChat2、TeleChat2.5及T1技术报告

Technical Report of TeleChat2, TeleChat2.5 and T1

摘要

Support