ChatPaper.aiChatPaper

TeleChat2、TeleChat2.5及T1技术报告

Technical Report of TeleChat2, TeleChat2.5 and T1

July 24, 2025
作者: Zihan Wang, Xinzhang Liu, Yitong Yao, Chao Wang, Yu Zhao, Zhihao Yang, Wenmin Deng, Kaipeng Jia, Jiaxin Peng, Yuyao Huang, Sishi Xiong, Zhuo Jiang, Kaidong Yu, Xiaohui Hu, Fubei Yao, Ruiyu Fang, Zhuoru Jiang, Ruiting Song, Qiyi Xie, Rui Xue, Xuewei He, Yanlei Xue, Zhu Yuan, Zhaoxi Zhang, Zilu Huang, Shiquan Wang, Xin Wang, Hanming Wu, Mingyuan Wang, Xufeng Zhan, Yuhan Sun, Zhaohu Xing, Yuhao Jiang, Bingkai Yang, Shuangyong Song, Yongxiang Li, Zhongjiang He, Xuelong Li
cs.AI

摘要

我们推出最新一代TeleChat系列模型:TeleChat2、TeleChat2.5及T1,相较于前代TeleChat实现了显著升级。尽管模型架构变动不大,但通过优化预训练与后训练阶段的策略,新系列在性能上取得了重大突破。该系列以TeleChat2为起点,其预训练过程使用了10万亿高质量且多样化的token,随后通过监督微调(SFT)和直接偏好优化(DPO)进一步增强能力。TeleChat2.5和T1在此基础上扩展了训练流程,引入了针对特定领域的持续预训练阶段,并结合强化学习(RL)以提升代码生成和数学推理任务的表现。其中,T1版本专为复杂推理设计,支持长链式思维(CoT)推理,在数学与编程方面展现出显著进步;而TeleChat2.5则侧重于速度,提供快速推理能力。这两款旗舰模型T1和TeleChat2.5均采用密集的Transformer架构,拥有1150亿参数,相比原版TeleChat,在推理与通用任务性能上实现了重大跨越。值得注意的是,T1-115B在多项指标上超越了OpenAI的o1-mini和GPT-4o等专有模型。我们公开发布了TeleChat2、TeleChat2.5及T1,包括350亿和1150亿参数的后训练版本,旨在为开发者和研究人员提供面向多样化应用的最先进语言模型。
English
We introduce the latest series of TeleChat models: TeleChat2, TeleChat2.5, and T1, offering a significant upgrade over their predecessor, TeleChat. Despite minimal changes to the model architecture, the new series achieves substantial performance gains through enhanced training strategies in both pre-training and post-training stages. The series begins with TeleChat2, which undergoes pretraining on 10 trillion high-quality and diverse tokens. This is followed by Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) to further enhance its capabilities. TeleChat2.5 and T1 expand the pipeline by incorporating a continual pretraining phase with domain-specific datasets, combined with reinforcement learning (RL) to improve performance in code generation and mathematical reasoning tasks. The T1 variant is designed for complex reasoning, supporting long Chain-of-Thought (CoT) reasoning and demonstrating substantial improvements in mathematics and coding. In contrast, TeleChat2.5 prioritizes speed, delivering rapid inference. Both flagship models of T1 and TeleChat2.5 are dense Transformer-based architectures with 115B parameters, showcasing significant advancements in reasoning and general task performance compared to the original TeleChat. Notably, T1-115B outperform proprietary models such as OpenAI's o1-mini and GPT-4o. We publicly release TeleChat2, TeleChat2.5 and T1, including post-trained versions with 35B and 115B parameters, to empower developers and researchers with state-of-the-art language models tailored for diverse applications.
PDF92July 25, 2025