GLM-4.5：智能代理、推理与编码（ARC）基础模型

摘要

我们推出GLM-4.5，这是一个开源的专家混合（MoE）大型语言模型，总参数量达3550亿，激活参数量为320亿，采用了一种支持思维与直接响应模式的混合推理方法。通过对23万亿个token进行多阶段训练，并结合专家模型迭代与强化学习的全面后训练，GLM-4.5在代理、推理及编码（ARC）任务上展现出强劲性能，在TAU-Bench上得分70.1%，AIME 24上91.0%，SWE-bench Verified上64.2%。相较于多个竞争对手，GLM-4.5以更少的参数量，在所有评估模型中综合排名第三，在代理基准测试中位列第二。我们发布了GLM-4.5（3550亿参数）及其精简版GLM-4.5-Air（1060亿参数），以推动推理与代理AI系统的研究。代码、模型及更多信息请访问https://github.com/zai-org/GLM-4.5。

English

We present GLM-4.5, an open-source Mixture-of-Experts (MoE) large language model with 355B total parameters and 32B activated parameters, featuring a hybrid reasoning method that supports both thinking and direct response modes. Through multi-stage training on 23T tokens and comprehensive post-training with expert model iteration and reinforcement learning, GLM-4.5 achieves strong performance across agentic, reasoning, and coding (ARC) tasks, scoring 70.1% on TAU-Bench, 91.0% on AIME 24, and 64.2% on SWE-bench Verified. With much fewer parameters than several competitors, GLM-4.5 ranks 3rd overall among all evaluated models and 2nd on agentic benchmarks. We release both GLM-4.5 (355B parameters) and a compact version, GLM-4.5-Air (106B parameters), to advance research in reasoning and agentic AI systems. Code, models, and more information are available at https://github.com/zai-org/GLM-4.5.