ChatGLM：从GLM-130B到GLM-4的一系列大型语言模型家族。

摘要

我们介绍了ChatGLM，这是我们多年来开发的一系列不断发展的大型语言模型。本报告主要关注GLM-4语言系列，包括GLM-4、GLM-4-Air和GLM-4-9B。它们代表了我们最具能力的模型，这些模型是通过从前三代ChatGLM中获得的所有见解和经验进行训练的。迄今为止，GLM-4模型主要在中文和英文中预训练了一万亿个标记，同时还包括来自24种语言的一小部分语料库，并主要针对中文和英文使用进行了对齐。通过多阶段的后训练过程，包括监督微调和从人类反馈中学习，实现了高质量的对齐。评估表明，GLM-4在各种常规指标如MMLU、GSM8K、MATH、BBH、GPQA和HumanEval方面与GPT-4不相上下甚至表现更好，接近了GPT-4-Turbo在指令遵循方面的表现，与GPT-4 Turbo（128K）和Claude 3在长文本任务上相匹敌，以AlignBench测量的中文对齐方面胜过了GPT-4。GLM-4 All Tools模型进一步对齐，以理解用户意图，并自主决定何时以及使用哪些工具（包括网络浏览器、Python解释器、文本到图像模型和用户定义函数），以有效完成复杂任务。在实际应用中，它在访问在线信息和使用Python解释器解决数学问题等任务方面与甚至超过了GPT-4 All Tools。在过程中，我们开源了一系列模型，包括ChatGLM-6B（三代）、GLM-4-9B（128K、1M）、GLM-4V-9B、WebGLM和CodeGeeX，在2023年仅一年内在Hugging Face上吸引了超过1000万次下载。这些开源模型可以通过https://github.com/THUDM和https://huggingface.co/THUDM进行访问。

English

We introduce ChatGLM, an evolving family of large language models that we have been developing over time. This report primarily focuses on the GLM-4 language series, which includes GLM-4, GLM-4-Air, and GLM-4-9B. They represent our most capable models that are trained with all the insights and lessons gained from the preceding three generations of ChatGLM. To date, the GLM-4 models are pre-trained on ten trillions of tokens mostly in Chinese and English, along with a small set of corpus from 24 languages, and aligned primarily for Chinese and English usage. The high-quality alignment is achieved via a multi-stage post-training process, which involves supervised fine-tuning and learning from human feedback. Evaluations show that GLM-4 1) closely rivals or outperforms GPT-4 in terms of general metrics such as MMLU, GSM8K, MATH, BBH, GPQA, and HumanEval, 2) gets close to GPT-4-Turbo in instruction following as measured by IFEval, 3) matches GPT-4 Turbo (128K) and Claude 3 for long context tasks, and 4) outperforms GPT-4 in Chinese alignments as measured by AlignBench. The GLM-4 All Tools model is further aligned to understand user intent and autonomously decide when and which tool(s) touse -- including web browser, Python interpreter, text-to-image model, and user-defined functions -- to effectively complete complex tasks. In practical applications, it matches and even surpasses GPT-4 All Tools in tasks like accessing online information via web browsing and solving math problems using Python interpreter. Over the course, we have open-sourced a series of models, including ChatGLM-6B (three generations), GLM-4-9B (128K, 1M), GLM-4V-9B, WebGLM, and CodeGeeX, attracting over 10 million downloads on Hugging face in the year 2023 alone. The open models can be accessed through https://github.com/THUDM and https://huggingface.co/THUDM.

ChatGLM：从GLM-130B到GLM-4的一系列大型语言模型家族。

ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

摘要

Support