ChatGLM：從 GLM-130B 到 GLM-4 的一系列大型語言模型工具

摘要

我們介紹了ChatGLM，這是我們多年來不斷發展的一系列大型語言模型。本報告主要聚焦於GLM-4語言系列，包括GLM-4、GLM-4-Air和GLM-4-9B。它們代表了我們最具能力的模型，這些模型是通過從前三代ChatGLM中獲得的所有見解和教訓進行訓練的。迄今為止，GLM-4模型主要在中文和英文中預訓練了數萬億個標記，還包括來自24種語言的一小部分語料庫，主要針對中文和英文使用進行了對齊。高質量的對齊是通過多階段的後訓練過程實現的，其中包括監督微調和從人類反饋中學習。評估結果顯示，GLM-4在多個一般指標（如MMLU、GSM8K、MATH、BBH、GPQA和HumanEval）方面與GPT-4不相上下甚至表現更優，在指令跟隨方面接近於GPT-4-Turbo（通過IFEval測量），在長文本任務上與GPT-4 Turbo（128K）和Claude 3相匹配，並在中文對齊方面（通過AlignBench測量）優於GPT-4。GLM-4 All Tools模型進一步對齊以理解用戶意圖，並自主決定何時以及使用哪些工具（包括網頁瀏覽器、Python解釋器、文本到圖像模型和用戶定義函數）來有效完成複雜任務。在實際應用中，它在訪問線上信息和使用Python解釋器解決數學問題等任務上與甚至超越了GPT-4 All Tools。我們已經開源了一系列模型，包括ChatGLM-6B（三代）、GLM-4-9B（128K、1M）、GLM-4V-9B、WebGLM和CodeGeeX，在2023年僅一年就在Hugging Face上吸引了超過1000萬次下載。這些開源模型可以通過https://github.com/THUDM和https://huggingface.co/THUDM訪問。

English

We introduce ChatGLM, an evolving family of large language models that we have been developing over time. This report primarily focuses on the GLM-4 language series, which includes GLM-4, GLM-4-Air, and GLM-4-9B. They represent our most capable models that are trained with all the insights and lessons gained from the preceding three generations of ChatGLM. To date, the GLM-4 models are pre-trained on ten trillions of tokens mostly in Chinese and English, along with a small set of corpus from 24 languages, and aligned primarily for Chinese and English usage. The high-quality alignment is achieved via a multi-stage post-training process, which involves supervised fine-tuning and learning from human feedback. Evaluations show that GLM-4 1) closely rivals or outperforms GPT-4 in terms of general metrics such as MMLU, GSM8K, MATH, BBH, GPQA, and HumanEval, 2) gets close to GPT-4-Turbo in instruction following as measured by IFEval, 3) matches GPT-4 Turbo (128K) and Claude 3 for long context tasks, and 4) outperforms GPT-4 in Chinese alignments as measured by AlignBench. The GLM-4 All Tools model is further aligned to understand user intent and autonomously decide when and which tool(s) touse -- including web browser, Python interpreter, text-to-image model, and user-defined functions -- to effectively complete complex tasks. In practical applications, it matches and even surpasses GPT-4 All Tools in tasks like accessing online information via web browsing and solving math problems using Python interpreter. Over the course, we have open-sourced a series of models, including ChatGLM-6B (three generations), GLM-4-9B (128K, 1M), GLM-4V-9B, WebGLM, and CodeGeeX, attracting over 10 million downloads on Hugging face in the year 2023 alone. The open models can be accessed through https://github.com/THUDM and https://huggingface.co/THUDM.

ChatGLM：從 GLM-130B 到 GLM-4 的一系列大型語言模型工具

ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

摘要

Support