ChatPaper.aiChatPaper

压缩线性地代表智能。

Compression Represents Intelligence Linearly

April 15, 2024
作者: Yuzhen Huang, Jinghan Zhang, Zifei Shan, Junxian He
cs.AI

摘要

有一种观点认为,学会有效地压缩将会导致智能。 最近,语言建模被证明等同于压缩, 这为大型语言模型(LLMs)的成功提供了令人信服的理由: 更先进的语言模型的发展实质上是增强了压缩,从而促进了智能。尽管存在这样引人注目的讨论,但在压缩和智能之间的相互作用方面几乎没有实证证据。在这项工作中,我们在LLMs的背景下研究它们之间的关系,将LLMs视为数据压缩器。鉴于“智能”这一抽象概念,我们采用下游基准测试分数的平均值作为替代指标,特别针对与知识和常识、编码以及数学推理相关的智能。在12个基准测试中,我们的研究汇集了来自不同组织的30个公共LLMs。值得注意的是,我们发现LLMs的智能——通过平均基准测试分数反映——几乎与它们压缩外部文本语料库的能力呈线性相关。这些结果提供了具体证据,支持了优越的压缩表明更高智能的观点。此外,我们的发现表明,作为从原始文本语料库中衍生的无监督度量,压缩效率作为一个可靠的评估指标,与模型能力呈线性关联。我们开源了我们的压缩数据集以及我们的数据收集管道,以便未来的研究人员能够适当评估压缩。
English
There is a belief that learning to compress well will lead to intelligence. Recently, language modeling has been shown to be equivalent to compression, which offers a compelling rationale for the success of large language models (LLMs): the development of more advanced language models is essentially enhancing compression which facilitates intelligence. Despite such appealing discussions, little empirical evidence is present for the interplay between compression and intelligence. In this work, we examine their relationship in the context of LLMs, treating LLMs as data compressors. Given the abstract concept of "intelligence", we adopt the average downstream benchmark scores as a surrogate, specifically targeting intelligence related to knowledge and commonsense, coding, and mathematical reasoning. Across 12 benchmarks, our study brings together 30 public LLMs that originate from diverse organizations. Remarkably, we find that LLMs' intelligence -- reflected by average benchmark scores -- almost linearly correlates with their ability to compress external text corpora. These results provide concrete evidence supporting the belief that superior compression indicates greater intelligence. Furthermore, our findings suggest that compression efficiency, as an unsupervised metric derived from raw text corpora, serves as a reliable evaluation measure that is linearly associated with the model capabilities. We open-source our compression datasets as well as our data collection pipelines to facilitate future researchers to assess compression properly.

Summary

AI-Generated Summary

PDF281December 15, 2024