基于记忆的语言模型：一种高效、可解释且环保的大语言建模方法

摘要

我们提出基于记忆的语言建模方法，作为基于深度神经网络的语言建模的高效环保替代方案。该方法具备对数线性可扩展的下一词元预测性能和强大的记忆能力。通过实现k近邻分类的快速近似算法，基于记忆的语言建模在训练和推理阶段均保持较小的生态足迹，其完全依赖CPU运行并实现较低词元延迟。该模型内部机制简洁且完全透明。我们将在下一词元预测准确率、碳排放估算及运行速度方面，将自研的基于记忆的语言建模系统OLIFANT与GPT-2、GPT-Neo进行对比，并对该模型进行深入分析。

English

We present memory-based language modeling as an efficient, eco-friendly alternative to deep neural network-based language modeling. It offers log-linearly scalable next-token prediction performance and strong memorization capabilities. Implementing fast approximations of k-nearest neighbor classification, memory-based language modeling leaves a relatively small ecological footprint both in training and in inference mode, as it relies fully on CPUs and attains low token latencies. Its internal workings are simple and fully transparent. We compare our implementation of memory-based language modeling, OLIFANT, with GPT-2 and GPT-Neo on next-token prediction accuracy, estimated emissions and speeds, and offer some deeper analyses of the model.

基于记忆的语言模型：一种高效、可解释且环保的大语言建模方法

Memory-based Language Models: An Efficient, Explainable, and Eco-friendly Approach to Large Language Modeling

摘要

Support