ChatPaper.aiChatPaper

时间被编码在微调的语言模型的权重中。

Time is Encoded in the Weights of Finetuned Language Models

December 20, 2023
作者: Kai Nylund, Suchin Gururangan, Noah A. Smith
cs.AI

摘要

我们提出时间向量,这是一种简单的工具,用于定制语言模型以适应新的时间段。时间向量是通过在单个时间段(例如,一年或一个月)的数据上微调语言模型而创建的,然后减去原始预训练模型的权重。正如我们的实验所显示的那样,这个向量在权重空间中指定了一个方向,可以提高该时间段文本的性能。专门针对相邻时间段的时间向量似乎在流形中更接近。利用这种结构,我们在时间向量之间插值,诱导出在介于和未来时间段上表现更好的新模型,而无需进行额外的训练。我们展示了我们的发现在不同任务、领域、模型规模和时间尺度上的一致性。我们的结果表明,时间被编码在微调模型的权重空间中。
English
We present time vectors, a simple tool to customize language models to new time periods. Time vectors are created by finetuning a language model on data from a single time (e.g., a year or month), and then subtracting the weights of the original pretrained model. This vector specifies a direction in weight space that, as our experiments show, improves performance on text from that time period. Time vectors specialized to adjacent time periods appear to be positioned closer together in a manifold. Using this structure, we interpolate between time vectors to induce new models that perform better on intervening and future time periods, without any additional training. We demonstrate the consistency of our findings across different tasks, domains, model sizes, and time scales. Our results suggest that time is encoded in the weight space of finetuned models.
PDF211December 15, 2024