ChatPaper.aiChatPaper

時間被編碼在微調語言模型的權重中。

Time is Encoded in the Weights of Finetuned Language Models

December 20, 2023
作者: Kai Nylund, Suchin Gururangan, Noah A. Smith
cs.AI

摘要

我們提出時間向量,這是一個簡單的工具,用於將語言模型定製到新的時間段。時間向量是通過在單個時間(例如一年或一個月)的數據上對語言模型進行微調來創建的,然後減去原始預訓練模型的權重。這個向量在權重空間中指定了一個方向,正如我們的實驗所顯示的,可以提高該時間段文本的性能。針對相鄰時間段專門化的時間向量似乎在流形中更接近。利用這種結構,我們在時間向量之間插值,誘導出在介於和未來時間段上表現更好的新模型,而無需進行任何額外的訓練。我們展示了我們的研究結果在不同任務、領域、模型大小和時間尺度上的一致性。我們的結果表明,時間被編碼在微調模型的權重空間中。
English
We present time vectors, a simple tool to customize language models to new time periods. Time vectors are created by finetuning a language model on data from a single time (e.g., a year or month), and then subtracting the weights of the original pretrained model. This vector specifies a direction in weight space that, as our experiments show, improves performance on text from that time period. Time vectors specialized to adjacent time periods appear to be positioned closer together in a manifold. Using this structure, we interpolate between time vectors to induce new models that perform better on intervening and future time periods, without any additional training. We demonstrate the consistency of our findings across different tasks, domains, model sizes, and time scales. Our results suggest that time is encoded in the weight space of finetuned models.
PDF211December 15, 2024