時間はファインチューニングされた言語モデルの重みに符号化されている

要旨

本論文では、言語モデルを新しい時代に適応させるためのシンプルなツールであるタイムベクトルを提案します。タイムベクトルは、単一の時間（例えば、年や月）のデータで言語モデルをファインチューニングし、その後、元の事前学習済みモデルの重みを差し引くことで作成されます。このベクトルは、我々の実験が示すように、その時代のテキストに対する性能を向上させる重み空間内の方向を指定します。隣接する時代に特化したタイムベクトルは、多様体内で互いに近い位置に配置されているように見えます。この構造を利用して、タイムベクトル間を補間することで、追加の学習なしに、中間および未来の時代においてより良い性能を発揮する新しいモデルを誘導します。我々は、異なるタスク、ドメイン、モデルサイズ、時間スケールにわたって、この発見の一貫性を実証します。結果は、ファインチューニングされたモデルの重み空間に時間がエンコードされていることを示唆しています。

English

We present time vectors, a simple tool to customize language models to new time periods. Time vectors are created by finetuning a language model on data from a single time (e.g., a year or month), and then subtracting the weights of the original pretrained model. This vector specifies a direction in weight space that, as our experiments show, improves performance on text from that time period. Time vectors specialized to adjacent time periods appear to be positioned closer together in a manifold. Using this structure, we interpolate between time vectors to induce new models that perform better on intervening and future time periods, without any additional training. We demonstrate the consistency of our findings across different tasks, domains, model sizes, and time scales. Our results suggest that time is encoded in the weight space of finetuned models.

時間はファインチューニングされた言語モデルの重みに符号化されている

Time is Encoded in the Weights of Finetuned Language Models

要旨

Support