시간은 파인튜닝된 언어 모델의 가중치에 인코딩되어 있습니다.

초록

우리는 새로운 시기로 언어 모델을 맞춤화할 수 있는 간단한 도구인 시간 벡터(time vectors)를 제안합니다. 시간 벡터는 단일 시기(예: 연도 또는 월)의 데이터로 언어 모델을 미세 조정한 후, 원래 사전 학습된 모델의 가중치를 빼서 생성됩니다. 이 벡터는 가중치 공간에서 특정 방향을 지정하며, 우리의 실험 결과에 따르면 해당 시기의 텍스트에 대한 성능을 향상시킵니다. 인접한 시기에 특화된 시간 벡터들은 매니폴드 상에서 서로 가까이 위치하는 것으로 보입니다. 이 구조를 활용하여, 우리는 시간 벡터 사이를 보간하여 추가 학습 없이도 중간 및 미래 시기에 대해 더 나은 성능을 보이는 새로운 모델을 유도합니다. 우리는 다양한 작업, 도메인, 모델 크기 및 시간 척도에 걸쳐 이러한 발견의 일관성을 입증합니다. 우리의 결과는 미세 조정된 모델의 가중치 공간에 시간이 인코딩되어 있음을 시사합니다.

English

We present time vectors, a simple tool to customize language models to new time periods. Time vectors are created by finetuning a language model on data from a single time (e.g., a year or month), and then subtracting the weights of the original pretrained model. This vector specifies a direction in weight space that, as our experiments show, improves performance on text from that time period. Time vectors specialized to adjacent time periods appear to be positioned closer together in a manifold. Using this structure, we interpolate between time vectors to induce new models that perform better on intervening and future time periods, without any additional training. We demonstrate the consistency of our findings across different tasks, domains, model sizes, and time scales. Our results suggest that time is encoded in the weight space of finetuned models.

시간은 파인튜닝된 언어 모델의 가중치에 인코딩되어 있습니다.

Time is Encoded in the Weights of Finetuned Language Models

초록

Support