時間はその場所を持つのか？時間的ヘッド：言語モデルが時間特異的情報を想起する場所

要旨

言語モデルが事実を引き出す能力については広く研究されてきたが、時間的に変化する事実をどのように扱うかについては未解明の部分が多い。本研究では、回路解析を通じて、時間的知識を主に処理する特定のアテンションヘッドである「Temporal Heads」を発見した。これらのヘッドは複数のモデルに存在するが、その具体的な位置はモデルによって異なり、知識の種類や対応する年代によって応答が変化することが確認された。これらのヘッドを無効化すると、モデルの時間固有の知識を想起する能力が低下する一方で、時間不変な知識や質問応答の性能は維持されることがわかった。さらに、これらのヘッドは数値的な条件（「2004年」）だけでなく、テキスト的な別名（「...の年」）にも活性化されることから、単純な数値表現を超えた時間的次元を符号化していることが示唆される。また、これらのヘッドの値を調整することで時間的知識を編集できる可能性を実証し、本研究の成果の可能性をさらに広げた。

English

While the ability of language models to elicit facts has been widely investigated, how they handle temporally changing facts remains underexplored. We discover Temporal Heads, specific attention heads primarily responsible for processing temporal knowledge through circuit analysis. We confirm that these heads are present across multiple models, though their specific locations may vary, and their responses differ depending on the type of knowledge and its corresponding years. Disabling these heads degrades the model's ability to recall time-specific knowledge while maintaining its general capabilities without compromising time-invariant and question-answering performances. Moreover, the heads are activated not only numeric conditions ("In 2004") but also textual aliases ("In the year ..."), indicating that they encode a temporal dimension beyond simple numerical representation. Furthermore, we expand the potential of our findings by demonstrating how temporal knowledge can be edited by adjusting the values of these heads.

時間はその場所を持つのか？時間的ヘッド：言語モデルが時間特異的情報を想起する場所

Does Time Have Its Place? Temporal Heads: Where Language Models Recall Time-specific Information

要旨

Support