逆転の呪いを癒すための逆方向トレーニング

要旨

大規模言語モデル（LLM）には驚くべき欠陥が存在します。「Aは特徴Bを持つ」というデータで学習させた場合、それらは「BはAの特徴である」という逆の表現に一般化できないのです。これは「反転の呪い（Reversal Curse）」と呼ばれています。たとえ数兆トークンのデータで学習させたとしても、ジップの法則によりこの問題は依然として発生します。つまり、インターネット全体を学習データとして使用した場合でも同様です。本研究では、代替的な学習手法として「逆方向学習（reverse training）」を提案します。この手法では、すべての単語を2回使用することで、利用可能なトークン量を倍増させます。LLMは、エンティティなどの特定の部分文字列を保持（つまり反転させず）したまま、学習文字列を反転させることで、順方向と逆方向の両方で学習されます。我々は、データ量を一致させた逆方向学習モデルが標準タスクにおいて標準モデルよりも優れた性能を示すこと、そして計算量を一致させた逆方向学習モデルが反転タスクにおいてはるかに優れた性能を示し、反転の呪いの問題を解決することを実証しました。

English

Large language models (LLMs) have a surprising failure: when trained on "A has a feature B", they do not generalize to "B is a feature of A", which is termed the Reversal Curse. Even when training with trillions of tokens this issue still appears due to Zipf's law - hence even if we train on the entire internet. This work proposes an alternative training scheme, called reverse training, whereby all words are used twice, doubling the amount of available tokens. The LLM is trained in both forward and reverse directions by reversing the training strings while preserving (i.e., not reversing) chosen substrings, such as entities. We show that data-matched reverse-trained models provide superior performance to standard models on standard tasks, and compute-matched reverse-trained models provide far superior performance on reversal tasks, helping resolve the reversal curse issue.

逆転の呪いを癒すための逆方向トレーニング

Reverse Training to Nurse the Reversal Curse

要旨

Summary

Support

Support