反向訓練以護理逆轉詛咒
Reverse Training to Nurse the Reversal Curse
March 20, 2024
作者: Olga Golovneva, Zeyuan Allen-Zhu, Jason Weston, Sainbayar Sukhbaatar
cs.AI
摘要
大型語言模型(LLMs)存在一個令人驚訝的失敗:當在“A 具有特徵 B”上進行訓練時,它們無法推廣到“B 是 A 的特徵”,這被稱為逆轉詛咒。即使通過數萬億標記進行訓練,由於 Zipf 定律的存在,這個問題仍然存在 - 即使我們在整個互聯網上進行訓練也是如此。本研究提出了一種替代訓練方案,稱為反向訓練,其中所有單詞都被使用兩次,從而使可用標記數量加倍。LLM 通過反轉訓練字符串的方式在正向和反向方向上進行訓練,同時保留(即不反轉)選定的子字符串,如實體。我們展示了與數據匹配的反向訓練模型在標準任務上提供了優異的性能,而與計算匹配的反向訓練模型在逆轉任務上提供了遠遠優於標準模型的性能,有助於解決逆轉詛咒問題。
English
Large language models (LLMs) have a surprising failure: when trained on "A
has a feature B", they do not generalize to "B is a feature of A", which is
termed the Reversal Curse. Even when training with trillions of tokens this
issue still appears due to Zipf's law - hence even if we train on the entire
internet. This work proposes an alternative training scheme, called reverse
training, whereby all words are used twice, doubling the amount of available
tokens. The LLM is trained in both forward and reverse directions by reversing
the training strings while preserving (i.e., not reversing) chosen substrings,
such as entities. We show that data-matched reverse-trained models provide
superior performance to standard models on standard tasks, and compute-matched
reverse-trained models provide far superior performance on reversal tasks,
helping resolve the reversal curse issue.Summary
AI-Generated Summary