역전 저주를 치유하기 위한 역방향 훈련

초록

대규모 언어 모델(LLMs)은 놀라운 실패 사례를 보입니다: "A는 B라는 특징을 가진다"와 같은 데이터로 학습했을 때, "B는 A의 특징이다"와 같은 역방향 일반화를 수행하지 못하는데, 이를 '역전 저주(Reversal Curse)'라고 부릅니다. 수조 개의 토큰으로 학습하더라도, 이 문제는 지프의 법칙(Zipf's law) 때문에 여전히 발생합니다. 따라서 인터넷 전체를 학습 데이터로 사용하더라도 마찬가지입니다. 본 연구에서는 '역방향 학습(reverse training)'이라는 대안적인 학습 방식을 제안합니다. 이 방식에서는 모든 단어를 두 번 사용함으로써 사용 가능한 토큰의 양을 두 배로 늘립니다. LLM은 학습 문자열을 역방향으로 뒤집되, 엔티티와 같은 특정 부분 문자열은 그대로 유지한 상태로 정방향과 역방향 모두에서 학습됩니다. 우리는 데이터 매칭된 역방향 학습 모델이 표준 작업에서 표준 모델보다 우수한 성능을 보이며, 계산 매칭된 역방향 학습 모델이 역전 작업에서 훨씬 더 뛰어난 성능을 제공함으로써 역전 저주 문제를 해결하는 데 도움을 준다는 것을 보여줍니다.

English

Large language models (LLMs) have a surprising failure: when trained on "A has a feature B", they do not generalize to "B is a feature of A", which is termed the Reversal Curse. Even when training with trillions of tokens this issue still appears due to Zipf's law - hence even if we train on the entire internet. This work proposes an alternative training scheme, called reverse training, whereby all words are used twice, doubling the amount of available tokens. The LLM is trained in both forward and reverse directions by reversing the training strings while preserving (i.e., not reversing) chosen substrings, such as entities. We show that data-matched reverse-trained models provide superior performance to standard models on standard tasks, and compute-matched reverse-trained models provide far superior performance on reversal tasks, helping resolve the reversal curse issue.

역전 저주를 치유하기 위한 역방향 훈련

Reverse Training to Nurse the Reversal Curse

초록

Summary

Support

Support