금붕어처럼 기억하지 마세요! 생성형 대형 언어 모델에서의 암기 현상 완화

초록

대규모 언어 모델은 학습 데이터를 암기하고 반복할 수 있어 개인정보 보호 및 저작권 위험을 초래합니다. 이러한 암기 현상을 완화하기 위해, 우리는 '골드피시 손실(goldfish loss)'이라고 명명한 다음 토큰 학습 목표에 대한 미세한 수정을 제안합니다. 학습 과정에서 무작위로 샘플링된 토큰 하위 집합을 손실 계산에서 제외합니다. 이렇게 제외된 토큰은 모델에 의해 암기되지 않으며, 이는 학습 데이터셋의 완전한 토큰 체인을 그대로 재현하는 것을 방지합니다. 우리는 10억 규모의 Llama-2 모델을 사전 학습 및 처음부터 학습시키는 광범위한 실험을 수행했으며, 다운스트림 벤치마크에 거의 영향을 미치지 않으면서 추출 가능한 암기 현상을 크게 감소시켰음을 입증했습니다.

English

Large language models can memorize and repeat their training data, causing privacy and copyright risks. To mitigate memorization, we introduce a subtle modification to the next-token training objective that we call the goldfish loss. During training, a randomly sampled subset of tokens are excluded from the loss computation. These dropped tokens are not memorized by the model, which prevents verbatim reproduction of a complete chain of tokens from the training set. We run extensive experiments training billion-scale Llama-2 models, both pre-trained and trained from scratch, and demonstrate significant reductions in extractable memorization with little to no impact on downstream benchmarks.

금붕어처럼 기억하지 마세요! 생성형 대형 언어 모델에서의 암기 현상 완화

Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

초록

Support