金魚のように、記憶しない！生成型大規模言語モデルにおける記憶化の軽減

要旨

大規模言語モデルは訓練データを記憶し、それを繰り返す可能性があり、プライバシーや著作権のリスクを引き起こします。この記憶化を緩和するため、我々は次のトークン予測の訓練目的関数に微妙な修正を加えた「ゴールドフィッシュ損失」を導入します。訓練中、ランダムにサンプリングされたトークンのサブセットを損失計算から除外します。これらの除外されたトークンはモデルに記憶されないため、訓練セットからの完全なトークン列の逐語的な再現を防ぎます。我々は、事前訓練済みおよびゼロから訓練した10億規模のLlama-2モデルを用いて広範な実験を行い、下流タスクのベンチマークにほとんど影響を与えることなく、抽出可能な記憶化を大幅に削減できることを実証しました。

English

Large language models can memorize and repeat their training data, causing privacy and copyright risks. To mitigate memorization, we introduce a subtle modification to the next-token training objective that we call the goldfish loss. During training, a randomly sampled subset of tokens are excluded from the loss computation. These dropped tokens are not memorized by the model, which prevents verbatim reproduction of a complete chain of tokens from the training set. We run extensive experiments training billion-scale Llama-2 models, both pre-trained and trained from scratch, and demonstrate significant reductions in extractable memorization with little to no impact on downstream benchmarks.

金魚のように、記憶しない！生成型大規模言語モデルにおける記憶化の軽減

Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

要旨

Support