教師ありファインチューニングがモデルの知識に及ぼす影響の分析：トークンとパラメータレベルからの考察

要旨

大規模言語モデル（LLM）は、事前学習において膨大な世界知識を獲得し、その後、教師ありファインチューニング（SFT）などの事後学習技術によってさらに形成されます。しかし、SFTがモデルの知識に与える影響は十分に研究されておらず、ファインチューニングされたモデルにおける知識変化の挙動を制御する能力が制限されています。このギャップを埋めるため、LLaMA-2およびLLaMA-3ファミリーの5つのLLMを対象に、クローズドブック質問応答（CBQA）の性能を評価しました。驚くべきことに、1,920サンプルでファインチューニングされたモデルは、わずか240サンプルでファインチューニングされたモデルよりも最大14%性能が低下しました。さらに、ファインチューニングデータにおける知識習得度を変化させると、性能が12%以上変動しました。これらの影響を調査するため、トークンレベルとパラメータレベルの両方でモデルの挙動を分析しました。その結果、SFT中のパラメータ更新の最大90%が知識の強化に寄与していないことが明らかになりました。これらの更新を復元することで、ファインチューニングデータの特性に応じてCBQAタスクの性能が向上する可能性があります。これらの知見は、モデルの知識をより効果的に強化するファインチューニング戦略の開発に実用的な指針を提供します。

English

Large language models (LLMs) acquire substantial world knowledge during pre-training, which is further shaped by post-training techniques such as supervised fine-tuning (SFT). However, the impact of SFT on a model's knowledge remains underexplored, limiting our ability to control knowledge change behavior in fine-tuned models. To address this gap, we evaluate closed-book question answering (CBQA) performance across five LLMs from the LLaMA-2 and LLaMA-3 families. Surprisingly, models fine-tuned on 1,920 samples perform up to 14% worse than those fine-tuned on only 240 samples. Furthermore, varying the level of knowledge mastery in the fine-tuning data leads to performance fluctuations of over 12%. To investigate these effects, we analyze model behavior at both the token and parameter levels. Our analysis reveals that up to 90% of parameter updates during SFT do not contribute to knowledge enhancement. Restoring these updates can improve performance on the CBQA task, depending on the characteristics of the fine-tuning data. These insights offer practical guidance for developing fine-tuning strategies that more effectively strengthen model knowledge.

教師ありファインチューニングがモデルの知識に及ぼす影響の分析：トークンとパラメータレベルからの考察

Analyzing the Effects of Supervised Fine-Tuning on Model Knowledge from Token and Parameter Levels

要旨

Support