ChatPaper.aiChatPaper

從詞元與參數層面探討監督式微調對模型知識的影響

Analyzing the Effects of Supervised Fine-Tuning on Model Knowledge from Token and Parameter Levels

September 20, 2025
作者: Junjie Ye, Yuming Yang, Yang Nan, Shuo Li, Qi Zhang, Tao Gui, Xuanjing Huang, Peng Wang, Zhongchao Shi, Jianping Fan
cs.AI

摘要

大型語言模型(LLMs)在預訓練階段獲取了大量的世界知識,這些知識隨後通過監督式微調(SFT)等後訓練技術進一步塑造。然而,SFT對模型知識的影響尚未得到充分探討,這限制了我們在微調模型中控制知識變更行為的能力。為填補這一空白,我們評估了來自LLaMA-2和LLaMA-3系列的五個LLM在閉卷問答(CBQA)任務中的表現。令人驚訝的是,使用1,920個樣本進行微調的模型表現比僅使用240個樣本微調的模型差達14%。此外,微調數據中知識掌握程度的不同會導致超過12%的性能波動。為探究這些效應,我們從詞元和參數層面分析了模型行為。分析顯示,SFT過程中高達90%的參數更新並未對知識增強做出貢獻。根據微調數據的特性,恢復這些更新可以提升CBQA任務的表現。這些見解為開發更有效強化模型知識的微調策略提供了實用指導。
English
Large language models (LLMs) acquire substantial world knowledge during pre-training, which is further shaped by post-training techniques such as supervised fine-tuning (SFT). However, the impact of SFT on a model's knowledge remains underexplored, limiting our ability to control knowledge change behavior in fine-tuned models. To address this gap, we evaluate closed-book question answering (CBQA) performance across five LLMs from the LLaMA-2 and LLaMA-3 families. Surprisingly, models fine-tuned on 1,920 samples perform up to 14% worse than those fine-tuned on only 240 samples. Furthermore, varying the level of knowledge mastery in the fine-tuning data leads to performance fluctuations of over 12%. To investigate these effects, we analyze model behavior at both the token and parameter levels. Our analysis reveals that up to 90% of parameter updates during SFT do not contribute to knowledge enhancement. Restoring these updates can improve performance on the CBQA task, depending on the characteristics of the fine-tuning data. These insights offer practical guidance for developing fine-tuning strategies that more effectively strengthen model knowledge.
PDF122September 23, 2025