토큰 및 매개변수 수준에서 모델 지식에 대한 지도 미세 조정의 효과 분석

초록

대규모 언어 모델(LLM)은 사전 학습 과정에서 상당한 세계 지식을 습득하며, 이는 지도 미세 조정(SFT)과 같은 사후 학습 기법을 통해 더욱 구체화됩니다. 그러나 SFT가 모델의 지식에 미치는 영향은 아직 충분히 탐구되지 않아, 미세 조정된 모델에서 지식 변화 행동을 제어하는 능력이 제한적입니다. 이러한 격차를 해결하기 위해, 우리는 LLaMA-2와 LLaMA-3 계열의 다섯 가지 LLM에 대해 폐쇄형 질문 응답(CBQA) 성능을 평가했습니다. 놀랍게도, 1,920개의 샘플로 미세 조정된 모델은 단 240개의 샘플로 미세 조정된 모델보다 최대 14% 더 낮은 성능을 보였습니다. 또한, 미세 조정 데이터의 지식 숙련도 수준을 달리하면 12% 이상의 성능 변동이 발생했습니다. 이러한 효과를 조사하기 위해, 우리는 토큰 및 매개변수 수준에서 모델 행동을 분석했습니다. 분석 결과, SFT 동안 최대 90%의 매개변수 업데이트가 지식 향상에 기여하지 않는 것으로 나타났습니다. 이러한 업데이트를 복원하면 미세 조정 데이터의 특성에 따라 CBQA 작업에서 성능을 개선할 수 있습니다. 이러한 통찰은 모델 지식을 보다 효과적으로 강화하는 미세 조정 전략 개발에 실질적인 지침을 제공합니다.

English

Large language models (LLMs) acquire substantial world knowledge during pre-training, which is further shaped by post-training techniques such as supervised fine-tuning (SFT). However, the impact of SFT on a model's knowledge remains underexplored, limiting our ability to control knowledge change behavior in fine-tuned models. To address this gap, we evaluate closed-book question answering (CBQA) performance across five LLMs from the LLaMA-2 and LLaMA-3 families. Surprisingly, models fine-tuned on 1,920 samples perform up to 14% worse than those fine-tuned on only 240 samples. Furthermore, varying the level of knowledge mastery in the fine-tuning data leads to performance fluctuations of over 12%. To investigate these effects, we analyze model behavior at both the token and parameter levels. Our analysis reveals that up to 90% of parameter updates during SFT do not contribute to knowledge enhancement. Restoring these updates can improve performance on the CBQA task, depending on the characteristics of the fine-tuning data. These insights offer practical guidance for developing fine-tuning strategies that more effectively strengthen model knowledge.

토큰 및 매개변수 수준에서 모델 지식에 대한 지도 미세 조정의 효과 분석

Analyzing the Effects of Supervised Fine-Tuning on Model Knowledge from Token and Parameter Levels

초록

Support