ProFit:透過機率引導的詞元選擇在監督式微調中善用高價值訊號
ProFit: Leveraging High-Value Signals in SFT via Probability-Guided Token Selection
January 14, 2026
作者: Tao Liu, Taiqiang Wu, Runming Yang, Shaoning Sun, Junjie Wang, Yujiu Yang
cs.AI
摘要
監督式微調(SFT)是將大型語言模型(LLM)與人類意圖對齊的基礎訓練後策略。然而,傳統SFT常因強制模型對齊單一參考答案而忽略語言的一對多特性,導致模型過度擬合非核心表達。儘管實證分析表明引入多重參考答案可緩解此問題,但高昂的數據與計算成本促使策略轉向:與其耗費成本追求答案多樣性,更應優先解決單一參考導致的過擬合問題。為此,我們揭示詞元概率與語義重要性間的內在關聯:高概率詞元承載核心邏輯框架,而低概率詞元多為可替換表達。基於此發現,我們提出ProFit方法,透過選擇性遮罩低概率詞元來防止表面層級的過度擬合。大量實驗證實,ProFit在通用推理與數學基準測試中均穩定優於傳統SFT基線模型。
English
Supervised fine-tuning (SFT) is a fundamental post-training strategy to align Large Language Models (LLMs) with human intent. However, traditional SFT often ignores the one-to-many nature of language by forcing alignment with a single reference answer, leading to the model overfitting to non-core expressions. Although our empirical analysis suggests that introducing multiple reference answers can mitigate this issue, the prohibitive data and computational costs necessitate a strategic shift: prioritizing the mitigation of single-reference overfitting over the costly pursuit of answer diversity. To achieve this, we reveal the intrinsic connection between token probability and semantic importance: high-probability tokens carry the core logical framework, while low-probability tokens are mostly replaceable expressions. Based on this insight, we propose ProFit, which selectively masks low-probability tokens to prevent surface-level overfitting. Extensive experiments confirm that ProFit consistently outperforms traditional SFT baselines on general reasoning and mathematical benchmarks.