ChatPaper.aiChatPaper

超越对数似然:基于概率的目标在模型能力连续体上的监督微调

Beyond Log Likelihood: Probability-Based Objectives for Supervised Fine-Tuning across the Model Capability Continuum

October 1, 2025
作者: Gaotang Li, Ruizhong Qiu, Xiusi Chen, Heng Ji, Hanghang Tong
cs.AI

摘要

監督式微調(Supervised Fine-Tuning, SFT)是大型語言模型(Large Language Models, LLMs)訓練後處理的標準方法,但其泛化能力往往有限。我們將這一限制歸因於其默認的訓練目標:負對數似然(Negative Log Likelihood, NLL)。雖然NLL在從零開始訓練時理論上是最優的,但訓練後處理處於不同的範式,可能違背其最優性假設,此時模型已編碼了任務相關的先驗知識,且監督信號可能冗長且帶有噪聲。為此,我們研究了一類基於概率的目標函數,並在不同條件下評估其有效性。通過對7種模型架構、14個基準測試和3個領域的全面實驗與廣泛消融研究,我們發現了一個決定目標函數行為的關鍵維度:模型能力連續體。在模型能力較強的一端,傾向於先驗的目標函數(如-p、-p^{10}及其閾值變體)在降低低概率詞元權重方面持續優於NLL;在模型能力較弱的一端,NLL占主導地位;而在中間區域,則無單一目標函數占優。我們的理論分析進一步闡明了目標函數在連續體上的交替作用,為根據模型能力調整目標函數提供了理論基礎。我們的代碼可在https://github.com/GaotangLi/Beyond-Log-Likelihood獲取。
English
Supervised fine-tuning (SFT) is the standard approach for post-training large language models (LLMs), yet it often shows limited generalization. We trace this limitation to its default training objective: negative log likelihood (NLL). While NLL is classically optimal when training from scratch, post-training operates in a different paradigm and could violate its optimality assumptions, where models already encode task-relevant priors and supervision can be long and noisy. To this end, we study a general family of probability-based objectives and characterize their effectiveness under different conditions. Through comprehensive experiments and extensive ablation studies across 7 model backbones, 14 benchmarks, and 3 domains, we uncover a critical dimension that governs objective behavior: the model-capability continuum. Near the model-strong end, prior-leaning objectives that downweight low-probability tokens (e.g., -p, -p^{10}, thresholded variants) consistently outperform NLL; toward the model-weak end, NLL dominates; in between, no single objective prevails. Our theoretical analysis further elucidates how objectives trade places across the continuum, providing a principled foundation for adapting objectives to model capability. Our code is available at https://github.com/GaotangLi/Beyond-Log-Likelihood.
PDF82October 2, 2025