超越对数似然:基于概率的目标函数在模型能力连续体上的监督微调
Beyond Log Likelihood: Probability-Based Objectives for Supervised Fine-Tuning across the Model Capability Continuum
October 1, 2025
作者: Gaotang Li, Ruizhong Qiu, Xiusi Chen, Heng Ji, Hanghang Tong
cs.AI
摘要
监督微调(SFT)是大型语言模型(LLMs)训练后的标准方法,但其泛化能力往往有限。我们将这一局限归因于其默认的训练目标:负对数似然(NLL)。尽管NLL在从头训练时理论上是最优的,但训练后阶段处于不同的范式,可能违背其最优性假设,此时模型已编码了任务相关的先验知识,且监督信号可能冗长且带有噪声。为此,我们研究了一类基于概率的通用目标函数,并在不同条件下评估其有效性。通过对7种模型架构、14个基准测试和3个领域的全面实验与广泛消融研究,我们发现了一个决定目标函数行为的关键维度:模型能力连续体。在模型能力较强的一端,倾向于先验知识的目标函数(如-p、-p^{10}及其阈值变体)在降低低概率词元权重方面持续优于NLL;在模型能力较弱的一端,NLL占据主导;而在中间区域,没有单一目标函数能普遍胜出。我们的理论分析进一步阐明了目标函数在连续体上的交替作用,为根据模型能力调整目标函数提供了原则性基础。代码已发布于https://github.com/GaotangLi/Beyond-Log-Likelihood。
English
Supervised fine-tuning (SFT) is the standard approach for post-training large
language models (LLMs), yet it often shows limited generalization. We trace
this limitation to its default training objective: negative log likelihood
(NLL). While NLL is classically optimal when training from scratch,
post-training operates in a different paradigm and could violate its optimality
assumptions, where models already encode task-relevant priors and supervision
can be long and noisy. To this end, we study a general family of
probability-based objectives and characterize their effectiveness under
different conditions. Through comprehensive experiments and extensive ablation
studies across 7 model backbones, 14 benchmarks, and 3 domains, we uncover a
critical dimension that governs objective behavior: the model-capability
continuum. Near the model-strong end, prior-leaning objectives that downweight
low-probability tokens (e.g., -p, -p^{10}, thresholded variants)
consistently outperform NLL; toward the model-weak end, NLL dominates; in
between, no single objective prevails. Our theoretical analysis further
elucidates how objectives trade places across the continuum, providing a
principled foundation for adapting objectives to model capability. Our code is
available at https://github.com/GaotangLi/Beyond-Log-Likelihood.