超越对数似然：基于概率的目标函数在模型能力连续体上的监督微调

摘要

监督微调（SFT）是大型语言模型（LLMs）训练后的标准方法，但其泛化能力往往有限。我们将这一局限归因于其默认的训练目标：负对数似然（NLL）。尽管NLL在从头训练时理论上是最优的，但训练后阶段处于不同的范式，可能违背其最优性假设，此时模型已编码了任务相关的先验知识，且监督信号可能冗长且带有噪声。为此，我们研究了一类基于概率的通用目标函数，并在不同条件下评估其有效性。通过对7种模型架构、14个基准测试和3个领域的全面实验与广泛消融研究，我们发现了一个决定目标函数行为的关键维度：模型能力连续体。在模型能力较强的一端，倾向于先验知识的目标函数（如-p、-p^{10}及其阈值变体）在降低低概率词元权重方面持续优于NLL；在模型能力较弱的一端，NLL占据主导；而在中间区域，没有单一目标函数能普遍胜出。我们的理论分析进一步阐明了目标函数在连续体上的交替作用，为根据模型能力调整目标函数提供了原则性基础。代码已发布于https://github.com/GaotangLi/Beyond-Log-Likelihood。

English

Supervised fine-tuning (SFT) is the standard approach for post-training large language models (LLMs), yet it often shows limited generalization. We trace this limitation to its default training objective: negative log likelihood (NLL). While NLL is classically optimal when training from scratch, post-training operates in a different paradigm and could violate its optimality assumptions, where models already encode task-relevant priors and supervision can be long and noisy. To this end, we study a general family of probability-based objectives and characterize their effectiveness under different conditions. Through comprehensive experiments and extensive ablation studies across 7 model backbones, 14 benchmarks, and 3 domains, we uncover a critical dimension that governs objective behavior: the model-capability continuum. Near the model-strong end, prior-leaning objectives that downweight low-probability tokens (e.g., -p, -p^{10}, thresholded variants) consistently outperform NLL; toward the model-weak end, NLL dominates; in between, no single objective prevails. Our theoretical analysis further elucidates how objectives trade places across the continuum, providing a principled foundation for adapting objectives to model capability. Our code is available at https://github.com/GaotangLi/Beyond-Log-Likelihood.

超越对数似然：基于概率的目标函数在模型能力连续体上的监督微调

Beyond Log Likelihood: Probability-Based Objectives for Supervised Fine-Tuning across the Model Capability Continuum

摘要

Support