扩展推理能力,失去控制:评估大型推理模型中的指令遵循表现
Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models
May 20, 2025
作者: Tingchen Fu, Jiawei Gu, Yafu Li, Xiaoye Qu, Yu Cheng
cs.AI
摘要
指令遵循对于使大型语言模型(LLMs)与用户意图保持一致至关重要。尽管近期面向推理的模型在复杂数学问题上展现出令人瞩目的性能,但其遵循自然语言指令的能力仍待深入探索。本研究中,我们推出了MathIF,一个专门用于评估数学推理任务中指令遵循能力的基准。我们的实证分析揭示了一个持续存在的矛盾:在提升推理能力与保持可控性之间,那些推理更为有效的模型往往难以遵从用户指示。我们发现,经过提炼的长链思维调整或采用推理导向强化学习训练的模型,在指令遵循方面表现下降,尤其是在生成长度增加时。此外,我们证明,即便是简单的干预措施也能部分恢复模型的服从性,尽管这可能会牺牲推理性能。这些发现凸显了当前LLM训练范式中的一个根本性矛盾,并激发了对更具指令意识的推理模型的需求。我们已在https://github.com/TingchenFu/MathIF 发布了代码与数据。
English
Instruction-following is essential for aligning large language models (LLMs)
with user intent. While recent reasoning-oriented models exhibit impressive
performance on complex mathematical problems, their ability to adhere to
natural language instructions remains underexplored. In this work, we introduce
MathIF, a dedicated benchmark for evaluating instruction-following in
mathematical reasoning tasks. Our empirical analysis reveals a consistent
tension between scaling up reasoning capacity and maintaining controllability,
as models that reason more effectively often struggle to comply with user
directives. We find that models tuned on distilled long chains-of-thought or
trained with reasoning-oriented reinforcement learning often degrade in
instruction adherence, especially when generation length increases.
Furthermore, we show that even simple interventions can partially recover
obedience, though at the cost of reasoning performance. These findings
highlight a fundamental tension in current LLM training paradigms and motivate
the need for more instruction-aware reasoning models. We release the code and
data at https://github.com/TingchenFu/MathIF.