規模化推理,控制力下降:評估大型推理模型中的指令遵循能力
Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models
May 20, 2025
作者: Tingchen Fu, Jiawei Gu, Yafu Li, Xiaoye Qu, Yu Cheng
cs.AI
摘要
指令遵循對於使大型語言模型(LLMs)與用戶意圖保持一致至關重要。儘管近期以推理為導向的模型在複雜數學問題上展現出令人印象深刻的性能,但其遵循自然語言指令的能力仍未被充分探索。在本研究中,我們引入了MathIF,這是一個專門用於評估數學推理任務中指令遵循能力的基準。我們的實證分析揭示了在提升推理能力與保持可控性之間存在持續的張力,因為推理能力更強的模型往往難以遵從用戶指令。我們發現,基於蒸餾長鏈思維調整的模型或通過推理導向的強化學習訓練的模型,在指令遵循方面往往表現下降,尤其是在生成長度增加時。此外,我們表明,即使簡單的干預措施也能部分恢復模型的服從性,儘管這是以犧牲推理性能為代價的。這些發現凸顯了當前LLM訓練範式中的根本性張力,並激發了對更具指令感知能力的推理模型的需求。我們已在https://github.com/TingchenFu/MathIF上公開了代碼和數據。
English
Instruction-following is essential for aligning large language models (LLMs)
with user intent. While recent reasoning-oriented models exhibit impressive
performance on complex mathematical problems, their ability to adhere to
natural language instructions remains underexplored. In this work, we introduce
MathIF, a dedicated benchmark for evaluating instruction-following in
mathematical reasoning tasks. Our empirical analysis reveals a consistent
tension between scaling up reasoning capacity and maintaining controllability,
as models that reason more effectively often struggle to comply with user
directives. We find that models tuned on distilled long chains-of-thought or
trained with reasoning-oriented reinforcement learning often degrade in
instruction adherence, especially when generation length increases.
Furthermore, we show that even simple interventions can partially recover
obedience, though at the cost of reasoning performance. These findings
highlight a fundamental tension in current LLM training paradigms and motivate
the need for more instruction-aware reasoning models. We release the code and
data at https://github.com/TingchenFu/MathIF.Summary
AI-Generated Summary