予算ガイダンスによるLLM思考の誘導

要旨

近年の深層思考型大規模言語モデルは、性能向上のために広範な推論を行うことが多いが、そのような長い推論は必ずしも望ましいものではなく、過剰な推論コストを伴いながらも性能向上が不均衡である場合がある。したがって、性能を犠牲にすることなく推論の長さを制御することは重要であるが、特に厳しい思考予算の下では依然として困難である。本研究では、LLMの微調整を必要とせずに、目標予算に向けてLLMの推論プロセスを導くためのシンプルかつ効果的な方法である「予算ガイダンス」を提案する。本手法では、次のトークン生成中に残りの思考長をガンマ分布としてモデル化する軽量な予測器を導入する。この信号は、ソフトなトークンレベルの方法で生成を導くために使用され、全体の推論トレースが指定された思考予算に従うことを保証する。予算ガイダンスは、思考長の自然な制御を可能にし、挑戦的な数学ベンチマークにおいてベースライン手法と比較して大幅なトークン効率の向上をもたらす。例えば、MATH-500ベンチマークにおいて、厳しい予算の下でベースライン手法と比較して最大26%の精度向上を達成し、完全思考モデルが使用する思考トークンのわずか63%で競争力のある精度を維持する。予算ガイダンスは、より広範なタスク領域にも一般化し、問題の難易度を推定するといった新たな能力も示す。ソースコードは以下で公開されている：https://github.com/UMass-Embodied-AGI/BudgetGuidance。

English

Recent deep-thinking large language models often reason extensively to improve performance, but such lengthy reasoning is not always desirable, as it incurs excessive inference costs with disproportionate performance gains. Controlling reasoning length without sacrificing performance is therefore important, but remains challenging, especially under tight thinking budgets. We propose budget guidance, a simple yet effective method for steering the reasoning process of LLMs toward a target budget without requiring any LLM fine-tuning. Our approach introduces a lightweight predictor that models a Gamma distribution over the remaining thinking length during next-token generation. This signal is then used to guide generation in a soft, token-level manner, ensuring that the overall reasoning trace adheres to the specified thinking budget. Budget guidance enables natural control of the thinking length, along with significant token efficiency improvements over baseline methods on challenging math benchmarks. For instance, it achieves up to a 26% accuracy gain on the MATH-500 benchmark under tight budgets compared to baseline methods, while maintaining competitive accuracy with only 63% of the thinking tokens used by the full-thinking model. Budget guidance also generalizes to broader task domains and exhibits emergent capabilities, such as estimating question difficulty. The source code is available at: https://github.com/UMass-Embodied-AGI/BudgetGuidance.

予算ガイダンスによるLLM思考の誘導

Steering LLM Thinking with Budget Guidance

要旨

Support