思維操控:外部思維對大型推理模型的高效應用
Thought Manipulation: External Thought Can Be Efficient for Large Reasoning Models
April 18, 2025
作者: Yule Liu, Jingyi Zheng, Zhen Sun, Zifan Peng, Wenhan Dong, Zeyang Sha, Shiwen Cui, Weiqiang Wang, Xinlei He
cs.AI
摘要
近期,大型推理模型(LRMs)的進展展示了擴展測試時計算以增強多任務推理能力的有效性。然而,LRMs通常面臨「過度思考」問題,即模型生成大量冗餘推理步驟,卻僅帶來有限的性能提升。現有工作依賴於微調來緩解過度思考,這需要額外數據、非傳統的訓練設置、存在安全對齊風險以及泛化能力差。
通過實證分析,我們揭示了LRM行為的一個重要特徵:在思考標記(<think>和</think>)之間插入由較小模型生成的外部推理鏈(CoTs),可以有效操控模型生成更少的思考步驟。基於這些洞察,我們提出了一個簡單而高效的流程——ThoughtMani,使LRMs能夠繞過不必要的中間步驟,顯著降低計算成本。我們進行了廣泛的實驗來驗證ThoughtMani的效用和效率。例如,在LiveBench/Code數據集上應用於QwQ-32B時,ThoughtMani保持了原始性能,並將輸出標記數量減少約30%,且CoT生成器的開銷極小。此外,我們發現ThoughtMani平均提升了10%的安全對齊性。由於模型供應商通常同時提供不同規模的模型,ThoughtMani為構建更高效、更易於實際應用的大型推理模型提供了一種有效途徑。
English
Recent advancements in large reasoning models (LRMs) have demonstrated the
effectiveness of scaling test-time computation to enhance reasoning
capabilities in multiple tasks. However, LRMs typically suffer from
"overthinking" problems, where models generate significantly redundant
reasoning steps while bringing limited performance gains. Existing work relies
on fine-tuning to mitigate overthinking, which requires additional data,
unconventional training setups, risky safety misalignment, and poor
generalization.
Through empirical analysis, we reveal an important characteristic of LRM
behaviors that placing external CoTs generated by smaller models between the
thinking token (<think> and </think>) can effectively
manipulate the model to generate fewer thoughts. Building on these insights, we
propose a simple yet efficient pipeline, ThoughtMani, to enable LRMs to bypass
unnecessary intermediate steps and reduce computational costs significantly. We
conduct extensive experiments to validate the utility and efficiency of
ThoughtMani. For instance, when applied to QwQ-32B on the LiveBench/Code
dataset, ThoughtMani keeps the original performance and reduces output token
counts by approximately 30%, with little overhead from the CoT generator.
Furthermore, we find that ThoughtMani enhances safety alignment by an average
of 10%. Since model vendors typically serve models of different sizes
simultaneously, ThoughtMani provides an effective way to construct more
efficient and accessible LRMs for real-world applications.