OThink-R1:内在快/慢思维模式切换以缓解过度推理
OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation
June 3, 2025
作者: Shengjia Zhang, Junjie Wu, Jiawei Chen, Changwang Zhang, Xingyu Lou, Wangchunshu Zhou, Sheng Zhou, Can Wang, Jun Wang
cs.AI
摘要
近期先进的大型推理模型(LRMs)通过扩展的思维链(CoT)推理来解决复杂任务,取得了最先进的性能。尽管取得了成功,我们发现一个关键问题:LRMs解决的大量简单任务,非推理型大语言模型(LLMs)也能以显著更少的token完成,这表明复杂的推理并非总是必要。针对这一问题,我们系统分析了LRMs的推理轨迹,并提出一种方法,利用识别出的范式及LLM-Judge将这些轨迹分类为冗余推理或必要推理。我们进一步引入了OThink-R1,该方法在保持逻辑有效性的同时,剪枝冗余推理步骤。OThink-R1动态地对简单问题采用非思考模式(快速思考),而对复杂问题则进行深思熟虑(慢速思考)。在数学和问答任务上的实验表明,OThink-R1平均减少了近23%的推理冗余,且不牺牲准确性,为高效推理模型提供了实用指导。代码已发布于https://github.com/AgenticIR-Lab/OThink-R1。
English
Recent advanced large reasoning models (LRMs) leverage extended
chain-of-thought (CoT) reasoning to solve complex tasks, achieving
state-of-the-art performance. Despite their success, we identify a critical
issue: a substantial portion of simple tasks solved by LRMs can also be
addressed by non-reasoning LLMs using significantly fewer tokens, indicating
the complex reasoning may not always be necessary. To address this, we
systematically analyze the reasoning trajectories of LRMs and present a method
utilizing identified paradigms and LLM-Judge to classify these trajectories as
either Redundant Reasoning or Essential Reasoning. And we introduce OThink-R1,
a method that prunes redundant reasoning steps while preserving logical
validity. OThink-R1 dynamically employs the non-thinking mode (fast-thinking)
for straightforward problems while engaging in deliberate thinking
(slow-thinking) for complex problems. Experiments across mathematical and
question-answering tasks demonstrate that OThink-R1 reduces reasoning
redundancy by almost 23\% on average without compromising accuracy, offering
practical guidelines for efficient reasoning models. The code is available at
https://github.com/AgenticIR-Lab/OThink-R1.Summary
AI-Generated Summary