ChatPaper.aiChatPaper

OThink-R1:内在快/慢思维模式切换以缓解过度推理

OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation

June 3, 2025
作者: Shengjia Zhang, Junjie Wu, Jiawei Chen, Changwang Zhang, Xingyu Lou, Wangchunshu Zhou, Sheng Zhou, Can Wang, Jun Wang
cs.AI

摘要

近期先进的大型推理模型(LRMs)通过扩展的思维链(CoT)推理来解决复杂任务,取得了最先进的性能。尽管取得了成功,我们发现一个关键问题:LRMs解决的大量简单任务,非推理型大语言模型(LLMs)也能以显著更少的token完成,这表明复杂的推理并非总是必要。针对这一问题,我们系统分析了LRMs的推理轨迹,并提出一种方法,利用识别出的范式及LLM-Judge将这些轨迹分类为冗余推理或必要推理。我们进一步引入了OThink-R1,该方法在保持逻辑有效性的同时,剪枝冗余推理步骤。OThink-R1动态地对简单问题采用非思考模式(快速思考),而对复杂问题则进行深思熟虑(慢速思考)。在数学和问答任务上的实验表明,OThink-R1平均减少了近23%的推理冗余,且不牺牲准确性,为高效推理模型提供了实用指导。代码已发布于https://github.com/AgenticIR-Lab/OThink-R1。
English
Recent advanced large reasoning models (LRMs) leverage extended chain-of-thought (CoT) reasoning to solve complex tasks, achieving state-of-the-art performance. Despite their success, we identify a critical issue: a substantial portion of simple tasks solved by LRMs can also be addressed by non-reasoning LLMs using significantly fewer tokens, indicating the complex reasoning may not always be necessary. To address this, we systematically analyze the reasoning trajectories of LRMs and present a method utilizing identified paradigms and LLM-Judge to classify these trajectories as either Redundant Reasoning or Essential Reasoning. And we introduce OThink-R1, a method that prunes redundant reasoning steps while preserving logical validity. OThink-R1 dynamically employs the non-thinking mode (fast-thinking) for straightforward problems while engaging in deliberate thinking (slow-thinking) for complex problems. Experiments across mathematical and question-answering tasks demonstrate that OThink-R1 reduces reasoning redundancy by almost 23\% on average without compromising accuracy, offering practical guidelines for efficient reasoning models. The code is available at https://github.com/AgenticIR-Lab/OThink-R1.

Summary

AI-Generated Summary

PDF332June 4, 2025