OThink-R1：内在快/慢思维模式切换以缓解过度推理

摘要

近期先进的大型推理模型（LRMs）通过扩展的思维链（CoT）推理来解决复杂任务，取得了最先进的性能。尽管取得了成功，我们发现一个关键问题：LRMs解决的大量简单任务，非推理型大语言模型（LLMs）也能以显著更少的token完成，这表明复杂的推理并非总是必要。针对这一问题，我们系统分析了LRMs的推理轨迹，并提出一种方法，利用识别出的范式及LLM-Judge将这些轨迹分类为冗余推理或必要推理。我们进一步引入了OThink-R1，该方法在保持逻辑有效性的同时，剪枝冗余推理步骤。OThink-R1动态地对简单问题采用非思考模式（快速思考），而对复杂问题则进行深思熟虑（慢速思考）。在数学和问答任务上的实验表明，OThink-R1平均减少了近23%的推理冗余，且不牺牲准确性，为高效推理模型提供了实用指导。代码已发布于https://github.com/AgenticIR-Lab/OThink-R1。

English

Recent advanced large reasoning models (LRMs) leverage extended chain-of-thought (CoT) reasoning to solve complex tasks, achieving state-of-the-art performance. Despite their success, we identify a critical issue: a substantial portion of simple tasks solved by LRMs can also be addressed by non-reasoning LLMs using significantly fewer tokens, indicating the complex reasoning may not always be necessary. To address this, we systematically analyze the reasoning trajectories of LRMs and present a method utilizing identified paradigms and LLM-Judge to classify these trajectories as either Redundant Reasoning or Essential Reasoning. And we introduce OThink-R1, a method that prunes redundant reasoning steps while preserving logical validity. OThink-R1 dynamically employs the non-thinking mode (fast-thinking) for straightforward problems while engaging in deliberate thinking (slow-thinking) for complex problems. Experiments across mathematical and question-answering tasks demonstrate that OThink-R1 reduces reasoning redundancy by almost 23\% on average without compromising accuracy, offering practical guidelines for efficient reasoning models. The code is available at https://github.com/AgenticIR-Lab/OThink-R1.

OThink-R1：内在快/慢思维模式切换以缓解过度推理

OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation

摘要

Support