過剰思考を避ける：効率的なR1スタイル大規模推論モデルの調査

要旨

近年、大規模推論モデル（LRMs）は、複雑なタスクを処理する際の優れた性能により、徐々に研究のホットスポットとなっている。その中でも、DeepSeek R1はその卓越した性能とオープンソースの性質から大きな注目を集め、R1スタイルのLRMsの研究を推進している。従来の大規模言語モデル（LLMs）とは異なり、これらのモデルは、長い連鎖思考や強化学習を通じた自己反省などのメカニズムを組み込むことで、推論中の論理的推論および意思決定能力を向上させている。しかし、これらのモデルの広範な応用に伴い、過剰思考の問題が徐々に浮上している。具体的には、回答を生成する際に、これらのモデルはしばしば過度に長い推論連鎖を構築し、冗長または繰り返しのステップが含まれるため、推論効率が低下し、最終的な回答の精度に影響を及ぼす可能性がある。これに対し、モデルの性能や推論能力を損なうことなく推論経路の長さを短縮することを目的とした、様々な効率的推論手法が提案されている。本稿では、効率的推論手法の分野における現在の研究進展を体系的にレビューし、既存の研究を単一モデル最適化とモデル協調の観点から二つの主要な方向に分類する：（1）単一モデルによる効率的推論、これは個々のモデルの推論効率を向上させることに焦点を当てる；（2）モデル協調による効率的推論、これは複数のモデル間の協調を通じて推論経路を最適化することを探求する。さらに、効率的推論手法の最新の進捗を追跡する公開GitHubリポジトリを維持している。

English

Recently, Large Reasoning Models (LRMs) have gradually become a research hotspot due to their outstanding performance in handling complex tasks. Among them, DeepSeek R1 has garnered significant attention for its exceptional performance and open-source nature, driving advancements in the research of R1-style LRMs. Unlike traditional Large Language Models (LLMs), these models enhance logical deduction and decision-making capabilities during reasoning by incorporating mechanisms such as long chain-of-thought and self-reflection through reinforcement learning. However, with the widespread application of these models, the problem of overthinking has gradually emerged. Specifically, when generating answers, these models often construct excessively long reasoning chains with redundant or repetitive steps, which leads to reduced reasoning efficiency and may affect the accuracy of the final answer. To this end, various efficient reasoning methods have been proposed, aiming to reduce the length of reasoning paths without compromising model performance and reasoning capability. By reviewing the current research advancements in the field of efficient reasoning methods systematically, we categorize existing works into two main directions based on the lens of single-model optimization versus model collaboration: (1) Efficient Reasoning with Single Model, which focuses on improving the reasoning efficiency of individual models; and (2) Efficient Reasoning with Model Collaboration, which explores optimizing reasoning paths through collaboration among multiple models. Besides, we maintain a public GitHub repository that tracks the latest progress in efficient reasoning methods.

過剰思考を避ける：効率的なR1スタイル大規模推論モデルの調査

Don't Overthink It: A Survey of Efficient R1-style Large Reasoning Models

要旨

Support