ChatPaper.aiChatPaper

只需兩位專家即可引導思考:無需額外訓練即可增強MoE推理模型的認知努力

Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training

May 20, 2025
作者: Mengru Wang, Xingyu Chen, Yue Wang, Zhiwei He, Jiahao Xu, Tian Liang, Qiuzhi Liu, Yunzhi Yao, Wenxuan Wang, Ruotian Ma, Haitao Mi, Ningyu Zhang, Zhaopeng Tu, Xiaolong Li, Dong Yu
cs.AI

摘要

大型推理模型(LRMs)中的專家混合架構(Mixture-of-Experts, MoE)通過選擇性激活專家來促進結構化的認知過程,已展現出令人印象深刻的推理能力。儘管取得了顯著進展,現有的推理模型仍常受到過度思考與思考不足等認知效率低下的困擾。為解決這些限制,我們引入了一種新穎的推理時引導方法,稱為強化認知專家(Reinforcing Cognitive Experts, RICE),旨在無需額外訓練或複雜啟發式方法的情況下提升推理性能。利用標準化點間互信息(nPMI),我們系統性地識別出專門的專家,即「認知專家」,這些專家負責協調以「<think>」等標記為特徵的元層次推理操作。在嚴格的定量與科學推理基準測試中,對領先的基於MoE的LRMs(如DeepSeek-R1和Qwen3-235B)進行的實證評估顯示,推理準確性、認知效率及跨領域泛化能力均得到了顯著且一致的提升。重要的是,我們的輕量級方法在保持模型通用指令跟隨能力的同時,大幅超越了提示設計與解碼約束等主流推理引導技術。這些結果表明,強化認知專家是提升先進推理模型認知效率的一個有前景、實用且可解釋的方向。
English
Mixture-of-Experts (MoE) architectures within Large Reasoning Models (LRMs) have achieved impressive reasoning capabilities by selectively activating experts to facilitate structured cognitive processes. Despite notable advances, existing reasoning models often suffer from cognitive inefficiencies like overthinking and underthinking. To address these limitations, we introduce a novel inference-time steering methodology called Reinforcing Cognitive Experts (RICE), designed to improve reasoning performance without additional training or complex heuristics. Leveraging normalized Pointwise Mutual Information (nPMI), we systematically identify specialized experts, termed ''cognitive experts'' that orchestrate meta-level reasoning operations characterized by tokens like ''<think>''. Empirical evaluations with leading MoE-based LRMs (DeepSeek-R1 and Qwen3-235B) on rigorous quantitative and scientific reasoning benchmarks demonstrate noticeable and consistent improvements in reasoning accuracy, cognitive efficiency, and cross-domain generalization. Crucially, our lightweight approach substantially outperforms prevalent reasoning-steering techniques, such as prompt design and decoding constraints, while preserving the model's general instruction-following skills. These results highlight reinforcing cognitive experts as a promising, practical, and interpretable direction to enhance cognitive efficiency within advanced reasoning models.

Summary

AI-Generated Summary

PDF92May 21, 2025