ChatPaper.aiChatPaper

PATS:进程级自适应思维模式切换

PATS: Process-Level Adaptive Thinking Mode Switching

May 25, 2025
作者: Yi Wang, Junxiao Liu, Shimao Zhang, Jiajun Chen, Shujian Huang
cs.AI

摘要

当前的大型语言模型(LLMs)通常对所有问题采用固定的推理策略,无论其难度如何,这种策略或简或繁。这种对任务和推理过程复杂性变化的忽视导致了性能与效率之间的失衡。现有方法尝试通过训练无关的快速-慢速思维系统切换来应对不同难度的问题,但受限于粗粒度的解决方案级策略调整。为解决这一问题,我们提出了一种新颖的推理范式:过程级自适应思维模式切换(PATS),它使LLMs能够根据每一步的难度动态调整其推理策略,从而优化准确性与计算效率之间的平衡。我们的方法将过程奖励模型(PRMs)与束搜索相结合,融入了渐进式模式切换和错误步骤惩罚机制。在多样化的数学基准测试上的实验表明,我们的方法在保持适度令牌使用的同时实现了高准确率。本研究强调了过程级、难度感知的推理策略适应的重要性,为LLMs的高效推理提供了宝贵的洞见。
English
Current large-language models (LLMs) typically adopt a fixed reasoning strategy, either simple or complex, for all questions, regardless of their difficulty. This neglect of variation in task and reasoning process complexity leads to an imbalance between performance and efficiency. Existing methods attempt to implement training-free fast-slow thinking system switching to handle problems of varying difficulty, but are limited by coarse-grained solution-level strategy adjustments. To address this issue, we propose a novel reasoning paradigm: Process-Level Adaptive Thinking Mode Switching (PATS), which enables LLMs to dynamically adjust their reasoning strategy based on the difficulty of each step, optimizing the balance between accuracy and computational efficiency. Our approach integrates Process Reward Models (PRMs) with Beam Search, incorporating progressive mode switching and bad-step penalty mechanisms. Experiments on diverse mathematical benchmarks demonstrate that our methodology achieves high accuracy while maintaining moderate token usage. This study emphasizes the significance of process-level, difficulty-aware reasoning strategy adaptation, offering valuable insights into efficient inference for LLMs.

Summary

AI-Generated Summary

PDF452May 27, 2025