ChatPaper.aiChatPaper

并行探针:基于二维探测的高效并行思维方法

Parallel-Probe: Towards Efficient Parallel Thinking via 2D Probing

February 3, 2026
作者: Tong Zheng, Chengsong Huang, Runpeng Dai, Yun He, Rui Liu, Xin Ni, Huiwen Bao, Kaishen Wang, Hongtu Zhu, Jiaxin Huang, Furong Huang, Heng Huang
cs.AI

摘要

并行思维已成为一种前景广阔的推理范式,但其计算负担较重。现有优化方法主要依赖局部单轨迹信号,缺乏利用并行分支间全局动态的机制。我们提出二维探针技术,通过定期获取所有分支的中间答案来揭示并行思维的宽度-深度动态。分析揭示三大关键发现:宽度-深度分配的非单调缩放特性、推理分支长度的异质性,以及全局共识的早期稳定化。基于这些发现,我们提出无需训练的控制器Parallel-Probe,可在线优化并行思维。该控制器采用基于共识的早停机制调控推理深度,通过偏差感知的分枝剪裁动态调整宽度。在三个基准测试和多种模型上的实验表明,Parallel-Probe实现了更优的测试时缩放帕累托边界。与标准多数投票法相比,在保持精度的同时将序列令牌数减少35.8%,总令牌成本降低超25.8%。
English
Parallel thinking has emerged as a promising paradigm for reasoning, yet it imposes significant computational burdens. Existing efficiency methods primarily rely on local, per-trajectory signals and lack principled mechanisms to exploit global dynamics across parallel branches. We introduce 2D probing, an interface that exposes the width-depth dynamics of parallel thinking by periodically eliciting intermediate answers from all branches. Our analysis reveals three key insights: non-monotonic scaling across width-depth allocations, heterogeneous reasoning branch lengths, and early stabilization of global consensus. Guided by these insights, we introduce Parallel-Probe, a training-free controller designed to optimize online parallel thinking. Parallel-Probe employs consensus-based early stopping to regulate reasoning depth and deviation-based branch pruning to dynamically adjust width. Extensive experiments across three benchmarks and multiple models demonstrate that Parallel-Probe establishes a superior Pareto frontier for test-time scaling. Compared to standard majority voting, it reduces sequential tokens by up to 35.8% and total token cost by over 25.8% while maintaining competitive accuracy.
PDF212February 5, 2026