ChatPaper.aiChatPaper

平行探针:通过二维探测实现高效并行思维

Parallel-Probe: Towards Efficient Parallel Thinking via 2D Probing

February 3, 2026
作者: Tong Zheng, Chengsong Huang, Runpeng Dai, Yun He, Rui Liu, Xin Ni, Huiwen Bao, Kaishen Wang, Hongtu Zhu, Jiaxin Huang, Furong Huang, Heng Huang
cs.AI

摘要

平行思維已成為一種頗具前景的推理範式,但其計算負擔相當可觀。現有效率優化方法主要依賴局部化的單軌跡信號,缺乏利用平行分支間全局動態的機制化設計。我們提出二維探測技術,通過定期獲取所有分支的中間答案,揭示平行思維的寬度-深度動態特性。分析結果揭示三大關鍵發現:寬度-深度資源分配的非單調擴展規律、推理分支長度的異質性特徵,以及全局共識的早期穩定現象。基於這些發現,我們開發了Parallel-Probe無訓練控制器,用於在線優化平行思維。該控制器採用共識驅動的早停機制調控推理深度,結合偏差感知的分枝剪枝動態調整寬度。在三大基準測試和多重模型上的實驗表明,Parallel-Probe能建立更優的測試時擴展帕累托邊界。相比標準多數表決機制,在保持競爭性準確度的同時,可將序列標記量減少35.8%,總標記成本降低逾25.8%。
English
Parallel thinking has emerged as a promising paradigm for reasoning, yet it imposes significant computational burdens. Existing efficiency methods primarily rely on local, per-trajectory signals and lack principled mechanisms to exploit global dynamics across parallel branches. We introduce 2D probing, an interface that exposes the width-depth dynamics of parallel thinking by periodically eliciting intermediate answers from all branches. Our analysis reveals three key insights: non-monotonic scaling across width-depth allocations, heterogeneous reasoning branch lengths, and early stabilization of global consensus. Guided by these insights, we introduce Parallel-Probe, a training-free controller designed to optimize online parallel thinking. Parallel-Probe employs consensus-based early stopping to regulate reasoning depth and deviation-based branch pruning to dynamically adjust width. Extensive experiments across three benchmarks and multiple models demonstrate that Parallel-Probe establishes a superior Pareto frontier for test-time scaling. Compared to standard majority voting, it reduces sequential tokens by up to 35.8% and total token cost by over 25.8% while maintaining competitive accuracy.
PDF212February 5, 2026