ChatPaper.aiChatPaper

PaCoRe:通过并行协同推理学习扩展测试时计算能力

PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

January 9, 2026
作者: Jingcheng Hu, Yinmin Zhang, Shijie Shang, Xiaobo Yang, Yue Peng, Zhewei Huang, Hebin Zhou, Xin Wu, Jie Cheng, Fanqi Wan, Xiangwen Kong, Chengyuan Yao, Kaiwen Yan, Ailin Huang, Hongyu Zhou, Qi Han, Zheng Ge, Daxin Jiang, Xiangyu Zhang, Heung-Yeung Shum
cs.AI

摘要

我们提出并行协同推理(PaCoRe),这一训练与推理框架旨在突破当代语言模型的核心局限:无法在固定上下文窗口下将测试时计算量(TTC)显著扩展至超越顺序推理的范畴。PaCoRe通过多轮消息传递架构驱动的海量并行探索来实现TTC扩展,突破了传统顺序推理范式。每轮推理会启动大量并行推理轨迹,将其发现压缩为上下文受限的消息,并通过综合这些消息指导下一轮推理并最终生成答案。该模型通过基于结果的大规模端到端强化学习训练,掌握了PaCoRe所需的信息整合能力,能够将有效TTC扩展至数百万token量级而不突破上下文限制。该方法在多个领域实现显著提升,尤其在数学推理方面超越前沿系统:一个80亿参数的模型在HMMT 2025上达到94.5%的准确率,通过将有效TTC扩展至约两百万token,超越了GPT-5的93.2%表现。我们开源了模型检查点、训练数据及完整推理流水线,以加速后续研究。
English
We introduce Parallel Coordinated Reasoning (PaCoRe), a training-and-inference framework designed to overcome a central limitation of contemporary language models: their inability to scale test-time compute (TTC) far beyond sequential reasoning under a fixed context window. PaCoRe departs from the traditional sequential paradigm by driving TTC through massive parallel exploration coordinated via a message-passing architecture in multiple rounds. Each round launches many parallel reasoning trajectories, compacts their findings into context-bounded messages, and synthesizes these messages to guide the next round and ultimately produce the final answer. Trained end-to-end with large-scale, outcome-based reinforcement learning, the model masters the synthesis abilities required by PaCoRe and scales to multi-million-token effective TTC without exceeding context limits. The approach yields strong improvements across diverse domains, and notably pushes reasoning beyond frontier systems in mathematics: an 8B model reaches 94.5% on HMMT 2025, surpassing GPT-5's 93.2% by scaling effective TTC to roughly two million tokens. We open-source model checkpoints, training data, and the full inference pipeline to accelerate follow-up work.
PDF803January 31, 2026