ChatPaper.aiChatPaper

PETS:一种面向高效测试时自一致性的最优轨迹分配原则性框架

PETS: A Principled Framework Towards Optimal Trajectory Allocation for Efficient Test-Time Self-Consistency

February 18, 2026
作者: Zhangyi Liu, Huaizhi Qu, Xiaowei Yin, He Sun, Yanjun Han, Tianlong Chen, Zhun Deng
cs.AI

摘要

测试时尺度缩放通过聚合随机推理轨迹可提升模型性能。然而在有限计算预算下实现样本高效的测试时自一致性仍是开放难题。我们提出PETS(原则化高效测试时自一致性)方法,通过优化框架对轨迹分配进行原则性研究。该方法的核心理念是自一致率——一种定义为与无限预算多数投票结果一致性的新度量。该公式使样本高效的测试时分配具有理论依据并适用于严谨分析。我们研究了离线和在线两种场景:在问题预先可知的离线场景中,通过将推理轨迹建模为众包工人,将轨迹分配问题与经典成熟领域相关联,从而利用现有理论成果获得性能保证及高效的多数投票分配算法;在问题序列到达的在线流式场景中,受离线框架启发提出新方法,在保持理论保证与计算效率的同时使预算分配适配问题难度。实验表明PETS持续优于均匀分配策略,在GPQA数据集上两种场景均实现完美自一致性,同时相比均匀分配减少采样预算达75%(离线)和55%(在线)。代码详见https://github.com/ZDCSlab/PETS。
English
Test-time scaling can improve model performance by aggregating stochastic reasoning trajectories. However, achieving sample-efficient test-time self-consistency under a limited budget remains an open challenge. We introduce PETS (Principled and Efficient Test-TimeSelf-Consistency), which initiates a principled study of trajectory allocation through an optimization framework. Central to our approach is the self-consistency rate, a new measure defined as agreement with the infinite-budget majority vote. This formulation makes sample-efficient test-time allocation theoretically grounded and amenable to rigorous analysis. We study both offline and online settings. In the offline regime, where all questions are known in advance, we connect trajectory allocation to crowdsourcing, a classic and well-developed area, by modeling reasoning traces as workers. This perspective allows us to leverage rich existing theory, yielding theoretical guarantees and an efficient majority-voting-based allocation algorithm. In the online streaming regime, where questions arrive sequentially and allocations must be made on the fly, we propose a novel method inspired by the offline framework. Our approach adapts budgets to question difficulty while preserving strong theoretical guarantees and computational efficiency. Experiments show that PETS consistently outperforms uniform allocation. On GPQA, PETS achieves perfect self-consistency in both settings while reducing the sampling budget by up to 75% (offline) and 55% (online) relative to uniform allocation. Code is available at https://github.com/ZDCSlab/PETS.
PDF82March 28, 2026