追蹤痕跡:潛在時間信號於高效精確推理之應用
Tracing the Traces: Latent Temporal Signals for Efficient and Accurate Reasoning
October 12, 2025
作者: Martina G. Vilas, Safoora Yousefi, Besmira Nushi, Eric Horvitz, Vidhisha Balachandran
cs.AI
摘要
推理模型通过推理时的规模扩展,分配更多计算资源以延长令牌预算,从而提升其问题解决能力。识别哪些推理轨迹可能成功仍是一个关键机遇:可靠预测有效路径能显著减少计算浪费并提高整体效率。我们引入了潜在轨迹信号,这些信号刻画了模型在生成中间推理令牌过程中内部表征的时间演变。通过测量推理开始与结束之间潜在表征的总体变化、跨中间步骤累积的变化,以及这些变化向最终状态推进的程度,我们展示了这些信号比跨层度量和基于输出的置信度测量更能可靠地预测解决方案的准确性。当用于指导跨多个采样生成的答案选择时,潜在轨迹信号使得测试时的规模扩展比多数投票更为有效和高效,在保持甚至平均提高2.6%准确率的同时,最多减少了70%的令牌使用。此外,这些预测信号往往在推理轨迹早期出现,使得能够早期选择并分配计算资源给最有希望的候选者。我们的发现不仅贡献了推理时效率的实用策略,还从更深层次的可解释性视角揭示了推理过程在潜在空间中的表示与区分方式。
English
Reasoning models improve their problem-solving ability through inference-time
scaling, allocating more compute via longer token budgets. Identifying which
reasoning traces are likely to succeed remains a key opportunity: reliably
predicting productive paths can substantially reduce wasted computation and
improve overall efficiency. We introduce Latent-Trajectory signals that
characterize the temporal evolution of a model's internal representations
during the generation of intermediate reasoning tokens. By measuring the
overall change in latent representations between the start and end of
reasoning, the change accumulated across intermediate steps, and the extent to
which these changes advance toward the final state, we show that these signals
predict solution accuracy more reliably than both cross-layer metrics and
output-based confidence measures. When used to guide answer selection across
multiple sampled generations, Latent-Trajectory signals make test-time scaling
more effective and efficient than majority voting, reducing token usage by up
to 70% while preserving and even improving accuracy by 2.6% on average.
Moreover, these predictive signals often emerge early in the reasoning trace,
enabling early selection and allocation of compute to the most promising
candidates. Our findings contribute not only practical strategies for
inference-time efficiency, but also a deeper interpretability perspective on
how reasoning processes are represented and differentiated in latent space.