FROST：基于注意力机制的高效推理异常过滤方法

摘要

我们提出FROST——一种基于注意力感知的高效推理方法。与传统方法不同，FROST通过利用注意力权重来剪枝非关键推理路径，从而生成更简短且更可靠的推理轨迹。在方法论上，我们引入推理异常值的概念，并设计了一种基于注意力的机制来消除这些异常值。从理论层面看，FROST在保持并增强模型推理能力的同时，实现了句子层级的异常值剔除。实证方面，我们在四个基准测试中使用两种强推理模型（Phi-4-Reasoning和GPT-OSS-20B）验证FROST，其表现优于TALE、ThinkLess等最先进方法。值得注意的是，FROST相较基础模型平均实现了69.68%的token使用量降低和26.70%的准确率提升。此外在注意力异常值指标评估中，FROST将最大无穷范数降低15.97%，平均峰度减少91.09%。代码已开源：https://github.com/robinzixuan/FROST

English

We propose FROST, an attention-aware method for efficient reasoning. Unlike traditional approaches, FROST leverages attention weights to prune uncritical reasoning paths, yielding shorter and more reliable reasoning trajectories. Methodologically, we introduce the concept of reasoning outliers and design an attention-based mechanism to remove them. Theoretically, FROST preserves and enhances the model's reasoning capacity while eliminating outliers at the sentence level. Empirically, we validate FROST on four benchmarks using two strong reasoning models (Phi-4-Reasoning and GPT-OSS-20B), outperforming state-of-the-art methods such as TALE and ThinkLess. Notably, FROST achieves an average 69.68% reduction in token usage and a 26.70% improvement in accuracy over the base model. Furthermore, in evaluations of attention outlier metrics, FROST reduces the maximum infinity norm by 15.97% and the average kurtosis by 91.09% compared to the base model. Code is available at https://github.com/robinzixuan/FROST

FROST：基于注意力机制的高效推理异常过滤方法

FROST: Filtering Reasoning Outliers with Attention for Efficient Reasoning

摘要

Support