AlphaOne:测试时兼具快慢思维的推理模型
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
May 30, 2025
作者: Junyu Zhang, Runpei Dong, Han Wang, Xuying Ning, Haoran Geng, Peihao Li, Xialin He, Yutong Bai, Jitendra Malik, Saurabh Gupta, Huan Zhang
cs.AI
摘要
本文介绍了AlphaOne(alpha1),一个用于在测试时调控大型推理模型(LRMs)推理进程的通用框架。alpha1首先引入了alpha时刻,它通过一个通用参数alpha来表征缩放后的思维阶段。在这一预alpha时刻的缩放阶段内,它通过将推理过渡令牌的插入建模为伯努利随机过程,动态调度慢速思维的转换。在alpha时刻之后,alpha1使用思维结束令牌确定性地终止慢速思维,从而促进快速推理和高效答案生成。该方法通过实现灵活且密集的慢速到快速推理调节,统一并推广了现有的单调缩放方法。在数学、编程和科学领域等多个具有挑战性的基准测试上进行的广泛实证研究,展示了alpha1卓越的推理能力和效率。项目页面:https://alphaone-project.github.io/
English
This paper presents AlphaOne (alpha1), a universal framework for
modulating reasoning progress in large reasoning models (LRMs) at test time.
alpha1 first introduces alpha moment, which represents the scaled
thinking phase with a universal parameter alpha. Within this scaled
pre-alpha moment phase, it dynamically schedules slow thinking transitions
by modeling the insertion of reasoning transition tokens as a Bernoulli
stochastic process. After the alpha moment, alpha1 deterministically
terminates slow thinking with the end-of-thinking token, thereby fostering fast
reasoning and efficient answer generation. This approach unifies and
generalizes existing monotonic scaling methods by enabling flexible and dense
slow-to-fast reasoning modulation. Extensive empirical studies on various
challenging benchmarks across mathematical, coding, and scientific domains
demonstrate alpha1's superior reasoning capability and efficiency. Project
page: https://alphaone-project.github.io/