LoopUS：将预训练大语言模型重塑为循环潜在精炼模型

摘要

循环计算通过扩展测试时计算，展现了提升大语言模型推理性能的潜力。然而，现有方法通常需要从头训练循环模型或进行破坏性改造，这涉及大量计算成本且可能损害预训练能力。为解决这些局限，我们提出循环深度扩展（LoopUS），一种将标准预训练大语言模型转换为循环架构的后训练框架。作为关键技术贡献，LoopUS将预训练模型重新拆解为编码器、循环推理模块和解码器，并通过四个核心组件实现这种隐式精炼架构：（1）基于阶段性表示动态引导的块分解；（2）用于缓解隐藏状态漂移的输入依赖选择门；（3）用于长递归范围内内存高效学习的随机深度监督；（4）用于自适应提前退出的置信度头。这些机制共同将标准非循环模型转化为循环形式，同时使其稳定应对计算瓶颈与表示坍缩。通过稳定的隐式循环，LoopUS在不扩展生成轨迹或需要从头训练循环模型的情况下，提升了推理导向性能。更多细节请参见 https://thrillcrazyer.github.io/LoopUS。

English

Looped computation shows promise in improving the reasoning-oriented performance of LLMs by scaling test-time compute. However, existing approaches typically require either training recurrent models from scratch or applying disruptive retrofits, which involve substantial computational costs and may compromise pretrained capabilities. To address these limitations, we introduce Looped Depth Up-Scaling (LoopUS), a post-training framework that converts a standard pretrained LLM into a looped architecture. As a key technical contribution, LoopUS recasts the pretrained LLM into an encoder, a looped reasoning block, and a decoder. It operationalizes this latent-refinement architecture through four core components: (1) block decomposition, guided by staged representation dynamics; (2) an input-dependent selective gate to mitigate hidden-state drift; (3) random deep supervision for memory-efficient learning over long recursive horizons; and (4) a confidence head for adaptive early exiting. Collectively, these mechanisms transform a standard non-looped model into a looped form while stabilizing it against both computational bottlenecks and representation collapse. Through stable latent looping, LoopUS improves reasoning-oriented performance without extending the generated traces or requiring recurrent training from scratch. For more details, see https://thrillcrazyer.github.io/LoopUS

LoopUS：将预训练大语言模型重塑为循环潜在精炼模型

LoopUS: Recasting Pretrained LLMs into Looped Latent Refinement Models

摘要

Support