LoopUS：將預訓練大型語言模型重塑為循環潛在精煉模型

摘要

循环计算通过扩展推理时计算，在提升大语言模型（LLM）推理导向性能方面展现出潜力。然而，现有方法通常需要从头训练递归模型或采用破坏性改造，这些方式会带来高昂的计算成本，并可能损害预训练能力。为解决这些限制，我们提出循环深度上采样（LoopUS）——一种将标准预训练LLM转化为循环架构的训练后框架。作为关键的技术贡献，LoopUS将预训练LLM重新构造为编码器、循环推理块和解码器。它通过四个核心组件实现这种潜在精化架构：（1）基于分段表示动态引导的区块分解；（2）用于缓解隐藏状态漂移的输入相关选择性闸门；（3）用于在长递归范围内实现内存高效学习的随机深度监督；（4）用于自适应提前退出的置信度头。这些机制共同将标准非循环模型转化为循环形式，同时使其在计算瓶颈和表示坍塌方面保持稳定。通过稳定的潜在循环，LoopUS在不延长生成轨迹或从头进行递归训练的情况下提升了推理导向性能。更多详情，请见 https://thrillcrazyer.github.io/LoopUS

English

Looped computation shows promise in improving the reasoning-oriented performance of LLMs by scaling test-time compute. However, existing approaches typically require either training recurrent models from scratch or applying disruptive retrofits, which involve substantial computational costs and may compromise pretrained capabilities. To address these limitations, we introduce Looped Depth Up-Scaling (LoopUS), a post-training framework that converts a standard pretrained LLM into a looped architecture. As a key technical contribution, LoopUS recasts the pretrained LLM into an encoder, a looped reasoning block, and a decoder. It operationalizes this latent-refinement architecture through four core components: (1) block decomposition, guided by staged representation dynamics; (2) an input-dependent selective gate to mitigate hidden-state drift; (3) random deep supervision for memory-efficient learning over long recursive horizons; and (4) a confidence head for adaptive early exiting. Collectively, these mechanisms transform a standard non-looped model into a looped form while stabilizing it against both computational bottlenecks and representation collapse. Through stable latent looping, LoopUS improves reasoning-oriented performance without extending the generated traces or requiring recurrent training from scratch. For more details, see https://thrillcrazyer.github.io/LoopUS

LoopUS：將預訓練大型語言模型重塑為循環潛在精煉模型

LoopUS: Recasting Pretrained LLMs into Looped Latent Refinement Models

摘要

Support