LoopUS: 사전 학습된 LLM을 순환 잠재 정제 모델로 재구성

초록

루프형 계산은 테스트 시간 연산을 확장함으로써 LLM의 추론 중심 성능을 향상시키는 데 유망한 접근 방식으로 주목받고 있다. 그러나 기존 방법은 일반적으로 순환 모델을 처음부터 훈련하거나 기존 사전 훈련된 능력을 저해할 수 있는 파괴적 개조를 필요로 하며, 이는 상당한 계산 비용을 초래한다. 이러한 한계를 해결하기 위해, 우리는 사전 훈련된 표준 LLM을 루프형 아키텍처로 변환하는 사후 훈련 프레임워크인 Looped Depth Up-Scaling (LoopUS)을 제안한다. 주요 기술 기여로서, LoopUS는 사전 훈련된 LLM을 인코더, 루프형 추론 블록, 디코더로 재구성한다. 이는 다음과 같은 네 가지 핵심 구성 요소를 통해 잠재 정제 아키텍처를 구현한다: (1) 단계적 표현 역학에 기반한 블록 분해, (2) 은닉 상태 드리프트를 완화하기 위한 입력 종속 선택 게이트, (3) 긴 재귀 구간에서 메모리 효율적 학습을 위한 무작위 심층 감독, (4) 적응형 조기 종료를 위한 신뢰도 헤드. 이러한 메커니즘들은 표준 비루프형 모델을 루프형 형태로 변환하는 동시에 계산 병목 및 표현 붕괴로부터 안정화한다. 안정적인 잠재 루프링을 통해 LoopUS는 생성된 추적을 확장하거나 처음부터 순환 훈련을 수행하지 않고도 추론 중심 성능을 향상시킨다. 자세한 내용은 https://thrillcrazyer.github.io/LoopUS에서 확인할 수 있다.

English

Looped computation shows promise in improving the reasoning-oriented performance of LLMs by scaling test-time compute. However, existing approaches typically require either training recurrent models from scratch or applying disruptive retrofits, which involve substantial computational costs and may compromise pretrained capabilities. To address these limitations, we introduce Looped Depth Up-Scaling (LoopUS), a post-training framework that converts a standard pretrained LLM into a looped architecture. As a key technical contribution, LoopUS recasts the pretrained LLM into an encoder, a looped reasoning block, and a decoder. It operationalizes this latent-refinement architecture through four core components: (1) block decomposition, guided by staged representation dynamics; (2) an input-dependent selective gate to mitigate hidden-state drift; (3) random deep supervision for memory-efficient learning over long recursive horizons; and (4) a confidence head for adaptive early exiting. Collectively, these mechanisms transform a standard non-looped model into a looped form while stabilizing it against both computational bottlenecks and representation collapse. Through stable latent looping, LoopUS improves reasoning-oriented performance without extending the generated traces or requiring recurrent training from scratch. For more details, see https://thrillcrazyer.github.io/LoopUS

LoopUS: 사전 학습된 LLM을 순환 잠재 정제 모델로 재구성

LoopUS: Recasting Pretrained LLMs into Looped Latent Refinement Models

초록

Support