원스텝 텍스트 생성을 위한 대형 언어 모델의 잠재력 탐구

초록

최근 연구에 따르면, 대규모 언어 모델(LLM)은 특별히 훈련된 단일 입력 임베딩으로부터 자기회귀적 생성(auto-regressive generation)을 통해 놀랍도록 긴 텍스트(수천 개의 토큰에 달하는)를 재구성할 수 있는 것으로 나타났습니다. 본 연구에서는 이러한 재구성이 자기회귀 없이도 가능한지 탐구합니다. 우리는 고정된(frozen) LLM이 단 두 개의 학습된 임베딩만 제공받는 경우, 단일 순방향 전달(forward pass)로 수백 개의 정확한 토큰을 생성할 수 있음을 보여줍니다. 이는 LLM의 놀랍고도 미처 탐구되지 않은 능력, 즉 반복적 디코딩 없이 다중 토큰을 생성하는 능력을 드러냅니다. 우리는 이러한 임베딩의 동작을 조사하고, 이들이 인코딩하는 정보의 유형에 대한 통찰을 제공합니다. 또한, 이러한 표현이 주어진 텍스트에 대해 유일하지는 않지만, 임베딩 공간 내에서 연결되고 지역적인 영역을 형성한다는 점을 실증적으로 보여줍니다. 이는 해당 공간으로의 전용 인코더(encoder)를 학습시킬 가능성을 시사하는 속성입니다.

English

A recent study showed that large language models (LLMs) can reconstruct surprisingly long texts - up to thousands of tokens - via autoregressive generation from just one specially trained input embedding. In this work, we explore whether such reconstruction is possible without autoregression. We show that frozen LLMs can generate hundreds of accurate tokens in just one forward pass, when provided with only two learned embeddings. This reveals a surprising and underexplored capability of LLMs - multi-token generation without iterative decoding. We investigate the behaviour of these embeddings and provide insight into the type of information they encode. We also empirically show that although these representations are not unique for a given text, they form connected and local regions in embedding space - a property that suggests the potential of learning a dedicated encoder into that space.

원스텝 텍스트 생성을 위한 대형 언어 모델의 잠재력 탐구

Exploring the Latent Capacity of LLMs for One-Step Text Generation

초록

Support