探索大语言模型在单步文本生成中的潜在能力

摘要

近期一项研究表明，大型语言模型（LLMs）能够通过自回归生成，仅从一个经过特殊训练的输入嵌入中重建出惊人的长文本——长达数千个标记。在本研究中，我们探讨了这种重建是否可以在无需自回归的情况下实现。我们发现，冻结的LLMs在仅提供两个学习到的嵌入时，仅通过一次前向传播就能生成数百个准确的标记。这揭示了LLMs一个令人惊讶且尚未充分探索的能力——无需迭代解码的多标记生成。我们深入研究了这些嵌入的行为，并对其编码的信息类型提供了见解。我们还通过实验证明，尽管这些表示对于给定文本并非唯一，但它们在嵌入空间中形成了连通且局部的区域——这一特性暗示了学习专用编码器进入该空间的潜力。

English

A recent study showed that large language models (LLMs) can reconstruct surprisingly long texts - up to thousands of tokens - via autoregressive generation from just one specially trained input embedding. In this work, we explore whether such reconstruction is possible without autoregression. We show that frozen LLMs can generate hundreds of accurate tokens in just one forward pass, when provided with only two learned embeddings. This reveals a surprising and underexplored capability of LLMs - multi-token generation without iterative decoding. We investigate the behaviour of these embeddings and provide insight into the type of information they encode. We also empirically show that although these representations are not unique for a given text, they form connected and local regions in embedding space - a property that suggests the potential of learning a dedicated encoder into that space.

探索大语言模型在单步文本生成中的潜在能力

Exploring the Latent Capacity of LLMs for One-Step Text Generation

摘要

Support