探索大型語言模型在一步式文本生成中的潛在能力

摘要

近期一項研究表明，大型語言模型（LLMs）能夠通過僅從一個經過特殊訓練的輸入嵌入進行自回歸生成，重建出驚人長度的文本——多達數千個標記。在本研究中，我們探討了是否可以在不使用自回歸的情況下實現此類重建。我們證明，當僅提供兩個學習到的嵌入時，凍結的LLMs僅需一次前向傳播即可生成數百個準確的標記。這揭示了LLMs一項令人驚訝且尚未被充分探索的能力——無需迭代解碼的多標記生成。我們研究了這些嵌入的行為，並深入探討了它們所編碼的信息類型。我們還通過實驗證明，儘管這些表示對於給定文本並非唯一，但它們在嵌入空間中形成了連通且局部的區域——這一特性暗示了學習專用編碼器進入該空間的潛力。

English

A recent study showed that large language models (LLMs) can reconstruct surprisingly long texts - up to thousands of tokens - via autoregressive generation from just one specially trained input embedding. In this work, we explore whether such reconstruction is possible without autoregression. We show that frozen LLMs can generate hundreds of accurate tokens in just one forward pass, when provided with only two learned embeddings. This reveals a surprising and underexplored capability of LLMs - multi-token generation without iterative decoding. We investigate the behaviour of these embeddings and provide insight into the type of information they encode. We also empirically show that although these representations are not unique for a given text, they form connected and local regions in embedding space - a property that suggests the potential of learning a dedicated encoder into that space.

探索大型語言模型在一步式文本生成中的潛在能力

Exploring the Latent Capacity of LLMs for One-Step Text Generation

摘要

Support