Esplorare la Capacità Latente dei Modelli Linguistici di Grandi Dimensioni per la Generazione di Testo in Un Singolo Passaggio

Abstract

Uno studio recente ha dimostrato che i grandi modelli linguistici (LLM) possono ricostruire testi sorprendentemente lunghi - fino a migliaia di token - tramite generazione autoregressiva a partire da un singolo embedding di input appositamente addestrato. In questo lavoro, esploriamo se tale ricostruzione sia possibile senza autoregressione. Mostriamo che LLM congelati possono generare centinaia di token accurati in un solo passaggio in avanti, quando vengono forniti solo due embedding appresi. Ciò rivela una capacità sorprendente e poco esplorata degli LLM: la generazione multi-token senza decodifica iterativa. Investigiamo il comportamento di questi embedding e forniamo approfondimenti sul tipo di informazioni che codificano. Dimostriamo inoltre empiricamente che, sebbene queste rappresentazioni non siano univoche per un dato testo, formano regioni connesse e locali nello spazio degli embedding - una proprietà che suggerisce il potenziale di apprendere un encoder dedicato in quello spazio.

English

A recent study showed that large language models (LLMs) can reconstruct surprisingly long texts - up to thousands of tokens - via autoregressive generation from just one specially trained input embedding. In this work, we explore whether such reconstruction is possible without autoregression. We show that frozen LLMs can generate hundreds of accurate tokens in just one forward pass, when provided with only two learned embeddings. This reveals a surprising and underexplored capability of LLMs - multi-token generation without iterative decoding. We investigate the behaviour of these embeddings and provide insight into the type of information they encode. We also empirically show that although these representations are not unique for a given text, they form connected and local regions in embedding space - a property that suggests the potential of learning a dedicated encoder into that space.

Esplorare la Capacità Latente dei Modelli Linguistici di Grandi Dimensioni per la Generazione di Testo in Un Singolo Passaggio

Exploring the Latent Capacity of LLMs for One-Step Text Generation

Abstract

Support