讓我們逐句進行預測

摘要

自回歸語言模型（LMs）一次生成一個詞元，而人類推理則基於更高層次的抽象——句子、命題和概念。這種對比引發了一個核心問題：LMs是否也能學會在結構化的語義單元而非原始詞元序列上進行推理？在本研究中，我們探討了預訓練的LMs能否通過其已學習的表徵提升至這樣的抽象推理空間。我們提出了一個框架，該框架通過自回歸地預測下一句的連續嵌入，使預訓練的詞元級LM適應於句子空間操作。我們探索了兩種受經典表徵學習啟發的嵌入範式：1）語義嵌入，通過自動編碼學習以保留表面意義；2）上下文嵌入，通過下一句預測訓練以編碼預期結構。我們在兩種推理機制下評估這兩種嵌入：離散化，即在重新編碼前將每個預測的嵌入解碼為文本；以及連續化，即在嵌入空間中完全進行推理以提高效率。在數學、邏輯、常識和規劃四個領域中，連續推理下的上下文嵌入表現出與思維鏈（CoT）相當的競爭力，同時平均減少了一半的推理時間浮點運算次數（FLOPs）。我們還展示了可擴展性和模塊化適應的早期跡象。最後，為了可視化潛在軌跡，我們引入了SentenceLens，這是一種將中間模型狀態解碼為可解釋句子的診斷工具。綜合來看，我們的結果表明，預訓練的LMs能夠在潛在嵌入空間內有效地過渡到抽象、結構化的推理。

English

Autoregressive language models (LMs) generate one token at a time, yet human reasoning operates over higher-level abstractions - sentences, propositions, and concepts. This contrast raises a central question- Can LMs likewise learn to reason over structured semantic units rather than raw token sequences? In this work, we investigate whether pretrained LMs can be lifted into such abstract reasoning spaces by building on their learned representations. We present a framework that adapts a pretrained token-level LM to operate in sentence space by autoregressively predicting continuous embeddings of next sentences. We explore two embedding paradigms inspired by classical representation learning: 1) semantic embeddings, learned via autoencoding to preserve surface meaning; and 2) contextual embeddings, trained via next-sentence prediction to encode anticipatory structure. We evaluate both under two inference regimes: Discretized, which decodes each predicted embedding into text before re-encoding; and Continuous, which reasons entirely in embedding space for improved efficiency. Across four domains - mathematics, logic, commonsense, and planning - contextual embeddings under continuous inference show competitive performance with Chain-of-Thought (CoT) while reducing inference-time FLOPs on average by half. We also present early signs of scalability and modular adaptation. Finally, to visualize latent trajectories, we introduce SentenceLens, a diagnostic tool that decodes intermediate model states into interpretable sentences. Together, our results indicate that pretrained LMs can effectively transition to abstract, structured reasoning within latent embedding spaces.

讓我們逐句進行預測

Let's Predict Sentence by Sentence

摘要

Support