在Transformer中揭示mesa優化算法

摘要

Transformer 已成為深度學習中的主要模型，但其優越性能的原因尚不清楚。在這裡，我們假設 Transformer 的強大性能源於對 mesa-optimization 的架構偏好，這是一個在模型前向傳遞中運行的學習過程，包括以下兩個步驟：(i) 內部學習目標的建立，以及 (ii) 通過優化找到相應的解決方案。為了驗證這一假設，我們對一系列在簡單序列建模任務上訓練的自回歸 Transformer 進行了逆向工程，揭示了驅動預測生成的基於梯度的 mesa-optimization 算法。此外，我們展示了學習的前向傳遞優化算法可以立即重新用於解決監督式少樣本任務，這表明 mesa-optimization 可能構成大型語言模型的上下文學習能力的基礎。最後，我們提出了一個新穎的自注意力層，mesa-layer，明確且高效地解決了上下文中指定的優化問題。我們發現這一層可以提高合成和初步語言建模實驗的性能，進一步證實了 mesa-optimization 是藏在訓練過的 Transformer 權重中的重要操作的假設。

English

Transformers have become the dominant model in deep learning, but the reason for their superior performance is poorly understood. Here, we hypothesize that the strong performance of Transformers stems from an architectural bias towards mesa-optimization, a learned process running within the forward pass of a model consisting of the following two steps: (i) the construction of an internal learning objective, and (ii) its corresponding solution found through optimization. To test this hypothesis, we reverse-engineer a series of autoregressive Transformers trained on simple sequence modeling tasks, uncovering underlying gradient-based mesa-optimization algorithms driving the generation of predictions. Moreover, we show that the learned forward-pass optimization algorithm can be immediately repurposed to solve supervised few-shot tasks, suggesting that mesa-optimization might underlie the in-context learning capabilities of large language models. Finally, we propose a novel self-attention layer, the mesa-layer, that explicitly and efficiently solves optimization problems specified in context. We find that this layer can lead to improved performance in synthetic and preliminary language modeling experiments, adding weight to our hypothesis that mesa-optimization is an important operation hidden within the weights of trained Transformers.

在Transformer中揭示mesa優化算法

Uncovering mesa-optimization algorithms in Transformers

摘要

Support