ALPINE:揭示語言模型中自回歸學習的規劃能力
ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models
May 15, 2024
作者: Siwei Wang, Yifei Shen, Shi Feng, Haoran Sun, Shang-Hua Teng, Wei Chen
cs.AI
摘要
本文介紹了我們的ALPINE計劃的研究成果,ALPINE代表"Autoregressive Learning for Planning In NEtworks"。ALPINE計劃啟動了一項理論研究,探討基於Transformer的語言模型通過其自回歸學習機制發展規劃能力,旨在識別其規劃能力中的潛在限制。我們將規劃抽象為一個網絡尋路任務,其中目標是從指定的源節點生成到指定目標節點的有效路徑。在表達能力方面,我們展示了Transformer能夠通過將鄰接和可達性矩陣嵌入其權重來執行尋路任務。我們對Transformer基於梯度的學習動態進行的理論分析顯示,Transformer能夠學習鄰接矩陣和有限形式的可達性矩陣。這些理論見解隨後通過實驗進行驗證,實驗表明Transformer確實學習了鄰接矩陣和不完整的可達性矩陣,這與我們理論分析中的預測一致。此外,當將我們的方法應用於一個名為Blocksworld的現實世界規劃基準時,我們的觀察結果保持一致。我們的理論和實證分析進一步揭示了Transformer在尋路任務中的潛在限制:它無法通過遞移識別可達性關係,因此在需要通過路徑串聯生成路徑時將失敗。總之,我們的研究結果為自回歸學習的內部機制如何實現網絡規劃帶來了新的視角。這項研究可能有助於我們對其他相關領域中的一般規劃能力的理解。
English
In this paper, we present the findings of our Project ALPINE which stands for
``Autoregressive Learning for Planning In NEtworks." Project ALPINE initiates a
theoretical investigation into the development of planning capabilities in
Transformer-based language models through their autoregressive learning
mechanisms, aiming to identify any potential limitations in their planning
abilities. We abstract planning as a network path-finding task where the
objective is to generate a valid path from a specified source node to a
designated target node. In terms of expressiveness, we show that the
Transformer is capable of executing path-finding by embedding the adjacency and
reachability matrices within its weights. Our theoretical analysis of the
gradient-based learning dynamic of the Transformer reveals that the Transformer
is capable of learning both the adjacency matrix and a limited form of the
reachability matrix. These theoretical insights are then validated through
experiments, which demonstrate that the Transformer indeed learns the adjacency
matrix and an incomplete reachability matrix, which aligns with the predictions
made in our theoretical analysis. Additionally, when applying our methodology
to a real-world planning benchmark, called Blocksworld, our observations remain
consistent. Our theoretical and empirical analyses further unveil a potential
limitation of Transformer in path-finding: it cannot identify reachability
relationships through transitivity, and thus would fail when path concatenation
is needed to generate a path. In summary, our findings shed new light on how
the internal mechanisms of autoregressive learning enable planning in networks.
This study may contribute to our understanding of the general planning
capabilities in other related domains.Summary
AI-Generated Summary