大型語言模型作為通用模式機器。

摘要

我們觀察到預訓練的大型語言模型（LLMs）能夠自回歸地完成複雜的標記序列，從由概率上下文無關文法（PCFG）程序生成的任意序列，到在抽象推理語料庫（ARC）中找到的更豐富的空間模式，這是一個通用人工智能基準，以ASCII藝術風格提示。令人驚訝的是，即使使用從詞彙中隨機抽樣的標記來表示序列，模式完成的能力也可以部分保留。這些結果表明，在沒有額外訓練的情況下，LLMs可以作為通用序列建模器，受到上下文學習的驅動。在這項工作中，我們研究了這些零-shot能力如何應用於機器人問題，從推斷代表隨時間變化的狀態的數字序列以完成簡單運動，到根據獎勵條件軌跡的從最少到最多提示，這些軌跡可以發現並表示閉環策略（例如，CartPole的穩定控制器）。雖然由於延遲、上下文大小限制和計算成本，目前難以應用於實際系統，但使用LLMs來驅動低級控制的方法可能提供了一個令人興奮的領域，展示了詞語之間的模式如何轉移到行動中。

English

We observe that pre-trained large language models (LLMs) are capable of autoregressively completing complex token sequences -- from arbitrary ones procedurally generated by probabilistic context-free grammars (PCFG), to more rich spatial patterns found in the Abstract Reasoning Corpus (ARC), a general AI benchmark, prompted in the style of ASCII art. Surprisingly, pattern completion proficiency can be partially retained even when the sequences are expressed using tokens randomly sampled from the vocabulary. These results suggest that without any additional training, LLMs can serve as general sequence modelers, driven by in-context learning. In this work, we investigate how these zero-shot capabilities may be applied to problems in robotics -- from extrapolating sequences of numbers that represent states over time to complete simple motions, to least-to-most prompting of reward-conditioned trajectories that can discover and represent closed-loop policies (e.g., a stabilizing controller for CartPole). While difficult to deploy today for real systems due to latency, context size limitations, and compute costs, the approach of using LLMs to drive low-level control may provide an exciting glimpse into how the patterns among words could be transferred to actions.

大型語言模型作為通用模式機器。

Large Language Models as General Pattern Machines

摘要

Support