大型语言模型作为通用模式机器。
Large Language Models as General Pattern Machines
July 10, 2023
作者: Suvir Mirchandani, Fei Xia, Pete Florence, Brian Ichter, Danny Driess, Montserrat Gonzalez Arenas, Kanishka Rao, Dorsa Sadigh, Andy Zeng
cs.AI
摘要
我们观察到,预训练的大型语言模型(LLMs)能够自回归地完成复杂的令牌序列,从由概率上下文无关文法(PCFG)生成的任意序列,到在抽象推理语料库(ARC)中发现的更丰富的空间模式,这是一个通用人工智能基准,以ASCII艺术的形式呈现。令人惊讶的是,即使序列是使用从词汇表中随机抽样的令牌表示,模式完成的能力也可以部分保留。这些结果表明,在没有任何额外训练的情况下,LLMs可以作为通用序列建模器,通过上下文学习驱动。在这项工作中,我们研究了这些零-shot能力如何应用于机器人领域的问题,从推断代表随时间变化的状态的数字序列,完成简单动作,到按最少到最多提示奖励条件轨迹,可以发现和表示闭环策略(例如,CartPole的稳定控制器)。尽管由于延迟、上下文大小限制和计算成本等原因,目前难以将其部署到实际系统中,但利用LLMs驱动低级控制的方法可能为我们展示了单词之间的模式如何转化为行动提供了令人兴奋的一瞥。
English
We observe that pre-trained large language models (LLMs) are capable of
autoregressively completing complex token sequences -- from arbitrary ones
procedurally generated by probabilistic context-free grammars (PCFG), to more
rich spatial patterns found in the Abstract Reasoning Corpus (ARC), a general
AI benchmark, prompted in the style of ASCII art. Surprisingly, pattern
completion proficiency can be partially retained even when the sequences are
expressed using tokens randomly sampled from the vocabulary. These results
suggest that without any additional training, LLMs can serve as general
sequence modelers, driven by in-context learning. In this work, we investigate
how these zero-shot capabilities may be applied to problems in robotics -- from
extrapolating sequences of numbers that represent states over time to complete
simple motions, to least-to-most prompting of reward-conditioned trajectories
that can discover and represent closed-loop policies (e.g., a stabilizing
controller for CartPole). While difficult to deploy today for real systems due
to latency, context size limitations, and compute costs, the approach of using
LLMs to drive low-level control may provide an exciting glimpse into how the
patterns among words could be transferred to actions.