Statler:用于具身推理的状态维护语言模型
Statler: State-Maintaining Language Models for Embodied Reasoning
June 30, 2023
作者: Takuma Yoneda, Jiading Fang, Peng Li, Huanyu Zhang, Tianchong Jiang, Shengjie Lin, Ben Picker, David Yunis, Hongyuan Mei, Matthew R. Walter
cs.AI
摘要
大型语言模型(LLMs)提供了一种有希望的工具,使机器人能够执行复杂的机器人推理任务。然而,当代LLMs的有限上下文窗口使得在长时间范围内进行推理变得困难。像家用机器人可能执行的那种具体任务通常需要规划者考虑很久之前获取的信息(例如,机器人先前在环境中遇到的许多物体的属性)。试图使用LLM的隐式内部表示捕捉世界状态受到任务和环境相关信息在机器人行动历史中的稀缺性的影响,而依赖于通过提示向LLM传达信息的方法受制于其有限的上下文窗口。在本文中,我们提出了Statler,这是一个框架,赋予LLMs对世界状态的显式表示,作为一种随时间保持的“记忆”。Statler的核心是其使用两个通用LLMs实例——一个世界模型阅读器和一个世界模型编写器——与世界状态进行交互并维护。通过提供对这种世界状态“记忆”的访问,Statler提高了现有LLMs在无上下文长度约束下推理更长时间范围的能力。我们在三个模拟桌面操作领域和一个真实机器人领域上评估了我们方法的有效性,并展示了它在基于LLM的机器人推理中的最新进展。项目网站:https://statler-lm.github.io/
English
Large language models (LLMs) provide a promising tool that enable robots to
perform complex robot reasoning tasks. However, the limited context window of
contemporary LLMs makes reasoning over long time horizons difficult. Embodied
tasks such as those that one might expect a household robot to perform
typically require that the planner consider information acquired a long time
ago (e.g., properties of the many objects that the robot previously encountered
in the environment). Attempts to capture the world state using an LLM's
implicit internal representation is complicated by the paucity of task- and
environment-relevant information available in a robot's action history, while
methods that rely on the ability to convey information via the prompt to the
LLM are subject to its limited context window. In this paper, we propose
Statler, a framework that endows LLMs with an explicit representation of the
world state as a form of ``memory'' that is maintained over time. Integral to
Statler is its use of two instances of general LLMs -- a world-model reader and
a world-model writer -- that interface with and maintain the world state. By
providing access to this world state ``memory'', Statler improves the ability
of existing LLMs to reason over longer time horizons without the constraint of
context length. We evaluate the effectiveness of our approach on three
simulated table-top manipulation domains and a real robot domain, and show that
it improves the state-of-the-art in LLM-based robot reasoning. Project website:
https://statler-lm.github.io/