具身-RAG:用于检索和生成的通用非参数化具身记忆
Embodied-RAG: General non-parametric Embodied Memory for Retrieval and Generation
September 26, 2024
作者: Quanting Xie, So Yeon Min, Tianyi Zhang, Aarav Bajaj, Ruslan Salakhutdinov, Matthew Johnson-Roberson, Yonatan Bisk
cs.AI
摘要
机器人探索和学习的可能性是无限的,但所有这些知识都需要是可搜索和可操作的。在语言研究领域,检索增强生成(RAG)已经成为大规模非参数化知识的主要工具,然而现有技术无法直接转移到具身领域,这是多模态的,数据高度相关,并且感知需要抽象化。
为了解决这些挑战,我们引入了具身-RAG,这是一个框架,通过将具身代理的基础模型与一个能够自主构建用于导航和语言生成的分层知识的非参数化记忆系统相结合。具身-RAG处理各种环境和查询类型的空间和语义分辨率范围,无论是针对特定对象还是对环境氛围的整体描述。在其核心,具身-RAG的记忆被构建为语义森林,以不同层次的详细程度存储语言描述。这种分层组织使系统能够在不同的机器人平台上高效生成上下文敏感的输出。我们证明,具身-RAG有效地将RAG与机器人领域连接起来,成功处理了19个环境中超过200个解释和导航查询,突显了其作为具身代理通用非参数化系统的潜力。
English
There is no limit to how much a robot might explore and learn, but all of
that knowledge needs to be searchable and actionable. Within language research,
retrieval augmented generation (RAG) has become the workhouse of large-scale
non-parametric knowledge, however existing techniques do not directly transfer
to the embodied domain, which is multimodal, data is highly correlated, and
perception requires abstraction.
To address these challenges, we introduce Embodied-RAG, a framework that
enhances the foundational model of an embodied agent with a non-parametric
memory system capable of autonomously constructing hierarchical knowledge for
both navigation and language generation. Embodied-RAG handles a full range of
spatial and semantic resolutions across diverse environments and query types,
whether for a specific object or a holistic description of ambiance. At its
core, Embodied-RAG's memory is structured as a semantic forest, storing
language descriptions at varying levels of detail. This hierarchical
organization allows the system to efficiently generate context-sensitive
outputs across different robotic platforms. We demonstrate that Embodied-RAG
effectively bridges RAG to the robotics domain, successfully handling over 200
explanation and navigation queries across 19 environments, highlighting its
promise for general-purpose non-parametric system for embodied agents.Summary
AI-Generated Summary