FARE:快慢智能体机器人探索系统
FARE: Fast-Slow Agentic Robotic Exploration
January 21, 2026
作者: Shuhao Liao, Xuxin Lv, Jeric Lew, Shizhe Zhang, Jingsong Liang, Peizhuo Li, Yuhong Cao, Wenjun Wu, Guillaume Sartoretti
cs.AI
摘要
本研究通过融合智能体级语义推理与快速局部控制,推动了自主机器人探索技术的发展。我们提出FARE——一种分层式自主探索框架,该框架将用于全局推理的大语言模型(LLM)与负责局部决策的强化学习(RL)策略相集成。FARE遵循快慢思维协同范式:慢思维LLM模块解析未知环境的简明文本描述,生成智能体级探索策略,并通过拓扑图将其具象化为全局航点序列;该模块还采用基于模块度的剪枝机制以减少冗余图结构,从而提升推理效率。快思维RL模块则在LLM生成的全局航点引导下,根据局部观测执行探索任务,其策略通过增设遵循全局航点的奖励项进行塑形,确保形成连贯稳健的闭环行为。该架构实现了语义推理与几何决策的解耦,使各模块能在适宜的时空尺度下运作。在具有挑战性的仿真环境中,实验结果表明FARE的探索效率较现有先进基线方法获得显著提升。我们进一步将FARE部署于硬件系统,在200米×130米的大型复杂建筑环境中完成了有效性验证。
English
This work advances autonomous robot exploration by integrating agent-level semantic reasoning with fast local control. We introduce FARE, a hierarchical autonomous exploration framework that integrates a large language model (LLM) for global reasoning with a reinforcement learning (RL) policy for local decision making. FARE follows a fast-slow thinking paradigm. The slow-thinking LLM module interprets a concise textual description of the unknown environment and synthesizes an agent-level exploration strategy, which is then grounded into a sequence of global waypoints through a topological graph. To further improve reasoning efficiency, this module employs a modularity-based pruning mechanism that reduces redundant graph structures. The fast-thinking RL module executes exploration by reacting to local observations while being guided by the LLM-generated global waypoints. The RL policy is additionally shaped by a reward term that encourages adherence to the global waypoints, enabling coherent and robust closed-loop behavior. This architecture decouples semantic reasoning from geometric decision, allowing each module to operate in its appropriate temporal and spatial scale. In challenging simulated environments, our results show that FARE achieves substantial improvements in exploration efficiency over state-of-the-art baselines. We further deploy FARE on hardware and validate it in complex, large scale 200mtimes130m building environment.