SwiftSage：一种具有快速和慢速思维的生成式智能体，用于复杂的交互式任务

摘要

我们介绍了SwiftSage，这是一个新颖的代理框架，灵感来自于人类认知的双过程理论，旨在在复杂互动推理任务的行动规划中表现出色。SwiftSage将行为克隆和提示大型语言模型（LLMs）的优势融合在一起，以增强任务完成性能。该框架包括两个主要模块：Swift模块，代表快速直觉思维，以及Sage模块，模拟深思熟虑的思维过程。Swift模块是一个在神谕代理的行动轨迹上进行微调的小型编码器-解码器LM，而Sage模块则利用诸如GPT-4之类的LLMs进行子目标规划和基础建立。我们开发了一种启发式方法，将这两个模块融合在一起，从而实现更高效和更稳健的问题解决过程。在来自ScienceWorld基准的30个任务中，SwiftSage明显优于其他方法，如SayCan、ReAct和Reflexion，展示了其在解决复杂现实世界任务中的有效性。

English

We introduce SwiftSage, a novel agent framework inspired by the dual-process theory of human cognition, designed to excel in action planning for complex interactive reasoning tasks. SwiftSage integrates the strengths of behavior cloning and prompting large language models (LLMs) to enhance task completion performance. The framework comprises two primary modules: the Swift module, representing fast and intuitive thinking, and the Sage module, emulating deliberate thought processes. The Swift module is a small encoder-decoder LM fine-tuned on the oracle agent's action trajectories, while the Sage module employs LLMs such as GPT-4 for subgoal planning and grounding. We develop a heuristic method to harmoniously integrate the two modules, resulting in a more efficient and robust problem-solving process. In 30 tasks from the ScienceWorld benchmark, SwiftSage significantly outperforms other methods such as SayCan, ReAct, and Reflexion, demonstrating its effectiveness in solving complex real-world tasks.

SwiftSage：一种具有快速和慢速思维的生成式智能体，用于复杂的交互式任务

SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks

摘要

Support