大型語言模型的代理推理能力
Agentic Reasoning for Large Language Models
January 18, 2026
作者: Tianxin Wei, Ting-Wei Li, Zhining Liu, Xuying Ning, Ze Yang, Jiaru Zou, Zhichen Zeng, Ruizhong Qiu, Xiao Lin, Dongqi Fu, Zihao Li, Mengting Ai, Duo Zhou, Wenxuan Bao, Yunzhe Li, Gaotang Li, Cheng Qian, Yu Wang, Xiangru Tang, Yin Xiao, Liri Fang, Hui Liu, Xianfeng Tang, Yuji Zhang, Chi Wang, Jiaxuan You, Heng Ji, Hanghang Tong, Jingrui He
cs.AI
摘要
推理是支撑推断、问题解决与决策制定的基本认知过程。尽管大型语言模型在封闭环境中展现出强大的推理能力,但在开放动态环境中仍面临挑战。智能体推理通过将大型语言模型重构为能够通过持续交互进行规划、行动与学习的自主智能体,实现了范式转变。本综述从三个互补维度系统梳理智能体推理研究:首先,通过三层架构刻画环境动态性——基础智能体推理建立稳定环境中包括规划、工具使用与搜索在内的核心单智能体能力;自演进智能体推理研究智能体如何通过反馈、记忆与适应优化这些能力;集体多智能体推理将智能延伸至涉及协作、知识共享与共同目标的协同场景。在这些层级中,我们区分了通过结构化编排扩展测试时交互的情境推理,与通过强化学习和监督微调优化行为的训练后推理。我们进一步综述了跨现实应用场景(包括科学、机器人、医疗、自主研究与数学领域)的代表性智能体推理框架与基准测试。本综述将智能体推理方法整合为连接思维与行动的统一路线图,并指出开放性挑战与未来方向,包括个性化、长周期交互、世界建模、可扩展多智能体训练以及现实部署的治理机制。
English
Reasoning is a fundamental cognitive process underlying inference, problem-solving, and decision-making. While large language models (LLMs) demonstrate strong reasoning capabilities in closed-world settings, they struggle in open-ended and dynamic environments. Agentic reasoning marks a paradigm shift by reframing LLMs as autonomous agents that plan, act, and learn through continual interaction. In this survey, we organize agentic reasoning along three complementary dimensions. First, we characterize environmental dynamics through three layers: foundational agentic reasoning, which establishes core single-agent capabilities including planning, tool use, and search in stable environments; self-evolving agentic reasoning, which studies how agents refine these capabilities through feedback, memory, and adaptation; and collective multi-agent reasoning, which extends intelligence to collaborative settings involving coordination, knowledge sharing, and shared goals. Across these layers, we distinguish in-context reasoning, which scales test-time interaction through structured orchestration, from post-training reasoning, which optimizes behaviors via reinforcement learning and supervised fine-tuning. We further review representative agentic reasoning frameworks across real-world applications and benchmarks, including science, robotics, healthcare, autonomous research, and mathematics. This survey synthesizes agentic reasoning methods into a unified roadmap bridging thought and action, and outlines open challenges and future directions, including personalization, long-horizon interaction, world modeling, scalable multi-agent training, and governance for real-world deployment.