ChatPaper.aiChatPaper

角色-智能体:通过双角色演化自举LLM智能体

Role-Agent: Bootstrapping LLM Agents via Dual-Role Evolution

June 9, 2026
作者: Xucong Wang, Ziyu Ma, Shidong Yang, Tongwen Huang, Pengkun Wang, Yong Wang, Xiangxiang Chu
cs.AI

摘要

尽管大型语言模型(LLM)代理在复杂任务上表现出色,但其学习过程常受限于低效的交互反馈和静态的训练环境,这阻碍了其更广泛的泛化能力。为解决这些问题,本文提出了Role-Agent框架,该框架利用单个LLM同时充当代理和环境,实现自举式共同进化。Role-Agent包含两个协同组件:世界代理(WIA)和代理世界(AIW)。在WIA中,LLM作为代理,在每次行动后预测未来状态;预测状态与实际状态的对齐程度被用作过程奖励,从而促进环境感知推理。在AIW中,LLM分析失败轨迹中的失败模式,并检索具有相似失败模式的任务,进而重塑训练数据分布以实现针对性练习。在多个基准测试上的实验表明,Role-Agent能够持续提升性能,相较于强基线平均提升超过4%。
English
Although Large Language Model (LLM) agents have demonstrated strong performance on complex tasks, their learning is often limited by inefficient interaction feedback and static training environments, which hinder broader generalization. To address these limitations, this paper introduces Role-Agent, black{a framework} that harnesses a single LLM to function concurrently as both the agent and the environment, enabling a bootstrapped co-evolution. Role-Agent comprises two synergistic components: World-In-Agent (WIA) and Agent-In-World (AIW). In WIA, the LLM acts as the agent and predicts future states after each action; the alignment between predicted and actual states is then used as a process reward, encouraging environment-aware reasoning. In AIW, the LLM analyzes failure modes from failed trajectories and retrieves tasks with similar failure patterns, thereby reshaping the training data distribution for targeted practice. Experiments on multiple benchmarks show that Role-Agent consistently improves performance, yielding an average gain of over 4\% over strong baselines.