ChatPaper.aiChatPaper

智能体省略:通过智能体强化学习训练高效LLM智能体实现自适应思维与观察省略

Agent-Omit: Training Efficient LLM Agents for Adaptive Thought and Observation Omission via Agentic Reinforcement Learning

February 4, 2026
作者: Yansong Ning, Jun Fang, Naiqiang Tan, Hao Liu
cs.AI

摘要

在多轮智能体-环境交互过程中,动态管理思维与观察是提升智能体效能的新兴策略。然而现有研究往往均等对待整个交互轨迹,忽视了不同轮次中思维必要性与观察效用的动态变化。为此,我们首先定量研究了思维与观察对智能体效能的影响机制。基于研究发现,我们提出Agent-Omit统一训练框架,使大语言模型智能体能够自适应地省略冗余思维与观察。具体而言,我们首先合成包含单轮与多轮省略场景的小规模冷启动数据,通过微调培养智能体的省略行为。进一步提出省略感知的智能体强化学习方法,结合双重采样机制与定制化省略奖励,激励智能体的自适应省略能力。理论上我们证明了省略策略的偏差存在KL散度上界。在五个智能体基准测试上的实验表明,我们构建的Agent-Omit-8B模型性能可比肩七种前沿大语言模型智能体,并在与七种高效大语言模型智能体方法的对比中实现了最佳效能平衡。代码与数据已开源:https://github.com/usail-hkust/Agent-Omit。
English
Managing agent thought and observation during multi-turn agent-environment interactions is an emerging strategy to improve agent efficiency. However, existing studies treat the entire interaction trajectories equally, overlooking the thought necessity and observation utility varies across turns. To this end, we first conduct quantitative investigations into how thought and observation affect agent effectiveness and efficiency. Based on our findings, we propose Agent-Omit, a unified training framework that empowers LLM agents to adaptively omit redundant thoughts and observations. Specifically, we first synthesize a small amount of cold-start data, including both single-turn and multi-turn omission scenarios, to fine-tune the agent for omission behaviors. Furthermore, we introduce an omit-aware agentic reinforcement learning approach, incorporating a dual sampling mechanism and a tailored omission reward to incentivize the agent's adaptive omission capability. Theoretically, we prove that the deviation of our omission policy is upper-bounded by KL-divergence. Experimental results on five agent benchmarks show that our constructed Agent-Omit-8B could obtain performance comparable to seven frontier LLM agent, and achieve the best effectiveness-efficiency trade-off than seven efficient LLM agents methods. Our code and data are available at https://github.com/usail-hkust/Agent-Omit.
PDF122February 6, 2026