利用技能程序驾驭LLM智能体
Harnessing LLM Agents with Skill Programs
May 18, 2026
作者: Hongjun Liu, Yifei Ming, Shafiq Joty, Chen Zhao
cs.AI
摘要
将基于过往经验的可复用技能赋予大语言模型(LLM)智能体,已成为处理复杂且长周期任务的流行且成功的方法。然而,这类经验通常以文本指导的形式编码,很大程度上仅停留在建议层面,缺乏在何时以及如何干预智能体循环的显式机制。为弥补这一差距,我们提出HASP(利用技能程序驾驭LLM智能体)框架,该框架将技能升级为可执行的程序函数(PF)。程序函数并非提供被动建议,而是作为可执行的护栏机制,在易失败状态激活,修正下一步动作或注入纠错性上下文。HASP具有高度模块化特性:可在推理时直接干预智能体循环,在训练后阶段提供结构化监督,或通过演化经过验证、教师审查的程序函数实现自我改进。实验表明,在网页搜索、数学推理和代码编写任务中,与无训练方法及基于训练的方法相比,HASP均带来显著性能提升。例如,在网页搜索推理任务中,仅推理时应用程序函数即可使平均性能相较于(多轮)ReAct智能体提升25%,而训练后结合受控演化则比Search-R1实现30.4%的提升。为深入揭示HASP的机理,我们的机制分析阐明了程序函数如何触发并干预、技能如何内化,以及稳定技能库演化的必要条件。
English
Equipping LLM agents with reusable skills derived from past experience has become a popular and successful approach for tackling complex and long-horizon tasks. However, such lessons are often encoded as textual guidance that remains largely advisory, lacking explicit mechanisms for when and how to intervene in the agent loop. To bridge the gap, we introduce HASP(Harnessing LLM Agents with Skill Programs), a new framework that upgrades skills into executable Program Functions (PFs). Rather than offering passive advice, PFs act as executable guardrails that activate on failure-prone states and modify the next action or inject corrective context. HASP is highly modular: it can be applied at inference time for direct agent-loop intervention, during post-training to provide structured supervision, or for self-improvement by evolving validated, teacher-reviewed PFs. Empirically, HASP drives substantial gains compared to both training-free and training-based methods on web-search, math reasoning, and coding tasks. For example, on web-search reasoning, inference-time PFs alone improve the average performance by 25% compared to (multi-loop) ReAct Agent, while post-training and controlled evolution achieve a 30.4% gain over Search-R1. To provide deeper insights into HASP, our mechanism analysis reveals how PFs trigger and intervene, how skills are internalized, and the requirement for stable skill library evolution.