SkillGrad:像梯度下降一樣優化智能體技能
SkillGrad: Optimizing Agent Skills Like Gradient Descent
May 26, 2026
作者: Hanyu Wang, Yifan Lan, Bochuan Cao, Lu Lin, Jinghui Chen
cs.AI
摘要
Agent技能提供了一種輕量級方式,透過將可重複使用的程序性知識儲存在結構化檔案中,來調整大型語言模型(LLM)智能體以適應專業領域。然而,無論是從第三方下載還是自行生成,這些技能往往不可靠、不完整或過時。現有的技能演化方法通常透過啟發式反思來解決這些缺陷,但缺乏明確的優化形式化。本文提出SkillGrad,一種受梯度下降啟發的框架,用於優化Agent技能。SkillGrad將技能套件視為結構化參數,以梯度下降方式進行優化:任務執行提供軌跡層級的損失證據,自動診斷則提供基於文字的梯度,指示修正方向。為穩定跨迭代的優化,一個動量智能體將重複出現的診斷模式累積到持久記憶覆蓋層中。最後,基於LLM的修補器透過對技能套件進行層級感知編輯來執行參數更新。在SpreadsheetBench Verified和WikiTableQuestions上的評估結果顯示,SkillGrad在兩個骨幹LLM上持續優於基於訓練的技能演化基線,平均比最強的基於訓練基線高出6.7個百分點。消融實驗進一步表明,動量與對比診斷均有助於最終技能品質的提升。
English
Agent skills provide a lightweight way to adapt LLM agents to specialized domains by storing reusable procedural knowledge in structured files. However, whether downloaded from third parties or self-generated, these skills are often unreliable, incomplete, or outdated. Existing skill-evolution methods often address these deficiencies through heuristic reflections without an explicit optimization formulation. In this paper, we propose SkillGrad, a gradient-descent-inspired framework for optimizing agent skills. SkillGrad treats the skill package as a structured parameter to optimize in a gradient descent fashion: task executions provide trajectory-level loss evidence, automatic diagnoses then provide text-based gradients that indicate the correction directions. To stabilize optimization across iterations, a momentum agent accumulates recurring diagnostic patterns into a persistent memory overlay. Finally, an LLM-based patcher executes the parameter update by applying layer-aware edits to the skill package. Evaluated on SpreadsheetBench Verified and WikiTableQuestions, SkillGrad consistently outperforms training-based skill evolution baselines across two backbone LLMs, improving over the strongest training-based baseline by 6.7 percentage points on average. Ablations further show that momentum and contrastive diagnosis both contribute to the final skill quality.