ChatPaper.aiChatPaper

ASA:面向工具調用代理的無訓練表徵工程

ASA: Training-Free Representation Engineering for Tool-Calling Agents

February 4, 2026
作者: Youjin Wang, Run Zhou, Rong Fu, Shuaishuai Cao, Hongwei Zeng, Jiaxuan Lu, Sicheng Fan, Jiaqiao Zhao, Liangming Pan
cs.AI

摘要

大型語言模型代理在適應領域特定工具調用時,面對持續演進的介面仍表現出明顯的脆弱性。提示與架構工程雖易於部署,但在分佈偏移和嚴格解析器下往往不夠穩健;而持續的參數高效微調雖能提升可靠性,卻需付出訓練成本、維護代價及潛在的遺忘風險。我們發現一種關鍵的「惰性代理」失效模式:儘管從中間層激活值能近乎完美解碼工具使用必要性,模型仍保守地避免進入工具調用模式,揭示出表徵與行為間的落差。為此,我們提出激活導向適配器(ASA),這款免訓練的推理時控制器透過單次中間層干預,以路由器調控的導向向量混合體鎖定工具領域,並結合探針引導的符號門控機制,在抑制虛假觸發的同時放大真實意圖。在Qwen2.5-1.5B模型上的MTU-Bench測試顯示,ASA僅需約20KB可移植資源且無需權重更新,即將嚴格工具使用的F1分數從0.18提升至0.50,同時將誤報率從0.15降至0.05。
English
Adapting LLM agents to domain-specific tool calling remains notably brittle under evolving interfaces. Prompt and schema engineering is easy to deploy but often fragile under distribution shift and strict parsers, while continual parameter-efficient fine-tuning improves reliability at the cost of training, maintenance, and potential forgetting. We identify a critical Lazy Agent failure mode where tool necessity is nearly perfectly decodable from mid-layer activations, yet the model remains conservative in entering tool mode, revealing a representation-behavior gap. We propose Activation Steering Adapter (ASA), a training-free, inference-time controller that performs a single-shot mid-layer intervention and targets tool domains via a router-conditioned mixture of steering vectors with a probe-guided signed gate to amplify true intent while suppressing spurious triggers. On MTU-Bench with Qwen2.5-1.5B, ASA improves strict tool-use F1 from 0.18 to 0.50 while reducing the false positive rate from 0.15 to 0.05, using only about 20KB of portable assets and no weight updates.
PDF391February 13, 2026