ChatPaper.aiChatPaper

自主知识性自我认知

Agentic Knowledgeable Self-awareness

April 4, 2025
作者: Shuofei Qiao, Zhisong Qiu, Baochang Ren, Xiaobin Wang, Xiangyuan Ru, Ningyu Zhang, Xiang Chen, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen
cs.AI

摘要

大型语言模型(LLMs)在各类代理规划任务中已展现出显著性能。然而,传统代理规划方法采用“大水漫灌”式策略,不加区分地将黄金轨迹、外部反馈及领域知识注入代理模型。这种做法忽视了人类决策过程中情境自我认知的基本原则——即动态评估情境需求并在决策时策略性运用资源的能力。为填补这一空白,我们提出了具备知识性自我认知的代理新范式,使基于LLM的代理能够自主调控知识利用。具体而言,我们提出了KnowSelf,一种以数据为中心的方法,赋予代理如人类般的知识性自我认知能力。我们设计了一种启发式情境判断准则,在代理自我探索的轨迹上标记特殊符号以收集训练数据。通过两阶段训练过程,代理模型能够通过生成特定特殊符号在不同情境间切换,以最小成本实现最优规划效果。实验表明,KnowSelf在多种任务和模型上均能以最少的外部知识使用量超越多个强基线。代码已发布于https://github.com/zjunlp/KnowSelf。
English
Large Language Models (LLMs) have achieved considerable performance across various agentic planning tasks. However, traditional agent planning approaches adopt a "flood irrigation" methodology that indiscriminately injects gold trajectories, external feedback, and domain knowledge into agent models. This practice overlooks the fundamental human cognitive principle of situational self-awareness during decision-making-the ability to dynamically assess situational demands and strategically employ resources during decision-making. We propose agentic knowledgeable self-awareness to address this gap, a novel paradigm enabling LLM-based agents to autonomously regulate knowledge utilization. Specifically, we propose KnowSelf, a data-centric approach that applies agents with knowledgeable self-awareness like humans. Concretely, we devise a heuristic situation judgement criterion to mark special tokens on the agent's self-explored trajectories for collecting training data. Through a two-stage training process, the agent model can switch between different situations by generating specific special tokens, achieving optimal planning effects with minimal costs. Our experiments demonstrate that KnowSelf can outperform various strong baselines on different tasks and models with minimal use of external knowledge. Code is available at https://github.com/zjunlp/KnowSelf.

Summary

AI-Generated Summary

PDF282April 7, 2025