エージェントKB：エージェント的問題解決のためのクロスドメイン経験の活用

要旨

言語エージェントがますます複雑なタスクに取り組むにつれ、効果的なエラー修正とドメイン間での経験の再利用に苦戦しています。本論文では、Agent KBという階層的な経験フレームワークを提案します。これは、新たなReason-Retrieve-Refineパイプラインを通じて複雑なエージェント的問題解決を可能にします。Agent KBは、従来エージェントが互いの経験から学べないという核心的な課題に対処します。高レベルの戦略と詳細な実行ログの両方を捕捉することで、Agent KBはエージェント間の知識転移を可能にする共有知識ベースを構築します。GAIAベンチマークでの評価では、Agent KBは成功率を最大16.28パーセントポイント向上させました。最も困難なタスクでは、Claude-3が38.46%から57.69%に、GPT-4が中間タスクで53.49%から73.26%に改善しました。SWE-benchのコード修復では、Agent KBによりClaude-3が41.33%から53.33%に向上しました。これらの結果は、Agent KBがモジュール型でフレームワークに依存しないインフラストラクチャを提供し、エージェントが過去の経験から学び、成功した戦略を新しいタスクに一般化することを可能にすることを示唆しています。

English

As language agents tackle increasingly complex tasks, they struggle with effective error correction and experience reuse across domains. We introduce Agent KB, a hierarchical experience framework that enables complex agentic problem solving via a novel Reason-Retrieve-Refine pipeline. Agent KB addresses a core limitation: agents traditionally cannot learn from each other's experiences. By capturing both high-level strategies and detailed execution logs, Agent KB creates a shared knowledge base that enables cross-agent knowledge transfer. Evaluated on the GAIA benchmark, Agent KB improves success rates by up to 16.28 percentage points. On the most challenging tasks, Claude-3 improves from 38.46% to 57.69%, while GPT-4 improves from 53.49% to 73.26% on intermediate tasks. On SWE-bench code repair, Agent KB enables Claude-3 to improve from 41.33% to 53.33%. Our results suggest that Agent KB provides a modular, framework-agnostic infrastructure for enabling agents to learn from past experiences and generalize successful strategies to new tasks.

エージェントKB：エージェント的問題解決のためのクロスドメイン経験の活用

Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving

要旨

Support