에이전트 KB: 에이전트 문제 해결을 위한 크로스 도메인 경험 활용

초록

언어 에이전트가 점점 더 복잡한 작업을 다루면서, 효과적인 오류 수정과 도메인 간 경험 재사용에 어려움을 겪고 있습니다. 우리는 이러한 문제를 해결하기 위해 Agent KB라는 계층적 경험 프레임워크를 소개합니다. 이 프레임워크는 새로운 Reason-Retrieve-Refine 파이프라인을 통해 복잡한 에이전트 문제 해결을 가능하게 합니다. Agent KB는 전통적으로 에이전트들이 서로의 경험을 학습할 수 없다는 핵심적인 한계를 해결합니다. 고수준 전략과 상세한 실행 로그를 모두 포착함으로써, Agent KB는 에이전트 간 지식 전달을 가능하게 하는 공유 지식 기반을 구축합니다. GAIA 벤치마크에서 평가한 결과, Agent KB는 성공률을 최대 16.28%포인트 향상시켰습니다. 가장 어려운 작업에서 Claude-3는 38.46%에서 57.69%로, GPT-4는 중간 수준 작업에서 53.49%에서 73.26%로 성능이 개선되었습니다. SWE-bench 코드 수정 작업에서는 Agent KB가 Claude-3의 성능을 41.33%에서 53.33%로 향상시켰습니다. 우리의 결과는 Agent KB가 에이전트들이 과거 경험을 학습하고 성공적인 전략을 새로운 작업에 일반화할 수 있도록 모듈화된 프레임워크-불문 인프라를 제공한다는 것을 시사합니다.

English

As language agents tackle increasingly complex tasks, they struggle with effective error correction and experience reuse across domains. We introduce Agent KB, a hierarchical experience framework that enables complex agentic problem solving via a novel Reason-Retrieve-Refine pipeline. Agent KB addresses a core limitation: agents traditionally cannot learn from each other's experiences. By capturing both high-level strategies and detailed execution logs, Agent KB creates a shared knowledge base that enables cross-agent knowledge transfer. Evaluated on the GAIA benchmark, Agent KB improves success rates by up to 16.28 percentage points. On the most challenging tasks, Claude-3 improves from 38.46% to 57.69%, while GPT-4 improves from 53.49% to 73.26% on intermediate tasks. On SWE-bench code repair, Agent KB enables Claude-3 to improve from 41.33% to 53.33%. Our results suggest that Agent KB provides a modular, framework-agnostic infrastructure for enabling agents to learn from past experiences and generalize successful strategies to new tasks.

에이전트 KB: 에이전트 문제 해결을 위한 크로스 도메인 경험 활용

Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving

초록

Support