에이전트 AI 시스템 구성 방법 학습하기

초록

LLM 기반 에이전트 시스템 구성은 방대한 조합 설계 공간에서 워크플로우, 도구, 토큰 예산 및 프롬프트를 선택하는 작업을 포함하며, 현재는 일반적으로 고정된 대형 템플릿이나 수동으로 조정된 휴리스틱으로 처리됩니다. 이는 동일한 번거로운 구성이 쉬운 입력 쿼리와 어려운 입력 쿼리 모두에 자주 적용되기 때문에 취약한 동작과 불필요한 컴퓨팅 리소스 사용으로 이어집니다. 우리는 에이전트 구성을 쿼리 단위 의사 결정 문제로 공식화하고, 강화 학습을 사용하여 이러한 구성을 동적으로 맞춤화하는 경량 계층적 정책을 학습하는 ARC(Agentic Resource & Configuration learner)를 소개합니다. 추론 및 도구 활용 질의응답을 아우르는 여러 벤치마크에서, 학습된 정책은 강력하게 수동 설계된 기준 모델 및 다른 기준 모델들을 꾸준히 능가하며, 작업 정확도를 최대 25% 높이는 동시에 토큰 및 실행 시간 비용도 절감했습니다. 이러한 결과는 쿼리별 에이전트 구성을 학습하는 것이 '일률적인' 설계에 대한 강력한 대안임을 입증합니다.

English

Configuring LLM-based agent systems involves choosing workflows, tools, token budgets, and prompts from a large combinatorial design space, and is typically handled today by fixed large templates or hand-tuned heuristics. This leads to brittle behavior and unnecessary compute, since the same cumbersome configuration is often applied to both easy and hard input queries. We formulate agent configuration as a query-wise decision problem and introduce ARC (Agentic Resource & Configuration learner), which learns a light-weight hierarchical policy using reinforcement learning to dynamically tailor these configurations. Across multiple benchmarks spanning reasoning and tool-augmented question answering, the learned policy consistently outperforms strong hand-designed and other baselines, achieving up to 25% higher task accuracy while also reducing token and runtime costs. These results demonstrate that learning per-query agent configurations is a powerful alternative to "one size fits all" designs.

에이전트 AI 시스템 구성 방법 학습하기

Learning to Configure Agentic AI Systems

초록

Support