エージェント型AIシステムの設定を学習する

要旨

LLMベースのエージェントシステムの構成は、ワークフロー、ツール、トークン予算、プロンプトを大規模な組み合わせ設計空間から選択することを含み、現在では固定化された大規模テンプレートや手動調整されたヒューリスティクスによって処理されるのが一般的です。このアプローチは脆弱な動作と不必要な計算資源の消費を招きます。なぜなら、容易な入力クエリと困難な入力クエリの両方に、同じ煩雑な構成が適用されることが多いためです。我々はエージェント構成をクエリ単位の意思決定問題として定式化し、強化学習を用いて軽量な階層的ポリシーを学習し、これらの構成を動的に調整するARC（Agentic Resource & Configuration learner）を提案します。推論とツール拡張質問応答にわたる複数のベンチマークにおいて、学習されたポリシーは、手動設計された強力なベースラインやその他の手法を一貫して上回り、タスク精度を最大25%向上させると同時に、トークンコストと実行時間も削減しました。これらの結果は、クエリごとにエージェント構成を学習することが、「万能型」設計に対する強力な代替手段であることを実証しています。

English

Configuring LLM-based agent systems involves choosing workflows, tools, token budgets, and prompts from a large combinatorial design space, and is typically handled today by fixed large templates or hand-tuned heuristics. This leads to brittle behavior and unnecessary compute, since the same cumbersome configuration is often applied to both easy and hard input queries. We formulate agent configuration as a query-wise decision problem and introduce ARC (Agentic Resource & Configuration learner), which learns a light-weight hierarchical policy using reinforcement learning to dynamically tailor these configurations. Across multiple benchmarks spanning reasoning and tool-augmented question answering, the learned policy consistently outperforms strong hand-designed and other baselines, achieving up to 25% higher task accuracy while also reducing token and runtime costs. These results demonstrate that learning per-query agent configurations is a powerful alternative to "one size fits all" designs.

エージェント型AIシステムの設定を学習する

Learning to Configure Agentic AI Systems

要旨

Support