Combee: 自己改善型言語モデルエージェントのためのプロンプト学習のスケーリング

要旨

プロンプト学習の最近の進歩により、大規模言語モデルエージェントはパラメータ変更なしに推論時のコンテキストからタスク関連知識を獲得できるようになった。例えば、既存手法（ACEやGEPAなど）では、過去のエージェント実行に基づいて精度を向上させるシステムプロンプトを学習できる。しかし、これらの手法は主に単一エージェントまたは低並列性の設定に焦点を当てている。これにより、収集された大量のエージェント軌跡から効率的に学習する能力が根本的に制限されている。多くのエージェント軌跡や並列エージェント実行からの学習という増大する傾向に対応するため、プロンプト学習を並列実行することは効率的かつ有益である。しかし、スケーリングに関する原理的な戦略がない現在の手法では、高並列化に伴う品質劣化が生じる。プロンプト学習の効率と品質の両方を改善するため、我々は自己改善型エージェントのための並列プロンプト学習をスケールさせる新規フレームワークCombeeを提案する。Combeeは学習を高速化し、多数のエージェントを並列実行しながら、それらの集約軌跡から品質劣化なく学習することを可能にする。これを実現するため、Combeeは並列スキャンを活用し、拡張シャッフル機構を採用する。さらにCombeeは、品質と遅延のバランスを取る動的バッチサイズ制御器を導入する。AppWorld、Terminal-Bench、Formula、FiNERによる評価では、Combeeが従来手法と比較して同等以上の精度と同等のコストで最大17倍の高速化を達成することを実証した。

English

Recent advances in prompt learning allow large language model agents to acquire task-relevant knowledge from inference-time context without parameter changes. For example, existing methods (like ACE or GEPA) can learn system prompts to improve accuracy based on previous agent runs. However, these methods primarily focus on single-agent or low-parallelism settings. This fundamentally limits their ability to efficiently learn from a large set of collected agentic traces. It would be efficient and beneficial to run prompt learning in parallel to accommodate the growing trend of learning from many agentic traces or parallel agent executions. Yet without a principled strategy for scaling, current methods suffer from quality degradation with high parallelism. To improve both the efficiency and quality of prompt learning, we propose Combee, a novel framework to scale parallel prompt learning for self-improving agents. Combee speeds up learning and enables running many agents in parallel while learning from their aggregate traces without quality degradation. To achieve this, Combee leverages parallel scans and employs an augmented shuffle mechanism; Combee also introduces a dynamic batch size controller to balance quality and delay. Evaluations on AppWorld, Terminal-Bench, Formula, and FiNER demonstrate that Combee achieves up to 17x speedup over previous methods with comparable or better accuracy and equivalent cost.

Combee: 自己改善型言語モデルエージェントのためのプロンプト学習のスケーリング

Combee: Scaling Prompt Learning for Self-Improving Language Model Agents

要旨

Support