AgentConductor：マルチエージェント競技レベルのコード生成のためのトポロジー進化

要旨

大規模言語モデル（LLM）駆動型マルチエージェントシステム（MAS）は、事前定義された相互作用トポロジーを通じて専門エージェントを調整し、競技レベルのコード生成などの複雑なタスクにおいて有望な成果を示している。最近の研究では、慎重に設計されたマルチエージェントワークフローと通信グラフが、協調的推論を活用することでコード生成性能を大幅に改善できることが実証されている。しかし、既存手法は、タスクの難易度に応じてトポロジー密度を適応させたり、実行フィードバックを用いてインスタンス内でトポロジーを反復的に改良したりすることはなく、これが冗長な通信と性能ボトルネックを引き起こしている。これらの課題を解決するため、我々はAgentConductorを提案する。これは、LLMベースのオーケストレーターエージェントを中核とする強化学習最適化MASであり、エンドツーエンドのフィードバック駆動による相互作用トポロジーの動的生成を可能にする。AgentConductorは各クエリに対して、エージェントの役割とタスクの難易度を推論し、タスクに適応した密度を考慮した階層的有向非巡回グラフ（DAG）トポロジーを構築する。この基盤には、二つの重要な革新がある。第一に、マルチエージェント相互作用の通信を考慮した数学的特徴を捉える新しいトポロジー密度関数を設計した。第二に、難易度レベルごとの正確なトポロジー密度上限測定とよりきめ細かい制御のために、過度な枝刈りを回避する難易度区間分割を採用した。3つの競技レベルおよび2つの基礎的コードデータセットを用いた実験では、AgentConductorは最高水準の精度を達成し、最強のベースラインをパス@1精度で最大14.6%、密度削減で13%、トークンコスト削減で68%上回った。

English

Large language model(LLM)-driven multi-agent systems(MAS) coordinate specialized agents through predefined interaction topologies and have shown promise for complex tasks such as competition-level code generation. Recent studies demonstrate that carefully designed multi-agent workflows and communication graphs can significantly improve code generation performance by leveraging collaborative reasoning. However, existing methods neither adapt topology density to task difficulty nor iteratively refine the topology within an instance using execution feedback, which leads to redundant communication and performance bottlenecks. To address these issues, we propose AgentConductor: a reinforcement learning-optimized MAS with an LLM-based orchestrator agent as its core, which enables end-to-end feedback-driven dynamic generation of interaction topologies. For each query, AgentConductor infers agent roles and task difficulty, then constructs a task-adapted, density-aware layered directed acyclic graph (DAG) topology, underpinned by two key innovations. First, we design a novel topological density function that captures communication-aware mathematical characterizations of multi-agent interactions. Second, we adopt difficulty interval partitioning to avoid excessive pruning for precise topological density upper bound measurement per difficulty level and finer-grained control. Empirically, across three competition-level and two foundational code datasets, AgentConductor achieves state-of-the-art accuracy, outperforming the strongest baseline by up to 14.6% in pass@1 accuracy, 13% in density reduction, and 68% in token cost reduction.

AgentConductor：マルチエージェント競技レベルのコード生成のためのトポロジー進化

AgentConductor: Topology Evolution for Multi-Agent Competition-Level Code Generation

要旨

Support