AgentConductor：面向多智能体竞赛级代码生成的拓扑演化框架

摘要

基於大型語言模型（LLM）的多智能體系統（MAS）通過預定義的交互拓撲協調專業化智能體，在競賽級代碼生成等複雜任務中展現出巨大潛力。近期研究表明，精心設計的多智能體工作流與通信圖能通過協作推理顯著提升代碼生成性能。然而，現有方法既未根據任務難度自適應調整拓撲密度，也未利用執行反饋在實例內迭代優化拓撲結構，導致通信冗餘和性能瓶頸。為解決這些問題，我們提出AgentConductor：一個以基於LLM的編排智能體為核心的強化學習優化MAS，可實現端到端反饋驅動的動態交互拓撲生成。針對每個查詢，AgentConductor通過兩項核心創新推斷智能體角色與任務難度，進而構建任務自適應的密度感知分層有向無環圖（DAG）拓撲：首先設計了新型拓撲密度函數，從數學表徵層面刻畫多智能體交互的通信特性；其次採用難度區間劃分策略，避免過度剪枝以實現精確的拓撲密度上界測量與更細粒度的控制。在三個競賽級和兩個基礎代碼數據集上的實驗表明，AgentConductor在準確率上達到最優水平，相較最強基線模型在pass@1準確率上最高提升14.6%，拓撲密度降低13%，令牌成本減少68%。

English

Large language model(LLM)-driven multi-agent systems(MAS) coordinate specialized agents through predefined interaction topologies and have shown promise for complex tasks such as competition-level code generation. Recent studies demonstrate that carefully designed multi-agent workflows and communication graphs can significantly improve code generation performance by leveraging collaborative reasoning. However, existing methods neither adapt topology density to task difficulty nor iteratively refine the topology within an instance using execution feedback, which leads to redundant communication and performance bottlenecks. To address these issues, we propose AgentConductor: a reinforcement learning-optimized MAS with an LLM-based orchestrator agent as its core, which enables end-to-end feedback-driven dynamic generation of interaction topologies. For each query, AgentConductor infers agent roles and task difficulty, then constructs a task-adapted, density-aware layered directed acyclic graph (DAG) topology, underpinned by two key innovations. First, we design a novel topological density function that captures communication-aware mathematical characterizations of multi-agent interactions. Second, we adopt difficulty interval partitioning to avoid excessive pruning for precise topological density upper bound measurement per difficulty level and finer-grained control. Empirically, across three competition-level and two foundational code datasets, AgentConductor achieves state-of-the-art accuracy, outperforming the strongest baseline by up to 14.6% in pass@1 accuracy, 13% in density reduction, and 68% in token cost reduction.

AgentConductor：面向多智能体竞赛级代码生成的拓扑演化框架

AgentConductor: Topology Evolution for Multi-Agent Competition-Level Code Generation

摘要

Support