AgentArk: マルチエージェント知能を単一LLMエージェントへ蒸留する

要旨

大規模言語モデル（LLM）を用いたマルチエージェントシステムは、反復的な議論を通じて優れた推論性能を達成するが、高い計算コストと誤り伝播の問題から実用展開が制限されている。本論文は、マルチエージェントの相互作用を単一モデルのパラメータに蒸留する新規フレームワーク「AgentArk」を提案する。これにより、推論時の明示的な相互作用を暗黙的なモデル能力へと変換し、計算効率を維持した単一エージェントにマルチエージェントシステムの知能を付与する。具体的には、様々なモデル・タスク・スケーリング・シナリオにおいて、階層的な3つの蒸留戦略（推論強化ファインチューニング、軌道に基づくデータ拡張、プロセス意識型蒸留）を検証する。計算負荷を推論時から学習時に移行させることで、蒸留モデルは単一エージェントの効率性を保ちつつ、マルチエージェントの強力な推論能力と自己修正性能を発揮する。さらに多様な推論タスクにおいて、頑健性と一般化性能の向上が確認された。本研究成果が、効率的かつ頑健なマルチエージェント開発に関する将来研究の指針となることを期待する。コードはhttps://github.com/AIFrontierLab/AgentArk で公開している。

English

While large language model (LLM) multi-agent systems achieve superior reasoning performance through iterative debate, practical deployment is limited by their high computational cost and error propagation. This paper proposes AgentArk, a novel framework to distill multi-agent dynamics into the weights of a single model, effectively transforming explicit test-time interactions into implicit model capabilities. This equips a single agent with the intelligence of multi-agent systems while remaining computationally efficient. Specifically, we investigate three hierarchical distillation strategies across various models, tasks, scaling, and scenarios: reasoning-enhanced fine-tuning; trajectory-based augmentation; and process-aware distillation. By shifting the burden of computation from inference to training, the distilled models preserve the efficiency of one agent while exhibiting strong reasoning and self-correction performance of multiple agents. They further demonstrate enhanced robustness and generalization across diverse reasoning tasks. We hope this work can shed light on future research on efficient and robust multi-agent development. Our code is at https://github.com/AIFrontierLab/AgentArk.

AgentArk: マルチエージェント知能を単一LLMエージェントへ蒸留する

AgentArk: Distilling Multi-Agent Intelligence into a Single LLM Agent

要旨

Support