MegaFlow:面向智能体时代的大规模分布式编排系统
MegaFlow: Large-Scale Distributed Orchestration System for the Agentic Era
January 12, 2026
作者: Lei Zhang, Mouxiang Chen, Ruisheng Cao, Jiawei Chen, Fan Zhou, Yiheng Xu, Jiaxi Yang, Liang Chen, Changwei Luo, Kai Zhang, Fan Yan, KaShun Shum, Jiajun Zhang, Zeyu Cui, Hu Feng, Junyang Lin, Binyuan Hui, Min Yang
cs.AI
摘要
交互式与自主人工智能系统的迅猛发展标志着我们正步入智能体时代。在软件工程、计算机操作等复杂智能体任务上开展训练与评估,不仅需要高效的模型计算能力,更依赖于能够协调海量智能体-环境交互的精密基础设施。然而,目前尚无开源基础设施能有效支撑此类复杂智能体任务的大规模训练与评估。为应对这一挑战,我们推出MegaFlow——一个支持智能体-环境工作负载高效调度、资源分配与细粒度任务管理的大规模分布式编排系统。MegaFlow将智能体训练基础设施抽象为三个通过统一接口交互的独立服务(模型服务、智能体服务与环境服务),实现了不同智能体-环境配置下的独立扩展与灵活资源分配。在实际部署中,该系统成功协调了数万个并发智能体任务,在保持系统高稳定性的同时实现了资源利用效率最大化。通过赋能大规模智能体训练,MegaFlow填补了新兴智能体AI领域的关键基础设施空白。
English
The rapid development of interactive and autonomous AI systems signals our entry into the agentic era. Training and evaluating agents on complex agentic tasks such as software engineering and computer use requires not only efficient model computation but also sophisticated infrastructure capable of coordinating vast agent-environment interactions. However, no open-source infrastructure can effectively support large-scale training and evaluation on such complex agentic tasks. To address this challenge, we present MegaFlow, a large-scale distributed orchestration system that enables efficient scheduling, resource allocation, and fine-grained task management for agent-environment workloads. MegaFlow abstracts agent training infrastructure into three independent services (Model Service, Agent Service, and Environment Service) that interact through unified interfaces, enabling independent scaling and flexible resource allocation across diverse agent-environment configurations. In our agent training deployments, MegaFlow successfully orchestrates tens of thousands of concurrent agent tasks while maintaining high system stability and achieving efficient resource utilization. By enabling such large-scale agent training, MegaFlow addresses a critical infrastructure gap in the emerging agentic AI landscape.