Alita:通用型智能体——以最小预定义与最大自我进化实现可扩展的代理推理
Alita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evolution
May 26, 2025
作者: Jiahao Qiu, Xuan Qi, Tongcheng Zhang, Xinzhe Juan, Jiacheng Guo, Yifu Lu, Yimin Wang, Zixin Yao, Qihan Ren, Xun Jiang, Xing Zhou, Dongrui Liu, Ling Yang, Yue Wu, Kaixuan Huang, Shilong Liu, Hongru Wang, Mengdi Wang
cs.AI
摘要
近期,大型语言模型(LLMs)的进展使得智能体能够自主执行复杂且开放性的任务。然而,许多现有框架严重依赖于手动预定义的工具和工作流程,这限制了它们的适应性、可扩展性及跨领域的泛化能力。在本研究中,我们推出了Alita——一款秉持“简约即终极复杂”原则设计的通用智能体,通过最小化预定义与最大化自我进化,实现了可扩展的智能推理。在最小化预定义方面,Alita仅配备了一个直接解决问题的组件,相较于以往依赖精心手工打造工具和工作流程的方法,其设计更为简洁明了。这种纯净的设计增强了其应对复杂问题的泛化潜力,不受工具限制。在最大化自我进化方面,我们通过提供一套通用组件,使Alita能够自主构建、优化并复用外部能力,通过从开源资源生成任务相关的模型上下文协议(MCPs),从而促进可扩展的智能推理。值得注意的是,Alita在GAIA基准验证数据集上实现了75.15%的pass@1和87.27%的pass@3准确率,在通用智能体中名列前茅;在Mathvista和PathVQA上分别达到了74.00%和52.00%的pass@1准确率,超越了众多复杂度更高的智能体系统。更多详情将持续更新于https://github.com/CharlesQ9/Alita。
English
Recent advances in large language models (LLMs) have enabled agents to
autonomously perform complex, open-ended tasks. However, many existing
frameworks depend heavily on manually predefined tools and workflows, which
hinder their adaptability, scalability, and generalization across domains. In
this work, we introduce Alita--a generalist agent designed with the principle
of "Simplicity is the ultimate sophistication," enabling scalable agentic
reasoning through minimal predefinition and maximal self-evolution. For minimal
predefinition, Alita is equipped with only one component for direct
problem-solving, making it much simpler and neater than previous approaches
that relied heavily on hand-crafted, elaborate tools and workflows. This clean
design enhances its potential to generalize to challenging questions, without
being limited by tools. For Maximal self-evolution, we enable the creativity of
Alita by providing a suite of general-purpose components to autonomously
construct, refine, and reuse external capabilities by generating task-related
model context protocols (MCPs) from open source, which contributes to scalable
agentic reasoning. Notably, Alita achieves 75.15% pass@1 and 87.27% pass@3
accuracy, which is top-ranking among general-purpose agents, on the GAIA
benchmark validation dataset, 74.00% and 52.00% pass@1, respectively, on
Mathvista and PathVQA, outperforming many agent systems with far greater
complexity. More details will be updated at
https://github.com/CharlesQ9/Alita{https://github.com/CharlesQ9/Alita}.Summary
AI-Generated Summary