Alita：通用型智能体——以最小预定义与最大自我进化实现可扩展的代理推理

摘要

近期，大型语言模型（LLMs）的进展使得智能体能够自主执行复杂且开放性的任务。然而，许多现有框架严重依赖于手动预定义的工具和工作流程，这限制了它们的适应性、可扩展性及跨领域的泛化能力。在本研究中，我们推出了Alita——一款秉持“简约即终极复杂”原则设计的通用智能体，通过最小化预定义与最大化自我进化，实现了可扩展的智能推理。在最小化预定义方面，Alita仅配备了一个直接解决问题的组件，相较于以往依赖精心手工打造工具和工作流程的方法，其设计更为简洁明了。这种纯净的设计增强了其应对复杂问题的泛化潜力，不受工具限制。在最大化自我进化方面，我们通过提供一套通用组件，使Alita能够自主构建、优化并复用外部能力，通过从开源资源生成任务相关的模型上下文协议（MCPs），从而促进可扩展的智能推理。值得注意的是，Alita在GAIA基准验证数据集上实现了75.15%的pass@1和87.27%的pass@3准确率，在通用智能体中名列前茅；在Mathvista和PathVQA上分别达到了74.00%和52.00%的pass@1准确率，超越了众多复杂度更高的智能体系统。更多详情将持续更新于https://github.com/CharlesQ9/Alita。

English

Recent advances in large language models (LLMs) have enabled agents to autonomously perform complex, open-ended tasks. However, many existing frameworks depend heavily on manually predefined tools and workflows, which hinder their adaptability, scalability, and generalization across domains. In this work, we introduce Alita--a generalist agent designed with the principle of "Simplicity is the ultimate sophistication," enabling scalable agentic reasoning through minimal predefinition and maximal self-evolution. For minimal predefinition, Alita is equipped with only one component for direct problem-solving, making it much simpler and neater than previous approaches that relied heavily on hand-crafted, elaborate tools and workflows. This clean design enhances its potential to generalize to challenging questions, without being limited by tools. For Maximal self-evolution, we enable the creativity of Alita by providing a suite of general-purpose components to autonomously construct, refine, and reuse external capabilities by generating task-related model context protocols (MCPs) from open source, which contributes to scalable agentic reasoning. Notably, Alita achieves 75.15% pass@1 and 87.27% pass@3 accuracy, which is top-ranking among general-purpose agents, on the GAIA benchmark validation dataset, 74.00% and 52.00% pass@1, respectively, on Mathvista and PathVQA, outperforming many agent systems with far greater complexity. More details will be updated at https://github.com/CharlesQ9/Alita{https://github.com/CharlesQ9/Alita}.