Alita：通用型智能體，以最少預定義與最大自我進化實現可擴展的自主推理

摘要

近期大型語言模型（LLMs）的進展使得代理能夠自主執行複雜且開放式的任務。然而，許多現有框架過度依賴於手動預定義的工具和工作流程，這限制了它們的適應性、可擴展性以及跨領域的泛化能力。在本研究中，我們介紹了Alita——一款秉持“簡約即終極複雜”原則設計的通用代理，通過最小化預定義與最大化自我進化來實現可擴展的代理推理。在最小化預定義方面，Alita僅配備了一個直接解決問題的組件，相比以往依賴於精心手工打造的工具和工作流程的方法，其設計更為簡潔明瞭。這種簡潔的設計增強了其應對挑戰性問題的泛化潛力，而不受工具限制。在最大化自我進化方面，我們通過提供一套通用組件，使Alita能夠自主構建、精煉並重用外部能力，通過從開源資源生成與任務相關的模型上下文協議（MCPs），從而促進可擴展的代理推理。值得注意的是，Alita在GAIA基準驗證數據集上達到了75.15%的pass@1和87.27%的pass@3準確率，在通用代理中名列前茅；在Mathvista和PathVQA上分別取得了74.00%和52.00%的pass@1成績，超越了許多複雜度更高的代理系統。更多詳情將更新於https://github.com/CharlesQ9/Alita。

English

Recent advances in large language models (LLMs) have enabled agents to autonomously perform complex, open-ended tasks. However, many existing frameworks depend heavily on manually predefined tools and workflows, which hinder their adaptability, scalability, and generalization across domains. In this work, we introduce Alita--a generalist agent designed with the principle of "Simplicity is the ultimate sophistication," enabling scalable agentic reasoning through minimal predefinition and maximal self-evolution. For minimal predefinition, Alita is equipped with only one component for direct problem-solving, making it much simpler and neater than previous approaches that relied heavily on hand-crafted, elaborate tools and workflows. This clean design enhances its potential to generalize to challenging questions, without being limited by tools. For Maximal self-evolution, we enable the creativity of Alita by providing a suite of general-purpose components to autonomously construct, refine, and reuse external capabilities by generating task-related model context protocols (MCPs) from open source, which contributes to scalable agentic reasoning. Notably, Alita achieves 75.15% pass@1 and 87.27% pass@3 accuracy, which is top-ranking among general-purpose agents, on the GAIA benchmark validation dataset, 74.00% and 52.00% pass@1, respectively, on Mathvista and PathVQA, outperforming many agent systems with far greater complexity. More details will be updated at https://github.com/CharlesQ9/Alita{https://github.com/CharlesQ9/Alita}.