LIMI:少即是多,助力代理效能
LIMI: Less is More for Agency
September 22, 2025
作者: Yang Xiao, Mohan Jiang, Jie Sun, Keyu Li, Jifan Lin, Yumin Zhuang, Ji Zeng, Shijie Xia, Qishuo Hua, Xuefeng Li, Xiaojie Cai, Tongyu Wang, Yue Zhang, Liming Liu, Xia Wu, Jinlong Hou, Yuan Cheng, Wenjie Li, Xiang Wang, Dequan Wang, Pengfei Liu
cs.AI
摘要
我們將「智能代理能力」定義為AI系統作為自主代理者所展現的湧現能力,能夠主動發現問題、提出假設,並通過與環境和工具的自主互動來執行解決方案。這一基本能力標誌著「AI代理時代」的黎明,其驅動力來自於產業的關鍵轉變:迫切需要的不僅是會思考,更要能工作的AI系統。儘管當前AI在推理和生成回應方面表現卓越,但各行業要求的是能夠執行任務、操作工具並推動現實世界成果的自主代理者。隨著代理智能成為區分認知系統與生產性工作者的決定性特徵,高效培養機器自主性變得至關重要。現有方法遵循語言模型的傳統擴展法則,認為更多數據能帶來更好的代理能力。我們從根本上挑戰這一範式。LIMI(少即是多智能代理)證明,代理能力的發展遵循截然不同的原則。通過戰略性地聚焦於協作軟件開發和科學研究工作流程,我們展示了精細的代理智能可以從少量但策略性策劃的自主行為示範中湧現。僅使用78個精心設計的訓練樣本,LIMI在綜合代理基準測試中達到了73.5%的成績,顯著超越了最先進的模型:Kimi-K2-Instruct(24.1%)、DeepSeek-V3.1(11.9%)、Qwen3-235B-A22B-Instruct(27.5%)和GLM-4.5(45.1%)。最引人注目的是,LIMI在僅使用10,000個樣本訓練的模型基礎上實現了53.7%的提升——以128倍更少的樣本獲得了更優的代理智能。我們的研究確立了「代理效率原則」:機器自主性並非源於數據的豐富,而是來自於高質量代理示範的策略性策劃。
English
We define Agency as the emergent capacity of AI systems to function as
autonomous agents actively discovering problems, formulating hypotheses, and
executing solutions through self-directed engagement with environments and
tools. This fundamental capability marks the dawn of the Age of AI Agency,
driven by a critical industry shift: the urgent need for AI systems that don't
just think, but work. While current AI excels at reasoning and generating
responses, industries demand autonomous agents that can execute tasks, operate
tools, and drive real-world outcomes. As agentic intelligence becomes the
defining characteristic separating cognitive systems from productive workers,
efficiently cultivating machine autonomy becomes paramount. Current approaches
assume that more data yields better agency, following traditional scaling laws
from language modeling. We fundamentally challenge this paradigm. LIMI (Less Is
More for Intelligent Agency) demonstrates that agency follows radically
different development principles. Through strategic focus on collaborative
software development and scientific research workflows, we show that
sophisticated agentic intelligence can emerge from minimal but strategically
curated demonstrations of autonomous behavior. Using only 78 carefully designed
training samples, LIMI achieves 73.5% on comprehensive agency benchmarks,
dramatically outperforming state-of-the-art models: Kimi-K2-Instruct (24.1%),
DeepSeek-V3.1 (11.9%), Qwen3-235B-A22B-Instruct (27.5%), and GLM-4.5 (45.1%).
Most strikingly, LIMI demonstrates 53.7% improvement over models trained on
10,000 samples-achieving superior agentic intelligence with 128 times fewer
samples. Our findings establish the Agency Efficiency Principle: machine
autonomy emerges not from data abundance but from strategic curation of
high-quality agentic demonstrations.