小型語言模型將成為代理式人工智慧的未來
Small Language Models are the Future of Agentic AI
June 2, 2025
作者: Peter Belcak, Greg Heinrich, Shizhe Diao, Yonggan Fu, Xin Dong, Saurav Muralidharan, Yingyan Celine Lin, Pavlo Molchanov
cs.AI
摘要
大型語言模型(LLMs)常因其在多種任務上展現出接近人類的表現而受到讚譽,並因其能夠進行一般性對話而備受重視。然而,隨著代理型人工智慧系統的興起,語言模型在這些系統中主要執行少量專門任務,且重複性高、變化性低,這類應用正大量湧現。
在此,我們提出一個觀點:小型語言模型(SLMs)在許多代理系統的調用中已足夠強大,本質上更為適合,且必然更具經濟效益,因此它們將是代理型人工智慧的未來。我們的論點基於當前SLMs展現的能力水平、代理系統的常見架構,以及語言模型部署的經濟性。我們進一步主張,在通用對話能力至關重要的情境下,異構代理系統(即調用多種不同模型的代理)是自然的選擇。我們探討了SLMs在代理系統中採用的潛在障礙,並概述了一個通用的LLM到SLM代理轉換算法。
我們以價值陳述的形式表達這一立場,強調即使從LLMs部分轉向SLMs,也將對人工智慧代理產業的運營和經濟產生重大影響。我們旨在激發關於有效利用人工智慧資源的討論,並希望推動降低當今人工智慧成本的努力。我們呼籲對我們的立場提出貢獻與批評,並承諾將所有相關通信發表於https://research.nvidia.com/labs/lpr/slm-agents。
English
Large language models (LLMs) are often praised for exhibiting near-human
performance on a wide range of tasks and valued for their ability to hold a
general conversation. The rise of agentic AI systems is, however, ushering in a
mass of applications in which language models perform a small number of
specialized tasks repetitively and with little variation.
Here we lay out the position that small language models (SLMs) are
sufficiently powerful, inherently more suitable, and necessarily more
economical for many invocations in agentic systems, and are therefore the
future of agentic AI. Our argumentation is grounded in the current level of
capabilities exhibited by SLMs, the common architectures of agentic systems,
and the economy of LM deployment. We further argue that in situations where
general-purpose conversational abilities are essential, heterogeneous agentic
systems (i.e., agents invoking multiple different models) are the natural
choice. We discuss the potential barriers for the adoption of SLMs in agentic
systems and outline a general LLM-to-SLM agent conversion algorithm.
Our position, formulated as a value statement, highlights the significance of
the operational and economic impact even a partial shift from LLMs to SLMs is
to have on the AI agent industry. We aim to stimulate the discussion on the
effective use of AI resources and hope to advance the efforts to lower the
costs of AI of the present day. Calling for both contributions to and critique
of our position, we commit to publishing all such correspondence at
https://research.nvidia.com/labs/lpr/slm-agents.