小型语言模型是智能体AI的未来发展方向。
Small Language Models are the Future of Agentic AI
June 2, 2025
作者: Peter Belcak, Greg Heinrich, Shizhe Diao, Yonggan Fu, Xin Dong, Saurav Muralidharan, Yingyan Celine Lin, Pavlo Molchanov
cs.AI
摘要
大型语言模型(LLMs)常因其在广泛任务中展现出接近人类的表现以及维持通用对话的能力而备受赞誉。然而,随着自主AI系统的兴起,语言模型在大量应用中执行少量专门任务,且重复性高、变化少,这一趋势正在改变。
我们在此提出,小型语言模型(SLMs)在许多自主系统调用中已足够强大,本质上更为适宜,且必然更具经济性,因此是自主AI的未来。我们的论点基于当前SLMs展现的能力水平、自主系统的常见架构以及语言模型部署的经济性。我们进一步主张,在通用对话能力至关重要的场景下,异构自主系统(即调用多种不同模型的代理)是自然之选。我们探讨了SLMs在自主系统中应用的潜在障碍,并概述了一个通用的LLM到SLM代理转换算法。
作为价值声明,我们的立场强调了从LLMs部分转向SLMs对AI代理行业运营和经济影响的重大意义。我们旨在激发关于有效利用AI资源的讨论,并希望推动降低当前AI成本的努力。我们呼吁对本文立场提出贡献与批评,并承诺将所有相关通信发布于https://research.nvidia.com/labs/lpr/slm-agents。
English
Large language models (LLMs) are often praised for exhibiting near-human
performance on a wide range of tasks and valued for their ability to hold a
general conversation. The rise of agentic AI systems is, however, ushering in a
mass of applications in which language models perform a small number of
specialized tasks repetitively and with little variation.
Here we lay out the position that small language models (SLMs) are
sufficiently powerful, inherently more suitable, and necessarily more
economical for many invocations in agentic systems, and are therefore the
future of agentic AI. Our argumentation is grounded in the current level of
capabilities exhibited by SLMs, the common architectures of agentic systems,
and the economy of LM deployment. We further argue that in situations where
general-purpose conversational abilities are essential, heterogeneous agentic
systems (i.e., agents invoking multiple different models) are the natural
choice. We discuss the potential barriers for the adoption of SLMs in agentic
systems and outline a general LLM-to-SLM agent conversion algorithm.
Our position, formulated as a value statement, highlights the significance of
the operational and economic impact even a partial shift from LLMs to SLMs is
to have on the AI agent industry. We aim to stimulate the discussion on the
effective use of AI resources and hope to advance the efforts to lower the
costs of AI of the present day. Calling for both contributions to and critique
of our position, we commit to publishing all such correspondence at
https://research.nvidia.com/labs/lpr/slm-agents.