大型语言模型作为工具制造者
Large Language Models as Tool Makers
May 26, 2023
作者: Tianle Cai, Xuezhi Wang, Tengyu Ma, Xinyun Chen, Denny Zhou
cs.AI
摘要
最近的研究显示,通过使用外部工具,可以提升大型语言模型(LLMs)的问题解决能力的潜力。然而,沿着这一方向的先前工作取决于现有工具的可用性。在这项工作中,我们迈出了一小步,试图通过提出一个封闭循环框架来消除这种依赖,该框架被称为LLMs作为工具制造者(LATM),在这个框架中,LLMs为问题解决创建自己的可重用工具。我们的方法包括两个关键阶段:1)工具制造:LLM充当工具制造者,为给定任务制作工具,其中工具被实现为Python实用函数。2)工具使用:LLM充当工具用户,应用工具制造者构建的工具进行问题解决。工具用户可以是与工具制造者相同或不同的LLM。工具制造使LLM能够持续生成可应用于不同请求的工具,以便将来的请求在解决任务时可以调用相应的API。此外,LLMs在工具制造和工具使用阶段之间的分工引入了实现成本效益而不降低生成的工具和问题解决方案质量的机会。例如,认识到工具制造需要比工具使用更复杂的能力,我们可以将一个功能强大但资源密集型的模型应用为工具制造者,将一个轻量级且具有成本效益的模型应用为工具用户。我们验证了我们的方法在各种复杂推理任务中的有效性,包括Big-Bench任务。通过以GPT-4作为工具制造者和以GPT-3.5作为工具用户,LATM可以实现与同时使用GPT-4进行工具制造和工具使用相当的性能,同时推理成本大幅降低。
English
Recent research shows the potential of enhancing the problem-solving ability
of large language models (LLMs) through the use of external tools. However,
prior work along this line depends on the availability of existing tools. In
this work, we take an initial step towards removing this dependency by
proposing a closed-loop framework, referred to as LLMs As Tool Makers (LATM),
where LLMs create their own reusable tools for problem-solving. Our approach
consists of two key phases: 1) tool making: an LLM acts as the tool maker that
crafts tools for given tasks, where a tool is implemented as a Python utility
function. 2) tool using: an LLM acts as the tool user, which applies the tool
built by the tool maker for problem-solving. The tool user can be either the
same or a different LLM from the tool maker. Tool-making enables an LLM to
continually generate tools that can be applied to different requests so that
future requests can call the corresponding APIs when beneficial for solving the
tasks. Furthermore, the division of labor among LLMs for tool-making and
tool-using phases introduces the opportunity to achieve cost effectiveness
without degrading the quality of generated tools and problem solutions. For
example, recognizing that tool-making demands more sophisticated capabilities
than tool-using, we can apply a powerful yet resource-intensive model as the
tool maker, and a lightweight while cost-effective model as the tool user. We
validate the effectiveness of our approach across a variety of complex
reasoning tasks, including Big-Bench tasks. With GPT-4 as the tool maker and
GPT-3.5 as the tool user, LATM can achieve performance that is on par with
using GPT-4 for both tool making and tool using, while the inference cost is
significantly reduced.