大型語言模型作為優化器

摘要

優化是無所不在的。儘管基於導數的演算法一直是各種問題的強大工具，但梯度缺失對許多現實應用構成挑戰。在這項工作中，我們提出了一種名為PROmpting優化（OPRO）的簡單而有效的方法，以利用大型語言模型（LLMs）作為優化器，其中優化任務以自然語言描述。在每個優化步驟中，LLM從包含先前生成的解以及其值的提示中生成新的解，然後評估這些新解並將其添加到下一個優化步驟的提示中。我們首先展示了OPRO在線性回歸和旅行推銷員問題上的應用，然後轉向提示優化，其中目標是找到最大化任務準確性的指令。通過多種LLM，我們展示了OPRO優化的最佳提示在GSM8K上比人工設計的提示高出多達8％，在Big-Bench Hard任務上高出多達50％。

English

Optimization is ubiquitous. While derivative-based algorithms have been powerful tools for various problems, the absence of gradient imposes challenges on many real-world applications. In this work, we propose Optimization by PROmpting (OPRO), a simple and effective approach to leverage large language models (LLMs) as optimizers, where the optimization task is described in natural language. In each optimization step, the LLM generates new solutions from the prompt that contains previously generated solutions with their values, then the new solutions are evaluated and added to the prompt for the next optimization step. We first showcase OPRO on linear regression and traveling salesman problems, then move on to prompt optimization where the goal is to find instructions that maximize the task accuracy. With a variety of LLMs, we demonstrate that the best prompts optimized by OPRO outperform human-designed prompts by up to 8% on GSM8K, and by up to 50% on Big-Bench Hard tasks.

大型語言模型作為優化器

Large Language Models as Optimizers

摘要

Support