深化 LLM 思維的演進

摘要

我們探索了一種用於擴展大型語言模型推論時間計算的演化搜索策略。所提出的方法名為「心靈進化」，利用語言模型生成、重組和優化候選回應。該方法避免了在解決方案評估器可用時需要對基礎推論問題進行形式化的必要性。在控制推論成本的情況下，我們發現「心靈進化」在自然語言規劃任務中明顯優於其他推論策略，如Best-of-N和Sequential Revision。在TravelPlanner和Natural Plan基準測試中，「心靈進化」使用Gemini 1.5 Pro解決了超過98%的問題實例，而無需使用正式求解器。

English

We explore an evolutionary search strategy for scaling inference time compute in Large Language Models. The proposed approach, Mind Evolution, uses a language model to generate, recombine and refine candidate responses. The proposed approach avoids the need to formalize the underlying inference problem whenever a solution evaluator is available. Controlling for inference cost, we find that Mind Evolution significantly outperforms other inference strategies such as Best-of-N and Sequential Revision in natural language planning tasks. In the TravelPlanner and Natural Plan benchmarks, Mind Evolution solves more than 98% of the problem instances using Gemini 1.5 Pro without the use of a formal solver.

深化 LLM 思維的演進

Evolving Deeper LLM Thinking

摘要

Support