未雨绸缪，事半功倍：语言模型的主动自我优化

摘要

近期，自我优化技术的进展展现出通过迭代优化大幅提升大型语言模型（LLMs）输出质量的巨大潜力。然而，现有的大多数自我优化方法依赖于固定迭代次数的被动过程，难以根据生成过程中的动态上下文确定最佳的优化时机与内容。受人类在执行过程中动态调整思路的启发，我们提出了主动自我优化（ProActive Self-Refinement, PASR）这一新方法，使LLMs能够在生成过程中主动优化其输出。与重新生成整个响应的方法不同，PASR基于模型的内部状态及不断演变的上下文，主动决定是否、何时以及如何进行优化。我们在10项多样化任务上进行了广泛实验，以评估PASR的有效性。实验结果表明，PASR显著提升了问题解决能力。特别是在Qwen3-8B模型上，与标准生成相比，PASR平均减少了41.6%的token消耗，同时准确率提升了8.2%。本文所使用的代码及所有基线模型均已公开于GitHub平台。

English

Recent advances in self-refinement have demonstrated significant potential for improving the outputs of large language models (LLMs) through iterative refinement. However, most existing self-refinement methods rely on a reactive process with a fixed number of iterations, making it difficult to determine the optimal timing and content of refinement based on the evolving generation context. Inspired by the way humans dynamically refine their thoughts during execution, we propose ProActive Self-Refinement (PASR), a novel method that enables LLMs to refine their outputs during the generation process. Unlike methods that regenerate entire responses, PASR proactively decides whether, when, and how to refine based on the model's internal state and evolving context. We conduct extensive experiments on a diverse set of 10 tasks to evaluate the effectiveness of PASR. Experimental results show that PASR significantly enhances problem-solving performance. In particular, on Qwen3-8B, PASR reduces average token consumption by 41.6 percent compared to standard generation, while also achieving an 8.2 percent improvement in accuracy. Our code and all baselines used in the paper are available in the GitHub.

未雨绸缪，事半功倍：语言模型的主动自我优化

A Stitch in Time Saves Nine: Proactive Self-Refinement for Language Models

摘要

Support