ChatPaper.aiChatPaper

POLCA:基于大语言模型的随机生成式优化方法

POLCA: Stochastic Generative Optimization with LLM

March 16, 2026
作者: Xuanfei Ren, Allen Nie, Tengyang Xie, Ching-An Cheng
cs.AI

摘要

针对从大语言模型提示优化到多轮智能体等复杂系统的优化问题,传统方法依赖高人工强度的迭代过程。我们将这一挑战形式化为随机生成式优化问题:通过生成式语言模型作为优化器,在数值奖励和文本反馈的引导下探索最优系统。本文提出具有局部上下文聚合的优先级优化框架(POLCA),该可扩展框架旨在处理优化过程中的随机性(如噪声反馈、小批量采样和随机系统行为),同时有效控制解空间的无限扩张。POLCA通过维护优先级队列来管理探索-利用权衡,系统化追踪候选解及其评估历史。为提升效率,我们集成ε-网络机制以保持参数多样性,并采用LLM摘要器实现历史试验的元学习。理论证明表明,POLCA在随机环境下能收敛至接近最优的候选解。我们在τ-bench、HotpotQA(智能体优化)、VeriBench(代码翻译)和KernelBench(CUDA内核生成)等多个基准测试中评估本框架。实验结果表明,POLCA在确定性与随机性问题中均能实现鲁棒、样本高效且时间高效的性能,持续超越现有最优算法。相关代码库已公开于https://github.com/rlx-lab/POLCA。
English
Optimizing complex systems, ranging from LLM prompts to multi-turn agents, traditionally requires labor-intensive manual iteration. We formalize this challenge as a stochastic generative optimization problem where a generative language model acts as the optimizer, guided by numerical rewards and text feedback to discover the best system. We introduce Prioritized Optimization with Local Contextual Aggregation (POLCA), a scalable framework designed to handle stochasticity in optimization -- such as noisy feedback, sampling minibatches, and stochastic system behaviors -- while effectively managing the unconstrained expansion of solution space. POLCA maintains a priority queue to manage the exploration-exploitation tradeoff, systematically tracking candidate solutions and their evaluation histories. To enhance efficiency, we integrate an varepsilon-Net mechanism to maintain parameter diversity and an LLM Summarizer to perform meta-learning across historical trials. We theoretically prove that POLCA converges to near-optimal candidate solutions under stochasticity. We evaluate our framework on diverse benchmarks, including τ-bench, HotpotQA (agent optimization), VeriBench (code translation) and KernelBench (CUDA kernel generation). Experimental results demonstrate that POLCA achieves robust, sample and time-efficient performance, consistently outperforming state-of-the-art algorithms in both deterministic and stochastic problems. The codebase for this work is publicly available at https://github.com/rlx-lab/POLCA.
PDF212March 18, 2026