基于流映射的扩散模型测试时尺度调整

摘要

为提升扩散模型在测试时的表现，使其生成的样本在用户指定奖励函数下获得高分，一种常用方法是在扩散动力学中引入奖励函数的梯度。然而这一操作往往存在理论缺陷，因为用户指定的奖励通常仅在生成过程末端的数据分布上才有明确定义。虽然常见的解决方案是使用去噪器估计样本在生成末态的可能形态，但我们提出通过直接运用流映射来简化该问题。通过利用流映射与掌控瞬时传输的速度场之间的数学关系，我们构建了流映射轨迹倾斜算法（FMTT），该算法在理论证明上能比传统基于奖励梯度的测试时方法实现更优的奖励提升效果。该方法既可通过重要性权重进行精确采样，也能通过原则性搜索定位奖励倾斜分布的局部极值点。我们通过对比实验验证了本方法相较于其他前瞻技术的优越性，并展示了流映射如何助力复杂奖励函数的应用——例如通过与视觉语言模型交互，实现新型图像编辑功能。

English

A common recipe to improve diffusion models at test-time so that samples score highly against a user-specified reward is to introduce the gradient of the reward into the dynamics of the diffusion itself. This procedure is often ill posed, as user-specified rewards are usually only well defined on the data distribution at the end of generation. While common workarounds to this problem are to use a denoiser to estimate what a sample would have been at the end of generation, we propose a simple solution to this problem by working directly with a flow map. By exploiting a relationship between the flow map and velocity field governing the instantaneous transport, we construct an algorithm, Flow Map Trajectory Tilting (FMTT), which provably performs better ascent on the reward than standard test-time methods involving the gradient of the reward. The approach can be used to either perform exact sampling via importance weighting or principled search that identifies local maximizers of the reward-tilted distribution. We demonstrate the efficacy of our approach against other look-ahead techniques, and show how the flow map enables engagement with complicated reward functions that make possible new forms of image editing, e.g. by interfacing with vision language models.

基于流映射的扩散模型测试时尺度调整

Test-time scaling of diffusions with flow maps

摘要

Support