D-Flow: 制御された生成のためのフローを通じた微分

要旨

最先端のDiffusionモデルやFlow-Matching（FM）モデルの生成結果を、タスク固有のモデルを再学習することなく制御することは、逆問題の解決、条件付き生成、そして一般的な制御生成において強力なツールを提供します。本論文では、フローを微分し、ソース（ノイズ）点を最適化することで生成プロセスを制御するシンプルなフレームワークであるD-Flowを紹介します。このフレームワークは、ガウス確率経路で学習されたDiffusion/FMモデルにおいて、生成プロセスを微分することでデータ多様体上に勾配を投影し、暗黙的に最適化プロセスに事前分布を注入するという我々の重要な観察に基づいて動機付けられています。我々は、線形および非線形の制御生成問題（画像および音声の逆問題、条件付き分子生成を含む）において本フレームワークを検証し、すべてのタスクで最先端の性能を達成しました。

English

Taming the generation outcome of state of the art Diffusion and Flow-Matching (FM) models without having to re-train a task-specific model unlocks a powerful tool for solving inverse problems, conditional generation, and controlled generation in general. In this work we introduce D-Flow, a simple framework for controlling the generation process by differentiating through the flow, optimizing for the source (noise) point. We motivate this framework by our key observation stating that for Diffusion/FM models trained with Gaussian probability paths, differentiating through the generation process projects gradient on the data manifold, implicitly injecting the prior into the optimization process. We validate our framework on linear and non-linear controlled generation problems including: image and audio inverse problems and conditional molecule generation reaching state of the art performance across all.

D-Flow: 制御された生成のためのフローを通じた微分

D-Flow: Differentiating through Flows for Controlled Generation

要旨

Support