FlowEdit: 事前学習されたフローモデルを使用した反転フリーテキストベースの編集

要旨

事前に学習されたテキストから画像への変換（T2I）拡散/フローモデルを使用して実画像を編集する場合、画像を対応するノイズマップに反転させることがしばしば含まれます。ただし、反転そのものでは通常、満足のいく結果が得られないため、多くの手法がサンプリングプロセスに追加で介入します。これらの手法は改善された結果を達成しますが、モデルアーキテクチャ間でシームレスに転送可能ではありません。ここで、FlowEditという、事前に学習されたT2Iフローモデル用のテキストベースの編集手法を紹介します。この手法は、反転や最適化を必要とせず、モデルに依存しません。当社の手法は、ソースとターゲットの分布（ソースとターゲットのテキストプロンプトに対応）間を直接マッピングするODEを構築し、反転アプローチよりも低い輸送コストを実現します。これにより、Stable Diffusion 3とFLUXで示すように、最先端の結果が得られます。コードと例はプロジェクトのウェブページで入手可能です。

English

Editing real images using a pre-trained text-to-image (T2I) diffusion/flow model often involves inverting the image into its corresponding noise map. However, inversion by itself is typically insufficient for obtaining satisfactory results, and therefore many methods additionally intervene in the sampling process. Such methods achieve improved results but are not seamlessly transferable between model architectures. Here, we introduce FlowEdit, a text-based editing method for pre-trained T2I flow models, which is inversion-free, optimization-free and model agnostic. Our method constructs an ODE that directly maps between the source and target distributions (corresponding to the source and target text prompts) and achieves a lower transport cost than the inversion approach. This leads to state-of-the-art results, as we illustrate with Stable Diffusion 3 and FLUX. Code and examples are available on the project's webpage.

FlowEdit: 事前学習されたフローモデルを使用した反転フリーテキストベースの編集

FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models

要旨

Support