D-Flow: 제어된 생성을 위한 흐름 기반 미분

초록

최신 디퓨전(Diffusion) 및 플로우 매칭(FM) 모델의 생성 결과를 특정 작업별 모델을 재학습하지 않고도 제어할 수 있다면, 역문제 해결, 조건부 생성 및 일반적인 제어 생성에 강력한 도구를 활용할 수 있다. 본 연구에서는 플로우를 통해 미분하고 소스(노이즈) 점을 최적화함으로써 생성 과정을 제어하는 간단한 프레임워크인 D-Flow를 소개한다. 이 프레임워크는 가우시안 확률 경로로 학습된 디퓨전/FM 모델의 경우, 생성 과정을 통해 미분하는 것이 데이터 매니폴드에 그래디언트를 투영하여 최적화 과정에 암묵적으로 사전 정보를 주입한다는 핵심 관찰에 기반을 두고 있다. 우리는 이 프레임워크를 선형 및 비선형 제어 생성 문제, 이미지 및 오디오 역문제, 조건부 분자 생성 등에 적용하여 모든 분야에서 최신 기술 수준의 성능을 달성함으로써 검증하였다.

English

Taming the generation outcome of state of the art Diffusion and Flow-Matching (FM) models without having to re-train a task-specific model unlocks a powerful tool for solving inverse problems, conditional generation, and controlled generation in general. In this work we introduce D-Flow, a simple framework for controlling the generation process by differentiating through the flow, optimizing for the source (noise) point. We motivate this framework by our key observation stating that for Diffusion/FM models trained with Gaussian probability paths, differentiating through the generation process projects gradient on the data manifold, implicitly injecting the prior into the optimization process. We validate our framework on linear and non-linear controlled generation problems including: image and audio inverse problems and conditional molecule generation reaching state of the art performance across all.

D-Flow: 제어된 생성을 위한 흐름 기반 미분

D-Flow: Differentiating through Flows for Controlled Generation

초록

Support