CAR-Flow: 조건 인식 재매개화를 통한 소스와 타겟 정렬로 향상된 플로우 매칭

초록

조건부 생성 모델링은 데이터-조건 쌍을 포함한 샘플로부터 조건부 데이터 분포를 학습하는 것을 목표로 합니다. 이를 위해 확산(diffusion) 및 플로우 기반(flow-based) 방법들이 뛰어난 결과를 달성했습니다. 이러한 방법들은 초기 표준 가우시안 노이즈를 조건을 무시한 상태에서 조건부 데이터 분포로 전송하기 위해 학습된 (플로우) 모델을 사용합니다. 따라서 모델은 질량 전송과 조건 주입을 모두 학습해야 합니다. 모델의 요구 사항을 완화하기 위해, 우리는 플로우 매칭을 위한 조건 인식 재매개변수화(Condition-Aware Reparameterization for Flow Matching, CAR-Flow)를 제안합니다. 이는 소스, 타겟 또는 두 분포 모두를 조건화하는 경량의 학습된 이동(shift)입니다. 이러한 분포를 재배치함으로써, CAR-Flow는 모델이 학습해야 할 확률 경로를 단축시켜 실제로 더 빠른 학습을 가능하게 합니다. 저차원의 합성 데이터에서는 CAR의 효과를 시각화하고 정량화했습니다. 고차원의 자연 이미지 데이터(ImageNet-256)에서는 SiT-XL/2에 CAR-Flow를 적용하여 FID를 2.07에서 1.68로 감소시키면서 0.6% 미만의 추가 매개변수만 도입했습니다.

English

Conditional generative modeling aims to learn a conditional data distribution from samples containing data-condition pairs. For this, diffusion and flow-based methods have attained compelling results. These methods use a learned (flow) model to transport an initial standard Gaussian noise that ignores the condition to the conditional data distribution. The model is hence required to learn both mass transport and conditional injection. To ease the demand on the model, we propose Condition-Aware Reparameterization for Flow Matching (CAR-Flow) -- a lightweight, learned shift that conditions the source, the target, or both distributions. By relocating these distributions, CAR-Flow shortens the probability path the model must learn, leading to faster training in practice. On low-dimensional synthetic data, we visualize and quantify the effects of CAR. On higher-dimensional natural image data (ImageNet-256), equipping SiT-XL/2 with CAR-Flow reduces FID from 2.07 to 1.68, while introducing less than 0.6% additional parameters.

CAR-Flow: 조건 인식 재매개화를 통한 소스와 타겟 정렬로 향상된 플로우 매칭

CAR-Flow: Condition-Aware Reparameterization Aligns Source and Target for Better Flow Matching

초록

Support