零樣本主體驅動生成中的負向引導主體保真度優化

摘要

我們提出了主題保真度優化（Subject Fidelity Optimization, SFO），這是一種新穎的比較學習框架，專注於零樣本主題驅動生成，旨在提升主題保真度。與僅依賴正樣本目標並沿用預訓練階段擴散損失的監督微調方法不同，SFO引入了合成負樣本目標，並通過成對比較明確引導模型偏好正樣本。針對負樣本，我們提出了條件退化負採樣（Condition-Degradation Negative Sampling, CDNS），該方法無需昂貴的人工標註，即可自動生成具有區分性和信息量的負樣本，通過有意退化視覺和文本線索來實現。此外，我們重新加權了擴散時間步，將微調重點放在主題細節顯現的中間步驟上。大量實驗表明，在主題驅動生成基準測試中，結合CDNS的SFO在主題保真度和文本對齊方面均顯著優於基線方法。項目頁面：https://subjectfidelityoptimization.github.io/

English

We present Subject Fidelity Optimization (SFO), a novel comparative learning framework for zero-shot subject-driven generation that enhances subject fidelity. Beyond supervised fine-tuning methods that rely only on positive targets and use the diffusion loss as in the pre-training stage, SFO introduces synthetic negative targets and explicitly guides the model to favor positives over negatives through pairwise comparison. For negative targets, we propose Condition-Degradation Negative Sampling (CDNS), which automatically generates distinctive and informative negatives by intentionally degrading visual and textual cues without expensive human annotations. Moreover, we reweight the diffusion timesteps to focus finetuning on intermediate steps where subject details emerge. Extensive experiments demonstrate that SFO with CDNS significantly outperforms baselines in terms of both subject fidelity and text alignment on a subject-driven generation benchmark. Project page: https://subjectfidelityoptimization.github.io/

零樣本主體驅動生成中的負向引導主體保真度優化

Negative-Guided Subject Fidelity Optimization for Zero-Shot Subject-Driven Generation

摘要

Support