ConsisLoRA:提升基於LoRA風格遷移的內容與風格一致性
ConsisLoRA: Enhancing Content and Style Consistency for LoRA-based Style Transfer
March 13, 2025
作者: Bolin Chen, Baoquan Zhao, Haoran Xie, Yi Cai, Qing Li, Xudong Mao
cs.AI
摘要
風格遷移涉及將參考圖像的風格轉移到目標圖像的內容上。基於LoRA(低秩適應)方法的最新進展在有效捕捉單一圖像風格方面顯示出潛力。然而,這些方法仍面臨顯著挑戰,如內容不一致、風格錯位和內容洩漏。本文全面分析了標準擴散參數化(即學習預測噪聲)在風格遷移中的局限性。為解決這些問題,我們引入了ConsisLoRA,這是一種基於LoRA的方法,通過優化LoRA權重以預測原始圖像而非噪聲,從而增強內容和風格的一致性。我們還提出了一種兩步訓練策略,將內容和風格的學習從參考圖像中解耦。為了有效捕捉內容圖像的全局結構和局部細節,我們引入了一種逐步損失過渡策略。此外,我們提出了一種推理指導方法,使在推理過程中能夠連續控制內容和風格的強度。通過定性和定量評估,我們的方法在內容和風格一致性方面顯示出顯著改進,同時有效減少了內容洩漏。
English
Style transfer involves transferring the style from a reference image to the
content of a target image. Recent advancements in LoRA-based (Low-Rank
Adaptation) methods have shown promise in effectively capturing the style of a
single image. However, these approaches still face significant challenges such
as content inconsistency, style misalignment, and content leakage. In this
paper, we comprehensively analyze the limitations of the standard diffusion
parameterization, which learns to predict noise, in the context of style
transfer. To address these issues, we introduce ConsisLoRA, a LoRA-based method
that enhances both content and style consistency by optimizing the LoRA weights
to predict the original image rather than noise. We also propose a two-step
training strategy that decouples the learning of content and style from the
reference image. To effectively capture both the global structure and local
details of the content image, we introduce a stepwise loss transition strategy.
Additionally, we present an inference guidance method that enables continuous
control over content and style strengths during inference. Through both
qualitative and quantitative evaluations, our method demonstrates significant
improvements in content and style consistency while effectively reducing
content leakage.Summary
AI-Generated Summary