ConsisLoRA：提升基於LoRA風格遷移的內容與風格一致性

摘要

風格遷移涉及將參考圖像的風格轉移到目標圖像的內容上。基於LoRA（低秩適應）方法的最新進展在有效捕捉單一圖像風格方面顯示出潛力。然而，這些方法仍面臨顯著挑戰，如內容不一致、風格錯位和內容洩漏。本文全面分析了標準擴散參數化（即學習預測噪聲）在風格遷移中的局限性。為解決這些問題，我們引入了ConsisLoRA，這是一種基於LoRA的方法，通過優化LoRA權重以預測原始圖像而非噪聲，從而增強內容和風格的一致性。我們還提出了一種兩步訓練策略，將內容和風格的學習從參考圖像中解耦。為了有效捕捉內容圖像的全局結構和局部細節，我們引入了一種逐步損失過渡策略。此外，我們提出了一種推理指導方法，使在推理過程中能夠連續控制內容和風格的強度。通過定性和定量評估，我們的方法在內容和風格一致性方面顯示出顯著改進，同時有效減少了內容洩漏。

English

Style transfer involves transferring the style from a reference image to the content of a target image. Recent advancements in LoRA-based (Low-Rank Adaptation) methods have shown promise in effectively capturing the style of a single image. However, these approaches still face significant challenges such as content inconsistency, style misalignment, and content leakage. In this paper, we comprehensively analyze the limitations of the standard diffusion parameterization, which learns to predict noise, in the context of style transfer. To address these issues, we introduce ConsisLoRA, a LoRA-based method that enhances both content and style consistency by optimizing the LoRA weights to predict the original image rather than noise. We also propose a two-step training strategy that decouples the learning of content and style from the reference image. To effectively capture both the global structure and local details of the content image, we introduce a stepwise loss transition strategy. Additionally, we present an inference guidance method that enables continuous control over content and style strengths during inference. Through both qualitative and quantitative evaluations, our method demonstrates significant improvements in content and style consistency while effectively reducing content leakage.

ConsisLoRA：提升基於LoRA風格遷移的內容與風格一致性

ConsisLoRA: Enhancing Content and Style Consistency for LoRA-based Style Transfer

摘要

Support