T-LoRA:單圖像擴散模型定制化,無需擔心過擬合
T-LoRA: Single Image Diffusion Model Customization Without Overfitting
July 8, 2025
作者: Vera Soboleva, Aibek Alanov, Andrey Kuznetsov, Konstantin Sobolev
cs.AI
摘要
雖然擴散模型微調提供了一種強大的方法來定制預訓練模型以生成特定對象,但在訓練樣本有限的情況下,它經常會出現過擬合問題,從而損害泛化能力和輸出多樣性。本文解決了使用單一概念圖像來適應擴散模型這一具有挑戰性但最具影響力的任務,因為單圖像定制具有最大的實際潛力。我們引入了T-LoRA,這是一個專為擴散模型個性化設計的時間步依賴低秩適應框架。在我們的工作中,我們展示了較高的擴散時間步比較低的時間步更容易過擬合,這需要一種對時間步敏感的微調策略。T-LoRA包含兩個關鍵創新:(1) 一種動態微調策略,根據擴散時間步調整秩約束更新;(2) 一種權重參數化技術,通過正交初始化確保適配器組件之間的獨立性。大量實驗表明,T-LoRA及其各個組件在標準LoRA和其他擴散模型個性化技術中表現優異。它們在概念保真度和文本對齊之間實現了更好的平衡,凸顯了T-LoRA在數據有限和資源受限場景中的潛力。代碼可在https://github.com/ControlGenAI/T-LoRA獲取。
English
While diffusion model fine-tuning offers a powerful approach for customizing
pre-trained models to generate specific objects, it frequently suffers from
overfitting when training samples are limited, compromising both generalization
capability and output diversity. This paper tackles the challenging yet most
impactful task of adapting a diffusion model using just a single concept image,
as single-image customization holds the greatest practical potential. We
introduce T-LoRA, a Timestep-Dependent Low-Rank Adaptation framework
specifically designed for diffusion model personalization. In our work we show
that higher diffusion timesteps are more prone to overfitting than lower ones,
necessitating a timestep-sensitive fine-tuning strategy. T-LoRA incorporates
two key innovations: (1) a dynamic fine-tuning strategy that adjusts
rank-constrained updates based on diffusion timesteps, and (2) a weight
parametrization technique that ensures independence between adapter components
through orthogonal initialization. Extensive experiments show that T-LoRA and
its individual components outperform standard LoRA and other diffusion model
personalization techniques. They achieve a superior balance between concept
fidelity and text alignment, highlighting the potential of T-LoRA in
data-limited and resource-constrained scenarios. Code is available at
https://github.com/ControlGenAI/T-LoRA.