T-LoRA: 과적합 없이 단일 이미지로 확산 모델 커스터마이징하기

초록

확산 모델 미세 조정은 사전 훈련된 모델을 특정 객체 생성에 맞게 커스터마이징하는 강력한 접근 방식을 제공하지만, 훈련 샘플이 제한적일 때 과적합이 자주 발생하여 일반화 능력과 출력 다양성이 저해됩니다. 본 논문은 단일 개념 이미지만을 사용하여 확산 모델을 적응시키는, 가장 실용적인 잠재력을 지닌 도전적이면서도 영향력 있는 과제를 다룹니다. 우리는 확산 모델 개인화를 위해 특별히 설계된 T-LoRA(Timestep-Dependent Low-Rank Adaptation) 프레임워크를 소개합니다. 우리의 연구에서 더 높은 확산 타임스텝이 더 낮은 타임스텝보다 과적합에 더 취약하다는 것을 보여주며, 이는 타임스텝에 민감한 미세 조정 전략이 필요함을 시사합니다. T-LoRA는 두 가지 주요 혁신을 포함합니다: (1) 확산 타임스텝에 따라 순위 제한 업데이트를 조정하는 동적 미세 조정 전략, 그리고 (2) 직교 초기화를 통해 어댑터 구성 요소 간의 독립성을 보장하는 가중치 매개변수화 기술. 광범위한 실험을 통해 T-LoRA와 그 개별 구성 요소가 표준 LoRA 및 기타 확산 모델 개인화 기술을 능가함을 보여줍니다. 이들은 개념 충실도와 텍스트 정렬 간의 우수한 균형을 달성하며, 데이터가 제한되고 자원이 부족한 시나리오에서 T-LoRA의 잠재력을 강조합니다. 코드는 https://github.com/ControlGenAI/T-LoRA에서 확인할 수 있습니다.

English

While diffusion model fine-tuning offers a powerful approach for customizing pre-trained models to generate specific objects, it frequently suffers from overfitting when training samples are limited, compromising both generalization capability and output diversity. This paper tackles the challenging yet most impactful task of adapting a diffusion model using just a single concept image, as single-image customization holds the greatest practical potential. We introduce T-LoRA, a Timestep-Dependent Low-Rank Adaptation framework specifically designed for diffusion model personalization. In our work we show that higher diffusion timesteps are more prone to overfitting than lower ones, necessitating a timestep-sensitive fine-tuning strategy. T-LoRA incorporates two key innovations: (1) a dynamic fine-tuning strategy that adjusts rank-constrained updates based on diffusion timesteps, and (2) a weight parametrization technique that ensures independence between adapter components through orthogonal initialization. Extensive experiments show that T-LoRA and its individual components outperform standard LoRA and other diffusion model personalization techniques. They achieve a superior balance between concept fidelity and text alignment, highlighting the potential of T-LoRA in data-limited and resource-constrained scenarios. Code is available at https://github.com/ControlGenAI/T-LoRA.

T-LoRA: 과적합 없이 단일 이미지로 확산 모델 커스터마이징하기

T-LoRA: Single Image Diffusion Model Customization Without Overfitting

초록

Support