LoRAShop: 수정된 흐름 트랜스포머를 사용한 학습 없이 다중 개념 이미지 생성 및 편집

초록

우리는 LoRA 모델을 활용한 다중 개념 이미지 편집을 위한 첫 번째 프레임워크인 LoRAShop을 소개합니다. LoRAShop은 Flux 스타일 확산 트랜스포머 내부의 특징 상호작용 패턴에 대한 중요한 관찰에 기반을 두고 있습니다: 개념별 트랜스포머 특징은 디노이징 과정 초기에 공간적으로 일관된 영역을 활성화합니다. 우리는 이 관찰을 활용하여 사전 순방향 전달 과정에서 각 개념에 대해 분리된 잠재 마스크를 도출하고, 개인화할 개념을 둘러싼 영역 내에서만 해당 LoRA 가중치를 혼합합니다. 그 결과, 원본 장면에 여러 주제나 스타일을 자연스럽게 통합하면서도 전역적인 맥락, 조명, 세부 사항을 보존하는 편집이 가능해집니다. 우리의 실험 결과, LoRAShop이 기존 방법 대비 더 나은 정체성 보존을 제공함을 보여줍니다. 재학습과 외부 제약을 없앰으로써, LoRAShop은 개인화된 확산 모델을 실용적인 'LoRA를 활용한 포토샵' 도구로 전환하고, 구성적 시각적 스토리텔링과 신속한 창의적 반복을 위한 새로운 가능성을 열어줍니다.

English

We introduce LoRAShop, the first framework for multi-concept image editing with LoRA models. LoRAShop builds on a key observation about the feature interaction patterns inside Flux-style diffusion transformers: concept-specific transformer features activate spatially coherent regions early in the denoising process. We harness this observation to derive a disentangled latent mask for each concept in a prior forward pass and blend the corresponding LoRA weights only within regions bounding the concepts to be personalized. The resulting edits seamlessly integrate multiple subjects or styles into the original scene while preserving global context, lighting, and fine details. Our experiments demonstrate that LoRAShop delivers better identity preservation compared to baselines. By eliminating retraining and external constraints, LoRAShop turns personalized diffusion models into a practical `photoshop-with-LoRAs' tool and opens new avenues for compositional visual storytelling and rapid creative iteration.

LoRAShop: 수정된 흐름 트랜스포머를 사용한 학습 없이 다중 개념 이미지 생성 및 편집

LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers

초록

Support