IDAdapter: 텍스트-이미지 모델의 튜닝 없이 개인화를 위한 혼합 특징 학습

초록

개인화된 초상화 생성에 Stable Diffusion을 활용하는 것은 사용자가 특정 프롬프트를 기반으로 고품질의 맞춤형 캐릭터 아바타를 생성할 수 있게 해주는 강력하고 주목할 만한 도구로 부상했습니다. 그러나 기존의 개인화 방법들은 테스트 시점 미세 조정, 다중 입력 이미지 요구, 정체성 보존의 낮은 수준, 그리고 생성 결과의 제한된 다양성과 같은 문제에 직면해 있습니다. 이러한 문제를 극복하기 위해, 우리는 단일 얼굴 이미지로부터 개인화된 이미지 생성에서 다양성과 정체성 보존을 향상시키는 튜닝이 필요 없는 접근 방식인 IDAdapter를 소개합니다. IDAdapter는 텍스트 및 시각적 주입과 얼굴 정체성 손실을 결합하여 생성 과정에 개인화된 개념을 통합합니다. 학습 단계에서, 우리는 특정 정체성의 다중 참조 이미지로부터 혼합된 특징을 통합하여 정체성 관련 콘텐츠 세부 사항을 풍부하게 하고, 이전 작업들에 비해 더 다양한 스타일, 표정, 각도의 이미지를 생성하도록 모델을 안내합니다. 광범위한 평가를 통해 우리의 방법이 생성된 이미지에서 다양성과 정체성 충실도를 모두 달성하는 효과를 입증합니다.

English

Leveraging Stable Diffusion for the generation of personalized portraits has emerged as a powerful and noteworthy tool, enabling users to create high-fidelity, custom character avatars based on their specific prompts. However, existing personalization methods face challenges, including test-time fine-tuning, the requirement of multiple input images, low preservation of identity, and limited diversity in generated outcomes. To overcome these challenges, we introduce IDAdapter, a tuning-free approach that enhances the diversity and identity preservation in personalized image generation from a single face image. IDAdapter integrates a personalized concept into the generation process through a combination of textual and visual injections and a face identity loss. During the training phase, we incorporate mixed features from multiple reference images of a specific identity to enrich identity-related content details, guiding the model to generate images with more diverse styles, expressions, and angles compared to previous works. Extensive evaluations demonstrate the effectiveness of our method, achieving both diversity and identity fidelity in generated images.

IDAdapter: 텍스트-이미지 모델의 튜닝 없이 개인화를 위한 혼합 특징 학습

IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image Models

초록

Support