ResAdapter: 확산 모델을 위한 도메인 일관적 해상도 어댑터

초록

최근 텍스트-이미지 모델(예: Stable Diffusion)과 이에 대응하는 개인화 기술(예: DreamBooth 및 LoRA)의 발전으로 개인이 고품질이고 창의적인 이미지를 생성할 수 있게 되었습니다. 그러나 이러한 모델들은 훈련된 도메인 외부의 해상도로 이미지를 생성할 때 종종 한계를 겪습니다. 이러한 한계를 극복하기 위해, 우리는 해상도 제약 없이 다양한 종횡비로 이미지를 생성할 수 있도록 설계된 도메인 일관성 어댑터인 Resolution Adapter(ResAdapter)를 제안합니다. 정적 해상도의 이미지를 복잡한 후처리 작업으로 처리하는 기타 다중 해상도 생성 방법과 달리, ResAdapter는 동적 해상도로 이미지를 직접 생성합니다. 특히, 순수 해상도 사전 지식을 깊이 이해한 후, 일반 데이터셋으로 훈련된 ResAdapter는 개인화된 확산 모델의 원래 스타일 도메인을 유지하면서 해상도 제약 없는 이미지를 생성합니다. 포괄적인 실험을 통해 ResAdapter가 단 0.5M 파라미터로 임의의 확산 모델에 대해 유연한 해상도의 이미지를 처리할 수 있음을 입증했습니다. 더 확장된 실험에서는 ResAdapter가 ControlNet, IP-Adapter, LCM-LoRA와 같은 다른 모듈과 호환되어 광범위한 해상도에서 이미지를 생성할 수 있으며, ElasticDiffusion과 같은 기타 다중 해상도 모델에 통합되어 고해상도 이미지를 효율적으로 생성할 수 있음을 보여줍니다. 프로젝트 링크는 https://res-adapter.github.io입니다.

English

Recent advancement in text-to-image models (e.g., Stable Diffusion) and corresponding personalized technologies (e.g., DreamBooth and LoRA) enables individuals to generate high-quality and imaginative images. However, they often suffer from limitations when generating images with resolutions outside of their trained domain. To overcome this limitation, we present the Resolution Adapter (ResAdapter), a domain-consistent adapter designed for diffusion models to generate images with unrestricted resolutions and aspect ratios. Unlike other multi-resolution generation methods that process images of static resolution with complex post-process operations, ResAdapter directly generates images with the dynamical resolution. Especially, after learning a deep understanding of pure resolution priors, ResAdapter trained on the general dataset, generates resolution-free images with personalized diffusion models while preserving their original style domain. Comprehensive experiments demonstrate that ResAdapter with only 0.5M can process images with flexible resolutions for arbitrary diffusion models. More extended experiments demonstrate that ResAdapter is compatible with other modules (e.g., ControlNet, IP-Adapter and LCM-LoRA) for image generation across a broad range of resolutions, and can be integrated into other multi-resolution model (e.g., ElasticDiffusion) for efficiently generating higher-resolution images. Project link is https://res-adapter.github.io

ResAdapter: 확산 모델을 위한 도메인 일관적 해상도 어댑터

ResAdapter: Domain Consistent Resolution Adapter for Diffusion Models

초록

Support