BRAT: 아키텍처에 구애받지 않는 텍스트 역전을 위한 보너스 직교 토큰

초록

텍스트 역전은 새로운 주제와 스타일을 모델에 가르치기 위해 개인화된 확산 모델에 대한 인기 있는 방법으로 남아 있습니다. 우리는 UNet 대안을 사용하여 텍스트 역전을 탐구하지 않았음을 주목하며, 비전 트랜스포머를 활용한 텍스트 역전 실험을 진행합니다. 또한 UNet 및 그 독특한 레이어의 명시적 사용이 필요하지 않은 전략을 활용하여 텍스트 역전을 최적화하고자 하며, 이를 위해 보너스 토큰을 추가하고 직교성을 강화합니다. 보너스 토큰의 사용은 소스 이미지에 대한 일치도를 향상시키고, 비전 트랜스포머의 사용은 프롬프트에 대한 일치도를 향상시킵니다. 코드는 https://github.com/jamesBaker361/tex_inv_plus에서 확인할 수 있습니다.

English

Textual Inversion remains a popular method for personalizing diffusion models, in order to teach models new subjects and styles. We note that textual inversion has been underexplored using alternatives to the UNet, and experiment with textual inversion with a vision transformer. We also seek to optimize textual inversion using a strategy that does not require explicit use of the UNet and its idiosyncratic layers, so we add bonus tokens and enforce orthogonality. We find the use of the bonus token improves adherence to the source images and the use of the vision transformer improves adherence to the prompt. Code is available at https://github.com/jamesBaker361/tex_inv_plus.

BRAT: 아키텍처에 구애받지 않는 텍스트 역전을 위한 보너스 직교 토큰

BRAT: Bonus oRthogonAl Token for Architecture Agnostic Textual Inversion

초록

Support