BRAT：用於架構無關文本反轉的獎勵正交標記

摘要

文本反轉仍然是個人化擴散模型的一種流行方法，以教授模型新的主題和風格。我們注意到，使用與 UNet 不同的替代方法來探索文本反轉仍未被充分探討，並嘗試使用視覺Transformer進行文本反轉。我們還尋求優化文本反轉的策略，該策略不需要明確使用UNet及其特有的層，因此我們添加了獎勵標記並強制正交性。我們發現獎勵標記的使用改善了對源圖像的遵循，而使用視覺Transformer則改善了對提示的遵循。代碼可在 https://github.com/jamesBaker361/tex_inv_plus 找到。

English

Textual Inversion remains a popular method for personalizing diffusion models, in order to teach models new subjects and styles. We note that textual inversion has been underexplored using alternatives to the UNet, and experiment with textual inversion with a vision transformer. We also seek to optimize textual inversion using a strategy that does not require explicit use of the UNet and its idiosyncratic layers, so we add bonus tokens and enforce orthogonality. We find the use of the bonus token improves adherence to the source images and the use of the vision transformer improves adherence to the prompt. Code is available at https://github.com/jamesBaker361/tex_inv_plus.

BRAT：用於架構無關文本反轉的獎勵正交標記

BRAT: Bonus oRthogonAl Token for Architecture Agnostic Textual Inversion

摘要

Support