FontStudio:針對一致且連貫的字型效果生成的形狀適應擴散模型
FontStudio: Shape-Adaptive Diffusion Model for Coherent and Consistent Font Effect Generation
June 12, 2024
作者: Xinzhi Mu, Li Chen, Bohan Chen, Shuyang Gu, Jianmin Bao, Dong Chen, Ji Li, Yuhui Yuan
cs.AI
摘要
最近,應用現代基於擴散的文本到圖像生成模型來創建藝術字型,傳統上是專業設計師的領域,已經引起了相當大的興趣。與大多數現有研究集中於生成藝術字體不同,我們的研究旨在應對一個新穎且更具挑戰性的任務:多語言字型的文本效果生成。這個任務基本上要求在字型形狀的畫布範圍內生成連貫一致的視覺內容,而不是傳統的矩形畫布。為了應對這個任務,我們引入了一種新穎的形狀適應性擴散模型,能夠解釋給定形狀並在不規則畫布內部策略性地規劃像素分佈。為了實現這一目標,我們精心製作了一個高質量的形狀適應性圖像文本數據集,並將分割遮罩作為視覺條件納入,以引導在不規則畫布內進行圖像生成過程。這種方法使傳統基於矩形畫布的擴散模型能夠根據提供的幾何形狀生成所需的概念。其次,為了在多個字母之間保持一致性,我們還提出了一種無需訓練的形狀適應性效果轉移方法,用於將生成的參考字母中的紋理轉移到其他字母。關鍵見解是建立字體效果噪聲先驗並在串聯潛在空間中傳播字體效果信息。通過用戶偏好研究確認了我們的FontStudio系統的有效性,結果顯示,即使與最新無與倫比的商業產品Adobe Firefly相比,我們的系統在美學上也獲得了明顯的偏好(78%的勝率)。
English
Recently, the application of modern diffusion-based text-to-image generation
models for creating artistic fonts, traditionally the domain of professional
designers, has garnered significant interest. Diverging from the majority of
existing studies that concentrate on generating artistic typography, our
research aims to tackle a novel and more demanding challenge: the generation of
text effects for multilingual fonts. This task essentially requires generating
coherent and consistent visual content within the confines of a font-shaped
canvas, as opposed to a traditional rectangular canvas. To address this task,
we introduce a novel shape-adaptive diffusion model capable of interpreting the
given shape and strategically planning pixel distributions within the irregular
canvas. To achieve this, we curate a high-quality shape-adaptive image-text
dataset and incorporate the segmentation mask as a visual condition to steer
the image generation process within the irregular-canvas. This approach enables
the traditionally rectangle canvas-based diffusion model to produce the desired
concepts in accordance with the provided geometric shapes. Second, to maintain
consistency across multiple letters, we also present a training-free,
shape-adaptive effect transfer method for transferring textures from a
generated reference letter to others. The key insights are building a font
effect noise prior and propagating the font effect information in a
concatenated latent space. The efficacy of our FontStudio system is confirmed
through user preference studies, which show a marked preference (78% win-rates
on aesthetics) for our system even when compared to the latest unrivaled
commercial product, Adobe Firefly.Summary
AI-Generated Summary