书法家:自由风格文本图像定制
Calligrapher: Freestyle Text Image Customization
June 30, 2025
作者: Yue Ma, Qingyan Bai, Hao Ouyang, Ka Leong Cheng, Qiuyu Wang, Hongyu Liu, Zichen Liu, Haofan Wang, Jingye Chen, Yujun Shen, Qifeng Chen
cs.AI
摘要
我们推出Calligrapher,一个创新的基于扩散模型的框架,它将先进的文本定制技术与艺术字体设计巧妙融合,专为数字书法与设计应用而打造。针对字体定制中精确风格控制与数据依赖性的挑战,我们的框架提出了三项核心技术贡献。首先,我们开发了一种自蒸馏机制,利用预训练的文本到图像生成模型结合大型语言模型,自动构建以风格为中心的字体基准。其次,我们引入了一个通过可训练风格编码器实现的局部风格注入框架,该编码器包含Qformer和线性层,用于从参考图像中提取稳健的风格特征。此外,还采用了上下文生成机制,直接将参考图像嵌入去噪过程,进一步强化目标风格的精细对齐。跨多种字体与设计场景的广泛定量与定性评估证实,Calligrapher能够准确再现复杂的风格细节并精确定位字形。通过自动化生成高质量、视觉一致的字体,Calligrapher超越了传统模型,为数字艺术、品牌塑造及情境字体设计领域的创意实践者提供了强大支持。
English
We introduce Calligrapher, a novel diffusion-based framework that
innovatively integrates advanced text customization with artistic typography
for digital calligraphy and design applications. Addressing the challenges of
precise style control and data dependency in typographic customization, our
framework incorporates three key technical contributions. First, we develop a
self-distillation mechanism that leverages the pre-trained text-to-image
generative model itself alongside the large language model to automatically
construct a style-centric typography benchmark. Second, we introduce a
localized style injection framework via a trainable style encoder, which
comprises both Qformer and linear layers, to extract robust style features from
reference images. An in-context generation mechanism is also employed to
directly embed reference images into the denoising process, further enhancing
the refined alignment of target styles. Extensive quantitative and qualitative
evaluations across diverse fonts and design contexts confirm Calligrapher's
accurate reproduction of intricate stylistic details and precise glyph
positioning. By automating high-quality, visually consistent typography,
Calligrapher surpasses traditional models, empowering creative practitioners in
digital art, branding, and contextual typographic design.