VecFusion: 拡散モデルを用いたベクトルフォント生成

要旨

本論文では、異なるトポロジー構造と精密な制御点位置を持つベクトルフォントを生成可能な新しいニューラルアーキテクチャ「VecFusion」を提案する。本手法は、ラスタ拡散モデルとベクトル拡散モデルからなるカスケード型拡散モデルである。ラスタモデルは、フォントの全体的なスタイルと形状を捉えた低解像度のラスタライズフォントと補助的な制御点情報を生成し、ベクトルモデルは、第一段階で生成された低解像度ラスタフォントを条件としてベクトルフォントを合成する。長く複雑な曲線を合成するために、ベクトル拡散モデルはトランスフォーマーアーキテクチャと新しいベクトル表現を採用し、多様なベクトルジオメトリのモデリングと制御点の精密な予測を可能にする。実験結果から、従来のベクトルグラフィックス生成モデルと比較して、本提案のカスケード型ベクトル拡散モデルは、複雑な構造と多様なスタイルを持つ高品質なベクトルフォントを生成することが示された。

English

We present VecFusion, a new neural architecture that can generate vector fonts with varying topological structures and precise control point positions. Our approach is a cascaded diffusion model which consists of a raster diffusion model followed by a vector diffusion model. The raster model generates low-resolution, rasterized fonts with auxiliary control point information, capturing the global style and shape of the font, while the vector model synthesizes vector fonts conditioned on the low-resolution raster fonts from the first stage. To synthesize long and complex curves, our vector diffusion model uses a transformer architecture and a novel vector representation that enables the modeling of diverse vector geometry and the precise prediction of control points. Our experiments show that, in contrast to previous generative models for vector graphics, our new cascaded vector diffusion model generates higher quality vector fonts, with complex structures and diverse styles.

VecFusion: 拡散モデルを用いたベクトルフォント生成

VecFusion: Vector Font Generation with Diffusion

要旨

Support