DreamStyler: テキストから画像への拡散モデルを用いたスタイル反転による絵画生成

要旨

大規模なテキストから画像への変換モデルの最近の進展は、芸術分野において顕著な成果を上げ、さまざまな応用が可能となっている。しかし、芸術作品の独自の特徴（例えば、筆遣い、色調、構図など）をテキストプロンプトのみで表現することは、言語記述の本質的な制約により限界に直面する可能性がある。この問題に対処するため、我々はDreamStylerを提案する。これは、テキストから画像への合成とスタイル転送の両方に精通した、芸術的画像合成のための新しいフレームワークである。DreamStylerは、文脈を考慮したテキストプロンプトを用いて多段階のテキスト埋め込みを最適化し、優れた画質を実現する。さらに、内容とスタイルのガイダンスを活用することで、DreamStylerは多様なスタイル参照に対応する柔軟性を示す。実験結果は、複数のシナリオにおいてその優れた性能を実証し、芸術作品の創作における有望な可能性を示唆している。

English

Recent progresses in large-scale text-to-image models have yielded remarkable accomplishments, finding various applications in art domain. However, expressing unique characteristics of an artwork (e.g. brushwork, colortone, or composition) with text prompts alone may encounter limitations due to the inherent constraints of verbal description. To this end, we introduce DreamStyler, a novel framework designed for artistic image synthesis, proficient in both text-to-image synthesis and style transfer. DreamStyler optimizes a multi-stage textual embedding with a context-aware text prompt, resulting in prominent image quality. In addition, with content and style guidance, DreamStyler exhibits flexibility to accommodate a range of style references. Experimental results demonstrate its superior performance across multiple scenarios, suggesting its promising potential in artistic product creation.

DreamStyler: テキストから画像への拡散モデルを用いたスタイル反転による絵画生成

DreamStyler: Paint by Style Inversion with Text-to-Image Diffusion Models

要旨

Support