卓越：一种开放的高级插图模型

摘要

在这项工作中，我们分享了实现我们文本到图像动漫图像生成模型Illustrious的最先进质量的见解。为了实现高分辨率、动态色彩范围图像和高恢复能力，我们专注于三种关键的模型改进方法。首先，我们深入探讨了批量大小和辍学控制的重要性，这使得基于可控令牌的概念激活能够更快地学习。其次，我们提高了图像的训练分辨率，影响了对角色解剖在更高分辨率下的准确描绘，利用适当的方法将其生成能力扩展到超过20MP。最后，我们提出了精细的多级标题，涵盖了所有标签和各种自然语言标题，作为模型发展的关键因素。通过广泛的分析和实验，Illustrious在动画风格方面表现出最先进的性能，超越了插图领域中广泛使用的模型，推动了更容易定制和个性化的开源特性。我们计划按顺序公开发布更新的Illustrious模型系列，以及改进的可持续计划。

English

In this work, we share the insights for achieving state-of-the-art quality in our text-to-image anime image generative model, called Illustrious. To achieve high resolution, dynamic color range images, and high restoration ability, we focus on three critical approaches for model improvement. First, we delve into the significance of the batch size and dropout control, which enables faster learning of controllable token based concept activations. Second, we increase the training resolution of images, affecting the accurate depiction of character anatomy in much higher resolution, extending its generation capability over 20MP with proper methods. Finally, we propose the refined multi-level captions, covering all tags and various natural language captions as a critical factor for model development. Through extensive analysis and experiments, Illustrious demonstrates state-of-the-art performance in terms of animation style, outperforming widely-used models in illustration domains, propelling easier customization and personalization with nature of open source. We plan to publicly release updated Illustrious model series sequentially as well as sustainable plans for improvements.

卓越：一种开放的高级插图模型

Illustrious: an Open Advanced Illustration Model

摘要

Support