卓越:一种开放的高级插图模型
Illustrious: an Open Advanced Illustration Model
September 30, 2024
作者: Sang Hyun Park, Jun Young Koh, Junha Lee, Joy Song, Dongha Kim, Hoyeon Moon, Hyunju Lee, Min Song
cs.AI
摘要
在这项工作中,我们分享了实现我们文本到图像动漫图像生成模型Illustrious的最先进质量的见解。为了实现高分辨率、动态色彩范围图像和高恢复能力,我们专注于三种关键的模型改进方法。首先,我们深入探讨了批量大小和辍学控制的重要性,这使得基于可控令牌的概念激活能够更快地学习。其次,我们提高了图像的训练分辨率,影响了对角色解剖在更高分辨率下的准确描绘,利用适当的方法将其生成能力扩展到超过20MP。最后,我们提出了精细的多级标题,涵盖了所有标签和各种自然语言标题,作为模型发展的关键因素。通过广泛的分析和实验,Illustrious在动画风格方面表现出最先进的性能,超越了插图领域中广泛使用的模型,推动了更容易定制和个性化的开源特性。我们计划按顺序公开发布更新的Illustrious模型系列,以及改进的可持续计划。
English
In this work, we share the insights for achieving state-of-the-art quality in
our text-to-image anime image generative model, called Illustrious. To achieve
high resolution, dynamic color range images, and high restoration ability, we
focus on three critical approaches for model improvement. First, we delve into
the significance of the batch size and dropout control, which enables faster
learning of controllable token based concept activations. Second, we increase
the training resolution of images, affecting the accurate depiction of
character anatomy in much higher resolution, extending its generation
capability over 20MP with proper methods. Finally, we propose the refined
multi-level captions, covering all tags and various natural language captions
as a critical factor for model development. Through extensive analysis and
experiments, Illustrious demonstrates state-of-the-art performance in terms of
animation style, outperforming widely-used models in illustration domains,
propelling easier customization and personalization with nature of open source.
We plan to publicly release updated Illustrious model series sequentially as
well as sustainable plans for improvements.Summary
AI-Generated Summary