ChatPaper.aiChatPaper

YaART:又一種ART渲染技術

YaART: Yet Another ART Rendering Technology

April 8, 2024
作者: Sergey Kastryulin, Artem Konev, Alexander Shishenya, Eugene Lyapustin, Artem Khurshudov, Alexander Tselousov, Nikita Vinokurov, Denis Kuznedelev, Alexander Markovich, Grigoriy Livshits, Alexey Kirillov, Anastasiia Tabisheva, Liubov Chubarova, Marina Kaminskaia, Alexander Ustyuzhanin, Artemii Shvetsov, Daniil Shlenskii, Valerii Startsev, Dmitrii Kornilov, Mikhail Romanov, Artem Babenko, Sergei Ovcharenko, Valentin Khrulkov
cs.AI

摘要

在快速發展的生成模型領域中,高效且高保真度的文本轉圖像擴散系統的開發代表著一個重要的前沿。本研究介紹了YaART,一種新型的產品級文本轉圖像級聯擴散模型,利用來自人類反饋的強化學習(RLHF)來對齊人類偏好。在YaART的開發過程中,我們特別關注模型和訓練數據集大小的選擇,這些方面在以往的文本轉圖像級聯擴散模型中並未得到系統性研究。特別是,我們全面分析了這些選擇如何影響訓練過程的效率以及生成圖像的質量,這在實踐中非常重要。此外,我們展示了在較小數據集上訓練的高質量圖像模型可以成功與在較大數據集上訓練的模型競爭,建立了更有效的擴散模型訓練場景。從質量的角度來看,YaART在許多現有的最先進模型中始終受到用戶的青睞。
English
In the rapidly progressing field of generative models, the development of efficient and high-fidelity text-to-image diffusion systems represents a significant frontier. This study introduces YaART, a novel production-grade text-to-image cascaded diffusion model aligned to human preferences using Reinforcement Learning from Human Feedback (RLHF). During the development of YaART, we especially focus on the choices of the model and training dataset sizes, the aspects that were not systematically investigated for text-to-image cascaded diffusion models before. In particular, we comprehensively analyze how these choices affect both the efficiency of the training process and the quality of the generated images, which are highly important in practice. Furthermore, we demonstrate that models trained on smaller datasets of higher-quality images can successfully compete with those trained on larger datasets, establishing a more efficient scenario of diffusion models training. From the quality perspective, YaART is consistently preferred by users over many existing state-of-the-art models.

Summary

AI-Generated Summary

PDF170December 15, 2024