ChatPaper.aiChatPaper

LayoutNUWA:揭示大型语言模型中隐藏的布局专业知识

LayoutNUWA: Revealing the Hidden Layout Expertise of Large Language Models

September 18, 2023
作者: Zecheng Tang, Chenfei Wu, Juntao Li, Nan Duan
cs.AI

摘要

图形布局生成是一个不断发展的研究领域,在用户参与和信息感知中发挥着重要作用。现有方法主要将布局生成视为数值优化任务,侧重于定量方面,却忽略了布局的语义信息,例如每个布局元素之间的关系。本文提出了LayoutNUWA,这是第一个将布局生成视为代码生成任务以增强语义信息并利用大型语言模型(LLMs)的隐藏布局专业知识的模型。更具体地说,我们开发了一个包含三个相互连接模块的代码指导调整(CIT)方法:1)代码初始化(CI)模块量化数值条件并将其初始化为带有策略性放置掩码的HTML代码;2)代码完成(CC)模块利用LLMs的格式化知识填充HTML代码中的掩码部分;3)代码渲染(CR)模块将完成的代码转换为最终布局输出,确保高度可解释和透明的布局生成过程,直接将代码映射到可视化布局。我们在多个数据集上取得了显著的最新性能(甚至超过50%的改进),展示了LayoutNUWA强大的能力。我们的代码可在https://github.com/ProjectNUWA/LayoutNUWA 上找到。
English
Graphic layout generation, a growing research field, plays a significant role in user engagement and information perception. Existing methods primarily treat layout generation as a numerical optimization task, focusing on quantitative aspects while overlooking the semantic information of layout, such as the relationship between each layout element. In this paper, we propose LayoutNUWA, the first model that treats layout generation as a code generation task to enhance semantic information and harness the hidden layout expertise of large language models~(LLMs). More concretely, we develop a Code Instruct Tuning (CIT) approach comprising three interconnected modules: 1) the Code Initialization (CI) module quantifies the numerical conditions and initializes them as HTML code with strategically placed masks; 2) the Code Completion (CC) module employs the formatting knowledge of LLMs to fill in the masked portions within the HTML code; 3) the Code Rendering (CR) module transforms the completed code into the final layout output, ensuring a highly interpretable and transparent layout generation procedure that directly maps code to a visualized layout. We attain significant state-of-the-art performance (even over 50\% improvements) on multiple datasets, showcasing the strong capabilities of LayoutNUWA. Our code is available at https://github.com/ProjectNUWA/LayoutNUWA.
PDF151December 15, 2024