ChatPaper.aiChatPaper

LayoutNUWA:揭示大型語言模型的隱藏版面專業知識

LayoutNUWA: Revealing the Hidden Layout Expertise of Large Language Models

September 18, 2023
作者: Zecheng Tang, Chenfei Wu, Juntao Li, Nan Duan
cs.AI

摘要

圖形版面生成是一個不斷增長的研究領域,在用戶參與和信息感知中發揮著重要作用。現有方法主要將版面生成視為一個數值優化任務,著重於定量方面,卻忽略了版面的語義信息,例如每個版面元素之間的關係。本文提出了LayoutNUWA,這是第一個將版面生成視為代碼生成任務以增強語義信息並利用大型語言模型(LLMs)的隱藏版面專業知識的模型。更具體地,我們開發了一種代碼指導調整(CIT)方法,包括三個相互連接的模塊:1)代碼初始化(CI)模塊量化數值條件並將其初始化為帶有策略性放置遮罩的HTML代碼;2)代碼完成(CC)模塊利用LLMs的格式化知識填充HTML代碼中的遮罩部分;3)代碼渲染(CR)模塊將完成的代碼轉換為最終版面輸出,確保高度可解釋和透明的版面生成過程,直接將代碼映射到可視化版面。我們在多個數據集上實現了顯著的最新性能(甚至超過50%的改進),展示了LayoutNUWA的強大能力。我們的代碼可在https://github.com/ProjectNUWA/LayoutNUWA 上找到。
English
Graphic layout generation, a growing research field, plays a significant role in user engagement and information perception. Existing methods primarily treat layout generation as a numerical optimization task, focusing on quantitative aspects while overlooking the semantic information of layout, such as the relationship between each layout element. In this paper, we propose LayoutNUWA, the first model that treats layout generation as a code generation task to enhance semantic information and harness the hidden layout expertise of large language models~(LLMs). More concretely, we develop a Code Instruct Tuning (CIT) approach comprising three interconnected modules: 1) the Code Initialization (CI) module quantifies the numerical conditions and initializes them as HTML code with strategically placed masks; 2) the Code Completion (CC) module employs the formatting knowledge of LLMs to fill in the masked portions within the HTML code; 3) the Code Rendering (CR) module transforms the completed code into the final layout output, ensuring a highly interpretable and transparent layout generation procedure that directly maps code to a visualized layout. We attain significant state-of-the-art performance (even over 50\% improvements) on multiple datasets, showcasing the strong capabilities of LayoutNUWA. Our code is available at https://github.com/ProjectNUWA/LayoutNUWA.
PDF151December 15, 2024