LayoutPrompter: 대형 언어 모델의 디자인 능력을 깨우다

초록

사용자 제약 조건을 고품질 레이아웃으로 자동 매핑하는 조건부 그래픽 레이아웃 생성은 오늘날 폭넓은 관심을 받고 있습니다. 최근 연구들은 유망한 성능을 달성했지만, 다용성과 데이터 효율성의 부족으로 인해 실제 적용에 어려움을 겪고 있습니다. 본 연구에서는 대규모 언어 모델(LLM)을 활용하여 인컨텍스트 학습을 통해 이러한 문제를 해결하는 LayoutPrompter를 제안합니다. LayoutPrompter는 입력-출력 직렬화, 동적 예제 선택, 레이아웃 순위 지정이라는 세 가지 핵심 구성 요소로 이루어져 있습니다. 구체적으로, 입력-출력 직렬화 구성 요소는 각 레이아웃 생성 작업을 위해 입력 및 출력 형식을 세심하게 설계합니다. 동적 예제 선택은 주어진 입력에 대해 가장 유용한 프롬프팅 예제를 선택하는 역할을 하며, 레이아웃 순위 지정기는 LLM의 다중 출력 중에서 가장 높은 품질의 레이아웃을 선택하는 데 사용됩니다. 우리는 네 가지 공개 데이터셋을 사용하여 기존의 모든 레이아웃 생성 작업에 대한 실험을 수행했습니다. 우리의 접근 방식이 단순함에도 불구하고, 실험 결과는 LayoutPrompter가 모델 학습이나 미세 조정 없이도 이러한 작업에서 최신 기술을 능가하거나 경쟁할 수 있음을 보여줍니다. 이는 이 다용적이고 학습이 필요 없는 접근 방식의 효과성을 입증합니다. 또한, 어블레이션 연구는 LayoutPrompter가 데이터가 부족한 환경에서 학습 기반 베이스라인보다 현저히 우수함을 보여주며, 이는 LayoutPrompter의 데이터 효율성을 더욱 강조합니다. 우리의 프로젝트는 https://github.com/microsoft/LayoutGeneration/tree/main/LayoutPrompter에서 확인할 수 있습니다.

English

Conditional graphic layout generation, which automatically maps user constraints to high-quality layouts, has attracted widespread attention today. Although recent works have achieved promising performance, the lack of versatility and data efficiency hinders their practical applications. In this work, we propose LayoutPrompter, which leverages large language models (LLMs) to address the above problems through in-context learning. LayoutPrompter is made up of three key components, namely input-output serialization, dynamic exemplar selection and layout ranking. Specifically, the input-output serialization component meticulously designs the input and output formats for each layout generation task. Dynamic exemplar selection is responsible for selecting the most helpful prompting exemplars for a given input. And a layout ranker is used to pick the highest quality layout from multiple outputs of LLMs. We conduct experiments on all existing layout generation tasks using four public datasets. Despite the simplicity of our approach, experimental results show that LayoutPrompter can compete with or even outperform state-of-the-art approaches on these tasks without any model training or fine-tuning. This demonstrates the effectiveness of this versatile and training-free approach. In addition, the ablation studies show that LayoutPrompter is significantly superior to the training-based baseline in a low-data regime, further indicating the data efficiency of LayoutPrompter. Our project is available at https://github.com/microsoft/LayoutGeneration/tree/main/LayoutPrompter.

LayoutPrompter: 대형 언어 모델의 디자인 능력을 깨우다

LayoutPrompter: Awaken the Design Ability of Large Language Models

초록

Support