LayoutPrompter: 大規模言語モデルのデザイン能力を目覚めさせる

要旨

ユーザーの制約を高品質なレイアウトに自動的にマッピングする条件付きグラフィックレイアウト生成は、今日広く注目を集めています。最近の研究は有望な性能を達成していますが、汎用性とデータ効率の欠如が実用化を妨げています。本研究では、大規模言語モデル（LLM）を活用し、インコンテキスト学習を通じてこれらの問題に対処するLayoutPrompterを提案します。LayoutPrompterは、入力出力シリアライゼーション、動的例選択、レイアウトランキングの3つの主要コンポーネントで構成されています。具体的には、入力出力シリアライゼーションコンポーネントは、各レイアウト生成タスクの入力と出力の形式を綿密に設計します。動的例選択は、与えられた入力に対して最も有用なプロンプティング例を選択する役割を担います。そして、レイアウトランカーは、LLMの複数の出力から最高品質のレイアウトを選び出します。4つの公開データセットを使用して、既存のすべてのレイアウト生成タスクで実験を行いました。我々のアプローチの簡潔さにもかかわらず、実験結果は、LayoutPrompterがモデルのトレーニングやファインチューニングなしで、これらのタスクにおいて最先端のアプローチと競合し、あるいはそれを上回ることを示しています。これは、この汎用的でトレーニング不要なアプローチの有効性を実証しています。さらに、アブレーションスタディは、LayoutPrompterが低データ体制においてトレーニングベースのベースラインを大幅に上回ることを示し、LayoutPrompterのデータ効率をさらに示唆しています。我々のプロジェクトはhttps://github.com/microsoft/LayoutGeneration/tree/main/LayoutPrompterで公開されています。

English

Conditional graphic layout generation, which automatically maps user constraints to high-quality layouts, has attracted widespread attention today. Although recent works have achieved promising performance, the lack of versatility and data efficiency hinders their practical applications. In this work, we propose LayoutPrompter, which leverages large language models (LLMs) to address the above problems through in-context learning. LayoutPrompter is made up of three key components, namely input-output serialization, dynamic exemplar selection and layout ranking. Specifically, the input-output serialization component meticulously designs the input and output formats for each layout generation task. Dynamic exemplar selection is responsible for selecting the most helpful prompting exemplars for a given input. And a layout ranker is used to pick the highest quality layout from multiple outputs of LLMs. We conduct experiments on all existing layout generation tasks using four public datasets. Despite the simplicity of our approach, experimental results show that LayoutPrompter can compete with or even outperform state-of-the-art approaches on these tasks without any model training or fine-tuning. This demonstrates the effectiveness of this versatile and training-free approach. In addition, the ablation studies show that LayoutPrompter is significantly superior to the training-based baseline in a low-data regime, further indicating the data efficiency of LayoutPrompter. Our project is available at https://github.com/microsoft/LayoutGeneration/tree/main/LayoutPrompter.

LayoutPrompter: 大規模言語モデルのデザイン能力を目覚めさせる

LayoutPrompter: Awaken the Design Ability of Large Language Models

要旨

Support