ChatPaper.aiChatPaper

DEsignBench:探索和基准测试 DALL-E 3 用于想象视觉设计

DEsignBench: Exploring and Benchmarking DALL-E 3 for Imagining Visual Design

October 23, 2023
作者: Kevin Lin, Zhengyuan Yang, Linjie Li, Jianfeng Wang, Lijuan Wang
cs.AI

摘要

我们介绍了DEsignBench,这是一个专为视觉设计场景定制的文本到图像(T2I)生成基准。最近的T2I模型如DALL-E 3等已展示出在生成与文本输入密切相关的逼真图像方面的显著能力。虽然创作引人入胜的图像具有不可抗拒的吸引力,但我们的重点不仅限于纯粹的美学享受。我们旨在探究在真实设计环境中利用这些强大模型的潜力。为了实现这一目标,我们开发了DEsignBench,其中包含旨在评估T2I模型在“设计技术能力”和“设计应用场景”上的测试样本。这两个维度分别由一组特定设计类别支持。我们在DEsignBench上探索DALL-E 3以及其他领先的T2I模型,形成了一个全面的视觉画廊,可进行并排比较。在DEsignBench基准测试中,我们对DEsignBench画廊中生成的图像进行人工评估,评估标准包括图像文本对齐、视觉美感和设计创意。我们的评估还考虑了其他专业设计能力,包括文本渲染、布局组成、色彩和谐、3D设计以及媒体风格。除了人工评估外,我们引入了由GPT-4V驱动的第一个自动图像生成评估器。该评估器提供的评分与人类判断高度一致,同时易于复制且成本效益高。高分辨率版本可在以下链接获取:https://github.com/design-bench/design-bench.github.io/raw/main/designbench.pdf?download=
English
We introduce DEsignBench, a text-to-image (T2I) generation benchmark tailored for visual design scenarios. Recent T2I models like DALL-E 3 and others, have demonstrated remarkable capabilities in generating photorealistic images that align closely with textual inputs. While the allure of creating visually captivating images is undeniable, our emphasis extends beyond mere aesthetic pleasure. We aim to investigate the potential of using these powerful models in authentic design contexts. In pursuit of this goal, we develop DEsignBench, which incorporates test samples designed to assess T2I models on both "design technical capability" and "design application scenario." Each of these two dimensions is supported by a diverse set of specific design categories. We explore DALL-E 3 together with other leading T2I models on DEsignBench, resulting in a comprehensive visual gallery for side-by-side comparisons. For DEsignBench benchmarking, we perform human evaluations on generated images in DEsignBench gallery, against the criteria of image-text alignment, visual aesthetic, and design creativity. Our evaluation also considers other specialized design capabilities, including text rendering, layout composition, color harmony, 3D design, and medium style. In addition to human evaluations, we introduce the first automatic image generation evaluator powered by GPT-4V. This evaluator provides ratings that align well with human judgments, while being easily replicable and cost-efficient. A high-resolution version is available at https://github.com/design-bench/design-bench.github.io/raw/main/designbench.pdf?download=

Summary

AI-Generated Summary

PDF142December 15, 2024