LeX-Art：通过可扩展的高质量数据合成重新思考文本生成

摘要

我们推出LeX-Art，这是一套全面的高质量文生图合成工具，系统性地弥合了提示表达力与文本渲染保真度之间的鸿沟。我们的方法遵循数据为中心的理念，基于Deepseek-R1构建了一个高质量的数据合成管道，精心打造了LeX-10K数据集，包含10,000张高分辨率、美学精炼的1024×1024图像。除了数据集构建，我们还开发了LeX-Enhancer，一个强大的提示增强模型，并训练了两个文生图模型——LeX-FLUX和LeX-Lumina，实现了业界领先的文本渲染性能。为了系统评估视觉文本生成，我们引入了LeX-Bench基准测试，评估保真度、美学及对齐度，并辅以成对归一化编辑距离（PNED），一种用于稳健文本准确性评估的新颖指标。实验显示显著改进，LeX-Lumina在CreateBench上实现了79.81%的PNED提升，而LeX-FLUX在色彩（+3.18%）、位置（+4.45%）和字体准确性（+3.81%）上均超越基线。我们的代码、模型、数据集及演示均公开可用。

English

We introduce LeX-Art, a comprehensive suite for high-quality text-image synthesis that systematically bridges the gap between prompt expressiveness and text rendering fidelity. Our approach follows a data-centric paradigm, constructing a high-quality data synthesis pipeline based on Deepseek-R1 to curate LeX-10K, a dataset of 10K high-resolution, aesthetically refined 1024times1024 images. Beyond dataset construction, we develop LeX-Enhancer, a robust prompt enrichment model, and train two text-to-image models, LeX-FLUX and LeX-Lumina, achieving state-of-the-art text rendering performance. To systematically evaluate visual text generation, we introduce LeX-Bench, a benchmark that assesses fidelity, aesthetics, and alignment, complemented by Pairwise Normalized Edit Distance (PNED), a novel metric for robust text accuracy evaluation. Experiments demonstrate significant improvements, with LeX-Lumina achieving a 79.81% PNED gain on CreateBench, and LeX-FLUX outperforming baselines in color (+3.18%), positional (+4.45%), and font accuracy (+3.81%). Our codes, models, datasets, and demo are publicly available.

LeX-Art：通过可扩展的高质量数据合成重新思考文本生成

LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis

摘要

Support