ChatPaper.aiChatPaper

Fine-T2I:面向高质量文生图微调的开放、大规模、多样化数据集

Fine-T2I: An Open, Large-Scale, and Diverse Dataset for High-Quality T2I Fine-Tuning

February 10, 2026
作者: Xu Ma, Yitian Zhang, Qihua Dong, Yun Fu
cs.AI

摘要

高質量開放數據集仍是文本到圖像生成模型微調的主要瓶頸。儘管模型架構和訓練流程快速進步,但大多數公開微調數據集仍存在分辨率低、圖文對齊差或多樣性有限的問題,導致開源研究模型與企業級模型存在明顯性能差距。本研究推出Fine-T2I——一個大規模、高質量、完全開放的T2I微調數據集。該數據集涵蓋10種任務組合、32種提示詞類別、11種視覺風格和5種提示模板,融合了現代強模型生成的合成圖像與專業攝影師精心策展的真實圖像。所有樣本均經過圖文對齊度、視覺保真度和提示詞質量的嚴格篩選,初始候選樣本淘汰率超過95%。最終數據集包含逾600萬個圖文對,磁盤佔用約2TB,在保持微調級質量的同時接近預訓練數據集的規模。經人為評估、視覺比對和自動指標驗證,在各類預訓練擴散模型與自回歸模型上,使用Fine-T2I進行微調均能持續提升生成質量與指令遵循能力。我們以開放許可協議發布Fine-T2I,旨在助力開源社區彌合T2I微調領域的數據鴻溝。
English
High-quality and open datasets remain a major bottleneck for text-to-image (T2I) fine-tuning. Despite rapid progress in model architectures and training pipelines, most publicly available fine-tuning datasets suffer from low resolution, poor text-image alignment, or limited diversity, resulting in a clear performance gap between open research models and enterprise-grade models. In this work, we present Fine-T2I, a large-scale, high-quality, and fully open dataset for T2I fine-tuning. Fine-T2I spans 10 task combinations, 32 prompt categories, 11 visual styles, and 5 prompt templates, and combines synthetic images generated by strong modern models with carefully curated real images from professional photographers. All samples are rigorously filtered for text-image alignment, visual fidelity, and prompt quality, with over 95% of initial candidates removed. The final dataset contains over 6 million text-image pairs, around 2 TB on disk, approaching the scale of pretraining datasets while maintaining fine-tuning-level quality. Across a diverse set of pretrained diffusion and autoregressive models, fine-tuning on Fine-T2I consistently improves both generation quality and instruction adherence, as validated by human evaluation, visual comparison, and automatic metrics. We release Fine-T2I under an open license to help close the data gap in T2I fine-tuning in the open community.
PDF101February 12, 2026