OneIG-Bench：面向圖像生成的全維度細粒度評估框架

摘要

文本到圖像（T2I）模型因能生成與文字提示高度契合的高質量圖像而備受關注。然而，T2I模型的快速發展揭示了早期基準測試的侷限性，這些測試缺乏全面的評估，例如在推理、文字渲染和風格方面的評估。值得注意的是，近期最先進的模型憑藉其豐富的知識建模能力，在需要強大推理能力的圖像生成問題上展現出令人期待的成果，但現有的評估體系尚未充分應對這一前沿領域。為系統性地彌補這些不足，我們推出了OneIG-Bench，這是一個精心設計的綜合基準框架，用於從多個維度對T2I模型進行細粒度評估，包括提示-圖像對齊、文字渲染精度、推理生成內容、風格化及多樣性。通過結構化的評估，該基準能夠深入分析模型性能，幫助研究人員和實踐者精確定位圖像生成全流程中的優勢與瓶頸。具體而言，OneIG-Bench允許用戶聚焦於特定的評估子集，從而實現靈活的評估。用戶無需為所有提示生成圖像，而僅需為選定維度相關的提示生成圖像，並據此完成相應的評估。我們的代碼庫和數據集現已公開，旨在促進T2I研究領域內的可重複評估研究與跨模型比較。

English

Text-to-image (T2I) models have garnered significant attention for generating high-quality images aligned with text prompts. However, rapid T2I model advancements reveal limitations in early benchmarks, lacking comprehensive evaluations, for example, the evaluation on reasoning, text rendering and style. Notably, recent state-of-the-art models, with their rich knowledge modeling capabilities, show promising results on the image generation problems requiring strong reasoning ability, yet existing evaluation systems have not adequately addressed this frontier. To systematically address these gaps, we introduce OneIG-Bench, a meticulously designed comprehensive benchmark framework for fine-grained evaluation of T2I models across multiple dimensions, including prompt-image alignment, text rendering precision, reasoning-generated content, stylization, and diversity. By structuring the evaluation, this benchmark enables in-depth analysis of model performance, helping researchers and practitioners pinpoint strengths and bottlenecks in the full pipeline of image generation. Specifically, OneIG-Bench enables flexible evaluation by allowing users to focus on a particular evaluation subset. Instead of generating images for the entire set of prompts, users can generate images only for the prompts associated with the selected dimension and complete the corresponding evaluation accordingly. Our codebase and dataset are now publicly available to facilitate reproducible evaluation studies and cross-model comparisons within the T2I research community.

OneIG-Bench：面向圖像生成的全維度細粒度評估框架

OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation

摘要

Support