ViStoryBench：故事可视化综合基准测试套件

摘要

故事可视化旨在生成一系列与给定叙事和参考图像视觉上连贯的图片，随着生成模型的近期进展，该领域已取得显著进步。为了进一步提升故事可视化框架在现实场景中的表现，我们引入了一个全面的评估基准——ViStoryBench。我们收集了涵盖多种故事类型和艺术风格的多样化数据集，确保模型能在不同情节（如喜剧、恐怖）和视觉美学（如动漫、3D渲染）等多个维度上接受评估。ViStoryBench精心设计，以平衡叙事结构和视觉元素，包含单一及多重主角的故事，用以测试模型保持角色一致性的能力。此外，它还囊括了复杂情节和精细的世界构建，挑战模型生成准确视觉内容的能力。为确保全面比较，我们的基准整合了多种评估指标，覆盖关键方面。这一结构化和多层面的框架使研究人员能够深入识别不同模型的优势与不足，从而推动有针对性的改进。

English

Story visualization, which aims to generate a sequence of visually coherent images aligning with a given narrative and reference images, has seen significant progress with recent advancements in generative models. To further enhance the performance of story visualization frameworks in real-world scenarios, we introduce a comprehensive evaluation benchmark, ViStoryBench. We collect a diverse dataset encompassing various story types and artistic styles, ensuring models are evaluated across multiple dimensions such as different plots (e.g., comedy, horror) and visual aesthetics (e.g., anime, 3D renderings). ViStoryBench is carefully curated to balance narrative structures and visual elements, featuring stories with single and multiple protagonists to test models' ability to maintain character consistency. Additionally, it includes complex plots and intricate world-building to challenge models in generating accurate visuals. To ensure comprehensive comparisons, our benchmark incorporates a wide range of evaluation metrics assessing critical aspects. This structured and multifaceted framework enables researchers to thoroughly identify both the strengths and weaknesses of different models, fostering targeted improvements.