ChatPaper.aiChatPaper

ShowTable:通过协作反思与优化开启创意表格可视化新境界

ShowTable: Unlocking Creative Table Visualization with Collaborative Reflection and Refinement

December 15, 2025
作者: Zhihang Liu, Xiaoyi Bao, Pandeng Li, Junjie Zhou, Zhaohe Liao, Yefei He, Kaixun Jiang, Chen-Wei Xie, Yun Zheng, Hongtao Xie
cs.AI

摘要

当前生成模型与统一模型虽在通用图像生成方面表现卓越,但在需要深度推理、规划能力以及超越常规场景的精确数据-视觉映射任务中仍存在局限。为突破现有技术瓶颈,我们提出一项创新性挑战任务:创意表格可视化,要求模型根据给定表格数据生成兼具信息准确性与视觉美学的信息图。针对这一挑战,我们提出ShowTable框架——通过渐进式自我修正流程实现多模态大语言模型与扩散模型的协同工作。该框架以MLLM作为核心协调器,负责视觉方案推理与视觉误差判定以提供优化指令,扩散模型则执行MLLM的指令以实现高保真生成效果。为支撑该任务及框架,我们开发了三套自动化数据构建流程用于训练不同模块,并建立包含800个挑战性实例的TableVisBench新基准,从五个评估维度全面衡量任务性能。实验表明,基于不同模型实例化的框架在各项指标上显著超越基线方法,凸显了其有效的多模态推理、生成及纠错能力。
English
While existing generation and unified models excel at general image generation, they struggle with tasks requiring deep reasoning, planning, and precise data-to-visual mapping abilities beyond general scenarios. To push beyond the existing limitations, we introduce a new and challenging task: creative table visualization, requiring the model to generate an infographic that faithfully and aesthetically visualizes the data from a given table. To address this challenge, we propose ShowTable, a pipeline that synergizes MLLMs with diffusion models via a progressive self-correcting process. The MLLM acts as the central orchestrator for reasoning the visual plan and judging visual errors to provide refined instructions, the diffusion execute the commands from MLLM, achieving high-fidelity results. To support this task and our pipeline, we introduce three automated data construction pipelines for training different modules. Furthermore, we introduce TableVisBench, a new benchmark with 800 challenging instances across 5 evaluation dimensions, to assess performance on this task. Experiments demonstrate that our pipeline, instantiated with different models, significantly outperforms baselines, highlighting its effective multi-modal reasoning, generation, and error correction capabilities.
PDF151December 18, 2025