ChatPaper.aiChatPaper

PosterGen:基于多智能体大语言模型的审美感知论文转海报生成系统

PosterGen: Aesthetic-Aware Paper-to-Poster Generation via Multi-Agent LLMs

August 24, 2025
作者: Zhilin Zhang, Xiang Zhang, Jiaqi Wei, Yiwei Xu, Chenyu You
cs.AI

摘要

基于大语言模型(LLMs)构建的多智能体系统在处理复杂组合任务方面展现出了卓越的能力。在本研究中,我们将这一范式应用于论文到海报的生成问题,这是研究人员准备会议时面临的一项实际但耗时的过程。尽管近期已有方法尝试自动化这一任务,但大多忽视了核心设计与美学原则,导致生成的海报需要大量手动调整。针对这些设计局限,我们提出了PosterGen,一个模拟专业海报设计师工作流程的多智能体框架。该框架由四个协作的专门化智能体组成:(1)解析与策划智能体从论文中提取内容并组织故事板;(2)布局智能体将内容映射为连贯的空间布局;(3)风格设计师智能体应用如色彩与排版等视觉设计元素;(4)渲染器合成最终海报。这些智能体共同作用,生成既语义扎实又视觉吸引人的海报。为评估设计质量,我们引入了一种基于视觉-语言模型(VLM)的评分标准,用于衡量布局平衡性、可读性及美学一致性。实验结果表明,PosterGen在内容保真度上始终匹配,并在视觉设计上显著超越现有方法,生成的海报几乎无需人工修饰即可直接用于展示。
English
Multi-agent systems built upon large language models (LLMs) have demonstrated remarkable capabilities in tackling complex compositional tasks. In this work, we apply this paradigm to the paper-to-poster generation problem, a practical yet time-consuming process faced by researchers preparing for conferences. While recent approaches have attempted to automate this task, most neglect core design and aesthetic principles, resulting in posters that require substantial manual refinement. To address these design limitations, we propose PosterGen, a multi-agent framework that mirrors the workflow of professional poster designers. It consists of four collaborative specialized agents: (1) Parser and Curator agents extract content from the paper and organize storyboard; (2) Layout agent maps the content into a coherent spatial layout; (3) Stylist agents apply visual design elements such as color and typography; and (4) Renderer composes the final poster. Together, these agents produce posters that are both semantically grounded and visually appealing. To evaluate design quality, we introduce a vision-language model (VLM)-based rubric that measures layout balance, readability, and aesthetic coherence. Experimental results show that PosterGen consistently matches in content fidelity, and significantly outperforms existing methods in visual designs, generating posters that are presentation-ready with minimal human refinements.
PDF92August 26, 2025