ChatPaper.aiChatPaper

论文展示是一门艺术:面向学术演讲的自我提升美学智能体

Presenting a Paper is an Art: Self-Improvement Aesthetic Agents for Academic Presentations

October 7, 2025
作者: Chengzhi Liu, Yuzhe Yang, Kaiwen Zhou, Zhen Zhang, Yue Fan, Yannan Xie, Peng Qi, Xin Eric Wang
cs.AI

摘要

学术论文的推广已成为提升研究可见度的重要手段。然而,现有的自动化方法在叙事连贯性、美学质量不足以及自我调整受限等方面存在困难,难以实现高效且引人入胜的传播。这些挑战的核心在于一个简单原则:若无法准确评估,则无从改进。为此,我们提出了EvoPresent,一个自我提升的智能体框架,它通过虚拟角色统一了连贯的叙事、美学感知的设计以及逼真的演示呈现。EvoPresent的核心是PresAesth,一个多任务强化学习(RL)美学模型,它提供了可靠的美学评分、缺陷调整和比较反馈,即使在美学训练数据有限的情况下也能实现迭代自我提升。为了系统评估这些方法,我们引入了EvoPresent基准,这是一个综合基准,包括:基于650篇顶级AI会议论文的多模态资源(幻灯片、视频和脚本)构建的演示生成质量评估,用于内容和设计的双重考量;以及美学意识评估,包含2000对美学水平各异的幻灯片,支持在评分、缺陷调整和比较任务上的联合训练与评估。我们的研究发现:(i)高质量反馈对于智能体自我提升至关重要,而初始能力本身并不能保证有效的自我修正。(ii)自动化生成管道在视觉设计与内容构建之间存在权衡。(iii)多任务RL训练在美学意识任务中展现出更强的泛化能力。
English
The promotion of academic papers has become an important means of enhancing research visibility. However, existing automated methods struggle limited storytelling, insufficient aesthetic quality, and constrained self-adjustment, making it difficult to achieve efficient and engaging dissemination. At the heart of those challenges is a simple principle: there is no way to improve it when you cannot evaluate it right. To address this, we introduce EvoPresent, a self-improvement agent framework that unifies coherent narratives, aesthetic-aware designs, and realistic presentation delivery via virtual characters. Central to EvoPresent is PresAesth, a multi-task reinforcement learning (RL) aesthetic model that provides reliable aesthetic scoring, defect adjustment, and comparative feedback, enabling iterative self-improvement even under limited aesthetic training data. To systematically evaluate the methods, we introduce EvoPresent Benchmark, a comprehensive benchmark comprising: Presentation Generation Quality, built on 650 top-tier AI conference papers with multimodal resources (slides, videos and scripts) to assess both content and design; and Aesthetic Awareness, consisting of 2,000 slide pairs with varying aesthetic levels, supporting joint training and evaluation on scoring, defect adjustment, and comparison. Our findings highlight that (i) High-quality feedback is essential for agent self-improvement, while initial capability alone does not guarantee effective self-correction. (ii) Automated generation pipelines exhibit a trade-off between visual design and content construction. (iii) Multi-task RL training shows stronger generalization in aesthetic awareness tasks.
PDF132October 8, 2025