SlideTailor:面向科研论文的个性化演示文稿生成系统
SlideTailor: Personalized Presentation Slide Generation for Scientific Papers
December 23, 2025
作者: Wenzheng Zeng, Mingyu Ouyang, Langyuan Cui, Hwee Tou Ng
cs.AI
摘要
自动演示文稿生成技术能显著提升内容创作效率。然而,由于不同用户的偏好存在差异,现有欠约束的生成方案常导致结果与用户需求不匹配。我们提出一项创新任务:基于用户指定偏好的论文转幻灯片生成。受人类行为启发,我们设计出SlideTailor智能代理框架,通过渐进式生成可编辑幻灯片实现用户需求对齐。该系统无需用户撰写详细文本偏好说明,仅需提供论文-幻灯片示例对和视觉模板——这些自然易得的素材隐式编码了用户在内容与视觉风格上的丰富偏好。尽管输入信息具有隐式无标注特性,我们的框架仍能有效提炼并泛化这些偏好以指导定制化幻灯片生成。我们还引入创新的语音链式机制,使幻灯片内容与预设的口头叙述相协调。该设计显著提升了生成幻灯片的质量,并支持视频演示等下游应用。为支撑此新任务,我们构建了涵盖多样化用户偏好的基准数据集,并设计了可解释的评估指标进行鲁棒性验证。大量实验证明了本框架的有效性。
English
Automatic presentation slide generation can greatly streamline content creation. However, since preferences of each user may vary, existing under-specified formulations often lead to suboptimal results that fail to align with individual user needs. We introduce a novel task that conditions paper-to-slides generation on user-specified preferences. We propose a human behavior-inspired agentic framework, SlideTailor, that progressively generates editable slides in a user-aligned manner. Instead of requiring users to write their preferences in detailed textual form, our system only asks for a paper-slides example pair and a visual template - natural and easy-to-provide artifacts that implicitly encode rich user preferences across content and visual style. Despite the implicit and unlabeled nature of these inputs, our framework effectively distills and generalizes the preferences to guide customized slide generation. We also introduce a novel chain-of-speech mechanism to align slide content with planned oral narration. Such a design significantly enhances the quality of generated slides and enables downstream applications like video presentations. To support this new task, we construct a benchmark dataset that captures diverse user preferences, with carefully designed interpretable metrics for robust evaluation. Extensive experiments demonstrate the effectiveness of our framework.