SlideTailor:面向科研论文的个性化演示文稿生成系统
SlideTailor: Personalized Presentation Slide Generation for Scientific Papers
December 23, 2025
作者: Wenzheng Zeng, Mingyu Ouyang, Langyuan Cui, Hwee Tou Ng
cs.AI
摘要
自動化簡報投影片生成能大幅簡流內容創作流程。然而,由於每位用戶的偏好可能存在差異,現有基於模糊設定的生成方案常導致生成結果欠佳,難以契合個體化需求。本文提出一項創新任務:基於用戶指定偏好條件下的論文轉簡報投影片生成。我們設計了受人類行為啟發的智能代理框架SlideTailor,該框架能以用戶對齊的方式逐步生成可編輯的投影片。相較於要求用戶以詳盡文本形式描述偏好,本系統僅需用戶提供一組論文-投影片範例對和視覺模板——這些自然易得的素材隱式編碼了用戶在內容與視覺風格方面的豐富偏好。儘管輸入信息具有隱性且未標註的特性,我們的框架能有效提煉並泛化這些偏好,從而指導定制化投影片生成。此外,我們引入創新的語音鏈機制,使投影片內容與預設的口頭敘述保持同步。此設計顯著提升了生成投影片的質量,並支持視頻演示等下游應用。為支撐該新任務,我們構建了涵蓋多樣化用戶偏好的基準數據集,並設計了具可解釋性的評估指標以進行魯棒性驗證。大量實驗結果證明了本框架的有效性。
English
Automatic presentation slide generation can greatly streamline content creation. However, since preferences of each user may vary, existing under-specified formulations often lead to suboptimal results that fail to align with individual user needs. We introduce a novel task that conditions paper-to-slides generation on user-specified preferences. We propose a human behavior-inspired agentic framework, SlideTailor, that progressively generates editable slides in a user-aligned manner. Instead of requiring users to write their preferences in detailed textual form, our system only asks for a paper-slides example pair and a visual template - natural and easy-to-provide artifacts that implicitly encode rich user preferences across content and visual style. Despite the implicit and unlabeled nature of these inputs, our framework effectively distills and generalizes the preferences to guide customized slide generation. We also introduce a novel chain-of-speech mechanism to align slide content with planned oral narration. Such a design significantly enhances the quality of generated slides and enables downstream applications like video presentations. To support this new task, we construct a benchmark dataset that captures diverse user preferences, with carefully designed interpretable metrics for robust evaluation. Extensive experiments demonstrate the effectiveness of our framework.