ChatPaper.aiChatPaper

大规模自动有声书制作

Large-Scale Automatic Audiobook Creation

September 7, 2023
作者: Brendan Walsh, Mark Hamilton, Greg Newby, Xi Wang, Serena Ruan, Sheng Zhao, Lei He, Shaofei Zhang, Eric Dettinger, William T. Freeman, Markus Weimer
cs.AI

摘要

有声书可以显著提高文学作品的可访问性和读者参与度。然而,制作、编辑和发布有声书可能需要数百小时的人力。在这项工作中,我们提出了一个系统,可以从在线电子书自动生成高质量的有声书。具体来说,我们利用了最新的神经文本转语音技术,从古腾堡计划的电子书集合中创建并发布了数千本人类品质的开放许可有声书。我们的方法可以识别要朗读的电子书内容的适当子集,适用于各种结构多样的书籍,并可以同时处理数百本书。我们的系统允许用户自定义有声书的朗读速度和风格、情感语调,甚至可以使用少量示例音频匹配所需的声音。这项工作贡献了五千多本开放许可有声书和一个交互式演示,让用户快速创建他们自己定制的有声书。欲收听有声书集合,请访问https://aka.ms/audiobook。
English
An audiobook can dramatically improve a work of literature's accessibility and improve reader engagement. However, audiobooks can take hundreds of hours of human effort to create, edit, and publish. In this work, we present a system that can automatically generate high-quality audiobooks from online e-books. In particular, we leverage recent advances in neural text-to-speech to create and release thousands of human-quality, open-license audiobooks from the Project Gutenberg e-book collection. Our method can identify the proper subset of e-book content to read for a wide collection of diversely structured books and can operate on hundreds of books in parallel. Our system allows users to customize an audiobook's speaking speed and style, emotional intonation, and can even match a desired voice using a small amount of sample audio. This work contributed over five thousand open-license audiobooks and an interactive demo that allows users to quickly create their own customized audiobooks. To listen to the audiobook collection visit https://aka.ms/audiobook.
PDF542December 15, 2024