ChatPaper.aiChatPaper

大規模自動有聲書製作

Large-Scale Automatic Audiobook Creation

September 7, 2023
作者: Brendan Walsh, Mark Hamilton, Greg Newby, Xi Wang, Serena Ruan, Sheng Zhao, Lei He, Shaofei Zhang, Eric Dettinger, William T. Freeman, Markus Weimer
cs.AI

摘要

有聲書可以顯著提升文學作品的可接近性,增進讀者參與度。然而,製作、編輯和發行一本有聲書可能需要數百小時的人力。在這項研究中,我們提出了一個系統,可以自動從線上電子書生成高品質的有聲書。具體而言,我們利用最新的神經文本轉語音技術,從古酷登計畫的電子書收藏中創建並發行成千上萬個人類品質的開放授權有聲書。我們的方法可以識別適合閱讀各種結構多樣的書籍的電子書內容子集,並可以同時處理數百本書籍。我們的系統允許用戶自定有聲書的說話速度和風格、情感語調,甚至可以使用少量樣本音頻來匹配所需的聲音。這項工作貢獻了超過五千本開放授權有聲書,以及一個互動演示,讓用戶快速創建自己定制的有聲書。欲收聽有聲書收藏,請訪問https://aka.ms/audiobook。
English
An audiobook can dramatically improve a work of literature's accessibility and improve reader engagement. However, audiobooks can take hundreds of hours of human effort to create, edit, and publish. In this work, we present a system that can automatically generate high-quality audiobooks from online e-books. In particular, we leverage recent advances in neural text-to-speech to create and release thousands of human-quality, open-license audiobooks from the Project Gutenberg e-book collection. Our method can identify the proper subset of e-book content to read for a wide collection of diversely structured books and can operate on hundreds of books in parallel. Our system allows users to customize an audiobook's speaking speed and style, emotional intonation, and can even match a desired voice using a small amount of sample audio. This work contributed over five thousand open-license audiobooks and an interactive demo that allows users to quickly create their own customized audiobooks. To listen to the audiobook collection visit https://aka.ms/audiobook.
PDF542December 15, 2024