ChatPaper.aiChatPaper

MuCodec:超低比特率音乐编解码器

MuCodec: Ultra Low-Bitrate Music Codec

September 20, 2024
作者: Yaoxun Xu, Hangting Chen, Jianwei Yu, Wei Tan, Rongzhi Gu, Shun Lei, Zhiwei Lin, Zhiyong Wu
cs.AI

摘要

音乐编解码器是音频编解码研究的重要方面,超低比特率压缩对音乐传输和生成具有重要意义。由于音乐背景的复杂性和人声的丰富性,仅依靠建模语义或声学信息无法有效重构同时包含人声和背景的音乐。为解决这一问题,我们提出了MuCodec,专门针对超低比特率下的音乐压缩和重构任务。MuCodec利用MuEncoder提取声学和语义特征,通过RVQ对其进行离散化,并通过流匹配获得Mel-VAE特征。然后使用预训练的MEL-VAE解码器和HiFi-GAN重构音乐。MuCodec可以在超低(0.35kbps)或高比特率(1.35kbps)下重构高保真音乐,在主观和客观指标上取得迄今最佳结果。代码和演示:https://xuyaoxun.github.io/MuCodec_demo/。
English
Music codecs are a vital aspect of audio codec research, and ultra low-bitrate compression holds significant importance for music transmission and generation. Due to the complexity of music backgrounds and the richness of vocals, solely relying on modeling semantic or acoustic information cannot effectively reconstruct music with both vocals and backgrounds. To address this issue, we propose MuCodec, specifically targeting music compression and reconstruction tasks at ultra low bitrates. MuCodec employs MuEncoder to extract both acoustic and semantic features, discretizes them with RVQ, and obtains Mel-VAE features via flow-matching. The music is then reconstructed using a pre-trained MEL-VAE decoder and HiFi-GAN. MuCodec can reconstruct high-fidelity music at ultra low (0.35kbps) or high bitrates (1.35kbps), achieving the best results to date in both subjective and objective metrics. Code and Demo: https://xuyaoxun.github.io/MuCodec_demo/.

Summary

AI-Generated Summary

PDF242November 16, 2024