阿波罗:用于高质量音频恢复的带序列建模
Apollo: Band-sequence Modeling for High-Quality Audio Restoration
September 13, 2024
作者: Kai Li, Yi Luo
cs.AI
摘要
在现代社会中,音频恢复变得越来越重要,这不仅是因为先进的播放设备带来了高质量听觉体验的需求,也因为生成式音频模型的不断增强需要高保真音频。通常,音频恢复被定义为从受损输入中预测无失真音频的任务,通常使用 GAN 框架进行训练,以平衡感知和失真。由于音频退化主要集中在中高频范围,特别是由于编解码器的原因,一个关键挑战在于设计一个生成器,能够保留低频信息同时准确重建高质量的中高频内容。受高采样率音乐分离、语音增强和音频编解码模型的最新进展启发,我们提出了 Apollo,这是一个专为高采样率音频恢复而设计的生成模型。Apollo 使用显式频带分割模块来建模不同频段之间的关系,从而实现更连贯和更高质量的恢复音频。在 MUSDB18-HQ 和 MoisesDB 数据集上进行评估,Apollo 在各种比特率和音乐流派中始终优于现有的 SR-GAN 模型,特别擅长处理涉及多种乐器和人声混合的复杂场景。Apollo 显著提高了音乐恢复质量,同时保持了计算效率。Apollo 的源代码可在 https://github.com/JusperLee/Apollo 上公开获取。
English
Audio restoration has become increasingly significant in modern society, not
only due to the demand for high-quality auditory experiences enabled by
advanced playback devices, but also because the growing capabilities of
generative audio models necessitate high-fidelity audio. Typically, audio
restoration is defined as a task of predicting undistorted audio from damaged
input, often trained using a GAN framework to balance perception and
distortion. Since audio degradation is primarily concentrated in mid- and
high-frequency ranges, especially due to codecs, a key challenge lies in
designing a generator capable of preserving low-frequency information while
accurately reconstructing high-quality mid- and high-frequency content.
Inspired by recent advancements in high-sample-rate music separation, speech
enhancement, and audio codec models, we propose Apollo, a generative model
designed for high-sample-rate audio restoration. Apollo employs an explicit
frequency band split module to model the relationships between different
frequency bands, allowing for more coherent and higher-quality restored audio.
Evaluated on the MUSDB18-HQ and MoisesDB datasets, Apollo consistently
outperforms existing SR-GAN models across various bit rates and music genres,
particularly excelling in complex scenarios involving mixtures of multiple
instruments and vocals. Apollo significantly improves music restoration quality
while maintaining computational efficiency. The source code for Apollo is
publicly available at https://github.com/JusperLee/Apollo.Summary
AI-Generated Summary