SonicMaster:邁向可控的一體化音樂修復與母帶處理
SonicMaster: Towards Controllable All-in-One Music Restoration and Mastering
August 5, 2025
作者: Jan Melechovsky, Ambuj Mehrish, Dorien Herremans
cs.AI
摘要
音樂錄音常因過度混響、失真、削波、音調失衡及立體聲像縮窄等音質問題而受損,尤其是在非專業環境下未使用專用設備或專業知識製作時。這些問題通常需借助多種專用工具及手動調整來修正。本文介紹了SonicMaster,首個針對廣泛音頻瑕疵進行修復與母帶處理的統一生成模型,並支持基於文本的控制。SonicMaster可根據自然語言指令進行定向增強,或運行於自動模式以實現通用修復。為訓練此模型,我們構建了SonicMaster數據集,這是一個大型配對數據集,通過模擬五類增強組(均衡、動態、混響、振幅及立體聲)下的十九種退化函數,生成退化與高質量音軌的對比。我們的方法採用流匹配生成訓練範式,學習一種音頻轉換,將退化輸入映射至其經文本提示引導的淨化、母帶處理版本。客觀音質指標顯示,SonicMaster在所有瑕疵類別上均顯著提升了音質。此外,主觀聽覺測試證實,聽者更偏好SonicMaster增強後的輸出而非原始退化音頻,凸顯了我們統一方法的有效性。
English
Music recordings often suffer from audio quality issues such as excessive
reverberation, distortion, clipping, tonal imbalances, and a narrowed stereo
image, especially when created in non-professional settings without specialized
equipment or expertise. These problems are typically corrected using separate
specialized tools and manual adjustments. In this paper, we introduce
SonicMaster, the first unified generative model for music restoration and
mastering that addresses a broad spectrum of audio artifacts with text-based
control. SonicMaster is conditioned on natural language instructions to apply
targeted enhancements, or can operate in an automatic mode for general
restoration. To train this model, we construct the SonicMaster dataset, a large
dataset of paired degraded and high-quality tracks by simulating common
degradation types with nineteen degradation functions belonging to five
enhancements groups: equalization, dynamics, reverb, amplitude, and stereo. Our
approach leverages a flow-matching generative training paradigm to learn an
audio transformation that maps degraded inputs to their cleaned, mastered
versions guided by text prompts. Objective audio quality metrics demonstrate
that SonicMaster significantly improves sound quality across all artifact
categories. Furthermore, subjective listening tests confirm that listeners
prefer SonicMaster's enhanced outputs over the original degraded audio,
highlighting the effectiveness of our unified approach.