基于提示控制的通用歌曲生成框架
Versatile Framework for Song Generation with Prompt-based Control
April 27, 2025
作者: Yu Zhang, Wenxiang Guo, Changhao Pan, Zhiyuan Zhu, Ruiqi Li, Jingyu Lu, Rongjie Huang, Ruiyuan Zhang, Zhiqing Hong, Ziyue Jiang, Zhou Zhao
cs.AI
摘要
歌曲生成技术致力于依据多样化的提示创作出可控且高质量的歌曲。然而,现有方法在基于提示控制生成人声与伴奏,并确保两者精准对齐方面面临挑战,同时亦难以支持多种任务需求。为应对这些难题,我们推出了VersBand,一个多任务歌曲生成框架,旨在合成高质量、对齐良好且可基于提示控制的歌曲。VersBand主要由以下核心模型构成:1) VocalBand,一个解耦模型,采用流匹配方法生成演唱风格、音高及梅尔频谱图,实现快速、高质量且风格可控的人声生成。2) AccompBand,基于流的Transformer模型,集成Band-MOE机制,通过选择合适专家提升质量、对齐度及控制性,该模型能够生成与歌声对齐、可控且高质量的伴奏。3) 两个生成模型,LyricBand负责歌词创作,MelodyBand专注于旋律生成,共同构建了一个全面的多任务歌曲生成系统,支持基于多重提示的广泛控制。实验结果显示,VersBand在多项歌曲生成任务中,无论是客观指标还是主观评价,均优于基线模型。音频样本可在https://VersBand.github.io获取。
English
Song generation focuses on producing controllable high-quality songs based on
various prompts. However, existing methods struggle to generate vocals and
accompaniments with prompt-based control and proper alignment. Additionally,
they fall short in supporting various tasks. To address these challenges, we
introduce VersBand, a multi-task song generation framework for synthesizing
high-quality, aligned songs with prompt-based control. VersBand comprises these
primary models: 1) VocalBand, a decoupled model, leverages the flow-matching
method for generating singing styles, pitches, and mel-spectrograms, allowing
fast, high-quality vocal generation with style control. 2) AccompBand, a
flow-based transformer model, incorporates the Band-MOE, selecting suitable
experts for enhanced quality, alignment, and control. This model allows for
generating controllable, high-quality accompaniments aligned with vocals. 3)
Two generation models, LyricBand for lyrics and MelodyBand for melodies,
contribute to the comprehensive multi-task song generation system, allowing for
extensive control based on multiple prompts. Experimental results demonstrate
that VersBand performs better over baseline models across multiple song
generation tasks using objective and subjective metrics. Audio samples are
available at https://VersBand.github.io.Summary
AI-Generated Summary