简单且可控的音乐生成
Simple and Controllable Music Generation
June 8, 2023
作者: Jade Copet, Felix Kreuk, Itai Gat, Tal Remez, David Kant, Gabriel Synnaeve, Yossi Adi, Alexandre Défossez
cs.AI
摘要
我们着手处理条件音乐生成任务。我们引入了MusicGen,这是一个单一的语言模型(LM),它可以处理几个流的离散压缩音乐表示,即标记。与先前的工作不同,MusicGen由一个单阶段的Transformer LM和高效的标记交错模式组成,消除了级联多个模型的需要,例如分层或上采样。遵循这种方法,我们展示了MusicGen如何能够生成高质量的样本,同时可以在文本描述或旋律特征的条件下进行控制,从而更好地控制生成的输出。我们进行了广泛的实证评估,考虑了自动化和人类研究,展示了所提出的方法在标准文本到音乐基准上优于评估基线。通过消融研究,我们阐明了构成MusicGen的每个组件的重要性。音乐样本、代码和模型可在https://github.com/facebookresearch/audiocraft找到。
English
We tackle the task of conditional music generation. We introduce MusicGen, a
single Language Model (LM) that operates over several streams of compressed
discrete music representation, i.e., tokens. Unlike prior work, MusicGen is
comprised of a single-stage transformer LM together with efficient token
interleaving patterns, which eliminates the need for cascading several models,
e.g., hierarchically or upsampling. Following this approach, we demonstrate how
MusicGen can generate high-quality samples, while being conditioned on textual
description or melodic features, allowing better controls over the generated
output. We conduct extensive empirical evaluation, considering both automatic
and human studies, showing the proposed approach is superior to the evaluated
baselines on a standard text-to-music benchmark. Through ablation studies, we
shed light over the importance of each of the components comprising MusicGen.
Music samples, code, and models are available at
https://github.com/facebookresearch/audiocraft.