Seed-Music：一个统一的框架，用于高质量和可控的音乐生成。

摘要

我们介绍了Seed-Music，这是一套能够生成高质量音乐并具有精细风格控制的音乐生成系统。我们的统一框架利用自回归语言建模和扩散方法，支持两种关键音乐创作工作流程：受控音乐生成和后期制作编辑。对于受控音乐生成，我们的系统能够通过多模态输入实现具有表现控制的人声音乐生成，包括风格描述、音频参考、乐谱和语音提示。对于后期制作编辑，它提供了交互式工具，可直接编辑生成音频中的歌词和人声旋律。我们鼓励读者在https://team.doubao.com/seed-music 听取演示音频示例。

English

We introduce Seed-Music, a suite of music generation systems capable of producing high-quality music with fine-grained style control. Our unified framework leverages both auto-regressive language modeling and diffusion approaches to support two key music creation workflows: controlled music generation and post-production editing. For controlled music generation, our system enables vocal music generation with performance controls from multi-modal inputs, including style descriptions, audio references, musical scores, and voice prompts. For post-production editing, it offers interactive tools for editing lyrics and vocal melodies directly in the generated audio. We encourage readers to listen to demo audio examples at https://team.doubao.com/seed-music .