Seed-Music: 高品質かつ制御された音楽生成のための統合フレームワーク

要旨

Seed-Musicは、高品質の音楽を微細なスタイル制御で生成することができる音楽生成システムのスイートを紹介します。当社の統合フレームワークは、自己回帰言語モデリングと拡散アプローチの両方を活用し、制御された音楽生成とポストプロダクション編集という2つの主要な音楽制作ワークフローをサポートしています。制御された音楽生成では、当社のシステムは、スタイルの説明、オーディオリファレンス、楽譜、音声プロンプトなど、マルチモーダル入力からのパフォーマンス制御を備えたボーカル音楽生成を可能にします。ポストプロダクション編集では、生成されたオーディオ内で歌詞やボーカルメロディを直接編集するためのインタラクティブツールを提供しています。デモオーディオ例は、以下のリンクで聴くことを読者にお勧めします：https://team.doubao.com/seed-music

English

We introduce Seed-Music, a suite of music generation systems capable of producing high-quality music with fine-grained style control. Our unified framework leverages both auto-regressive language modeling and diffusion approaches to support two key music creation workflows: controlled music generation and post-production editing. For controlled music generation, our system enables vocal music generation with performance controls from multi-modal inputs, including style descriptions, audio references, musical scores, and voice prompts. For post-production editing, it offers interactive tools for editing lyrics and vocal melodies directly in the generated audio. We encourage readers to listen to demo audio examples at https://team.doubao.com/seed-music .