JEN-1 DreamStyler:透過關鍵參數調整的方式,定制化音樂概念學習
JEN-1 DreamStyler: Customized Musical Concept Learning via Pivotal Parameters Tuning
June 18, 2024
作者: Boyu Chen, Peike Li, Yao Yao, Alex Wang
cs.AI
摘要
大型模型用於文本轉音樂生成已取得顯著進展,有助於從提供的文本提示創作出高質量且多樣化的音樂作品。然而,輸入的文本提示可能無法準確捕捉用戶需求,特別是當目標是生成體現自特定參考集合衍生的特定概念的音樂時。在本文中,我們提出了一種新的定制文本轉音樂生成方法,可以從一段兩分鐘的參考音樂中捕捉概念並生成符合該概念的新音樂作品。我們通過使用參考音樂對預訓練的文本轉音樂模型進行微調來實現這一點。然而,直接微調所有參數會導致過度擬合問題。為了解決這個問題,我們提出了一種關鍵參數調整方法,使模型能夠吸收新概念,同時保留其原始生成能力。此外,我們還識別了在將多個概念引入預訓練模型時可能出現的概念衝突。我們提出了一種概念增強策略,以區分多個概念,使微調後的模型能夠同時生成包含單個或多個概念的音樂。由於我們是首個從事定制音樂生成任務的研究者,我們還為新任務引入了新的數據集和評估協議。我們提出的Jen1-DreamStyler在定性和定量評估中均優於幾個基準模型。演示將在https://www.jenmusic.ai/research#DreamStyler 上提供。
English
Large models for text-to-music generation have achieved significant progress,
facilitating the creation of high-quality and varied musical compositions from
provided text prompts. However, input text prompts may not precisely capture
user requirements, particularly when the objective is to generate music that
embodies a specific concept derived from a designated reference collection. In
this paper, we propose a novel method for customized text-to-music generation,
which can capture the concept from a two-minute reference music and generate a
new piece of music conforming to the concept. We achieve this by fine-tuning a
pretrained text-to-music model using the reference music. However, directly
fine-tuning all parameters leads to overfitting issues. To address this
problem, we propose a Pivotal Parameters Tuning method that enables the model
to assimilate the new concept while preserving its original generative
capabilities. Additionally, we identify a potential concept conflict when
introducing multiple concepts into the pretrained model. We present a concept
enhancement strategy to distinguish multiple concepts, enabling the fine-tuned
model to generate music incorporating either individual or multiple concepts
simultaneously. Since we are the first to work on the customized music
generation task, we also introduce a new dataset and evaluation protocol for
the new task. Our proposed Jen1-DreamStyler outperforms several baselines in
both qualitative and quantitative evaluations. Demos will be available at
https://www.jenmusic.ai/research#DreamStyler.Summary
AI-Generated Summary