預測性音樂轉換器
Anticipatory Music Transformer
June 14, 2023
作者: John Thickstun, David Hall, Chris Donahue, Percy Liang
cs.AI
摘要
我們介紹了「預測」:一種構建可控制的時間點過程生成模型的方法,該模型在非同步條件下受到第二個相關過程(控制過程)的實現的影響。我們通過交錯事件和控制的序列來實現這一點,使得控制在事件序列中的停止時間之後出現。這項工作是由在符號音樂生成控制中出現的問題所激發的。我們專注於填充控制任務,其中控制本身是事件的子集,並且在固定控制事件的情況下完成對事件序列的條件生成。我們使用大型且多樣化的Lakh MIDI音樂數據集來訓練預測填充模型。這些模型與提示音樂生成的自回歸模型的性能相匹配,並具有執行填充控制任務(包括伴奏)的額外能力。人類評估者報告說,預測模型生成的伴奏與人類創作的音樂在20秒片段中的音樂性相似。
English
We introduce anticipation: a method for constructing a controllable
generative model of a temporal point process (the event process) conditioned
asynchronously on realizations of a second, correlated process (the control
process). We achieve this by interleaving sequences of events and controls,
such that controls appear following stopping times in the event sequence. This
work is motivated by problems arising in the control of symbolic music
generation. We focus on infilling control tasks, whereby the controls are a
subset of the events themselves, and conditional generation completes a
sequence of events given the fixed control events. We train anticipatory
infilling models using the large and diverse Lakh MIDI music dataset. These
models match the performance of autoregressive models for prompted music
generation, with the additional capability to perform infilling control tasks,
including accompaniment. Human evaluators report that an anticipatory model
produces accompaniments with similar musicality to even music composed by
humans over a 20-second clip.