预测性音乐变换器

摘要

我们介绍了一种名为“预期”的方法，用于构建一个可控的生成模型，针对时间点过程（事件过程），在第二个相关过程（控制过程）的实现异步条件下。我们通过交错事件和控制的序列来实现这一点，使得控制在事件序列中的停止时间之后出现。这项工作受到符号音乐生成控制中出现的问题的启发。我们专注于填充控制任务，其中控制是事件本身的子集，并且在固定控制事件的情况下完成事件序列的条件生成。我们使用大型且多样化的Lakh MIDI音乐数据集训练预期填充模型。这些模型在提示音乐生成方面与自回归模型的性能相匹配，并具有执行填充控制任务（包括伴奏）的额外能力。人类评估者报告称，预期模型生成的伴奏与人类创作的音乐在20秒片段中的音乐性相似。

English

We introduce anticipation: a method for constructing a controllable generative model of a temporal point process (the event process) conditioned asynchronously on realizations of a second, correlated process (the control process). We achieve this by interleaving sequences of events and controls, such that controls appear following stopping times in the event sequence. This work is motivated by problems arising in the control of symbolic music generation. We focus on infilling control tasks, whereby the controls are a subset of the events themselves, and conditional generation completes a sequence of events given the fixed control events. We train anticipatory infilling models using the large and diverse Lakh MIDI music dataset. These models match the performance of autoregressive models for prompted music generation, with the additional capability to perform infilling control tasks, including accompaniment. Human evaluators report that an anticipatory model produces accompaniments with similar musicality to even music composed by humans over a 20-second clip.

预测性音乐变换器

Anticipatory Music Transformer

摘要

Support