PerceiverS:一种多尺度感知器,具有有效分割功能,用于长期表现性符号音乐生成。
PerceiverS: A Multi-Scale Perceiver with Effective Segmentation for Long-Term Expressive Symbolic Music Generation
November 13, 2024
作者: Yungang Yi, Weihua Li, Matthew Kuo, Quan Bai
cs.AI
摘要
音乐生成取得了显著进展,特别是在音频生成领域。然而,生成既具有长期结构又富有表现力的符号音乐仍然是一个重大挑战。在本文中,我们提出了PerceiverS(分割和尺度),这是一种新颖的架构,旨在通过利用有效的分割和多尺度注意机制来解决这一问题。我们的方法通过同时学习长期结构依赖性和短期表现细节,增强了符号音乐生成。通过在多尺度设置中结合交叉注意力和自注意力,PerceiverS捕捉了长距离的音乐结构,同时保留了表现细微差别。所提出的模型在Maestro等数据集上进行评估,展示了在生成既具有结构一致性又富有表现变化的连贯且多样化音乐方面的改进。项目演示和生成的音乐样本可通过以下链接访问:https://perceivers.github.io。
English
Music generation has progressed significantly, especially in the domain of
audio generation. However, generating symbolic music that is both
long-structured and expressive remains a significant challenge. In this paper,
we propose PerceiverS (Segmentation and Scale), a novel architecture designed
to address this issue by leveraging both Effective Segmentation and Multi-Scale
attention mechanisms. Our approach enhances symbolic music generation by
simultaneously learning long-term structural dependencies and short-term
expressive details. By combining cross-attention and self-attention in a
Multi-Scale setting, PerceiverS captures long-range musical structure while
preserving performance nuances. The proposed model, evaluated on datasets like
Maestro, demonstrates improvements in generating coherent and diverse music
with both structural consistency and expressive variation. The project demos
and the generated music samples can be accessed through the link:
https://perceivers.github.io.Summary
AI-Generated Summary