ChatPaper.aiChatPaper

MuPT:一種生成式符號音樂預訓練Transformer

MuPT: A Generative Symbolic Music Pretrained Transformer

April 9, 2024
作者: Xingwei Qu, Yuelin Bai, Yinghao Ma, Ziya Zhou, Ka Man Lo, Jiaheng Liu, Ruibin Yuan, Lejun Min, Xueling Liu, Tianyu Zhang, Xinrun Du, Shuyue Guo, Yiming Liang, Yizhi Li, Shangda Wu, Junting Zhou, Tianyu Zheng, Ziyang Ma, Fengze Han, Wei Xue, Gus Xia, Emmanouil Benetos, Xiang Yue, Chenghua Lin, Xu Tan, Stephen W. Huang, Wenhu Chen, Jie Fu, Ge Zhang
cs.AI

摘要

本文探討大型語言模型(LLMs)在音樂預訓練中的應用。儘管 MIDI 在音樂建模中的普遍應用已被確立,但我們的研究結果表明,LLMs 與 ABC 記譜法更相容,與其設計和優勢更為契合,從而提升模型在音樂作曲中的表現。為解決在生成過程中來自不同軌道的不對齊節拍所帶來的挑戰,我們提出了一種同步多軌道 ABC 記譜法(SMT-ABC 記譜法),旨在保持多個音樂軌道之間的連貫性。我們的貢獻包括一系列能處理高達 8192 個標記的模型,涵蓋我們訓練集中 90% 的符號音樂數據。此外,我們探討符號音樂擴展定律(SMS Law)對模型性能的影響。研究結果顯示了音樂生成領域未來研究的一個有前景的方向,通過我們的開源貢獻為社區主導的研究提供了豐富的資源。
English
In this paper, we explore the application of Large Language Models (LLMs) to the pre-training of music. While the prevalent use of MIDI in music modeling is well-established, our findings suggest that LLMs are inherently more compatible with ABC Notation, which aligns more closely with their design and strengths, thereby enhancing the model's performance in musical composition. To address the challenges associated with misaligned measures from different tracks during generation, we propose the development of a Synchronized Multi-Track ABC Notation (SMT-ABC Notation), which aims to preserve coherence across multiple musical tracks. Our contributions include a series of models capable of handling up to 8192 tokens, covering 90\% of the symbolic music data in our training set. Furthermore, we explore the implications of the Symbolic Music Scaling Law (SMS Law) on model performance. The results indicate a promising direction for future research in music generation, offering extensive resources for community-led research through our open-source contributions.

Summary

AI-Generated Summary

PDF160December 15, 2024