可個性化的長上下文符號音樂填充與MIDI-RWKV

摘要

現有的自動音樂生成研究主要集中於端到端系統，這些系統能夠產生完整的作品或延續部分。然而，由於音樂創作通常是一個迭代過程，此類系統難以實現人機之間的互動，而這種互動對於計算機輔助創作至關重要。在本研究中，我們探討了個性化、多軌道、長上下文且可控的符號音樂填充任務，以增強計算機輔助作曲的過程。我們提出了MIDI-RWKV，這是一種基於RWKV-7線性架構的新模型，旨在實現邊緣設備上高效且連貫的音樂共同創作。我們還展示了MIDI-RWKV在極少樣本情況下，通過微調其初始狀態來實現個性化的有效方法。我們在多個定量和定性指標上評估了MIDI-RWKV及其狀態微調，並在https://github.com/christianazinn/MIDI-RWKV上發布了模型權重和代碼。

English

Existing work in automatic music generation has primarily focused on end-to-end systems that produce complete compositions or continuations. However, because musical composition is typically an iterative process, such systems make it difficult to engage in the back-and-forth between human and machine that is essential to computer-assisted creativity. In this study, we address the task of personalizable, multi-track, long-context, and controllable symbolic music infilling to enhance the process of computer-assisted composition. We present MIDI-RWKV, a novel model based on the RWKV-7 linear architecture, to enable efficient and coherent musical cocreation on edge devices. We also demonstrate that MIDI-RWKV admits an effective method of finetuning its initial state for personalization in the very-low-sample regime. We evaluate MIDI-RWKV and its state tuning on several quantitative and qualitative metrics, and release model weights and code at https://github.com/christianazinn/MIDI-RWKV.

可個性化的長上下文符號音樂填充與MIDI-RWKV

Personalizable Long-Context Symbolic Music Infilling with MIDI-RWKV

摘要

Support