可個性化的長上下文符號音樂填充與MIDI-RWKV
Personalizable Long-Context Symbolic Music Infilling with MIDI-RWKV
June 16, 2025
作者: Christian Zhou-Zheng, Philippe Pasquier
cs.AI
摘要
現有的自動音樂生成研究主要集中於端到端系統,這些系統能夠產生完整的作品或延續部分。然而,由於音樂創作通常是一個迭代過程,此類系統難以實現人機之間的互動,而這種互動對於計算機輔助創作至關重要。在本研究中,我們探討了個性化、多軌道、長上下文且可控的符號音樂填充任務,以增強計算機輔助作曲的過程。我們提出了MIDI-RWKV,這是一種基於RWKV-7線性架構的新模型,旨在實現邊緣設備上高效且連貫的音樂共同創作。我們還展示了MIDI-RWKV在極少樣本情況下,通過微調其初始狀態來實現個性化的有效方法。我們在多個定量和定性指標上評估了MIDI-RWKV及其狀態微調,並在https://github.com/christianazinn/MIDI-RWKV上發布了模型權重和代碼。
English
Existing work in automatic music generation has primarily focused on
end-to-end systems that produce complete compositions or continuations.
However, because musical composition is typically an iterative process, such
systems make it difficult to engage in the back-and-forth between human and
machine that is essential to computer-assisted creativity. In this study, we
address the task of personalizable, multi-track, long-context, and controllable
symbolic music infilling to enhance the process of computer-assisted
composition. We present MIDI-RWKV, a novel model based on the RWKV-7 linear
architecture, to enable efficient and coherent musical cocreation on edge
devices. We also demonstrate that MIDI-RWKV admits an effective method of
finetuning its initial state for personalization in the very-low-sample regime.
We evaluate MIDI-RWKV and its state tuning on several quantitative and
qualitative metrics, and release model weights and code at
https://github.com/christianazinn/MIDI-RWKV.