BACHI:基於掩碼迭代解碼的邊界感知符號和弦識別——應用於流行與古典音樂
BACHI: Boundary-Aware Symbolic Chord Recognition Through Masked Iterative Decoding on Pop and Classical Music
October 8, 2025
作者: Mingyang Yao, Ke Chen, Shlomo Dubnov, Taylor Berg-Kirkpatrick
cs.AI
摘要
基於深度學習模型的自動和弦識別(ACR)已逐步實現了令人矚目的識別準確率,然而仍面臨兩大關鍵挑戰。首先,先前的研究主要集中於音頻領域的ACR,而符號音樂(如樂譜)的ACR由於數據稀缺性,所獲關注有限。其次,現有方法仍忽視了與人類音樂分析實踐相契合的策略。針對這些挑戰,我們提出了兩項貢獻:(1)我們引入了POP909-CL,這是POP909數據集的增強版本,具備節奏對齊的內容及人工校正的和弦、節拍、調性和拍號標籤;(2)我們提出了BACHI,一種符號和弦識別模型,該模型將任務分解為不同的決策步驟,即邊界檢測以及和弦根音、質量和低音(轉位)的迭代排序。此機制模擬了人類聽覺訓練的實踐。實驗結果顯示,BACHI在古典與流行音樂基準測試中均達到了頂尖的和弦識別性能,並通過消融研究驗證了各模塊的有效性。
English
Automatic chord recognition (ACR) via deep learning models has gradually
achieved promising recognition accuracy, yet two key challenges remain. First,
prior work has primarily focused on audio-domain ACR, while symbolic music
(e.g., score) ACR has received limited attention due to data scarcity. Second,
existing methods still overlook strategies that are aligned with human music
analytical practices. To address these challenges, we make two contributions:
(1) we introduce POP909-CL, an enhanced version of POP909 dataset with
tempo-aligned content and human-corrected labels of chords, beats, keys, and
time signatures; and (2) We propose BACHI, a symbolic chord recognition model
that decomposes the task into different decision steps, namely boundary
detection and iterative ranking of chord root, quality, and bass (inversion).
This mechanism mirrors the human ear-training practices. Experiments
demonstrate that BACHI achieves state-of-the-art chord recognition performance
on both classical and pop music benchmarks, with ablation studies validating
the effectiveness of each module.