BACHI:基于掩码迭代解码的流行与古典音乐边界感知符号和弦识别
BACHI: Boundary-Aware Symbolic Chord Recognition Through Masked Iterative Decoding on Pop and Classical Music
October 8, 2025
作者: Mingyang Yao, Ke Chen, Shlomo Dubnov, Taylor Berg-Kirkpatrick
cs.AI
摘要
通过深度学习模型实现的自动和弦识别(ACR)已逐步取得显著的识别准确率,但仍面临两大关键挑战。首先,先前的研究主要集中于音频领域的ACR,而符号音乐(如乐谱)的ACR因数据稀缺而受到较少关注。其次,现有方法仍缺乏与人类音乐分析实践相契合的策略。为应对这些挑战,我们做出了两项贡献:(1)我们推出了POP909-CL,这是POP909数据集的增强版本,包含节奏对齐的内容及人工校正的和弦、节拍、调性和拍号标签;(2)我们提出了BACHI,一种符号和弦识别模型,该模型将任务分解为不同的决策步骤,即边界检测及和弦根音、性质与低音(转位)的迭代排序。这一机制模拟了人类听觉训练实践。实验表明,BACHI在古典与流行音乐基准测试中均达到了当前最优的和弦识别性能,消融研究验证了各模块的有效性。
English
Automatic chord recognition (ACR) via deep learning models has gradually
achieved promising recognition accuracy, yet two key challenges remain. First,
prior work has primarily focused on audio-domain ACR, while symbolic music
(e.g., score) ACR has received limited attention due to data scarcity. Second,
existing methods still overlook strategies that are aligned with human music
analytical practices. To address these challenges, we make two contributions:
(1) we introduce POP909-CL, an enhanced version of POP909 dataset with
tempo-aligned content and human-corrected labels of chords, beats, keys, and
time signatures; and (2) We propose BACHI, a symbolic chord recognition model
that decomposes the task into different decision steps, namely boundary
detection and iterative ranking of chord root, quality, and bass (inversion).
This mechanism mirrors the human ear-training practices. Experiments
demonstrate that BACHI achieves state-of-the-art chord recognition performance
on both classical and pop music benchmarks, with ablation studies validating
the effectiveness of each module.