BeepBank-500:面向用户界面音效研究与心理声学研究的合成耳标微型语料库
BeepBank-500: A Synthetic Earcon Mini-Corpus for UI Sound Research and Psychoacoustics Research
September 21, 2025
作者: Mandip Goswami
cs.AI
摘要
我们推出BeepBank-500,这是一个紧凑、完全合成的提示音/警报数据集(包含300至500个音频片段),专为人机交互与音频机器学习领域的快速、无版权纠纷的实验设计。每个片段均通过参数化配方生成,控制波形族(正弦波、方波、三角波、调频波)、基频、时长、振幅包络、振幅调制(AM)以及轻量级的Schroeder式混响效果。我们采用三种混响设置:干声,以及两种合成房间环境,分别标记为“rir small”(小)和“rir medium”(中),这些标记贯穿全文及元数据。我们发布了单声道48 kHz WAV音频(16位)、详尽的元数据表(信号/频谱特征),以及针对(i)波形族分类和(ii)单音f0回归的微型可复现基线。该数据集旨在服务于提示音分类、音色分析及起始点检测等任务,并明确声明了许可与限制。音频通过CC0-1.0协议贡献至公共领域;代码遵循MIT许可。数据DOI:https://doi.org/10.5281/zenodo.17172015。代码地址:https://github.com/mandip42/earcons-mini-500。
English
We introduce BeepBank-500, a compact, fully synthetic earcon/alert dataset
(300-500 clips) designed for rapid, rights-clean experimentation in
human-computer interaction and audio machine learning. Each clip is generated
from a parametric recipe controlling waveform family (sine, square, triangle,
FM), fundamental frequency, duration, amplitude envelope, amplitude modulation
(AM), and lightweight Schroeder-style reverberation. We use three reverberation
settings: dry, and two synthetic rooms denoted 'rir small' ('small') and 'rir
medium' ('medium') throughout the paper and in the metadata. We release mono 48
kHz WAV audio (16-bit), a rich metadata table (signal/spectral features), and
tiny reproducible baselines for (i) waveform-family classification and (ii) f0
regression on single tones. The corpus targets tasks such as earcon
classification, timbre analyses, and onset detection, with clearly stated
licensing and limitations. Audio is dedicated to the public domain via CC0-1.0;
code is under MIT. Data DOI: https://doi.org/10.5281/zenodo.17172015. Code:
https://github.com/mandip42/earcons-mini-500.