ChatPaper.aiChatPaper

BeepBank-500:面向用户界面音效研究与心理声学研究的合成耳标微型语料库

BeepBank-500: A Synthetic Earcon Mini-Corpus for UI Sound Research and Psychoacoustics Research

September 21, 2025
作者: Mandip Goswami
cs.AI

摘要

我们推出BeepBank-500,这是一个紧凑、完全合成的提示音/警报数据集(包含300至500个音频片段),专为人机交互与音频机器学习领域的快速、无版权纠纷的实验设计。每个片段均通过参数化配方生成,控制波形族(正弦波、方波、三角波、调频波)、基频、时长、振幅包络、振幅调制(AM)以及轻量级的Schroeder式混响效果。我们采用三种混响设置:干声,以及两种合成房间环境,分别标记为“rir small”(小)和“rir medium”(中),这些标记贯穿全文及元数据。我们发布了单声道48 kHz WAV音频(16位)、详尽的元数据表(信号/频谱特征),以及针对(i)波形族分类和(ii)单音f0回归的微型可复现基线。该数据集旨在服务于提示音分类、音色分析及起始点检测等任务,并明确声明了许可与限制。音频通过CC0-1.0协议贡献至公共领域;代码遵循MIT许可。数据DOI:https://doi.org/10.5281/zenodo.17172015。代码地址:https://github.com/mandip42/earcons-mini-500。
English
We introduce BeepBank-500, a compact, fully synthetic earcon/alert dataset (300-500 clips) designed for rapid, rights-clean experimentation in human-computer interaction and audio machine learning. Each clip is generated from a parametric recipe controlling waveform family (sine, square, triangle, FM), fundamental frequency, duration, amplitude envelope, amplitude modulation (AM), and lightweight Schroeder-style reverberation. We use three reverberation settings: dry, and two synthetic rooms denoted 'rir small' ('small') and 'rir medium' ('medium') throughout the paper and in the metadata. We release mono 48 kHz WAV audio (16-bit), a rich metadata table (signal/spectral features), and tiny reproducible baselines for (i) waveform-family classification and (ii) f0 regression on single tones. The corpus targets tasks such as earcon classification, timbre analyses, and onset detection, with clearly stated licensing and limitations. Audio is dedicated to the public domain via CC0-1.0; code is under MIT. Data DOI: https://doi.org/10.5281/zenodo.17172015. Code: https://github.com/mandip42/earcons-mini-500.
PDF12September 23, 2025