BeepBank-500:用於使用者介面音效研究與心理聲學研究的合成耳標迷你語料庫
BeepBank-500: A Synthetic Earcon Mini-Corpus for UI Sound Research and Psychoacoustics Research
September 21, 2025
作者: Mandip Goswami
cs.AI
摘要
我們推出BeepBank-500,這是一個緊湊、完全合成的音效/警示數據集(包含300至500個片段),專為人機交互和音頻機器學習領域的快速、無版權顧慮的實驗而設計。每個片段均基於參數化配方生成,控制波形類型(正弦波、方波、三角波、調頻波)、基頻、持續時間、振幅包絡、振幅調製(AM)以及輕量級的Schroeder式混響效果。我們採用三種混響設置:乾聲,以及兩種合成房間效果,分別標記為“rir small”(小)和“rir medium”(中),這些標記貫穿全文及元數據中。我們發布了單聲道48 kHz WAV音頻(16位)、豐富的元數據表(信號/頻譜特徵),以及針對(i)波形類型分類和(ii)單音f0回歸的微型可重現基線。該數據集旨在支持音效分類、音色分析和起始點檢測等任務,並明確了使用許可和限制。音頻通過CC0-1.0協議貢獻至公共領域;代碼採用MIT許可。數據DOI:https://doi.org/10.5281/zenodo.17172015。代碼:https://github.com/mandip42/earcons-mini-500。
English
We introduce BeepBank-500, a compact, fully synthetic earcon/alert dataset
(300-500 clips) designed for rapid, rights-clean experimentation in
human-computer interaction and audio machine learning. Each clip is generated
from a parametric recipe controlling waveform family (sine, square, triangle,
FM), fundamental frequency, duration, amplitude envelope, amplitude modulation
(AM), and lightweight Schroeder-style reverberation. We use three reverberation
settings: dry, and two synthetic rooms denoted 'rir small' ('small') and 'rir
medium' ('medium') throughout the paper and in the metadata. We release mono 48
kHz WAV audio (16-bit), a rich metadata table (signal/spectral features), and
tiny reproducible baselines for (i) waveform-family classification and (ii) f0
regression on single tones. The corpus targets tasks such as earcon
classification, timbre analyses, and onset detection, with clearly stated
licensing and limitations. Audio is dedicated to the public domain via CC0-1.0;
code is under MIT. Data DOI: https://doi.org/10.5281/zenodo.17172015. Code:
https://github.com/mandip42/earcons-mini-500.