Tadabur：大规模古兰经音频数据集

摘要

尽管对古兰经数据研究的兴趣日益增长，但现有古兰经数据集在规模和多样性方面仍存在局限。为填补这一空白，我们推出Tadabur——一个大规模古兰经音频数据集。该数据集收录了来自600多位不同诵经者的1400余小时诵经音频，在诵经风格、声音特征和录制条件方面呈现出显著差异。这种多样性使Tadabur成为古兰经语音研究与分析的全面且具代表性的资源。通过显著扩展可用古兰经数据的总时长和变异性，Tadabur旨在支持未来研究，并推动标准化古兰经语音基准的开发。

English

Despite growing interest in Quranic data research, existing Quran datasets remain limited in both scale and diversity. To address this gap, we present Tadabur, a large-scale Quran audio dataset. Tadabur comprises more than 1400+ hours of recitation audio from over 600 distinct reciters, providing substantial variation in recitation styles, vocal characteristics, and recording conditions. This diversity makes Tadabur a comprehensive and representative resource for Quranic speech research and analysis. By significantly expanding both the total duration and variability of available Quran data, Tadabur aims to support future research and facilitate the development of standardized Quranic speech benchmarks.

Tadabur：大规模古兰经音频数据集

Tadabur: A Large-Scale Quran Audio Dataset

摘要

Support