Vibravox: 身体伝導オーディオセンサーを用いて収録したフランス語音声データセット

要旨

Vibravoxは、一般データ保護規則（GDPR）に準拠したデータセットであり、5種類の身体伝導型オーディオセンサーを使用した音声録音を含んでいます。これには、2つのイヤーマイク、2つの骨導振動ピックアップ、および喉頭マイクが含まれます。また、データセットには、基準として使用された空中伝搬型マイクからのオーディオデータも含まれています。Vibravoxコーパスは、188名の参加者によって異なる音響条件下で録音された38時間の音声サンプルと生理学的音声を含んでおり、高次アンビソニックス3D空間化装置によって課された条件で録音されました。録音条件に関する注釈と言語学的転写もコーパスに含まれています。我々は、音声認識、音声強調、話者認証など、さまざまな音声関連タスクに関する一連の実験を実施しました。これらの実験は、最先端のモデルを使用して行われ、Vibravoxデータセットが提供する異なるオーディオセンサーで捕捉された信号に対する性能を評価・比較し、それぞれの特性をより深く理解することを目的としました。

English

Vibravox is a dataset compliant with the General Data Protection Regulation (GDPR) containing audio recordings using five different body-conduction audio sensors : two in-ear microphones, two bone conduction vibration pickups and a laryngophone. The data set also includes audio data from an airborne microphone used as a reference. The Vibravox corpus contains 38 hours of speech samples and physiological sounds recorded by 188 participants under different acoustic conditions imposed by an high order ambisonics 3D spatializer. Annotations about the recording conditions and linguistic transcriptions are also included in the corpus. We conducted a series of experiments on various speech-related tasks, including speech recognition, speech enhancement and speaker verification. These experiments were carried out using state-of-the-art models to evaluate and compare their performances on signals captured by the different audio sensors offered by the Vibravox dataset, with the aim of gaining a better grasp of their individual characteristics.

Vibravox: 身体伝導オーディオセンサーを用いて収録したフランス語音声データセット

Vibravox: A Dataset of French Speech Captured with Body-conduction Audio Sensors

要旨

Support