ChatPaper.aiChatPaper

Vibravox:使用體傳導音頻感應器捕獲的法語語音數據集

Vibravox: A Dataset of French Speech Captured with Body-conduction Audio Sensors

July 16, 2024
作者: Julien Hauret, Malo Olivier, Thomas Joubaud, Christophe Langrenne, Sarah Poirée, Véronique Zimpfer, Éric Bavu
cs.AI

摘要

Vibravox是一個符合《通用數據保護規例》(GDPR)的數據集,其中包含使用五種不同的體聲導感音頻感應器的音頻錄製:兩個耳道麥克風、兩個骨導振動拾音器和一個喉頭麥克風。該數據集還包括來自空中麥克風的音頻數據,作為參考。Vibravox語料庫包含由188名參與者在高階Ambisonics 3D空間化器強加的不同聲學條件下錄製的38小時語音樣本和生理聲音。語料庫中還包括有關錄製條件和語言轉錄的注釋。我們對各種與語音相關的任務進行了一系列實驗,包括語音識別、語音增強和語者驗證。這些實驗是使用最先進的模型進行的,以評估和比較它們在Vibravox數據集提供的不同音頻感應器捕獲的信號上的性能,旨在更好地理解它們的個別特徵。
English
Vibravox is a dataset compliant with the General Data Protection Regulation (GDPR) containing audio recordings using five different body-conduction audio sensors : two in-ear microphones, two bone conduction vibration pickups and a laryngophone. The data set also includes audio data from an airborne microphone used as a reference. The Vibravox corpus contains 38 hours of speech samples and physiological sounds recorded by 188 participants under different acoustic conditions imposed by an high order ambisonics 3D spatializer. Annotations about the recording conditions and linguistic transcriptions are also included in the corpus. We conducted a series of experiments on various speech-related tasks, including speech recognition, speech enhancement and speaker verification. These experiments were carried out using state-of-the-art models to evaluate and compare their performances on signals captured by the different audio sensors offered by the Vibravox dataset, with the aim of gaining a better grasp of their individual characteristics.

Summary

AI-Generated Summary

PDF42November 28, 2024