FADI-AEC: 遠端信号に基づく高速スコア拡散モデルを用いた音響エコーキャンセレーション

要旨

拡散モデルの音声強調における可能性にもかかわらず、その音響エコーキャンセレーション（AEC）への適用は限定的でした。本論文では、AEC専用の拡散ベースの確率的再生アプローチを初めて提案するDI-AECを紹介します。さらに、エッジデバイスでの利用に適した計算負荷を削減するため、高速スコアベース拡散AECフレームワークであるFADI-AECを提案します。このフレームワークは、フレームごとにスコアモデルを1回実行することで、処理効率の大幅な向上を実現しています。加えて、遠端信号を活用した新しいノイズ生成技術を導入し、遠端信号と近端信号の両方を組み合わせてスコアモデルの精度を向上させます。提案手法をICASSP2023 Microsoft Deep Echo Cancellation Challenge評価データセットで検証した結果、エンドツーエンド手法や他の拡散ベースのエコーキャンセレーション手法を上回る性能を示しました。

English

Despite the potential of diffusion models in speech enhancement, their deployment in Acoustic Echo Cancellation (AEC) has been restricted. In this paper, we propose DI-AEC, pioneering a diffusion-based stochastic regeneration approach dedicated to AEC. Further, we propose FADI-AEC, fast score-based diffusion AEC framework to save computational demands, making it favorable for edge devices. It stands out by running the score model once per frame, achieving a significant surge in processing efficiency. Apart from that, we introduce a novel noise generation technique where far-end signals are utilized, incorporating both far-end and near-end signals to refine the score model's accuracy. We test our proposed method on the ICASSP2023 Microsoft deep echo cancellation challenge evaluation dataset, where our method outperforms some of the end-to-end methods and other diffusion based echo cancellation methods.

FADI-AEC: 遠端信号に基づく高速スコア拡散モデルを用いた音響エコーキャンセレーション

FADI-AEC: Fast Score Based Diffusion Model Guided by Far-end Signal for Acoustic Echo Cancellation

要旨

Support