FADI-AEC: 원단 신호 기반의 빠른 점수 확산 모델을 활용한 음향 에코 제거

초록

음성 향상 분야에서 디퓨전 모델의 잠재력에도 불구하고, 음향 에코 제거(Acoustic Echo Cancellation, AEC)에의 적용은 제한적이었습니다. 본 논문에서는 AEC에 특화된 디퓨전 기반 확률적 재생성 접근법인 DI-AEC를 최초로 제안합니다. 더 나아가, 에지 디바이스에 적합하도록 계산 요구량을 절감한 빠른 스코어 기반 디퓨전 AEC 프레임워크인 FADI-AEC를 제안합니다. 이 프레임워크는 프레임당 한 번만 스코어 모델을 실행함으로써 처리 효율성을 크게 향상시킨 점이 두드러집니다. 또한, 원단 신호를 활용한 새로운 노이즈 생성 기법을 도입하여 원단 및 근단 신호를 모두 활용함으로써 스코어 모델의 정확도를 개선했습니다. 우리는 제안된 방법을 ICASSP2023 Microsoft 딥 에코 제거 챌린지 평가 데이터셋에서 테스트했으며, 이 방법이 일부 종단 간(end-to-end) 방법 및 기타 디퓨전 기반 에코 제거 방법을 능가하는 성능을 보였습니다.

English

Despite the potential of diffusion models in speech enhancement, their deployment in Acoustic Echo Cancellation (AEC) has been restricted. In this paper, we propose DI-AEC, pioneering a diffusion-based stochastic regeneration approach dedicated to AEC. Further, we propose FADI-AEC, fast score-based diffusion AEC framework to save computational demands, making it favorable for edge devices. It stands out by running the score model once per frame, achieving a significant surge in processing efficiency. Apart from that, we introduce a novel noise generation technique where far-end signals are utilized, incorporating both far-end and near-end signals to refine the score model's accuracy. We test our proposed method on the ICASSP2023 Microsoft deep echo cancellation challenge evaluation dataset, where our method outperforms some of the end-to-end methods and other diffusion based echo cancellation methods.

FADI-AEC: 원단 신호 기반의 빠른 점수 확산 모델을 활용한 음향 에코 제거

FADI-AEC: Fast Score Based Diffusion Model Guided by Far-end Signal for Acoustic Echo Cancellation

초록

Support