動態語音辨識路徑:適應性遮罩方法以實現多語言語音辨識模型的高效修剪
Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
September 22, 2023
作者: Jiamin Xie, Ke Li, Jinxi Guo, Andros Tjandra, Yuan Shangguan, Leda Sari, Chunyang Wu, Junteng Jia, Jay Mahadeokar, Ozlem Kalinli
cs.AI
摘要
神經網絡剪枝提供了一種有效的方法,用於壓縮多語言自動語音識別(ASR)模型,並且性能損失最小。然而,這需要進行幾輪剪枝和重新訓練,以便針對每種語言運行。在這項工作中,我們提出了在兩種情況下為了高效地剪枝多語言ASR模型而使用的自適應遮罩方法,每種情況都會產生稀疏的單語言模型或稀疏的多語言模型(稱為動態ASR路徑)。我們的方法動態地適應子網絡,避免對固定子網絡結構做出過早的決定。我們展示了我們的方法在針對稀疏的單語言模型時優於現有的剪枝方法。此外,我們說明了動態ASR路徑共同發現並訓練單個多語言模型的更好子網絡(路徑),通過從不同的子網絡初始化進行調整,從而減少了對特定語言剪枝的需求。
English
Neural network pruning offers an effective method for compressing a
multilingual automatic speech recognition (ASR) model with minimal performance
loss. However, it entails several rounds of pruning and re-training needed to
be run for each language. In this work, we propose the use of an adaptive
masking approach in two scenarios for pruning a multilingual ASR model
efficiently, each resulting in sparse monolingual models or a sparse
multilingual model (named as Dynamic ASR Pathways). Our approach dynamically
adapts the sub-network, avoiding premature decisions about a fixed sub-network
structure. We show that our approach outperforms existing pruning methods when
targeting sparse monolingual models. Further, we illustrate that Dynamic ASR
Pathways jointly discovers and trains better sub-networks (pathways) of a
single multilingual model by adapting from different sub-network
initializations, thereby reducing the need for language-specific pruning.