매개변수 효율적 음성 인식을 위한 대규모 언어 모델 리스코어링의 저순위 적응

초록

우리는 음성 인식 출력 재점수를 위한 저순위 적응(LoRA) 기반의 신경망 언어 모델링 시스템을 제안한다. BERT와 같은 사전 학습된 언어 모델(LM)이 두 번째 단계 재점수에서 우수한 성능을 보여왔지만, 사전 학습 단계의 확장과 사전 학습된 모델을 특정 도메인에 적응시키는 데 드는 높은 계산 비용으로 인해 재점수에서의 실용적 사용이 제한되어 왔다. 본 연구에서는 저순위 분해를 기반으로 재점수 BERT 모델을 학습시키고 사전 학습된 매개변수의 일부(0.08%)만을 사용하여 새로운 도메인에 적응시키는 방법을 제시한다. 이러한 삽입된 행렬은 판별 학습 목표와 상관관계 기반 정규화 손실을 통해 최적화된다. 제안된 저순위 적응 Rescore-BERT(LoRB) 아키텍처는 LibriSpeech와 내부 데이터셋에서 평가되었으며, 학습 시간이 5.4배에서 3.6배로 감소하였다.

English

We propose a neural language modeling system based on low-rank adaptation (LoRA) for speech recognition output rescoring. Although pretrained language models (LMs) like BERT have shown superior performance in second-pass rescoring, the high computational cost of scaling up the pretraining stage and adapting the pretrained models to specific domains limit their practical use in rescoring. Here we present a method based on low-rank decomposition to train a rescoring BERT model and adapt it to new domains using only a fraction (0.08%) of the pretrained parameters. These inserted matrices are optimized through a discriminative training objective along with a correlation-based regularization loss. The proposed low-rank adaptation Rescore-BERT (LoRB) architecture is evaluated on LibriSpeech and internal datasets with decreased training times by factors between 5.4 and 3.6.

매개변수 효율적 음성 인식을 위한 대규모 언어 모델 리스코어링의 저순위 적응

Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition

초록

Support