超越轉錄:語音辨識中的機制可解釋性
Beyond Transcription: Mechanistic Interpretability in ASR
August 21, 2025
作者: Neta Glazer, Yael Segal-Feldman, Hilit Segev, Aviv Shamsian, Asaf Buchnick, Gill Hetz, Ethan Fetaya, Joseph Keshet, Aviv Navon
cs.AI
摘要
可解釋性方法近年來受到廣泛關注,特別是在大型語言模型的背景下,這些方法能夠深入理解語言表徵、檢測錯誤,並揭示模型行為如幻覺和重複等現象。然而,儘管這些技術在提升自動語音識別(ASR)系統的性能和可解釋性方面具有潛力,它們在ASR領域的應用仍顯不足。在本研究中,我們調整並系統地應用已建立的可解釋性方法,如logit透鏡、線性探測和激活修補,來考察ASR系統中各層次上聲學和語義信息的演變過程。我們的實驗揭示了此前未知的內部動態,包括導致重複幻覺的特定編碼器-解碼器交互,以及深植於聲學表徵中的語義偏見。這些發現展示了將可解釋性技術擴展並應用於語音識別的益處,為未來提升模型透明度和魯棒性的研究開闢了有前景的方向。
English
Interpretability methods have recently gained significant attention,
particularly in the context of large language models, enabling insights into
linguistic representations, error detection, and model behaviors such as
hallucinations and repetitions. However, these techniques remain underexplored
in automatic speech recognition (ASR), despite their potential to advance both
the performance and interpretability of ASR systems. In this work, we adapt and
systematically apply established interpretability methods such as logit lens,
linear probing, and activation patching, to examine how acoustic and semantic
information evolves across layers in ASR systems. Our experiments reveal
previously unknown internal dynamics, including specific encoder-decoder
interactions responsible for repetition hallucinations and semantic biases
encoded deep within acoustic representations. These insights demonstrate the
benefits of extending and applying interpretability techniques to speech
recognition, opening promising directions for future research on improving
model transparency and robustness.