ChatPaper.aiChatPaper

心智步調說話法:雙腦協同實現口語模型中的實時推理

Mind-Paced Speaking: A Dual-Brain Approach to Real-Time Reasoning in Spoken Language Models

October 10, 2025
作者: Donghang Wu, Haoyang Zhang, Jun Chen, Xiangyu, Zhang, Hexin Liu, Eng Siong Chng, Fei Tian, Xuerui Yang, Xiangyu Zhang, Daxin Jiang, Gang Yu
cs.AI

摘要

實時口語語言模型(SLMs)在利用思維鏈(CoT)推理方面面臨挑戰,主要原因在於按順序生成整個思維過程所帶來的過高延遲。使SLMs能夠像人類一樣邊說邊想,正日益受到關注。我們首次提出了心智節奏說話(Mind-Paced Speaking, MPS),這是一個受大腦啟發的框架,旨在實現高保真度的實時推理。與人類利用不同大腦區域進行思考和回應相似,我們提出了一種新穎的雙腦方法,採用“構思腦”進行高層次推理,以節奏引導獨立的“表達腦”流暢生成語音。這種分工消除了模式切換,保持了推理過程的完整性。實驗表明,MPS顯著優於現有的邊說邊想方法,並在推理性能上與那些在說話前預先計算完整CoT的模型相當,同時大幅降低了延遲。在零延遲配置下,該方法在數學推理任務Spoken-MQA上達到了92.8%的準確率,並在語音對話任務URO-Bench上獲得了82.5分。我們的工作有效彌合了高質量推理與實時交互之間的差距。
English
Real-time Spoken Language Models (SLMs) struggle to leverage Chain-of-Thought (CoT) reasoning due to the prohibitive latency of generating the entire thought process sequentially. Enabling SLMs to think while speaking, similar to humans, is attracting increasing attention. We present, for the first time, Mind-Paced Speaking (MPS), a brain-inspired framework that enables high-fidelity, real-time reasoning. Similar to how humans utilize distinct brain regions for thinking and responding, we propose a novel dual-brain approach, employing a "Formulation Brain" for high-level reasoning to pace and guide a separate "Articulation Brain" for fluent speech generation. This division of labor eliminates mode-switching, preserving the integrity of the reasoning process. Experiments show that MPS significantly outperforms existing think-while-speaking methods and achieves reasoning performance comparable to models that pre-compute the full CoT before speaking, while drastically reducing latency. Under a zero-latency configuration, the proposed method achieves an accuracy of 92.8% on the mathematical reasoning task Spoken-MQA and attains a score of 82.5 on the speech conversation task URO-Bench. Our work effectively bridges the gap between high-quality reasoning and real-time interaction.
PDF42October 13, 2025