AsyncVoice 代理:面向大语言模型规划与推理的实时解释系统
AsyncVoice Agent: Real-Time Explanation for LLM Planning and Reasoning
October 17, 2025
作者: Yueqian Lin, Zhengmian Hu, Jayakumar Subramanian, Qinsi Wang, Nikos Vlassis, Hai "Helen" Li, Yiran Chen
cs.AI
摘要
在复杂的推理任务中实现有效的人机协作,要求用户不仅接收输出,更要理解并参与模型的思考过程。然而,诸如思维链(CoT)等方法生成的单一文本阻碍了这一目标,因为现有界面缺乏实时语音化表达和强大的用户打断功能。我们提出了AsyncVoice Agent系统,其异步架构将流式大语言模型后端与对话式语音前端解耦。这一设计使得叙述与推理能够并行运行,使用户能够随时打断、查询并引导模型的推理过程。客观基准测试表明,与单一基线相比,该方法将交互延迟降低了600倍以上,同时确保了高保真度和具有竞争力的任务准确性。通过与模型思维过程建立双向对话,AsyncVoice Agent为构建更高效、可引导且可信赖的高风险任务人机系统提供了新范式。
English
Effective human-AI collaboration on complex reasoning tasks requires that
users understand and interact with the model's process, not just receive an
output. However, the monolithic text from methods like Chain-of-Thought (CoT)
prevents this, as current interfaces lack real-time verbalization and robust
user barge-in. We present AsyncVoice Agent, a system whose asynchronous
architecture decouples a streaming LLM backend from a conversational voice
frontend. This design allows narration and inference to run in parallel,
empowering users to interrupt, query, and steer the model's reasoning process
at any time. Objective benchmarks show this approach reduces interaction
latency by more than 600x compared to monolithic baselines while ensuring high
fidelity and competitive task accuracy. By enabling a two-way dialogue with a
model's thought process, AsyncVoice Agent offers a new paradigm for building
more effective, steerable, and trustworthy human-AI systems for high-stakes
tasks.