ChatPaper.aiChatPaper

主动隔离自我中心对话的听觉辅助系统

Proactive Hearing Assistants that Isolate Egocentric Conversations

November 14, 2025
作者: Guilin Hu, Malek Itani, Tuochao Chen, Shyamnath Gollakota
cs.AI

摘要

我们推出了一款主动式听觉辅助系统,该系统能自动识别并分离佩戴者的对话对象,无需用户明确指令。该系统基于以自我为中心的双耳音频进行运作,利用佩戴者自身语音作为锚点,通过分析对话轮换模式和互动动态来推断交流对象并抑制其他声音。为实现设备端实时运算,我们提出双模型架构:轻量级流式模型每12.5毫秒运行一次,实现低延迟提取对话对象;而运算频率较低的模型则负责捕捉长时程对话动态。通过在真实场景中采集的2-3人对话测试集(使用双耳第一视角硬件从11位参与者处收集,总时长6.8小时)上的实验表明,该系统在多对话场景中具有识别与隔离对话对象的泛化能力。我们的研究标志着听觉辅助设备向主动适应对话动态与参与度迈出了重要一步。更多信息请访问我们的网站:https://proactivehearing.cs.washington.edu/
English
We introduce proactive hearing assistants that automatically identify and separate the wearer's conversation partners, without requiring explicit prompts. Our system operates on egocentric binaural audio and uses the wearer's self-speech as an anchor, leveraging turn-taking behavior and dialogue dynamics to infer conversational partners and suppress others. To enable real-time, on-device operation, we propose a dual-model architecture: a lightweight streaming model runs every 12.5 ms for low-latency extraction of the conversation partners, while a slower model runs less frequently to capture longer-range conversational dynamics. Results on real-world 2- and 3-speaker conversation test sets, collected with binaural egocentric hardware from 11 participants totaling 6.8 hours, show generalization in identifying and isolating conversational partners in multi-conversation settings. Our work marks a step toward hearing assistants that adapt proactively to conversational dynamics and engagement. More information can be found on our website: https://proactivehearing.cs.washington.edu/
PDF63December 1, 2025