ChatPaper.aiChatPaper

主动隔离自我中心对话的听觉辅助系统

Proactive Hearing Assistants that Isolate Egocentric Conversations

November 14, 2025
作者: Guilin Hu, Malek Itani, Tuochao Chen, Shyamnath Gollakota
cs.AI

摘要

我们推出了一款主动式听觉辅助系统,该系统能自动识别并分离佩戴者的对话对象,无需用户明确指令。该体系基于第一人称双耳音频,以佩戴者的自主语音为锚点,通过利用对话轮转行为和对话动态来推断交流对象并屏蔽其他声源。为实现设备端实时运算,我们提出双模型架构:轻量级流式模型每12.5毫秒运行一次,实现低延迟提取对话对象;而运算频率较低的慢速模型则负责捕捉长时程对话动态。通过在真实场景中采集的2-3人对话测试集(使用双耳第一人称硬件从11位参与者处收集,总时长6.8小时)上的实验表明,该系统在多对话场景中具有识别与隔离对话对象的泛化能力。我们的研究标志着听觉辅助设备向主动适应对话动态与参与度迈出了重要一步。更多信息请访问我们的网站:https://proactivehearing.cs.washington.edu/
English
We introduce proactive hearing assistants that automatically identify and separate the wearer's conversation partners, without requiring explicit prompts. Our system operates on egocentric binaural audio and uses the wearer's self-speech as an anchor, leveraging turn-taking behavior and dialogue dynamics to infer conversational partners and suppress others. To enable real-time, on-device operation, we propose a dual-model architecture: a lightweight streaming model runs every 12.5 ms for low-latency extraction of the conversation partners, while a slower model runs less frequently to capture longer-range conversational dynamics. Results on real-world 2- and 3-speaker conversation test sets, collected with binaural egocentric hardware from 11 participants totaling 6.8 hours, show generalization in identifying and isolating conversational partners in multi-conversation settings. Our work marks a step toward hearing assistants that adapt proactively to conversational dynamics and engagement. More information can be found on our website: https://proactivehearing.cs.washington.edu/
PDF63December 1, 2025