LLAMAPIE: 사전 대응형 이어 컨버세이션 어시스턴트

초록

우리는 인간 대화를 향상시키기 위해 헤어러블(hearable) 장치를 통해 은밀하고 간결한 지침을 제공하는 최초의 실시간 사전적(proactive) 어시스턴트인 LlamaPIE를 소개한다. 기존의 명시적인 사용자 호출이 필요한 언어 모델과 달리, 이 어시스턴트는 대화를 방해하지 않으면서 사용자 요구를 예측하여 배경에서 작동한다. 우리는 언제 응답할지 결정하고, 대화를 향상시키는 간결한 응답을 구성하며, 사용자에 대한 지식을 활용하여 상황 인식 지원을 제공하고, 실시간으로 기기 내에서 처리하는 등 여러 도전 과제를 해결한다. 이를 위해 반합성(semi-synthetic) 대화 데이터셋을 구축하고, 응답 시점을 결정하는 소형 모델과 응답을 생성하는 대형 모델로 구성된 이중 모델 파이프라인을 제안한다. 우리는 실제 데이터셋을 통해 이 접근법을 평가하며, 도움 되면서도 방해가 되지 않는 지원을 제공하는 데 있어 그 효과를 입증한다. Apple Silicon M2 하드웨어에 구현된 우리의 어시스턴트를 대상으로 한 사용자 연구는, 사전적 어시스턴트가 지원이 없는 기준 모델과 반응적(reactive) 모델 모두에 비해 강력한 선호도를 보여주며, LlamaPIE가 실시간 대화를 향상시킬 잠재력을 강조한다.

English

We introduce LlamaPIE, the first real-time proactive assistant designed to enhance human conversations through discreet, concise guidance delivered via hearable devices. Unlike traditional language models that require explicit user invocation, this assistant operates in the background, anticipating user needs without interrupting conversations. We address several challenges, including determining when to respond, crafting concise responses that enhance conversations, leveraging knowledge of the user for context-aware assistance, and real-time, on-device processing. To achieve this, we construct a semi-synthetic dialogue dataset and propose a two-model pipeline: a small model that decides when to respond and a larger model that generates the response. We evaluate our approach on real-world datasets, demonstrating its effectiveness in providing helpful, unobtrusive assistance. User studies with our assistant, implemented on Apple Silicon M2 hardware, show a strong preference for the proactive assistant over both a baseline with no assistance and a reactive model, highlighting the potential of LlamaPie to enhance live conversations.

LLAMAPIE: 사전 대응형 이어 컨버세이션 어시스턴트

LLAMAPIE: Proactive In-Ear Conversation Assistants

초록

Support