ChatPaper.aiChatPaper

超越回合制遊戲:利用雙工模型實現即時對話

Beyond the Turn-Based Game: Enabling Real-Time Conversations with Duplex Models

June 22, 2024
作者: Xinrong Zhang, Yingfa Chen, Shengding Hu, Xu Han, Zihang Xu, Yuanwei Xu, Weilin Zhao, Maosong Sun, Zhiyuan Liu
cs.AI

摘要

隨著大型語言模型(LLMs)日益滲透到日常生活中,對模擬人類對話的即時互動需求不斷增加。傳統的基於LLMs的交替式聊天系統阻止用戶在系統生成回應時進行口頭交流。為了克服這些限制,我們將現有的LLMs調整為雙工模型,使這些LLMs在生成輸出的同時能夠聆聽用戶並動態調整自身以提供用戶即時反饋,例如對於中斷的回應。具體而言,我們將對話的查詢和回應劃分為多個時間片段,然後採用時間分割多路復用(TDM)編碼解碼策略來虛擬同時處理這些片段。此外,為了使LLMs足夠熟練以應對實時對話,我們構建了一個微調數據集,其中包含交替的查詢和回應時間片段,並涵蓋即時交互中的典型反饋類型。我們的實驗表明,儘管對話的查詢和回應被劃分為不完整的片段進行處理,但在我們的數據集上進行少量微調步驟後,LLMs可以保持其在標準基準上的原始性能。自動和人工評估表明,雙工模型使用戶與AI的互動更加自然和類似人類,並且與普通LLMs相比,大大提高了用戶滿意度。我們的雙工模型和數據集將會釋出。
English
As large language models (LLMs) increasingly permeate daily lives, there is a growing demand for real-time interactions that mirror human conversations. Traditional turn-based chat systems driven by LLMs prevent users from verbally interacting with the system while it is generating responses. To overcome these limitations, we adapt existing LLMs to duplex models so that these LLMs can listen for users while generating output and dynamically adjust themselves to provide users with instant feedback. % such as in response to interruptions. Specifically, we divide the queries and responses of conversations into several time slices and then adopt a time-division-multiplexing (TDM) encoding-decoding strategy to pseudo-simultaneously process these slices. Furthermore, to make LLMs proficient enough to handle real-time conversations, we build a fine-tuning dataset consisting of alternating time slices of queries and responses as well as covering typical feedback types in instantaneous interactions. Our experiments show that although the queries and responses of conversations are segmented into incomplete slices for processing, LLMs can preserve their original performance on standard benchmarks with a few fine-tuning steps on our dataset. Automatic and human evaluation indicate that duplex models make user-AI interactions more natural and human-like, and greatly improve user satisfaction compared to vanilla LLMs. Our duplex model and dataset will be released.

Summary

AI-Generated Summary

PDF142November 29, 2024