超越回合制遊戲:利用雙工模型實現即時對話
Beyond the Turn-Based Game: Enabling Real-Time Conversations with Duplex Models
June 22, 2024
作者: Xinrong Zhang, Yingfa Chen, Shengding Hu, Xu Han, Zihang Xu, Yuanwei Xu, Weilin Zhao, Maosong Sun, Zhiyuan Liu
cs.AI
摘要
隨著大型語言模型(LLMs)日益滲透到日常生活中,對模擬人類對話的即時互動需求不斷增加。傳統的基於LLMs的交替式聊天系統阻止用戶在系統生成回應時進行口頭交流。為了克服這些限制,我們將現有的LLMs調整為雙工模型,使這些LLMs在生成輸出的同時能夠聆聽用戶並動態調整自身以提供用戶即時反饋,例如對於中斷的回應。具體而言,我們將對話的查詢和回應劃分為多個時間片段,然後採用時間分割多路復用(TDM)編碼解碼策略來虛擬同時處理這些片段。此外,為了使LLMs足夠熟練以應對實時對話,我們構建了一個微調數據集,其中包含交替的查詢和回應時間片段,並涵蓋即時交互中的典型反饋類型。我們的實驗表明,儘管對話的查詢和回應被劃分為不完整的片段進行處理,但在我們的數據集上進行少量微調步驟後,LLMs可以保持其在標準基準上的原始性能。自動和人工評估表明,雙工模型使用戶與AI的互動更加自然和類似人類,並且與普通LLMs相比,大大提高了用戶滿意度。我們的雙工模型和數據集將會釋出。
English
As large language models (LLMs) increasingly permeate daily lives, there is a
growing demand for real-time interactions that mirror human conversations.
Traditional turn-based chat systems driven by LLMs prevent users from verbally
interacting with the system while it is generating responses. To overcome these
limitations, we adapt existing LLMs to duplex models so that these
LLMs can listen for users while generating output and dynamically adjust
themselves to provide users with instant feedback. % such as in response to
interruptions. Specifically, we divide the queries and responses of
conversations into several time slices and then adopt a
time-division-multiplexing (TDM) encoding-decoding strategy to
pseudo-simultaneously process these slices. Furthermore, to make LLMs
proficient enough to handle real-time conversations, we build a fine-tuning
dataset consisting of alternating time slices of queries and responses as well
as covering typical feedback types in instantaneous interactions. Our
experiments show that although the queries and responses of conversations are
segmented into incomplete slices for processing, LLMs can preserve their
original performance on standard benchmarks with a few fine-tuning steps on our
dataset. Automatic and human evaluation indicate that duplex models make
user-AI interactions more natural and human-like, and greatly improve user
satisfaction compared to vanilla LLMs. Our duplex model and dataset will be
released.Summary
AI-Generated Summary