先識你,再超越你:通過隱式特徵建模實現類人用戶模擬器
Know You First and Be You Better: Modeling Human-Like User Simulators via Implicit Profiles
February 26, 2025
作者: Kuang Wang, Xianfei Li, Shenghao Yang, Li Zhou, Feng Jiang, Haizhou Li
cs.AI
摘要
使用者模擬器對於複現人類與對話系統的互動至關重要,它不僅支持協同訓練,還能進行自動評估,尤其是在大型語言模型(LLMs)的應用中。然而,現有的模擬器往往僅依賴於文本語句,忽略了使用者的隱含特質,如個性、說話風格和目標。相比之下,基於人物角色的方法缺乏通用性,因為它們依賴於預先定義的名人或原型檔案。為解決這些挑戰,我們提出了帶有隱含特徵的使用者模擬器(USP),這是一個從人機對話中推斷隱含使用者特徵並利用這些特徵生成更個性化和真實對話的框架。我們首先開發了一個由LLM驅動的提取器,配備了全面的特徵架構。接著,通過條件監督微調和帶有循環一致性的強化學習來精煉模擬,在語句和對話層面進行優化。最後,我們採用多樣化的特徵採樣器來捕捉現實世界使用者特徵的分佈。實驗結果表明,USP在真實性和多樣性方面優於強基準,同時在一致性方面表現相當。此外,基於USP的動態多輪評估與主流基準高度一致,證明了其在實際應用中的有效性。
English
User simulators are crucial for replicating human interactions with dialogue
systems, supporting both collaborative training and automatic evaluation,
especially for large language models (LLMs). However, existing simulators often
rely solely on text utterances, missing implicit user traits such as
personality, speaking style, and goals. In contrast, persona-based methods lack
generalizability, as they depend on predefined profiles of famous individuals
or archetypes. To address these challenges, we propose User Simulator with
implicit Profiles (USP), a framework that infers implicit user profiles from
human-machine conversations and uses them to generate more personalized and
realistic dialogues. We first develop an LLM-driven extractor with a
comprehensive profile schema. Then, we refine the simulation through
conditional supervised fine-tuning and reinforcement learning with cycle
consistency, optimizing it at both the utterance and conversation levels.
Finally, we adopt a diverse profile sampler to capture the distribution of
real-world user profiles. Experimental results demonstrate that USP outperforms
strong baselines in terms of authenticity and diversity while achieving
comparable performance in consistency. Furthermore, dynamic multi-turn
evaluations based on USP strongly align with mainstream benchmarks,
demonstrating its effectiveness in real-world applications.Summary
AI-Generated Summary