基于大语言模型的对话AI用户模拟器中的目标对齐
Goal Alignment in LLM-Based User Simulators for Conversational AI
July 27, 2025
作者: Shuhaib Mehri, Xiaocheng Yang, Takyoung Kim, Gokhan Tur, Shikib Mehri, Dilek Hakkani-Tür
cs.AI
摘要
用户模拟器在对话式人工智能中至关重要,它通过模拟交互实现了智能体的可扩展开发与评估。尽管当前的大型语言模型(LLMs)已显著提升了用户模拟能力,但我们发现,在多轮对话中,它们难以持续展现目标导向行为——这一关键局限削弱了其在下游应用中的可靠性。为此,我们引入了用户目标状态追踪(UGST)这一创新框架,用于全程监控对话中的用户目标进展。依托UGST,我们提出了一套三阶段方法论,用于开发能够自主追踪目标进展并推理生成目标一致响应的用户模拟器。此外,我们建立了一套全面的评估指标,用以衡量用户模拟器的目标一致性,并证明我们的方法在MultiWOZ 2.4和{\tau}-Bench两个基准测试上均取得了显著提升。我们的研究填补了对话式人工智能领域的一个关键空白,确立了UGST作为开发目标一致用户模拟器的核心框架地位。
English
User simulators are essential to conversational AI, enabling scalable agent
development and evaluation through simulated interactions. While current Large
Language Models (LLMs) have advanced user simulation capabilities, we reveal
that they struggle to consistently demonstrate goal-oriented behavior across
multi-turn conversations--a critical limitation that compromises their
reliability in downstream applications. We introduce User Goal State Tracking
(UGST), a novel framework that tracks user goal progression throughout
conversations. Leveraging UGST, we present a three-stage methodology for
developing user simulators that can autonomously track goal progression and
reason to generate goal-aligned responses. Moreover, we establish comprehensive
evaluation metrics for measuring goal alignment in user simulators, and
demonstrate that our approach yields substantial improvements across two
benchmarks (MultiWOZ 2.4 and {\tau}-Bench). Our contributions address a
critical gap in conversational AI and establish UGST as an essential framework
for developing goal-aligned user simulators.