基于大语言模型的对话AI用户模拟器中的目标对齐

摘要

用户模拟器在对话式人工智能中至关重要，它通过模拟交互实现了智能体的可扩展开发与评估。尽管当前的大型语言模型（LLMs）已显著提升了用户模拟能力，但我们发现，在多轮对话中，它们难以持续展现目标导向行为——这一关键局限削弱了其在下游应用中的可靠性。为此，我们引入了用户目标状态追踪（UGST）这一创新框架，用于全程监控对话中的用户目标进展。依托UGST，我们提出了一套三阶段方法论，用于开发能够自主追踪目标进展并推理生成目标一致响应的用户模拟器。此外，我们建立了一套全面的评估指标，用以衡量用户模拟器的目标一致性，并证明我们的方法在MultiWOZ 2.4和{\tau}-Bench两个基准测试上均取得了显著提升。我们的研究填补了对话式人工智能领域的一个关键空白，确立了UGST作为开发目标一致用户模拟器的核心框架地位。

English

User simulators are essential to conversational AI, enabling scalable agent development and evaluation through simulated interactions. While current Large Language Models (LLMs) have advanced user simulation capabilities, we reveal that they struggle to consistently demonstrate goal-oriented behavior across multi-turn conversations--a critical limitation that compromises their reliability in downstream applications. We introduce User Goal State Tracking (UGST), a novel framework that tracks user goal progression throughout conversations. Leveraging UGST, we present a three-stage methodology for developing user simulators that can autonomously track goal progression and reason to generate goal-aligned responses. Moreover, we establish comprehensive evaluation metrics for measuring goal alignment in user simulators, and demonstrate that our approach yields substantial improvements across two benchmarks (MultiWOZ 2.4 and {\tau}-Bench). Our contributions address a critical gap in conversational AI and establish UGST as an essential framework for developing goal-aligned user simulators.

基于大语言模型的对话AI用户模拟器中的目标对齐

Goal Alignment in LLM-Based User Simulators for Conversational AI

摘要

Support