まずあなたを知り、より良くあなたになる：暗黙的プロファイルによる人間らしいユーザーシミュレータのモデリング

要旨

ユーザーシミュレータは、対話システムとの人間のインタラクションを再現する上で重要であり、特に大規模言語モデル（LLM）の協調的なトレーニングと自動評価を支援します。しかし、既存のシミュレータはしばしばテキスト発話のみに依存し、性格、話し方、目標といった暗黙的なユーザー特性を見落としています。一方、ペルソナベースの手法は、著名人やアーキタイプの事前定義されたプロファイルに依存するため、汎用性に欠けています。これらの課題に対処するため、我々は暗黙的プロファイルを備えたユーザーシミュレータ（USP）を提案します。このフレームワークは、人間と機械の会話から暗黙的なユーザープロファイルを推論し、それらを用いてよりパーソナライズされた現実的な対話を生成します。まず、包括的なプロファイルスキーマを備えたLLM駆動の抽出器を開発します。次に、条件付き教師ありファインチューニングとサイクル一貫性を伴う強化学習を通じてシミュレーションを改良し、発話レベルと会話レベルの両方で最適化します。最後に、多様なプロファイルサンプラーを採用して、現実世界のユーザープロファイルの分布を捕捉します。実験結果は、USPが真正性と多様性の点で強力なベースラインを上回り、一貫性においても同等の性能を達成することを示しています。さらに、USPに基づく動的なマルチターン評価は、主流のベンチマークと強く一致し、実世界のアプリケーションにおけるその有効性を実証しています。

English

User simulators are crucial for replicating human interactions with dialogue systems, supporting both collaborative training and automatic evaluation, especially for large language models (LLMs). However, existing simulators often rely solely on text utterances, missing implicit user traits such as personality, speaking style, and goals. In contrast, persona-based methods lack generalizability, as they depend on predefined profiles of famous individuals or archetypes. To address these challenges, we propose User Simulator with implicit Profiles (USP), a framework that infers implicit user profiles from human-machine conversations and uses them to generate more personalized and realistic dialogues. We first develop an LLM-driven extractor with a comprehensive profile schema. Then, we refine the simulation through conditional supervised fine-tuning and reinforcement learning with cycle consistency, optimizing it at both the utterance and conversation levels. Finally, we adopt a diverse profile sampler to capture the distribution of real-world user profiles. Experimental results demonstrate that USP outperforms strong baselines in terms of authenticity and diversity while achieving comparable performance in consistency. Furthermore, dynamic multi-turn evaluations based on USP strongly align with mainstream benchmarks, demonstrating its effectiveness in real-world applications.

まずあなたを知り、より良くあなたになる：暗黙的プロファイルによる人間らしいユーザーシミュレータのモデリング

Know You First and Be You Better: Modeling Human-Like User Simulators via Implicit Profiles

要旨

Support