SPASM：面向多轮对话生成的稳定角色驱动型智能体仿真框架

摘要

大型语言模型正日益广泛应用于多轮对话场景，如教学辅导、技术支持与心理疏导等，其可靠性取决于能否在长对话中保持角色、人设与目标的一致性。当LLM被用于生成训练和评估所需的合成对话时，这一要求尤为关键，因为LLM与LLM的对话会累积身份相关故障，例如人设漂移、角色混淆以及“回声效应”（即一方逐渐模仿对话伙伴的言行）。我们提出SPASM（基于稳定人设的智能体多轮对话生成框架），这一模块化框架以稳定性为核心，将对话仿真分解为三个步骤：（i）通过模式采样、合理性验证及自然语言人设构建实现人设创建；（ii）客户端-应答端对话生成；（iii）基于连贯性判断的终止检测。为在不改变模型权重的前提下提升长对话稳定性，我们提出自我中心语境投射技术：对话历史以视角无关的形式存储，并在生成对话前确定性地投射至每个智能体的自我中心视角。基于三种LLM骨干模型（GPT-4o-mini、DeepSeek-V3.2、Qwen-Plus）和九组客户端-应答端配对，我们构建了包含4,500种人设和45,000段对话的数据集（每组配对包含500种人设×10段对话）。消融实验表明，自我中心语境投射技术显著降低了人设漂移，并经人工验证完全消除了回声效应；嵌入分析不仅还原了人设结构，还揭示了应答端主导的强交互几何模式。代码已开源：https://github.com/lhannnn/SPASM。

English

Large language models are increasingly deployed in multi-turn settings such as tutoring, support, and counseling, where reliability depends on preserving consistent roles, personas, and goals across long horizons. This requirement becomes critical when LLMs are used to generate synthetic dialogues for training and evaluation, since LLM--LLM conversations can accumulate identity-related failures such as persona drift, role confusion, and "echoing", where one agent gradually mirrors its partner. We introduce SPASM (Stable Persona-driven Agent Simulation for Multi-turn dialogue generation), a modular, stability-first framework that decomposes simulation into (i) persona creation via schema sampling, plausibility validation, and natural-language persona crafting, (ii) Client--Responder dialogue generation, and (iii) termination detection for coherent stopping. To improve long-horizon stability without changing model weights, we propose Egocentric Context Projection (ECP): dialogue history is stored in a perspective-agnostic representation and deterministically projected into each agent's egocentric view before generation. Across three LLM backbones (GPT-4o-mini, DeepSeek-V3.2, Qwen-Plus) and nine Client--Responder pairings, we construct a dataset of 4,500 personas and 45,000 conversations (500 personas X 10 conversations per pairing). Ablations show ECP substantially reduces persona drift and, under human validation, eliminates echoing; embedding analyses recover persona structure and reveal strong responder-driven interaction geometry. Our code is available at https://github.com/lhannnn/SPASM.