何时行动，何时等待：面向任务型对话中意图触发性的结构轨迹建模

摘要

面向任务的对话系统常面临这样的困境：用户话语在语义上看似完整，却缺乏必要的结构信息以触发系统采取恰当行动。这一问题的根源在于，用户往往未能完全理解自身需求，而系统却需要精确的意图定义。当前基于大语言模型（LLM）的代理无法有效区分语言表达上的完整性与上下文可触发性的差异，缺乏协作式意图形成的框架。为此，我们提出了STORM框架，通过UserLLM（拥有完整内部访问权限）与AgentLLM（仅可观察外部行为）之间的对话，模拟非对称信息动态。STORM生成标注语料库，捕捉表达轨迹与潜在认知转变，从而系统分析协作理解的发展过程。我们的贡献包括：（1）形式化对话系统中的非对称信息处理；（2）建模意图形成，追踪协作理解的演变；（3）提出评估指标，同时衡量内部认知提升与任务表现。在四种语言模型上的实验表明，在某些场景下，适度的不确定性（40-60%）可能优于完全透明，且模型特有的模式提示我们重新思考人机协作中信息完整性的最优程度。这些发现深化了对非对称推理动态的理解，并为不确定性校准的对话系统设计提供了洞见。

English

Task-oriented dialogue systems often face difficulties when user utterances seem semantically complete but lack necessary structural information for appropriate system action. This arises because users frequently do not fully understand their own needs, while systems require precise intent definitions. Current LLM-based agents cannot effectively distinguish between linguistically complete and contextually triggerable expressions, lacking frameworks for collaborative intent formation. We present STORM, a framework modeling asymmetric information dynamics through conversations between UserLLM (full internal access) and AgentLLM (observable behavior only). STORM produces annotated corpora capturing expression trajectories and latent cognitive transitions, enabling systematic analysis of collaborative understanding development. Our contributions include: (1) formalizing asymmetric information processing in dialogue systems; (2) modeling intent formation tracking collaborative understanding evolution; and (3) evaluation metrics measuring internal cognitive improvements alongside task performance. Experiments across four language models reveal that moderate uncertainty (40-60%) can outperform complete transparency in certain scenarios, with model-specific patterns suggesting reconsideration of optimal information completeness in human-AI collaboration. These findings contribute to understanding asymmetric reasoning dynamics and inform uncertainty-calibrated dialogue system design.

何时行动，何时等待：面向任务型对话中意图触发性的结构轨迹建模

WHEN TO ACT, WHEN TO WAIT: Modeling Structural Trajectories for Intent Triggerability in Task-Oriented Dialogue

摘要

Support