BEDA:信念估计作为执行战略对话行为的概率约束
BEDA: Belief Estimation as Probabilistic Constraints for Performing Strategic Dialogue Acts
December 31, 2025
作者: Hengli Li, Zhaoxin Yu, Qi Shen, Chenxi Li, Mengmeng Wang, Tinglang Wu, Yipeng Kang, Yuxuan Wang, Song-Chun Zhu, Zixia Jia, Zilong Zheng
cs.AI
摘要
戰略對話要求智慧體執行不同的對話行為,這其中信念估測至關重要。雖然現有研究通常能準確估測信念,但缺乏在生成過程中運用這些信念的規範機制。我們通過以下方式彌合這一差距:首先將對抗性與協同性這兩類核心行為形式化,並透過對智慧體生成內容的概率約束實現操作化。我們將此理念具體實現於BEDA框架,該框架包含世界集、用於信念估測的信念估計器,以及能根據推斷信念選擇行為並實現話語生成的條件生成器。在條件型守衛盜賊(CKBG,對抗性)、共同好友(MF,合作性)和CaSiNo(協商性)三種設定中,BEDA始終優於強基線模型:在CKBG任務中,不同骨幹模型的成功率至少提升5.0個百分點,使用GPT-4.1-nano時更提升20.6個百分點;在共同好友任務中平均提升9.3個百分點;在CaSiNo任務中達成了相較所有基線模型的最優協議。這些結果表明,將信念估測轉化為約束條件,為實現可靠戰略對話提供了一種簡潔通用的機制。
English
Strategic dialogue requires agents to execute distinct dialogue acts, for which belief estimation is essential. While prior work often estimates beliefs accurately, it lacks a principled mechanism to use those beliefs during generation. We bridge this gap by first formalizing two core acts Adversarial and Alignment, and by operationalizing them via probabilistic constraints on what an agent may generate. We instantiate this idea in BEDA, a framework that consists of the world set, the belief estimator for belief estimation, and the conditional generator that selects acts and realizes utterances consistent with the inferred beliefs. Across three settings, Conditional Keeper Burglar (CKBG, adversarial), Mutual Friends (MF, cooperative), and CaSiNo (negotiation), BEDA consistently outperforms strong baselines: on CKBG it improves success rate by at least 5.0 points across backbones and by 20.6 points with GPT-4.1-nano; on Mutual Friends it achieves an average improvement of 9.3 points; and on CaSiNo it achieves the optimal deal relative to all baselines. These results indicate that casting belief estimation as constraints provides a simple, general mechanism for reliable strategic dialogue.