遅延検証がマルチエージェントLLMの信念を不安定化する：不安定性閾値と最適な修正器の配置

要旨

マルチエージェント大規模言語モデル（LLM）システムは、幻覚（ハルシネーション）を抑制するために検証エージェントや批評エージェントに依存することが多いが、検証には遅延が生じる。この遅延の間に、誤った主張がエージェントネットワーク内を伝播する可能性がある。本論文では、このプロセスを、接地修正ノードを有するグラフ上の遅延コンセンサスとしてモデル化する。接地ラプラシアンによるスペクトル分解により、検証量（修正の強さ）に関する閉形式の安定性閾値が得られる。修正が強すぎる場合や遅延が大きすぎる場合、コンセンサスが振動に転じる可能性がある。最も不安定な状態は、通信遅延と検証遅延が一致するときに生じ、遅延が2の場合、閾値は黄金比の逆数となる。同じ枠組みにより、超モジュラな配置目的関数と、限られた修正予算を影響力のあるノードに割り当てるための貪欲(1-1/e)近似規則が得られる。5つのオープンモデルを用いた実験により、予測された検証量と遅延による振動が確認された。対照的に、接地された事実回答（grounded factual answering）は真実を吸収境界とし、この効果を排除する。これは、不安定性が符号付き信念タスクに固有のものであり、接地された検証は依然として安定化効果を持つことを示唆している。

English

Multi-agent large language model (LLM) systems often rely on verifier and critic agents to suppress hallucinations, but verification is delayed. During this delay, false claims can propagate through the agent network. We model this process as delayed consensus on a graph with grounded corrector nodes. Spectral decomposition by the grounded Laplacian yields a closed-form stability threshold for the verification dose: correction that is too strong or too delayed can turn consensus into oscillation. The most unstable regime occurs when the communication and verification delays coincide; for delay two, the threshold is the inverse golden ratio. The same framework gives a supermodular placement objective and a greedy (1-1/e)-approximation rule for assigning a limited corrector budget to influential nodes. Experiments across five open models confirm the predicted dose-delay oscillations. By contrast, grounded factual answering makes truth an absorbing boundary and eliminates the effect, suggesting that the instability is specific to signed-belief tasks while grounded verification remains stabilizing