ERGO：マルチターン言語モデルにおける生成最適化のためのエントロピー誘導型リセット

要旨

大規模言語モデル（LLM）は、情報が段階的に提示される多ターン会話において、パフォーマンスの著しい低下に悩まされています。多ターン会話はLLMとの日常的なインタラクションを特徴づけるものであり、この低下は実世界での有用性に深刻な課題を突きつけています。我々は、モデルの不確実性の急激な増加が多ターンLLMインタラクションにおけるミスアラインメントを示すと仮定し、この洞察を活用して会話コンテキストを動的に再調整します。我々はERGO（Entropy-guided Resetting for Generation Optimization）を導入し、次トークン分布に対するシャノンエントロピーを通じて内部不確実性を継続的に定量化し、エントロピーの急激な上昇が検出された場合に適応的なプロンプト統合をトリガーします。不確実性を排除すべき厄介者ではなく第一級の信号として扱うことで、ERGOは言語とモデリングの変動を受け入れ、不確実性を表現し対応します。段階的に明らかにされる指示を伴う多ターンタスクにおいて、ERGOは標準ベースラインに対して平均56.6%のパフォーマンス向上をもたらし、適性（ピークパフォーマンス能力）を24.7%向上させ、信頼性の低さ（パフォーマンスの変動）を35.3%減少させました。これにより、不確実性を意識した介入が会話型AIの精度と信頼性の両方を向上させることが実証されました。

English

Large Language Models (LLMs) suffer significant performance degradation in multi-turn conversations when information is presented incrementally. Given that multi-turn conversations characterize everyday interactions with LLMs, this degradation poses a severe challenge to real world usability. We hypothesize that abrupt increases in model uncertainty signal misalignment in multi-turn LLM interactions, and we exploit this insight to dynamically realign conversational context. We introduce ERGO (Entropy-guided Resetting for Generation Optimization), which continuously quantifies internal uncertainty via Shannon entropy over next token distributions and triggers adaptive prompt consolidation when a sharp spike in entropy is detected. By treating uncertainty as a first class signal rather than a nuisance to eliminate, ERGO embraces variability in language and modeling, representing and responding to uncertainty. In multi-turn tasks with incrementally revealed instructions, ERGO yields a 56.6% average performance gain over standard baselines, increases aptitude (peak performance capability) by 24.7%, and decreases unreliability (variability in performance) by 35.3%, demonstrating that uncertainty aware interventions can improve both accuracy and reliability in conversational AI.

ERGO：マルチターン言語モデルにおける生成最適化のためのエントロピー誘導型リセット

ERGO: Entropy-guided Resetting for Generation Optimization in Multi-turn Language Models

要旨

Support