文脈の修正とシミュレートされたスタンスのシフト：オンラインディスカッションにおけるLLMベースのスタンスシミュレーションの監査

要旨

大規模言語モデルは、ソーシャルメディアユーザーをシミュレートし、個人がオンライン上の議論にどのように反応するかを推測するためにますます用いられている。しかし、これらのシミュレーションがユーザー固有の正確な信念を反映しているのか、それとも会話の文脈における意味的に独立した変化に非常に敏感であるのかは依然として不明である。本研究では、LLMベースのスタンスシミュレーションを監査するための枠組みとして、反事実的文脈修正を検討する。元のオンライン会話を所与として、まず対象ユーザーの特定のトピックに対するスタンスを推測する。次に、会話の文脈に対して制御された修正戦略を適用し、修正された文脈のもとでユーザーのスタンスを再度シミュレートする。テキストのみの修正戦略と、ミームベースの文脈を取り入れたマルチモーダル戦略とを比較し、平均方向別スタンス変化量とスタンス遷移率という二つの主要な有効性指標を評価する。結果は、異なる分極化選好メカニズムにわたって、テキストのみの戦略とマルチモーダル戦略の両方において効果的でロバストなスタンス遷移を示している。本研究は、LLMベースのスタンスシミュレーションの文脈感受性を理解するための評価枠組みを提供する。より広くは、オンライン上の意見動態をシミュレートするためのLLM利用の可能性とリスクの両方を浮き彫りにする。

English

Large language models are increasingly used to simulate social media users and infer how individuals may respond to online discussions. However, it remains unclear whether these simulations reflect precise user-specific beliefs or whether they are highly sensitive to semantically independent changes in conversational contexts. In this work, we study counterfactual context revision as a framework for auditing LLM-based stance simulation. Given an original online conversation, we first infer a target user's stance toward a specific topic. We then apply controlled revision strategies to the conversational context and simulate the user's stance again under the revised context. We compare text-only revision strategies with a multimodal one that incorporates meme-based context and evaluate two main effectiveness metrics, i.e., average directional stance shift and stance transition rate. The results reveal effective and robust stance transitions in both text-only and multimodal strategies across different polarization-preference mechanisms. Our study contributes an evaluation framework for understanding the context sensitivity of LLM-based stance simulation. More broadly, it highlights both the promise and risk of using LLMs to simulate online opinion dynamics.