世論シミュレーションにおけるパラメトリックな社会的アイデンティティ注入と多様化

要旨

大規模言語モデル（LLM）は近年、世論シミュレーションのための合成エージェントとして採用され、コストが高く時間のかかる人間の調査に代わる有望な選択肢を提供している。その拡張性にもかかわらず、現在のLLMベースのシミュレーション手法は社会的多様性を捉えることができず、集団間の差異が平坦化され、人口統計学的グループ間で過度に均質な応答を生成している。我々はこの限界を、LLMの潜在表現における「多様性崩壊」現象として特定する。この現象では、異なる社会的アイデンティティが層を経るにつれて区別がつかなくなる。この観察に動機づけられ、我々は「パラメトリック社会的アイデンティティ注入（PSII）」を提案する。これは、人口統計学的属性と価値志向性の明示的かつパラメトリックな表現を、LLMの中間潜在状態に直接注入する汎用フレームワークである。プロンプトベースのペルソナ条件付けとは異なり、PSIIは表現レベルでの微細かつ制御可能なアイデンティティ変調を可能にする。複数のオープンソースLLMを用いたWorld Values Surveyにおける広範な実験により、PSIIが分布の忠実性と多様性を大幅に向上させ、実際の調査データとのKLダイバージェンスを低減しつつ、全体的な多様性を高めることが示された。本研究は、LLMエージェントの表現レベル制御に関する新たな知見を提供し、拡張可能で多様性を考慮した世論シミュレーションを前進させる。

English

Large language models (LLMs) have recently been adopted as synthetic agents for public opinion simulation, offering a promising alternative to costly and slow human surveys. Despite their scalability, current LLM-based simulation methods fail to capture social diversity, producing flattened inter-group differences and overly homogeneous responses across demographic groups. We identify this limitation as a Diversity Collapse phenomenon in LLM hidden representations, where distinct social identities become increasingly indistinguishable across layers. Motivated by this observation, we propose Parametric Social Identity Injection (PSII), a general framework that injects explicit, parametric representations of demographic attributes and value orientations directly into intermediate hidden states of LLMs. Unlike prompt-based persona conditioning, PSII enables fine-grained and controllable identity modulation at the representation level. Extensive experiments on the World Values Survey using multiple open-source LLMs show that PSII significantly improves distributional fidelity and diversity, reducing KL divergence to real-world survey data while enhancing overall diversity. This work provides new insights into representation-level control of LLM agents and advances scalable, diversity-aware public opinion simulation.