アラインメントを超えて：多文化エージェントシステムにおける集合的性質としての価値多様性

要旨

多文化マルチエージェントシステムは、異なる文化的背景を持つエージェントが共存する、世界的に多様な環境への導入が進んでいる。既存の文化評価は価値の整合性、すなわち単一のエージェントが目標とする文化にどれだけ適合しているかに焦点を当てている。しかし、整合性はエージェントごとの特性であり、システム全体として表現すべき文化の複数性が維持されているかを明らかにすることはできない。本稿では、共有の価値観調査における文化的に条件づけられたエージェントの応答間の非類似性によって定義される、多文化エージェントシステムのシステムレベルの評価軸として「価値の多様性」を提案する。世界価値観調査を用いて、多様なシステム構成のもとで19の文化と18のバックボーンモデルを評価した。その結果、多様性は整合性とほぼ無相関であり、両者が補完的なシステム特性を捉えていること、また、現在の多文化エージェントシステムは価値の多様性において人間社会を大幅に下回ることが明らかになった。バックボーンモデルを混在させたシステムはその差を縮めるが解消には至らず、その差は文化構成やエージェント規模を変えても持続する。さらに、社会的相互作用はエージェントを合意へと導くことで多様性を損ない、参加型予算編成のケーススタディでは、この均質化が集団的意思決定の幅を狭めることを示した。以上の結果は、価値の多様性を多文化マルチエージェントシステムの独立した評価軸として確立するとともに、現在のLLMベースの社会における持続的な均質化傾向を明らかにする。コードとデータはhttps://github.com/iNLP-Lab/MultiAgent-Diversityで公開している。

English

Multicultural multi-agent systems are increasingly deployed in globally diverse settings, where different agents are grounded in different cultural backgrounds. Existing cultural evaluation focuses on value alignment: how closely a single agent matches a target culture. Yet alignment is a per-agent property and cannot reveal whether a system, taken as a whole, preserves the cultural plurality it is meant to represent. We propose value diversity as a system-level evaluation axis for multicultural agent systems, defined through the dissimilarity between culturally conditioned agents' responses on a shared value survey. Using the World Values Survey, we evaluate 19 cultures and 18 backbone models across a wide range of system configurations. We find that diversity is largely uncorrelated with alignment, indicating that the two capture complementary system properties, and that current multicultural agent systems fall substantially below human societies in value diversity. Mixed-backbone systems narrow this gap but do not close it, and the gap persists across culture compositions and agent scales. Social interaction further erodes diversity by driving agents toward consensus, and a participatory budgeting case study shows that this homogenization narrows the breadth of collective decision-making. Together, our results establish value diversity as a distinct evaluation axis for multicultural multi-agent systems and reveal a persistent homogenization tendency in current LLM-based societies. Our code and data are publicly available at https://github.com/iNLP-Lab/MultiAgent-Diversity.