言語モデルにおける文化的認識の調査：テキストとその先

要旨

さまざまなアプリケーションでの大規模な大規模言語モデル（LLM）の展開は、チャットボットやバーチャルアシスタントなど、ユーザーに対して文化的に敏感で包括的であることを要求します。文化は心理学や人類学で広く研究されており、最近では、多言語性を超えて心理学や人類学の知見に基づいたLLMの文化的包括性を高める研究が急増しています。本論文では、テキストベースおよびマルチモーダルなLLMに文化的意識を組み込む取り組みについて調査します。まず、人類学と心理学から文化の定義を出発点として文化的意識をLLMで定義し、横断的なデータセットの作成に採用された方法、下流タスクでの文化的包括性の戦略、そしてLLMにおける文化的意識のベンチマーク化に使用された方法論を検討します。さらに、文化的整合性の倫理的側面、ヒューマンコンピュータインタラクションの役割、LLMにおける文化的包括性を促進する役割、文化的整合性が社会科学研究を促進する役割について議論します。最後に、文献の空白に関する私たちの調査結果に基づいて将来の研究への示唆を提供します。

English

Large-scale deployment of large language models (LLMs) in various applications, such as chatbots and virtual assistants, requires LLMs to be culturally sensitive to the user to ensure inclusivity. Culture has been widely studied in psychology and anthropology, and there has been a recent surge in research on making LLMs more culturally inclusive in LLMs that goes beyond multilinguality and builds on findings from psychology and anthropology. In this paper, we survey efforts towards incorporating cultural awareness into text-based and multimodal LLMs. We start by defining cultural awareness in LLMs, taking the definitions of culture from anthropology and psychology as a point of departure. We then examine methodologies adopted for creating cross-cultural datasets, strategies for cultural inclusion in downstream tasks, and methodologies that have been used for benchmarking cultural awareness in LLMs. Further, we discuss the ethical implications of cultural alignment, the role of Human-Computer Interaction in driving cultural inclusion in LLMs, and the role of cultural alignment in driving social science research. We finally provide pointers to future research based on our findings about gaps in the literature.