大規模言語モデルにおける文化的シグナルの作者プロファイリングによる分析

要旨

大規模言語モデル（LLM）は社会的影響を伴うアプリケーションで展開が進むにつれ、その内在文化的バイアスへの懸念が高まっている。本研究では、LLMがゼロショット設定で歌詞から作者プロファイリング（歌手の性別・民族の推定）を遂行できるか評価し、これらの表現体系を探る。10,000件超の歌詞を用いて複数のオープンソースモデルを評価した結果、LLMは有意なプロファイリング性能を示す一方、体系的な文化的偏向を確認した。大半のモデルは北米の民族属性へデフォルトで傾くが、DeepSeek-1.5Bはアジア系民族属性への強い整合性を示した。この知見はモデルの予測分布と生成根拠の分析の双方から導出されている。これらの偏りを定量化するため、我々はModality Accuracy Divergence（MAD）とRecall Divergence（RD）という二つの公平性指標を提案し、Ministral-8Bが評価モデル中最も強い民族的バイアスを示す一方、Gemma-12Bが最も均衡のとれた振る舞いを示すことを明らかにした。コードはGitHub（https://github.com/ValentinLafargue/CulturalProbingLLM）で公開している。

English

Large language models (LLMs) are increasingly deployed in applications with societal impact, raising concerns about the cultural biases they encode. We probe these representations by evaluating whether LLMs can perform author profiling from song lyrics in a zero-shot setting, inferring singers' gender and ethnicity without task-specific fine-tuning. Across several open-source models evaluated on more than 10,000 lyrics, we find that LLMs achieve non-trivial profiling performance but demonstrate systematic cultural alignment: most models default toward North American ethnicity, while DeepSeek-1.5B aligns more strongly with Asian ethnicity. This finding emerges from both the models' prediction distributions and an analysis of their generated rationales. To quantify these disparities, we introduce two fairness metrics, Modality Accuracy Divergence (MAD) and Recall Divergence (RD), and show that Ministral-8B displays the strongest ethnicity bias among the evaluated models, whereas Gemma-12B shows the most balanced behavior. Our code is available on GitHub (https://github.com/ValentinLafargue/CulturalProbingLLM).

大規模言語モデルにおける文化的シグナルの作者プロファイリングによる分析

Probing Cultural Signals in Large Language Models through Author Profiling

要旨

Support