通过作者画像探析大型语言模型中的文化信号
Probing Cultural Signals in Large Language Models through Author Profiling
March 17, 2026
作者: Valentin Lafargue, Ariel Guerra-Adames, Emmanuelle Claeys, Elouan Vuichard, Jean-Michel Loubes
cs.AI
摘要
大型语言模型(LLMs)在具有社会影响的应用中日益普及,引发了对其所编码文化偏见的担忧。我们通过评估LLMs在零样本场景下能否根据歌词进行作者画像分析(即无需任务特定微调即可推断歌手的性别与族裔),来探究这些表征。基于对万余首歌词的多个开源模型测试发现,LLMs虽能实现显著的画像分析性能,但呈现出系统性文化对齐特征:多数模型默认偏向北美族裔,而DeepSeek-1.5B则更倾向于亚洲族裔。这一结论既源自模型的预测分布,也基于其生成推理逻辑的分析。为量化这些差异,我们引入模态准确度离散度(MAD)和召回率离散度(RD)两项公平性指标,结果表明Ministral-8B在受测模型中表现出最强的族裔偏见,而Gemma-12B则展现出最均衡的行为特征。相关代码已发布于GitHub(https://github.com/ValentinLafargue/CulturalProbingLLM)。
English
Large language models (LLMs) are increasingly deployed in applications with societal impact, raising concerns about the cultural biases they encode. We probe these representations by evaluating whether LLMs can perform author profiling from song lyrics in a zero-shot setting, inferring singers' gender and ethnicity without task-specific fine-tuning. Across several open-source models evaluated on more than 10,000 lyrics, we find that LLMs achieve non-trivial profiling performance but demonstrate systematic cultural alignment: most models default toward North American ethnicity, while DeepSeek-1.5B aligns more strongly with Asian ethnicity. This finding emerges from both the models' prediction distributions and an analysis of their generated rationales. To quantify these disparities, we introduce two fairness metrics, Modality Accuracy Divergence (MAD) and Recall Divergence (RD), and show that Ministral-8B displays the strongest ethnicity bias among the evaluated models, whereas Gemma-12B shows the most balanced behavior. Our code is available on GitHub (https://github.com/ValentinLafargue/CulturalProbingLLM).