多语言大语言模型安全研究现状:从测量语言差距到缓解差距
The State of Multilingual LLM Safety Research: From Measuring the Language Gap to Mitigating It
May 30, 2025
作者: Zheng-Xin Yong, Beyza Ermis, Marzieh Fadaee, Stephen H. Bach, Julia Kreutzer
cs.AI
摘要
本文对大型语言模型(LLM)安全研究的语言多样性进行了全面分析,揭示了该领域以英语为中心的特点。通过对2020年至2024年间*ACL主要自然语言处理会议和研讨会上近300篇出版物的系统回顾,我们发现LLM安全研究存在显著且日益扩大的语言鸿沟,即便是资源丰富的非英语语言也极少受到关注。我们进一步观察到,非英语语言很少作为独立语言进行研究,且英语安全研究在语言文档实践方面表现欠佳。为激励未来多语言安全研究的发展,我们基于调查提出了若干建议,并针对安全评估、训练数据生成及跨语言安全泛化三个具体方向提出了未来研究路径。通过本次调查及所提出的方向,该领域有望为全球多元人口开发出更为稳健、包容的人工智能安全实践。
English
This paper presents a comprehensive analysis of the linguistic diversity of
LLM safety research, highlighting the English-centric nature of the field.
Through a systematic review of nearly 300 publications from 2020--2024 across
major NLP conferences and workshops at *ACL, we identify a significant and
growing language gap in LLM safety research, with even high-resource
non-English languages receiving minimal attention. We further observe that
non-English languages are rarely studied as a standalone language and that
English safety research exhibits poor language documentation practice. To
motivate future research into multilingual safety, we make several
recommendations based on our survey, and we then pose three concrete future
directions on safety evaluation, training data generation, and crosslingual
safety generalization. Based on our survey and proposed directions, the field
can develop more robust, inclusive AI safety practices for diverse global
populations.Summary
AI-Generated Summary