SEA-Guard:东南亚文化根基的多语言安全屏障
SEA-Guard: Culturally Grounded Multilingual Safeguard for Southeast Asia
February 2, 2026
作者: Panuthep Tasawong, Jian Gang Ngui, Alham Fikri Aji, Trevor Cohn, Peerat Limkonchotiwat
cs.AI
摘要
在现实应用场景中,具备文化认知的防护机制对人工智能对齐至关重要——此时安全性已超越常识范畴,需涵盖多元的地方价值观、社会规范及区域性法规。然而受限于资源匮乏与母语标注员稀缺,构建大规模文化根基数据集面临挑战,导致多数防护模型依赖英语数据集的机器翻译结果,往往缺失地域文化特质。我们提出一种创新的智能体数据生成框架,可规模化创建适用于东南亚地区的本土化安全数据集。基于此,我们推出SEA-Guard系列模型,这是首个植根于东南亚文化背景的多语言防护模型。经多基准测试与文化变体评估,SEA-Guard在识别区域敏感性或有害内容方面持续优于现有防护模型,同时保持卓越的通用安全性能。
English
Culturally aware safeguards are crucial for AI alignment in real-world settings, where safety extends beyond common sense and encompasses diverse local values, norms, and region-specific regulations. However, building large-scale, culturally grounded datasets is challenging due to limited resources and a scarcity of native annotators. Consequently, many safeguard models rely on machine translation of English datasets, often missing regional and cultural nuances. We present a novel agentic data-generation framework to scalably create authentic, region-specific safety datasets for Southeast Asia (SEA). On this foundation, we introduce the SEA-Guard family, the first multilingual safeguard models grounded in SEA cultural contexts. Evaluated across multiple benchmarks and cultural variants, SEA-Guard consistently outperforms existing safeguards at detecting regionally sensitive or harmful content while maintaining strong general safety performance.