理解Alpha世代数字语言：大语言模型内容审核安全系统的评估

摘要

本研究对人工智能系统如何解读阿尔法一代（Gen Alpha，2010-2024年出生）的数字语言进行了独特评估。作为与AI共同成长的第一代，阿尔法一代因沉浸式数字参与及其不断演变的沟通方式与现有安全工具之间的日益脱节，面临着新型的在线风险。他们独特的语言，受游戏、表情包和AI驱动趋势的影响，常常将有害互动隐藏于人类审核员和自动化系统之外。我们评估了四种领先的AI模型（GPT-4、Claude、Gemini和Llama 3）在检测阿尔法一代话语中隐蔽骚扰和操纵行为方面的能力。通过使用来自游戏平台、社交媒体和视频内容的100个最新表达数据集，研究揭示了直接关系到在线安全的关键理解失败。本研究的贡献包括：（1）首个捕捉阿尔法一代表达的数据集；（2）改进AI审核系统以保护青少年的框架；（3）包含AI系统、人类审核员和家长的多视角评估，并直接纳入阿尔法一代合作研究者的意见；（4）分析语言差异如何增加青少年的脆弱性。研究结果强调了重新设计适应青少年沟通的安全系统的紧迫性，尤其是在阿尔法一代因成人无法理解其数字世界而不愿寻求帮助的情况下。本研究结合了阿尔法一代研究者的洞察与系统的学术分析，以应对关键的数字化安全挑战。

English

This research offers a unique evaluation of how AI systems interpret the digital language of Generation Alpha (Gen Alpha, born 2010-2024). As the first cohort raised alongside AI, Gen Alpha faces new forms of online risk due to immersive digital engagement and a growing mismatch between their evolving communication and existing safety tools. Their distinct language, shaped by gaming, memes, and AI-driven trends, often conceals harmful interactions from both human moderators and automated systems. We assess four leading AI models (GPT-4, Claude, Gemini, and Llama 3) on their ability to detect masked harassment and manipulation within Gen Alpha discourse. Using a dataset of 100 recent expressions from gaming platforms, social media, and video content, the study reveals critical comprehension failures with direct implications for online safety. This work contributes: (1) a first-of-its-kind dataset capturing Gen Alpha expressions; (2) a framework to improve AI moderation systems for youth protection; (3) a multi-perspective evaluation including AI systems, human moderators, and parents, with direct input from Gen Alpha co-researchers; and (4) an analysis of how linguistic divergence increases youth vulnerability. Findings highlight the urgent need to redesign safety systems attuned to youth communication, especially given Gen Alpha reluctance to seek help when adults fail to understand their digital world. This study combines the insight of a Gen Alpha researcher with systematic academic analysis to address critical digital safety challenges.