ChatPaper.aiChatPaper

凡人数学:评估推理目标与紧急情境间的冲突

MortalMATH: Evaluating the Conflict Between Reasoning Objectives and Emergency Contexts

January 26, 2026
作者: Etienne Lanzeray, Stephane Meilliez, Malo Ruelle, Damien Sileo
cs.AI

摘要

大型语言模型正日益针对深度推理进行优化,将复杂任务的正确执行置于通用对话能力之上。我们研究这种对计算能力的专注是否会造成"隧道视野",在危急情境下忽视安全考量。我们推出MortalMATH基准测试,包含150个场景:用户在描述逐渐危及生命的紧急情况(如中风症状、自由落体)时请求代数帮助。研究发现存在显著的行为分化:通用模型(如Llama-3.1)能成功拒绝数学请求以处理危险;而专用推理模型(如Qwen-3-32b和GPT-5-nano)往往完全忽略紧急情况,在用户描述濒死状态时仍保持超过95%的任务完成率。更严重的是,推理所需的计算时间会导致危险延迟:在提供任何潜在帮助前耗时长达15秒。这些结果表明,训练模型不懈追求正确答案的做法,可能会在无意中削弱安全部署所需的生存本能。
English
Large Language Models are increasingly optimized for deep reasoning, prioritizing the correct execution of complex tasks over general conversation. We investigate whether this focus on calculation creates a "tunnel vision" that ignores safety in critical situations. We introduce MortalMATH, a benchmark of 150 scenarios where users request algebra help while describing increasingly life-threatening emergencies (e.g., stroke symptoms, freefall). We find a sharp behavioral split: generalist models (like Llama-3.1) successfully refuse the math to address the danger. In contrast, specialized reasoning models (like Qwen-3-32b and GPT-5-nano) often ignore the emergency entirely, maintaining over 95 percent task completion rates while the user describes dying. Furthermore, the computational time required for reasoning introduces dangerous delays: up to 15 seconds before any potential help is offered. These results suggest that training models to relentlessly pursue correct answers may inadvertently unlearn the survival instincts required for safe deployment.
PDF11January 28, 2026