ChatPaper.aiChatPaper

诊所:评估医疗领域语言模型的多语言可信度

CLINIC: Evaluating Multilingual Trustworthiness in Language Models for Healthcare

December 12, 2025
作者: Akash Ghosh, Srivarshinee Sridhar, Raghav Kaushik Ravi, Muhsin Muhsin, Sriparna Saha, Chirag Agarwal
cs.AI

摘要

将语言模型(LM)整合至医疗保健系统,对于改善医疗工作流程和临床决策具有巨大潜力。然而,其在实际应用中的关键障碍在于缺乏可信赖度的系统评估,尤其是在多语言医疗场景中。现有语言模型主要基于高资源语言训练,难以应对中低资源语言中医护查询的复杂性与多样性,这在以语言多样性为特征的全球医疗部署中构成重大挑战。本研究提出CLINIC——一个用于评估医疗领域语言模型可信度的综合性多语言基准。该基准通过18项多样化任务,系统化地评估语言模型在五大可信度维度(真实性、公平性、安全性、鲁棒性及隐私性)的表现,涵盖15种语言(遍及全球主要大洲),涉及疾病状况、预防措施、诊断检测、治疗方案、外科手术及药物等关键医疗主题。大规模评估表明:语言模型存在事实准确性不足、对人口统计与语言群体展现偏见、易受隐私泄露及对抗性攻击等问题。通过揭示这些缺陷,CLINIC为提升语言模型在全球多语言医疗环境中的适用性与安全性奠定了重要基础。
English
Integrating language models (LMs) in healthcare systems holds great promise for improving medical workflows and decision-making. However, a critical barrier to their real-world adoption is the lack of reliable evaluation of their trustworthiness, especially in multilingual healthcare settings. Existing LMs are predominantly trained in high-resource languages, making them ill-equipped to handle the complexity and diversity of healthcare queries in mid- and low-resource languages, posing significant challenges for deploying them in global healthcare contexts where linguistic diversity is key. In this work, we present CLINIC, a Comprehensive Multilingual Benchmark to evaluate the trustworthiness of language models in healthcare. CLINIC systematically benchmarks LMs across five key dimensions of trustworthiness: truthfulness, fairness, safety, robustness, and privacy, operationalized through 18 diverse tasks, spanning 15 languages (covering all the major continents), and encompassing a wide array of critical healthcare topics like disease conditions, preventive actions, diagnostic tests, treatments, surgeries, and medications. Our extensive evaluation reveals that LMs struggle with factual correctness, demonstrate bias across demographic and linguistic groups, and are susceptible to privacy breaches and adversarial attacks. By highlighting these shortcomings, CLINIC lays the foundation for enhancing the global reach and safety of LMs in healthcare across diverse languages.
PDF32December 17, 2025