ChatPaper.aiChatPaper

Baichuan-M3:构建临床问诊模型以实现可靠医疗决策

Baichuan-M3: Modeling Clinical Inquiry for Reliable Medical Decision-Making

February 6, 2026
作者: Baichuan-M3 Team, Chengfeng Dou, Fan Yang, Fei Li, Jiyuan Jia, Qiang Ju, Shuai Wang, Tianpeng Li, Xiangrong Zeng, Yijie Zhou, Hongda Zhang, Jinyang Tai, Linzhuang Sun, Peidong Guo, Yichuan Mo, Xiaochuan Wang, Hengfu Cui, Zhishou Zhang
cs.AI

摘要

我们推出Baichuan-M3,这是一款医疗增强型大语言模型,旨在将范式从被动问答转向主动的临床级决策支持。针对现有系统在开放式问诊中的局限性,Baichuan-M3采用专业化训练流程模拟医师的系统化工作流。其核心能力包括:(一)主动信息采集以消除歧义;(二)长程推理能力,将零散证据整合为连贯诊断;(三)自适应幻觉抑制机制确保事实可靠性。实证评估表明,Baichuan-M3在新推出的HealthBench、HealthBench-Hallu及ScanBench基准测试中取得最先进成果,在临床问询、咨询建议与安全性方面显著超越GPT-5.2。模型已开源发布于https://huggingface.co/collections/baichuan-inc/baichuan-m3。
English
We introduce Baichuan-M3, a medical-enhanced large language model engineered to shift the paradigm from passive question-answering to active, clinical-grade decision support. Addressing the limitations of existing systems in open-ended consultations, Baichuan-M3 utilizes a specialized training pipeline to model the systematic workflow of a physician. Key capabilities include: (i) proactive information acquisition to resolve ambiguity; (ii) long-horizon reasoning that unifies scattered evidence into coherent diagnoses; and (iii) adaptive hallucination suppression to ensure factual reliability. Empirical evaluations demonstrate that Baichuan-M3 achieves state-of-the-art results on HealthBench, the newly introduced HealthBench-Hallu and ScanBench, significantly outperforming GPT-5.2 in clinical inquiry, advisory and safety. The models are publicly available at https://huggingface.co/collections/baichuan-inc/baichuan-m3.
PDF593March 16, 2026