Baichuan-M3：构建临床问诊模型以实现可靠医疗决策

摘要

我们推出Baichuan-M3，这是一款医疗增强型大语言模型，旨在将范式从被动问答转向主动的临床级决策支持。针对现有系统在开放式问诊中的局限性，Baichuan-M3采用专业化训练流程模拟医师的系统化工作流。其核心能力包括：（一）主动信息采集以消除歧义；（二）长程推理能力，将零散证据整合为连贯诊断；（三）自适应幻觉抑制机制确保事实可靠性。实证评估表明，Baichuan-M3在新推出的HealthBench、HealthBench-Hallu及ScanBench基准测试中取得最先进成果，在临床问询、咨询建议与安全性方面显著超越GPT-5.2。模型已开源发布于https://huggingface.co/collections/baichuan-inc/baichuan-m3。

English

We introduce Baichuan-M3, a medical-enhanced large language model engineered to shift the paradigm from passive question-answering to active, clinical-grade decision support. Addressing the limitations of existing systems in open-ended consultations, Baichuan-M3 utilizes a specialized training pipeline to model the systematic workflow of a physician. Key capabilities include: (i) proactive information acquisition to resolve ambiguity; (ii) long-horizon reasoning that unifies scattered evidence into coherent diagnoses; and (iii) adaptive hallucination suppression to ensure factual reliability. Empirical evaluations demonstrate that Baichuan-M3 achieves state-of-the-art results on HealthBench, the newly introduced HealthBench-Hallu and ScanBench, significantly outperforming GPT-5.2 in clinical inquiry, advisory and safety. The models are publicly available at https://huggingface.co/collections/baichuan-inc/baichuan-m3.