MedGemma 1.5 技术报告

摘要

我们推出MedGemma系列最新模型——MedGemma 1.5 4B。该版本在MedGemma 1基础上新增四大核心能力：高维医学影像（CT/MRI三维数据与病理全切片图像）、基于边界框的解剖定位、多时间点胸片分析以及增强的医疗文档理解能力（检验报告、电子健康记录）。我们详细阐述了在单一架构中实现多模态融合的技术创新，包括新型训练数据、长上下文三维体积切片技术和全切片病理采样方案。相较于MedGemma 1 4B，新版模型在新领域实现显著提升：3D MRI疾病分类准确率提升11%，3D CT疾病分类准确率提升3%（绝对增益）；全切片病理影像分析中宏观F1分数提升47%。在胸片解剖定位任务中，交并比指标提升35%，多时间点胸片分析的宏观准确率达4%。除多模态能力增强外，MedGemma 1.5在临床知识推理方面也有突破：MedQA准确率提升5%，EHRQA准确率提升22%，在四个检验报告信息抽取数据集（EHR数据集2/3/4及Mendeley临床检验报告）上平均宏观F1分数达18%。综上所述，MedGemma 1.5作为社区开放的稳健资源，为开发者构建新一代医疗AI系统提供了增强基础平台。相关开发资源与教程详见https://goo.gle/MedGemma。

English

We introduce MedGemma 1.5 4B, the latest model in the MedGemma collection. MedGemma 1.5 expands on MedGemma 1 by integrating additional capabilities: high-dimensional medical imaging (CT/MRI volumes and histopathology whole slide images), anatomical localization via bounding boxes, multi-timepoint chest X-ray analysis, and improved medical document understanding (lab reports, electronic health records). We detail the innovations required to enable these modalities within a single architecture, including new training data, long-context 3D volume slicing, and whole-slide pathology sampling. Compared to MedGemma 1 4B, MedGemma 1.5 4B demonstrates significant gains in these new areas, improving 3D MRI condition classification accuracy by 11% and 3D CT condition classification by 3% (absolute improvements). In whole slide pathology imaging, MedGemma 1.5 4B achieves a 47% macro F1 gain. Additionally, it improves anatomical localization with a 35% increase in Intersection over Union on chest X-rays and achieves a 4% macro accuracy for longitudinal (multi-timepoint) chest x-ray analysis. Beyond its improved multimodal performance over MedGemma 1, MedGemma 1.5 improves on text-based clinical knowledge and reasoning, improving by 5% on MedQA accuracy and 22% on EHRQA accuracy. It also achieves an average of 18% macro F1 on 4 different lab report information extraction datasets (EHR Datasets 2, 3, 4, and Mendeley Clinical Laboratory Test Reports). Taken together, MedGemma 1.5 serves as a robust, open resource for the community, designed as an improved foundation on which developers can create the next generation of medical AI systems. Resources and tutorials for building upon MedGemma 1.5 can be found at https://goo.gle/MedGemma.