Med-Flamingo:一种多模态医学小样本学习器
Med-Flamingo: a Multimodal Medical Few-shot Learner
July 27, 2023
作者: Michael Moor, Qian Huang, Shirley Wu, Michihiro Yasunaga, Cyril Zakka, Yash Dalmia, Eduardo Pontes Reis, Pranav Rajpurkar, Jure Leskovec
cs.AI
摘要
医学本质上是一个多方面的领域,需要综合各种模态的信息。医学生成式视觉语言模型(VLMs)迈出了朝着这个方向迈出的第一步,并承诺许多令人兴奋的临床应用。然而,现有模型通常需要在庞大的下游数据集上进行微调,这构成了一个重要限制,因为在许多医学应用中,数据稀缺,需要能够从少量实例中实时学习的模型。在这里,我们提出了Med-Flamingo,这是一种适用于医学领域的多模态少样本学习器。基于OpenFlamingo-9B,我们继续在医学图像文本数据(来自出版物和教科书)上进行配对和交织的预训练。Med-Flamingo解锁了少样本生成式医学视觉问答(VQA)能力,我们在包括一个新颖的具有挑战性的开放式VQA数据集(包含视觉USMLE风格问题)在内的多个数据集上进行评估。此外,我们进行了首次针对生成式医学VQA的人类评估,医生们在交互式应用程序中审查问题和盲目生成。Med-Flamingo在医学VQA的生成性能中提高了高达20\%的临床评分,并首次实现了多模态医学少样本适应,如理由生成。我们在https://github.com/snap-stanford/med-flamingo上发布了我们的模型、代码和评估应用程序。
English
Medicine, by its nature, is a multifaceted domain that requires the synthesis
of information across various modalities. Medical generative vision-language
models (VLMs) make a first step in this direction and promise many exciting
clinical applications. However, existing models typically have to be fine-tuned
on sizeable down-stream datasets, which poses a significant limitation as in
many medical applications data is scarce, necessitating models that are capable
of learning from few examples in real-time. Here we propose Med-Flamingo, a
multimodal few-shot learner adapted to the medical domain. Based on
OpenFlamingo-9B, we continue pre-training on paired and interleaved medical
image-text data from publications and textbooks. Med-Flamingo unlocks few-shot
generative medical visual question answering (VQA) abilities, which we evaluate
on several datasets including a novel challenging open-ended VQA dataset of
visual USMLE-style problems. Furthermore, we conduct the first human evaluation
for generative medical VQA where physicians review the problems and blinded
generations in an interactive app. Med-Flamingo improves performance in
generative medical VQA by up to 20\% in clinician's rating and firstly enables
multimodal medical few-shot adaptations, such as rationale generation. We
release our model, code, and evaluation app under
https://github.com/snap-stanford/med-flamingo.