ChatPaper.aiChatPaper

Med-Flamingo:一個多模態醫學少樣本學習器

Med-Flamingo: a Multimodal Medical Few-shot Learner

July 27, 2023
作者: Michael Moor, Qian Huang, Shirley Wu, Michihiro Yasunaga, Cyril Zakka, Yash Dalmia, Eduardo Pontes Reis, Pranav Rajpurkar, Jure Leskovec
cs.AI

摘要

醫學本質上是一個多面向的領域,需要在各種模式之間綜合信息。醫學生成式視覺語言模型(VLMs)在這方面邁出了第一步,並承諾許多令人興奮的臨床應用。然而,現有模型通常需要在龐大的下游數據集上進行微調,這構成了一個重要限制,因為在許多醫學應用中,數據稀缺,需要能夠實時從少量示例中學習的模型。在這裡,我們提出了Med-Flamingo,這是一個適應於醫學領域的多模式少樣本學習器。基於OpenFlamingo-9B,我們繼續在來自出版物和教科書的醫學圖像-文本配對和交錯數據上進行預訓練。Med-Flamingo發揮了少樣本生成式醫學視覺問答(VQA)的能力,我們在多個數據集上進行評估,包括一個新的具有挑戰性的開放式VQA數據集,其中包含視覺USMLE風格問題。此外,我們對生成式醫學VQA進行了首次人類評估,醫生們在交互式應用程序中審查問題和盲目生成。Med-Flamingo在醫學生成式VQA中的表現提高了高達20%的臨床評分,並首次實現了多模式醫學少樣本適應,例如理由生成。我們在https://github.com/snap-stanford/med-flamingo 下發布了我們的模型、代碼和評估應用程序。
English
Medicine, by its nature, is a multifaceted domain that requires the synthesis of information across various modalities. Medical generative vision-language models (VLMs) make a first step in this direction and promise many exciting clinical applications. However, existing models typically have to be fine-tuned on sizeable down-stream datasets, which poses a significant limitation as in many medical applications data is scarce, necessitating models that are capable of learning from few examples in real-time. Here we propose Med-Flamingo, a multimodal few-shot learner adapted to the medical domain. Based on OpenFlamingo-9B, we continue pre-training on paired and interleaved medical image-text data from publications and textbooks. Med-Flamingo unlocks few-shot generative medical visual question answering (VQA) abilities, which we evaluate on several datasets including a novel challenging open-ended VQA dataset of visual USMLE-style problems. Furthermore, we conduct the first human evaluation for generative medical VQA where physicians review the problems and blinded generations in an interactive app. Med-Flamingo improves performance in generative medical VQA by up to 20\% in clinician's rating and firstly enables multimodal medical few-shot adaptations, such as rationale generation. We release our model, code, and evaluation app under https://github.com/snap-stanford/med-flamingo.
PDF231December 15, 2024