Med-Flamingo:一個多模態醫學少樣本學習器
Med-Flamingo: a Multimodal Medical Few-shot Learner
July 27, 2023
作者: Michael Moor, Qian Huang, Shirley Wu, Michihiro Yasunaga, Cyril Zakka, Yash Dalmia, Eduardo Pontes Reis, Pranav Rajpurkar, Jure Leskovec
cs.AI
摘要
醫學本質上是一個多面向的領域,需要在各種模式之間綜合信息。醫學生成式視覺語言模型(VLMs)在這方面邁出了第一步,並承諾許多令人興奮的臨床應用。然而,現有模型通常需要在龐大的下游數據集上進行微調,這構成了一個重要限制,因為在許多醫學應用中,數據稀缺,需要能夠實時從少量示例中學習的模型。在這裡,我們提出了Med-Flamingo,這是一個適應於醫學領域的多模式少樣本學習器。基於OpenFlamingo-9B,我們繼續在來自出版物和教科書的醫學圖像-文本配對和交錯數據上進行預訓練。Med-Flamingo發揮了少樣本生成式醫學視覺問答(VQA)的能力,我們在多個數據集上進行評估,包括一個新的具有挑戰性的開放式VQA數據集,其中包含視覺USMLE風格問題。此外,我們對生成式醫學VQA進行了首次人類評估,醫生們在交互式應用程序中審查問題和盲目生成。Med-Flamingo在醫學生成式VQA中的表現提高了高達20%的臨床評分,並首次實現了多模式醫學少樣本適應,例如理由生成。我們在https://github.com/snap-stanford/med-flamingo 下發布了我們的模型、代碼和評估應用程序。
English
Medicine, by its nature, is a multifaceted domain that requires the synthesis
of information across various modalities. Medical generative vision-language
models (VLMs) make a first step in this direction and promise many exciting
clinical applications. However, existing models typically have to be fine-tuned
on sizeable down-stream datasets, which poses a significant limitation as in
many medical applications data is scarce, necessitating models that are capable
of learning from few examples in real-time. Here we propose Med-Flamingo, a
multimodal few-shot learner adapted to the medical domain. Based on
OpenFlamingo-9B, we continue pre-training on paired and interleaved medical
image-text data from publications and textbooks. Med-Flamingo unlocks few-shot
generative medical visual question answering (VQA) abilities, which we evaluate
on several datasets including a novel challenging open-ended VQA dataset of
visual USMLE-style problems. Furthermore, we conduct the first human evaluation
for generative medical VQA where physicians review the problems and blinded
generations in an interactive app. Med-Flamingo improves performance in
generative medical VQA by up to 20\% in clinician's rating and firstly enables
multimodal medical few-shot adaptations, such as rationale generation. We
release our model, code, and evaluation app under
https://github.com/snap-stanford/med-flamingo.