RadEdit:透過擴散影像編輯對生物醫學視覺模型進行壓力測試
RadEdit: stress-testing biomedical vision models via diffusion image editing
December 20, 2023
作者: Fernando Pérez-García, Sam Bond-Taylor, Pedro P. Sanchez, Boris van Breugel, Daniel C. Castro, Harshita Sharma, Valentina Salvatelli, Maria T. A. Wetscherek, Hannah Richardson, Matthew P. Lungren, Aditya Nori, Javier Alvarez-Valle, Ozan Oktay, Maximilian Ilse
cs.AI
摘要
生物醫學影像數據集通常規模較小且存在偏差,這意味著預測模型的真實世界性能可能遠低於內部測試所預期的。本研究提出使用生成式圖像編輯來模擬數據集變化,並診斷生物醫學視覺模型的失敗模式;這可在部署前使用以評估準備情況,潛在地降低成本和患者傷害。現有的編輯方法可能產生不良變化,由於疾病和治療干預的共同出現而學習到虛假相關性,限制了實際應用性。為解決這個問題,我們在多個胸部X光數據集上訓練了一個文本到圖像擴散模型,並引入了一種名為RadEdit的新編輯方法,該方法使用多個遮罩(如果存在)來限制變化,並確保編輯後的圖像一致性。我們考慮三種數據集變化類型:獲取變化、表現變化和人口變化,並證明我們的方法可以診斷失敗並量化模型的穩健性,而無需額外的數據收集,這補充了更具解釋性的AI工具。
English
Biomedical imaging datasets are often small and biased, meaning that
real-world performance of predictive models can be substantially lower than
expected from internal testing. This work proposes using generative image
editing to simulate dataset shifts and diagnose failure modes of biomedical
vision models; this can be used in advance of deployment to assess readiness,
potentially reducing cost and patient harm. Existing editing methods can
produce undesirable changes, with spurious correlations learned due to the
co-occurrence of disease and treatment interventions, limiting practical
applicability. To address this, we train a text-to-image diffusion model on
multiple chest X-ray datasets and introduce a new editing method RadEdit that
uses multiple masks, if present, to constrain changes and ensure consistency in
the edited images. We consider three types of dataset shifts: acquisition
shift, manifestation shift, and population shift, and demonstrate that our
approach can diagnose failures and quantify model robustness without additional
data collection, complementing more qualitative tools for explainable AI.