RadEdit:通过扩散图像编辑对生物医学视觉模型进行压力测试
RadEdit: stress-testing biomedical vision models via diffusion image editing
December 20, 2023
作者: Fernando Pérez-García, Sam Bond-Taylor, Pedro P. Sanchez, Boris van Breugel, Daniel C. Castro, Harshita Sharma, Valentina Salvatelli, Maria T. A. Wetscherek, Hannah Richardson, Matthew P. Lungren, Aditya Nori, Javier Alvarez-Valle, Ozan Oktay, Maximilian Ilse
cs.AI
摘要
生物医学成像数据集通常规模较小且存在偏见,这意味着预测模型的实际性能可能远低于内部测试所预期的。本研究提出利用生成式图像编辑来模拟数据集转移,并诊断生物医学视觉模型的失败模式;这可在部署前使用以评估准备就绪程度,潜在地降低成本和患者伤害。现有的编辑方法可能会产生不良变化,由于疾病和治疗干预的共同发生而学习到虚假相关性,从而限制了实际适用性。为解决这一问题,我们在多个胸部X射线数据集上训练了一个文本到图像扩散模型,并引入了一种名为RadEdit的新编辑方法,利用多个蒙版(如果存在)来限制变化,并确保编辑后的图像一致性。我们考虑了三种类型的数据集转移:获取转移、表现转移和人口转移,并证明我们的方法可以诊断失败并量化模型的鲁棒性,而无需额外的数据收集,为可解释人工智能提供了更多定性工具的补充。
English
Biomedical imaging datasets are often small and biased, meaning that
real-world performance of predictive models can be substantially lower than
expected from internal testing. This work proposes using generative image
editing to simulate dataset shifts and diagnose failure modes of biomedical
vision models; this can be used in advance of deployment to assess readiness,
potentially reducing cost and patient harm. Existing editing methods can
produce undesirable changes, with spurious correlations learned due to the
co-occurrence of disease and treatment interventions, limiting practical
applicability. To address this, we train a text-to-image diffusion model on
multiple chest X-ray datasets and introduce a new editing method RadEdit that
uses multiple masks, if present, to constrain changes and ensure consistency in
the edited images. We consider three types of dataset shifts: acquisition
shift, manifestation shift, and population shift, and demonstrate that our
approach can diagnose failures and quantify model robustness without additional
data collection, complementing more qualitative tools for explainable AI.