RadEdit: 拡散画像編集による生体医療視覚モデルのストレステスト

要旨

生体医用画像データセットはしばしば小規模で偏りがあり、予測モデルの実世界での性能が内部テストから期待される値よりも大幅に低くなる可能性があります。本研究では、生成的な画像編集を用いてデータセットシフトをシミュレートし、生体医用視覚モデルの故障モードを診断することを提案します。これにより、展開前に準備状況を評価し、コストと患者へのリスクを低減できる可能性があります。既存の編集手法では、疾患と治療介入の共起による偽の相関が学習され、望ましくない変更が生じるため、実用性が制限されています。この問題に対処するため、複数の胸部X線データセットでテキストから画像への拡散モデルを学習し、複数のマスクを使用して変更を制約し、編集された画像の一貫性を保証する新しい編集手法RadEditを導入します。取得シフト、症状シフト、集団シフトという3種類のデータセットシフトを考慮し、追加のデータ収集なしに故障を診断し、モデルの堅牢性を定量化できることを示します。これにより、説明可能なAIのためのより定性的なツールを補完します。

English

Biomedical imaging datasets are often small and biased, meaning that real-world performance of predictive models can be substantially lower than expected from internal testing. This work proposes using generative image editing to simulate dataset shifts and diagnose failure modes of biomedical vision models; this can be used in advance of deployment to assess readiness, potentially reducing cost and patient harm. Existing editing methods can produce undesirable changes, with spurious correlations learned due to the co-occurrence of disease and treatment interventions, limiting practical applicability. To address this, we train a text-to-image diffusion model on multiple chest X-ray datasets and introduce a new editing method RadEdit that uses multiple masks, if present, to constrain changes and ensure consistency in the edited images. We consider three types of dataset shifts: acquisition shift, manifestation shift, and population shift, and demonstrate that our approach can diagnose failures and quantify model robustness without additional data collection, complementing more qualitative tools for explainable AI.

RadEdit: 拡散画像編集による生体医療視覚モデルのストレステスト

RadEdit: stress-testing biomedical vision models via diffusion image editing

要旨

Support