擴散模型作為資料探勘工具
Diffusion Models as Data Mining Tools
July 20, 2024
作者: Ioannis Siglidis, Aleksander Holynski, Alexei A. Efros, Mathieu Aubry, Shiry Ginosar
cs.AI
摘要
本文展示了如何將訓練用於圖像合成的生成模型作為視覺數據挖掘的工具。我們的洞察是,由於當代生成模型學習了其訓練數據的準確表示,我們可以利用它們通過挖掘視覺模式來對數據進行摘要。具體而言,我們展示了在對特定數據集進行圖像合成的條件擴散模型進行微調後,我們可以使用這些模型來定義該數據集上的典型性度量。該度量評估了不同數據標籤(如地理位置、時間戳記、語義標籤或甚至疾病存在)的視覺元素的典型性。這種通過合成進行數據挖掘的方法具有兩個關鍵優勢。首先,與傳統基於對應的方法相比,它的擴展性更好,因為它不需要明確比較所有視覺元素對。其次,儘管大多數先前關於視覺數據挖掘的工作都集中在單個數據集上,我們的方法可以處理在內容和規模上多樣的數據集,包括歷史汽車數據集、歷史人臉數據集、大規模全球街景數據集,甚至更大的場景數據集。此外,我們的方法允許在類標籤之間轉換視覺元素並分析一致的變化。
English
This paper demonstrates how to use generative models trained for image
synthesis as tools for visual data mining. Our insight is that since
contemporary generative models learn an accurate representation of their
training data, we can use them to summarize the data by mining for visual
patterns. Concretely, we show that after finetuning conditional diffusion
models to synthesize images from a specific dataset, we can use these models to
define a typicality measure on that dataset. This measure assesses how typical
visual elements are for different data labels, such as geographic location,
time stamps, semantic labels, or even the presence of a disease. This
analysis-by-synthesis approach to data mining has two key advantages. First, it
scales much better than traditional correspondence-based approaches since it
does not require explicitly comparing all pairs of visual elements. Second,
while most previous works on visual data mining focus on a single dataset, our
approach works on diverse datasets in terms of content and scale, including a
historical car dataset, a historical face dataset, a large worldwide
street-view dataset, and an even larger scene dataset. Furthermore, our
approach allows for translating visual elements across class labels and
analyzing consistent changes.Summary
AI-Generated Summary