ChatPaper.aiChatPaper

扩散模型作为数据挖掘工具

Diffusion Models as Data Mining Tools

July 20, 2024
作者: Ioannis Siglidis, Aleksander Holynski, Alexei A. Efros, Mathieu Aubry, Shiry Ginosar
cs.AI

摘要

本文演示了如何将用于图像合成的生成模型作为视觉数据挖掘工具。我们的洞察力在于,由于当代生成模型学习了其训练数据的准确表示,我们可以利用它们通过挖掘视觉模式来总结数据。具体来说,我们展示了在对特定数据集进行微调的条件扩散模型用于合成图像后,我们可以利用这些模型在该数据集上定义典型性度量。该度量评估了不同数据标签(如地理位置、时间戳、语义标签甚至疾病存在)的视觉元素的典型程度。这种通过合成进行数据挖掘的分析方法具有两个关键优势。首先,与传统基于对应关系的方法相比,它的扩展性更好,因为它不需要显式比较所有视觉元素对。其次,虽然大多数先前关于视觉数据挖掘的工作集中在单个数据集上,我们的方法可以处理内容和规模各异的多个数据集,包括历史汽车数据集、历史人脸数据集、大规模全球街景数据集,甚至更大的场景数据集。此外,我们的方法允许在类别标签之间转换视觉元素并分析一致的变化。
English
This paper demonstrates how to use generative models trained for image synthesis as tools for visual data mining. Our insight is that since contemporary generative models learn an accurate representation of their training data, we can use them to summarize the data by mining for visual patterns. Concretely, we show that after finetuning conditional diffusion models to synthesize images from a specific dataset, we can use these models to define a typicality measure on that dataset. This measure assesses how typical visual elements are for different data labels, such as geographic location, time stamps, semantic labels, or even the presence of a disease. This analysis-by-synthesis approach to data mining has two key advantages. First, it scales much better than traditional correspondence-based approaches since it does not require explicitly comparing all pairs of visual elements. Second, while most previous works on visual data mining focus on a single dataset, our approach works on diverse datasets in terms of content and scale, including a historical car dataset, a historical face dataset, a large worldwide street-view dataset, and an even larger scene dataset. Furthermore, our approach allows for translating visual elements across class labels and analyzing consistent changes.

Summary

AI-Generated Summary

PDF142November 28, 2024