DreamTeacher:使用深度生成模型对图像骨干进行预训练
DreamTeacher: Pretraining Image Backbones with Deep Generative Models
July 14, 2023
作者: Daiqing Li, Huan Ling, Amlan Kar, David Acuna, Seung Wook Kim, Karsten Kreis, Antonio Torralba, Sanja Fidler
cs.AI
摘要
在这项工作中,我们介绍了一种自监督特征表示学习框架 DreamTeacher,该框架利用生成网络对下游图像主干进行预训练。我们提出从经过训练的生成模型中提炼知识,注入到经过良好设计用于特定感知任务的标准图像主干中。我们研究了两种类型的知识提炼:1) 将学习到的生成特征提炼到目标图像主干上,作为替代方案,而非对这些主干在大型标记数据集(如ImageNet)上进行预训练;2) 将从生成网络和任务头获得的标签提炼到目标主干的对数中。我们对多个生成模型、密集预测基准和几种预训练方案进行了广泛分析。我们经验性地发现,我们的 DreamTeacher 在各方面明显优于现有的自监督表示学习方法。使用 DreamTeacher 进行无监督的 ImageNet 预训练,相比于在下游数据集上进行 ImageNet 分类预训练,能够显著提升性能,展示了生成模型,尤其是扩散生成模型,作为在大型、多样化数据集上进行表示学习的一种有前途的方法,而无需手动注释。
English
In this work, we introduce a self-supervised feature representation learning
framework DreamTeacher that utilizes generative networks for pre-training
downstream image backbones. We propose to distill knowledge from a trained
generative model into standard image backbones that have been well engineered
for specific perception tasks. We investigate two types of knowledge
distillation: 1) distilling learned generative features onto target image
backbones as an alternative to pretraining these backbones on large labeled
datasets such as ImageNet, and 2) distilling labels obtained from generative
networks with task heads onto logits of target backbones. We perform extensive
analyses on multiple generative models, dense prediction benchmarks, and
several pre-training regimes. We empirically find that our DreamTeacher
significantly outperforms existing self-supervised representation learning
approaches across the board. Unsupervised ImageNet pre-training with
DreamTeacher leads to significant improvements over ImageNet classification
pre-training on downstream datasets, showcasing generative models, and
diffusion generative models specifically, as a promising approach to
representation learning on large, diverse datasets without requiring manual
annotation.