DreamTeacher:使用深度生成模型對影像主幹進行預訓練
DreamTeacher: Pretraining Image Backbones with Deep Generative Models
July 14, 2023
作者: Daiqing Li, Huan Ling, Amlan Kar, David Acuna, Seung Wook Kim, Karsten Kreis, Antonio Torralba, Sanja Fidler
cs.AI
摘要
在這份工作中,我們介紹了一個自監督特徵表示學習框架 DreamTeacher,該框架利用生成網絡來預訓練下游圖像主幹。我們提出將從訓練過的生成模型中提煉知識,注入已經為特定感知任務進行良好工程設計的標準圖像主幹中。我們探討了兩種知識提煉方式:1) 將學習到的生成特徵提煉到目標圖像主幹上,作為對於在大型標記數據集(如 ImageNet)上預訓練這些主幹的替代方法;以及 2) 將從生成網絡和任務頭獲得的標籤提煉到目標主幹的 logits 上。我們對多個生成模型、密集預測基準和多種預訓練方案進行了廣泛分析。我們在實驗中發現,我們的 DreamTeacher 在各方面明顯優於現有的自監督表示學習方法。使用 DreamTeacher 進行無監督的 ImageNet 預訓練,比在下游數據集上進行 ImageNet 分類預訓練帶來了顯著的改善,展示了生成模型,特別是擴散生成模型,作為在大型、多樣數據集上進行表示學習的有前途方法,而無需手動標註。
English
In this work, we introduce a self-supervised feature representation learning
framework DreamTeacher that utilizes generative networks for pre-training
downstream image backbones. We propose to distill knowledge from a trained
generative model into standard image backbones that have been well engineered
for specific perception tasks. We investigate two types of knowledge
distillation: 1) distilling learned generative features onto target image
backbones as an alternative to pretraining these backbones on large labeled
datasets such as ImageNet, and 2) distilling labels obtained from generative
networks with task heads onto logits of target backbones. We perform extensive
analyses on multiple generative models, dense prediction benchmarks, and
several pre-training regimes. We empirically find that our DreamTeacher
significantly outperforms existing self-supervised representation learning
approaches across the board. Unsupervised ImageNet pre-training with
DreamTeacher leads to significant improvements over ImageNet classification
pre-training on downstream datasets, showcasing generative models, and
diffusion generative models specifically, as a promising approach to
representation learning on large, diverse datasets without requiring manual
annotation.