視覺超對齊:視覺基礎模型的弱到強泛化
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
February 6, 2024
作者: Jianyuan Guo, Hanting Chen, Chengcheng Wang, Kai Han, Chang Xu, Yunhe Wang
cs.AI
摘要
近期大型語言模型的進步引發了人們對其非凡和接近超人類能力的興趣,促使研究人員探索評估和優化這些能力的方法,這被稱為超對齊。在這個背景下,我們的論文深入探討了視覺基礎模型的領域,著重於弱到強泛化的概念,即利用較弱模型監督較強模型,旨在提升後者的能力超越前者的極限。我們引入了一種新穎且可自適應調整的弱到強監督損失函數。我們的全面實驗涵蓋各種情境,包括少樣本學習、遷移學習、噪聲標籤學習和常識蒸餾設置。結果令人驚訝:我們的方法不僅超越了由強到強泛化設定設定的性能基準,還超越了使用整個數據集對強模型進行微調的結果。這些有力的證據突顯了弱到強泛化的重要潛力,展示了它顯著提升視覺基礎模型性能的能力。程式碼可在 https://github.com/ggjy/vision_weak_to_strong 找到。
English
Recent advancements in large language models have sparked interest in their
extraordinary and near-superhuman capabilities, leading researchers to explore
methods for evaluating and optimizing these abilities, which is called
superalignment. In this context, our paper delves into the realm of vision
foundation models, focusing on the concept of weak-to-strong generalization,
which involves using a weaker model to supervise a stronger one, aiming to
enhance the latter's capabilities beyond the former's limits. We introduce a
novel and adaptively adjustable loss function for weak-to-strong supervision.
Our comprehensive experiments span various scenarios, including few-shot
learning, transfer learning, noisy label learning, and common knowledge
distillation settings. The results are striking: our approach not only exceeds
the performance benchmarks set by strong-to-strong generalization but also
surpasses the outcomes of fine-tuning strong models with whole datasets. This
compelling evidence underscores the significant potential of weak-to-strong
generalization, showcasing its capability to substantially elevate the
performance of vision foundation models. The code is available at
https://github.com/ggjy/vision_weak_to_strong.