ChatPaper.aiChatPaper

揭示视觉表示学习中的骨干-优化器耦合偏差

Unveiling the Backbone-Optimizer Coupling Bias in Visual Representation Learning

October 8, 2024
作者: Siyuan Li, Juanxi Tian, Zedong Wang, Luyuan Zhang, Zicheng Liu, Weiyang Jin, Yang Liu, Baigui Sun, Stan Z. Li
cs.AI

摘要

本文探讨了视觉主干网络和优化器之间的相互作用,揭示了一种相互依赖的现象,称为\textbf{主干网络-优化器耦合偏差}(BOCB)。我们观察到经典的卷积神经网络,如VGG和ResNet,与SGD系列表现出明显的相互依赖关系,而最近的架构如ViTs和ConvNeXt与自适应学习率的优化器紧密耦合。我们进一步展示了BOCB可以由优化器和特定主干设计引入,可能会显著影响视觉模型的预训练和下游微调。通过深入的实证分析,我们总结了关于推荐优化器和强大视觉主干架构的见解。我们希望这项工作能激发社区对主干网络和优化器长期以来的假设进行质疑,促进进一步探索,从而为更强大的视觉系统做出贡献。源代码和模型可在https://bocb-ai.github.io/ 上公开获取。
English
This paper delves into the interplay between vision backbones and optimizers, unvealing an inter-dependent phenomenon termed \textbf{backbone-optimizer coupling bias} (BOCB). We observe that canonical CNNs, such as VGG and ResNet, exhibit a marked co-dependency with SGD families, while recent architectures like ViTs and ConvNeXt share a tight coupling with the adaptive learning rate ones. We further show that BOCB can be introduced by both optimizers and certain backbone designs and may significantly impact the pre-training and downstream fine-tuning of vision models. Through in-depth empirical analysis, we summarize takeaways on recommended optimizers and insights into robust vision backbone architectures. We hope this work can inspire the community to question long-held assumptions on backbones and optimizers, stimulate further explorations, and thereby contribute to more robust vision systems. The source code and models are publicly available at https://bocb-ai.github.io/.

Summary

AI-Generated Summary

PDF343November 16, 2024