ChatPaper.aiChatPaper

揭示視覺表示學習中的骨幹優化器耦合偏差

Unveiling the Backbone-Optimizer Coupling Bias in Visual Representation Learning

October 8, 2024
作者: Siyuan Li, Juanxi Tian, Zedong Wang, Luyuan Zhang, Zicheng Liu, Weiyang Jin, Yang Liu, Baigui Sun, Stan Z. Li
cs.AI

摘要

本文探討了視覺主幹和優化器之間的相互作用,揭示了一種稱為\textbf{主幹-優化器耦合偏差}(BOCB)的相互依賴現象。我們觀察到,像VGG和ResNet這樣的經典CNN與SGD家族呈現明顯的相互依存關係,而像ViTs和ConvNeXt這樣的最新架構則與自適應學習率方法緊密耦合。我們進一步展示,BOCB可能由優化器和某些主幹設計引入,並可能顯著影響視覺模型的預訓練和下游微調。通過深入的實證分析,我們總結了對推薦優化器和強健視覺主幹架構的見解。我們希望這項工作能激發社區對主幹和優化器的長期假設提出質疑,促進進一步的探索,從而為更強健的視覺系統做出貢獻。源代碼和模型可在https://bocb-ai.github.io/公開獲得。
English
This paper delves into the interplay between vision backbones and optimizers, unvealing an inter-dependent phenomenon termed \textbf{backbone-optimizer coupling bias} (BOCB). We observe that canonical CNNs, such as VGG and ResNet, exhibit a marked co-dependency with SGD families, while recent architectures like ViTs and ConvNeXt share a tight coupling with the adaptive learning rate ones. We further show that BOCB can be introduced by both optimizers and certain backbone designs and may significantly impact the pre-training and downstream fine-tuning of vision models. Through in-depth empirical analysis, we summarize takeaways on recommended optimizers and insights into robust vision backbone architectures. We hope this work can inspire the community to question long-held assumptions on backbones and optimizers, stimulate further explorations, and thereby contribute to more robust vision systems. The source code and models are publicly available at https://bocb-ai.github.io/.

Summary

AI-Generated Summary

PDF343November 16, 2024