擴展您的核心:在 ConvNets 中的大核心設計朝向通用表示形式
Scaling Up Your Kernels: Large Kernel Design in ConvNets towards Universal Representations
October 10, 2024
作者: Yiyuan Zhang, Xiaohan Ding, Xiangyu Yue
cs.AI
摘要
本文提出了在設計現代卷積神經網絡(ConvNets)中使用大型卷積核的範式。我們確定,採用少量大型卷積核,而非堆疊多個較小的卷積核,可以是一種優越的設計策略。我們的工作引入了一套針對大型卷積核ConvNets的架構設計指南,優化其效率和性能。我們提出了UniRepLKNet架構,提供了專門為大型卷積核ConvNets量身定制的系統架構設計原則,強調它們捕獲廣泛空間信息的獨特能力,而無需深度堆疊層。這導致一個模型,不僅在ImageNet準確度達到88.0%,ADE20K mIoU達到55.6%,COCO box AP達到56.4%,超越了其前身,還在各種模態(如時間序列預測、音頻、點雲和視頻識別)上展示了令人印象深刻的可擴展性和性能。這些結果表明,與視覺變換器相比,大型卷積核ConvNets具有更快的推理速度,顯示了其通用建模能力。我們的研究發現顯示,大型卷積核ConvNets具有更大的有效感受野和更高的形狀偏差,遠離較小卷積核CNN典型的紋理偏差。所有代碼和模型都可在https://github.com/AILab-CVC/UniRepLKNet 公開獲得,促進社區中進一步的研究和發展。
English
This paper proposes the paradigm of large convolutional kernels in designing
modern Convolutional Neural Networks (ConvNets). We establish that employing a
few large kernels, instead of stacking multiple smaller ones, can be a superior
design strategy. Our work introduces a set of architecture design guidelines
for large-kernel ConvNets that optimize their efficiency and performance. We
propose the UniRepLKNet architecture, which offers systematical architecture
design principles specifically crafted for large-kernel ConvNets, emphasizing
their unique ability to capture extensive spatial information without deep
layer stacking. This results in a model that not only surpasses its
predecessors with an ImageNet accuracy of 88.0%, an ADE20K mIoU of 55.6%, and a
COCO box AP of 56.4% but also demonstrates impressive scalability and
performance on various modalities such as time-series forecasting, audio, point
cloud, and video recognition. These results indicate the universal modeling
abilities of large-kernel ConvNets with faster inference speed compared with
vision transformers. Our findings reveal that large-kernel ConvNets possess
larger effective receptive fields and a higher shape bias, moving away from the
texture bias typical of smaller-kernel CNNs. All codes and models are publicly
available at https://github.com/AILab-CVC/UniRepLKNet promoting further
research and development in the community.Summary
AI-Generated Summary