扩展你的核:ConvNets 中的大核设计朝向通用表示
Scaling Up Your Kernels: Large Kernel Design in ConvNets towards Universal Representations
October 10, 2024
作者: Yiyuan Zhang, Xiaohan Ding, Xiangyu Yue
cs.AI
摘要
本文提出了在设计现代卷积神经网络(ConvNets)中采用大卷积核的范式。我们确认,使用少量大卷积核,而不是堆叠多个较小的卷积核,可以是一种更优越的设计策略。我们的工作引入了一组针对大卷积核ConvNets的架构设计准则,优化它们的效率和性能。我们提出了UniRepLKNet架构,提供了专门为大卷积核ConvNets量身定制的系统化架构设计原则,强调了它们在不需要深度层堆叠的情况下捕获广泛空间信息的独特能力。这导致了一个模型,不仅在ImageNet准确率达到了88.0%,ADE20K mIoU达到了55.6%,COCO盒子AP达到了56.4%,而且在诸如时间序列预测、音频、点云和视频识别等各种模态上展现出了令人印象深刻的可扩展性和性能。这些结果表明了大卷积核ConvNets具有通用建模能力,较视觉Transformer具有更快的推理速度。我们的发现揭示了大卷积核ConvNets具有更大的有效感受野和更高的形状偏差,远离了较小卷积核CNN典型的纹理偏差。所有代码和模型都可以在https://github.com/AILab-CVC/UniRepLKNet上公开获取,促进社区中进一步的研究和发展。
English
This paper proposes the paradigm of large convolutional kernels in designing
modern Convolutional Neural Networks (ConvNets). We establish that employing a
few large kernels, instead of stacking multiple smaller ones, can be a superior
design strategy. Our work introduces a set of architecture design guidelines
for large-kernel ConvNets that optimize their efficiency and performance. We
propose the UniRepLKNet architecture, which offers systematical architecture
design principles specifically crafted for large-kernel ConvNets, emphasizing
their unique ability to capture extensive spatial information without deep
layer stacking. This results in a model that not only surpasses its
predecessors with an ImageNet accuracy of 88.0%, an ADE20K mIoU of 55.6%, and a
COCO box AP of 56.4% but also demonstrates impressive scalability and
performance on various modalities such as time-series forecasting, audio, point
cloud, and video recognition. These results indicate the universal modeling
abilities of large-kernel ConvNets with faster inference speed compared with
vision transformers. Our findings reveal that large-kernel ConvNets possess
larger effective receptive fields and a higher shape bias, moving away from the
texture bias typical of smaller-kernel CNNs. All codes and models are publicly
available at https://github.com/AILab-CVC/UniRepLKNet promoting further
research and development in the community.Summary
AI-Generated Summary