局部尺度等变性通过潜在深度平衡规范化器实现
Local Scale Equivariance with Latent Deep Equilibrium Canonicalizer
August 19, 2025
作者: Md Ashiqur Rahman, Chiao-An Yang, Michael N. Cheng, Lim Jun Hao, Jeremiah Jiang, Teck-Yian Lim, Raymond A. Yeh
cs.AI
摘要
尺度变化是计算机视觉领域的一项基础性挑战。同类物体可能具有不同尺寸,且其感知大小还会受到与相机距离的影响。这些变化是物体局部的,即在同一图像中,不同物体的尺寸可能以不同方式变化。为有效处理尺度变化,我们提出了一种深度均衡规范化器(DEC),以提升模型的局部尺度等变性。DEC可轻松融入现有网络架构,并能适配预训练模型。值得注意的是,在竞争激烈的ImageNet基准测试中,DEC在四种流行的预训练深度网络(如ViT、DeiT、Swin和BEiT)上均提升了模型性能及局部尺度一致性。我们的代码已公开于https://github.com/ashiq24/local-scale-equivariance。
English
Scale variation is a fundamental challenge in computer vision. Objects of the
same class can have different sizes, and their perceived size is further
affected by the distance from the camera. These variations are local to the
objects, i.e., different object sizes may change differently within the same
image. To effectively handle scale variations, we present a deep equilibrium
canonicalizer (DEC) to improve the local scale equivariance of a model. DEC can
be easily incorporated into existing network architectures and can be adapted
to a pre-trained model. Notably, we show that on the competitive ImageNet
benchmark, DEC improves both model performance and local scale consistency
across four popular pre-trained deep-nets, e.g., ViT, DeiT, Swin, and BEiT. Our
code is available at https://github.com/ashiq24/local-scale-equivariance.