局部尺度等变性通过潜在深度平衡规范化器实现

摘要

尺度变化是计算机视觉领域的一项基础性挑战。同类物体可能具有不同尺寸，且其感知大小还会受到与相机距离的影响。这些变化是物体局部的，即在同一图像中，不同物体的尺寸可能以不同方式变化。为有效处理尺度变化，我们提出了一种深度均衡规范化器（DEC），以提升模型的局部尺度等变性。DEC可轻松融入现有网络架构，并能适配预训练模型。值得注意的是，在竞争激烈的ImageNet基准测试中，DEC在四种流行的预训练深度网络（如ViT、DeiT、Swin和BEiT）上均提升了模型性能及局部尺度一致性。我们的代码已公开于https://github.com/ashiq24/local-scale-equivariance。

English

Scale variation is a fundamental challenge in computer vision. Objects of the same class can have different sizes, and their perceived size is further affected by the distance from the camera. These variations are local to the objects, i.e., different object sizes may change differently within the same image. To effectively handle scale variations, we present a deep equilibrium canonicalizer (DEC) to improve the local scale equivariance of a model. DEC can be easily incorporated into existing network architectures and can be adapted to a pre-trained model. Notably, we show that on the competitive ImageNet benchmark, DEC improves both model performance and local scale consistency across four popular pre-trained deep-nets, e.g., ViT, DeiT, Swin, and BEiT. Our code is available at https://github.com/ashiq24/local-scale-equivariance.

局部尺度等变性通过潜在深度平衡规范化器实现

Local Scale Equivariance with Latent Deep Equilibrium Canonicalizer

摘要

Support