局部尺度等變性與潛在深度平衡規範化器

摘要

尺度變化是計算機視覺中的一個基本挑戰。同一類別的物體可能具有不同的大小，而它們的感知大小還會受到與攝像頭距離的影響。這些變化是局部於物體的，即在同一圖像中，不同物體的大小可能以不同方式變化。為了有效處理尺度變化，我們提出了一種深度平衡規範化器（DEC），以提升模型的局部尺度等變性。DEC可以輕鬆融入現有的網絡架構，並能適應預訓練模型。值得注意的是，我們在競爭激烈的ImageNet基準測試中展示，DEC提升了四種流行預訓練深度網絡（如ViT、DeiT、Swin和BEiT）的模型性能和局部尺度一致性。我們的代碼可在https://github.com/ashiq24/local-scale-equivariance 獲取。

English

Scale variation is a fundamental challenge in computer vision. Objects of the same class can have different sizes, and their perceived size is further affected by the distance from the camera. These variations are local to the objects, i.e., different object sizes may change differently within the same image. To effectively handle scale variations, we present a deep equilibrium canonicalizer (DEC) to improve the local scale equivariance of a model. DEC can be easily incorporated into existing network architectures and can be adapted to a pre-trained model. Notably, we show that on the competitive ImageNet benchmark, DEC improves both model performance and local scale consistency across four popular pre-trained deep-nets, e.g., ViT, DeiT, Swin, and BEiT. Our code is available at https://github.com/ashiq24/local-scale-equivariance.

局部尺度等變性與潛在深度平衡規範化器

Local Scale Equivariance with Latent Deep Equilibrium Canonicalizer

摘要

Support