局部尺度等變性與潛在深度平衡規範化器
Local Scale Equivariance with Latent Deep Equilibrium Canonicalizer
August 19, 2025
作者: Md Ashiqur Rahman, Chiao-An Yang, Michael N. Cheng, Lim Jun Hao, Jeremiah Jiang, Teck-Yian Lim, Raymond A. Yeh
cs.AI
摘要
尺度變化是計算機視覺中的一個基本挑戰。同一類別的物體可能具有不同的大小,而它們的感知大小還會受到與攝像頭距離的影響。這些變化是局部於物體的,即在同一圖像中,不同物體的大小可能以不同方式變化。為了有效處理尺度變化,我們提出了一種深度平衡規範化器(DEC),以提升模型的局部尺度等變性。DEC可以輕鬆融入現有的網絡架構,並能適應預訓練模型。值得注意的是,我們在競爭激烈的ImageNet基準測試中展示,DEC提升了四種流行預訓練深度網絡(如ViT、DeiT、Swin和BEiT)的模型性能和局部尺度一致性。我們的代碼可在https://github.com/ashiq24/local-scale-equivariance 獲取。
English
Scale variation is a fundamental challenge in computer vision. Objects of the
same class can have different sizes, and their perceived size is further
affected by the distance from the camera. These variations are local to the
objects, i.e., different object sizes may change differently within the same
image. To effectively handle scale variations, we present a deep equilibrium
canonicalizer (DEC) to improve the local scale equivariance of a model. DEC can
be easily incorporated into existing network architectures and can be adapted
to a pre-trained model. Notably, we show that on the competitive ImageNet
benchmark, DEC improves both model performance and local scale consistency
across four popular pre-trained deep-nets, e.g., ViT, DeiT, Swin, and BEiT. Our
code is available at https://github.com/ashiq24/local-scale-equivariance.