I&S-ViT：一种推动后训练 ViT 量化极限的包容稳定方法

摘要

尽管视觉Transformer（ViTs）具有可扩展的性能，但密集的计算成本（训练和推断）削弱了它们在工业应用中的地位。后训练量化（PTQ）是一种通过使用小数据集微调ViTs并以低比特格式运行来有效解决成本问题的方法，但不幸的是，在低比特情况下会导致更多性能下降。在本文中，我们介绍了I&S-ViT，这是一种新颖的方法，以全面且稳定的方式调节ViTs的PTQ。I&S-ViT首先确定了ViTs的PTQ中的两个问题：（1）在常见的log2量化器中，后Softmax激活的量化效率不高；（2）在后LayerNorm激活的粗粒度量化粒度中，存在崎岖且放大的损失景观。然后，I&S-ViT通过引入以下内容解决了这些问题：（1）一种新颖的shift-uniform-log2量化器（SULQ），它结合了一个位移机制，然后是均匀量化，以实现包容性域表示和准确的分布逼近；（2）一种三阶段平滑优化策略（SOS），将通道级和层级量化的优势融合在一起，实现稳定学习。对各种视觉任务的全面评估验证了I&S-ViT在现有ViTs PTQ方法中的优越性，特别是在低比特情况下。例如，I&S-ViT将3比特ViT-B的性能提升了惊人的50.68%。

English

Albeit the scalable performance of vision transformers (ViTs), the dense computational costs (training & inference) undermine their position in industrial applications. Post-training quantization (PTQ), tuning ViTs with a tiny dataset and running in a low-bit format, well addresses the cost issue but unluckily bears more performance drops in lower-bit cases. In this paper, we introduce I&S-ViT, a novel method that regulates the PTQ of ViTs in an inclusive and stable fashion. I&S-ViT first identifies two issues in the PTQ of ViTs: (1) Quantization inefficiency in the prevalent log2 quantizer for post-Softmax activations; (2) Rugged and magnified loss landscape in coarse-grained quantization granularity for post-LayerNorm activations. Then, I&S-ViT addresses these issues by introducing: (1) A novel shift-uniform-log2 quantizer (SULQ) that incorporates a shift mechanism followed by uniform quantization to achieve both an inclusive domain representation and accurate distribution approximation; (2) A three-stage smooth optimization strategy (SOS) that amalgamates the strengths of channel-wise and layer-wise quantization to enable stable learning. Comprehensive evaluations across diverse vision tasks validate I&S-ViT' superiority over existing PTQ of ViTs methods, particularly in low-bit scenarios. For instance, I&S-ViT elevates the performance of 3-bit ViT-B by an impressive 50.68%.

I&S-ViT：一种推动后训练 ViT 量化极限的包容稳定方法

I&S-ViT: An Inclusive & Stable Method for Pushing the Limit of Post-Training ViTs Quantization

摘要

Support