ChatPaper.aiChatPaper

I&S-ViT:一种推动后训练 ViT 量化极限的包容稳定方法

I&S-ViT: An Inclusive & Stable Method for Pushing the Limit of Post-Training ViTs Quantization

November 16, 2023
作者: Yunshan Zhong, Jiawei Hu, Mingbao Lin, Mengzhao Chen, Rongrong Ji
cs.AI

摘要

尽管视觉Transformer(ViTs)具有可扩展的性能,但密集的计算成本(训练和推断)削弱了它们在工业应用中的地位。后训练量化(PTQ)是一种通过使用小数据集微调ViTs并以低比特格式运行来有效解决成本问题的方法,但不幸的是,在低比特情况下会导致更多性能下降。在本文中,我们介绍了I&S-ViT,这是一种新颖的方法,以全面且稳定的方式调节ViTs的PTQ。I&S-ViT首先确定了ViTs的PTQ中的两个问题:(1)在常见的log2量化器中,后Softmax激活的量化效率不高;(2)在后LayerNorm激活的粗粒度量化粒度中,存在崎岖且放大的损失景观。然后,I&S-ViT通过引入以下内容解决了这些问题:(1)一种新颖的shift-uniform-log2量化器(SULQ),它结合了一个位移机制,然后是均匀量化,以实现包容性域表示和准确的分布逼近;(2)一种三阶段平滑优化策略(SOS),将通道级和层级量化的优势融合在一起,实现稳定学习。对各种视觉任务的全面评估验证了I&S-ViT在现有ViTs PTQ方法中的优越性,特别是在低比特情况下。例如,I&S-ViT将3比特ViT-B的性能提升了惊人的50.68%。
English
Albeit the scalable performance of vision transformers (ViTs), the dense computational costs (training & inference) undermine their position in industrial applications. Post-training quantization (PTQ), tuning ViTs with a tiny dataset and running in a low-bit format, well addresses the cost issue but unluckily bears more performance drops in lower-bit cases. In this paper, we introduce I&S-ViT, a novel method that regulates the PTQ of ViTs in an inclusive and stable fashion. I&S-ViT first identifies two issues in the PTQ of ViTs: (1) Quantization inefficiency in the prevalent log2 quantizer for post-Softmax activations; (2) Rugged and magnified loss landscape in coarse-grained quantization granularity for post-LayerNorm activations. Then, I&S-ViT addresses these issues by introducing: (1) A novel shift-uniform-log2 quantizer (SULQ) that incorporates a shift mechanism followed by uniform quantization to achieve both an inclusive domain representation and accurate distribution approximation; (2) A three-stage smooth optimization strategy (SOS) that amalgamates the strengths of channel-wise and layer-wise quantization to enable stable learning. Comprehensive evaluations across diverse vision tasks validate I&S-ViT' superiority over existing PTQ of ViTs methods, particularly in low-bit scenarios. For instance, I&S-ViT elevates the performance of 3-bit ViT-B by an impressive 50.68%.
PDF100December 15, 2024