COMPOT: 변압기 압축을 위한 보정 최적화 행렬 프로크루스테스 직교화

초록

트랜스포머 모델의 학습 후 압축은 일반적으로 절단된 특이값 분해(SVD)에 의존합니다. 그러나 단일 공유 부분공간을 강제하는 것은 중간 수준의 압축에서도 정확도를 저하시킬 수 있습니다. 희소 사전 학습은 더 유연한 부분공간 합집합 표현을 제공하지만, 기존 접근법은 반복적인 사전 및 계수 업데이트 문제를 자주 겪습니다. 본 연구에서는 소량의 캘리브레이션 데이터셋을 사용하여 희소 가중치 분해를 추정하는 학습이 필요 없는 압축 프레임워크인 COMPOT(Calibration-Optimized Matrix Procrustes Orthogonalization for Transformers)를 제안합니다. COMPOT는 사전에 대해 폐형 Procrustes 업데이트를 가능하게 하고 계수에 대해 분석적 단일 단계 희소 코딩을 가능하게 하는 직교 사전을 사용하여 반복 최적화를 제거합니다. 또한 전역 압축 예산 내에서 이질적인 계층 민감도를 처리하기 위해 COMPOT는 계층별 압축률을 적응적으로 재분배하는 원샷 동적 할당 전략을 추가로 도입합니다. 다양한 아키텍처와 작업에 대한 광범위한 실험을 통해 COMPOT가 강력한 저랭크 및 희소 베이스라인 대비 일관적으로 우수한 품질-압축 트레이드오프를 제공하는 동시에 극한 압축을 위한 학습 후 양자화와 완전히 호환됨을 확인했습니다. 코드는 https://github.com/mts-ai/COMPOT에서 이용 가능합니다.

English

Post-training compression of Transformer models commonly relies on truncated singular value decomposition (SVD). However, enforcing a single shared subspace can degrade accuracy even at moderate compression. Sparse dictionary learning provides a more flexible union-of-subspaces representation, but existing approaches often suffer from iterative dictionary and coefficient updates. We propose COMPOT (Calibration-Optimized Matrix Procrustes Orthogonalization for Transformers), a training-free compression framework that uses a small calibration dataset to estimate a sparse weight factorization. COMPOT employs orthogonal dictionaries that enable closed-form Procrustes updates for the dictionary and analytical single-step sparse coding for the coefficients, eliminating iterative optimization. To handle heterogeneous layer sensitivity under a global compression budget, COMPOT further introduces a one-shot dynamic allocation strategy that adaptively redistributes layer-wise compression rates. Extensive experiments across diverse architectures and tasks show that COMPOT consistently delivers a superior quality-compression trade-off over strong low-rank and sparse baselines, while remaining fully compatible with post-training quantization for extreme compression. Code is available https://github.com/mts-ai/COMPOT{here}.

COMPOT: 변압기 압축을 위한 보정 최적화 행렬 프로크루스테스 직교화

COMPOT: Calibration-Optimized Matrix Procrustes Orthogonalization for Transformers Compression

초록

Support