最佳腦細胞凋零
Optimal Brain Apoptosis
February 25, 2025
作者: Mingyuan Sun, Zheng Fang, Jiaxu Wang, Junjie Jiang, Delei Kong, Chenming Hu, Yuetong Fang, Renjing Xu
cs.AI
摘要
隨著卷積神經網絡(CNNs)和Transformer模型的複雜性與參數量不斷增加,其在計算效率和資源需求方面面臨著嚴峻挑戰。剪枝技術已被證實為應對這些挑戰的有效策略,通過移除冗餘的神經元、通道或連接,從而提升計算效率而不顯著影響性能。本文基於「最優腦損傷」(Optimal Brain Damage, OBD)的基礎工作,進一步推進了利用海森矩陣進行參數重要性估計的方法。與以往依賴近似計算的方法不同,我們提出了一種新穎的剪枝方法——「最優腦凋亡」(Optimal Brain Apoptosis, OBA),該方法直接計算每個參數的海森-向量乘積值。通過將海森矩陣分解至網絡各層,並識別層間海森子矩陣非零的條件,我們提出了一種高效計算參數二階泰勒展開的技術。這一方法使得剪枝過程更加精確,特別是在CNNs和Transformer的應用中,這在我們對VGG19、ResNet32、ResNet50及ViT-B/16在CIFAR10、CIFAR100和Imagenet數據集上的實驗中得到了驗證。我們的代碼已公開於https://github.com/NEU-REAL/OBA。
English
The increasing complexity and parameter count of Convolutional Neural
Networks (CNNs) and Transformers pose challenges in terms of computational
efficiency and resource demands. Pruning has been identified as an effective
strategy to address these challenges by removing redundant elements such as
neurons, channels, or connections, thereby enhancing computational efficiency
without heavily compromising performance. This paper builds on the foundational
work of Optimal Brain Damage (OBD) by advancing the methodology of parameter
importance estimation using the Hessian matrix. Unlike previous approaches that
rely on approximations, we introduce Optimal Brain Apoptosis (OBA), a novel
pruning method that calculates the Hessian-vector product value directly for
each parameter. By decomposing the Hessian matrix across network layers and
identifying conditions under which inter-layer Hessian submatrices are
non-zero, we propose a highly efficient technique for computing the
second-order Taylor expansion of parameters. This approach allows for a more
precise pruning process, particularly in the context of CNNs and Transformers,
as validated in our experiments including VGG19, ResNet32, ResNet50, and
ViT-B/16 on CIFAR10, CIFAR100 and Imagenet datasets. Our code is available at
https://github.com/NEU-REAL/OBA.Summary
AI-Generated Summary