神經融合:學習在低電壓範圍中提高有限存取神經網絡推斷準確性
NeuralFuse: Learning to Improve the Accuracy of Access-Limited Neural Network Inference in Low-Voltage Regimes
June 29, 2023
作者: Hao-Lun Sun, Lei Hsiung, Nandhini Chandramoorthy, Pin-Yu Chen, Tsung-Yi Ho
cs.AI
摘要
深度神經網絡(DNNs)已經在機器學習中變得無處不在,但它們的能源消耗仍然是一個顯著問題。降低供電電壓是減少能源消耗的有效策略。然而,過度降低供電電壓可能會導致準確性下降,這是由於靜態隨機存取記憶器(SRAM)中存儲模型參數的隨機位翻轉所導致的。為了應對這一挑戰,我們引入了NeuralFuse,這是一個新穎的附加模塊,通過學習輸入轉換來生成抗錯誤的數據表示,以解決低電壓範圍中準確性和能源之間的折衷。NeuralFuse在標準和低電壓情況下保護DNN的準確性。此外,NeuralFuse易於實施,並且可以輕鬆應用於訪問受限的DNN,如不可配置的硬件或遠程訪問基於雲的API。實驗結果表明,在1%的位錯誤率下,NeuralFuse可以將SRAM存儲器訪問能源降低高達24%,同時將準確性提高高達57%。據我們所知,這是第一個不依賴模型(即無需重新訓練模型)的方法,來解決低電壓引起的位錯誤。源代碼可在https://github.com/IBM/NeuralFuse 上找到。
English
Deep neural networks (DNNs) have become ubiquitous in machine learning, but
their energy consumption remains a notable issue. Lowering the supply voltage
is an effective strategy for reducing energy consumption. However, aggressively
scaling down the supply voltage can lead to accuracy degradation due to random
bit flips in static random access memory (SRAM) where model parameters are
stored. To address this challenge, we introduce NeuralFuse, a novel add-on
module that addresses the accuracy-energy tradeoff in low-voltage regimes by
learning input transformations to generate error-resistant data
representations. NeuralFuse protects DNN accuracy in both nominal and
low-voltage scenarios. Moreover, NeuralFuse is easy to implement and can be
readily applied to DNNs with limited access, such as non-configurable hardware
or remote access to cloud-based APIs. Experimental results demonstrate that, at
a 1% bit error rate, NeuralFuse can reduce SRAM memory access energy by up to
24% while improving accuracy by up to 57%. To the best of our knowledge, this
is the first model-agnostic approach (i.e., no model retraining) to address
low-voltage-induced bit errors. The source code is available at
https://github.com/IBM/NeuralFuse.