神經融合：學習在低電壓範圍中提高有限存取神經網絡推斷準確性

摘要

深度神經網絡（DNNs）已經在機器學習中變得無處不在，但它們的能源消耗仍然是一個顯著問題。降低供電電壓是減少能源消耗的有效策略。然而，過度降低供電電壓可能會導致準確性下降，這是由於靜態隨機存取記憶器（SRAM）中存儲模型參數的隨機位翻轉所導致的。為了應對這一挑戰，我們引入了NeuralFuse，這是一個新穎的附加模塊，通過學習輸入轉換來生成抗錯誤的數據表示，以解決低電壓範圍中準確性和能源之間的折衷。NeuralFuse在標準和低電壓情況下保護DNN的準確性。此外，NeuralFuse易於實施，並且可以輕鬆應用於訪問受限的DNN，如不可配置的硬件或遠程訪問基於雲的API。實驗結果表明，在1%的位錯誤率下，NeuralFuse可以將SRAM存儲器訪問能源降低高達24％，同時將準確性提高高達57％。據我們所知，這是第一個不依賴模型（即無需重新訓練模型）的方法，來解決低電壓引起的位錯誤。源代碼可在https://github.com/IBM/NeuralFuse 上找到。

English

Deep neural networks (DNNs) have become ubiquitous in machine learning, but their energy consumption remains a notable issue. Lowering the supply voltage is an effective strategy for reducing energy consumption. However, aggressively scaling down the supply voltage can lead to accuracy degradation due to random bit flips in static random access memory (SRAM) where model parameters are stored. To address this challenge, we introduce NeuralFuse, a novel add-on module that addresses the accuracy-energy tradeoff in low-voltage regimes by learning input transformations to generate error-resistant data representations. NeuralFuse protects DNN accuracy in both nominal and low-voltage scenarios. Moreover, NeuralFuse is easy to implement and can be readily applied to DNNs with limited access, such as non-configurable hardware or remote access to cloud-based APIs. Experimental results demonstrate that, at a 1% bit error rate, NeuralFuse can reduce SRAM memory access energy by up to 24% while improving accuracy by up to 57%. To the best of our knowledge, this is the first model-agnostic approach (i.e., no model retraining) to address low-voltage-induced bit errors. The source code is available at https://github.com/IBM/NeuralFuse.

神經融合：學習在低電壓範圍中提高有限存取神經網絡推斷準確性

NeuralFuse: Learning to Improve the Accuracy of Access-Limited Neural Network Inference in Low-Voltage Regimes

摘要

Support