NeuralFuse：学习如何在低电压环境中提高受限访问神经网络推断的准确性

摘要

深度神经网络（DNNs）已经在机器学习中变得无处不在，但它们的能耗仍然是一个显著问题。降低供电电压是减少能耗的有效策略。然而，过度降低供电电压可能会导致准确性下降，因为模型参数存储在静态随机存取存储器（SRAM）中，其中可能发生随机位翻转。为了解决这一挑战，我们引入了NeuralFuse，这是一个新颖的附加模块，通过学习输入转换来生成抗错误数据表示，从而解决低电压范围中准确性与能耗的权衡问题。NeuralFuse在标称和低电压场景下均能保护DNN的准确性。此外，NeuralFuse易于实现，并且可以轻松应用于访问受限的DNN，例如不可配置的硬件或远程访问云端API。实验结果表明，在1%的位错误率下，NeuralFuse可以将SRAM内存访问能耗降低高达24%，同时将准确性提高高达57%。据我们所知，这是第一个面向模型的方法（即无需重新训练模型）来解决低电压引起的位错误。源代码可在https://github.com/IBM/NeuralFuse找到。

English

Deep neural networks (DNNs) have become ubiquitous in machine learning, but their energy consumption remains a notable issue. Lowering the supply voltage is an effective strategy for reducing energy consumption. However, aggressively scaling down the supply voltage can lead to accuracy degradation due to random bit flips in static random access memory (SRAM) where model parameters are stored. To address this challenge, we introduce NeuralFuse, a novel add-on module that addresses the accuracy-energy tradeoff in low-voltage regimes by learning input transformations to generate error-resistant data representations. NeuralFuse protects DNN accuracy in both nominal and low-voltage scenarios. Moreover, NeuralFuse is easy to implement and can be readily applied to DNNs with limited access, such as non-configurable hardware or remote access to cloud-based APIs. Experimental results demonstrate that, at a 1% bit error rate, NeuralFuse can reduce SRAM memory access energy by up to 24% while improving accuracy by up to 57%. To the best of our knowledge, this is the first model-agnostic approach (i.e., no model retraining) to address low-voltage-induced bit errors. The source code is available at https://github.com/IBM/NeuralFuse.

NeuralFuse：学习如何在低电压环境中提高受限访问神经网络推断的准确性

NeuralFuse: Learning to Improve the Accuracy of Access-Limited Neural Network Inference in Low-Voltage Regimes

摘要

Support