NeuralFuse: 低電圧領域におけるアクセス制限付きニューラルネットワーク推論の精度向上を学習する手法

要旨

ディープニューラルネットワーク（DNN）は機械学習において広く普及していますが、そのエネルギー消費は依然として重要な課題です。供給電圧を下げることは、エネルギー消費を削減するための効果的な戦略です。しかし、供給電圧を過度に下げると、モデルパラメータが格納されている静的ランダムアクセスメモリ（SRAM）におけるランダムなビット反転が原因で精度が低下する可能性があります。この課題に対処するため、我々はNeuralFuseを導入します。これは、低電圧環境における精度とエネルギーのトレードオフを解決するための新しいアドオンモジュールで、エラー耐性のあるデータ表現を生成するための入力変換を学習します。NeuralFuseは、通常電圧および低電圧の両方のシナリオにおいてDNNの精度を保護します。さらに、NeuralFuseは実装が容易で、設定不可能なハードウェアやクラウドベースのAPIへのリモートアクセスなど、制限されたアクセス環境にあるDNNにも容易に適用できます。実験結果によると、1%のビットエラーレートにおいて、NeuralFuseはSRAMメモリアクセスエネルギーを最大24%削減し、精度を最大57%向上させることが示されています。我々の知る限り、これは低電圧誘発ビットエラーに対処するための最初のモデル非依存アプローチ（すなわち、モデルの再トレーニングを必要としない）です。ソースコードはhttps://github.com/IBM/NeuralFuseで公開されています。

English

Deep neural networks (DNNs) have become ubiquitous in machine learning, but their energy consumption remains a notable issue. Lowering the supply voltage is an effective strategy for reducing energy consumption. However, aggressively scaling down the supply voltage can lead to accuracy degradation due to random bit flips in static random access memory (SRAM) where model parameters are stored. To address this challenge, we introduce NeuralFuse, a novel add-on module that addresses the accuracy-energy tradeoff in low-voltage regimes by learning input transformations to generate error-resistant data representations. NeuralFuse protects DNN accuracy in both nominal and low-voltage scenarios. Moreover, NeuralFuse is easy to implement and can be readily applied to DNNs with limited access, such as non-configurable hardware or remote access to cloud-based APIs. Experimental results demonstrate that, at a 1% bit error rate, NeuralFuse can reduce SRAM memory access energy by up to 24% while improving accuracy by up to 57%. To the best of our knowledge, this is the first model-agnostic approach (i.e., no model retraining) to address low-voltage-induced bit errors. The source code is available at https://github.com/IBM/NeuralFuse.

NeuralFuse: 低電圧領域におけるアクセス制限付きニューラルネットワーク推論の精度向上を学習する手法

NeuralFuse: Learning to Improve the Accuracy of Access-Limited Neural Network Inference in Low-Voltage Regimes

要旨

Support