コードのための回帰型言語モデル

要旨

我々はコードからメトリックへの回帰、すなわちコード実行の数値的結果を予測する課題を研究する。プログラミング言語のオープンエンドな性質ゆえに、これは困難なタスクである。従来の手法は重厚でドメイン固有の特徴量エンジニアリングに頼ってきたが、我々は単一の統一された回帰言語モデル（RLM）が、テキストから直接的に予測できることを示す。具体的には、(i) PythonやC++といった複数の高級言語におけるコードのメモリフットプリント、(ii) Triton GPUカーネルのレイテンシ、(iii) ONNXで表現された訓練済みニューラルネットワークの精度と速度を同時に予測できる。特に、T5Gemmaから初期化された比較的小規模な300MパラメータのRLMは、APPSの競技プログラミング提出物において0.9以上のスピアマン順位相関係数を達成し、単一の統一モデルがCodeNetの17の異なる言語にわたって0.5以上の平均スピアマン順位相関係数を達成する。さらに、RLMはグラフニューラルネットワークが支配していた5つの古典的なNAS設計空間において0.46の最高平均ケンドールタウを獲得し、同時に多数のハードウェアプラットフォームにおけるアーキテクチャのレイテンシを予測できる。

English

We study code-to-metric regression: predicting numeric outcomes of code executions, a challenging task due to the open-ended nature of programming languages. While prior methods have resorted to heavy and domain-specific feature engineering, we show that a single unified Regression Language Model (RLM) can simultaneously predict directly from text, (i) the memory footprint of code across multiple high-level languages such as Python and C++, (ii) the latency of Triton GPU kernels, and (iii) the accuracy and speed of trained neural networks represented in ONNX. In particular, a relatively small 300M parameter RLM initialized from T5Gemma, obtains > 0.9 Spearman-rank on competitive programming submissions from APPS, and a single unified model achieves > 0.5 average Spearman-rank across 17 separate languages from CodeNet. Furthermore, the RLM can obtain the highest average Kendall-Tau of 0.46 on five classic NAS design spaces previously dominated by graph neural networks, and simultaneously predict architecture latencies on numerous hardware platforms.

コードのための回帰型言語モデル

Regression Language Models for Code

要旨

Support