代码回归语言模型

摘要

我们研究了代码到指标的回归任务：预测代码执行时的数值结果，这一任务因编程语言的开放性而极具挑战性。以往的方法依赖于繁重且领域特定的特征工程，而我们证明，一个统一的回归语言模型（RLM）能够直接从文本中同时预测：(i) 跨Python和C++等多种高级语言的代码内存占用，(ii) Triton GPU内核的延迟，以及(iii) 以ONNX格式表示的已训练神经网络的准确性和速度。具体而言，一个相对较小的、基于T5Gemma初始化的300M参数RLM，在APPS竞赛编程提交上获得了超过0.9的斯皮尔曼等级相关系数，且单一模型在CodeNet的17种不同语言上平均斯皮尔曼等级相关系数超过0.5。此外，RLM在五个先前由图神经网络主导的经典NAS设计空间上，取得了最高的平均肯德尔-τ系数0.46，并能同时预测多种硬件平台上的架构延迟。

English

We study code-to-metric regression: predicting numeric outcomes of code executions, a challenging task due to the open-ended nature of programming languages. While prior methods have resorted to heavy and domain-specific feature engineering, we show that a single unified Regression Language Model (RLM) can simultaneously predict directly from text, (i) the memory footprint of code across multiple high-level languages such as Python and C++, (ii) the latency of Triton GPU kernels, and (iii) the accuracy and speed of trained neural networks represented in ONNX. In particular, a relatively small 300M parameter RLM initialized from T5Gemma, obtains > 0.9 Spearman-rank on competitive programming submissions from APPS, and a single unified model achieves > 0.5 average Spearman-rank across 17 separate languages from CodeNet. Furthermore, the RLM can obtain the highest average Kendall-Tau of 0.46 on five classic NAS design spaces previously dominated by graph neural networks, and simultaneously predict architecture latencies on numerous hardware platforms.

代码回归语言模型

Regression Language Models for Code

摘要

Support