検索拡張検索によるLLMプログラム最適化

要旨

近年の研究では、大規模言語モデル（LLM）のプログラム最適化への応用可能性が示されている。この問題はプログラミング言語分野における重要な課題である。本稿では、ブラックボックス適応手法である「検索拡張探索（RAS）」を提案する。RASは、候補最適化に対してビームサーチを実行し、各ステップにおいて、低速-高速プログラム対からなる所与の訓練データセットからインコンテキスト例を検索し、LLMを誘導する。重要な発見として、LLMが生成した自然言語記述に基づく文脈的検索は、ソースコードに基づく検索よりも有意に優れた性能を示す。また、訓練例を「原子編集」として分解することで解釈可能性を向上させる手法AEGISも提案する。この原子編集は、本質的により段階的な性質を持つ。RASはC++プログラムの最適化において、従来の最先端ブラックボックス適応戦略と比較して最大2.06倍の性能向上を示し、AEGISは大幅に小さな編集を行いながら最大1.37倍の性能向上を達成する。さらに、RASを用いることで、Pythonプログラムの平均実行時間パーセンタイルがベースラインと比較して10.27改善されることを示す。

English

Recent work has demonstrated the potential of large language models (LLMs) for program optimization, a key challenge in programming languages. We propose a blackbox adaptation method called Retrieval Augmented Search (RAS) that performs beam search over candidate optimizations; at each step, it retrieves in-context examples from a given training dataset of slow-fast program pairs to guide the LLM. Critically, we find that performing contextual retrieval based on an LLM-generated natural language description significantly outperforms retrieval based on the source code. We also propose AEGIS, a method for improving interpretability by decomposing training examples into ''atomic edits'' that are significantly more incremental in nature. We show that RAS performs up to 2.06times better than prior state-of-the-art blackbox adaptation strategies on optimizing C++ programs, and that AEGIS performs up to 1.37times better while making significantly smaller edits. We also show that using RAS improves the mean runtime percentile of Python programs by 10.27 compared to baselines.