モニターを用いたグローバルコンテキストによるコード言語モデルのガイダンス

要旨

コード言語モデル（LMs）は、生成対象の周辺コードが十分なコンテキストを提供する場合に良好に機能します。しかし、他のモジュールやライブラリで定義された型や機能、特にトレーニング中に見られなかったものを使用する必要がある場合には、これは当てはまりません。LMsは、そのようなグローバルコンテキストに対する認識が限られており、結果として誤った型を使用するなど、幻覚を起こすことがあります。最近の研究では、グローバル情報を取得してローカルコンテキストを補強することでこの問題を克服しようとしています。しかし、これによりプロンプトが肥大化したり、アーキテクチャの変更や追加のトレーニングが必要になったりします。統合開発環境（IDEs）は、静的解析を使用して開発者の手元にグローバルコンテキストをもたらすことで、開発者を支援します。私たちは、この支援をLMsにも拡張します。私たちは、バックグラウンドで静的解析を使用してデコードをガイドするモニターの概念を提案します。事前取得とは異なり、静的解析はデコードプロセス全体を通じて反復的に呼び出され、必要に応じて最も関連性の高い提案を提供します。私たちは、LMがオブジェクトの参照解除を行うコードを生成する際に、識別子の型一貫性を監視することで、この提案の有用性を実証します。私たちのアプローチを評価するために、開発環境を含むオープンソースプロジェクトのデータセットであるPragmaticCodeをキュレーションしました。さまざまなパラメータスケールのモデルにおいて、モニターガイド付きデコードが、LMがグラウンドトゥルースに一致する識別子を生成する能力だけでなく、コンパイル率やグラウンドトゥルースとの一致率も向上させることを示します。私たちは、パラメータが少ないLMsが、私たちのモニターのガイドを受けることで、より大きなLMsを上回ることができることを発見しました。モニターガイド付きデコードにより、SantaCoder-1.1Bは、はるかに大きなtext-davinci-003モデルよりも優れたコンパイル率と次の識別子の一致率を達成します。データセットとコードはhttps://aka.ms/monitors4codegenで公開されます。

English

Language models of code (LMs) work well when the surrounding code in the vicinity of generation provides sufficient context. This is not true when it becomes necessary to use types or functionality defined in another module or library, especially those not seen during training. LMs suffer from limited awareness of such global context and end up hallucinating, e.g., using types defined in other files incorrectly. Recent work tries to overcome this issue by retrieving global information to augment the local context. However, this bloats the prompt or requires architecture modifications and additional training. Integrated development environments (IDEs) assist developers by bringing the global context at their fingertips using static analysis. We extend this assistance, enjoyed by developers, to the LMs. We propose a notion of monitors that use static analysis in the background to guide the decoding. Unlike a priori retrieval, static analysis is invoked iteratively during the entire decoding process, providing the most relevant suggestions on demand. We demonstrate the usefulness of our proposal by monitoring for type-consistent use of identifiers whenever an LM generates code for object dereference. To evaluate our approach, we curate PragmaticCode, a dataset of open-source projects with their development environments. On models of varying parameter scale, we show that monitor-guided decoding consistently improves the ability of an LM to not only generate identifiers that match the ground truth but also improves compilation rates and agreement with ground truth. We find that LMs with fewer parameters, when guided with our monitor, can outperform larger LMs. With monitor-guided decoding, SantaCoder-1.1B achieves better compilation rate and next-identifier match than the much larger text-davinci-003 model. The datasets and code will be released at https://aka.ms/monitors4codegen .

モニターを用いたグローバルコンテキストによるコード言語モデルのガイダンス

Guiding Language Models of Code with Global Context using Monitors

要旨

Support