通过类型化空洞静态上下文化大型语言模型

摘要

大型语言模型（LLMs）已重塑了程序合成的格局。然而，当前基于LLM的代码补全系统常因缺乏适当上下文而生成错误代码，尤其是在处理训练数据中未包含或光标附近未出现的定义时。本文证明，通过与语言服务器暴露的类型和绑定结构紧密集成，可以以高效利用令牌的方式解决这一上下文问题。简言之，我们认为AI同样需要集成开发环境（IDEs）！具体而言，我们将LLM代码生成整合到Hazel实时程序草图环境中。Hazel语言服务器能识别待填充“洞”的类型及类型上下文，即便存在错误，也能确保始终提供有意义的程序草图。这使得提示信息能够包含代码库范围内、非词法上邻近光标、甚至可能不在同一文件中的上下文信息，但这些信息很可能在语义上接近开发者的目标。随后，LLM合成的补全通过与语言服务器的进一步对话进行迭代优化。为评估这些技术，我们引入了MVUBench，一个模型-视图-更新（MVU）Web应用程序的数据集。这些应用程序因其对应用特定数据结构的依赖而成为挑战性问题。我们发现，利用类型定义进行上下文化尤为有效。在Hazel背景下介绍我们的想法后，我们复制了这些技术并将MVUBench移植到TypeScript，以验证这些方法对资源更丰富语言的适用性。最后，我们概述了ChatLSP，这是对语言服务器协议（LSP）的一个保守扩展，语言服务器可实施该协议，以暴露各种设计的AI代码补全系统在生成LLM提示时可用于整合静态上下文的能力。

English

Large language models (LLMs) have reshaped the landscape of program synthesis. However, contemporary LLM-based code completion systems often hallucinate broken code because they lack appropriate context, particularly when working with definitions not in the training data nor near the cursor. This paper demonstrates that tight integration with the type and binding structure of a language, as exposed by its language server, can address this contextualization problem in a token-efficient manner. In short, we contend that AIs need IDEs, too! In particular, we integrate LLM code generation into the Hazel live program sketching environment. The Hazel Language Server identifies the type and typing context of the hole being filled, even in the presence of errors, ensuring that a meaningful program sketch is always available. This allows prompting with codebase-wide contextual information not lexically local to the cursor, nor necessarily in the same file, but that is likely to be semantically local to the developer's goal. Completions synthesized by the LLM are then iteratively refined via further dialog with the language server. To evaluate these techniques, we introduce MVUBench, a dataset of model-view-update (MVU) web applications. These applications serve as challenge problems due to their reliance on application-specific data structures. We find that contextualization with type definitions is particularly impactful. After introducing our ideas in the context of Hazel we duplicate our techniques and port MVUBench to TypeScript in order to validate the applicability of these methods to higher-resource languages. Finally, we outline ChatLSP, a conservative extension to the Language Server Protocol (LSP) that language servers can implement to expose capabilities that AI code completion systems of various designs can use to incorporate static context when generating prompts for an LLM.

通过类型化空洞静态上下文化大型语言模型

Statically Contextualizing Large Language Models with Typed Holes

摘要

Support