大型语言模型代码生成的提示指南：一项实证特征研究

摘要

当前，大型语言模型（LLMs）已被广泛应用于各类软件工程任务，其中代码生成是主要应用场景。已有研究表明，恰当的提示工程能够有效帮助开发者优化代码生成提示。然而迄今为止，业界尚未形成专门指导开发者编写代码生成提示的规范框架。本研究通过推导和评估，提出了一套面向开发场景的提示优化指南。首先，我们采用迭代式测试驱动方法自动优化代码生成提示，并通过分析优化过程的结果，识别出能够通过测试的提示改进要素。基于这些要素，我们总结出10项提示改进指南，涉及输入输出规范、前置后置条件明确化、示例提供、多维度细节补充以及模糊概念澄清等方面。通过对50名从业者的调研评估，我们收集了他们对这些提示改进模式的使用频率及感知效用反馈，结果显示其实际使用情况与了解指南前的认知存在差异。本研究结论不仅对从业者和教育者具有指导意义，也为开发更优质的LLM辅助软件开发工具提供了方向性启示。

English

Large Language Models (LLMs) are nowadays extensively used for various types of software engineering tasks, primarily code generation. Previous research has shown how suitable prompt engineering could help developers in improving their code generation prompts. However, so far, there do not exist specific guidelines driving developers towards writing suitable prompts for code generation. In this work, we derive and evaluate development-specific prompt optimization guidelines. First, we use an iterative, test-driven approach to automatically refine code generation prompts, and we analyze the outcome of this process to identify prompt improvement items that lead to test passes. We use such elements to elicit 10 guidelines for prompt improvement, related to better specifying I/O, pre-post conditions, providing examples, various types of details, or clarifying ambiguities. We conduct an assessment with 50 practitioners, who report their usage of the elicited prompt improvement patterns, as well as their perceived usefulness, which does not always correspond to the actual usage before knowing our guidelines. Our results lead to implications not only for practitioners and educators, but also for those aimed at creating better LLM-aided software development tools.

大型语言模型代码生成的提示指南：一项实证特征研究

Guidelines to Prompt Large Language Models for Code Generation: An Empirical Characterization

摘要

Support