学会提交：通过在线仓库记忆生成有机拉取请求

摘要

基于大语言模型的编程代理在受控基准测试中表现优异，但在实际开发中提交的拉取请求却常被维护者拒绝。其根本原因并非功能错误，而是缺乏有机性：生成的代码往往忽略项目特定规范、重复内部API已提供的功能，并违反多年积累的隐性架构约束。仅向代理提供最新代码库快照远远不够——快照虽能展现代码库的最终状态，却无法揭示达成该状态的仓库特定变更模式。我们提出"学习式提交"框架，通过在线仓库记忆机制弥补这一缺陷。该框架要求代理在严格按时间划分的代码库中，对早期提交记录进行监督式对比反思：盲目尝试解决每个历史问题，将其预测结果与标准差异比对，并将差距提炼为持续增长的技能集——这些可复用的模式涵盖编码风格、内部API使用及架构不变性。当新PR描述出现时，代理会基于这些积累的技能进行代码生成，使变更更贴近项目自身演进轨迹而非通用预训练先验。我们在技能构建阶段完全未接触的已合并未来PR上进行多维度评估，包括功能正确性、代码风格一致性、内部API复用率及修改区域合理性。在具有丰富提交历史的专家维护仓库实验中，在线仓库记忆机制显著提升了未来预留任务的有机性评分。

English

Large language model (LLM)-based coding agents achieve impressive results on controlled benchmarks yet routinely produce pull requests that real maintainers reject. The root cause is not functional incorrectness but a lack of organicity: generated code ignores project-specific conventions, duplicates functionality already provided by internal APIs, and violates implicit architectural constraints accumulated over years of development. Simply exposing an agent to the latest repository snapshot is not enough: the snapshot reveals the final state of the codebase, but not the repository-specific change patterns by which that state was reached. We introduce Learning to Commit, a framework that closes this gap through Online Repository Memory. Given a repository with a strict chronological split, the agent performs supervised contrastive reflection on earlier commits: it blindly attempts to resolve each historical issue, compares its prediction against the oracle diff, and distils the gap into a continuously growing set of skills-reusable patterns capturing coding style, internal API usage, and architectural invariants. When a new PR description arrives, the agent conditions its generation on these accumulated skills, producing changes grounded in the project's own evolution rather than generic pretraining priors. Evaluation is conducted on genuinely future, merged pull requests that could not have been seen during the skill-building phase, and spans multiple dimensions including functional correctness, code-style consistency, internal API reuse rate, and modified-region plausibility. Experiments on an expert-maintained repository with rich commit history show that Online Repository Memory effectively improves organicity scores on held-out future tasks.

学会提交：通过在线仓库记忆生成有机拉取请求

Learning to Commit: Generating Organic Pull Requests via Online Repository Memory

摘要

Support