ChatPaper.aiChatPaper

探究自主智能体在真实环境中的贡献:活动模式与代码变更的时间演进

Investigating Autonomous Agent Contributions in the Wild: Activity Patterns and Code Change over Time

April 1, 2026
作者: Razvan Mihai Popescu, David Gros, Andrei Botocan, Rahul Pandita, Prem Devanbu, Maliheh Izadi
cs.AI

摘要

基於大型程式語言模型的崛起已重塑軟體開發樣貌。具備創建分支、開啟拉取請求及執行程式碼審查能力的自主編碼代理,現已能實際參與專案貢獻。其日益增長的角色為研究AI驅動的貢獻如何影響程式碼品質、團隊動態與軟體可維護性,提供了獨特且即時的觀察契機。本研究建構了包含約11萬筆開源拉取請求的新型資料集,涵蓋關聯的提交記錄、評論、審查意見、議題追蹤及檔案變更,總計代表數百萬行程式碼。我們比較五款主流編碼代理(包括OpenAI Codex、Claude Code、GitHub Copilot、Google Jules與Devin),從合併頻率、編輯檔案類型及開發者互動訊號(如評論與審查)等多維度剖析其應用差異。更重要的是,我們強調程式撰寫與審查僅是軟體工程流程的局部環節,生成程式碼尚需經歷長期維護與更新。據此,我們針對代理生成與人工編寫的程式碼,提出存活率與變動率的縱向分析指標。最終數據顯示:儘管開源專案中的代理活動日趨活躍,但其產出程式碼隨時間推移產生的變動率,相較人工編寫程式碼顯著更高。
English
The rise of large language models for code has reshaped software development. Autonomous coding agents, able to create branches, open pull requests, and perform code reviews, now actively contribute to real-world projects. Their growing role offers a unique and timely opportunity to investigate AI-driven contributions and their effects on code quality, team dynamics, and software maintainability. In this work, we construct a novel dataset of approximately 110,000 open-source pull requests, including associated commits, comments, reviews, issues, and file changes, collectively representing millions of lines of source code. We compare five popular coding agents, including OpenAI Codex, Claude Code, GitHub Copilot, Google Jules, and Devin, examining how their usage differs in various development aspects such as merge frequency, edited file types, and developer interaction signals, including comments and reviews. Furthermore, we emphasize that code authoring and review are only a small part of the larger software engineering process, as the resulting code must also be maintained and updated over time. Hence, we offer several longitudinal estimates of survival and churn rates for agent-generated versus human-authored code. Ultimately, our findings indicate an increasing agent activity in open-source projects, although their contributions are associated with more churn over time compared to human-authored code.
PDF101April 4, 2026