GitHub儲存庫中AI使用特徵與演變之實證研究:來自程式碼註解的證據
Empirical Study on the Characteristics and Evolution of AI-usage in GitHub Repositories: Evidence from Code Comments
June 5, 2026
作者: Abdullah Al Mujahid, Preetha Chatterjee, Mia Mohammad Imran
cs.AI
摘要
开发者越来越多地在日常软件工作流中使用ChatGPT、Copilot和Claude等AI工具,但以往的研究往往孤立评估大语言模型的输出,而非考察开发者如何在真实项目中对其进行适配。我们分析了35,361条明确引用AI使用的GitHub代码评论及其关联代码块。首先对500条独特的评论及代码块进行开放式编码,归纳出AI辅助开发活动的分类体系,随后利用两个基于LLM的分类器对完整数据集进行标注,并通过Dawid-Skene期望最大化算法聚合预测结果。同时,我们还分析了12,996条后续提交信息,以研究AI辅助代码在引入后的演变过程,并考察了从2022年12月至2026年3月的时间趋势。结果表明,开发者主要使用LLM进行代码实现,其次是代码增强、调试、文档编写和测试。后续提交中频繁出现重构与清理、功能整合与扩展以及错误修复,表明人类在适配AI辅助代码时持续进行监督。随时间推移,引用AI的评论从直接的代码生成转向知识性与概念性支持以及代码增强。这些发现表明,AI工具不仅嵌入为代码生成辅助手段,更成为协作支持机制——其输出会由开发者持续进行优化、扩展与修正。
English
Developers increasingly use AI tools such as ChatGPT, Copilot, and Claude in everyday software workflows, but prior studies often evaluate LLM outputs in isolation rather than examining how developers adapt them in real projects. We analyze 35,361 GitHub code comments that explicitly reference AI use and their associated code blocks. We first open-code 500 unique comments and code blocks to derive a taxonomy of AI-assisted development activities, then annotate the full dataset using two LLM-based classifiers and aggregate predictions with Dawid-Skene expectation-maximization. We also analyze 12,996 subsequent commit messages to study how AI-assisted code evolves after introduction, and examine temporal trends from December 2022 to March 2026. Our results show that developers primarily use LLMs for code implementation, followed by code enhancement, debugging, documentation, and testing. Subsequent commits frequently involve refactoring and cleanup, feature integration and extension, and bug fixing, indicating sustained human oversight in adapting AI-assisted code. Over time, AI-referencing comments shift from direct code generation toward knowledge and conceptual support and code enhancement. These findings suggest that AI tools are becoming embedded not only as code-generation aids, but also as collaborative support mechanisms whose outputs are refined, extended, and corrected by developers over time.