GitHubリポジトリにおけるAI利用の特徴と進化に関する実証研究：コードコメントからのエビデンス

要旨

開発者は、日々のソフトウェアワークフローにおいてChatGPT、Copilot、ClaudeなどのAIツールをますます活用しているが、従来の研究では、LLMの出力を単独で評価することが多く、実際のプロジェクトで開発者がどのように適応させているかを検討することは少なかった。本研究では、AIの使用に明示的に言及している35,361件のGitHubコードコメントと、それに関連するコードブロックを分析する。まず、500件のユニークなコメントとコードブロックをオープンコーディングし、AI支援による開発活動の分類体系を導出する。次に、2つのLLMベースの分類器を用いて全データセットにアノテーションを施し、Dawid-Skene期待値最大化法により予測を統合する。さらに、導入後のAI支援コードの進化を調査するため、12,996件の後続コミットメッセージを分析し、2022年12月から2026年3月までの時間的傾向を検討する。結果から、開発者は主にコード実装にLLMを利用しており、次いでコード改善、デバッグ、ドキュメント作成、テストに使用していることが示された。後続のコミットでは、リファクタリングとクリーンアップ、機能統合と拡張、バグ修正が頻繁に行われており、AI支援コードを適応させる際に持続的な人間の監視が行われていることを示している。時間の経過とともに、AIに言及するコメントは、直接的なコード生成から、知識・概念的な支援やコード改善へとシフトしている。これらの知見は、AIツールが単なるコード生成支援としてだけでなく、開発者がその出力を時間をかけて洗練・拡張・修正する協調的な支援メカニズムとして埋め込まれつつあることを示唆している。

English

Developers increasingly use AI tools such as ChatGPT, Copilot, and Claude in everyday software workflows, but prior studies often evaluate LLM outputs in isolation rather than examining how developers adapt them in real projects. We analyze 35,361 GitHub code comments that explicitly reference AI use and their associated code blocks. We first open-code 500 unique comments and code blocks to derive a taxonomy of AI-assisted development activities, then annotate the full dataset using two LLM-based classifiers and aggregate predictions with Dawid-Skene expectation-maximization. We also analyze 12,996 subsequent commit messages to study how AI-assisted code evolves after introduction, and examine temporal trends from December 2022 to March 2026. Our results show that developers primarily use LLMs for code implementation, followed by code enhancement, debugging, documentation, and testing. Subsequent commits frequently involve refactoring and cleanup, feature integration and extension, and bug fixing, indicating sustained human oversight in adapting AI-assisted code. Over time, AI-referencing comments shift from direct code generation toward knowledge and conceptual support and code enhancement. These findings suggest that AI tools are becoming embedded not only as code-generation aids, but also as collaborative support mechanisms whose outputs are refined, extended, and corrected by developers over time.